Mailing List Archive

Devil's Advocate - Segment Routing, Why?
Hi all.

When the whole SR concept was being first dreamed up, I was mildly
excited about it. But then real life happened and global deployment (be
it basic SR-MPLS or SRv6) is what it is, and I became less excited. This
was back in 2015.

All the talk about LDPv6 this and last week has had me reflecting a
great deal on where we are, as an industry, in why we are having to
think about SR and all its incarnations.

So, let me be the one that stirs up the hornets' nest...

Why do we really need SR? Be it SR-MPLS or SRv6 or SRv6+?

I've heard a lot about "network programmability", e.t.c., but can anyone
point me toward a solution that actually does this in the way that it
has been touted for years? A true flow that shows the implementation of
"network programming" over any incarnation of SR? Perhaps one a customer
can go to the shop and grab off the shelf?

A lot of kit does not currently support SR, be it in hardware or
software. So are operators expected to dispose of boxes that are happily
moving MPLS frames along with no complaints, and replace them with some
newfangled creations that will support SR in code and silicon? At whose
cost? Not just money, but time, people and working the day-to-day kinks out?

I've heard about "end-to-end service chaining" as a use-case for SR. To
service-chain what? Classic telco's don't offer complex over-the-top
services that operate at a such a scale that "service chaining" in SR
would make lives easier. More than half of the traffic we are carrying
is coming in over the public Internet, and not some private VPN. And if
"service chaining" makes sense to the cloud and content operators who
run humongous data centres where the servers significantly outnumber the
routing/switching/transport gear, I'd naively posit that they have built
a myriad of custom, in-house solutions, systems, tools and controllers
to do all the "service chaining" they could ever need, and have been at
it for more than 10 years, if not more, all to manage an MPLS/DWDM
backbone. So what off-the-shelf "service chaining" controller are they
going to walk into the shop and pay money for?

If I had to think of the number of network, content and cloud operators
who have either said they've deployed some kind of SR, or intend to,
you're looking at probably 10% - 15% of a market. What about the other
85% - 90% of the operators whose requirements are so simple, thinking
about dumping existing boxes, systems, tools and solutions that work
very well in order to join the SR club doesn't seem feasible. What
problems are 90% of the operators running MPLS having that SR will truly
fix, given that they don't operate large, distributed data centres or
have a 5G license?

What's even more wild, is that there are equally a number of networks
that are stalling IPv6 deployment, for some reason or other, meaning it
will probably take us another 1 to 2 decades to see worldwide adoption
of IPv6. If SRv6 or SRv6+ is "where the market is dying to go", and a
bunch of operators don't have IPv6 in their plans, what gives?

To be clear, I'm not against SR; what has to come will come. What I am
less enthused about is being forced into an all-or-nothing scenario for
the going concern of my network. For those that are keen on SR, give
them SR. But for those who would prefer to keep things simple in
networks that are not about to fall over and die, let's have LDPv6 and
let's implement RFC 7439.

Then let the operators choose.

On a personal note, it's a pity Juniper gave in to the SRv6 fight,
despite all the initial resistance. If they'd gone a different direction
and simply implemented RFC 7439 (they have LDPv6 already), not only
would that have put Cisco under serious pressure, but it would have
solved the problems of many network operators that are desperately
looking to go IPv6-only, and still maintain the rich MPLS services they
and their customers have grown to like.

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
Hey,

> Why do we really need SR? Be it SR-MPLS or SRv6 or SRv6+?

I don't like this, SR-MPLS and SRv6 are just utterly different things
to me, and no answer meaningfully applies to both.

I would ask, why do we need LDP, why not use IGP to carry labels?

Less state, protocols, SLOC, cost, bug surface

And we get more features to boot, with LDP if you want LFA, you need
to form tLDP to every Q-space node, on top of your normal LDP, because
you don't know label view from anyone else but yourself. With SR by
nature you know the label view for everyone, thus you have full LFA
coverage for free, by-design.
Also by-design IGP/LDP Sync.

So no need to justify it by any magic new things, it's just a lot
simpler than LDP, you don't need to need new things to justify
SR-MPLS, you need to want to do existing things while reducing
complexity and state.

--
++ytti
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On Wed, 17 Jun 2020 at 18:42, Saku Ytti <saku@ytti.fi> wrote:

> Hey,
>
> > Why do we really need SR? Be it SR-MPLS or SRv6 or SRv6+?
>
> I don't like this, SR-MPLS and SRv6 are just utterly different things
> to me, and no answer meaningfully applies to both.
>

I don't understand the point of SRv6. What equipment can support IPv6
routing, but can't support MPLS label switching?

I'm a big fan of SR-MPLS however.




> And we get more features to boot, with LDP if you want LFA, you need
> to form tLDP to every Q-space node, on top of your normal LDP, because
> you don't know label view from anyone else but yourself. With SR by
> nature you know the label view for everyone, thus you have full LFA
> coverage for free, by-design.


Not just this, but the LFA path is always the post-convergence path. You
don't get microloops.

You can implement TE on top if that is your thing. No need to run RSVP.
Another protocol you don't need to run.

You don't need to throw out all your old kit, and replace with new in one
go. You can incrementally roll it out, and leave islands of LDP where
needed. LDP-SR interworking is pretty simple.

We are currently introducing it into our core. It will probably be a while
before we fully phase out LDP, but its definitely on the roadmap.

Regards,
Dave
RE: Devil's Advocate - Segment Routing, Why? [ In reply to ]
> From: NANOG <nanog-bounces@nanog.org> On Behalf Of Mark Tinka
> Sent: Wednesday, June 17, 2020 6:07 PM
>
>
> I've heard a lot about "network programmability", e.t.c.,
First of all the "SR = network programmability" is BS, SR = MPLS, any programmability we've had for MPLS since ever works the same way for SR.

> but can anyone
> point me toward a solution that actually does this in the way that it has been
> touted for years? A true flow that shows the implementation of "network
> programming" over any incarnation of SR? Perhaps one a customer can go to
> the shop and grab off the shelf?
>
Yes anything that works for RSVP-TE (i.e. PCEP), if you want to play there's this free app on top of ODL(acting as PCEP+BGP-LS) to program LSPs (can't recall the name).


> I've heard about "end-to-end service chaining" as a use-case for SR.
"service chaining" = traffic-engineering, you can do that with or without SR just fine.

> To
> service-chain what?
To service-chain DC or as hipsters call it "cloud" stuff. To TE path from VM to FW to ...whatever, or to TE mice flows around elephant flows.

> Classic telco's don't offer complex over-the-top services
They do via telco cloud.

> What problems are 90% of the
> operators running MPLS having that SR will truly fix,
>
None,
The same point I was trying to get across in our LDPv6 (or any v6 in control-plane or management plane for that matter) discussion, there's no problem to solve.
Personally I'll be doing SR only in brand new greenfield deployments or if I start running out of RSVP-TE scale on existing deployments.


adam
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 17/06/2020 18:38, Saku Ytti wrote:
>> Why do we really need SR? Be it SR-MPLS or SRv6 or SRv6+?
> I don't like this, SR-MPLS and SRv6 are just utterly different things
> to me, and no answer meaningfully applies to both.
>
> I would ask, why do we need LDP, why not use IGP to carry labels?
>
> Less state, protocols, SLOC, cost, bug surface
>
> And we get more features to boot, with LDP if you want LFA, you need
> to form tLDP to every Q-space node, on top of your normal LDP, because
> you don't know label view from anyone else but yourself. With SR by
> nature you know the label view for everyone, thus you have full LFA
> coverage for free, by-design.
> Also by-design IGP/LDP Sync.
>
> So no need to justify it by any magic new things, it's just a lot
> simpler than LDP, you don't need to need new things to justify
> SR-MPLS, you need to want to do existing things while reducing
> complexity and state.


Unsurprisingly, there would be no way on Earth that I could have said
that better, so you shall find only loud cheering from over here.

--
Tom
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 17/Jun/20 19:38, Saku Ytti wrote:

> I don't like this, SR-MPLS and SRv6 are just utterly different things
> to me, and no answer meaningfully applies to both.

I know they are different, but that was on purpose, because even with
SR-MPLS, there are a couple of things to consider:

* IOS XR does not appear to support SR-OSPFv3.
* IOS XE does not appear to support SR-ISISv6.
* IOS XE does not appear to support SR-OSPFv3.
* Junos does not appear to support SR-OSPFv3.
* MPLS/VPN service signaling in IPv6-only networks also has gaps in SR.

So for networks that run OSPF and don't run Juniper, they'd need to move
to IS-IS in order to have SR forward IPv6 traffic in an MPLS
encapsulation. Seems like a bit of an ask. Yes, code needs to be
written, which is fine by me, as it also does for LDPv6.


> I would ask, why do we need LDP, why not use IGP to carry labels?
>
> Less state, protocols, SLOC, cost, bug surface

I'd be curious to understand what bugs you've suffered with LDP in the
last 10 or so years, that likely still have open tickets.

Yes, we all love less state, I won't argue that. But it's the same
question that is being asked less and less with each passing year - what
scales better in 2020, OSPF or IS-IS. That is becoming less relevant as
control planes keep getting faster and cheaper.

I'm not saying that if you are dealing with 100,000 T-LDP sessions you
should not consider SR, but if you're not, and SR still requires a bit
more development (never mind deployment experience), what's wrong with
having LDPv6? If it makes near-as-no-difference to your control plane in
2020 or 2030 as to whether your 10,000-node network is running LDP or
SR, why not have the choice?


>
> And we get more features to boot, with LDP if you want LFA, you need
> to form tLDP to every Q-space node, on top of your normal LDP, because
> you don't know label view from anyone else but yourself. With SR by
> nature you know the label view for everyone, thus you have full LFA
> coverage for free, by-design.
> Also by-design IGP/LDP Sync.
>
> So no need to justify it by any magic new things, it's just a lot
> simpler than LDP, you don't need to need new things to justify
> SR-MPLS, you need to want to do existing things while reducing
> complexity and state.

Again, it's a question of scale and requirements. Some large networks
don't run any RSVP, while some small networks do.

I'm not saying let's not do SR; but for those who want something mature,
and for those who want something new, I don't see a reason why the
choice can't be left up to the operator.

Routers, in 2020, still ship with RIPv2. If anyone wants to use it (as I
am sure there are some that do), who are we to stand in their way, if it
makes sense for them?

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 17/Jun/20 20:40, Dave Bell wrote:


> I don't understand the point of SRv6. What equipment can support IPv6
> routing, but can't support MPLS label switching?

Indeed.

Anything that can support LDPv4 today can support LDPv6, in hardware.

SRv6 and SRv6+ is a whole other issue, not to mention the amount of work
needed to write code for it.


> Not just this, but the LFA path is always the post-convergence path.
> You don't get microloops.
>
> You can implement TE on top if that is your thing. No need to run
> RSVP. Another protocol you don't need to run.
>
> You don't need to throw out all your old kit, and replace with new in
> one go. You can incrementally roll it out, and leave islands of LDP
> where needed. LDP-SR interworking is pretty simple.
>
> We are currently introducing it into our core. It will probably be a
> while before we fully phase out LDP, but its definitely on the roadmap.

Happy to hear, and I have nothing against your choice if you are happy
with it.

But for a network that may not see the need in spending cycles doing
yet-another roll out, it tastes funny when you are forced down a new path.

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 17/Jun/20 23:07, adamv0025@netconsultings.com wrote:

> First of all the "SR = network programmability" is BS, SR = MPLS, any
> programmability we've had for MPLS since ever works the same way for SR.

I see it the same way.


> Yes anything that works for RSVP-TE (i.e. PCEP), if you want to play there's this free app on top of ODL(acting as PCEP+BGP-LS) to program LSPs (can't recall the name).

In short, more working and not the panacea it was made out to be. No
problem with that, if you're one to roll your sleeves up.


> "service chaining" = traffic-engineering, you can do that with or without SR just fine.

I don't make the terms up... best-of-breed and all that :-).


> To service-chain DC or as hipsters call it "cloud" stuff. To TE path from VM to FW to ...whatever, or to TE mice flows around elephant flows.

And how many classic telco's are doing this at scale in a way that only
SR can solve?


> They do via telco cloud.

What's that :-)?


> None,
> The same point I was trying to get across in our LDPv6 (or any v6 in control-plane or management plane for that matter) discussion, there's no problem to solve.
> Personally I'll be doing SR only in brand new greenfield deployments or if I start running out of RSVP-TE scale on existing deployments.

If I want to remove BGP state in the core (which is a good thing, given
how heavy BGP code and FIB requirements are), LDPv4 and LDPv6 are useful
for native dual-stack networks that do not share fate between either IP
protocol.

But, YMMV on that one :-).

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 17/Jun/20 23:46, Tom Hill wrote:

> Unsurprisingly, there would be no way on Earth that I could have said
> that better, so you shall find only loud cheering from over here.

Out of pure curiousity, have you deployed (or are you deploying)?

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
>
> Anything that can support LDPv4 today can support LDPv6, in hardware.
>

While I am trying to stay out of this interesting discussion the above
statement is not fully correct.

Yes in the MPLS2MPLS path you are correct,

But ingress and egress switching vectors are very different for LDPv6 as
you need to match on IPv6 vs LDPv4 ingress where you match on IPv4 to map
it to correct label stack rewrite.

Example: If your hardware ASICs do not support IPv6 while support IPv4 -
LDPv4 will work just fine while LDPv6 will have a rather a bit of hard time
:)

Cheers,
R.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On Thu, 18 Jun 2020 at 01:17, Mark Tinka <mark.tinka@seacom.mu> wrote:

> IOS XR does not appear to support SR-OSPFv3.
> IOS XE does not appear to support SR-ISISv6.
> IOS XE does not appear to support SR-OSPFv3.
> Junos does not appear to support SR-OSPFv3.

The IGP mess we are in is horrible, but I can't blame SR for it. It's
really unacceptable we spend NRE hours developing 3 identical IGP
(OSPFv2, OSPFv3, ISIS). We all pay a 300-400% premium for a single
IGP.

In a sane world, we'd retire all of them except OSPFv3 and put all NRE
focus on there or move some of the NRE dollars to some other problems
we have, perhaps we would have room to support some different
non-djikstra IGP.

In a half sane world, IGP code, 90% of your code would be identical,
then you'd have adapter/ospfv2 adapter/ospfv3 adapter/isis which
translates internal struct to wire and wire to internal struct. So any
features you code, come for free to all of them. But no one is doing
this, it's 300% effort, and we all pay a premium for that.

In a quarter sane world we'd have some CIC, common-igp-container RFC
and then new features like SR would be specified as CIC-format,
instead of OSPFv2, OSPFv3, ISIS and BGP. Then each OSPFv2, OSPFv3,
ISIS and BGP would have CIC-to-x RFC. So people introducing new IGP
features do not need to write 4 drafts, one is enough.

I would include IPv4+IPv6 my-igp-of-choice SR in my RFP. Luckily ISIS
is supported on platforms I care about for IPV4+IPV6, so I'm already
there.

> MPLS/VPN service signaling in IPv6-only networks also has gaps in SR.

I don't understand this.


> So for networks that run OSPF and don't run Juniper, they'd need to move to IS-IS in order to have SR forward IPv6 traffic in an MPLS encapsulation. Seems like a bit of an ask. Yes, code needs to be written, which is fine by me, as it also does for LDPv6.

And it's really just adding TLV, if it already does IPv4 all the infra
should be in place, only thing missing is transporting the
information. Adding TLV to IGP is a lot less work than LDPv6.

> I'd be curious to understand what bugs you've suffered with LDP in the last 10 or so years, that likely still have open tickets.

3 within a year.
- PR1436119
- PR1428081
- PR1416032

I don't have IOS-XR LDP bugs within a year, but we had a bunch back
when going from 4 to 5. And none of these are cosmetic, these are
blackholing.

I'm not saying LDP is bad, it's just, of course more code lines you
exercise more bugs you see.

But yes, LDP has a lot of bug surface compared to SR, but in _your
network_ lot of that bug surface and complexity is amortised
complexity. So status quo bias is strong to keep running LDP, it is
simpler _NOW_ as a lot of the tax has been paid and moving to an
objectively simpler solution carries risk, as its complexity is not
amortised yet.


> Yes, we all love less state, I won't argue that. But it's the same question that is being asked less and less with each passing year - what scales better in 2020, OSPF or IS-IS. That is becoming less relevant as control planes keep getting faster and cheaper.

I don't think it ever was relevant.

> I'm not saying that if you are dealing with 100,000 T-LDP sessions you should not consider SR, but if you're not, and SR still requires a bit more development (never mind deployment experience), what's wrong with having LDPv6? If it makes near-as-no-difference to your control plane in 2020 or 2030 as to whether your 10,000-node network is running LDP or SR, why not have the choice?

I can't add anything to the upside of going from LDP to SR that I've
not already said. You get more by spending less, it's win:win. Only
reason to stay in LDP is status quo bias which makes short term sense.

> Routers, in 2020, still ship with RIPv2. If anyone wants to use it (as I am sure there are some that do), who are we to stand in their way, if it makes sense for them?

RIP might make sense in some deployments, because it's essentially
stateless (routes age out, no real 'session') so if you have 100k VM
per router that you need to support and you want dynamic routing, RIP
might be the least resistance solution with the highest scale. Timing
wheels should help it scale and maintain great number of timers.

--
++ytti
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 18/Jun/20 00:29, Robert Raszuk wrote:

>
> Example: If your hardware ASICs do not support IPv6 while support IPv4
> - LDPv4 will work just fine while LDPv6 will have a rather a bit of
> hard time :)

Well, safe to say that if your box doesn't support IPv6, MPLSv6 is
probably the least of your worries :-).

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 18/Jun/20 07:25, Saku Ytti wrote:

> The IGP mess we are in is horrible, but I can't blame SR for it. It's
> really unacceptable we spend NRE hours developing 3 identical IGP
> (OSPFv2, OSPFv3, ISIS). We all pay a 300-400% premium for a single
> IGP.
>
> In a sane world, we'd retire all of them except OSPFv3 and put all NRE
> focus on there or move some of the NRE dollars to some other problems
> we have, perhaps we would have room to support some different
> non-djikstra IGP.
>
> In a half sane world, IGP code, 90% of your code would be identical,
> then you'd have adapter/ospfv2 adapter/ospfv3 adapter/isis which
> translates internal struct to wire and wire to internal struct. So any
> features you code, come for free to all of them. But no one is doing
> this, it's 300% effort, and we all pay a premium for that.
>
> In a quarter sane world we'd have some CIC, common-igp-container RFC
> and then new features like SR would be specified as CIC-format,
> instead of OSPFv2, OSPFv3, ISIS and BGP. Then each OSPFv2, OSPFv3,
> ISIS and BGP would have CIC-to-x RFC. So people introducing new IGP
> features do not need to write 4 drafts, one is enough.

While I don't have a real opinion on how to fix the IGP mess, the point
is we sit with it now. Getting all these fixed is going to increase the
bug surface area for some time to come as both vendors and operators
work the kinks out, in addition to SR's own kinks.

Yes, it's all par for the course for new features, which is why I'd also
like to have an alternative that has been baked in for many years to
give me an option for stability, as we roll the new kid out.

I probably will deploy SR-MPLS at some point in my lifetime, but I'm not
feeling awfully comfortable to do so right now; and yet I do need MPLSv6
forwarding.


>
> I would include IPv4+IPv6 my-igp-of-choice SR in my RFP. Luckily ISIS
> is supported on platforms I care about for IPV4+IPV6, so I'm already
> there.

Which is great for you, me, and a ton of other folk that run IS-IS on
Juniper. What about folk that don't have Juniper, or run OSPF?

I know, not your or my problem, but the Internet isn't just a few networks.



> I don't understand this.

I mean the same gaps that exist in RFC 7439, for would-be IPv6-only MPLS
networks.



> And it's really just adding TLV, if it already does IPv4 all the infra
> should be in place, only thing missing is transporting the
> information. Adding TLV to IGP is a lot less work than LDPv6.

What we theorize as "should be easy" can turn out to be a whole
discussion with the vendors about it being months or years of work. Not
being inside their meeting rooms, I can't quite challenge how they
present the task.

Fundamentally, LDPv6 already has 5+ years in implementation (and LDPv4
is 20 years old), inter-op issues seem to be mostly fixed, and for what
we need it to do, it's working very well.

There are probably as many networks running SR-MPLS as there are running
LDPv6, likely fewer if your SR deployment doesn't yet support OSPFv3 or
SR-ISISv6. I concede that for some networks looking to go SR-MPLS, label
distribution state reduction is probably higher up on the agenda than
MPLSv6 forwarding. For me, I'd like the option to have both, and decide
whether my network is in a position to handle the additional state
required for LDPv6, if I feel that I'd prefer to deal with a protocol
that has had more exposure to the sun.

Ultimately, boxes with LDPv6 have been shipping for some time, and we
have a ton of them deployed and running for a while now. If it comes
down to kicking out the 20% that won't support it because of an
all-or-nothing vendor approach on a platform without full SR-MPLS
support for all IGP's, it is what it is.



> 3 within a year.
> - PR1436119
> - PR1428081
> - PR1416032
>
> I don't have IOS-XR LDP bugs within a year, but we had a bunch back
> when going from 4 to 5. And none of these are cosmetic, these are
> blackholing.
>
> I'm not saying LDP is bad, it's just, of course more code lines you
> exercise more bugs you see.
>
> But yes, LDP has a lot of bug surface compared to SR, but in _your
> network_ lot of that bug surface and complexity is amortised
> complexity. So status quo bias is strong to keep running LDP, it is
> simpler _NOW_ as a lot of the tax has been paid and moving to an
> objectively simpler solution carries risk, as its complexity is not
> amortised yet.

And FWIW, if some operators are willing to benefit from all the
experience that has gone into developing and maintaining LDP, while we
let SR settle down, I don't see why that choice shouldn't be there.

I'm not saying it should be an SR vs. LDP debate like it was
BGP-signaling vs. LDP-signaling for VPLS 12+ years ago. All I'm saying
is for those who want to go bleeding edge with SR, go for it. For those
who prefer to gracefully transition toward SR over time by settling on
LDP that has been in the field for a minute, go for it too.

I won't claim to know whether LDP or SR have a smaller or larger bug
surface area. What I do know is that there will be plenty of bugs for
SR, as there have been for MPLS and all related protocols in the last
20+ years. From my side, I'd prefer to give SR the time it needs to get
all of its Vitamin D, but don't oppose anyone that prefers to deploy it.


> I can't add anything to the upside of going from LDP to SR that I've
> not already said. You get more by spending less, it's win:win. Only
> reason to stay in LDP is status quo bias which makes short term sense.

I can't argue the usefulness of reducing label distribution state in
MPLS. Heck, that is what got me excited about SR back in 2013, and also
what caused me to pump the brakes on the noise I was making to vendors
about developing LDPv6 (which started in 2008), because I was finally
going to get native MPLSv6 forwarding in SR without all the LDP/RSVP
fluff. But, things took their own turn, and with the IGP mess that it
currently is, we are where we are. Thankfully, some vendors did develop
LDPv6 anyway, so we got MPLSv6 in the end as SR was still in the embryo.

If I'm still in the game in half-a-decade from now or so, I will very
likely dump LDP and move to SR-MPLS. I'm just not too comfortable doing
so now because IGP support is not where it needs to be, and it still has
to through its own life cycle of bugs and fixes, which will be quite an
effort as global deployment is still far behind LDP and RSVP.


> RIP might make sense in some deployments, because it's essentially
> stateless (routes age out, no real 'session') so if you have 100k VM
> per router that you need to support and you want dynamic routing, RIP
> might be the least resistance solution with the highest scale. Timing
> wheels should help it scale and maintain great number of timers.

I guess my point was the vendors won't be dumping RIP, even if general
conensus is to avoid it whenever possible.

If I'm not concerned about LDP state, and protocol stability is more
important to me in the near-to-medium term, we'd be remiss to start a
culture of taking that choice away.

Because the next time vendors get bored with what they've built and sold
and decide that SR or some other feature has seen enough light of day,
let's dream up something else to shout about between the 2030 -  2040
decade, they'll have the had the experience of cornering operators into
making rash decisions, and they'd never let us forget it.

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On Thu, 18 Jun 2020 at 10:13, Mark Tinka <mark.tinka@seacom.mu> wrote:

> Which is great for you, me, and a ton of other folk that run IS-IS on
> Juniper. What about folk that don't have Juniper, or run OSPF?
>
> I know, not your or my problem, but the Internet isn't just a few networks.

Yes work left to be done. Ultimately the root problem is, no one cares
about IPv6. But perhaps work with vendors in parallel to LDPv6 to get
them to fix OSPFv3 and/or ISIS.

> I'm not saying it should be an SR vs. LDP debate like it was
> BGP-signaling vs. LDP-signaling for VPLS 12+ years ago. All I'm saying

FWIW I am definitely saying that, and it should be IGP+BGP. I do
accept and realise a lot of platforms only did and do Martini not
Kompella, so reality isn't quite there.

--
++ytti
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 18/Jun/20 09:30, Saku Ytti wrote:

> Yes work left to be done. Ultimately the root problem is, no one cares
> about IPv6. But perhaps work with vendors in parallel to LDPv6 to get
> them to fix OSPFv3 and/or ISIS.

Yes, this.

Vendor feedback for those not supporting LDPv6 is that there is no
demand for it. And like I said in the previous thread, LDPv6 demand is
not about LDPv6, it's about IPv6.

If the majority of the high-paying vendors' favorite customers that pay
for CGN's continue to do so, what incentive do they have to ask for
IPv6. The T-Mobile US's of the world are few and far between, sadly.

I suppose I would not be unwilling to push the vendors to support
SR-OSPFv3 and SR-ISISv6 as I am also pushing them to support LDPv6 where
it is lacking, because at some point in the future, I do want to deploy
SR-MPLS in the same way I envisioned doing so back in 2014. I just need
to take it on a few dates first before I bring it home to meet the folks
:-).



> FWIW I am definitely saying that, and it should be IGP+BGP. I do
> accept and realise a lot of platforms only did and do Martini not
> Kompella, so reality isn't quite there.

That was me in 2013/2014. Dump LDP, dump RSVP, get SR deployed, forward
IPv4 natively in MPLSv4, and IPv6 natively in MPLSv6. But life happened.

Nonetheless, I will go SR-MPLS in many years to come, after I'm feeling
comfortable about it. That's a promise. But until then, I'd like
trusted, stable IPv4-IPv6 MPLS forwarding parity.

I have never cared much for VPLS because I thought it was a very messy
piece of tech. from Day 1. And while EVPN makes more sense, for our
market, more than 98% of the traffic we sell is IP-based, so we have no
demand for mp2mp Ethernet VPN's. But for those that adore VPLS (or
EVPN), let them have the choice of LDP or BGP, which both Cisco and
Juniper, after years of muscle-flexing, both ended up agreeing on
anyway, despite all the fuss.

So the LDPv6 vs. SR-MPLS vs. SRv6 vs. SRv6+ posturing is a rehash of
those LDP vs. BGP days, which just wastes everyone's time.

Mark.
RE: Devil's Advocate - Segment Routing, Why? [ In reply to ]
> From: Saku Ytti
> Sent: Thursday, June 18, 2020 6:26 AM
>
> On Thu, 18 Jun 2020 at 01:17, Mark Tinka <mark.tinka@seacom.mu> wrote:
>
> > Yes, we all love less state, I won't argue that. But it's the same question
> that is being asked less and less with each passing year - what scales better in
> 2020, OSPF or IS-IS. That is becoming less relevant as control planes keep
> getting faster and cheaper.
>
> I don't think it ever was relevant.
>
In 99% of cases, there are cases however where supporting 1M+ routes in IGP is one of the viable options to consider, or running multi-100k of LSPs through a core node...
But these are core MPLS networks that have no boundaries cause these literally wrap around the globe, and access to custom code.

adam
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
Hi Saku,

To your IGP point let me observe that OSPF runs over IP and ISIS does not.
That is first fundamental difference. There are customers using both all
over the world and therefore any suggestion to just use OSPFv3 is IMHO
quite unrealistic. Keep in mind that OSPF hierarchy is 2 (or 3 with super
area) while in IETF there is ongoing work to extend ISIS to 8 levels. There
is a lot of fundamental differences between those two (or three) IGPs and I
am sure many folks on the lists know them. Last there is a lot of
enterprise networks happily using IPv4 RFC1918 all over their global WAN
and DCs infrastructure and have no reason to deploy IPv6 there any time
soon.

If you are serious about converging to a single IGP I would rather consider
look towards OpenR type of IGP architecture with message bus underneath.

Thx,
R.

On Thu, Jun 18, 2020 at 7:26 AM Saku Ytti <saku@ytti.fi> wrote:

> On Thu, 18 Jun 2020 at 01:17, Mark Tinka <mark.tinka@seacom.mu> wrote:
>
> > IOS XR does not appear to support SR-OSPFv3.
> > IOS XE does not appear to support SR-ISISv6.
> > IOS XE does not appear to support SR-OSPFv3.
> > Junos does not appear to support SR-OSPFv3.
>
> The IGP mess we are in is horrible, but I can't blame SR for it. It's
> really unacceptable we spend NRE hours developing 3 identical IGP
> (OSPFv2, OSPFv3, ISIS). We all pay a 300-400% premium for a single
> IGP.
>
> In a sane world, we'd retire all of them except OSPFv3 and put all NRE
> focus on there or move some of the NRE dollars to some other problems
> we have, perhaps we would have room to support some different
> non-djikstra IGP.
>
> In a half sane world, IGP code, 90% of your code would be identical,
> then you'd have adapter/ospfv2 adapter/ospfv3 adapter/isis which
> translates internal struct to wire and wire to internal struct. So any
> features you code, come for free to all of them. But no one is doing
> this, it's 300% effort, and we all pay a premium for that.
>
> In a quarter sane world we'd have some CIC, common-igp-container RFC
> and then new features like SR would be specified as CIC-format,
> instead of OSPFv2, OSPFv3, ISIS and BGP. Then each OSPFv2, OSPFv3,
> ISIS and BGP would have CIC-to-x RFC. So people introducing new IGP
> features do not need to write 4 drafts, one is enough.
>
> I would include IPv4+IPv6 my-igp-of-choice SR in my RFP. Luckily ISIS
> is supported on platforms I care about for IPV4+IPV6, so I'm already
> there.
>
> > MPLS/VPN service signaling in IPv6-only networks also has gaps in SR.
>
> I don't understand this.
>
>
> > So for networks that run OSPF and don't run Juniper, they'd need to move
> to IS-IS in order to have SR forward IPv6 traffic in an MPLS encapsulation.
> Seems like a bit of an ask. Yes, code needs to be written, which is fine by
> me, as it also does for LDPv6.
>
> And it's really just adding TLV, if it already does IPv4 all the infra
> should be in place, only thing missing is transporting the
> information. Adding TLV to IGP is a lot less work than LDPv6.
>
> > I'd be curious to understand what bugs you've suffered with LDP in the
> last 10 or so years, that likely still have open tickets.
>
> 3 within a year.
> - PR1436119
> - PR1428081
> - PR1416032
>
> I don't have IOS-XR LDP bugs within a year, but we had a bunch back
> when going from 4 to 5. And none of these are cosmetic, these are
> blackholing.
>
> I'm not saying LDP is bad, it's just, of course more code lines you
> exercise more bugs you see.
>
> But yes, LDP has a lot of bug surface compared to SR, but in _your
> network_ lot of that bug surface and complexity is amortised
> complexity. So status quo bias is strong to keep running LDP, it is
> simpler _NOW_ as a lot of the tax has been paid and moving to an
> objectively simpler solution carries risk, as its complexity is not
> amortised yet.
>
>
> > Yes, we all love less state, I won't argue that. But it's the same
> question that is being asked less and less with each passing year - what
> scales better in 2020, OSPF or IS-IS. That is becoming less relevant as
> control planes keep getting faster and cheaper.
>
> I don't think it ever was relevant.
>
> > I'm not saying that if you are dealing with 100,000 T-LDP sessions you
> should not consider SR, but if you're not, and SR still requires a bit more
> development (never mind deployment experience), what's wrong with having
> LDPv6? If it makes near-as-no-difference to your control plane in 2020 or
> 2030 as to whether your 10,000-node network is running LDP or SR, why not
> have the choice?
>
> I can't add anything to the upside of going from LDP to SR that I've
> not already said. You get more by spending less, it's win:win. Only
> reason to stay in LDP is status quo bias which makes short term sense.
>
> > Routers, in 2020, still ship with RIPv2. If anyone wants to use it (as I
> am sure there are some that do), who are we to stand in their way, if it
> makes sense for them?
>
> RIP might make sense in some deployments, because it's essentially
> stateless (routes age out, no real 'session') so if you have 100k VM
> per router that you need to support and you want dynamic routing, RIP
> might be the least resistance solution with the highest scale. Timing
> wheels should help it scale and maintain great number of timers.
>
> --
> ++ytti
>
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 18/Jun/20 12:28, Robert Raszuk wrote:

> To your IGP point let me observe that OSPF runs over IP and ISIS does
> not. That is first fundamental difference. There are customers using
> both all over the world and therefore any suggestion to just use
> OSPFv3 is IMHO quite unrealistic.

Are you saying that OSPF houses that want IPv6 should just move to
IS-IS. Don't get me wrong, I support that very much as I think IS-IS is
a great IGP. That said, while it's good to convince OSPF operators to
consider IS-IS, it's not our place to force them to use it.

Also, OSPFv3-only for your dual-stack IGP needs is a supported
capability. Last time I tested it in Juniper in 2010/2011, it worked
well. I don't know if anyone is actually running IPv4 and IPv6 on OSPFv3
only, but it does work.


> Keep in mind that OSPF hierarchy is 2 (or 3 with super area) while in
> IETF there is ongoing work to extend ISIS to 8 levels. There is a lot
> of fundamental differences between those two (or three) IGPs and I am
> sure many folks on the lists know them.

15+ years ago, I'd have said that one protocol may have been suited to a
specific task than another due to the control plane limitations of the day.

In 2020, with the state-of-the-art of control planes today, it near as
makes no difference, IMHO.


> Last there is a lot of enterprise networks happily using IPv4 RFC1918
> all over their global WAN and DCs infrastructure and have no reason to
> deploy IPv6 there any time soon.

No wonder the vendors aren't seeing any LDPv6, SR-ISISv6 or SR-OSPFv3
demand :-).

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On Thu, 18 Jun 2020 at 13:28, Robert Raszuk <robert@raszuk.net> wrote:

> To your IGP point let me observe that OSPF runs over IP and ISIS does not. That is first fundamental difference. There are customers using both all over the world and therefore any suggestion to just use OSPFv3 is IMHO quite unrealistic. Keep in mind that OSPF hierarchy is 2 (or 3 with super area) while in IETF there is ongoing work to extend ISIS to 8 levels. There is a lot of fundamental differences between those two (or three) IGPs and I am sure many folks on the lists know them. Last there is a lot of enterprise networks happily using IPv4 RFC1918 all over their global WAN and DCs infrastructure and have no reason to deploy IPv6 there any
time soon.

I view the 802.3 and CLNS as liability, not an asset. People who
actually route CLNS are a dying breed, think just DCN of a legacy
optical.

Many platforms have no facilities to protect ISIS, any connected
attacker can kill the box. Nokia handles generated packets
classification by assigning DSCP value to application then DSCP to
forwarding-class, which precludes from configuring ISIS qos. Very few
people understand how ISIS works before ISIS PDU is handed to them,
world from 802.3 to that is largely huge pile of hacks, instead of
complete CLNS stack implementation. There is no standard way to send
large frames over 802.3, so there is non-standard way to encap ISIS
for those links. Also due to lack of LSP roll-over, ISIS is subject to
a horrible attack vector which is very difficult to troubleshoot and
solve.

--
++ytti
RE: Devil's Advocate - Segment Routing, Why? [ In reply to ]
> From: NANOG <nanog-bounces@nanog.org> On Behalf Of Mark Tinka
> Sent: Thursday, June 18, 2020 8:13 AM
>
> There are probably as many networks running SR-MPLS as there are running
> LDPv6, likely fewer if your SR deployment doesn't yet support OSPFv3 or SR-
> ISISv6. I concede that for some networks looking to go SR-MPLS, label
> distribution state reduction is probably higher up on the agenda than
> MPLSv6 forwarding. For me, I'd like the option to have both, and decide
> whether my network is in a position to handle the additional state required
> for LDPv6, if I feel that I'd prefer to deal with a protocol that has had more
> exposure to the sun.
>
You do have the LDP vs SR choice (in v4 anyways) yes there's not a good 1:1 feature parity with v6, but the important point is the current state is not the end state, this is a pretty dynamic industry that I'm sure is converging/evolving towards a v4:v6 parity, however the pace may be, which is understandable considering the scope of ground to be covered. Yes you're right in acknowledging that we're not living in a perfect world and that choices are limited, but it's been like that since ever yet we managed to thrive by analysing our options and striving for optimal strategies year by year.

adam
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 18/Jun/20 13:23, adamv0025@netconsultings.com wrote:

> You do have the LDP vs SR choice (in v4 anyways) yes there's not a good 1:1 feature parity with v6, but the important point...

But the lack of IPv4/IPv6 parity is a crucial one.

There is only so long we can stretch IPv4, if one can still manage the
tangible and intangible costs of doing so. But that's for another
discussion.


> is the current state is not the end state, this is a pretty dynamic industry that I'm sure is converging/evolving towards a v4:v6 parity, however the pace may be, which is understandable considering the scope of ground to be covered.

Which I am fine with  - if you give me a time line to say LDPv6,
SR-OSPFv3 and SR-ISISv6 will be available on X date, I can manage my
operation and expectations accordingly.

But if you say, "No LDPv6, no SR-OSPFv3, no SR-ISISv6... only SRv6",
then that's an entirely different issue.

The good news is there currently is choice on the matter, but upending
hundreds or thousands of boxes to prove that point should really be a
last resort, as there are more pressing things we all have to deal with.


> Yes you're right in acknowledging that we're not living in a perfect world and that choices are limited, but it's been like that since ever yet we managed to thrive by analysing our options and striving for optimal strategies year by year.

We can thank NAT44, CIDR, DHCP and PPPoE for that strategy over the
years :-).

IPv6 is the future, and at some point, we'll have to stop hiding from it.

Mark.
RE: Devil's Advocate - Segment Routing, Why? [ In reply to ]
> From: Mark Tinka <mark.tinka@seacom.mu>
> Sent: Thursday, June 18, 2020 12:51 PM
>
> On 18/Jun/20 13:23, adamv0025@netconsultings.com wrote:
>
> > is the current state is not the end state, this is a pretty dynamic industry
> that I'm sure is converging/evolving towards a v4:v6 parity, however the pace
> may be, which is understandable considering the scope of ground to be
> covered.
>
> Which I am fine with - if you give me a time line to say LDPv6,
> SR-OSPFv3 and SR-ISISv6 will be available on X date, I can manage my
> operation and expectations accordingly.
>
> But if you say, "No LDPv6, no SR-OSPFv3, no SR-ISISv6... only SRv6", then
> that's an entirely different issue.
>
> The good news is there currently is choice on the matter, but upending
> hundreds or thousands of boxes to prove that point should really be a last
> resort, as there are more pressing things we all have to deal with.
>
Hence our current strategy is to stay on IPv4 control-plane (and IPv4 management plane) as it suits, and for the foreseeable future will suite, all our needs (which are to transport v4&v6 data packets via L2&L3 MPLS VPN services), there are simply more important projects than to experiment with v6 control-plane, like for instance perfecting/securing the v6 customer facing services (delivered over the underlying v4 signalled MPLS infrastructure, that no customer really cares about).

But I understand your frustrations case it seems like you're taking the bullet for us late adopters and in a sense you are, cause say in 10 years from now when I decide to migrate to v6 control-plane and management-plane as then it might be viewed as common courtesy, it will be all there on a silver plate waiting for me allowing for a relatively effortless and painless move. All thanks to you fighting the good fight today.

>
> > Yes you're right in acknowledging that we're not living in a perfect world
> and that choices are limited, but it's been like that since ever yet we
> managed to thrive by analysing our options and striving for optimal strategies
> year by year.
>
> We can thank NAT44, CIDR, DHCP and PPPoE for that strategy over the years
> :-).
>
> IPv6 is the future, and at some point, we'll have to stop hiding from it.
>
And I'd say the future is now, cause there is an actual need for v6 services.
But need for v6 control & management plane? - It's not like operators are losing business opportunities not having that. (they might even be viewed as conservative->stable, which might be preferred by some customers).

adam
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 18/Jun/20 14:30, adamv0025@netconsultings.com wrote:

> Hence our current strategy is to stay on IPv4 control-plane (and IPv4 management plane) as it suits, and for the foreseeable future will suite, all our needs (which are to transport v4&v6 data packets via L2&L3 MPLS VPN services), there are simply more important projects than to experiment with v6 control-plane, like for instance perfecting/securing the v6 customer facing services (delivered over the underlying v4 signalled MPLS infrastructure, that no customer really cares about).

Fair enough.


> But I understand your frustrations case it seems like you're taking the bullet for us late adopters and in a sense you are, cause say in 10 years from now when I decide to migrate to v6 control-plane and management-plane as then it might be viewed as common courtesy, it will be all there on a silver plate waiting for me allowing for a relatively effortless and painless move. All thanks to you fighting the good fight today.

You better hope and pray I don't run out wine. Equipment manufacturers
make me drink, and I like my wine :-).


> And I'd say the future is now, cause there is an actual need for v6 services.
> But need for v6 control & management plane? - It's not like operators are losing business opportunities not having that. (they might even be viewed as conservative->stable, which might be preferred by some customers).

Well, the other way to look at it, especially if you are a Broadband or
mobile network operator, is what your plan is when you can no longer
stretch the IPv4 you have, can no longer obtain IPv4 from an RIR, and
can't afford to buy IPv4 on the open market.

For mobile operators, paying US$50 million/year in CGN line cards and
licensing is not even a rounding error on the books. But the telco space
has been under pressure for some time now, further amplified and
accelerated this Coronavirus pandemic. Even though mobile networks are
ATM machines printing money for the shareholders, they probably made
more money in the days of SMS than they do now building and selling
4G/5G packet cores. At some point, that US$50 million/year is going to
start getting some ex-co and Board level visibility, as capex spend
begins to pinch revenue because of the data demands of subscribers, and
the ever-falling ARPU's to go along with it.

Perhaps, at that point, massive IPv6 deployment in the mobile space is
what will wake everyone else up, as the race to grab on to every $$
tightens.

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On Wed, Jun 17, 2020, at 20:40, Dave Bell wrote:
>
> I don't understand the point of SRv6. What equipment can support IPv6
> routing, but can't support MPLS label switching?

A whole ocean of "datacenter" hardware, from pretty much evey vendor. Because many of them automatically link MPLS to RSVP and IPv4 L3VPN (which may still be an interesting feature in datacenter), many try to stay as far away from it as possible. Othen than some scared C-level guys imposing this, I don't really see a good reason (lack of market demand, which is sometimes invoked, doesn't stand).

> where needed. LDP-SR interworking is pretty simple.

Mapping servers or something else ?
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On Fri, 19 Jun 2020 at 10:24, Radu-Adrian Feurdean
<nanog@radu-adrian.feurdean.net> wrote:


> > I don't understand the point of SRv6. What equipment can support IPv6
> > routing, but can't support MPLS label switching?
>
> A whole ocean of "datacenter" hardware, from pretty much evey vendor. Because many of them automatically link MPLS to RSVP and IPv4 L3VPN (which may still be an interesting feature in datacenter), many try to stay as far away from it as possible. Othen than some scared C-level guys imposing this, I don't really see a good reason (lack of market demand, which is sometimes invoked, doesn't stand).

I'm sure such devices exist, I can't name any from top of my head. But
this market perversion is caused by DC people who did not understand
networks and suffer from not-invented-here. Everyone needsa tunnel
solution, but DC people decided before looking into or understanding
the topic that MPLS is bad and complex, let's invent something new.
Then we re-invented solutions that already had _MORE_ efficient
solutions in MPLS, and a lot of those technologies are now becoming
established in DC space, creating confusion in SP space.

Maybe these inferior technologies will win, due to the marketing
strength of DC solutions. Or maybe DC will later figure out the
fundamental aspect in tunneling cost, and invent even-better-MPLS,
which is entirely possible now that we have a bit more understanding
how we use MPLS.

--
++ytti
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 19/Jun/20 09:20, Radu-Adrian Feurdean wrote:
>
> A whole ocean of "datacenter" hardware, from pretty much evey vendor.

You mean the ones deliberately castrated so that we can create a
specific "DC vertical", even if they are, pretty much, the same box a
service provider will buy, just given a darker color so it can glow more
brightly in the data centre night?

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 19/Jun/20 09:50, Saku Ytti wrote:


> I'm sure such devices exist, I can't name any from top of my head. But
> this market perversion is caused by DC people who did not understand
> networks and suffer from not-invented-here. Everyone needsa tunnel
> solution, but DC people decided before looking into or understanding
> the topic that MPLS is bad and complex, let's invent something new.
> Then we re-invented solutions that already had _MORE_ efficient
> solutions in MPLS, and a lot of those technologies are now becoming
> established in DC space, creating confusion in SP space.
>
> Maybe these inferior technologies will win, due to the marketing
> strength of DC solutions. Or maybe DC will later figure out the
> fundamental aspect in tunneling cost, and invent even-better-MPLS,
> which is entirely possible now that we have a bit more understanding
> how we use MPLS.

Let me work out how to print all this on a t-shirt.

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On Fri, Jun 19, 2020, at 10:11, Mark Tinka wrote:
>
> On 19/Jun/20 09:20, Radu-Adrian Feurdean wrote:
> >
> > A whole ocean of "datacenter" hardware, from pretty much evey vendor.
>
> You mean the ones deliberately castrated so that we can create a
> specific "DC vertical", even if they are, pretty much, the same box a
> service provider will buy, just given a darker color so it can glow more
> brightly in the data centre night?

Yes, exactly that one.
Which also happens to spill outside the DC area, because the main "vertical" allows it to be sold at lower prices.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 19/Jun/20 10:57, Radu-Adrian Feurdean wrote:

>
> Yes, exactly that one.
> Which also happens to spill outside the DC area, because the main "vertical" allows it to be sold at lower prices.

These days, half the gig is filtering the snake oil.

Mark.
RE: Devil's Advocate - Segment Routing, Why? [ In reply to ]
> Saku Ytti
> Sent: Friday, June 19, 2020 8:50 AM
>
> On Fri, 19 Jun 2020 at 10:24, Radu-Adrian Feurdean <nanog@radu-
> adrian.feurdean.net> wrote:
>
>
> > > I don't understand the point of SRv6. What equipment can support
> > > IPv6 routing, but can't support MPLS label switching?
> >
> Maybe these inferior technologies will win, due to the marketing strength of
> DC solutions. Or maybe DC will later figure out the fundamental aspect in
> tunneling cost, and invent even-better-MPLS, which is entirely possible now
> that we have a bit more understanding how we use MPLS.
>
Looking back at history (VXLAN or the Google's Espresso "architecture") I'm not holding my breath for anything reasonable coming out of the DC camp....

adam
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 19/Jun/20 16:45, Masataka Ohta wrote:

> The problem of MPLS, or label switching in general, is that, though
> it was advertised to be topology driven to scale better than flow
> driven, it is actually flow driven with poor scalability.
>
> Thus, it is impossible to deploy any technology scalably over MPLS.
>
> MPLS was considered to scale, because it supports nested labels
> corresponding to hierarchical, thus, scalable, routing table.
>
> However, to assign nested labels at the source, the source
> must know hierarchical routing table at the destination, even
> though the source only knows hierarchical routing table at
> the source itself.
>
> So, the routing table must be flat, which dose not scale, or
> the source must detect flows to somehow request hierarchical
> destination routing table on demand, which means MPLS is flow
> driven.
>
> People, including some data center people, avoiding MPLS, know
> network scalability better than those deploying MPLS.
>
> It is true that some performance improvement is possible with
> label switching by flow driven ways, if flows are manually
> detected. But, it means extra label-switching-capable equipment
> and administrative effort to detect flows, neither of which do
> not scale and cost a lot.
>
> It cost a lot less to have more plain IP routers than insisting
> on having a little fewer MPLS routers.

I wouldn't agree.

MPLS is a purely forwarding paradigm, as is hop-by-hop IP. Even with
hop-by-hop IP, you need the edge to be routing-aware.

I wasn't at the table when the MPLS spec. was being dreamed up, but I'd
find it very hard to accept that someone drafting the idea advertised it
as being a replacement or alternative for end-to-end IP routing and
forwarding.

Whether you run MPLS or not, you will always have routing table scaling
concerns. So I'm not quite sure how that is MPLS's problem. If you can
tell me how NOT running MPLS affords you a "hierarchical, scalable"
routing table, I'm all ears.

Whether you forward in IP or in MPLS, scaling routing is an ever clear &
present concern. Where MPLS can directly mitigate that particular
concern is in the core, where you can remove BGP. But you still need
routing in the edge, whether you forward in IP or MPLS.

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
Hi Mark,

As actually someone who was at that table you are referring to - I must say
that MPLS was never proposed as replacement for IP.

MPLS was since day one proposed as enabler for services originally L3VPNs
and RSVP-TE. Then bunch of others jumped on the same encapsulation train.
If at that very time GSR would be able to do right GRE encapsulation at
line rate in all of its engines MPLS for transport would never take off. As
service demux - sure but this is completely separate.

But since at that time shipping hardware could not do the right
encapsulation and since SPs were looking for more revenue and new way to
move ATM and FR customers to IP backbones L3VPN was proposed which really
required to hide the service addresses from anyone's core. So some form of
encapsulation was a MUST. Hence tag switching then mpls switching was
rolled out.

So I think Ohta-san's point is about scalability services not flat underlay
RIB and FIB sizes. Many years ago we had requests to support 5M L3VPN
routes while underlay was just 500K IPv4.

Last - when I originally discussed just plain MPLS with customers with
single application of hierarchical routing (no BGP in the core) frankly no
one was interested. Till L3VPN arrived which was game changer and run for
new revenue streams ...

Best,
R.


On Fri, Jun 19, 2020 at 5:00 PM Mark Tinka <mark.tinka@seacom.mu> wrote:

>
>
> On 19/Jun/20 16:45, Masataka Ohta wrote:
>
> > The problem of MPLS, or label switching in general, is that, though
> > it was advertised to be topology driven to scale better than flow
> > driven, it is actually flow driven with poor scalability.
> >
> > Thus, it is impossible to deploy any technology scalably over MPLS.
> >
> > MPLS was considered to scale, because it supports nested labels
> > corresponding to hierarchical, thus, scalable, routing table.
> >
> > However, to assign nested labels at the source, the source
> > must know hierarchical routing table at the destination, even
> > though the source only knows hierarchical routing table at
> > the source itself.
> >
> > So, the routing table must be flat, which dose not scale, or
> > the source must detect flows to somehow request hierarchical
> > destination routing table on demand, which means MPLS is flow
> > driven.
> >
> > People, including some data center people, avoiding MPLS, know
> > network scalability better than those deploying MPLS.
> >
> > It is true that some performance improvement is possible with
> > label switching by flow driven ways, if flows are manually
> > detected. But, it means extra label-switching-capable equipment
> > and administrative effort to detect flows, neither of which do
> > not scale and cost a lot.
> >
> > It cost a lot less to have more plain IP routers than insisting
> > on having a little fewer MPLS routers.
>
> I wouldn't agree.
>
> MPLS is a purely forwarding paradigm, as is hop-by-hop IP. Even with
> hop-by-hop IP, you need the edge to be routing-aware.
>
> I wasn't at the table when the MPLS spec. was being dreamed up, but I'd
> find it very hard to accept that someone drafting the idea advertised it
> as being a replacement or alternative for end-to-end IP routing and
> forwarding.
>
> Whether you run MPLS or not, you will always have routing table scaling
> concerns. So I'm not quite sure how that is MPLS's problem. If you can
> tell me how NOT running MPLS affords you a "hierarchical, scalable"
> routing table, I'm all ears.
>
> Whether you forward in IP or in MPLS, scaling routing is an ever clear &
> present concern. Where MPLS can directly mitigate that particular
> concern is in the core, where you can remove BGP. But you still need
> routing in the edge, whether you forward in IP or MPLS.
>
> Mark.
>
>
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
> MPLS was since day one proposed as enabler for services originally
> L3VPNs and RSVP-TE.

MPLS day one was mike o'dell wanting to move his city/city traffic
matrix from ATM to tag switching and open cascade's hold on tags.

randy
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
Mark Tinka wrote:

> I wouldn't agree.
>
> MPLS is a purely forwarding paradigm, as is hop-by-hop IP.

As the first person to have proposed the forwarding paradigm of
label switching, I have been fully aware from the beginning that:

https://tools.ietf.org/html/draft-ohta-ip-over-atm-01

Conventional Communication over ATM in a Internetwork Layer

The conventional communication, that is communication that does not
assume connectivity, is no different from that of the existing IP, of
course.

special, prioritized forwarding should be done only by special
request by end users (by properly designed signaling mechanism, for
which RSVP failed to be) or administration does not scale.

> Even with
> hop-by-hop IP, you need the edge to be routing-aware.

The edge to be routing-aware around itself does scale.

The edge to be routing-aware at the destinations of all the flows
over it does not scale, which is the problem of MPLS.

Though the lack of equipment scalability was unnoticed by many,
thanks to Moore' law, inscalable administration costs a lot.

As a result, administration of MPLS has been costing a lot.

> I wasn't at the table when the MPLS spec. was being dreamed up,

I was there before poor MPLS was dreamed up.

> If you can
> tell me how NOT running MPLS affords you a "hierarchical, scalable"
> routing table, I'm all ears.

Are you saying inter-domain routing table is not "hierarchical,
scalable" except for the reason of multihoming?

As for multihoming problem, see, for example:

https://tools.ietf.org/html/draft-ohta-e2e-multihoming-03

> Whether you forward in IP or in MPLS, scaling routing is an ever clear &
> present concern.

Not. Even without MPLS, fine tuning of BGP does not scale.

However, just as using plain IP router costs less than using
MPLS capable IP routers, BGP-only administration costs less than
BGP and MPLS administration.

For better networking infrastructure, extra cost should be spent
for L1, not MPLS or very complicated technologies around it.

Masataka Ohta
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
Robert Raszuk wrote:

> MPLS was since day one proposed as enabler for services originally L3VPNs
> and RSVP-TE.
There seems to be serious confusions between label switching
with explicit flows and MPLS, which was believed to scale
without detecting/configuring flows.

At the time I proposed label switching, there already was RSVP
but RSVP-TE was proposed long after MPLS was proposed.

But, today, people are seems to be using, so called, MPLS, with
explicitly configured flows, administration of which does not
scale and is annoying.

Remember that the original point of MPLS was that it should work
scalably without a lot of configuration, which is not the reality
recognized by people on this thread.

> So I think Ohta-san's point is about scalability services not flat underlay
> RIB and FIB sizes. Many years ago we had requests to support 5M L3VPN
> routes while underlay was just 500K IPv4.

That is certainly a problem. However, worse problem is to know
label values nested deeply in MPLS label chain.

Even worse, if route near the destination expected to pop the label
chain goes down, how can the source knows that the router goes down
and choose alternative router near the destination?

> Last - when I originally discussed just plain MPLS with customers with
> single application of hierarchical routing (no BGP in the core) frankly no
> one was interested.

MPLS with hierarchical routing just does not scale.

Masataka Ohta
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 19/Jun/20 17:13, Robert Raszuk wrote:

>
> So I think Ohta-san's point is about scalability services not flat
> underlay RIB and FIB sizes. Many years ago we had requests to support
> 5M L3VPN routes while underlay was just 500K IPv4.

Ah, if the context, then, was l3vpn scaling, yes, that is a known issue.

Apart from the global table vs. VRF parity concerns I've always had (one
of which was illustrated earlier this week, on this list, with RPKI in a
VRF), the other reason I don't do Internet in a VRF is because it was
always a trade-off:

    - More routes per VRF = fewer VRF's.
    - More VRF's  = fewer routes per VRF.

Going forward, I believe the l3vpn pressures (for pure VPN services, not
Internet in a VRF) should begin to subside as businesses move on-prem
workloads to the cloud, bite into the SD-WAN train, and generally, do
more stuff over the public Internet than via inter-branch WAN links
formerly driven by l3vpn.

Time will tell, but in Africa, bar South Africa, l3vpn's were never a
big thing, mostly because Internet connectivity was best served from one
or two major cities, where most businesses had a branch that warranted
connectivity.

But even in South Africa (as the rest of our African market), 98% of our
business is plain IP. The other 2% is mostly l2vpn. l3vpn's don't really
feature, except for some in-house enterprise VoIP carriage + some
high-speed in-band management.

Even with the older South African operators that made a killing off
l3vpn's, these are falling away as their customers either move to the
cloud and/or accept SD-WAN thingies.


>
> Last - when I originally discussed just plain MPLS with customers with
> single application of hierarchical routing (no BGP in the core)
> frankly no one was interested. Till L3VPN arrived which was game
> changer and run for new revenue streams ...

The BGP-free core has always sounded like a dark art. More so in the
days when hardware was precious, core routers doubled as inline route
reflectors and the size of the IPv4 DFZ wasn't rapidly exploding like it
is today, and no one was even talking about the IPv6 DFZ.

Might be useful speaking with them again, in 2020 :-).

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
>
> But, today, people are seems to be using, so called, MPLS, with
>
explicitly configured flows, administration of which does not
> scale and is annoying.
>

I am actually not sure what you are talking about here.

The only per flow action in any MPLS deployments I have seen was mapping
flow groups to specific TE-LSPs. In all other TDP or LDP cases flow == IP
destination so it is exact based on the destination reachability. And such
mapping is based on the LDP FEC to IGP (or BGP) match.

Even worse, if route near the destination expected to pop the label
> chain goes down, how can the source knows that the router goes down
> and choose alternative router near the destination?
>

In normal MPLS the src does not pick the transit paths. Transit is 100%
driven by IGP and if you loose a node local connectivity restoration
techniques (FRR or IGP convergence applies). If egress signalled
implicit NULL it would signal it to any IGP peer.

That is also possible with SR-MPLS too. No change ... no per flow state at
all more then per IP destination routing. If you want to control your
transit hops you can - but this is an option not w requirement.

MPLS with hierarchical routing just does not scale.


While I am not defending MPLS here and 100% agree that IP as transit is a
much better option today and tomorrow I also would like to make sure we
communicate true points. So when you say it does not scale - it could be
good to list what exactly does not scale by providing a real network
operational example.

Many thx,
R.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
> On Jun 19, 2020, at 11:34 AM, Randy Bush <randy@psg.com> wrote:
>
> ?
>>
>> MPLS was since day one proposed as enabler for services originally
>> L3VPNs and RSVP-TE.
>
> MPLS day one was mike o'dell wanting to move his city/city traffic
> matrix from ATM to tag switching and open cascade's hold on tags.

And IIRC, Tag switching day one was Cisco overreacting to Ipsilon.

-dorian
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On Wed, Jun 17, 2020 at 11:40 AM Dave Bell <me@geordish.org> wrote:

>
>
> On Wed, 17 Jun 2020 at 18:42, Saku Ytti <saku@ytti.fi> wrote:
>
>> Hey,
>>
>> > Why do we really need SR? Be it SR-MPLS or SRv6 or SRv6+?
>>
>> I don't like this, SR-MPLS and SRv6 are just utterly different things
>> to me, and no answer meaningfully applies to both.
>>
>
> I don't understand the point of SRv6. What equipment can support IPv6
> routing, but can't support MPLS label switching?
>
> I'm a big fan of SR-MPLS however.
>
> One of the advantages cited for SRv6 over MPLS is that the packet contains
a record of where it has been.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
>
> One of the advantages cited for SRv6 over MPLS is that the packet contains
>> a record of where it has been.
>>
>
Not really ... packets are not tourists in a bus.

First there are real studies proving that most large production networks
for the goal of good TE only need to place 1, 2 or 3 hops to traverse
through. Rest is the shortest path between those hops.

Then even if you place those node SIDs you have no control which interfaces
are chosen as outbound. There is often more then one IGP ECMP path in
between. You would need to insert adj. SIDs which does require pretty fine
level of controller's capabilities to start with.

I just hope that no one sane proposes that now all packets should get
encapsulated in a new IPv6 header while entering a transit ISP network and
carry long list of hop by hop adjacencies it is to travel by. Besides even
if it would it would be valid only within given ASN and had no visibility
outside.

Thx,
R.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 19/Jun/20 17:40, Masataka Ohta wrote:
 
>
> As the first person to have proposed the forwarding paradigm of
> label switching, I have been fully aware from the beginning that:
>
>    https://tools.ietf.org/html/draft-ohta-ip-over-atm-01
>
>    Conventional Communication over ATM in a Internetwork Layer
>
>    The conventional communication, that is communication that does not
>    assume connectivity, is no different from that of the existing IP, of
>    course.
>
> special, prioritized forwarding should be done only by special
> request by end users (by properly designed signaling mechanism, for
> which RSVP failed to be) or administration does not scale.

I could be wrong, but I get the feeling that you are speaking about RSVP
in its original form, where hosts were meant to make calls (CAC) into
the network to reserve resources on their behalf.

As we all know, that never took off, even though I saw some ideas about
it being proposed for mobile phones as well.

I don't think there ever was another attempt to get hosts to reserve
resources within the network, since the RSVP failure.



>
> Not. Even without MPLS, fine tuning of BGP does not scale.

We all know this, and like I said, that is a current concern.


>
> However, just as using plain IP router costs less than using
> MPLS capable IP routers, BGP-only administration costs less than
> BGP and MPLS administration.
>
> For better networking infrastructure, extra cost should be spent
> for L1, not MPLS or very complicated technologies around it.

In the early 2000's, I would have agreed with that.

Nowadays, there is a very good chance that a box you require a BGP DFZ
on inherently supports MPLS, likely without extra licensing.

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 19/Jun/20 18:00, Masataka Ohta wrote:
 
> There seems to be serious confusions between label switching
> with explicit flows and MPLS, which was believed to scale
> without detecting/configuring flows.
>
> At the time I proposed label switching, there already was RSVP
> but RSVP-TE was proposed long after MPLS was proposed.

RSVP failed to take off, for whatever reason (I can think of many).

I'm not sure any network operator, today, would allow an end-host to
make reservation requests in their core.

Even in the Transport world, this was the whole point of GMPLS. After
they saw how terrible that idea was, it shifted from customers to being
an internal fight between the IP teams and the Transport teams.
Ultimately, I don't think anybody really cared about routers
automatically using GMPLS to reserve and direct the DWDM network.

In our Transport network, we use GMPLS/ASON in the Transport network
only. When the IP team needs capacity, it's a telephone job :-).


>
> But, today, people are seems to be using, so called, MPLS, with
> explicitly configured flows, administration of which does not
> scale and is annoying.
>
> Remember that the original point of MPLS was that it should work
> scalably without a lot of configuration, which is not the reality
> recognized by people on this thread.

Well, you get the choice of LDP (low-touch) or RSVP-TE (high-touch).

Pick your poison.

We don't use RSVP-TE because of the issues you describe above.

We use LDP to avoid the issues you describe above.

In the end, SR-MPLS is meant to solve this issue for TE requirements. So
the signaling state-of-the-art improves with time.


> That is certainly a problem. However, worse problem is to know
> label values nested deeply in MPLS label chain.

Why, how, is that a problem? For load balancing?


>
> Even worse, if route near the destination expected to pop the label
> chain goes down, how can the source knows that the router goes down
> and choose alternative router near the destination?

If by source you mean end-host, if the edge router they are connected to
only ran IP and they were single-homed, they'd still go down.

If the end-host were multi-homed to two edge routers, one of them
failing won't cause an outage for the host.

Unless I misunderstand.


> MPLS with hierarchical routing just does not scale.

With Internet in a VRF, I truly agree.

But if you run a simple global BGP table and no VRF's, I don't see an
issue. This is what we do, and our scaling concerns are exactly the same
whether we run plain IP or IP/MPLS.

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On Sat, Jun 20, 2020 at 11:08 AM Mark Tinka <mark.tinka@seacom.mu> wrote:

> > MPLS with hierarchical routing just does not scale.
>
> With Internet in a VRF, I truly agree.
>
> But if you run a simple global BGP table and no VRF's, I don't see an
> issue. This is what we do, and our scaling concerns are exactly the same
> whether we run plain IP or IP/MPLS.
>
> Mark.
>
>
We run the Internet in a VRF to get watertight separation between
management and the Internet. I do also have a CGN vrf but that one has very
few routes in it (99% being subscriber management created, eg. one route
per customer). Why would this create a scaling issue? If you collapse our
three routing tables into one, you would have exactly the same number of
routes. All we did was separate the routes into namespaces, to establish a
firewall that prevents traffic to flow where it shouldn't.

Regards,

Baldur
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 20/Jun/20 11:27, Baldur Norddahl wrote:

>
>
> We run the Internet in a VRF to get watertight separation between
> management and the Internet. I do also have a CGN vrf but that one has
> very few routes in it (99% being subscriber management created, eg.
> one route per customer). Why would this create a scaling issue? If you
> collapse our three routing tables into one, you would have exactly the
> same number of routes. All we did was separate the routes into
> namespaces, to establish a firewall that prevents traffic to flow
> where it shouldn't.

It may be less of an issue in 2020 with the current control planes and
how far the code has come, but in the early days of l3vpn's, the number
of VRF's you could have was directly proportional to the number of
routes you had in each one. More VRF's, less routes for each. More
routes per VRF, less VRF's in total.

I don't know if that's still an issue today, as we don't run the
Internet in a VRF. I'd defer to those with that experience, who knew
about the scaling limitations of the past.

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
Mark Tinka wrote:

>> At the time I proposed label switching, there already was RSVP
>> but RSVP-TE was proposed long after MPLS was proposed.
>
> RSVP failed to take off, for whatever reason (I can think of many).

There are many. So, our research group tried to improve RSVP.

Practically, the most serious problem of RSVP is, like OSPF, using
unreliable link multicast to reliably exchange signalling messages
between routers, making specification and implementations very
complicated.

So, we developed SRSVP (Simple RSVP) replacing link multicast by,
like BGP, link local TCP mesh (thanks to the CATENET model, unlike
BGP, there is no scalability concern). Then, it was not so difficult
to remove other problems.

However, perhaps, most people think show stopper to RSVP is lack
of scalability of weighted fair queueing, though, it is not a
problem specific to RSVP and MPLS shares the same problem.

Obviously, weighted fair queueing does not scale because it is
based on deterministic traffic model of token bucket model
and, these days, people just use some ad-hoc ways for BW
guarantee implicitly assuming stochastic traffic model. I
even developed a little formal theory on scalable queueing
with stochastic traffic model.

So, we have specification and working implementation of
hop-by-hop, scalable, stable unicast/multicast interdomain
QoS routing protocol supporting routing hierarchy without
clank back.

See

http://www.isoc.org/inet2000/cdproceedings/1c/1c_1.htm

for rough description of design guideline.

> I'm not sure any network operator, today, would allow an end-host to
> make reservation requests in their core.

I didn't attempt to standardize our result in IETF, partly
because optical packet switching was a lot more interesting.

> Even in the Transport world, this was the whole point of GMPLS. After
> they saw how terrible that idea was, it shifted from customers to being
> an internal fight between the IP teams and the Transport teams.
> Ultimately, I don't think anybody really cared about routers
> automatically using GMPLS to reserve and direct the DWDM network.

That should be a reasonable way of practical operation, though I'm
not very interested in OCS (optical circuit switching) of GMPLS

> In our Transport network, we use GMPLS/ASON in the Transport network
> only. When the IP team needs capacity, it's a telephone job :-).

For IP layer, that should be enough. For ASON, so complicated
GMPLS is actually overkill.

When I was playing with ATM switches, I established control
plain network with VPI/VCI=0/0 and assign control plain IP
addresses to ATM switches. To control other VCs, simple UDP
packets are sent to switches from controlling hosts.

Similar technology should be applicable to ASON. Maintaining
integrity between wavelength switches is responsibility
of controllers.

>> Remember that the original point of MPLS was that it should work
>> scalably without a lot of configuration, which is not the reality
>> recognized by people on this thread.
>
> Well, you get the choice of LDP (low-touch) or RSVP-TE (high-touch).

No, I just explained what was advertised to be MPLS by people
around Cisco against Ipsilon.

According to the advertisements, you should call what you
are using LS or GLS, not MPLS or GMPLS.

> We don't use RSVP-TE because of the issugaes you describe above.
>
> We use LDP to avoid the issues you describe above.

Good.

> In the end, SR-MPLS is meant to solve this issue for TE requirements. So
> the signaling state-of-the-art improves with time.
Assuming a central controller (and its collocated or distributed
back up controllers), we don't need complicated protocols in
the network to maintain integrity of the entire network.

>> That is certainly a problem. However, worse problem is to know
>> label values nested deeply in MPLS label chain.
>
> Why, how, is that a problem? For load balancing?
What if, an inner label becomes invalidated around the
destination, which is hidden, for route scalability,
from the equipments around the source?

>> Even worse, if route near the destination expected to pop the label
>> chain goes down, how can the source knows that the router goes down
>> and choose alternative router near the destination?
>
> If by source you mean end-host, if the edge router they are connected to
> only ran IP and they were single-homed, they'd still go down.

No, as "the destination expected to pop the label" is located somewhere
around the final destination end-host.

If, at the destination site, connectivity between a router to pop nested
label and the fine destination end-host is lost, we are at a loss,
unless source side changes inner label.

>> MPLS with hierarchical routing just does not scale.
>
> With Internet in a VRF, I truly agree.
>
> But if you run a simple global BGP table and no VRF's, I don't see an
> issue. This is what we do, and our scaling concerns are exactly the same
> whether we run plain IP or IP/MPLS.

If you are using intra-domain hierarchical routing for
scalability within the domain, you still suffer from
lack of scalability of MPLS.

And, VRF is, in a sense, a form of intra-domain hierarchical
routing with a lot of flexibility, which means a lot of
unnecessary complications.

Masataka Ohta
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 20/Jun/20 14:41, Masataka Ohta wrote:

>  
> There are many. So, our research group tried to improve RSVP.

I'm a lot younger than the Internet, but I read a fair bit about its
history. I can't remember ever coming across an implementation of RSVP
between a host and the network in a commercial setting. If I missed it,
kindly share, as I'd be keen to see how that went.


>
> Practically, the most serious problem of RSVP is, like OSPF, using
> unreliable link multicast to reliably exchange signalling messages
> between routers, making specification and implementations very
> complicated.
>
> So, we developed SRSVP (Simple RSVP) replacing link multicast by,
> like BGP, link local TCP mesh (thanks to the CATENET model, unlike
> BGP, there is no scalability concern). Then, it was not so difficult
> to remove other problems.

Was "S-RSVP" ever implemented, and deployed?


>
> However, perhaps, most people think show stopper to RSVP is lack
> of scalability of weighted fair queueing, though, it is not a
> problem specific to RSVP and MPLS shares the same problem.

QoS has nothing to do with MPLS. You can do QoS with or without MPLS.

I should probably point out, also, that RSVP (or RSVP-TE) is not MPLS.
They collaborate, yes, but we'd be doing the community a disservice by
interchanging them for one another.


>
> Obviously, weighted fair queueing does not scale because it is
> based on deterministic traffic model of token bucket model
> and, these days, people just use some ad-hoc ways for BW
> guarantee implicitly assuming stochastic traffic model. I
> even developed a little formal theory on scalable queueing
> with stochastic traffic model.

Maybe so, but I still don't see the relation to MPLS.

All MPLS can do is convey IPP or DSCP values as an EXP code point in the
core. I'm not sure how that creates a scaling problem within MPLS itself.

If you didn't have MPLS, you'd be encoding those values in IPP or DSCP.
So what's the issue?


>
> So, we have specification and working implementation of
> hop-by-hop, scalable, stable unicast/multicast interdomain
> QoS routing protocol supporting routing hierarchy without
> clank back.
>
> See
>
>     http://www.isoc.org/inet2000/cdproceedings/1c/1c_1.htm
>
> for rough description of design guideline.
>

If I understand this correctly, would this be the IntServ QoS model?

 
>
> I didn't attempt to standardize our result in IETF, partly
> because optical packet switching was a lot more interesting.

Still is, even today :-)?


> That should be a reasonable way of practical operation, though I'm
> not very interested in OCS (optical circuit switching) of GMPLS

Design goals are often what they are, and then the real world hits you.



> For IP layer, that should be enough. For ASON, so complicated
> GMPLS is actually overkill.
>
> When I was playing with ATM switches, I established control
> plain network with VPI/VCI=0/0 and assign control plain IP
> addresses to ATM switches. To control other VCs, simple UDP
> packets are sent to switches from controlling hosts.
>
> Similar technology should be applicable to ASON. Maintaining
> integrity between wavelength switches is responsibility
> of controllers.

Well, GMPLS and ASON is basically skinny OSPF, IS-IS and RSVP running in
a DWDM node's control plane.


>
> No, I just explained what was advertised to be MPLS by people
> around Cisco against Ipsilon.
>
> According to the advertisements, you should call what you
> are using LS or GLS, not MPLS or GMPLS.

It takes a while for new technology to be fully understood, which is why
I'm not rushing on to the SR bandwagon :-).

I can't blame the sales droids or the customers of the day. It probably
sounded like dark magic.


> Assuming a central controller (and its collocated or distributed
> back up controllers), we don't need complicated protocols in
> the network to maintain integrity of the entire network.

Well, that's a point of view, I suppose.

I still can't walk into a shop and "buy a controller". I don't know what
this controller thing is, 10 years on.

IGP's, BGP and label distribution protocols have proven themselves, in
the interim.


> What if, an inner label becomes invalidated around the
> destination, which is hidden, for route scalability,
> from the equipments around the source?

I can't say I've ever come across that scenario running MPLS since 2004.

Do you have an example from a production network that you can share with
us? I'd really like to understand this better.


> No, as "the destination expected to pop the label" is located somewhere
> around the final destination end-host.
>
> If, at the destination site, connectivity between a router to pop nested
> label and the fine destination end-host is lost, we are at a loss,
> unless source side changes inner label.

Maybe a diagram would help, as I still don't get this failure scenario.

If a host lost connectivity with the service provider network, getting
label switching to work is pretty low on the priority list.

Again, unless I misunderstand.


>
> If you are using intra-domain hierarchical routing for
> scalability within the domain, you still suffer from
> lack of scalability of MPLS.
>
> And, VRF is, in a sense, a form of intra-domain hierarchical
> routing with a lot of flexibility, which means a lot of
> unnecessary complications.

I don't think stuffing your VRF's full of routes is an intrinsic problem
of MPLS.

MPLS works whether you run l3vpn's or not. That MPLS provides a
forwarding paradigm for VRF's does not put it and the potential poor
scalability VRF's in the same WhatsApp group.

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On Sat, Jun 20, 2020 at 12:38 PM Mark Tinka <mark.tinka@seacom.mu> wrote:

>
>
> On 20/Jun/20 11:27, Baldur Norddahl wrote:
>
>
>>
> We run the Internet in a VRF to get watertight separation between
> management and the Internet. I do also have a CGN vrf but that one has very
> few routes in it (99% being subscriber management created, eg. one route
> per customer). Why would this create a scaling issue? If you collapse our
> three routing tables into one, you would have exactly the same number of
> routes. All we did was separate the routes into namespaces, to establish a
> firewall that prevents traffic to flow where it shouldn't.
>
>
> It may be less of an issue in 2020 with the current control planes and how
> far the code has come, but in the early days of l3vpn's, the number of
> VRF's you could have was directly proportional to the number of routes you
> had in each one. More VRF's, less routes for each. More routes per VRF,
> less VRF's in total.
>
> I don't know if that's still an issue today, as we don't run the Internet
> in a VRF. I'd defer to those with that experience, who knew about the
> scaling limitations of the past.
>
>
I can't speak for the year 2000 as I was not doing networking at this level
at that time. But when I check the specs for the base mx204 it says
something like 32 VRFs, 2 million routes in FIB and 6 million routes in
RIB. Clearly those numbers are the total of routes across all VRFs
otherwise you arrive at silly numbers (64 million FIB if you multiply, 128k
FIB if you divide by 32). My conclusion is that scale wise you are ok as
long you do not try to have more than one VRF with a complete copy of the
DFZ.

More worrying is that 2 million routes will soon not be enough to install
all routes with a backup route, invalidating BGP FRR.

Regards,

Baldur
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 20/Jun/20 00:41, Anoop Ghanwani wrote:

> One of the advantages cited for SRv6 over MPLS is that the packet
> contains a record of where it has been.

I can't see how advantageous that is, or how possible it would be to
implement, especially for inter-domain traffic.

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
----- On Jun 20, 2020, at 2:27 PM, Mark Tinka <mark.tinka@seacom.mu> wrote:

Hi Mark,

> On 20/Jun/20 00:41, Anoop Ghanwani wrote:

>> One of the advantages cited for SRv6 over MPLS is that the packet contains a
>> record of where it has been.

> I can't see how advantageous that is,

That will be very advantageous in a datacenter environment, or any other
environment dealing with a lot of ECMP paths.

I can't tell you how often during my eBay time I've been troubleshooting
end-to-end packetloss between hosts in two datacenters where there were at least
10 or more layers of up to 16 way ECMP between them. Having a record of which
path is being taken by a packet is very helpful to determine the one with a crappy
transceiver.

> or how possible it would be to implement,

That work is already underway, albeit not specifically for MPLS. For example,
I've worked with an experimental version of In-Band Network Telemetry (INT)
as described in this draft: https://tools.ietf.org/html/draft-kumar-ippm-ifa-02

I even demonstrated a very basic implementatoin during SuperCompute 19 in Denver
last year. Most people who were interested in the demo were academics however,
probably because it wasn't a real networking event.

Note that there are several caveats that come with this draft and previous
versions, and that it is still very much work in progress. But the potential is
huge, at least in the DC.

> especially for inter-domain traffic.

That's a different story, but not entirely impossible. A probe packet can
be sent across AS borders, and as long as the two NOCs are cooperating, the
entire path can be reconstructed.

Thanks,

Sabri
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
> On Jun 20, 2020, at 2:27 PM, Mark Tinka <mark.tinka@seacom.mu> wrote:
>
>
>
> On 20/Jun/20 00:41, Anoop Ghanwani wrote:
>
>> One of the advantages cited for SRv6 over MPLS is that the packet contains a record of where it has been.
>
> I can't see how advantageous that is, or how possible it would be to implement, especially for inter-domain traffic.
>
> Mark.
>

Since the packet is essentially source-routed, and the labels aren’t popped off the way they are in MPLS, but preserved in the hop by hop headers (AIUI), the implementation isn’t particularly difficult.

Owen
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 21/Jun/20 00:54, Sabri Berisha wrote:

> That will be very advantageous in a datacenter environment, or any other
> environment dealing with a lot of ECMP paths.
>
> I can't tell you how often during my eBay time I've been troubleshooting
> end-to-end packetloss between hosts in two datacenters where there were at least
> 10 or more layers of up to 16 way ECMP between them. Having a record of which
> path is being taken by a packet is very helpful to determine the one with a crappy
> transceiver.
>
> That work is already underway, albeit not specifically for MPLS. For example,
> I've worked with an experimental version of In-Band Network Telemetry (INT)
> as described in this draft: https://tools.ietf.org/html/draft-kumar-ippm-ifa-02
>
> I even demonstrated a very basic implementatoin during SuperCompute 19 in Denver
> last year. Most people who were interested in the demo were academics however,
> probably because it wasn't a real networking event.
>
> Note that there are several caveats that come with this draft and previous
> versions, and that it is still very much work in progress. But the potential is
> huge, at least in the DC.

Alright, we'll wait and see, then.



> That's a different story, but not entirely impossible. A probe packet can
> be sent across AS borders, and as long as the two NOCs are cooperating, the
> entire path can be reconstructed.

Yes, for once-off troubleshooting, I suppose that would work.

My concern is if it's for normal day-to-day operations. But who knows,
maybe someone will propose that too :-).

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 20/Jun/20 22:00, Baldur Norddahl wrote:

>
> I can't speak for the year 2000 as I was not doing networking at this
> level at that time. But when I check the specs for the base mx204 it
> says something like 32 VRFs, 2 million routes in FIB and 6 million
> routes in RIB. Clearly those numbers are the total of routes across
> all VRFs otherwise you arrive at silly numbers (64 million FIB if you
> multiply, 128k FIB if you divide by 32). My conclusion is that scale
> wise you are ok as long you do not try to have more than one VRF with
> a complete copy of the DFZ.

I recall a number of networks holding multiple VRF's, including at least
2x Internet VRF's, for numerous use-cases. I don't know if they still do
that today, but one can get creative real quick :-).


>
> More worrying is that 2 million routes will soon not be enough to
> install all routes with a backup route, invalidating BGP FRR.

I have a niggling feeling this will be solved before we get there.

Now, whether we can afford it is a whole other matter.

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
Mark Tinka wrote:

>> There are many. So, our research group tried to improve RSVP.
>
> I'm a lot younger than the Internet, but I read a fair bit about its
> history. I can't remember ever coming across an implementation of RSVP
> between a host and the network in a commercial setting.

No, of course, because, as we agreed, RSVP has a lot of problems.

> Was "S-RSVP" ever implemented, and deployed?

It was implemented and some technology was used by commercial
router from Furukawa (a Japanese vendor selling optical
fiber now not selling routers).

>> However, perhaps, most people think show stopper to RSVP is lack
>> of scalability of weighted fair queueing, though, it is not a
>> problem specific to RSVP and MPLS shares the same problem.
>
> QoS has nothing to do with MPLS. You can do QoS with or without MPLS.

GMPLS, you are using, is the mechanism to guarantee QoS by
reserving wavelength resource. It is impossible for GMPLS
not to offer QoS.

Moreover, as some people says they offer QoS with MPLS, they
should be using some prioritized queueing mechanisms, perhaps
not poor WFQ.

> I should probably point out, also, that RSVP (or RSVP-TE) is not MPLS.

They are different, of course. But, GMPLS is to reserve bandwidth
resource. MPLS, in general, is to reserve label values, at least.

> All MPLS can do is convey IPP or DSCP values as an EXP code point in the
> core. I'm not sure how that creates a scaling problem within MPLS itself.

I didn't say scaling problem caused by QoS.

But, as you are avoiding to extensively use MPLS, I think you
are aware that extensive use of MPLS needs management of a
lot of labels, which does not scale.

Or, do I misunderstand something?

> If I understand this correctly, would this be the IntServ QoS model?

No. IntServ specifies format to carry QoS specification in RSVP
packets without assuming any specific model of QoS.

>> I didn't attempt to standardize our result in IETF, partly
>> because optical packet switching was a lot more interesting.
>
> Still is, even today :-)?

No. As experimental switches are working years ago and making
it work >10Tbps is not difficult (switching is easy, generating
10Tbps packets needs a lot of parallel equipment), there is little
remaining for research.

https://www.osapublishing.org/abstract.cfm?URI=OFC-2010-OWM4

>> Assuming a central controller (and its collocated or distributed
>> back up controllers), we don't need complicated protocols in
>> the network to maintain integrity of the entire network.
>
> Well, that's a point of view, I suppose.
>
> I still can't walk into a shop and "buy a controller". I don't know what
> this controller thing is, 10 years on.

SDN, maybe. Though I'm not saying SDN scale, it should be no
worse than MPLS.

> I can't say I've ever come across that scenario running MPLS since 2004.

I did some retrospective research.

https://en.wikipedia.org/wiki/Multiprotocol_Label_Switching
History
1994: Toshiba presented Cell Switch Router (CSR) ideas to IETF BOF
1996: Ipsilon, Cisco and IBM announced label switching plans
1997: Formation of the IETF MPLS working group
1999: First MPLS VPN (L3VPN) and TE deployments
2000: MPLS traffic engineering
2001: First MPLS Request for Comments (RFCs) released

as I was a co-chair of 1994 BOF and my knowledge on MPLS is
mostly on 1997 ID:

https://tools.ietf.org/html/draft-ietf-mpls-arch-00

there seems to be a lot of terminology changes.

I'm saying that, if some failure occurs and IGP changes, a
lot of LSPs must be recomputed, which does not scale
if # of LSPs is large, especially in a large network
where IGP needs hierarchy (such as OSPF area).

Masataka Ohta
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On Sun, Jun 21, 2020 at 9:56 AM Mark Tinka <mark.tinka@seacom.mu> wrote:

>
>
> On 20/Jun/20 22:00, Baldur Norddahl wrote:
>
>
> I can't speak for the year 2000 as I was not doing networking at this
> level at that time. But when I check the specs for the base mx204 it says
> something like 32 VRFs, 2 million routes in FIB and 6 million routes in
> RIB. Clearly those numbers are the total of routes across all VRFs
> otherwise you arrive at silly numbers (64 million FIB if you multiply, 128k
> FIB if you divide by 32). My conclusion is that scale wise you are ok as
> long you do not try to have more than one VRF with a complete copy of the
> DFZ.
>
>
> I recall a number of networks holding multiple VRF's, including at least
> 2x Internet VRF's, for numerous use-cases. I don't know if they still do
> that today, but one can get creative real quick :-).
>
>
Yes I once made a plan to have one VRF per transit provider plus a peering
VRF. That way our BGP customers could have a session with each of those
VRFs to allow them full control of the route mix. I would of course also
need a Internet VRF for our own needs.

But the reality of that would be too many copies of the DFZ in the routing
tables. Although not necessary in the FIB as each of the transit VRFs could
just have a default route installed.

Regards,

Baldur
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 21/Jun/20 12:10, Masataka Ohta wrote:

>  
> It was implemented and some technology was used by commercial
> router from Furukawa (a Japanese vendor selling optical
> fiber now not selling routers).

I won't lie, never heard of it.


> GMPLS, you are using, is the mechanism to guarantee QoS by
> reserving wavelength resource. It is impossible for GMPLS
> not to offer QoS.

That is/was the idea.

In practice (at least in our Transport network), deploying capacity as
an offline exercise is significantly simpler. In such a case, we
wouldn't use GMPLS for capacity reservation, just path re-computation in
failure scenarios.

Our Transport network isn't overly meshed. It's just stretchy. Perhaps
if one was trying to build a DWDM backbone into, out of and through
every city in the U.S., capacity reservation in GMPLS may be a use-case.
But unless someone is willing to pipe up and confess to implementing it
in this way, I've not heard of it.


>
> Moreover, as some people says they offer QoS with MPLS, they
> should be using some prioritized queueing mechanisms, perhaps
> not poor WFQ.

It would be a combination - PQ and WFQ depending on the traffic type and
how much customers want to pay.

But carrying an MPLS EXP code point does not make MPLS unscalable. It's
no different to carrying a DSCP or IPP code point in plain IP. Or even
an 802.1p code point in Ethernet.


> They are different, of course. But, GMPLS is to reserve bandwidth
> resource.

In theory. What are people doing in practice? I just told you our story.


> MPLS, in general, is to reserve label values, at least.

MPLS is the forwarding paradigm. Label reservation/allocation can be
done manually or with a label distribution protocol. MPLS doesn't care
how labels are generated and learned. It will just push, swap and pop as
it needs to.


> I didn't say scaling problem caused by QoS.
>
> But, as you are avoiding to extensively use MPLS, I think you
> are aware that extensive use of MPLS needs management of a
> lot of labels, which does not scale.
>
> Or, do I misunderstand something?

I'm not avoiding extensive use of MPLS. I want extensive use of MPLS.

In IPv4, we forward in MPLS 100%. In IPv6, we forward in MPLS 80%. This
is due to vendor nonsense. Trying to fix.



> No. IntServ specifies format to carry QoS specification in RSVP
> packets without assuming any specific model of QoS.

Then I'm failing to understand your point, especially since it doesn't
sound like any operator is deploying such a model, or if so, publicly
suffering from it.



> No. As experimental switches are working years ago and making
> it work >10Tbps is not difficult (switching is easy, generating
> 10Tbps packets needs a lot of parallel equipment), there is little
> remaining for research.

We'll get there. This doesn't worry me so much :-). Either horizontally
or vertically. I can see a few models to scale IP/MPLS carriage.


>    
> SDN, maybe. Though I'm not saying SDN scale, it should be no
> worse than MPLS.

I still can't tell you what SDN is :-). I won't suffer it in this
decade, thankfully.


> I did some retrospective research.
>
>    https://en.wikipedia.org/wiki/Multiprotocol_Label_Switching
>    History
>    1994: Toshiba presented Cell Switch Router (CSR) ideas to IETF BOF
>    1996: Ipsilon, Cisco and IBM announced label switching plans
>    1997: Formation of the IETF MPLS working group
>    1999: First MPLS VPN (L3VPN) and TE deployments
>    2000: MPLS traffic engineering
>    2001: First MPLS Request for Comments (RFCs) released
>
> as I was a co-chair of 1994 BOF and my knowledge on MPLS is
> mostly on 1997 ID:
>
>    https://tools.ietf.org/html/draft-ietf-mpls-arch-00
>
> there seems to be a lot of terminology changes.

My comment to that was in reference to your text, below:

    "What if, an inner label becomes invalidated around the
    destination, which is hidden, for route scalability,
    from the equipments around the source?"

I've never heard of such an issue in 16 years.


>
> I'm saying that, if some failure occurs and IGP changes, a
> lot of LSPs must be recomputed, which does not scale
> if # of LSPs is large, especially in a large network
> where IGP needs hierarchy (such as OSPF area).

That happens everyday, already. Links fail, IGP re-converges, LDP keeps
humming. RSVP-TE too, albeit all that state does need some consideration
especially if code is buggy.

Particularly, where you have LFA/IP-FRR both in the IGP and LDP, I've
not come across any issue where IGP re-convergence caused LSP's to fail.

In practice, IGP hierarchy (OSPF Areas or IS-IS Levels) doesn't help
much if you are running MPLS. FEC's are forged against /32 and /128
addresses. Yes, as with everything else, it's a trade-off.

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 21/Jun/20 12:45, Baldur Norddahl wrote:

>
> Yes I once made a plan to have one VRF per transit provider plus a
> peering VRF. That way our BGP customers could have a session with each
> of those VRFs to allow them full control of the route mix. I would of
> course also need a Internet VRF for our own needs.
>
> But the reality of that would be too many copies of the DFZ in the
> routing tables. Although not necessary in the FIB as each of the
> transit VRFs could just have a default route installed.

We just opted for BGP communities :-).

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On Sun, Jun 21, 2020 at 1:30 PM Mark Tinka <mark.tinka@seacom.mu> wrote:

>
>
> On 21/Jun/20 12:45, Baldur Norddahl wrote:
>
>
> Yes I once made a plan to have one VRF per transit provider plus a peering
> VRF. That way our BGP customers could have a session with each of those
> VRFs to allow them full control of the route mix. I would of course also
> need a Internet VRF for our own needs.
>
> But the reality of that would be too many copies of the DFZ in the routing
> tables. Although not necessary in the FIB as each of the transit VRFs could
> just have a default route installed.
>
>
> We just opted for BGP communities :-).
>
>
Not really the same. Lets say the best path is through transit 1 but the
customer thinks transit 1 sucks balls and wants his egress traffic to go
through your transit 2. Only the VRF approach lets every BGP customer, even
single homed ones, make his own choices about upstream traffic.

You would be more like a transit broker than a traditional ISP with a
routing mix. Your service is to buy one place, but get the exact same
product as you would have if you bought from top X transits in your area.
Delivered as X distinct BGP sessions to give you total freedom to send
traffic via any of the transit providers.

This is also the reason you do not actually need any routes in the FIB for
each of those transit VRFs. Just a default route because all traffic will
unconditionally go to said transit provider. The customer routes would
still be there of course.

Regards,

Baldur
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
> I'm saying that, if some failure occurs and IGP changes, a
> lot of LSPs must be recomputed, which does not scale
> if # of LSPs is large, especially in a large network
> where IGP needs hierarchy (such as OSPF area).
>
> Masataka Ohta
>


Actually when IGP changes LSPs are not recomputed with LDP or SR-MPLS (when
used without TE :).

"LSP" term is perhaps what drives your confusion --- in LDP MPLS there is
no "Path" - in spite of the acronym (Labeled Switch *Path*). Labels are
locally significant and swapped at each LSR - resulting essentially with a
bunch of one hop crossconnects.

In other words MPLS LDP strictly follows IGP SPT at each LSR hop.

Many thx,
R.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 21/Jun/20 14:58, Baldur Norddahl wrote:

>
> Not really the same. Lets say the best path is through transit 1 but
> the customer thinks transit 1 sucks balls and wants his egress traffic
> to go through your transit 2. Only the VRF approach lets every BGP
> customer, even single homed ones, make his own choices about upstream
> traffic.
>
> You would be more like a transit broker than a traditional ISP with a
> routing mix. Your service is to buy one place, but get the exact same
> product as you would have if you bought from top X transits in your
> area. Delivered as X distinct BGP sessions to give you total freedom
> to send traffic via any of the transit providers.

We received such requests years ago, and calculated the cost of
complexity vs. BGP communities. In the end, if the customer wants to use
a particular upstream on our side, we'd rather setup an EoMPLS circuit
between them and they can have their own contract.

Practically, 90% of our traffic is peering. We don't that much with
upstreams providers.


>
> This is also the reason you do not actually need any routes in the FIB
> for each of those transit VRFs. Just a default route because all
> traffic will unconditionally go to said transit provider. The customer
> routes would still be there of course.

Glad it works for you. We just found it too complex, not just for the
problems it would solve, but also for the parity issues between VRF's
and the global table.

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 21/Jun/20 15:48, Robert Raszuk wrote:

>
>
> Actually when IGP changes LSPs are not recomputed with LDP or SR-MPLS
> (when used without TE :). 
>
> "LSP" term is perhaps what drives your confusion --- in LDP MPLS there
> is no "Path" - in spite of the acronym (Labeled Switch *Path*). Labels
> are locally significant and swapped at each LSR - resulting
> essentially with a bunch of one hop crossconnects. 
>
> In other words MPLS LDP strictly follows IGP SPT at each LSR hop.

Yep, which is what I tried to explain as well. With LDP, MPLS-enabled
hosts simply push, swap and pop. There is not concept of an "end-to-end
LSP" as such. We just use the term "LSP" to define an FEC. But really,
each node in the FEC's path is making its own push, swap and pop decisions.

The LFIB in each node need only be as large as the number of LDP-enabled
routers in the network. You can get scenarios where FEC's are also
created for infrastructure links, but if you employ filtering to save on
FIB slots, you really just need to allocate labels to Loopback addresses
only.

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
> The LFIB in each node need only be as large as the number of LDP-enabled
routers in the network.

That is true for P routers ... not so much for PEs.

Please observe that label space in each PE router is divided for IGP and
BGP as well as other label hungy services ... there are many consumers of
local label block.

So it is always the case that LFIB table (max 2^20 entries - 1M) on PEs is
much larger then LFIB on P nodes.

Thx,
R.




On Sun, Jun 21, 2020 at 6:01 PM Mark Tinka <mark.tinka@seacom.mu> wrote:

>
>
> On 21/Jun/20 15:48, Robert Raszuk wrote:
>
>
>
> Actually when IGP changes LSPs are not recomputed with LDP or SR-MPLS
> (when used without TE :).
>
> "LSP" term is perhaps what drives your confusion --- in LDP MPLS there is
> no "Path" - in spite of the acronym (Labeled Switch *Path*). Labels are
> locally significant and swapped at each LSR - resulting essentially with a
> bunch of one hop crossconnects.
>
> In other words MPLS LDP strictly follows IGP SPT at each LSR hop.
>
>
> Yep, which is what I tried to explain as well. With LDP, MPLS-enabled
> hosts simply push, swap and pop. There is not concept of an "end-to-end
> LSP" as such. We just use the term "LSP" to define an FEC. But really, each
> node in the FEC's path is making its own push, swap and pop decisions.
>
> The LFIB in each node need only be as large as the number of LDP-enabled
> routers in the network. You can get scenarios where FEC's are also created
> for infrastructure links, but if you employ filtering to save on FIB slots,
> you really just need to allocate labels to Loopback addresses only.
>
> Mark.
>
RE: Devil's Advocate - Segment Routing, Why? [ In reply to ]
> From: NANOG <nanog-bounces@nanog.org> On Behalf Of Mark Tinka
> Sent: Friday, June 19, 2020 7:28 PM
>
>
> On 19/Jun/20 17:13, Robert Raszuk wrote:
>
> >
> > So I think Ohta-san's point is about scalability services not flat
> > underlay RIB and FIB sizes. Many years ago we had requests to support
> > 5M L3VPN routes while underlay was just 500K IPv4.
>
> Ah, if the context, then, was l3vpn scaling, yes, that is a known issue.
>
I wouldn't say it's known to many as not many folks are actually limited by only up to ~1M customer connections, or next level up, only up to ~1M customer VPNs.

> Apart from the global table vs. VRF parity concerns I've always had (one of
> which was illustrated earlier this week, on this list, with RPKI in a VRF),
>
Well yeah, things work differently in VRFs, not a big surprise.
And what about an example of bad flowspec routes/filters cutting the boxes off net -where having those flowspec routes/filters contained within an Internet VRF would not have such an effect.
See, it goes either way.
Would be interesting to see a comparison of good vs bad for the Internet routes in VRF vs in Internet routes in global/default routing table.


> the
> other reason I don't do Internet in a VRF is because it was always a trade-off:
>
> - More routes per VRF = fewer VRF's.
> - More VRF's = fewer routes per VRF.
>
No, that's just a result of having a finite FIB/RIB size -if you want to cut these resources into virtual pieces you'll naturally get your equations above.
But if you actually construct your testing to showcase the delta between how much FIB/RIB space is taken by x prefixes with each in a VRF as opposed to all in a single default VRF (global routing table) the delta is negligible.
(Yes negligible even in case of per prefix VPN label allocation method -which I'm assuming no one is using anyways as it inherently doesn't scale and would limit you to ~1M VPN prefixes though per-CE/per-next-hop VPN label allocation method gives one the same functionality as per-prefix one while pushing the limit to ~1M PE-CE links/IFLs which from my experience is sufficient for most folks out there).

adam
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 21/Jun/20 19:34, Robert Raszuk wrote:
>
> That is true for P routers ... not so much for PEs. 
>
> Please observe that label space in each PE router is divided for IGP
> and BGP as well as other label hungy services ... there are many
> consumers of local label block. 
>
> So it is always the case that LFIB table (max 2^20 entries - 1M) on
> PEs is much larger then LFIB on P nodes.

I should point out that all of my input here is based on simple MPLS
forwarding of IP traffic in the global table. In this scenario, labels
are only assigned to BGP next-hops, which is typically an IGP Loopback
address.

Labels don't get assigned to BGP routes in a global table. There is no
use for that.

Of course, as this is needed in VRF's and other BGP-based VPN services,
the extra premium customers pay for that priviledge may be considered
warranted :-).

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
>
> I should point out that all of my input here is based on simple MPLS
> forwarding of IP traffic in the global table. In this scenario, labels
> are only assigned to BGP next-hops, which is typically an IGP Loopback
> address.
>

Well this is true for one company :) Name starts with j ....

Other company name starting with c - at least some time back by default
allocated labels for all routes in the RIB either connected or static or
sourced from IGP. Sure you could always limit that with a knob if desired.

The issue with allocating labels only for BGP next hops is that your
IP/MPLS LFA breaks (or more directly is not possible) as you do not have a
label to PQ node upon failure. Hint: PQ node is not even running BGP :).

Sure selective folks still count of "IGP Convergence" to restore
connectivity. But I hope those will move to much faster connectivity
restoration techniques soon.


> Labels don't get assigned to BGP routes in a global table. There is no
> use for that.
>

Sure - True.

Cheers,
R,
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 21/Jun/20 21:15, adamv0025@netconsultings.com wrote:

> I wouldn't say it's known to many as not many folks are actually limited by only up to ~1M customer connections, or next level up, only up to ~1M customer VPNs.

It's probably less of a problem now than it was 10 years ago. But, yes,
I don't have any real-world experience.



> Well yeah, things work differently in VRFs, not a big surprise.
> And what about an example of bad flowspec routes/filters cutting the boxes off net -where having those flowspec routes/filters contained within an Internet VRF would not have such an effect.
> See, it goes either way.
> Would be interesting to see a comparison of good vs bad for the Internet routes in VRF vs in Internet routes in global/default routing table.

Well, the global table is the basics, and VRF's is where sexy lives :-).


> No, that's just a result of having a finite FIB/RIB size -if you want to cut these resources into virtual pieces you'll naturally get your equations above.
> But if you actually construct your testing to showcase the delta between how much FIB/RIB space is taken by x prefixes with each in a VRF as opposed to all in a single default VRF (global routing table) the delta is negligible.
> (Yes negligible even in case of per prefix VPN label allocation method -which I'm assuming no one is using anyways as it inherently doesn't scale and would limit you to ~1M VPN prefixes though per-CE/per-next-hop VPN label allocation method gives one the same functionality as per-prefix one while pushing the limit to ~1M PE-CE links/IFLs which from my experience is sufficient for most folks out there).

Like I said, with today's CPU's and memory, probably not an issue. But
it's not an area I play in, so those with more experience - like
yourself - would know better.

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 21/Jun/20 22:21, Robert Raszuk wrote:

>
> Well this is true for one company :) Name starts with j .... 
>
> Other company name starting with c - at least some time back by
> default allocated labels for all routes in the RIB either connected or
> static or sourced from IGP. Sure you could always limit that with a
> knob if desired.


Juniper allocates labels to the Loopback only.

Cisco allocates labels to all IGP and interface routes.

Neither allocate labels to BGP routes for the global table.


>
> The issue with allocating labels only for BGP next hops is that your
> IP/MPLS LFA breaks (or more directly is not possible) as you do not
> have a label to PQ node upon failure.  Hint: PQ node is not even
> running BGP :).

Wouldn't T-LDP fix this, since LDP LFA is a targeted session?

Need to test.


>
> Sure selective folks still count of "IGP Convergence" to restore
> connectivity. But I hope those will move to much faster connectivity
> restoration techniques soon.

We are happy :-).

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
> Wouldn't T-LDP fix this, since LDP LFA is a targeted session?

Nope. You need to get to PQ node via potentially many hops. So you need to
have even ordered or independent label distribution to its loopback in
place.

Best,
R.

On Sun, Jun 21, 2020 at 10:58 PM Mark Tinka <mark.tinka@seacom.mu> wrote:

>
>
> On 21/Jun/20 22:21, Robert Raszuk wrote:
>
>
> Well this is true for one company :) Name starts with j ....
>
> Other company name starting with c - at least some time back by default
> allocated labels for all routes in the RIB either connected or static or
> sourced from IGP. Sure you could always limit that with a knob if desired.
>
>
>
> Juniper allocates labels to the Loopback only.
>
> Cisco allocates labels to all IGP and interface routes.
>
> Neither allocate labels to BGP routes for the global table.
>
>
>
> The issue with allocating labels only for BGP next hops is that your
> IP/MPLS LFA breaks (or more directly is not possible) as you do not have a
> label to PQ node upon failure. Hint: PQ node is not even running BGP :).
>
>
> Wouldn't T-LDP fix this, since LDP LFA is a targeted session?
>
> Need to test.
>
>
>
> Sure selective folks still count of "IGP Convergence" to restore
> connectivity. But I hope those will move to much faster connectivity
> restoration techniques soon.
>
>
> We are happy :-).
>
> Mark.
>
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 21/Jun/20 23:01, Robert Raszuk wrote:

>
> Nope. You need to get to PQ node via potentially many hops. So you
> need to have even ordered or independent label distribution to its
> loopback in place.

I have some testing I want to do with IS-IS only announcing the Loopback
from a set of routers to the rest of the backbone, and LDP allocating
labels for it accordingly, to solve a particular problem.

I'll test this out and see what happens re: LDP LFA.

Mark.
RE: Devil's Advocate - Segment Routing, Why? [ In reply to ]
> From: NANOG <nanog-bounces@nanog.org> On Behalf Of Masataka Ohta
> Sent: Friday, June 19, 2020 5:01 PM
>
> Robert Raszuk wrote:
>
> > So I think Ohta-san's point is about scalability services not flat
> > underlay RIB and FIB sizes. Many years ago we had requests to support
> > 5M L3VPN routes while underlay was just 500K IPv4.
>
> That is certainly a problem. However, worse problem is to know label
values
> nested deeply in MPLS label chain.
>
> Even worse, if route near the destination expected to pop the label chain
> goes down, how can the source knows that the router goes down and
> choose alternative router near the destination?
>
Via IGP or controller, but for sub 50ms convergence there are edge node
protection mechanisms, so the point is the source doesn't even need to know
about for the restoration to happen.

adam
RE: Devil's Advocate - Segment Routing, Why? [ In reply to ]
Hi Baldur,



From memory mx204 FIB is 10M (v4/v6) and RIB 30M for each v4 and v6.

And remember the FIB is hierarchical -so it’s the next-hops per prefix you are referring to with BGP FRR. And also going from memory of past scaling testing, if pfx1+NH1 == x, then Pfx1+NH1+NH2 !== 2x, where x is used FIB space.



adam



From: NANOG <nanog-bounces+adamv0025=netconsultings.com@nanog.org> On Behalf Of Baldur Norddahl
Sent: Saturday, June 20, 2020 9:00 PM



I can't speak for the year 2000 as I was not doing networking at this level at that time. But when I check the specs for the base mx204 it says something like 32 VRFs, 2 million routes in FIB and 6 million routes in RIB. Clearly those numbers are the total of routes across all VRFs otherwise you arrive at silly numbers (64 million FIB if you multiply, 128k FIB if you divide by 32). My conclusion is that scale wise you are ok as long you do not try to have more than one VRF with a complete copy of the DFZ.



More worrying is that 2 million routes will soon not be enough to install all routes with a backup route, invalidating BGP FRR.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On Wed, 17 Jun 2020 at 22:09, <adamv0025@netconsultings.com> wrote:
>
> > From: NANOG <nanog-bounces@nanog.org> On Behalf Of Mark Tinka
> > Sent: Wednesday, June 17, 2020 6:07 PM
> >
> >
> > I've heard a lot about "network programmability", e.t.c.,
> First of all the "SR = network programmability" is BS, SR = MPLS, any programmability we've had for MPLS since ever works the same way for SR.

It works because SR != MPLS.

SR is a protocol which describes many aspects, such as how traffic
forwarding decisions made at the ingress node to a PSN can be
guaranteed across the PSN, even though the nodes along the PSN path
use per-hop forwarding behaviour and different nodes along the path
have made different forwarding decisions.

When using SR MPLS segment IDs are used as an index into the label
range (SRGB) and so SIDs don't correlate 1:1 to MPLS labels, equally
with SRv6 the segment IDs are encoded as IPv6 addresses and don't
correlate 1:1 to an IPv6 address. There is a venn diagram with an
overlapping section in the middle which is "generic SR" with a bunch
of core features that are supported agnostic of the encoding
mechanism.

Cheers,
James.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On Wed, 17 Jun 2020 at 18:08, Mark Tinka <mark.tinka@seacom.mu> wrote:
>
> Hi all.
>
> When the whole SR concept was being first dreamed up, I was mildly excited about it. But then real life happened and global deployment (be it basic SR-MPLS or SRv6) is what it is, and I became less excited. This was back in 2015.
>
> All the talk about LDPv6 this and last week has had me reflecting a great deal on where we are, as an industry, in why we are having to think about SR and all its incarnations.
>
> So, let me be the one that stirs up the hornets' nest...
>
> Why do we really need SR? Be it SR-MPLS or SRv6 or SRv6+?

I am clearly very far behind on my emails, but of the emails I've read
so far in this thread though you have mentioned at least twice:

On Wed, 17 Jun 2020 at 18:08, Mark Tinka <mark.tinka@seacom.mu> wrote:
> What I am less enthused about is being forced

On Wed, 17 Jun 2020 at 23:22, Mark Tinka <mark.tinka@seacom.mu> wrote:
> it tastes funny when you are forced

Mark, does someone have a gun to your head? Are you in trouble? Blink
63 times for yes, 64 times for no ;)

Cheers,
James.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On Wed, 17 Jun 2020 at 23:19, Mark Tinka <mark.tinka@seacom.mu> wrote:
> Yes, we all love less state, I won't argue that. But it's the same question that is being asked less and less with each passing year - what scales better in 2020, OSPF or IS-IS. That is becoming less relevant as control planes keep getting faster and cheaper.
>
> I'm not saying that if you are dealing with 100,000 T-LDP sessions you should not consider SR, but if you're not, and SR still requires a bit more development (never mind deployment experience), what's wrong with having LDPv6? If it makes near-as-no-difference to your control plane in 2020 or 2030 as to whether your 10,000-node network is running LDP or SR, why not have the choice?

I'm going to kick the nest in the other direction now :D ... There
would be no need to massively scale an IGP or worry about running
LDPv4 + LDv6 or SR MPLS if we had put more development time into MPLS
over UDP. I think it's a great technology which solves a lot of
problems and I've been itching to deploy it for ages now, but vendor
support for it is nowhere near the level of MPLS over Ethernet.

Cheers,
James.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 30/Jun/20 20:37, James Bensley wrote:

> Mark, does someone have a gun to your head? Are you in trouble? Blink
> 63 times for yes, 64 times for no ;)

You're pretty late to this party, mate...

Mark.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On Tue, 30 Jun 2020 at 22:07, Mark Tinka <mark.tinka@seacom.com> wrote:
>
>
>
> On 30/Jun/20 20:37, James Bensley wrote:
>
> > Mark, does someone have a gun to your head? Are you in trouble? Blink
> > 63 times for yes, 64 times for no ;)
>
> You're pretty late to this party, mate...

True, but what's changed in two weeks with regards to LDv6 and SR?

What was your use case / requirement for LDv6 - to remove the full
table v6 feed from your core or to remove IPv4 from your IGP or both?

Cheers,
James.
Re: Devil's Advocate - Segment Routing, Why? [ In reply to ]
On 1/Jul/20 09:10, James Bensley wrote:

> True, but what's changed in two weeks with regards to LDv6 and SR?
>
> What was your use case / requirement for LDv6 - to remove the full
> table v6 feed from your core or to remove IPv4 from your IGP or both?

Give me a year to work this and report back, hopefully at a NANOG
lectern near you :-).

Mark.