Mailing List Archive

L3VPNs and on-prem DDoS scrubbing architecture
Hi there,

We're a US research and education ISP and we've been tasked for coming up with an architecture to allow on premise DDoS scrubbing with an appliance. As a first pass I've created an cleanL3VPN routing-instance to function as a clean VRF that uses rib-groups to mirror the relevant parts of inet.0. It is in production and is working great for customer learned BGP routes. It falls apart when I try to protect a directly attached destination that has a mac address in inet.0. I think I understand why and the purpose of this message is to see if anyone has been in a similar situation and has thoughts/advice/warnings about alternative designs.

To explain what I see, I noticed that mac address based nexthops don't seem to be copied from inet.0 into cleanL3VPN.inet.0. I assume this means that mac-address based forwarding must be referencing inet.0 [see far below]. This obviously creates a loop once the best path in inet.0 becomes a BGP /32. For example when I'm announcing a /32 for 1.2.3.4 out of a locally attached 1.2.3.0/26, traceroute implies the packet enters inet.0, is sent to 5.6.7.8 as the nexthop correctly, arrives in cleanL3VPN which decides to forward to 5.6.7.8 in a loop, even though the BGP /32 isn't part of cleanL3VPN [see below], cleanL3VPN Is dependent on inet.0 for resolution. Even if I could copy inet.0 mac addresses into cleanL3VPN, eventually the mac address would age out of inet.0 because the /32 would no longer be directly connected. If I want to be able to protect locally attached destinations so I think my design is unworkable, I think my solutions are

= use flowspec redirection to dirty VRF, keep inet.0 as clean and use flowspec interface filter-group appropriately on backbone interfaces [routing-options flow interface-group exclude, which I already have deployed correctly]. This seems easy but is less performant.
= put my customers into a customerVRF and deal with route leaking between global and customerVRF. This is a well-known tactic but more complicated to approach and disruptive to deploy as I have to airlift basically all the customers to into a VRF to have full coverage.

For redirection, to date I've been looking at longest prefix match solutions due to the presumed scalability vs using flowspec. I have an unknown amount of "always on" redirects I might be asked to entertain. 10? 100? 1000? I'm trying to come up with a solution that doesn't rely on touching the routers themselves. I did think about creating a normal [non flowspec] input firewall term on untrusted interfaces that redirects to dirty VRF based in a single destination prefix-list and just relying on flowspec for on demand stuff with the assumption one firewall term with let's say 1000 prefixes is more performant than 1000 standalone flowspec rules. I think my solution is fundamentally workable but I don't think the purchased turnkey ddos orchestration is going to natively interact with our Junipers, so that is looked down upon, since it would require " a router guy " or writing custom automation when adding/removing always-on protection. Seems technically very viable to me, I jus
t bring up these details because I feel like without a ton of effort VRF redirection can be made to be nearly as performant as longest prefix match.

While we run MPLS, currently all of our customers/transit are in the global table. I'm trying to avoid solutions for now that puts the 1M+ RIB DFZ zone into an L3VPN; it's awfully big change I don't want to rush into especially for this proof of concept but I'd like to hear opinions if that's the best solution to this specific problem. I'm not sure it's fundamentally different than creating a customerVRF, seems like I just need to separate the customers from the internet ingress.

My gut says "the best" thing to do is to create a customerVRF but it feels a bit complicated as I have to worry about things like BGP/static/direct and will lose addPath [.I recently discovered add-path and route-target are mutually exclusive in JunOS].

My gut says "the quickest" and least disruptive thing to do is to go the flowspec/filter route and frankly I'm beginning to lean that way since I'm already partially in production and needed to have a solution 5 days ago to this problem :>

I've done all of these things before [flowspec, rib leaking] I think it's just a matter of trying to figure out the next best step and was looking to see if anyone has been in a similar situation and has thoughts/advice/warnings.

I'm talking about IPv4 below but I ack IPv6 is a thing and I would just do the same solution.

-Michael

===/===

@$myrouter> show route forwarding-table destination 1.2.3.4 extensive
Apr 02 08:39:10
Routing table: default.inet [Index 0]
Internet:

Destination: 1.2.3.4/32
Route type: user
Route reference: 0 Route interface-index: 0
Multicast RPF nh index: 0
P2mpidx: 0
Flags: sent to PFE
Next-hop type: indirect Index: 1048588 Reference: 3
Nexthop: 5.6.7.8
Next-hop type: unicast Index: 981 Reference: 3
Next-hop interface: et-0/1/10.3099

Destination: 1.2.3.4/32
Route type: destination
Route reference: 0 Route interface-index: 85
Multicast RPF nh index: 0
P2mpidx: 0
Flags: none
Nexthop: 0:50:56:b3:4f:fe
Next-hop type: unicast Index: 1562 Reference: 1
Next-hop interface: ae17.3347

Routing table: cleanL3VPN.inet [Index 21]
Internet:

Destination: 1.2.3.0/26
Route type: user
Route reference: 0 Route interface-index: 0
Multicast RPF nh index: 0
P2mpidx: 0
Flags: sent to PFE, rt nh decoupled
Next-hop type: table lookup Index: 1 Reference: 40
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: L3VPNs and on-prem DDoS scrubbing architecture [ In reply to ]
On Tue, Apr 02, 2024 at 03:25:21PM +0000, Michael Hare via juniper-nsp wrote:

Hi!

Workaround that we're using (not elegant, but working): setup a
"self-pointing" routes to directly connected destinations:

set routing-options static route A.B.C.D/32 next-hop A.B.C.D

and export these to cleanL3VPN. Resulting forwarding-table:

Routing table: default.inet [Index 0]
Internet:

Destination: A.B.C.D/32
Route type: user
Route reference: 0 Route interface-index: 0
Multicast RPF nh index: 0
P2mpidx: 0
Flags: sent to PFE, rt nh decoupled
Nexthop: 0:15:17:b0:e6:f8
Next-hop type: unicast Index: 2930 Reference: 4
Next-hop interface: ae3.200
RPF interface: ae3.200

[...]

Routing table: cleanL3VPN.inet [Index 6]
Internet:

Destination: 87.245.206.15/32
Route type: user
Route reference: 0 Route interface-index: 0
Multicast RPF nh index: 0
P2mpidx: 0
Flags: sent to PFE, rt nh decoupled
Nexthop: 0:15:17:b0:e6:f8
Next-hop type: unicast Index: 2930 Reference: 4
Next-hop interface: ae3.200

Unfortunately, we found no way to provision such routes via BGP,
so you have to have all those in configuration :(

If there is a better workaround, I'd like to know it too :)


> Hi there,
>
> We're a US research and education ISP and we've been tasked for coming up with an architecture to allow on premise DDoS scrubbing with an appliance. As a first pass I've created an cleanL3VPN routing-instance to function as a clean VRF that uses rib-groups to mirror the relevant parts of inet.0. It is in production and is working great for customer learned BGP routes. It falls apart when I try to protect a directly attached destination that has a mac address in inet.0. I think I understand why and the purpose of this message is to see if anyone has been in a similar situation and has thoughts/advice/warnings about alternative designs.
>
> To explain what I see, I noticed that mac address based nexthops don't seem to be copied from inet.0 into cleanL3VPN.inet.0. I assume this means that mac-address based forwarding must be referencing inet.0 [see far below]. This obviously creates a loop once the best path in inet.0 becomes a BGP /32. For example when I'm announcing a /32 for 1.2.3.4 out of a locally attached 1.2.3.0/26, traceroute implies the packet enters inet.0, is sent to 5.6.7.8 as the nexthop correctly, arrives in cleanL3VPN which decides to forward to 5.6.7.8 in a loop, even though the BGP /32 isn't part of cleanL3VPN [see below], cleanL3VPN Is dependent on inet.0 for resolution. Even if I could copy inet.0 mac addresses into cleanL3VPN, eventually the mac address would age out of inet.0 because the /32 would no longer be directly connected. If I want to be able to protect locally attached destinations so I think my design is unworkable, I think my solutions are
>
> = use flowspec redirection to dirty VRF, keep inet.0 as clean and use flowspec interface filter-group appropriately on backbone interfaces [routing-options flow interface-group exclude, which I already have deployed correctly]. This seems easy but is less performant.
> = put my customers into a customerVRF and deal with route leaking between global and customerVRF. This is a well-known tactic but more complicated to approach and disruptive to deploy as I have to airlift basically all the customers to into a VRF to have full coverage.
>
> For redirection, to date I've been looking at longest prefix match solutions due to the presumed scalability vs using flowspec. I have an unknown amount of "always on" redirects I might be asked to entertain. 10? 100? 1000? I'm trying to come up with a solution that doesn't rely on touching the routers themselves. I did think about creating a normal [non flowspec] input firewall term on untrusted interfaces that redirects to dirty VRF based in a single destination prefix-list and just relying on flowspec for on demand stuff with the assumption one firewall term with let's say 1000 prefixes is more performant than 1000 standalone flowspec rules. I think my solution is fundamentally workable but I don't think the purchased turnkey ddos orchestration is going to natively interact with our Junipers, so that is looked down upon, since it would require " a router guy " or writing custom automation when adding/removing always-on protection. Seems technically very viable to me, I j
us
> t bring up these details because I feel like without a ton of effort VRF redirection can be made to be nearly as performant as longest prefix match.
>
> While we run MPLS, currently all of our customers/transit are in the global table. I'm trying to avoid solutions for now that puts the 1M+ RIB DFZ zone into an L3VPN; it's awfully big change I don't want to rush into especially for this proof of concept but I'd like to hear opinions if that's the best solution to this specific problem. I'm not sure it's fundamentally different than creating a customerVRF, seems like I just need to separate the customers from the internet ingress.
>
> My gut says "the best" thing to do is to create a customerVRF but it feels a bit complicated as I have to worry about things like BGP/static/direct and will lose addPath [.I recently discovered add-path and route-target are mutually exclusive in JunOS].
>
> My gut says "the quickest" and least disruptive thing to do is to go the flowspec/filter route and frankly I'm beginning to lean that way since I'm already partially in production and needed to have a solution 5 days ago to this problem :>
>
> I've done all of these things before [flowspec, rib leaking] I think it's just a matter of trying to figure out the next best step and was looking to see if anyone has been in a similar situation and has thoughts/advice/warnings.
>
> I'm talking about IPv4 below but I ack IPv6 is a thing and I would just do the same solution.
>
> -Michael
>
> ===/===
>
> @$myrouter> show route forwarding-table destination 1.2.3.4 extensive
> Apr 02 08:39:10
> Routing table: default.inet [Index 0]
> Internet:
>
> Destination: 1.2.3.4/32
> Route type: user
> Route reference: 0 Route interface-index: 0
> Multicast RPF nh index: 0
> P2mpidx: 0
> Flags: sent to PFE
> Next-hop type: indirect Index: 1048588 Reference: 3
> Nexthop: 5.6.7.8
> Next-hop type: unicast Index: 981 Reference: 3
> Next-hop interface: et-0/1/10.3099
>
> Destination: 1.2.3.4/32
> Route type: destination
> Route reference: 0 Route interface-index: 85
> Multicast RPF nh index: 0
> P2mpidx: 0
> Flags: none
> Nexthop: 0:50:56:b3:4f:fe
> Next-hop type: unicast Index: 1562 Reference: 1
> Next-hop interface: ae17.3347
>
> Routing table: cleanL3VPN.inet [Index 21]
> Internet:
>
> Destination: 1.2.3.0/26
> Route type: user
> Route reference: 0 Route interface-index: 0
> Multicast RPF nh index: 0
> P2mpidx: 0
> Flags: sent to PFE, rt nh decoupled
> Next-hop type: table lookup Index: 1 Reference: 40
> _______________________________________________
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: L3VPNs and on-prem DDoS scrubbing architecture [ In reply to ]
On Tue, Apr 02, 2024 at 07:43:01PM +0300, Alexandre Snarskii via juniper-nsp wrote:
> On Tue, Apr 02, 2024 at 03:25:21PM +0000, Michael Hare via juniper-nsp wrote:
>
> Hi!
>
> Workaround that we're using (not elegant, but working): setup a
> "self-pointing" routes to directly connected destinations:
>
> set routing-options static route A.B.C.D/32 next-hop A.B.C.D

Forgot to note one thing: these self-pointing routes shall have
preference of 200 (or anytning more than BGP's 170):

set routing-options static route A.B.C.D/32 next-hop A.B.C.D
set routing-options static route A.B.C.D/32 preference 200

so, in case when traffic shall be diverted to scrubbing, bgp route
will be active in inet.0 and static route will be active in cleanL3VPN:

snar@RT1.OV.SPB> show route A.B.C.D/32
inet.0: ...
+ = Active Route, - = Last Active, * = Both

A.B.C.D/32 *[BGP/170] 00:06:33, localpref 100
AS path: 65532 I, validation-state: unverified
> to Scrubbing via ae3.232
[Static/200] 00:02:22
> to A.B.C.D via ae3.200

cleanL3VPN.inet.0: ....
+ = Active Route, - = Last Active, * = Both

A.B.C.D/32 *[Static/200] 00:02:22
> to A.B.C.D via ae3.200


and the corresponding forwarding entry:

Routing table: default.inet [Index 0]
Internet:

Destination: A.B.C.D/32
Route type: user
Route reference: 0 Route interface-index: 0
Multicast RPF nh index: 0
P2mpidx: 0
Flags: sent to PFE, rt nh decoupled
Nexthop: Scrubbing
Next-hop type: unicast Index: 2971 Reference: 6
Next-hop interface: ae3.232
RPF interface: ae3.200
RPF interface: ae3.232

Destination: A.B.C.D/32
Route type: destination
Route reference: 0 Route interface-index: 431
Multicast RPF nh index: 0
P2mpidx: 0
Flags: none
Nexthop: 0:15:17:b0:e6:f8
Next-hop type: unicast Index: 2930 Reference: 3
Next-hop interface: ae3.200
RPF interface: ae3.200

[...]
Routing table: cleanL3VPN.inet [Index 6]
Internet:

Destination: A.B.C.D/32
Route type: user
Route reference: 0 Route interface-index: 0
Multicast RPF nh index: 0
P2mpidx: 0
Flags: sent to PFE, rt nh decoupled
Nexthop: 0:15:17:b0:e6:f8
Next-hop type: unicast Index: 2930 Reference: 3
Next-hop interface: ae3.200


>
> and export these to cleanL3VPN. Resulting forwarding-table:
>
> Routing table: default.inet [Index 0]
> Internet:
>
> Destination: A.B.C.D/32
> Route type: user
> Route reference: 0 Route interface-index: 0
> Multicast RPF nh index: 0
> P2mpidx: 0
> Flags: sent to PFE, rt nh decoupled
> Nexthop: 0:15:17:b0:e6:f8
> Next-hop type: unicast Index: 2930 Reference: 4
> Next-hop interface: ae3.200
> RPF interface: ae3.200
>
> [...]
>
> Routing table: cleanL3VPN.inet [Index 6]
> Internet:
>
> Destination: A.B.C.D/32
> Route type: user
> Route reference: 0 Route interface-index: 0
> Multicast RPF nh index: 0
> P2mpidx: 0
> Flags: sent to PFE, rt nh decoupled
> Nexthop: 0:15:17:b0:e6:f8
> Next-hop type: unicast Index: 2930 Reference: 4
> Next-hop interface: ae3.200
>
> Unfortunately, we found no way to provision such routes via BGP,
> so you have to have all those in configuration :(
>
> If there is a better workaround, I'd like to know it too :)
>
>
> > Hi there,
> >
> > We're a US research and education ISP and we've been tasked for coming up with an architecture to allow on premise DDoS scrubbing with an appliance. As a first pass I've created an cleanL3VPN routing-instance to function as a clean VRF that uses rib-groups to mirror the relevant parts of inet.0. It is in production and is working great for customer learned BGP routes. It falls apart when I try to protect a directly attached destination that has a mac address in inet.0. I think I understand why and the purpose of this message is to see if anyone has been in a similar situation and has thoughts/advice/warnings about alternative designs.
> >
> > To explain what I see, I noticed that mac address based nexthops don't seem to be copied from inet.0 into cleanL3VPN.inet.0. I assume this means that mac-address based forwarding must be referencing inet.0 [see far below]. This obviously creates a loop once the best path in inet.0 becomes a BGP /32. For example when I'm announcing a /32 for 1.2.3.4 out of a locally attached 1.2.3.0/26, traceroute implies the packet enters inet.0, is sent to 5.6.7.8 as the nexthop correctly, arrives in cleanL3VPN which decides to forward to 5.6.7.8 in a loop, even though the BGP /32 isn't part of cleanL3VPN [see below], cleanL3VPN Is dependent on inet.0 for resolution. Even if I could copy inet.0 mac addresses into cleanL3VPN, eventually the mac address would age out of inet.0 because the /32 would no longer be directly connected. If I want to be able to protect locally attached destinations so I think my design is unworkable, I think my solutions are
> >
> > = use flowspec redirection to dirty VRF, keep inet.0 as clean and use flowspec interface filter-group appropriately on backbone interfaces [routing-options flow interface-group exclude, which I already have deployed correctly]. This seems easy but is less performant.
> > = put my customers into a customerVRF and deal with route leaking between global and customerVRF. This is a well-known tactic but more complicated to approach and disruptive to deploy as I have to airlift basically all the customers to into a VRF to have full coverage.
> >
> > For redirection, to date I've been looking at longest prefix match solutions due to the presumed scalability vs using flowspec. I have an unknown amount of "always on" redirects I might be asked to entertain. 10? 100? 1000? I'm trying to come up with a solution that doesn't rely on touching the routers themselves. I did think about creating a normal [non flowspec] input firewall term on untrusted interfaces that redirects to dirty VRF based in a single destination prefix-list and just relying on flowspec for on demand stuff with the assumption one firewall term with let's say 1000 prefixes is more performant than 1000 standalone flowspec rules. I think my solution is fundamentally workable but I don't think the purchased turnkey ddos orchestration is going to natively interact with our Junipers, so that is looked down upon, since it would require " a router guy " or writing custom automation when adding/removing always-on protection. Seems technically very viable to me, I
j
> us
> > t bring up these details because I feel like without a ton of effort VRF redirection can be made to be nearly as performant as longest prefix match.
> >
> > While we run MPLS, currently all of our customers/transit are in the global table. I'm trying to avoid solutions for now that puts the 1M+ RIB DFZ zone into an L3VPN; it's awfully big change I don't want to rush into especially for this proof of concept but I'd like to hear opinions if that's the best solution to this specific problem. I'm not sure it's fundamentally different than creating a customerVRF, seems like I just need to separate the customers from the internet ingress.
> >
> > My gut says "the best" thing to do is to create a customerVRF but it feels a bit complicated as I have to worry about things like BGP/static/direct and will lose addPath [.I recently discovered add-path and route-target are mutually exclusive in JunOS].
> >
> > My gut says "the quickest" and least disruptive thing to do is to go the flowspec/filter route and frankly I'm beginning to lean that way since I'm already partially in production and needed to have a solution 5 days ago to this problem :>
> >
> > I've done all of these things before [flowspec, rib leaking] I think it's just a matter of trying to figure out the next best step and was looking to see if anyone has been in a similar situation and has thoughts/advice/warnings.
> >
> > I'm talking about IPv4 below but I ack IPv6 is a thing and I would just do the same solution.
> >
> > -Michael
> >
> > ===/===
> >
> > @$myrouter> show route forwarding-table destination 1.2.3.4 extensive
> > Apr 02 08:39:10
> > Routing table: default.inet [Index 0]
> > Internet:
> >
> > Destination: 1.2.3.4/32
> > Route type: user
> > Route reference: 0 Route interface-index: 0
> > Multicast RPF nh index: 0
> > P2mpidx: 0
> > Flags: sent to PFE
> > Next-hop type: indirect Index: 1048588 Reference: 3
> > Nexthop: 5.6.7.8
> > Next-hop type: unicast Index: 981 Reference: 3
> > Next-hop interface: et-0/1/10.3099
> >
> > Destination: 1.2.3.4/32
> > Route type: destination
> > Route reference: 0 Route interface-index: 85
> > Multicast RPF nh index: 0
> > P2mpidx: 0
> > Flags: none
> > Nexthop: 0:50:56:b3:4f:fe
> > Next-hop type: unicast Index: 1562 Reference: 1
> > Next-hop interface: ae17.3347
> >
> > Routing table: cleanL3VPN.inet [Index 21]
> > Internet:
> >
> > Destination: 1.2.3.0/26
> > Route type: user
> > Route reference: 0 Route interface-index: 0
> > Multicast RPF nh index: 0
> > P2mpidx: 0
> > Flags: sent to PFE, rt nh decoupled
> > Next-hop type: table lookup Index: 1 Reference: 40
> > _______________________________________________
> > juniper-nsp mailing list juniper-nsp@puck.nether.net
> > https://puck.nether.net/mailman/listinfo/juniper-nsp
> _______________________________________________
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: L3VPNs and on-prem DDoS scrubbing architecture [ In reply to ]
On Tue, 2 Apr 2024 at 18:25, Michael Hare via juniper-nsp
<juniper-nsp@puck.nether.net> wrote:

> We're a US research and education ISP and we've been tasked for coming up with an architecture to allow on premise DDoS scrubbing with an appliance. As a first pass I've created an cleanL3VPN routing-instance to function as a clean VRF that uses rib-groups to mirror the relevant parts of inet.0. It is in production and is working great for customer learned BGP routes. It falls apart when I try to protect a directly attached destination that has a mac address in inet.0. I think I understand why and the purpose of this message is to see if anyone has been in a similar situation and has thoughts/advice/warnings about alternative designs.
>
> To explain what I see, I noticed that mac address based nexthops don't seem to be copied from inet.0 into cleanL3VPN.inet.0. I assume this means that mac-address based forwarding must be referencing inet.0 [see far below]. This obviously creates a loop once the best path in inet.0 becomes a BGP /32. For example when I'm announcing a /32 for 1.2.3.4 out of a locally attached 1.2.3.0/26, traceroute implies the packet enters inet.0, is sent to 5.6.7.8 as the nexthop correctly, arrives in cleanL3VPN which decides to forward to 5.6.7.8 in a loop, even though the BGP /32 isn't part of cleanL3VPN [see below], cleanL3VPN Is dependent on inet.0 for resolution. Even if I could copy inet.0 mac addresses into cleanL3VPN, eventually the mac address would age out of inet.0 because the /32 would no longer be directly connected. If I want to be able to protect locally attached destinations so I think my design is unworkable, I think my solutions are

If I understand you correctly, the problem is not that you can't copy
direct into CleanVRF, the problem is that ScrubberPE that does clean
lookup in in CleanVRF, has label stack of [EgressPE TableLabel],
instead of [EgressPE EgressCE], this causes the EgressPE to do IP
lookup, which will then see the Direct/32 advertised by the scrubber,
causing loop. While what you want is end-to-end MPLS lookup, so that
egressPE MPLS lookup has egressMAC.

I believe in BGP-LU you could fix this, without actually paying for
duplicate RIB/FIB and without opportunistically copying routes to
CleanVRF, every prefix would be scrubbable by default. You'd have
per-ce for rest, but per-prefix for connected routes, I believe then
you would have [EgressPE EgressMAC_CE] label for connected routes, so
each host route would have their own label, allowing mac rewrite
without additional local IP lookup.

I'm not sure if this is the only way, I'm not sure if there would be a
way in CleanVRF to force each direct/32 to have a label as well,
avoiding the egress IP lookup loops. One doesn't immediately spring to
mind, but technically implementation could certainly allow such mode.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: L3VPNs and on-prem DDoS scrubbing architecture [ In reply to ]
On 4/3/24 08:07, Saku Ytti via juniper-nsp wrote:

> If I understand you correctly, the problem is not that you can't copy
> direct into CleanVRF, the problem is that ScrubberPE that does clean
> lookup in in CleanVRF, has label stack of [EgressPE TableLabel],
> instead of [EgressPE EgressCE], this causes the EgressPE to do IP
> lookup, which will then see the Direct/32 advertised by the scrubber,
> causing loop. While what you want is end-to-end MPLS lookup, so that
> egressPE MPLS lookup has egressMAC.
>
> I believe in BGP-LU you could fix this, without actually paying for
> duplicate RIB/FIB and without opportunistically copying routes to
> CleanVRF, every prefix would be scrubbable by default. You'd have
> per-ce for rest, but per-prefix for connected routes, I believe then
> you would have [EgressPE EgressMAC_CE] label for connected routes, so
> each host route would have their own label, allowing mac rewrite
> without additional local IP lookup.
>
> I'm not sure if this is the only way, I'm not sure if there would be a
> way in CleanVRF to force each direct/32 to have a label as well,
> avoiding the egress IP lookup loops. One doesn't immediately spring to
> mind, but technically implementation could certainly allow such mode.

At old job, we managed to do this with a virtual-router VRF that carried
traffic between the scrubbing PE and the egress PE via MPLS, to avoid
the IP loop.

It was quite an involved configuration, but it actually worked. It sort
of mimicked the ability of the scrubbing device being able to
participate in your IGP (which is something I wanted but the scrubbing
vendor was not keen to support).

Mark.
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: L3VPNs and on-prem DDoS scrubbing architecture [ In reply to ]
On Wed, 3 Apr 2024 at 09:37, Mark Tinka via juniper-nsp
<juniper-nsp@puck.nether.net> wrote:

> At old job, we managed to do this with a virtual-router VRF that carried
> traffic between the scrubbing PE and the egress PE via MPLS, to avoid
> the IP loop.

Actually I think I'm confused. I think it will just work. Because even
as the EgressPE does IP lookup due to table-label, the IP lookup still
points to egressMAC, instead looping back, because it's doing it in
the CleanVRF.
So I think it just works.

So OP just needs to copy the direct route as-is, not as host/32 into
cleanVRF, with something like this:

routing-options {
interface-routes {
rib-groups {
cleanVRF {
import-rib [ inet.0 cleanVRF.inet.0 ];
import-policy cleanVRF:EXPORT;
}}}}

Now cleanVRF.inet.0 has the connected TableLabel, and as lookup is
done in the cleanVRF, without the Scrubber/32 route, it'll be sent to
the correct egress CE, despite doing egress IP lookup.
--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: L3VPNs and on-prem DDoS scrubbing architecture [ In reply to ]
On 4/3/24 08:45, Saku Ytti wrote:

> Actually I think I'm confused. I think it will just work. Because even
> as the EgressPE does IP lookup due to table-label, the IP lookup still
> points to egressMAC, instead looping back, because it's doing it in
> the CleanVRF.
> So I think it just works.
>
> So OP just needs to copy the direct route as-is, not as host/32 into
> cleanVRF, with something like this:
>
> routing-options {
> interface-routes {
> rib-groups {
> cleanVRF {
> import-rib [ inet.0 cleanVRF.inet.0 ];
> import-policy cleanVRF:EXPORT;
> }}}}
>
> Now cleanVRF.inet.0 has the connected TableLabel, and as lookup is
> done in the cleanVRF, without the Scrubber/32 route, it'll be sent to
> the correct egress CE, despite doing egress IP lookup.

Sounds like it should if I logic through your example, but in our case,
we took a different path.

We did not use RIB Groups. Everything happened in the virtual-router
instance (including IS-IS + LDP + a dedicated Loopback interface), and
then we connected it to the global table using an lt- interface (classic
virtual-router vibes).

Basically, we cut the router in half (or doubled it, whichever way you
look at it) so that one side of the router was dealing with traffic
on-ramp to send to the scrubber for cleaning, and the other side of the
router was dealing with traffic off-ramp to send the cleaned traffic
toward the egress PE. Both sides of the router were virtually
independent, even though in the same physical hardware.

You could achieve the same using two physical routers, but with the
available tech., it would have been a waste.

We did this with an MX204, which means there could be a PFE penalty down
the line if traffic grows, but I did not spend too much time digging
into that, as at the time, we were only dealing with about 40Gbps of
traffic, and needed to get the setup going ASAP.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: L3VPNs and on-prem DDoS scrubbing architecture [ In reply to ]
On Wed, 3 Apr 2024 at 09:45, Saku Ytti <saku@ytti.fi> wrote:

> Actually I think I'm confused. I think it will just work. Because even
> as the EgressPE does IP lookup due to table-label, the IP lookup still
> points to egressMAC, instead looping back, because it's doing it in
> the CleanVRF.
> So I think it just works.

> routing-options {
> interface-routes {
> rib-groups {
> cleanVRF {
> import-rib [ inet.0 cleanVRF.inet.0 ];
> import-policy cleanVRF:EXPORT;
> }}}}

This isn't exactly correct. You need to put the cleanVRF in
interfacer-quotes and close it.

Anyhow I'm 90% sure this will just work and pretty sure I've done it.
The confusion I had was about the scrubbing route that on the
clean-side is already host/32. For this, I can't figure out a cleanVRF
solution, but a BGP-LU solution exists even for this problem.


--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: L3VPNs and on-prem DDoS scrubbing architecture [ In reply to ]
Saku, Mark-

Thanks for the responses. Unless I'm mistaken, short of specifying a selective import policy, I think I'm already doing what Saku suggests, see relevant config snippet below. Our clean VRF is L3VPN-4205. But after I saw the lack of mac based next hops I started searching to see if there was a protocol other than direct that I wasn't aware of. I intend to take a look at Alexandre's workaround to understand/test, just haven't gotten there yet.

I was able to get FBF via dirtyVRF working quickly in the meantime while I figure out how to salvage the longest-prefix approach.

-Michael

==/==

@ # show routing-options | display inheritance no-comments
...
interface-routes {
rib-group {
inet rib-interface-routes-v4;
inet6 rib-interface-routes-v6;
}
}
rib-groups {
rib-interface-routes-v4 {
import-rib [ inet.0 L3VPN-4205.inet.0 ];
}
...
rib-interface-routes-v6 {
import-rib [ inet6.0 L3VPN-4205.inet6.0 ];
}
...
}

> -----Original Message-----
> From: juniper-nsp <juniper-nsp-bounces@puck.nether.net> On Behalf Of
> Saku Ytti via juniper-nsp
> Sent: Wednesday, April 3, 2024 1:58 AM
> To: Mark Tinka <mark@tinka.africa>
> Cc: juniper-nsp@puck.nether.net
> Subject: Re: [j-nsp] L3VPNs and on-prem DDoS scrubbing architecture
>
> On Wed, 3 Apr 2024 at 09:45, Saku Ytti <saku@ytti.fi> wrote:
>
> > Actually I think I'm confused. I think it will just work. Because even
> > as the EgressPE does IP lookup due to table-label, the IP lookup still
> > points to egressMAC, instead looping back, because it's doing it in
> > the CleanVRF.
> > So I think it just works.
>
> > routing-options {
> > interface-routes {
> > rib-groups {
> > cleanVRF {
> > import-rib [ inet.0 cleanVRF.inet.0 ];
> > import-policy cleanVRF:EXPORT;
> > }}}}
>
> This isn't exactly correct. You need to put the cleanVRF in
> interfacer-quotes and close it.
>
> Anyhow I'm 90% sure this will just work and pretty sure I've done it.
> The confusion I had was about the scrubbing route that on the
> clean-side is already host/32. For this, I can't figure out a cleanVRF
> solution, but a BGP-LU solution exists even for this problem.
>
>
> --
> ++ytti
> _______________________________________________
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://urldefense.com/v3/__https://puck.nether.net/mailman/listinfo/junip
> er-nsp__;!!Mak6IKo!JQJvgDK7yNf4-
> 3MbfcDkWHvNajBUNxt3ZAC3DefzEkRkebYhpy3c7RX5em7pvvTJZrdrNKw79P
> QweWqGaJdIwLpkAng$
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: L3VPNs and on-prem DDoS scrubbing architecture [ In reply to ]
This might be grounds for a feature request to Juniper, if there isn't
already some magic toggle to MakeItGo.

But yeah, the forwarding-table looks suspect, as if it'll do table
lookup, and then will fail to discover the more-specific host-route,
and discard, as the ARP entries are not copied. And yeah Alexandres'
workaround seems like a cute way to force the host route into VRF, if
provisioning intensive.

I think two features would be nice to have

a) this to copy the arp/nd entries from inet to vrf (if not already possible)
b) feature to assign labels to each arp/nd host route, to avoid doing
egressPE lookup (this labeled route would only be imported to the
interface facing scrubber clean side, rest of the network sees the
unlabeled direct aggregate)

On Wed, 3 Apr 2024 at 17:04, Michael Hare <michael.hare@wisc.edu> wrote:
>
> Saku, Mark-
>
> Thanks for the responses. Unless I'm mistaken, short of specifying a selective import policy, I think I'm already doing what Saku suggests, see relevant config snippet below. Our clean VRF is L3VPN-4205. But after I saw the lack of mac based next hops I started searching to see if there was a protocol other than direct that I wasn't aware of. I intend to take a look at Alexandre's workaround to understand/test, just haven't gotten there yet.
>
> I was able to get FBF via dirtyVRF working quickly in the meantime while I figure out how to salvage the longest-prefix approach.
>
> -Michael
>
> ==/==
>
> @ # show routing-options | display inheritance no-comments
> ...
> interface-routes {
> rib-group {
> inet rib-interface-routes-v4;
> inet6 rib-interface-routes-v6;
> }
> }
> rib-groups {
> rib-interface-routes-v4 {
> import-rib [ inet.0 L3VPN-4205.inet.0 ];
> }
> ...
> rib-interface-routes-v6 {
> import-rib [ inet6.0 L3VPN-4205.inet6.0 ];
> }
> ...
> }
>
> > -----Original Message-----
> > From: juniper-nsp <juniper-nsp-bounces@puck.nether.net> On Behalf Of
> > Saku Ytti via juniper-nsp
> > Sent: Wednesday, April 3, 2024 1:58 AM
> > To: Mark Tinka <mark@tinka.africa>
> > Cc: juniper-nsp@puck.nether.net
> > Subject: Re: [j-nsp] L3VPNs and on-prem DDoS scrubbing architecture
> >
> > On Wed, 3 Apr 2024 at 09:45, Saku Ytti <saku@ytti.fi> wrote:
> >
> > > Actually I think I'm confused. I think it will just work. Because even
> > > as the EgressPE does IP lookup due to table-label, the IP lookup still
> > > points to egressMAC, instead looping back, because it's doing it in
> > > the CleanVRF.
> > > So I think it just works.
> >
> > > routing-options {
> > > interface-routes {
> > > rib-groups {
> > > cleanVRF {
> > > import-rib [ inet.0 cleanVRF.inet.0 ];
> > > import-policy cleanVRF:EXPORT;
> > > }}}}
> >
> > This isn't exactly correct. You need to put the cleanVRF in
> > interfacer-quotes and close it.
> >
> > Anyhow I'm 90% sure this will just work and pretty sure I've done it.
> > The confusion I had was about the scrubbing route that on the
> > clean-side is already host/32. For this, I can't figure out a cleanVRF
> > solution, but a BGP-LU solution exists even for this problem.
> >
> >
> > --
> > ++ytti
> > _______________________________________________
> > juniper-nsp mailing list juniper-nsp@puck.nether.net
> > https://urldefense.com/v3/__https://puck.nether.net/mailman/listinfo/junip
> > er-nsp__;!!Mak6IKo!JQJvgDK7yNf4-
> > 3MbfcDkWHvNajBUNxt3ZAC3DefzEkRkebYhpy3c7RX5em7pvvTJZrdrNKw79P
> > QweWqGaJdIwLpkAng$



--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: L3VPNs and on-prem DDoS scrubbing architecture [ In reply to ]
>
> but a BGP-LU solution exists even for this problem.
>

My first thought was also to use BGP-LU.

On Wed, Apr 3, 2024 at 2:58?AM Saku Ytti via juniper-nsp <
juniper-nsp@puck.nether.net> wrote:

> On Wed, 3 Apr 2024 at 09:45, Saku Ytti <saku@ytti.fi> wrote:
>
> > Actually I think I'm confused. I think it will just work. Because even
> > as the EgressPE does IP lookup due to table-label, the IP lookup still
> > points to egressMAC, instead looping back, because it's doing it in
> > the CleanVRF.
> > So I think it just works.
>
> > routing-options {
> > interface-routes {
> > rib-groups {
> > cleanVRF {
> > import-rib [ inet.0 cleanVRF.inet.0 ];
> > import-policy cleanVRF:EXPORT;
> > }}}}
>
> This isn't exactly correct. You need to put the cleanVRF in
> interfacer-quotes and close it.
>
> Anyhow I'm 90% sure this will just work and pretty sure I've done it.
> The confusion I had was about the scrubbing route that on the
> clean-side is already host/32. For this, I can't figure out a cleanVRF
> solution, but a BGP-LU solution exists even for this problem.
>
>
> --
> ++ytti
> _______________________________________________
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: L3VPNs and on-prem DDoS scrubbing architecture [ In reply to ]
On 4/3/24 18:06, Tom Beecher wrote:

>
> My first thought was also to use BGP-LU.

Would a virtual router with an lt- interface connecting the VRF to the
global table be too expensive?

Mark.
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: L3VPNs and on-prem DDoS scrubbing architecture [ In reply to ]
Alexandre,

Thanks for your emails. I finally got around to trying it myself; it definitely works! I first "broke" my A.B.C.D destination and =then= added a static. When I reproduced this, instead of putting the static route into inet.0 I chose to install in my cleanVRF, which gets around the admin distance issues. Any reason you install the routes in global instead of cleanVRF that I'm overlooking?

I'm curious to know how safe it is to rely on working in the future. How long have you been using this trick? I'll probably follow up with our Juniper support channels, as Saku suggests, maybe something even better can come out of this.

Thanks again,
-Michael

=========/========

@# run show route A.B.C.D

inet.0: 933009 destinations, 2744517 routes (932998 active, 0 holddown, 360 hidden)
+ = Active Route, - = Last Active, * = Both

A.B.C.D/32 *[BGP/170] 00:24:03, localpref 100, from 2.3.4.5
AS path: I, validation-state: unverified
> to 5.6.7.8 via et-0/1/10.3099

cleanVRF.inet.0: 319 destinations, 1179 routes (318 active, 0 holddown, 1 hidden)
Limit/Threshold: 5000/4000 destinations
+ = Active Route, - = Last Active, * = Both

A.B.C.D/32 *[Static/5] 00:07:36
> to A.B.C.D via ae17.3347

@# run show route forwarding-table destination A.B.C.D
Routing table: default.inet
Internet:
Destination Type RtRef Next hop Type Index NhRef Netif
A.B.C.D/32 user 0 indr 1048588 3
5.6.7.8 ucst 981 5 et-0/1/10.3099
A.B.C.D/32 dest 0 0:50:56:b3:4f:fe ucst 1420 3 ae17.3347

Routing table: cleanVRF.inet
Internet:
Destination Type RtRef Next hop Type Index NhRef Netif
A.B.C.D/32 user 0 0:50:56:b3:4f:fe ucst 1420 3 ae17.3347

> -----Original Message-----
> From: Alexandre Snarskii <snar@snar.spb.ru>
> Sent: Tuesday, April 2, 2024 12:20 PM
> To: Michael Hare <michael.hare@wisc.edu>
> Cc: juniper-nsp@puck.nether.net
> Subject: Re: [j-nsp] L3VPNs and on-prem DDoS scrubbing architecture
>
> On Tue, Apr 02, 2024 at 07:43:01PM +0300, Alexandre Snarskii via juniper-
> nsp wrote:
> > On Tue, Apr 02, 2024 at 03:25:21PM +0000, Michael Hare via juniper-nsp
> wrote:
> >
> > Hi!
> >
> > Workaround that we're using (not elegant, but working): setup a
> > "self-pointing" routes to directly connected destinations:
> >
> > set routing-options static route A.B.C.D/32 next-hop A.B.C.D
>
> Forgot to note one thing: these self-pointing routes shall have
> preference of 200 (or anytning more than BGP's 170):
>
> set routing-options static route A.B.C.D/32 next-hop A.B.C.D
> set routing-options static route A.B.C.D/32 preference 200
>
> so, in case when traffic shall be diverted to scrubbing, bgp route
> will be active in inet.0 and static route will be active in cleanL3VPN:
>
> snar@RT1.OV.SPB> show route A.B.C.D/32
> inet.0: ...
> + = Active Route, - = Last Active, * = Both
>
> A.B.C.D/32 *[BGP/170] 00:06:33, localpref 100
> AS path: 65532 I, validation-state: unverified
> > to Scrubbing via ae3.232
> [Static/200] 00:02:22
> > to A.B.C.D via ae3.200
>
> cleanL3VPN.inet.0: ....
> + = Active Route, - = Last Active, * = Both
>
> A.B.C.D/32 *[Static/200] 00:02:22
> > to A.B.C.D via ae3.200
>
>
> and the corresponding forwarding entry:
>
> Routing table: default.inet [Index 0]
> Internet:
>
> Destination: A.B.C.D/32
> Route type: user
> Route reference: 0 Route interface-index: 0
> Multicast RPF nh index: 0
> P2mpidx: 0
> Flags: sent to PFE, rt nh decoupled
> Nexthop: Scrubbing
> Next-hop type: unicast Index: 2971 Reference: 6
> Next-hop interface: ae3.232
> RPF interface: ae3.200
> RPF interface: ae3.232
>
> Destination: A.B.C.D/32
> Route type: destination
> Route reference: 0 Route interface-index: 431
> Multicast RPF nh index: 0
> P2mpidx: 0
> Flags: none
> Nexthop: 0:15:17:b0:e6:f8
> Next-hop type: unicast Index: 2930 Reference: 3
> Next-hop interface: ae3.200
> RPF interface: ae3.200
>
> [...]
> Routing table: cleanL3VPN.inet [Index 6]
> Internet:
>
> Destination: A.B.C.D/32
> Route type: user
> Route reference: 0 Route interface-index: 0
> Multicast RPF nh index: 0
> P2mpidx: 0
> Flags: sent to PFE, rt nh decoupled
> Nexthop: 0:15:17:b0:e6:f8
> Next-hop type: unicast Index: 2930 Reference: 3
> Next-hop interface: ae3.200
>
>
> >
> > and export these to cleanL3VPN. Resulting forwarding-table:
> >
> > Routing table: default.inet [Index 0]
> > Internet:
> >
> > Destination: A.B.C.D/32
> > Route type: user
> > Route reference: 0 Route interface-index: 0
> > Multicast RPF nh index: 0
> > P2mpidx: 0
> > Flags: sent to PFE, rt nh decoupled
> > Nexthop: 0:15:17:b0:e6:f8
> > Next-hop type: unicast Index: 2930 Reference: 4
> > Next-hop interface: ae3.200
> > RPF interface: ae3.200
> >
> > [...]
> >
> > Routing table: cleanL3VPN.inet [Index 6]
> > Internet:
> >
> > Destination: A.B.C.D/32
> > Route type: user
> > Route reference: 0 Route interface-index: 0
> > Multicast RPF nh index: 0
> > P2mpidx: 0
> > Flags: sent to PFE, rt nh decoupled
> > Nexthop: 0:15:17:b0:e6:f8
> > Next-hop type: unicast Index: 2930 Reference: 4
> > Next-hop interface: ae3.200
> >
> > Unfortunately, we found no way to provision such routes via BGP,
> > so you have to have all those in configuration :(
> >
> > If there is a better workaround, I'd like to know it too :)
> >
> >
> > > Hi there,
> > >
> > > We're a US research and education ISP and we've been tasked for coming
> up with an architecture to allow on premise DDoS scrubbing with an appliance.
> As a first pass I've created an cleanL3VPN routing-instance to function as a
> clean VRF that uses rib-groups to mirror the relevant parts of inet.0. It is in
> production and is working great for customer learned BGP routes. It falls apart
> when I try to protect a directly attached destination that has a mac address in
> inet.0. I think I understand why and the purpose of this message is to see if
> anyone has been in a similar situation and has thoughts/advice/warnings
> about alternative designs.
> > >
> > > To explain what I see, I noticed that mac address based nexthops don't
> seem to be copied from inet.0 into cleanL3VPN.inet.0. I assume this means
> that mac-address based forwarding must be referencing inet.0 [see far below].
> This obviously creates a loop once the best path in inet.0 becomes a BGP /32.
> For example when I'm announcing a /32 for 1.2.3.4 out of a locally attached
> 1.2.3.0/26, traceroute implies the packet enters inet.0, is sent to 5.6.7.8 as
> the nexthop correctly, arrives in cleanL3VPN which decides to forward to
> 5.6.7.8 in a loop, even though the BGP /32 isn't part of cleanL3VPN [see
> below], cleanL3VPN Is dependent on inet.0 for resolution. Even if I could copy
> inet.0 mac addresses into cleanL3VPN, eventually the mac address would age
> out of inet.0 because the /32 would no longer be directly connected. If I want
> to be able to protect locally attached destinations so I think my design is
> unworkable, I think my solutions are
> > >
> > > = use flowspec redirection to dirty VRF, keep inet.0 as clean and use
> flowspec interface filter-group appropriately on backbone interfaces [.routing-
> options flow interface-group exclude, which I already have deployed
> correctly]. This seems easy but is less performant.
> > > = put my customers into a customerVRF and deal with route leaking
> between global and customerVRF. This is a well-known tactic but more
> complicated to approach and disruptive to deploy as I have to airlift basically
> all the customers to into a VRF to have full coverage.
> > >
> > > For redirection, to date I've been looking at longest prefix match solutions
> due to the presumed scalability vs using flowspec. I have an unknown
> amount of "always on" redirects I might be asked to entertain. 10? 100?
> 1000? I'm trying to come up with a solution that doesn't rely on touching the
> routers themselves. I did think about creating a normal [non flowspec] input
> firewall term on untrusted interfaces that redirects to dirty VRF based in a
> single destination prefix-list and just relying on flowspec for on demand stuff
> with the assumption one firewall term with let's say 1000 prefixes is more
> performant than 1000 standalone flowspec rules. I think my solution is
> fundamentally workable but I don't think the purchased turnkey ddos
> orchestration is going to natively interact with our Junipers, so that is looked
> down upon, since it would require " a router guy " or writing custom
> automation when adding/removing always-on protection. Seems technically
> very viable to me, I j
> > us
> > > t bring up these details because I feel like without a ton of effort VRF
> redirection can be made to be nearly as performant as longest prefix match.
> > >
> > > While we run MPLS, currently all of our customers/transit are in the global
> table. I'm trying to avoid solutions for now that puts the 1M+ RIB DFZ zone
> into an L3VPN; it's awfully big change I don't want to rush into especially for
> this proof of concept but I'd like to hear opinions if that's the best solution to
> this specific problem. I'm not sure it's fundamentally different than creating a
> customerVRF, seems like I just need to separate the customers from the
> internet ingress.
> > >
> > > My gut says "the best" thing to do is to create a customerVRF but it feels a
> bit complicated as I have to worry about things like BGP/static/direct and will
> lose addPath [.I recently discovered add-path and route-target are mutually
> exclusive in JunOS].
> > >
> > > My gut says "the quickest" and least disruptive thing to do is to go the
> flowspec/filter route and frankly I'm beginning to lean that way since I'm
> already partially in production and needed to have a solution 5 days ago to
> this problem :>
> > >
> > > I've done all of these things before [flowspec, rib leaking] I think it's just a
> matter of trying to figure out the next best step and was looking to see if
> anyone has been in a similar situation and has thoughts/advice/warnings.
> > >
> > > I'm talking about IPv4 below but I ack IPv6 is a thing and I would just do
> the same solution.
> > >
> > > -Michael
> > >
> > > ===/===
> > >
> > > @$myrouter> show route forwarding-table destination 1.2.3.4 extensive
> > > Apr 02 08:39:10
> > > Routing table: default.inet [Index 0]
> > > Internet:
> > >
> > > Destination: 1.2.3.4/32
> > > Route type: user
> > > Route reference: 0 Route interface-index: 0
> > > Multicast RPF nh index: 0
> > > P2mpidx: 0
> > > Flags: sent to PFE
> > > Next-hop type: indirect Index: 1048588 Reference: 3
> > > Nexthop: 5.6.7.8
> > > Next-hop type: unicast Index: 981 Reference: 3
> > > Next-hop interface: et-0/1/10.3099
> > >
> > > Destination: 1.2.3.4/32
> > > Route type: destination
> > > Route reference: 0 Route interface-index: 85
> > > Multicast RPF nh index: 0
> > > P2mpidx: 0
> > > Flags: none
> > > Nexthop: 0:50:56:b3:4f:fe
> > > Next-hop type: unicast Index: 1562 Reference: 1
> > > Next-hop interface: ae17.3347
> > >
> > > Routing table: cleanL3VPN.inet [Index 21]
> > > Internet:
> > >
> > > Destination: 1.2.3.0/26
> > > Route type: user
> > > Route reference: 0 Route interface-index: 0
> > > Multicast RPF nh index: 0
> > > P2mpidx: 0
> > > Flags: sent to PFE, rt nh decoupled
> > > Next-hop type: table lookup Index: 1 Reference: 40
> > > _______________________________________________
> > > juniper-nsp mailing list juniper-nsp@puck.nether.net
> > >
> https://urldefense.com/v3/__https://puck.nether.net/mailman/listinfo/junip
> er-
> nsp__;!!Mak6IKo!OP5fgdGtjPWTngVcQt8mG10zVeOT1BQTtzQaIzT9MWqOM
> OPJvY_goFJTJVA1kek_IGLylCiYOLgImFms0w$
> > _______________________________________________
> > juniper-nsp mailing list juniper-nsp@puck.nether.net
> >
> https://urldefense.com/v3/__https://puck.nether.net/mailman/listinfo/junip
> er-
> nsp__;!!Mak6IKo!OP5fgdGtjPWTngVcQt8mG10zVeOT1BQTtzQaIzT9MWqOM
> OPJvY_goFJTJVA1kek_IGLylCiYOLgImFms0w$
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: L3VPNs and on-prem DDoS scrubbing architecture [ In reply to ]
On Thu, Apr 04, 2024 at 04:15:03PM +0000, Michael Hare wrote:
> Alexandre,
>
> Thanks for your emails. I finally got around to trying it myself;
> it definitely works! I first "broke" my A.B.C.D destination and =then=
> added a static. When I reproduced this, instead of putting the static
> route into inet.0 I chose to install in my cleanVRF, which gets around
> the admin distance issues. Any reason you install the routes in global
> instead of cleanVRF that I'm overlooking?

I guess you have not only static A.B.C.D/32 but also covering direct
A.B.C.0/24 in your cleanVRF ? Looks like I just overlooked that with
the help of imported direct route I'll be able to resolve /32 into mac
in VRF (not yet tested, but shall work).

> I'm curious to know how safe it is to rely on working in the future.
> How long have you been using this trick?

Trick with self-pointing routes is from the times when there were
no 'vrf-table-label' option in routing instances (there were no other
option to provide VRF access without CE-router). Trick with redistributing
them into cleanVRF is used since 2018 or something like.

> I'll probably follow up with our Juniper support channels, as Saku
> suggests, maybe something even better can come out of this.
>
> Thanks again,
> -Michael
>
> =========/========
>
> @# run show route A.B.C.D
>
> inet.0: 933009 destinations, 2744517 routes (932998 active, 0 holddown, 360 hidden)
> + = Active Route, - = Last Active, * = Both
>
> A.B.C.D/32 *[BGP/170] 00:24:03, localpref 100, from 2.3.4.5
> AS path: I, validation-state: unverified
> > to 5.6.7.8 via et-0/1/10.3099
>
> cleanVRF.inet.0: 319 destinations, 1179 routes (318 active, 0 holddown, 1 hidden)
> Limit/Threshold: 5000/4000 destinations
> + = Active Route, - = Last Active, * = Both
>
> A.B.C.D/32 *[Static/5] 00:07:36
> > to A.B.C.D via ae17.3347
>
> @# run show route forwarding-table destination A.B.C.D
> Routing table: default.inet
> Internet:
> Destination Type RtRef Next hop Type Index NhRef Netif
> A.B.C.D/32 user 0 indr 1048588 3
> 5.6.7.8 ucst 981 5 et-0/1/10.3099
> A.B.C.D/32 dest 0 0:50:56:b3:4f:fe ucst 1420 3 ae17.3347
>
> Routing table: cleanVRF.inet
> Internet:
> Destination Type RtRef Next hop Type Index NhRef Netif
> A.B.C.D/32 user 0 0:50:56:b3:4f:fe ucst 1420 3 ae17.3347
>
> > -----Original Message-----
> > From: Alexandre Snarskii <snar@snar.spb.ru>
> > Sent: Tuesday, April 2, 2024 12:20 PM
> > To: Michael Hare <michael.hare@wisc.edu>
> > Cc: juniper-nsp@puck.nether.net
> > Subject: Re: [j-nsp] L3VPNs and on-prem DDoS scrubbing architecture
> >
> > On Tue, Apr 02, 2024 at 07:43:01PM +0300, Alexandre Snarskii via juniper-
> > nsp wrote:
> > > On Tue, Apr 02, 2024 at 03:25:21PM +0000, Michael Hare via juniper-nsp
> > wrote:
> > >
> > > Hi!
> > >
> > > Workaround that we're using (not elegant, but working): setup a
> > > "self-pointing" routes to directly connected destinations:
> > >
> > > set routing-options static route A.B.C.D/32 next-hop A.B.C.D
> >
> > Forgot to note one thing: these self-pointing routes shall have
> > preference of 200 (or anytning more than BGP's 170):
> >
> > set routing-options static route A.B.C.D/32 next-hop A.B.C.D
> > set routing-options static route A.B.C.D/32 preference 200
> >
> > so, in case when traffic shall be diverted to scrubbing, bgp route
> > will be active in inet.0 and static route will be active in cleanL3VPN:
> >
> > snar@RT1.OV.SPB> show route A.B.C.D/32
> > inet.0: ...
> > + = Active Route, - = Last Active, * = Both
> >
> > A.B.C.D/32 *[BGP/170] 00:06:33, localpref 100
> > AS path: 65532 I, validation-state: unverified
> > > to Scrubbing via ae3.232
> > [Static/200] 00:02:22
> > > to A.B.C.D via ae3.200
> >
> > cleanL3VPN.inet.0: ....
> > + = Active Route, - = Last Active, * = Both
> >
> > A.B.C.D/32 *[Static/200] 00:02:22
> > > to A.B.C.D via ae3.200
> >
> >
> > and the corresponding forwarding entry:
> >
> > Routing table: default.inet [Index 0]
> > Internet:
> >
> > Destination: A.B.C.D/32
> > Route type: user
> > Route reference: 0 Route interface-index: 0
> > Multicast RPF nh index: 0
> > P2mpidx: 0
> > Flags: sent to PFE, rt nh decoupled
> > Nexthop: Scrubbing
> > Next-hop type: unicast Index: 2971 Reference: 6
> > Next-hop interface: ae3.232
> > RPF interface: ae3.200
> > RPF interface: ae3.232
> >
> > Destination: A.B.C.D/32
> > Route type: destination
> > Route reference: 0 Route interface-index: 431
> > Multicast RPF nh index: 0
> > P2mpidx: 0
> > Flags: none
> > Nexthop: 0:15:17:b0:e6:f8
> > Next-hop type: unicast Index: 2930 Reference: 3
> > Next-hop interface: ae3.200
> > RPF interface: ae3.200
> >
> > [...]
> > Routing table: cleanL3VPN.inet [Index 6]
> > Internet:
> >
> > Destination: A.B.C.D/32
> > Route type: user
> > Route reference: 0 Route interface-index: 0
> > Multicast RPF nh index: 0
> > P2mpidx: 0
> > Flags: sent to PFE, rt nh decoupled
> > Nexthop: 0:15:17:b0:e6:f8
> > Next-hop type: unicast Index: 2930 Reference: 3
> > Next-hop interface: ae3.200
> >
> >
> > >
> > > and export these to cleanL3VPN. Resulting forwarding-table:
> > >
> > > Routing table: default.inet [Index 0]
> > > Internet:
> > >
> > > Destination: A.B.C.D/32
> > > Route type: user
> > > Route reference: 0 Route interface-index: 0
> > > Multicast RPF nh index: 0
> > > P2mpidx: 0
> > > Flags: sent to PFE, rt nh decoupled
> > > Nexthop: 0:15:17:b0:e6:f8
> > > Next-hop type: unicast Index: 2930 Reference: 4
> > > Next-hop interface: ae3.200
> > > RPF interface: ae3.200
> > >
> > > [...]
> > >
> > > Routing table: cleanL3VPN.inet [Index 6]
> > > Internet:
> > >
> > > Destination: A.B.C.D/32
> > > Route type: user
> > > Route reference: 0 Route interface-index: 0
> > > Multicast RPF nh index: 0
> > > P2mpidx: 0
> > > Flags: sent to PFE, rt nh decoupled
> > > Nexthop: 0:15:17:b0:e6:f8
> > > Next-hop type: unicast Index: 2930 Reference: 4
> > > Next-hop interface: ae3.200
> > >
> > > Unfortunately, we found no way to provision such routes via BGP,
> > > so you have to have all those in configuration :(
> > >
> > > If there is a better workaround, I'd like to know it too :)
> > >
> > >
> > > > Hi there,
> > > >
> > > > We're a US research and education ISP and we've been tasked for coming
> > up with an architecture to allow on premise DDoS scrubbing with an appliance.
> > As a first pass I've created an cleanL3VPN routing-instance to function as a
> > clean VRF that uses rib-groups to mirror the relevant parts of inet.0. It is in
> > production and is working great for customer learned BGP routes. It falls apart
> > when I try to protect a directly attached destination that has a mac address in
> > inet.0. I think I understand why and the purpose of this message is to see if
> > anyone has been in a similar situation and has thoughts/advice/warnings
> > about alternative designs.
> > > >
> > > > To explain what I see, I noticed that mac address based nexthops don't
> > seem to be copied from inet.0 into cleanL3VPN.inet.0. I assume this means
> > that mac-address based forwarding must be referencing inet.0 [see far below].
> > This obviously creates a loop once the best path in inet.0 becomes a BGP /32.
> > For example when I'm announcing a /32 for 1.2.3.4 out of a locally attached
> > 1.2.3.0/26, traceroute implies the packet enters inet.0, is sent to 5.6.7.8 as
> > the nexthop correctly, arrives in cleanL3VPN which decides to forward to
> > 5.6.7.8 in a loop, even though the BGP /32 isn't part of cleanL3VPN [see
> > below], cleanL3VPN Is dependent on inet.0 for resolution. Even if I could copy
> > inet.0 mac addresses into cleanL3VPN, eventually the mac address would age
> > out of inet.0 because the /32 would no longer be directly connected. If I want
> > to be able to protect locally attached destinations so I think my design is
> > unworkable, I think my solutions are
> > > >
> > > > = use flowspec redirection to dirty VRF, keep inet.0 as clean and use
> > flowspec interface filter-group appropriately on backbone interfaces [.routing-
> > options flow interface-group exclude, which I already have deployed
> > correctly]. This seems easy but is less performant.
> > > > = put my customers into a customerVRF and deal with route leaking
> > between global and customerVRF. This is a well-known tactic but more
> > complicated to approach and disruptive to deploy as I have to airlift basically
> > all the customers to into a VRF to have full coverage.
> > > >
> > > > For redirection, to date I've been looking at longest prefix match solutions
> > due to the presumed scalability vs using flowspec. I have an unknown
> > amount of "always on" redirects I might be asked to entertain. 10? 100?
> > 1000? I'm trying to come up with a solution that doesn't rely on touching the
> > routers themselves. I did think about creating a normal [non flowspec] input
> > firewall term on untrusted interfaces that redirects to dirty VRF based in a
> > single destination prefix-list and just relying on flowspec for on demand stuff
> > with the assumption one firewall term with let's say 1000 prefixes is more
> > performant than 1000 standalone flowspec rules. I think my solution is
> > fundamentally workable but I don't think the purchased turnkey ddos
> > orchestration is going to natively interact with our Junipers, so that is looked
> > down upon, since it would require " a router guy " or writing custom
> > automation when adding/removing always-on protection. Seems technically
> > very viable to me, I j
> > > us
> > > > t bring up these details because I feel like without a ton of effort VRF
> > redirection can be made to be nearly as performant as longest prefix match.
> > > >
> > > > While we run MPLS, currently all of our customers/transit are in the global
> > table. I'm trying to avoid solutions for now that puts the 1M+ RIB DFZ zone
> > into an L3VPN; it's awfully big change I don't want to rush into especially for
> > this proof of concept but I'd like to hear opinions if that's the best solution to
> > this specific problem. I'm not sure it's fundamentally different than creating a
> > customerVRF, seems like I just need to separate the customers from the
> > internet ingress.
> > > >
> > > > My gut says "the best" thing to do is to create a customerVRF but it feels a
> > bit complicated as I have to worry about things like BGP/static/direct and will
> > lose addPath [.I recently discovered add-path and route-target are mutually
> > exclusive in JunOS].
> > > >
> > > > My gut says "the quickest" and least disruptive thing to do is to go the
> > flowspec/filter route and frankly I'm beginning to lean that way since I'm
> > already partially in production and needed to have a solution 5 days ago to
> > this problem :>
> > > >
> > > > I've done all of these things before [flowspec, rib leaking] I think it's just a
> > matter of trying to figure out the next best step and was looking to see if
> > anyone has been in a similar situation and has thoughts/advice/warnings.
> > > >
> > > > I'm talking about IPv4 below but I ack IPv6 is a thing and I would just do
> > the same solution.
> > > >
> > > > -Michael
> > > >
> > > > ===/===
> > > >
> > > > @$myrouter> show route forwarding-table destination 1.2.3.4 extensive
> > > > Apr 02 08:39:10
> > > > Routing table: default.inet [Index 0]
> > > > Internet:
> > > >
> > > > Destination: 1.2.3.4/32
> > > > Route type: user
> > > > Route reference: 0 Route interface-index: 0
> > > > Multicast RPF nh index: 0
> > > > P2mpidx: 0
> > > > Flags: sent to PFE
> > > > Next-hop type: indirect Index: 1048588 Reference: 3
> > > > Nexthop: 5.6.7.8
> > > > Next-hop type: unicast Index: 981 Reference: 3
> > > > Next-hop interface: et-0/1/10.3099
> > > >
> > > > Destination: 1.2.3.4/32
> > > > Route type: destination
> > > > Route reference: 0 Route interface-index: 85
> > > > Multicast RPF nh index: 0
> > > > P2mpidx: 0
> > > > Flags: none
> > > > Nexthop: 0:50:56:b3:4f:fe
> > > > Next-hop type: unicast Index: 1562 Reference: 1
> > > > Next-hop interface: ae17.3347
> > > >
> > > > Routing table: cleanL3VPN.inet [Index 21]
> > > > Internet:
> > > >
> > > > Destination: 1.2.3.0/26
> > > > Route type: user
> > > > Route reference: 0 Route interface-index: 0
> > > > Multicast RPF nh index: 0
> > > > P2mpidx: 0
> > > > Flags: sent to PFE, rt nh decoupled
> > > > Next-hop type: table lookup Index: 1 Reference: 40
> > > > _______________________________________________
> > > > juniper-nsp mailing list juniper-nsp@puck.nether.net
> > > >
> > https://urldefense.com/v3/__https://puck.nether.net/mailman/listinfo/junip
> > er-
> > nsp__;!!Mak6IKo!OP5fgdGtjPWTngVcQt8mG10zVeOT1BQTtzQaIzT9MWqOM
> > OPJvY_goFJTJVA1kek_IGLylCiYOLgImFms0w$
> > > _______________________________________________
> > > juniper-nsp mailing list juniper-nsp@puck.nether.net
> > >
> > https://urldefense.com/v3/__https://puck.nether.net/mailman/listinfo/junip
> > er-
> > nsp__;!!Mak6IKo!OP5fgdGtjPWTngVcQt8mG10zVeOT1BQTtzQaIzT9MWqOM
> > OPJvY_goFJTJVA1kek_IGLylCiYOLgImFms0w$
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp