Mailing List Archive

BGP timer
Hello everyone,

Having difficulty finding a way to prevent BGP from re-establishing after a
BFD down detect. I am looking for a way to keep the session from
re-establishing for a configured amount of time (say 5 minutes) to ensure
we don't have a flapping session for a. link having issues.

We asked the jtac but they came back with the reverse which would keep the
session up for a certain amount of time before it drops (Not what we want).

Is there a way to do this? We are using MX204 routers and the latest
23.4R1.9 Junos.

Best,

-Lee
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: BGP timer [ In reply to ]
Hi Lee

Would Flap Damping fix it for you ?

I also find that BFD can cause more problems than it fixes if you go too
aggressive with it !

regards

Sean


On 27-Apr-24 9:44 AM, Lee Starnes via juniper-nsp wrote:
> Hello everyone,
>
> Having difficulty finding a way to prevent BGP from re-establishing after a
> BFD down detect. I am looking for a way to keep the session from
> re-establishing for a configured amount of time (say 5 minutes) to ensure
> we don't have a flapping session for a. link having issues.
>
> We asked the jtac but they came back with the reverse which would keep the
> session up for a certain amount of time before it drops (Not what we want).
>
> Is there a way to do this? We are using MX204 routers and the latest
> 23.4R1.9 Junos.
>
> Best,
>
> -Lee
> _______________________________________________
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: BGP timer [ In reply to ]
Hello Lee,

at least for link flapping issues (but not other session flapping reasons) you could set the hold-time:

set interfaces xy hold-time up 300000

This would delay the link to come up.

kind regards
Rolf

On 27/04/2024 12:34, Sean Clarke via juniper-nsp wrote:
> Hi Lee
>
> Would Flap Damping fix it for you ?
>
> I also find that BFD can cause more problems than it fixes if you go too aggressive with it !
>
> regards
>
> Sean
>
>
> On 27-Apr-24 9:44 AM, Lee Starnes via juniper-nsp wrote:
>> Hello everyone,
>>
>> Having difficulty finding a way to prevent BGP from re-establishing after a
>> BFD down detect. I am looking for a way to keep the session from
>> re-establishing for a configured amount of time (say 5 minutes) to ensure
>> we don't have a flapping session for a. link having issues.
>>
>> We asked the jtac but they came back with the reverse which would keep the
>> session up for a certain amount of time before it drops (Not what we want).
>>
>> Is there a way to do this? We are using MX204 routers and the latest
>> 23.4R1.9 Junos.
>>
>> Best,
>>
>> -Lee
>> _______________________________________________
>> juniper-nsp mailing list juniper-nsp@puck.nether.net
>> https://puck.nether.net/mailman/listinfo/juniper-nsp
> _______________________________________________
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: BGP timer [ In reply to ]
Only way to achieve something like this is either link hold-timer or event
script- depending on your requirement you can use one of the options

On Sat, Apr 27, 2024 at 12:45?AM Lee Starnes via juniper-nsp <
juniper-nsp@puck.nether.net> wrote:

> Hello everyone,
>
> Having difficulty finding a way to prevent BGP from re-establishing after a
> BFD down detect. I am looking for a way to keep the session from
> re-establishing for a configured amount of time (say 5 minutes) to ensure
> we don't have a flapping session for a. link having issues.
>
> We asked the jtac but they came back with the reverse which would keep the
> session up for a certain amount of time before it drops (Not what we want).
>
> Is there a way to do this? We are using MX204 routers and the latest
> 23.4R1.9 Junos.
>
> Best,
>
> -Lee
> _______________________________________________
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: BGP timer [ In reply to ]
>
> I also find that BFD can cause more problems than it fixes if you go too
> aggressive with it !
>

Concur here. BFD has its uses in specific circumstances, but it's almost
always much better to rely on interface state change and hold-time up FOO.


On Sat, Apr 27, 2024 at 6:34?AM Sean Clarke via juniper-nsp <
juniper-nsp@puck.nether.net> wrote:

> Hi Lee
>
> Would Flap Damping fix it for you ?
>
> I also find that BFD can cause more problems than it fixes if you go too
> aggressive with it !
>
> regards
>
> Sean
>
>
> On 27-Apr-24 9:44 AM, Lee Starnes via juniper-nsp wrote:
> > Hello everyone,
> >
> > Having difficulty finding a way to prevent BGP from re-establishing
> after a
> > BFD down detect. I am looking for a way to keep the session from
> > re-establishing for a configured amount of time (say 5 minutes) to ensure
> > we don't have a flapping session for a. link having issues.
> >
> > We asked the jtac but they came back with the reverse which would keep
> the
> > session up for a certain amount of time before it drops (Not what we
> want).
> >
> > Is there a way to do this? We are using MX204 routers and the latest
> > 23.4R1.9 Junos.
> >
> > Best,
> >
> > -Lee
> > _______________________________________________
> > juniper-nsp mailing list juniper-nsp@puck.nether.net
> > https://puck.nether.net/mailman/listinfo/juniper-nsp
> _______________________________________________
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: BGP timer [ In reply to ]
On Sat, 27 Apr 2024 at 14:29, Rolf Hanßen via juniper-nsp
<juniper-nsp@puck.nether.net> wrote:

> at least for link flapping issues (but not other session flapping reasons) you could set the hold-time:
> set interfaces xy hold-time up 300000

Since Junos 14.1 it has caught up with Cisco, and it has implemented
exponential back-off for interface damping. So you don't have to cause
a static penalty as above, but can penalise actually flapping
interfaces, instead of killing convergence on the first transition.

But indeed doesn't really address what OP is asking, and I don't
think, outside scripting, there is a direct solution to what OP wants.
Clearly any vendor could implement exponential back-off damping to any
protocol which has up and down state, and they could write the code
once, and reuse it for everything, so it's not a tall order at all.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: BGP timer [ In reply to ]
On 2024-04-27 09:44, Lee Starnes via juniper-nsp wrote:

> Having difficulty finding a way to prevent BGP from re-establishing after a
> BFD down detect. I am looking for a way to keep the session from
> re-establishing for a configured amount of time (say 5 minutes) to ensure
> we don't have a flapping session for a. link having issues.

Isn't that what the holddown-interval setting does? It is limited
to 255 seconds (4 minutes 15 seconds), though, and for BGP it is
only allowed for EBGP sessions, not iBGP sessions.

The documentation also says that you need to set holddown-interval
on *both* ends. I'm gueesing that the holddown only prevents your
end from initiating the BGP session, but that it will still accept
a connection initiated from the other end.

https://www.juniper.net/documentation/us/en/software/junos/cli-reference/topics/ref/statement/bfd-liveness-detection-edit-protocols-bgp.html

I haven't used BFD for BGP myself, though, only for static routes
on a couple of links. But there I do use holddown-interval, and
at least when I set it up several years ago, it seemed to do what
I expected: after the link and the BFD session came up again, it
waited (in my case) 15 seconds before enabling my static route
again.


--
Thomas Bellman, National Supercomputer Centre, Linköping Univ., Sweden
"We don't understand the software, and sometimes we don't understand
the hardware, but we can *see* the blinking lights!"
Re: BGP timer [ In reply to ]
BFD holddown is the right feature for this.
WARNING: BFD holddown is known to be problematic between Juniper and Cisco implementations due to where each start their state machines for BFD vs. BGP.

It was a partial motivation for BGP BFD strict:
https://datatracker.ietf.org/doc/html/draft-ietf-idr-bgp-bfd-strict-mode

BGP BFD strict was added in 23.2R1.

-- Jeff


+/v8-On 4/28/24, 05:13, "juniper-nsp on behalf of Thomas Bellman via juniper-nsp" <juniper-nsp-bounces@puck.nether.net <mailto:juniper-nsp-bounces@puck.nether.net> on behalf of juniper-nsp@puck.nether.net <mailto:juniper-nsp@puck.nether.net>> wrote:


[External Email. Be cautious of content]





Juniper Business Use Only
On 2024-04-27 09:44, Lee Starnes via juniper-nsp wrote:


> Having difficulty finding a way to prevent BGP from re-establishing after a
> BFD down detect. I am looking for a way to keep the session from
> re-establishing for a configured amount of time (say 5 minutes) to ensure
> we don't have a flapping session for a. link having issues.


Isn't that what the holddown-interval setting does? It is limited
to 255 seconds (4 minutes 15 seconds), though, and for BGP it is
only allowed for EBGP sessions, not iBGP sessions.


The documentation also says that you need to set holddown-interval
on *both* ends. I'm gueesing that the holddown only prevents your
end from initiating the BGP session, but that it will still accept
a connection initiated from the other end.


https://www.juniper.net/documentation/us/en/software/junos/cli-reference/topics/ref/statement/bfd-liveness-detection-edit-protocols-bgp.html <https://www.juniper.net/documentation/us/en/software/junos/cli-reference/topics/ref/statement/bfd-liveness-detection-edit-protocols-bgp.html>


I haven't used BFD for BGP myself, though, only for static routes
on a couple of links. But there I do use holddown-interval, and
at least when I set it up several years ago, it seemed to do what
I expected: after the link and the BFD session came up again, it
waited (in my case) 15 seconds before enabling my static route
again.




--
Thomas Bellman, National Supercomputer Centre, Link+APY-ping Univ., Sweden
"We don't understand the software, and sometimes we don't understand
the hardware, but we can *see* the blinking lights!"





_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: BGP timer [ In reply to ]
On Sun, 28 Apr 2024 at 21:20, Jeff Haas via juniper-nsp
<juniper-nsp@puck.nether.net> wrote:

> BFD holddown is the right feature for this.
> WARNING: BFD holddown is known to be problematic between Juniper and Cisco implementations due to where each start their state machines for BFD vs. BGP.
>
> It was a partial motivation for BGP BFD strict:
> https://datatracker.ietf.org/doc/html/draft-ietf-idr-bgp-bfd-strict-mode
>
> BGP BFD strict was added in 23.2R1.

But why is this desirable? Why do I want to prioritise stability
always, instead of prioritising convergence on well-behaved interfaces
and stability on poorly behaved interfaces?

If I can pick just one, I'll prioritise convergence every time for both.

That is, if I cannot have exponential back-off, I won't kill
convergence 'just in case', because it's not me who will feel the pain
of my decisions, it's my customers. Netengs and particularly infosec
people quite often are unnecessarily conservative in their policies,
because they don't have skin in the game, they feel the upside, but
not the downside.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: BGP timer [ In reply to ]
On 4/29/24 08:31, Saku Ytti via juniper-nsp wrote:

> But why is this desirable? Why do I want to prioritise stability
> always, instead of prioritising convergence on well-behaved interfaces
> and stability on poorly behaved interfaces?
>
> If I can pick just one, I'll prioritise convergence every time for both.
>
> That is, if I cannot have exponential back-off, I won't kill
> convergence 'just in case', because it's not me who will feel the pain
> of my decisions, it's my customers. Netengs and particularly infosec
> people quite often are unnecessarily conservative in their policies,
> because they don't have skin in the game, they feel the upside, but
> not the downside.

Over the decades, I've had a handful of customers that preferred uptime
to convergence, because they were measured on that by their boss,
organization or auditors.

You know - the kind of people that would refuse to reboot a router to
implement new code, because "Last Reboot: 5y, 6w ago" looks far better
than "Last Reboot: 15min ago" - those people.

Protocols staying up despite the underlay being unstable means traffic
dies and users are not happy. It's really that simple.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: BGP timer [ In reply to ]
Hi,

On Mon, Apr 29, 2024 at 08:52:17AM +0200, Mark Tinka via juniper-nsp wrote:
> Protocols staying up despite the underlay being unstable means traffic dies
> and users are not happy. It's really that simple.

Yes, but that's a slightly different tangent. If the underlay is unstable,
I think we're all in agremeent that higher layers should not send packets
there.

The interesting question is "how to react when underlay seems to be stable
again"? "bring up upper layers right away, with exponential decay flap
dampening" or "always wait 15 minutes to be SURE it's stable!!!"...

I go for flap dampening and taking ports into service quickly ("one of the
trainees might be on a rampage, killing the other uplink right next") ;-)

gert

--
"If was one thing all people took for granted, was conviction that if you
feed honest figures into a computer, honest figures come out. Never doubted
it myself till I met a computer with a sense of humor."
Robert A. Heinlein, The Moon is a Harsh Mistress

Gert Doering - Munich, Germany gert@greenie.muc.de
Re: BGP timer [ In reply to ]
On 4/29/24 09:06, Gert Doering wrote:

> Yes, but that's a slightly different tangent. If the underlay is unstable,
> I think we're all in agremeent that higher layers should not send packets
> there.

It comes down to how you classify stable (well-behaved) vs. unstable
(misbehaving) interfaces.

This will vary for networks, backbones, providers, e.t.c.

In many cases, manual intervention will be required because even the
most aggressive or the most conservative dampening settings will not be
able to account for what stable and unstable interfaces means. I suppose
one could "AI" it, but that's outside the realm of my abilities.

In other words, a one-size-fits-all is unlikely to work here. Plenty of
tools exist, and I think it is up to the operator to educate themselves
on all of them and make the best decision for a given scenario.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: BGP timer [ In reply to ]
On Mon, 29 Apr 2024 at 10:07, Gert Doering via juniper-nsp
<juniper-nsp@puck.nether.net> wrote:

> The interesting question is "how to react when underlay seems to be stable
> again"? "bring up upper layers right away, with exponential decay flap
> dampening" or "always wait 15 minutes to be SURE it's stable!!!"...

100%, what Mark implied was not what I was trying to communicate.
Sure, go ahead and damp flapping interfaces, but to penalise on first
down event, when most of them are just that, one event, to me, is just
bad policy made by people who don't feel the cost.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: BGP timer [ In reply to ]
On Mon, 29 Apr 2024 at 10:13, Mark Tinka via juniper-nsp
<juniper-nsp@puck.nether.net> wrote:

> It comes down to how you classify stable (well-behaved) vs. unstable
> (misbehaving) interfaces.

You are making this unnecessarily complicated.

You could simply configure that first down event doesn't add enough
points to damp, 2nd does. And you are wildly better off.

Perfect is the enemy of done and kills all movement towards better.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: BGP timer [ In reply to ]
On 4/29/24 09:15, Saku Ytti wrote:

> You are making this unnecessarily complicated.
>
> You could simply configure that first down event doesn't add enough
> points to damp, 2nd does. And you are wildly better off.
>
> Perfect is the enemy of done and kills all movement towards better.

Fair enough.

My perspective is from this side of the world where backbone is not the
greatest experience in most of the inland markets. But I grant that such
scenarios are not the norm in more mature regions.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: BGP timer [ In reply to ]
On 4/29/24 09:13, Saku Ytti wrote:

> 100%, what Mark implied was not what I was trying to communicate.
> Sure, go ahead and damp flapping interfaces, but to penalise on first
> down event, when most of them are just that, one event, to me, is just
> bad policy made by people who don't feel the cost.

Yes, agree with this. Didn't mean to cause a mix-up.

As before, my perspective is from circuits where it can be continuous
events in, let's say, a 12-hour period every few moments. Yes, this is
not the norm in most mature markets, but we have had to deal with this
sort of thing several times a year, down here, and it can get complex
especially if the route you are dealing with has no suitable alternative
options other than going round the continent and back.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: BGP timer [ In reply to ]
Juniper Business Use Only
On 4/29/24, 02:41, "Saku Ytti" <saku@ytti.fi <mailto:saku@ytti.fi>> wrote:
> On Sun, 28 Apr 2024 at 21:20, Jeff Haas via juniper-nsp
> > BFD holddown is the right feature for this.
>
> But why is this desirable? Why do I want to prioritise stability
> always, instead of prioritising convergence on well-behaved interfaces
> and stability on poorly behaved interfaces?

This feature is "don't bring up BGP on interfaces that aren't stable enough to
let BFD stay up". The intended use case is when you have an interface noisy
enough that TCP can fight its way through keeping BGP up... enough, but not
stable enough that you'd really want to forward over it. The assessment for
that is "BFD will go down in short order".

> That is, if I cannot have exponential back-off, I won't kill
> convergence 'just in case', because it's not me who will feel the pain
> of my decisions, it's my customers. Netengs and particularly infosec
> people quite often are unnecessarily conservative in their policies,
> because they don't have skin in the game, they feel the upside, but
> not the downside.

People make decisions that are appropriate for their networks. Using BFD on
your BGP sessions is probably overkill *for you*. Don't do that then.

-- Jeff

_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: BGP timer [ In reply to ]
Thank you everyone for the replies on this topic. For us, we would rather
keep a link down longer when it has an issue and goes down than to have it
come back up and then go down again. This is because the flapping is very
destructive to live video and VoIP. Having several diverse backbone
connections, we can tolerate having one down. This topic came up because we
have had one of our backbone carriers become problematic and the flapping
caused by their issues caused a lot of damage in terms of customer
relations. So certainly would want to let a failed link sit failed for a
little bit after it restores before bringing BGP back up.

As for BFD and stability with aggressive settings, we don't run too
aggressive on this, but certainly do require it because the physical links
have not gone down in our cases when we have had issues, causing a larger
delay in killing the routes for that path. Not being able to rely on link
state failure leaves us with requiring the use of BFD.

Again, thanks for all the replies everyone. I will check out the BFD
holddown.

-Lee

On Mon, Apr 29, 2024 at 5:43?AM Jeff Haas via juniper-nsp <
juniper-nsp@puck.nether.net> wrote:

>
> Juniper Business Use Only
> On 4/29/24, 02:41, "Saku Ytti" <saku@ytti.fi <mailto:saku@ytti.fi>> wrote:
> > On Sun, 28 Apr 2024 at 21:20, Jeff Haas via juniper-nsp
> > > BFD holddown is the right feature for this.
> >
> > But why is this desirable? Why do I want to prioritise stability
> > always, instead of prioritising convergence on well-behaved interfaces
> > and stability on poorly behaved interfaces?
>
> This feature is "don't bring up BGP on interfaces that aren't stable
> enough to
> let BFD stay up". The intended use case is when you have an interface
> noisy
> enough that TCP can fight its way through keeping BGP up... enough, but not
> stable enough that you'd really want to forward over it. The assessment
> for
> that is "BFD will go down in short order".
>
> > That is, if I cannot have exponential back-off, I won't kill
> > convergence 'just in case', because it's not me who will feel the pain
> > of my decisions, it's my customers. Netengs and particularly infosec
> > people quite often are unnecessarily conservative in their policies,
> > because they don't have skin in the game, they feel the upside, but
> > not the downside.
>
> People make decisions that are appropriate for their networks. Using BFD
> on
> your BGP sessions is probably overkill *for you*. Don't do that then.
>
> -- Jeff
>
> _______________________________________________
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: BGP timer [ In reply to ]
On 4/29/24 17:42, Lee Starnes via juniper-nsp wrote:
> As for BFD and stability with aggressive settings, we don't run too
> aggressive on this, but certainly do require it because the physical links
> have not gone down in our cases when we have had issues, causing a larger
> delay in killing the routes for that path. Not being able to rely on link
> state failure leaves us with requiring the use of BFD.

Is this link carrying eBGP or iBGP?

If the latter, have you considered using BFD to track the IGP instead of
BGP?

Mark.
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: BGP timer [ In reply to ]
Hello Mark,

Thanks for asking. This is eBGP and the issue is that there have been
failures whereby the link does not fail, and thus can't track that routes
should be removed. BGP session has stayed up in some cases as well, yet no
traffic.

On Mon, Apr 29, 2024 at 9:31?AM Mark Tinka via juniper-nsp <
juniper-nsp@puck.nether.net> wrote:

>
>
> On 4/29/24 17:42, Lee Starnes via juniper-nsp wrote:
> > As for BFD and stability with aggressive settings, we don't run too
> > aggressive on this, but certainly do require it because the physical
> links
> > have not gone down in our cases when we have had issues, causing a larger
> > delay in killing the routes for that path. Not being able to rely on link
> > state failure leaves us with requiring the use of BFD.
>
> Is this link carrying eBGP or iBGP?
>
> If the latter, have you considered using BFD to track the IGP instead of
> BGP?
>
> Mark.
> _______________________________________________
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: BGP timer [ In reply to ]
On 5/3/24 19:54, Lee Starnes wrote:

> Hello Mark,
>
> Thanks for asking. This is eBGP and the issue is that there have been
> failures whereby the link does not fail, and thus can't track that
> routes should be removed. BGP session has stayed up in some cases as
> well, yet no traffic.

Yeah, if the physical media is unable to indicate link-layer failure to
the system, BFD is probably you best bet at detecting link failure.

Have you worked with your eBGP neighbor (I'm guessing your transit
provider) to see what they can do to improve link failure detection at
the link layer? Perhaps avoid an intermediate Ethernet switch between
both your routers, e.t.c.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp