Mailing List Archive

RSVP-TE broken between pre and post 16.1 code?
Hi gents,

Just wondering if anyone experienced RSVP-TE incompatibility issues when
moving from pre 16.1 code to post 16.1 code.
Didn't get much out of Juniper folks thus far so I figured I'll ask here as
well.

The problem we're facing is that in case 17 code is LSP head-end and 15 code
is tail-end works, but in the opposite direction 17/15-to-17 (basically
cases where 17 is the LSP tail-end) the LSP signalling fails.
Trace reveals that the 17 gets the PATH message for bunch of LSPs, accepts
it (yes reduction and acks are used), creates the session, then deletes it
right away for some reason.
Our testing suggests there are two workarounds for this:
You might be aware that in 16.1 among other RSVP-TE changes the default
refresh-time (governing generation of successive refresh messages Path/Resv)
changed to 1200s -so no what you think making it 1200 on 15 side wont do, it
has to be less (e.q. 1999s).
If you want to keep refresh time at 1200 or higher then another option
strangely enough is to disable CSPF on the affected LSPs (didn't know that
SPF/CSPF changes contents of the PATH msg that in one case 17 code is cool
with PATH msg in other case not).

Would appreciate any pointers.


adam


_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: RSVP-TE broken between pre and post 16.1 code? [ In reply to ]
I have MXs and ACXs all running RSVP, ranging in versions from 13 to
18. I haven't had any issues.
What platform are you using?


On Fri, Jun 28, 2019 at 7:15 AM <adamv0025@netconsultings.com> wrote:
>
> Hi gents,
>
> Just wondering if anyone experienced RSVP-TE incompatibility issues when
> moving from pre 16.1 code to post 16.1 code.
> Didn't get much out of Juniper folks thus far so I figured I'll ask here as
> well.
>
> The problem we're facing is that in case 17 code is LSP head-end and 15 code
> is tail-end works, but in the opposite direction 17/15-to-17 (basically
> cases where 17 is the LSP tail-end) the LSP signalling fails.
> Trace reveals that the 17 gets the PATH message for bunch of LSPs, accepts
> it (yes reduction and acks are used), creates the session, then deletes it
> right away for some reason.
> Our testing suggests there are two workarounds for this:
> You might be aware that in 16.1 among other RSVP-TE changes the default
> refresh-time (governing generation of successive refresh messages Path/Resv)
> changed to 1200s -so no what you think making it 1200 on 15 side wont do, it
> has to be less (e.q. 1999s).
> If you want to keep refresh time at 1200 or higher then another option
> strangely enough is to disable CSPF on the affected LSPs (didn't know that
> SPF/CSPF changes contents of the PATH msg that in one case 17 code is cool
> with PATH msg in other case not).
>
> Would appreciate any pointers.
>
>
> adam
>
>
> _______________________________________________
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: RSVP-TE broken between pre and post 16.1 code? [ In reply to ]
Adam-

Have you accounted for this behavioral change?

https://kb.juniper.net/InfoCenter/index?page=content&id=KB32883&pmv=print&actp=LIST&searchid=&type=currentpaging

-Michael

> -----Original Message-----
> From: juniper-nsp <juniper-nsp-bounces@puck.nether.net> On Behalf Of
> adamv0025@netconsultings.com
> Sent: Friday, June 28, 2019 9:16 AM
> To: juniper-nsp@puck.nether.net
> Subject: [j-nsp] RSVP-TE broken between pre and post 16.1 code?
>
> Hi gents,
>
> Just wondering if anyone experienced RSVP-TE incompatibility issues when
> moving from pre 16.1 code to post 16.1 code.
> Didn't get much out of Juniper folks thus far so I figured I'll ask here as
> well.
>
> The problem we're facing is that in case 17 code is LSP head-end and 15 code
> is tail-end works, but in the opposite direction 17/15-to-17 (basically
> cases where 17 is the LSP tail-end) the LSP signalling fails.
> Trace reveals that the 17 gets the PATH message for bunch of LSPs, accepts
> it (yes reduction and acks are used), creates the session, then deletes it
> right away for some reason.
> Our testing suggests there are two workarounds for this:
> You might be aware that in 16.1 among other RSVP-TE changes the default
> refresh-time (governing generation of successive refresh messages
> Path/Resv)
> changed to 1200s -so no what you think making it 1200 on 15 side wont do, it
> has to be less (e.q. 1999s).
> If you want to keep refresh time at 1200 or higher then another option
> strangely enough is to disable CSPF on the affected LSPs (didn't know that
> SPF/CSPF changes contents of the PATH msg that in one case 17 code is cool
> with PATH msg in other case not).
>
> Would appreciate any pointers.
>
>
> adam
>
>
> _______________________________________________
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: RSVP-TE broken between pre and post 16.1 code? [ In reply to ]
> From: Michael Hare <michael.hare@wisc.edu>
> Sent: Friday, June 28, 2019 7:02 PM
>
> Adam-
>
> Have you accounted for this behavioral change?
>
> https://kb.juniper.net/InfoCenter/index?page=content&id=KB32883&pmv=
> print&actp=LIST&searchid=&type=currentpaging
>
Thank you, yes please we're aware of that, but even with this the issue is
still present if the refresh timer is not <1200 or CSPF is enabled.

adam


_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: RSVP-TE broken between pre and post 16.1 code? [ In reply to ]
We have seen similar issues with single hop LSPs and the stats being wrong unless we explicitly configure the single hop LSPs differently with UHP.

Sent from my iCar

On Jul 1, 2019, at 5:59 AM, <adamv0025@netconsultings.com> <adamv0025@netconsultings.com> wrote:

>> From: Michael Hare <michael.hare@wisc.edu>
>> Sent: Friday, June 28, 2019 7:02 PM
>>
>> Adam-
>>
>> Have you accounted for this behavioral change?
>>
>> https://kb.juniper.net/InfoCenter/index?page=content&id=KB32883&pmv=
>> print&actp=LIST&searchid=&type=currentpaging
>>
> Thank you, yes please we're aware of that, but even with this the issue is
> still present if the refresh timer is not <1200 or CSPF is enabled.
>
> adam
>
>
> _______________________________________________
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: RSVP-TE broken between pre and post 16.1 code? [ In reply to ]
I had that issue between QFX5110's and MX's. Some feature at the time
forced me to run 17.4 on the QFX's and they wouldn't establish LSP's with
older MX80's in our fleet that were still running 14.2.

I had to either downgrade the QFX's to 15.1 or upgrade the MX's to 16.1 or
greater. I ended up grading the MX's as they were overdue anyway.

Simon.


On Fri, 28 Jun 2019 at 22:15, <adamv0025@netconsultings.com> wrote:

> Hi gents,
>
> Just wondering if anyone experienced RSVP-TE incompatibility issues when
> moving from pre 16.1 code to post 16.1 code.
> Didn't get much out of Juniper folks thus far so I figured I'll ask here as
> well.
>
> The problem we're facing is that in case 17 code is LSP head-end and 15
> code
> is tail-end works, but in the opposite direction 17/15-to-17 (basically
> cases where 17 is the LSP tail-end) the LSP signalling fails.
> Trace reveals that the 17 gets the PATH message for bunch of LSPs, accepts
> it (yes reduction and acks are used), creates the session, then deletes it
> right away for some reason.
> Our testing suggests there are two workarounds for this:
> You might be aware that in 16.1 among other RSVP-TE changes the default
> refresh-time (governing generation of successive refresh messages
> Path/Resv)
> changed to 1200s -so no what you think making it 1200 on 15 side wont do,
> it
> has to be less (e.q. 1999s).
> If you want to keep refresh time at 1200 or higher then another option
> strangely enough is to disable CSPF on the affected LSPs (didn't know that
> SPF/CSPF changes contents of the PATH msg that in one case 17 code is cool
> with PATH msg in other case not).
>
> Would appreciate any pointers.
>
>
> adam
>
>
> _______________________________________________
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>


--

Dicko.
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: RSVP-TE broken between pre and post 16.1 code? [ In reply to ]
Looks like the PR about this is now available: PR1443811 «RSVP refresh-timer interoperability between 15.1 and 16.1+».

«Path message with long refresh interval (equal to or more than 20 minutes) from a node that does not support Refresh-interval Independent RSVP (RI-RSVP) is dropped by the receiver with RI-RSVP.»


> Le 2 juil. 2019 à 07:22, Simon Dixon <dicko@highway1.com.au> a écrit :
>
> I had that issue between QFX5110's and MX's. Some feature at the time
> forced me to run 17.4 on the QFX's and they wouldn't establish LSP's with
> older MX80's in our fleet that were still running 14.2.
>
> I had to either downgrade the QFX's to 15.1 or upgrade the MX's to 16.1 or
> greater. I ended up grading the MX's as they were overdue anyway.
>
> Simon.
>
>
> On Fri, 28 Jun 2019 at 22:15, <adamv0025@netconsultings.com> wrote:
>
>> Hi gents,
>>
>> Just wondering if anyone experienced RSVP-TE incompatibility issues when
>> moving from pre 16.1 code to post 16.1 code.
>> Didn't get much out of Juniper folks thus far so I figured I'll ask here as
>> well.
>>
>> The problem we're facing is that in case 17 code is LSP head-end and 15
>> code
>> is tail-end works, but in the opposite direction 17/15-to-17 (basically
>> cases where 17 is the LSP tail-end) the LSP signalling fails.
>> Trace reveals that the 17 gets the PATH message for bunch of LSPs, accepts
>> it (yes reduction and acks are used), creates the session, then deletes it
>> right away for some reason.
>> Our testing suggests there are two workarounds for this:
>> You might be aware that in 16.1 among other RSVP-TE changes the default
>> refresh-time (governing generation of successive refresh messages
>> Path/Resv)
>> changed to 1200s -so no what you think making it 1200 on 15 side wont do,
>> it
>> has to be less (e.q. 1999s).
>> If you want to keep refresh time at 1200 or higher then another option
>> strangely enough is to disable CSPF on the affected LSPs (didn't know that
>> SPF/CSPF changes contents of the PATH msg that in one case 17 code is cool
>> with PATH msg in other case not).
>>
>> Would appreciate any pointers.

_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: RSVP-TE broken between pre and post 16.1 code? [ In reply to ]
> From: Nathan Ward <nward@daork.net>
> Sent: Friday, August 16, 2019 8:39 AM
>
> > On 1/07/2019, at 9:59 PM, adamv0025@netconsultings.com wrote:
> >
> >> From: Michael Hare <michael.hare@wisc.edu>
> >> Sent: Friday, June 28, 2019 7:02 PM
> >>
> >> Adam-
> >>
> >> Have you accounted for this behavioral change?
> >>
> >>
> https://kb.juniper.net/InfoCenter/index?page=content&id=KB32883&pmv=
> >> print&actp=LIST&searchid=&type=currentpaging
> >>
> > Thank you, yes please we're aware of that, but even with this the
> > issue is still present if the refresh timer is not <1200 or CSPF is enabled.
>
> I’m confused by this one - what’s the refresh timer and CSPF got to do with
> it?
>
Not much it's a bug, it appears form the logs that the path message has "something different" in the ERO when CSPF is enabled triggering the bug ...

> LSPs on 16.1 will do self-ping after they come up before they put traffic on
> them. The lo0 filter has to permit that, or you’ve got to disable self-ping.
>
LSPs will do self ping when switching onto a new/optimized path, not when the LSP is first brought up -which in this case doesn’t happen.

> Or am I parsing this weird, and you’re saying this is still an issue even with the
> self ping disabled (or permitted in filters), under those conditions?
>
Yes that is correct, this problem appears even before the self-ping is engaged (the LSP is not even signalled -the RESV msg is never sent as a response to PATH msg in this case).

adam

_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: RSVP-TE broken between pre and post 16.1 code? [ In reply to ]
> On 16/08/2019, at 10:01 PM, <adamv0025@netconsultings.com> <adamv0025@netconsultings.com> wrote:
>
>
>
>> From: Nathan Ward <nward@daork.net>
>> Sent: Friday, August 16, 2019 8:39 AM
>>
>>> On 1/07/2019, at 9:59 PM, adamv0025@netconsultings.com wrote:
>>>
>>>> From: Michael Hare <michael.hare@wisc.edu>
>>>> Sent: Friday, June 28, 2019 7:02 PM
>>>>
>>>> Adam-
>>>>
>>>> Have you accounted for this behavioral change?
>>>>
>>>>
>> https://kb.juniper.net/InfoCenter/index?page=content&id=KB32883&pmv=
>>>> print&actp=LIST&searchid=&type=currentpaging
>>>>
>>> Thank you, yes please we're aware of that, but even with this the
>>> issue is still present if the refresh timer is not <1200 or CSPF is enabled.
>>
>> I’m confused by this one - what’s the refresh timer and CSPF got to do with
>> it?
>>
> Not much it's a bug, it appears form the logs that the path message has "something different" in the ERO when CSPF is enabled triggering the bug ...
>
>> LSPs on 16.1 will do self-ping after they come up before they put traffic on
>> them. The lo0 filter has to permit that, or you’ve got to disable self-ping.
>>
> LSPs will do self ping when switching onto a new/optimized path, not when the LSP is first brought up -which in this case doesn’t happen.

I am fairly certain they’ll do self ping on new LSPs as well. It’s, in theory, whenever the ingress LSR gets a RESV message - for either a new path, or a whole new LSP it doesn’t matter.

>> Or am I parsing this weird, and you’re saying this is still an issue even with the
>> self ping disabled (or permitted in filters), under those conditions?
>>
> Yes that is correct, this problem appears even before the self-ping is engaged (the LSP is not even signalled -the RESV msg is never sent as a response to PATH msg in this case).

Yeah got it, that’s certainly Not Right.

Weird that the refresh timer <1200 triggers it. Do you mean the refresh timer in the time values object is 1200 - i.e. 1.2s? JunOS default is 30,000 (30s), 1200 seems very short, or very long? Certainly it sounds like a bug, but, I’m curious why you’d have timers like that..

--
Nathan Ward

_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: RSVP-TE broken between pre and post 16.1 code? [ In reply to ]
Could it be:
Number PR1443811
Title RSVP refresh-timer interoperability between 15.1 and 16.1+
Release Note

Path message with long refresh interval (equal to or more than 20
minutes) from a node that does not support Refresh-interval Independent
RSVP (RI-RSVP) is dropped by the receiver with RI-RSVP.

Severity Minor
Status Open
Last Modified 2019-07-26 08:51:18 EDT
Resolved In
Release junos
18.3R3 x
18.4R3 x
17.4R3 x
19.2R2 x
19.3R1 x
19.1R2 x
Product J Series, M Series, T Series, MX-series, EX Series, SRX Series,
QFX Series, NFX Series, PTX Series
Functional Area software
Feature Group Multiprotocol Label Switching (MPLS)
Workaround

1. Use default rsvp refresh-time config. No config is needed.
30 seconds in 15.1 and 20 minutes in 16.1+

2. If you must configure rsvp refresh-time, configure it to be less than
20 minutes.
set protocols rsvp refresh-time 1199

Problem

Starting with Junos OS Release 16.1, RSVP Traffic Engineering (TE)
protocol extensions to support Refresh-interval Independent RSVP
(RI-RSVP) defined RFC 8370 for fast reroute (FRR) facility protection
were introduced to allow greater scalability of label-switched paths
(LSPs) faster convergence times and decrease RSVP signaling message
overhead from periodic refreshes. RI-RSVP mode is enabled by default and
includes protocol extensions to support RI-RSVP for FRR facility bypass
originally specified in RFC 4090. The default refresh time for RSVP
messages has increased from 30 seconds to 20 minutes.
In mixed environments, where a subset of LSPs traverse nodes that do not
include this feature, Junos RSVP-TE running in enhanced FRR mode will
automatically turn off the new protocol extensions in its signaling
exchanges with nodes that do not support the new extensions. However,
path messages with long refresh interval (equal to or more than 20
minutes) from such nodes will be dropped by the receiver with RI-RSVP.
It is assumed that non-RI-RSVP nodes should have lower refresh time
because it is used for failure detection in non-RI-RSVP environments.

With this fix, configuring 'no-enhanced-frr-bypass' on 16.1+ nodes will
solve the silent path message drop and will allow 20 minutes and higher
refresh times to be used on non-RI-RSVP nodes.

Triggers

- 'protocols rsvp refresh-time 1200' or higher is used on a non-RI-RSVP
node (Junos <16.1).
- There is a RI-RSVP (16.1 or later) node after non-RI-RSVP node.

Kind regards,
Andrey Kostin

adamv0025@netconsultings.com ????? 2019-08-16 06:01:
>> From: Nathan Ward <nward@daork.net>
>> Sent: Friday, August 16, 2019 8:39 AM
>>
>> > On 1/07/2019, at 9:59 PM, adamv0025@netconsultings.com wrote:
>> >
>> >> From: Michael Hare <michael.hare@wisc.edu>
>> >> Sent: Friday, June 28, 2019 7:02 PM
>> >>
>> >> Adam-
>> >>
>> >> Have you accounted for this behavioral change?
>> >>
>> >>
>> https://kb.juniper.net/InfoCenter/index?page=content&id=KB32883&pmv=
>> >> print&actp=LIST&searchid=&type=currentpaging
>> >>
>> > Thank you, yes please we're aware of that, but even with this the
>> > issue is still present if the refresh timer is not <1200 or CSPF is enabled.
>>
>> I’m confused by this one - what’s the refresh timer and CSPF got to do
>> with
>> it?
>>
> Not much it's a bug, it appears form the logs that the path message
> has "something different" in the ERO when CSPF is enabled triggering
> the bug ...
>
>> LSPs on 16.1 will do self-ping after they come up before they put
>> traffic on
>> them. The lo0 filter has to permit that, or you’ve got to disable
>> self-ping.
>>
> LSPs will do self ping when switching onto a new/optimized path, not
> when the LSP is first brought up -which in this case doesn’t happen.
>
>> Or am I parsing this weird, and you’re saying this is still an issue
>> even with the
>> self ping disabled (or permitted in filters), under those conditions?
>>
> Yes that is correct, this problem appears even before the self-ping is
> engaged (the LSP is not even signalled -the RESV msg is never sent as
> a response to PATH msg in this case).
>
> adam
>
> _______________________________________________
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp

_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: RSVP-TE broken between pre and post 16.1 code? [ In reply to ]
> From: Nathan Ward <juniper-nsp@daork.net>
> Sent: Friday, August 16, 2019 11:20 AM
>
> Weird that the refresh timer <1200 triggers it. Do you mean the refresh timer
> in the time values object is 1200 - i.e. 1.2s? JunOS default is 30,000 (30s), 1200
> seems very short, or very long? Certainly it sounds like a bug, but, I’m curious
> why you’d have timers like that..
>
The refresh timer is in seconds,
The idea was to change the refresh timer from 30s to 1200s among other settings to match with the post 16.1 defaults, so when there's time to upgrade these settings would match for better interop -but in this case it totally backfired.

adam

_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp