Mailing List Archive: Stange issue on 100 Gbs interconnection Juniper

Dear experts
we have a couple of BGP peers over a 100 Gbs interconnection between
Juniper (MX10003) and Cisco (Nexus N9K-C9364C) in two different datacenters
like this:

DC1
MX1 -- bgp -- NEXUS1
MX2 -- bgp -- NEXUS2

DC2
MX3 -- bgp -- NEXUS3
MX4 -- bgp -- NEXUS4

The issue we see is that sporadically (ie every 1 to 3 days) we notice BGP
flaps only in DC1 on both interconnections (not at the same time), there is
still no traffic since once noticed the flaps we have blocked deploy on
production.

We've already changed SPF (we moved the ones from DC2 to DC1 and viceversa)
and cables on both the interconnetion at DC1 without any solution.

SFP we use in both DCs:

Juniper - QSFP-100G-SR4-T2
Cisco - QSFP-100G-SR4

over MPO cable OM4.

Distance is DC1 70 mt and DC2 80 mt, hence is less where we see the issue.

Any idea or suggestion what to check or to do ?

Thanks in advance
Cheers
James
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Open JTAC and CTAC cases.

The amount of information provided is wildly insufficient.

'BGP flaps' what does that mean, is it always the same direction? If
so, which direction thinks it's not seeing keepalives? Do you also
observe loss in 'ping' between the links during the period?

Purely stabbing in the dark, I'd say you always observe it in a single
direction, because in that direction you are losing reliably every nTh
keepalive, and statistically it takes 1-3 days to lose 3 in a row,
with the probability you're seeing. Now why exactly is this, is one
end not sending to wire or is one end not receiving from wire. Again
stabbing in the dark, more likely that problem is in the punt path,
rather than inject path, so I would focus my investigation on the
party who is tearing down the session, due to lack of keepalive, on
thesis this device has problem in punt path and is for some reason
dropping at reliable probability BGP packets from the wire.

On Sun, 11 Feb 2024 at 12:09, james list via juniper-nsp
<juniper-nsp@puck.nether.net> wrote:
>
> Dear experts
> we have a couple of BGP peers over a 100 Gbs interconnection between
> Juniper (MX10003) and Cisco (Nexus N9K-C9364C) in two different datacenters
> like this:
>
> DC1
> MX1 -- bgp -- NEXUS1
> MX2 -- bgp -- NEXUS2
>
> DC2
> MX3 -- bgp -- NEXUS3
> MX4 -- bgp -- NEXUS4
>
> The issue we see is that sporadically (ie every 1 to 3 days) we notice BGP
> flaps only in DC1 on both interconnections (not at the same time), there is
> still no traffic since once noticed the flaps we have blocked deploy on
> production.
>
> We've already changed SPF (we moved the ones from DC2 to DC1 and viceversa)
> and cables on both the interconnetion at DC1 without any solution.
>
> SFP we use in both DCs:
>
> Juniper - QSFP-100G-SR4-T2
> Cisco - QSFP-100G-SR4
>
> over MPO cable OM4.
>
> Distance is DC1 70 mt and DC2 80 mt, hence is less where we see the issue.
>
> Any idea or suggestion what to check or to do ?
>
> Thanks in advance
> Cheers
> James
> _______________________________________________
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp