Mailing List Archive

100g PCS Errors
We've got a 100g qsfp in an mx204 that has 1207 bit errors and 29666 errored blocks after 24 hours of just being linked up...
I would assume this is not normal behavior, but I haven't used 100g before. Do others see high error rates on their 100g optics?
Re: 100g PCS Errors [ In reply to ]
Id hope by this point you’ve already reseated not only the card but the connection to the card as well ?.

Possibly a faulty card.

> On Aug 19, 2020, at 07:46, Nicholas Warren <nwarren@barryelectric.com> wrote:
>
> We've got a 100g qsfp in an mx204 that has 1207 bit errors and 29666 errored blocks after 24 hours of just being linked up...
> I would assume this is not normal behavior, but I haven't used 100g before. Do others see high error rates on their 100g optics?


--

J. Hellenthal

The fact that there's a highway to Hell but only a stairway to Heaven says a lot about anticipated traffic volume.
Re: 100g PCS Errors [ In reply to ]
On Wed, Aug 19, 2020 at 9:16 AM J. Hellenthal via NANOG <nanog@nanog.org>
wrote:

> Id hope by this point you’ve already reseated not only the card but the
> connection to the card as well ?.
>
> Possibly a faulty card.
>

I'm guessing by card you mean the optic? These are QSFP28 ports.

Clean fiber as Daniel mentioned, reseat optic, the usual stuff. Try
replacement parts you have in stock, then go from there replacing the
cheapest components first before trying to replace the more costly ones.

What manufacturer optic are you using, and what sort of media specifically?

Matt Harris|Infrastructure Lead Engineer
816-256-5446|Direct
Looking for something?
Helpdesk Portal|Email Support|Billing Portal
We build and deliver end-to-end IT solutions.
Re: 100g PCS Errors [ In reply to ]
It's not normal, no.

On Wed, Aug 19, 2020 at 10:02 AM Nicholas Warren <nwarren@barryelectric.com>
wrote:

> We've got a 100g qsfp in an mx204 that has 1207 bit errors and 29666
> errored blocks after 24 hours of just being linked up...
> I would assume this is not normal behavior, but I haven't used 100g
> before. Do others see high error rates on their 100g optics?
>
Re: 100g PCS Errors [ In reply to ]
we have been making 100G packet capture systems for 5 years now ( fmad.io
). In the early days vendor qualified transceivers really do make a
difference, its 25Gbps signaling per differential pair which is anything
but easy. Back then (4-5Y ago) the cheap QSFP28 vendors had some really
marginal parts... we had to tune the fpga alot to get the QSFP28s to to
work correctly, and even then some just wouldnt work at all / have alot of
errors.

If your using latests Finisar or Avago level transceivers should be fine,
currently (last 12 months) the cheap transceivers dont need any tuning too.
Guess depends if your using old HW / old transceivers or new HW with new
transceivers.

Aaron

On Wed, 19 Aug 2020 at 23:21, Tom Beecher <beecher@beecher.cc> wrote:

> It's not normal, no.
>
> On Wed, Aug 19, 2020 at 10:02 AM Nicholas Warren <
> nwarren@barryelectric.com> wrote:
>
>> We've got a 100g qsfp in an mx204 that has 1207 bit errors and 29666
>> errored blocks after 24 hours of just being linked up...
>> I would assume this is not normal behavior, but I haven't used 100g
>> before. Do others see high error rates on their 100g optics?
>>
>
Re: 100g PCS Errors [ In reply to ]
What is the device on the other side of the MX204 100G link. We've had some incrementing PCS errors on 100G links when the other side was a Juniper PTX1000 using port et-0/0/25. Using a different port on the PTX1000 resolved the incrementing PCS errors. We opened JTAC cases for two incidents and a root cause was never found.

--
Clinton Work
Airdrie, AB

On Wed, Aug 19, 2020, at 6:46 AM, Nicholas Warren wrote:
> We've got a 100g qsfp in an mx204 that has 1207 bit errors and 29666
> errored blocks after 24 hours of just being linked up...
> I would assume this is not normal behavior, but I haven't used 100g
> before. Do others see high error rates on their 100g optics?
>
Re: 100g PCS Errors [ In reply to ]
On 19/Aug/20 19:34, Clinton Work wrote:

> What is the device on the other side of the MX204 100G link. We've had some incrementing PCS errors on 100G links when the other side was a Juniper PTX1000 using port et-0/0/25. Using a different port on the PTX1000 resolved the incrementing PCS errors. We opened JTAC cases for two incidents and a root cause was never found.

Good to know, we are just about to start deploying a bunch of PTX1000's.

Mark.
Re: 100g PCS Errors [ In reply to ]
On Wed, 19 Aug 2020 at 21:39, Mark Tinka <mark.tinka@seacom.com> wrote:

> > What is the device on the other side of the MX204 100G link. We've had some incrementing PCS errors on 100G links when the other side was a Juniper PTX1000 using port et-0/0/25. Using a different port on the PTX1000 resolved the incrementing PCS errors. We opened JTAC cases for two incidents and a root cause was never found.
>
> Good to know, we are just about to start deploying a bunch of PTX1000's.

On QSFP28 devices I would recommend always when possible run RS-FEC.
By default LR4 doesn't run it, but the added value is fantastic. You
will immediately during turn-up know if circuit works or not, without
any ping testing or live traffic. You will know if the circuit doesn't
work, before it impacts customers. Combine preFEC with DDM and you
have fantastic predictive power over failures and you can
reroute/schedule maintenance to fix issues before they become
symptomatic.

Unfortunately no SNMP counters for RS-FEC. No for Juniper, not for
Nokia, not for Arista, so screenscraping you go. I have an ER-079886
for JNPR, if someone wants to chip in.


--
++ytti
Re: 100g PCS Errors [ In reply to ]
On 20/Aug/20 08:16, Saku Ytti wrote:

> On QSFP28 devices I would recommend always when possible run RS-FEC.
> By default LR4 doesn't run it, but the added value is fantastic. You
> will immediately during turn-up know if circuit works or not, without
> any ping testing or live traffic. You will know if the circuit doesn't
> work, before it impacts customers. Combine preFEC with DDM and you
> have fantastic predictive power over failures and you can
> reroute/schedule maintenance to fix issues before they become
> symptomatic.

Yes, thanks. I recall you recommended this to someone earlier in the
year, so we have it in our library for deployment when we do the turn-up.

All our 100Gbps deployments are currently within the data centre, so SR4
optics; which Juniper enables RS-FEC on by default. But yes, we shall
definitely do the same (manually) for LR4.

IOS XR appears to suggest that FEC configuration is not explicitly
needed, unless the optic is a non-Cisco qualified non-LR4 unit.

Mark.