Mailing List Archive: Bottlenecks and link upgrades

Bottlenecks and link upgrades

Aug 12, 2020, 12:31 AM

Post #1 of 37 (2921 views)

At what point do commercial ISPs upgrade links in their backbone as well as peering and transit links that are congested? At 80% capacity? 90%? 95%?

Thanks,
Hank

Caveat: The views expressed above are solely my own and do not express the views or opinions of my employer

Re: Bottlenecks and link upgrades [ In reply to ]

saku at ytti

Aug 12, 2020, 12:44 AM

Post #2 of 37 (2921 views)

Permalink

On Wed, 12 Aug 2020 at 10:35, Hank Nussbacher <hank@interall.co.il> wrote:

> At what point do commercial ISPs upgrade links in their backbone as well as peering and transit links that are congested? At 80% capacity? 90%? 95%?

I've worked for employees where policy has been anywhere from 50% or
80%. And I know this isn't complete range. Most do not subscribe to
any single simple rule but act more tactically.

Personally if the link is in a growth market, you should upgrade
really early, 50% seems late, cost is negligible if you anticipate
growth to continue. If it's not a growth market cost may become less
than negligible.

Sometimes networks congest particularly their edge interfaces
strategically due to poor incentives, where irrelevant revenue
wholesale arm might see some benefit from strategic congestion while
also significantly hurting their money printing mobile arm reducing
company wide bottom line while improving wholesale arm bottom line.

--
++ytti

Re: Bottlenecks and link upgrades [ In reply to ]

mark.tinka at seacom

Aug 12, 2020, 12:52 AM

Post #3 of 37 (2921 views)

Permalink

On 12/Aug/20 09:31, Hank Nussbacher wrote:

> At what point do commercial ISPs upgrade links in their backbone as
> well as peering and transit links that are congested? At 80%
> capacity? 90%? 95%?
>

We start the process at 50% utilization, and work toward completing the
upgrade by 70% utilization.

The period between 50% - 70% is just internal paperwork.

Mark.

Re: Bottlenecks and link upgrades [ In reply to ]

mark.tinka at seacom

Aug 12, 2020, 1:07 AM

Post #4 of 37 (2920 views)

Permalink

On 12/Aug/20 09:44, Saku Ytti wrote:

> Personally if the link is in a growth market, you should upgrade
> really early, 50% seems late, cost is negligible if you anticipate
> growth to continue. If it's not a growth market cost may become less
> than negligible.

The problem you have is "what is a growth market", especially over time
as it stabilizes and see new entrants, but growth is now in a phase
where you need massive scale to keep playing.

You then shift from a "sales are guaranteed Day 1" to a "build it and
hope for the best". Many Commercial get fearful at that point, because
of the temptation to link capacity to guaranteed sales.

> Sometimes networks congest particularly their edge interfaces
> strategically due to poor incentives, where irrelevant revenue
> wholesale arm might see some benefit from strategic congestion while
> also significantly hurting their money printing mobile arm reducing
> company wide bottom line while improving wholesale arm bottom line.

I know a few :-).

Mark.

Re: Bottlenecks and link upgrades [ In reply to ]

marc101.maxmaok at gmail

Aug 12, 2020, 8:08 AM

Post #5 of 37 (2916 views)

Permalink

Just my curiosity. May I ask how we can measure the link capacity loading?
What does it mean by a 50%, 70%, or 90% capacity loading? Load sampled and
measured instantaneously, or averaging over a certain period of time
(granularity)?

These are questions have bothered me for long. Don't know if I can ask
about these by the way. I take care of the radio access network performance
at work. Found many things unknown in transport network.

Thanks and best regards,
Taichi

On Wed, Aug 12, 2020 at 3:54 PM Mark Tinka <mark.tinka@seacom.com> wrote:

>
>
> On 12/Aug/20 09:31, Hank Nussbacher wrote:
>
> At what point do commercial ISPs upgrade links in their backbone as well
> as peering and transit links that are congested? At 80% capacity? 90%?
> 95%?
>
>
> We start the process at 50% utilization, and work toward completing the
> upgrade by 70% utilization.
>
> The period between 50% - 70% is just internal paperwork.
>
> Mark.
>

Re: Bottlenecks and link upgrades [ In reply to ]

jurado98 at mail

Aug 12, 2020, 9:21 AM

Post #6 of 37 (2916 views)

Permalink

When I worked for an ISP, it was about 70%, not sure if that is the case
with the other ones.

On 8/12/2020 3:31 AM, Hank Nussbacher wrote:
>
> At what point do commercial ISPs upgrade links in their backbone as
> well as peering and transit links that are congested? At 80%
> capacity? 90%? 95%?
>
>
> Thanks,
> Hank
>
>
> Caveat: The views expressed above are solely my own and do not express
> the views or opinions of my employer
>

Re: Bottlenecks and link upgrades [ In reply to ]

mark.tinka at seacom

Aug 12, 2020, 9:30 AM

Post #7 of 37 (2916 views)

Permalink

On 12/Aug/20 17:08, m.Taichi wrote:

>
> Just my curiosity. May I ask how we can measure the link capacity
> loading? What does it mean by a 50%, 70%, or 90% capacity loading?
> Load sampled and measured instantaneously, or averaging over a certain
> period of time (granularity)?
>
> These are questions have bothered me for long. Don't know if I can ask
> about these by the way. I take care of the radio access network
> performance at work. Found many things unknown in transport network.

For this, we look at simpel 5-minute based SNMP data over the period.
Nothing too fancy. It's stable

Mark.

Re: Bottlenecks and link upgrades [ In reply to ]

mark.tinka at seacom

Aug 12, 2020, 9:42 AM

Post #8 of 37 (2916 views)

Permalink

On 12/Aug/20 17:08, m.Taichi wrote:

> Just my curiosity. May I ask how we can measure the link capacity
> loading? What does it mean by a 50%, 70%, or 90% capacity loading?
> Load sampled and measured instantaneously, or averaging over a certain
> period of time (granularity)?
>
> These are questions have bothered me for long. Don't know if I can ask
> about these by the way. I take care of the radio access network
> performance at work. Found many things unknown in transport network.

For this, we look at simple 5-minute based SNMP data over the period.
Nothing too fancy. It's stable

Mark.

Re: Bottlenecks and link upgrades [ In reply to ]

ted at io-tx

Aug 12, 2020, 1:11 PM

Post #9 of 37 (2916 views)

Permalink

On Wed, 12 Aug 2020, Hank Nussbacher wrote:

>
> At what point do commercial ISPs upgrade links in their backbone as well as peering and transit links that are congested? At
> 80% capacity? 90%? 95%?
>
>
> Thanks,
> Hank
>
>
> Caveat: The views expressed above are solely my own and do not express the views or opinions of my employer
>
>
>

Why upgrade when you can legislate the problem instead.

Charter tries to convince FCC that broadband customers want data caps.

https://arstechnica.com/tech-policy/2020/08/charter-tries-to-convince-fcc-that-broadband-customers-want-data-caps/

Ted

Re: Bottlenecks and link upgrades [ In reply to ]

simon.leinen at switch

Aug 13, 2020, 2:56 AM

Post #10 of 37 (2909 views)

Permalink

m Taichi writes:
> Just my curiosity. May I ask how we can measure the link capacity
> loading? What does it mean by a 50%, 70%, or 90% capacity loading?
> Load sampled and measured instantaneously, or averaging over a certain
> period of time (granularity)?

Very good question!

With tongue in cheek, one could say that measured instantaneously, the
load on a link is always either zero or 100% link rate...

ISPs typically sample link load in 5-minute intervals and look at graphs
that show load (at this 5-minute sampling resolution) over ~24 hours, or
longer-term graphs where the resolution has been "downsampled", where
downsampling usually smoothes out short-term peaks.

From my own experience, upgrade decisions are made by looking at those
graphs and checking whether peak traffic (possibly ignoring "spikes" :-)
crosses the threshold repeatedly.

At some places this might be codified in terms of percentiles, e.g. "the
Nth percentile of the M-minute utilization samples exceeds X% of link
capacity over a Y-day period". I doubt that anyone uses such rules to
automatically issue upgrade orders, but maybe to generate alerts like
"please check this link, we might want to upgrade it".

I'd be curious whether other operators have such alert rules, and what
N/M/X/Y they use - might well be different values for different kinds of
links.
--
Simon.
PS. We use the "stare at graphs" method, but if we had automatic alerts,
I guess it would be something like "the 95th percentile of 5-minute
samples exceeds 50% over 30 days".
PPS. My colleagues remind me that we do alert on output queue drops.

> These are questions have bothered me for long. Don't know if I can ask
> about these by the way. I take care of the radio access network
> performance at work. Found many things unknown in transport network.

> Thanks and best regards,
> Taichi

> On Wed, Aug 12, 2020 at 3:54 PM Mark Tinka <mark.tinka@seacom.com> wrote:

> On 12/Aug/20 09:31, Hank Nussbacher wrote:

> At what point do commercial ISPs upgrade links in their backbone as well as peering and transit links that are congested? At 80%
> capacity? 90%? 95%?

> We start the process at 50% utilization, and work toward completing the upgrade by 70% utilization.

> The period between 50% - 70% is just internal paperwork.

> Mark.

Re: Bottlenecks and link upgrades [ In reply to ]

mark.tinka at seacom

Aug 13, 2020, 3:03 AM

Post #11 of 37 (2909 views)

Permalink

On 13/Aug/20 11:56, Simon Leinen wrote:

> I'd be curious whether other operators have such alert rules, and what
> N/M/X/Y they use - might well be different values for different kinds of
> links.

We use alerts to tell us about links that hit a threshold, in our NMS.
But yes, this is based on 5-minute samples, not percentile data.

The alerts are somewhat redundant for any long-term planning. They are
more useful when problems happen out of the blue.

Mark.

Re: Bottlenecks and link upgrades [ In reply to ]

edepa at ieee

Aug 13, 2020, 3:05 AM

Post #12 of 37 (2909 views)

Permalink

>
> With tongue in cheek, one could say that measured instantaneously, the
> load on a link is always either zero or 100% link rate...
>

Actually, that's a first-class observation !

On Thu, Aug 13, 2020 at 12:00 PM Simon Leinen <simon.leinen@switch.ch>
wrote:

> m Taichi writes:
> > Just my curiosity. May I ask how we can measure the link capacity
> > loading? What does it mean by a 50%, 70%, or 90% capacity loading?
> > Load sampled and measured instantaneously, or averaging over a certain
> > period of time (granularity)?
>
> Very good question!
>
> With tongue in cheek, one could say that measured instantaneously, the
> load on a link is always either zero or 100% link rate...
>
> ISPs typically sample link load in 5-minute intervals and look at graphs
> that show load (at this 5-minute sampling resolution) over ~24 hours, or
> longer-term graphs where the resolution has been "downsampled", where
> downsampling usually smoothes out short-term peaks.
>
> From my own experience, upgrade decisions are made by looking at those
> graphs and checking whether peak traffic (possibly ignoring "spikes" :-)
> crosses the threshold repeatedly.
>
> At some places this might be codified in terms of percentiles, e.g. "the
> Nth percentile of the M-minute utilization samples exceeds X% of link
> capacity over a Y-day period". I doubt that anyone uses such rules to
> automatically issue upgrade orders, but maybe to generate alerts like
> "please check this link, we might want to upgrade it".
>
> I'd be curious whether other operators have such alert rules, and what
> N/M/X/Y they use - might well be different values for different kinds of
> links.
> --
> Simon.
> PS. We use the "stare at graphs" method, but if we had automatic alerts,
> I guess it would be something like "the 95th percentile of 5-minute
> samples exceeds 50% over 30 days".
> PPS. My colleagues remind me that we do alert on output queue drops.
>
> > These are questions have bothered me for long. Don't know if I can ask
> > about these by the way. I take care of the radio access network
> > performance at work. Found many things unknown in transport network.
>
> > Thanks and best regards,
> > Taichi
>
> > On Wed, Aug 12, 2020 at 3:54 PM Mark Tinka <mark.tinka@seacom.com>
> wrote:
>
> > On 12/Aug/20 09:31, Hank Nussbacher wrote:
>
> > At what point do commercial ISPs upgrade links in their backbone as
> well as peering and transit links that are congested? At 80%
> > capacity? 90%? 95%?
>
> > We start the process at 50% utilization, and work toward completing the
> upgrade by 70% utilization.
>
> > The period between 50% - 70% is just internal paperwork.
>
> > Mark.
>
>

--
Ing. Etienne-Victor Depasquale
Assistant Lecturer
Department of Communications & Computer Engineering
Faculty of Information & Communication Technology
University of Malta
Web. https://www.um.edu.mt/profile/etiennedepasquale

Re: Bottlenecks and link upgrades [ In reply to ]

nanog at nanog

Aug 13, 2020, 3:23 AM

Post #13 of 37 (2909 views)

Permalink

On 12.08.2020 09:31, Hank Nussbacher wrote:
>
> At what point do commercial ISPs upgrade links in their backbone as
> well as peering and transit links that are congested? At 80%
> capacity? 90%? 95%?
>

Hi,

Wouldn't it be better to measure the basic performance like packet drop
rates and queue sizes ?

These days live video is needed and these parameters are essential to
the quality.

Queues are building up in milliseconds and people are averaging over
minutes to estimate quality.

If you are measuring queue delay with high frequent one-way-delay
measurements

you would then be able to advice better on what the consequences of a
highly loaded link are.

We are running a research project on end-to-end quality and the enclosed
image is yesterdays report on

queuesize(h_ddelay) in ms. It shows stats on delays between some peers.

I would have looked at the trends on the involved links to see if
upgrade is necessary -

421 ms might be too much ig it happens often.

Best regards

Olav Kvittem

>
> Thanks,
> Hank
>
>
> Caveat: The views expressed above are solely my own and do not express
> the views or opinions of my employer
>

Re: Bottlenecks and link upgrades [ In reply to ]

mark.tinka at seacom

Aug 13, 2020, 3:31 AM

Post #14 of 37 (2909 views)

Permalink

On 13/Aug/20 12:23, Olav Kvittem via NANOG wrote:

> Wouldn't it be better to measure the basic performance like packet
> drop rates and queue sizes ?
>
> These days live video is needed and these parameters are essential to
> the quality.
>
> Queues are building up in milliseconds and people are averaging over
> minutes to estimate quality.
>
>
> If you are measuring queue delay with high frequent one-way-delay
> measurements
>
> you would then be able to advice better on what the consequences of a
> highly loaded link are.
>
>
> We are running a research project on end-to-end quality and the
> enclosed image is yesterdays report on
>
> queuesize(h_ddelay) in ms. It shows stats on delays between some peers.
>
> I would have looked at the trends on the involved links to see if
> upgrade is necessary -
>
> 421 ms might be too much ig it happens often.
>

I'm confident everyone (even the cheapest CFO) knows the consequences of
congesting a link and choosing not to upgrade it.

Optical issues, dirty patch cords, faulty line cards, wrong
configurations, will almost likely lead to packet loss. Link congestion
due to insufficient bandwidth will most certainly lead to packet loss.

It's great to monitor packet loss, latency, pps, e.t.c. But packet loss
at 10% link utilization is not a foreign occurrence. No amount of
bandwidth upgrades will fix that.

Mark.

Re: Bottlenecks and link upgrades [ In reply to ]

nick at foobar

Aug 13, 2020, 4:00 AM

Post #15 of 37 (2909 views)

Permalink

Mark Tinka wrote on 13/08/2020 11:31:
> It's great to monitor packet loss, latency, pps, e.t.c. But packet loss
> at 10% link utilization is not a foreign occurrence. No amount of
> bandwidth upgrades will fix that.

you could easily have 10% utilization and see packet loss due to
insufficient bandwidth if you have egress << ingress and proportionally
low buffering, e.g. UDP or iSCSI from a 40G/100 port with egress to a
low-buffer 1G port.

This sort of thing is less likely in the imix world, but it can easily
happen with high capacity CDN nodes injecting content where the
receiving port is small and subject to bursty traffic.

Nick

Re: Bottlenecks and link upgrades [ In reply to ]

mark.tinka at seacom

Aug 13, 2020, 4:18 AM

Post #16 of 37 (2907 views)

Permalink

On 13/Aug/20 13:00, Nick Hilliard wrote:

>
> you could easily have 10% utilization and see packet loss due to
> insufficient bandwidth if you have egress << ingress and
> proportionally low buffering, e.g. UDP or iSCSI from a 40G/100 port
> with egress to a low-buffer 1G port.
>
> This sort of thing is less likely in the imix world, but it can easily
> happen with high capacity CDN nodes injecting content where the
> receiving port is small and subject to bursty traffic.

Indeed.

The smaller the capacity gets toward egress, the closer you are getting
to an end-user, in most cases.

End-user link upgrades will always be the weakest link in the chain, as
the incentive is more on their side than you, their provider. Your final
egress port buffer sizing notwithstanding, of course.

Mark.

Re: Bottlenecks and link upgrades [ In reply to ]

nanog at nanog

Aug 13, 2020, 4:44 AM

Post #17 of 37 (2907 views)

Permalink

Hi Mark,

Just comments on your points below.

On 13.08.2020 12:31, Mark Tinka wrote:
>
> On 13/Aug/20 12:23, Olav Kvittem via NANOG wrote:
>
>> Wouldn't it be better to measure the basic performance like packet
>> drop rates and queue sizes ?
>>
>> These days live video is needed and these parameters are essential to
>> the quality.
>>
>> Queues are building up in milliseconds and people are averaging over
>> minutes to estimate quality.
>>
>>
>> If you are measuring queue delay with high frequent one-way-delay
>> measurements
>>
>> you would then be able to advice better on what the consequences of a
>> highly loaded link are.
>>
>>
>> We are running a research project on end-to-end quality and the
>> enclosed image is yesterdays report on
>>
>> queuesize(h_ddelay) in ms. It shows stats on delays between some peers.
>>
>> I would have looked at the trends on the involved links to see if
>> upgrade is necessary -
>>
>> 421 ms might be too much ig it happens often.
>>
> I'm confident everyone (even the cheapest CFO) knows the consequences of
> congesting a link and choosing not to upgrade it.
>
> Optical issues, dirty patch cords, faulty line cards, wrong
> configurations, will almost likely lead to packet loss.
> Link congestion
> due to insufficient bandwidth will most certainly lead to packet loss.
sure, but I guess the loss rate depends of the nature of the traffic.
>
> It's great to monitor packet loss, latency, pps, e.t.c. But packet loss
> at 10% link utilization is not a foreign occurrence. No amount of
> bandwidth upgrades will fix that.

I guess that having more reports would support the judgements better.

A basic question is : what is the effect on the perceived quality of the
customers ?

And the relation between that and /5min load is not known to me.

Actually one good indicator of the congestion loss rate are of course
the SNMP OutputDiscards.

Curves for queueing delay, link load and discard rate are surprisingly
different.

regards

Olav

>
> Mark.

Re: Bottlenecks and link upgrades [ In reply to ]

mark.tinka at seacom

Aug 13, 2020, 5:01 AM

Post #18 of 37 (2907 views)

Permalink

On 13/Aug/20 13:44, Olav Kvittem wrote:

> sure, but I guess the loss rate depends of the nature of the traffic.

Packet loss is packet loss.

Some applications are more sensitive to it (live video, live voice, for
example), while others are less so. However, packet loss always
manifests badly if left unchecked.

>> I guess that having more reports would support the judgements better.

For sure, yes. Any decent NMS can provide a number of data points so you
aren't shooting in the dark.

>>
>> A basic question is : what is the effect on the perceived quality of the
>> customers ?

Depends on the application.

Gamers tend to complain the most, so that's a great indicator.

Some customers that think bandwidth solves all problems will perceive
their inability to attain their advertised contract as a problem, if
packet loss is in the way.

Generally, other bad things, including unruly human beings :-).

>>
>> And the relation between that and /5min load is not known to me.

For troubleshooting, being able to have a tighter resolution is more
important. 5-minute averages are for day-to-day operations, and
long-term planning.

>>
>> Actually one good indicator of the congestion loss rate are of course
>> the SNMP OutputDiscards.
>>
>>
>> Curves for queueing delay, link load and discard rate are surprisingly
>> different.

Yes, that then gets into the guts of the router hardware, and it's design.

In such cases, that's when your 100Gbps link is peaking and causing
packet loss, not understanding that the forwarding chip on it is only
good for 60Gbps, for example.

Mark.

Re: Bottlenecks and link upgrades [ In reply to ]

baldur.norddahl at gmail

Aug 13, 2020, 6:20 AM

Post #19 of 37 (2907 views)

Permalink

Is it possible to do and is anyone monitoring metrics such as max queue
length in 5 minutes intervals? Might be a better metric than average load
in 5 minutes intervals.

Regards

Baldur

Re: Bottlenecks and link upgrades [ In reply to ]

nanog at ics-il

Aug 13, 2020, 6:28 AM

Post #20 of 37 (2907 views)

Permalink

I suppose it would depend on if your hardware has an OID for what you want to monitor.

-----
Mike Hammett
Intelligent Computing Solutions

Midwest Internet Exchange

The Brothers WISP

----- Original Message -----

From: "Baldur Norddahl" <baldur.norddahl@gmail.com>
To: nanog@nanog.org
Sent: Thursday, August 13, 2020 8:20:26 AM
Subject: Re: Bottlenecks and link upgrades

Is it possible to do and is anyone monitoring metrics such as max queue length in 5 minutes intervals? Might be a better metric than average load in 5 minutes intervals.

Regards

Baldur

Re: Bottlenecks and link upgrades [ In reply to ]

baldur.norddahl at gmail

Aug 13, 2020, 7:45 AM

Post #21 of 37 (2907 views)

Permalink

I expect my hardware does not have such a metric, but maybe it should have.
Max queue length tell us how full the link is with respect to microbursts.

tor. 13. aug. 2020 15.28 skrev Mike Hammett <nanog@ics-il.net>:

> I suppose it would depend on if your hardware has an OID for what you want
> to monitor.
>
>
>
> -----
> Mike Hammett
> Intelligent Computing Solutions <http://www.ics-il.com/>
> <https://www.facebook.com/ICSIL>
> <https://plus.google.com/+IntelligentComputingSolutionsDeKalb>
> <https://www.linkedin.com/company/intelligent-computing-solutions>
> <https://twitter.com/ICSIL>
> Midwest Internet Exchange <http://www.midwest-ix.com/>
> <https://www.facebook.com/mdwestix>
> <https://www.linkedin.com/company/midwest-internet-exchange>
> <https://twitter.com/mdwestix>
> The Brothers WISP <http://www.thebrotherswisp.com/>
> <https://www.facebook.com/thebrotherswisp>
> <https://www.youtube.com/channel/UCXSdfxQv7SpoRQYNyLwntZg>
> ------------------------------
> *From: *"Baldur Norddahl" <baldur.norddahl@gmail.com>
> *To: *nanog@nanog.org
> *Sent: *Thursday, August 13, 2020 8:20:26 AM
> *Subject: *Re: Bottlenecks and link upgrades
>
> Is it possible to do and is anyone monitoring metrics such as max queue
> length in 5 minutes intervals? Might be a better metric than average load
> in 5 minutes intervals.
>
> Regards
>
> Baldur
>
>

Re: Bottlenecks and link upgrades [ In reply to ]

beecher at beecher

Aug 13, 2020, 8:33 AM

Post #22 of 37 (2907 views)

Permalink

>
> Wouldn't it be better to measure the basic performance like packet drop
> rates and queue sizes ?
>

Those values should be a standard part of monitoring and data collection,
but if they happen to MATTER or not in a given situation very much depends.

The traffic profile traversing the link may be such that the observed drop
% and buffer depths is acceptable for that traffic, and there is no need
for further tuning or changes. In other scenarios it may not be, in which
case either network or application adjustments are warranted.

There is rarely a one sized fits all answer when it comes to these things.

On Thu, Aug 13, 2020 at 6:25 AM Olav Kvittem via NANOG <nanog@nanog.org>
wrote:

>
> On 12.08.2020 09:31, Hank Nussbacher wrote:
>
> At what point do commercial ISPs upgrade links in their backbone as well
> as peering and transit links that are congested? At 80% capacity? 90%?
> 95%?
>
>
> Hi,
>
>
> Wouldn't it be better to measure the basic performance like packet drop
> rates and queue sizes ?
>
> These days live video is needed and these parameters are essential to the
> quality.
>
> Queues are building up in milliseconds and people are averaging over
> minutes to estimate quality.
>
>
> If you are measuring queue delay with high frequent one-way-delay
> measurements
>
> you would then be able to advice better on what the consequences of a
> highly loaded link are.
>
>
> We are running a research project on end-to-end quality and the enclosed
> image is yesterdays report on
>
> queuesize(h_ddelay) in ms. It shows stats on delays between some peers.
>
> I would have looked at the trends on the involved links to see if upgrade
> is necessary -
>
> 421 ms might be too much ig it happens often.
>
>
> Best regards
>
>
> Olav Kvittem
>
>
>
> Thanks,
> Hank
>
>
> Caveat: The views expressed above are solely my own and do not express the
> views or opinions of my employer
>
>

Re: Bottlenecks and link upgrades [ In reply to ]

beecher at beecher

Aug 13, 2020, 8:39 AM

Post #23 of 37 (2907 views)

Permalink

It is possible to gather a lot of information about buffers and queues, at
least with the vendors we work with. That can be very helpful in a lot of
ways. :)

On Thu, Aug 13, 2020 at 9:21 AM Baldur Norddahl <baldur.norddahl@gmail.com>
wrote:

> Is it possible to do and is anyone monitoring metrics such as max queue
> length in 5 minutes intervals? Might be a better metric than average load
> in 5 minutes intervals.
>
> Regards
>
> Baldur
>

Re: Bottlenecks and link upgrades [ In reply to ]

edepa at ieee

Aug 13, 2020, 8:44 AM

Post #24 of 37 (2907 views)

Permalink

>
> There is rarely a one sized fits all answer when it comes to these
> things.
>

Absolutely true: every application has characteristic QoS parameters.

Unfortunately, it seems that 5-minute averages of data rates through links
are the one-size-fits-all answer ... which doesn't fit all.

Etienne

On Thu, Aug 13, 2020 at 5:37 PM Tom Beecher <beecher@beecher.cc> wrote:

> Wouldn't it be better to measure the basic performance like packet drop
>> rates and queue sizes ?
>>
>
> Those values should be a standard part of monitoring and data collection,
> but if they happen to MATTER or not in a given situation very much depends.
>
> The traffic profile traversing the link may be such that the observed drop
> % and buffer depths is acceptable for that traffic, and there is no need
> for further tuning or changes. In other scenarios it may not be, in which
> case either network or application adjustments are warranted.
>
> There is rarely a one sized fits all answer when it comes to these things.
>
>
> On Thu, Aug 13, 2020 at 6:25 AM Olav Kvittem via NANOG <nanog@nanog.org>
> wrote:
>
>>
>> On 12.08.2020 09:31, Hank Nussbacher wrote:
>>
>> At what point do commercial ISPs upgrade links in their backbone as well
>> as peering and transit links that are congested? At 80% capacity? 90%?
>> 95%?
>>
>>
>> Hi,
>>
>>
>> Wouldn't it be better to measure the basic performance like packet drop
>> rates and queue sizes ?
>>
>> These days live video is needed and these parameters are essential to the
>> quality.
>>
>> Queues are building up in milliseconds and people are averaging over
>> minutes to estimate quality.
>>
>>
>> If you are measuring queue delay with high frequent one-way-delay
>> measurements
>>
>> you would then be able to advice better on what the consequences of a
>> highly loaded link are.
>>
>>
>> We are running a research project on end-to-end quality and the enclosed
>> image is yesterdays report on
>>
>> queuesize(h_ddelay) in ms. It shows stats on delays between some peers.
>>
>> I would have looked at the trends on the involved links to see if upgrade
>> is necessary -
>>
>> 421 ms might be too much ig it happens often.
>>
>>
>> Best regards
>>
>>
>> Olav Kvittem
>>
>>
>>
>> Thanks,
>> Hank
>>
>>
>> Caveat: The views expressed above are solely my own and do not express
>> the views or opinions of my employer
>>
>>

--
Ing. Etienne-Victor Depasquale
Assistant Lecturer
Department of Communications & Computer Engineering
Faculty of Information & Communication Technology
University of Malta
Web. https://www.um.edu.mt/profile/etiennedepasquale

Re: Bottlenecks and link upgrades [ In reply to ]

bill at herrin

Aug 13, 2020, 10:35 AM

Post #25 of 37 (2906 views)

Permalink

On Wed, Aug 12, 2020 at 12:33 AM Hank Nussbacher <hank@interall.co.il> wrote:
> At what point do commercial ISPs upgrade links in their backbone as well as peering and transit links that are congested? At 80% capacity? 90%? 95%?

Hi Hank,

As others have noted, the answer is rarely that simple.

First, what is your consumption? 90th or 95th percentile usually,
after all 100% between 9 and 5 is 100% not 33% but 100% for two
minutes is not 100%. It gets more complicated if any kind of QoS is in
play because capacity-wise QoS essentially gives you not a single
fixed-speed line but many interdependent variable-speed lines.

Next, capacity is not the only question. Here are some of the other factors:

1) A residential customer on the cheapest plan does not merit as clean
a channel as a high-paying business customer you'd like to keep
milking.

2) Upgrades can take months of planning so the capacity now is beside
the point. You'll use your best-guess projection for the capacity at
the time an upgrade can be complete.

3) Some upgrades tend to be significantly more expensive than others.
Lit service to dark fiber, for example. It's pretty ordinary to run
closer to the limit before making an expensive upgrade than a modest
upgrade.

4) A dirty link merits replacement sooner than a clean one. If the
higher-capacity service also clears up packet loss, you'll want to
trigger the decision at a lower consumption threshold.

5) Switching a single path to two paths is more valuable than
switching two paths to three. It has priority at a lower level of
consumption.

Regards,
Bill Herrin

--
William Herrin
bill@herrin.us
https://bill.herrin.us/

Mailing List Archive

Attached Files:

Attached Files: