Mailing List Archive

1 2  View All
Re: Bottlenecks and link upgrades [ In reply to ]
On Wed, Aug 12, 2020, at 09:31, Hank Nussbacher wrote:
> At what point do commercial ISPs upgrade links in their backbone as
> well as peering and transit links that are congested? At 80% capacity?
> 90%? 95%?

Some reflections about link capacity:
At 90% and over, you should panic.
Between 80% and 90% you should be (very) scared.
Between 70% and 80% you should be worried.
Between 60% and 70% you should seriously consider speeding up the upgrades that you effectively started at 50%, and started planning since 40%.

Of course, that differs from one ISP to another. Some only upgrade after several months with at least 4 hours a day, every day (or almost) at over 95%. Others deploy 10x expected capacity, and upgrade well before 40%.
Re: Bottlenecks and link upgrades [ In reply to ]
On Thu, Aug 13, 2020, at 12:31, Mark Tinka wrote:
> I'm confident everyone (even the cheapest CFO) knows the consequences of
> congesting a link and choosing not to upgrade it.

I think you're over-confident.

> It's great to monitor packet loss, latency, pps, e.t.c. But packet loss
> at 10% link utilization is not a foreign occurrence. No amount of
> bandwidth upgrades will fix that.

That, plus the fact that by the time delay becomes an indication of congestion, it's way too late to start an upgrade. That event should not occur.
Re: Bottlenecks and link upgrades [ In reply to ]
Beyond a pure percentage, you might want to account for the time it takes
you stay below a certain threshold. If you want to target a certain link to
keep your 95th percentile peaks below 70%, then first get an understanding
of your traffic growth and try to project when you will reach that number.
You have to decide whether you care about the occasional peak, or the
consistent peak, or somewhere in between, like weekday vs weekends, etc.
Now you know how much lead time you will have.

Then consider how long it will take you to upgrade that link. If it's a
matter of adding a couple of crossconnects, then you might just need a
week. If you have to ship and install optics, modules, a card, then add
another week. If you have to get a sales order signed by senior management,
add another week. If you have to put it through legal and finance, add a
month. (kidding) If you are doing your annual re-negotiation, well...good
luck.

It's always good to ask your circuit vendors what the lead times are, then
double it and add 5.

And sometimes, if you need a low latency connection, traffic utilization
levels might not even be something you look at.

Louie
Peering Coordinator at a start-up ISP


On Fri, Aug 14, 2020 at 4:13 PM Radu-Adrian Feurdean <
nanog@radu-adrian.feurdean.net> wrote:

> On Wed, Aug 12, 2020, at 09:31, Hank Nussbacher wrote:
> > At what point do commercial ISPs upgrade links in their backbone as
> > well as peering and transit links that are congested? At 80% capacity?
> > 90%? 95%?
>
> Some reflections about link capacity:
> At 90% and over, you should panic.
> Between 80% and 90% you should be (very) scared.
> Between 70% and 80% you should be worried.
> Between 60% and 70% you should seriously consider speeding up the
> upgrades that you effectively started at 50%, and started planning since
> 40%.
>
> Of course, that differs from one ISP to another. Some only upgrade after
> several months with at least 4 hours a day, every day (or almost) at over
> 95%. Others deploy 10x expected capacity, and upgrade well before 40%.
>
Re: Bottlenecks and link upgrades [ In reply to ]
I've seen the weekly profiles of traffic sourced from caches for the major
global services (video, social media, search and general) for a specific
metro area.

For all services, the weekly profile is a repetition of the daily profile,
within +/- 20%.
That is: the weekly profile is obtained from the daily profile within +/-
20% of the average daily profile height.

Given this regularity, as suggested by Louie Lee, then it seems that growth
projections are meaningful.
That is, the weely profile data, seem to provide a sound empirical basis
for link upgrades.

Since I'm not an operator, my comments need to be sprinkled with a pinch of
salt :)

Cheers,

Etienne

On Sat, Aug 15, 2020 at 2:43 AM Louie Lee via NANOG <nanog@nanog.org> wrote:

> Beyond a pure percentage, you might want to account for the time it takes
> you stay below a certain threshold. If you want to target a certain link to
> keep your 95th percentile peaks below 70%, then first get an understanding
> of your traffic growth and try to project when you will reach that number.
> You have to decide whether you care about the occasional peak, or the
> consistent peak, or somewhere in between, like weekday vs weekends, etc.
> Now you know how much lead time you will have.
>
> Then consider how long it will take you to upgrade that link. If it's a
> matter of adding a couple of crossconnects, then you might just need a
> week. If you have to ship and install optics, modules, a card, then add
> another week. If you have to get a sales order signed by senior management,
> add another week. If you have to put it through legal and finance, add a
> month. (kidding) If you are doing your annual re-negotiation, well...good
> luck.
>
> It's always good to ask your circuit vendors what the lead times are, then
> double it and add 5.
>
> And sometimes, if you need a low latency connection, traffic utilization
> levels might not even be something you look at.
>
> Louie
> Peering Coordinator at a start-up ISP
>
>
> On Fri, Aug 14, 2020 at 4:13 PM Radu-Adrian Feurdean <
> nanog@radu-adrian.feurdean.net> wrote:
>
>> On Wed, Aug 12, 2020, at 09:31, Hank Nussbacher wrote:
>> > At what point do commercial ISPs upgrade links in their backbone as
>> > well as peering and transit links that are congested? At 80% capacity?
>> > 90%? 95%?
>>
>> Some reflections about link capacity:
>> At 90% and over, you should panic.
>> Between 80% and 90% you should be (very) scared.
>> Between 70% and 80% you should be worried.
>> Between 60% and 70% you should seriously consider speeding up the
>> upgrades that you effectively started at 50%, and started planning since
>> 40%.
>>
>> Of course, that differs from one ISP to another. Some only upgrade after
>> several months with at least 4 hours a day, every day (or almost) at over
>> 95%. Others deploy 10x expected capacity, and upgrade well before 40%.
>>
>

--
Ing. Etienne-Victor Depasquale
Assistant Lecturer
Department of Communications & Computer Engineering
Faculty of Information & Communication Technology
University of Malta
Web. https://www.um.edu.mt/profile/etiennedepasquale
Re: Bottlenecks and link upgrades [ In reply to ]
On Sat, Aug 15, 2020, at 02:39, Louie Lee wrote:

> get an understanding of your traffic growth and try to project when you
> will reach that number. You have to decide whether you care about the
> occasional peak, or the consistent peak, or somewhere in between, like
> weekday vs weekends, etc. Now you know how much lead time you will have.

Get an understanding, and try to make a plan on the longer term (like 2-3 years) if you can. If you're reaching some important milestones (e.g need to buy expensive hardware), make a presentation for the management.
You will definitely need adjustments, during the timespan covered (some things will need to be done sooner, others may leave you some extra time) but it should reduce the amount of surprise.

That is valid if you have visibility. If you don't (that may happen), the cheatsheet I described previously is a good start. It could be applied at $job[-1], where I applied it to grow the network from almost zero to 35 Gbps, and it is kind of applied at $job[$now] where long term visibility is kind of missing and we need to be ready for rapid capacity variations.

> And sometimes, if you need a low latency connection, traffic
> utilization levels might not even be something you look at.

This goes to the "understand your traffic" chapter. All the traffic (sine sometimes there may be a mix, e.g. regular eyeball traffic + voice traffic).
Re: Bottlenecks and link upgrades [ In reply to ]
No plan survives contact with the enemy. Your careful made growth
projection was fine until the brass made a deal with some major customer,
which caused a traffic spike. Or any infinite other events that could and
eventually will happen to you.

One hard thing, that almost everyone will get wrong at some point, is
simulating load in the event multiple outages takes some links out, causing
excessive traffic to reroute unto links that previously seemed fine.

Regards,

Baldur


On Sat, Aug 15, 2020 at 10:48 AM Etienne-Victor Depasquale <edepa@ieee.org>
wrote:

> I've seen the weekly profiles of traffic sourced from caches for the major
> global services (video, social media, search and general) for a specific
> metro area.
>
> For all services, the weekly profile is a repetition of the daily profile,
> within +/- 20%.
> That is: the weekly profile is obtained from the daily profile within +/-
> 20% of the average daily profile height.
>
> Given this regularity, as suggested by Louie Lee, then it seems that
> growth projections are meaningful.
> That is, the weely profile data, seem to provide a sound empirical basis
> for link upgrades.
>
> Since I'm not an operator, my comments need to be sprinkled with a pinch
> of salt :)
>
> Cheers,
>
> Etienne
>
> On Sat, Aug 15, 2020 at 2:43 AM Louie Lee via NANOG <nanog@nanog.org>
> wrote:
>
>> Beyond a pure percentage, you might want to account for the time it takes
>> you stay below a certain threshold. If you want to target a certain link to
>> keep your 95th percentile peaks below 70%, then first get an understanding
>> of your traffic growth and try to project when you will reach that number.
>> You have to decide whether you care about the occasional peak, or the
>> consistent peak, or somewhere in between, like weekday vs weekends, etc.
>> Now you know how much lead time you will have.
>>
>> Then consider how long it will take you to upgrade that link. If it's a
>> matter of adding a couple of crossconnects, then you might just need a
>> week. If you have to ship and install optics, modules, a card, then add
>> another week. If you have to get a sales order signed by senior management,
>> add another week. If you have to put it through legal and finance, add a
>> month. (kidding) If you are doing your annual re-negotiation, well...good
>> luck.
>>
>> It's always good to ask your circuit vendors what the lead times are,
>> then double it and add 5.
>>
>> And sometimes, if you need a low latency connection, traffic utilization
>> levels might not even be something you look at.
>>
>> Louie
>> Peering Coordinator at a start-up ISP
>>
>>
>> On Fri, Aug 14, 2020 at 4:13 PM Radu-Adrian Feurdean <
>> nanog@radu-adrian.feurdean.net> wrote:
>>
>>> On Wed, Aug 12, 2020, at 09:31, Hank Nussbacher wrote:
>>> > At what point do commercial ISPs upgrade links in their backbone as
>>> > well as peering and transit links that are congested? At 80%
>>> capacity?
>>> > 90%? 95%?
>>>
>>> Some reflections about link capacity:
>>> At 90% and over, you should panic.
>>> Between 80% and 90% you should be (very) scared.
>>> Between 70% and 80% you should be worried.
>>> Between 60% and 70% you should seriously consider speeding up the
>>> upgrades that you effectively started at 50%, and started planning since
>>> 40%.
>>>
>>> Of course, that differs from one ISP to another. Some only upgrade after
>>> several months with at least 4 hours a day, every day (or almost) at over
>>> 95%. Others deploy 10x expected capacity, and upgrade well before 40%.
>>>
>>
>
> --
> Ing. Etienne-Victor Depasquale
> Assistant Lecturer
> Department of Communications & Computer Engineering
> Faculty of Information & Communication Technology
> University of Malta
> Web. https://www.um.edu.mt/profile/etiennedepasquale
>
Re: Bottlenecks and link upgrades [ In reply to ]
On Sat, Aug 15, 2020, at 11:35, Baldur Norddahl wrote:
> No plan survives contact with the enemy. Your careful made growth
> projection was fine until the brass made a deal with some major
> customer, which caused a traffic spike.

Capacity planning also includes keeping an eye on what is being sold and what is being prepared.
Having the traffic more than double within a 48h timespan (until day X peak at N Gbps, after days X+2, peaks at 2.5*N Gbps) -> done with success when the correct information ("partner X will change delivery system") arrived 4 months in advance.

Having multiple 200 Mbps and 500 Mbps connections over an already-used 1 Gbps port and pretending that "everything's gonna be allright" , in that case you should confront your enemy.

> Or any infinite other events that could and eventually will happen to you.

Among which you try to protect yourself against the most realistic ones.

> One hard thing, that almost everyone will get wrong at some point, is
> simulating load in the event multiple outages takes some links out,
> causing excessive traffic to reroute unto links that previously seemed
> fine.

You should scale the network to absorb a certain degree of "surprise"/damage, and clearly explain that beyond that certain level, service will be degraded (or even absent) and there is nothing that can and nothing that will be done immediately.

Every network fails at a certain moment in time. You just need to make sure you know how to make it working again, within a reasonable time frame. Or have a good run-away plan (sometimes this is the best solution).
Re: Bottlenecks and link upgrades [ In reply to ]
+1

You can't foresee everything, but no plan means foreseeing nothing, =
blindfold.

Cheers,

Etienne

On Sat, Aug 15, 2020 at 12:29 PM Radu-Adrian Feurdean <
nanog@radu-adrian.feurdean.net> wrote:

> On Sat, Aug 15, 2020, at 11:35, Baldur Norddahl wrote:
> > No plan survives contact with the enemy. Your careful made growth
> > projection was fine until the brass made a deal with some major
> > customer, which caused a traffic spike.
>
> Capacity planning also includes keeping an eye on what is being sold and
> what is being prepared.
> Having the traffic more than double within a 48h timespan (until day X
> peak at N Gbps, after days X+2, peaks at 2.5*N Gbps) -> done with success
> when the correct information ("partner X will change delivery system")
> arrived 4 months in advance.
>
> Having multiple 200 Mbps and 500 Mbps connections over an already-used 1
> Gbps port and pretending that "everything's gonna be allright" , in that
> case you should confront your enemy.
>
> > Or any infinite other events that could and eventually will happen to
> you.
>
> Among which you try to protect yourself against the most realistic ones.
>
> > One hard thing, that almost everyone will get wrong at some point, is
> > simulating load in the event multiple outages takes some links out,
> > causing excessive traffic to reroute unto links that previously seemed
> > fine.
>
> You should scale the network to absorb a certain degree of
> "surprise"/damage, and clearly explain that beyond that certain level,
> service will be degraded (or even absent) and there is nothing that can and
> nothing that will be done immediately.
>
> Every network fails at a certain moment in time. You just need to make
> sure you know how to make it working again, within a reasonable time frame.
> Or have a good run-away plan (sometimes this is the best solution).
>


--
Ing. Etienne-Victor Depasquale
Assistant Lecturer
Department of Communications & Computer Engineering
Faculty of Information & Communication Technology
University of Malta
Web. https://www.um.edu.mt/profile/etiennedepasquale
Re: Bottlenecks and link upgrades [ In reply to ]
On 15/Aug/20 01:45, Radu-Adrian Feurdean wrote:

>
> I think you're over-confident.

If you can resist the "let me make a plan" offer that CFO's would want
you to give them, you can be confident :-). Because when it hits the
fan, the CFO will say, "But Feurdean said he would make a plan. If he
thought the situation was urgent, he didn't make it known clearly enough".

Better to say, "CFO, if you don't do this upgrade, the network breaks".

And walk away.

Don't accept risk on behalf of someone else, because at the end of the
day, no one will blame the network... but those that operate it.

Mark.
Re: Bottlenecks and link upgrades [ In reply to ]
On 15/Aug/20 10:47, Etienne-Victor Depasquale wrote:

> I've seen the weekly profiles of traffic sourced from caches for the
> major global services (video, social media, search and general) for a
> specific metro area.
>
> For all services, the weekly profile is a repetition of the daily
> profile, within +/- 20%. 
> That is: the weekly profile is obtained from the daily profile
> within +/- 20% of the average daily profile height.
>
> Given this regularity, as suggested by Louie Lee, then it seems that
> growth projections are meaningful.
> That is, the weely profile data, seem to provide a sound empirical
> basis for link upgrades.
>
> Since I'm not an operator, my comments need to be sprinkled with a
> pinch of salt :)

Provided your NMS has been stable over any period of time, you can
extract historical data over 1 year or more and see how linearly things
grew.

It's difficult to sometimes see the growth rate when you are close to
the daily action.

Mark.
Re: Bottlenecks and link upgrades [ In reply to ]
On 15/Aug/20 11:35, Baldur Norddahl wrote:

> No plan survives contact with the enemy. Your careful made growth
> projection was fine until the brass made a deal with some major
> customer, which caused a traffic spike. Or any infinite other events
> that could and eventually will happen to you.

That's why your operations teams cannot work separately from the Sales
teams. If a big deal is in the pipeline, there should be someone
operational to do a simple feasibility check to see if the segment in
question will handle the traffic. If not, defer to standard lead times
to deliver. Or even extended ones if the deal is larger than usual.


>
> One hard thing, that almost everyone will get wrong at some point, is
> simulating load in the event multiple outages takes some links out,
> causing excessive traffic to reroute unto links that previously seemed
> fine.

So rather than simulate, insure, I say. By insure, I mean upgrade each
and every backbone link when it hits 50%, and you'll have less to worry
about when things start crumbling all over the place.

Mark.
Re: Bottlenecks and link upgrades [ In reply to ]
On 15/Aug/20 12:32, Etienne-Victor Depasquale wrote:

> +1
>
> You can't foresee everything, but no plan means foreseeing nothing, =
> blindfold.

In the absence of guidance from your Sales team on a forecast, keep the
50% threshold trigger, and standardize on lead times if urgent
feasibilities don't immediately pass.

The more you do this, the more you will encourage better planning on the
Sales side. It just happens automatically.

The worst thing you can do for yourself and your team is try to be the
hero.

Mark.

1 2  View All