Mailing List Archive

FCC: Staff Report on T-Mobile Outage on June 15 2020
FCC Issues Staff Report On T-Mobile Outage

https://www.fcc.gov/document/fcc-issues-staff-report-t-mobile-outage-0


The outage was initially caused by an equipment failure and then
exacerbated by a network routing misconfiguration that occurred when
T-Mobile introduced a new router into its network. In addition, the outage
was magnified by a software flaw in T-Mobile?s network that had been
latent for months and interfered with customers? ability to initiate or
receive voice calls during the outage.


[...]
44. While fiber link failures are common, PSHSB finds that these steps,
taken together, will reduce the likelihood that a fiber link failure could
result in the recurrence of a similar event in TMobile?s network because
traffic would be routed to an alternative path that could handle it.
Moreover, if such an event recurred on T-Mobile?s network, it would not
cause such a large service disruption because T-Mobile would have improved
its networks? ability to manage congestion in the case of a similar event
and would have increased network capacity to maintain the network in a
working state even with an increased volume of traffic.
Re: FCC: Staff Report on T-Mobile Outage on June 15 2020 [ In reply to ]
----- On Nov 12, 2020, at 9:35 AM, Sean Donelan sean@donelan.com wrote:

Hi,

> FCC Issues Staff Report On T-Mobile Outage
>
> https://www.fcc.gov/document/fcc-issues-staff-report-t-mobile-outage-0

This part, I find most interesting as well:

> However, they were unable to resolve the issue by restoring the link because
> the network management tools required to do so remotely relied on the same
> paths they had just disabled.

I can't begin to tell you how often I battled senior mgmt to get some investment
into an OOB network. This only proves the point.

Parantap, are you reading this? I know you are.

Thanks,

Sabri
Re: FCC: Staff Report on T-Mobile Outage on June 15 2020 [ In reply to ]
The larger story here is...

"7. Routing. Routers connect T-Mobile’s LTE towers to T-Mobile’s LTE
network. These routers utilize a routing protocol called Open
Shortest Path First."

Calling Vijay Gill to the courtesy phone.

On Thu, Nov 12, 2020 at 3:16 PM Sabri Berisha <sabri@cluecentral.net> wrote:
>
> ----- On Nov 12, 2020, at 9:35 AM, Sean Donelan sean@donelan.com wrote:
>
> Hi,
>
> > FCC Issues Staff Report On T-Mobile Outage
> >
> > https://www.fcc.gov/document/fcc-issues-staff-report-t-mobile-outage-0
>
> This part, I find most interesting as well:
>
> > However, they were unable to resolve the issue by restoring the link because
> > the network management tools required to do so remotely relied on the same
> > paths they had just disabled.
>
> I can't begin to tell you how often I battled senior mgmt to get some investment
> into an OOB network. This only proves the point.
>
> Parantap, are you reading this? I know you are.
>
> Thanks,
>
> Sabri
Re: FCC: Staff Report on T-Mobile Outage on June 15 2020 [ In reply to ]
> The larger story here is...
>
> "7. Routing. Routers connect T-Mobile?s LTE towers to T-Mobile?s LTE
> network. These routers utilize a routing protocol called Open
> Shortest Path First."

you can blow it with is-is, just as you can with ospf, just as you can
with pretty much any dynamic [routing] protocol. though i am an is-is
fanboy, i would not blame the protocol. and if they can not manage the
currently deployed protocol, i am not sure i would recommend they try a
delicate transition.

randy
Re: FCC: Staff Report on T-Mobile Outage on June 15 2020 [ In reply to ]
On Fri, Nov 13, 2020 at 11:53 AM Randy Bush <randy@psg.com> wrote:
>
> > The larger story here is...
> >
> > "7. Routing. Routers connect T-Mobile’s LTE towers to T-Mobile’s LTE
> > network. These routers utilize a routing protocol called Open
> > Shortest Path First."
>
> you can blow it with is-is, just as you can with ospf, just as you can
> with pretty much any dynamic [routing] protocol. though i am an is-is
> fanboy, i would not blame the protocol. and if they can not manage the
> currently deployed protocol, i am not sure i would recommend they try a
> delicate transition.

Absolutely all of these guns pointed at toes can be problematic.
I don't often get a chance to poke fun at vijay though :)

On the bright side that write up by TMO was pretty great as a read...
good details (or more than I expected from telco).