Mailing List Archive

Re: meeting agenda
>Hello Alan - Here's a PARTIAL list of presentations we've planned so far:

I hesitate to suggest a presentation, because I don't have any solutions
only problems. A couple of months ago this list had a thread on communicating
network outage information, network usability, and generally keeping
network operators and users informed about the current state of the
net.

- Overall network reliability tracking, better or worse?

While individual network providers track their own network
reliability, is there a need to report and track some data
on an Internet-wide basis similar to the reliability reporting
done in other industries (telephone, airline, etc)?

While anyone could track network outages on their own through
massive invasive testing, it usually doesn't reveal the cause
of the outage. What is the biggest threat to network reliability?
A farmer with a backhoe, or a network engineer at a console?

Is there a neutral third-party which could blind and summarize
the data? I'm not an academic type, so I don't know what would
be involved in getting funding at one of the national labs for
such a project. Or do we wait the the FCC to mandate something?

- No one is perfect

Everyone should plan for the disaster which will hit their network
at some point. Whenever a network melts down, the next thing to go
is the NOC communication lines. I haven't seen a network provider
with sufficient staff to answer all the calls, and repair their
network at the same time when it goes down. Either calls go unanswered,
or the network doesn't get repaired, or sometimes both.

The 1-800 problem reporting method isn't scaling well. Alternatives?

- Everything hasn't failed at once [for a long time]

I don't think there has been an Internet-wide ('net-wide) failure
since BBN made Butterfly gateways and one lost its mind.

This means, even though one network provider is wiped out, other
networks could pass along reports about the current state of
the network. How can this reporting function be decentralized?

- Finally, keep network users informed

Since we have a hard time tracking who is using what (if we even
wanted to track users), out-of-band notification isn't great for
notifying users. Ideally the network itself could be used to
inform just those users affected why things aren't working. Any
chance of wedging a "user information" field into the IPng ICMP
destination unreachable message? It would be nice to tell the
user in the ICMP message: "Beep BOOP BEEP, We're sorry your
packet could not be delivered as addressed due to a ...." Instead
of waiting for the users to call the NOC which probably is already
snowed under with calls.

Since the 'net as a whole doesn't fail that often, but pieces
of the 'net fail frequently, in-band notification isn't as crazy
an idea as it seems.

Any thoughts how to turn this into a presentation topic?
--
Sean Donelan, Data Research Associates, Inc, St. Louis, MO
Affiliation given for identification not representation
Re: meeting agenda [ In reply to ]
......... Sean Donelan is rumored to have said:
]
] >Hello Alan - Here's a PARTIAL list of presentations we've planned so far:
]
] inform just those users affected why things aren't working. Any
] chance of wedging a "user information" field into the IPng ICMP
] destination unreachable message? It would be nice to tell the
] user in the ICMP message: "Beep BOOP BEEP, We're sorry your
] packet could not be delivered as addressed due to a ...." Instead

Cool! I could really dig that. Pop the tcp stack to jive on a
message, and the hubs and routers could determine the cause...
Very nice. My telnet throws back a message via icmp that the site
is unreachable becuase they don't know how to subnet and flap too
much.. ;)

Still doesn't change the fact that IPv6 will never solve any real
problems.

] Since the 'net as a whole doesn't fail that often, but pieces
] of the 'net fail frequently, in-band notification isn't as crazy
] an idea as it seems.

Hmm, how about this, any 'decent' ISP engages in a 'subscription'
broadcast system, where each sites web site was decorated with a
dynamic list of all significant outages? And if we, Global
Internet/MIDnet go south, a nifty little modem dials up to DRA or
MCI and spouts out our problem description, and this propogates to
the group. I don't know the best way to make it happen, but I do
think with some discussion and brainstorming we could make it
work. Jonathon Heiliger (MFS) and I have been working on a draft
of such a thing, and would appreciate some input.

] Any thoughts how to turn this into a presentation topic?

This is a good way. Let's see if others are interested. I'll see
you in San Diego, and if nothing else, let's sit down and talk.

-alan
Re: meeting agenda [ In reply to ]
On Sat, 13 Jan 1996, Sean Donelan wrote:

> I hesitate to suggest a presentation, because I don't have any solutions
> only problems. A couple of months ago this list had a thread on communicating

Generally problems are what start people thinking about solutions.. ;-)

> While individual network providers track their own network
> reliability, is there a need to report and track some data
> on an Internet-wide basis similar to the reliability reporting
> done in other industries (telephone, airline, etc)?

Definitely. I see the critical need as not publishing generic uptime
statistics, but network health information. That's not only what
informs users as to why they can reach X web site, but also for consumers
to choose their providers; among other benefits.

> While anyone could track network outages on their own through
> massive invasive testing, it usually doesn't reveal the cause
> of the outage. What is the biggest threat to network reliability?
> A farmer with a backhoe, or a network engineer at a console?

This sort of testing would also lead to false conclusions, IMHO. Lately
the backhoe appears to be taking the lead. :-)

> Is there a neutral third-party which could blind and summarize
> the data? I'm not an academic type, so I don't know what would
> be involved in getting funding at one of the national labs for
> such a project. Or do we wait the the FCC to mandate something?

I can't suggest a third-party at this point, but as my cohort Alan Hannan
pointed out, following the discussion last month we have begun working on
proposal material. Unfortunately it really hasn't received the
attention it should on my priority list.

The key in obtaining the data is for the individual providers themselves
to contribute it. There should be no reason why an organization needs to
maintain a staff simply to monitor another's backbone and publish the
results.

> is the NOC communication lines. I haven't seen a network provider
> with sufficient staff to answer all the calls, and repair their
> network at the same time when it goes down. Either calls go unanswered,
> or the network doesn't get repaired, or sometimes both.
>
> The 1-800 problem reporting method isn't scaling well. Alternatives?

Unfortunately the telephone is still one of the best methods for reliable
communication, IMHO. One of the bits that amazed me back in the day
('bout 2yrs ago) at BARRNet (not a pot shot) -- is that when the network
had an outage, an Email was sent to an outage list. Well, if your network
link is down -- how can you get the Email? One particular customer
mentioned to me that when I called him during an outage, it was the first
time he had been contacted during an outage. Normally he had to wait and
digest the Email after his service was restored.

> chance of wedging a "user information" field into the IPng ICMP
> destination unreachable message? It would be nice to tell the
> user in the ICMP message: "Beep BOOP BEEP, We're sorry your
> packet could not be delivered as addressed due to a ...." Instead
> of waiting for the users to call the NOC which probably is already
> snowed under with calls.

Nice idea, but much more difficult to implement. You're talking about
convincing quite a few people to implement it. Part of a NOC's mission
is to deal with incoming calls, or in certain cases Customer Service
Center's.

> Since the 'net as a whole doesn't fail that often, but pieces
> of the 'net fail frequently, in-band notification isn't as crazy
> an idea as it seems.

Definitely; the model we're currently toying with would be open
enough to be accessed both by provider's NOC staff as well as individual
consumers on the Internet. The access method would be Web based and
offer an Email interface for those desiring automated status reporting or
simply a different view.

Providers would be responsible for submitting incident reports and
keeping them current (e.g. ticket updates); and a user could browse as
his/her leisure.

> Any thoughts how to turn this into a presentation topic?

Probably lots more effort than has been put in thus far. :) I would
imagine a short presentation could be prepared explaining the model and
who has "agreed" to support it by NANOG. However, a worthwhile
presentation should include stats on who has used the service and if it's
worthwhile, etc.


\|/ _____ \|/
Jonathan Heiliger @~/ . . \~@ MFS Global Network Services, Inc.
________________________/_( \___/ )_\______________________________________
\__U__/
E-Mail: loco@mfst.com Data Services Network Engineering