Mailing List Archive

RING measurements don't match access-networks (Was: Some very nice broken IPv6 networks...)
On 2014-11-09 20:18, Job Snijders wrote:
> On Sun, Nov 09, 2014 at 08:03:01PM +0100, Jeroen Massar wrote:
>>> No. I feel that 250+ successes vs 10 failures is enough to conclude
>>> that Akamai and Google are *not* universally broken, far from it.
>>
>> Testing from colod boxes on well behaved networks (otherwise they would
>> not know or be part of the RING), while the problem lies with actual
>> home users is quite a difference.
>
> I can't comment on the validaty of the tests performed, but I'd like to
> point out one thing: I like that the NLNOG RING is very diverse,
> especially in terms of the node's IPv6 connectivity.

As they are distributed amongst a large rang of ISPs, that is correct.
But that primarily affects routing paths, not their link types.

> Some hosts are behind exotic 6to4 NATted tunnels,

I am a bit surprised by such a statement, or the need for it, everybody
knows of the positive value of the one true RING. But most of these
nodes nicely sitting on colod boxes and thus while it excludes any
problem in that setup for that single query made, it does not exclude
the problems reported.

Also making a claim like that they are 6to4 that is known to be false,
is a bit weird as there is no 2002::/16 address in the participants hosts:

$ wget https://ring.nlnog.net/images/ring-logos/participants.js
..
$ (for i in `cat participants.js |grep http |grep -v ris.ripe.net | cut
-f8 -d\' | sed 's/,//g'`; do dig +short aaaa $i.ring.nlnog.net; done) >>
aaaa
$ grep 2002: aaaa
2607:f0d0:2002:6e::2

The below MTU tracepaths also do not reveal anything in 2002::/16

And also "6to4 NAT" does not work unless you hack a kernel AND the NAT
box to understand that they need to use fake addresses in those 6in4
packets. Unless you mean "The NAT is terminated on a CPE that does NAT
and 6to4 but forwards packets to the CPE's internal network". In that
case though there is no NAT involved.

Hence, I can only assume that you meant "6in4" instead of "6to4", common
mistake it seems. But lets not confuse the two as they are totally
dislike from a "working" perspective.


But lets test MTU according to PMTU on those nodes:

(for i in `cat aaaa`; do echo "tracepath6 $i"; tracepath6 -n $i; done)
>>mtu.txt

See attached.

Hmm, indeed what is odd about those traces is the "hops xx, back
xx+~40", indicating quite some asymmetric routing; but doing tracepath6s
from two sides indicates the same on both sides of a trace, something to
dig into one day.

Except for:

tracepath6 2a02:78:d443:8314::1
Resume: pmtu 1480 hops 11 back 53
tracepath6 2001:4830:123:1::d85d:f0b3
Resume: pmtu 1480 hops 18 back 52
tracepath6 2001:4c48:2:71::ffff
Resume: pmtu 1476 hops 13 back 54
tracepath6 2001:67c:105c:800::2
Resume: pmtu 1480 hops 14 back 54
tracepath6 2001:4830:ed::2
Resume: pmtu 1480 hops 16 back 51


Also, broken pMTU/traceroute for:

2a02:58:3:110::23:1
2a01:310:8312:1001::19
2a00:1f00:dc06:11::10
2001:48c8:3:2::2
2607:fcc0:2:1:208:70:247:50
2001:67c:2274:4021::101
2001:67c:2814:60::7c:1
2001:4ce8:c134:1019::1 <--- routing loop
2001:8d8:580:400::1:0
2a03:3400:3:2501::150:255
2a00:1798:fff0::4
2001:4b28:198:0:21c:42ff:fe2a:d38a <-- intermediary nodes broken
2a01:528:2000:0:21c:42ff:fe4a:c3a0 <-- "" (same network it seems)
2001:500:a:500::229
2a01:8e40:1:12::2
2a03:b980:c037::1
2a02:e0:1::2
2001:12f0:500:1dc::1916:12 <-- routing loop
2001:67c:2a0:8::beef <-- intermediary nodes broken
2405:fc00::a001
2a02:74c0::232:52
2620:0:2d0:203::158
2620:0:2830:203::158
2a02:c10:3f00:5::2
2a02:c10:3f00:7::2
2001:fb0:100:ffff:211:25ff:fe40:9468 <-- intermediates broken
2a00:1c18:0:101::5
2a00:59c0:1:3::130
2a02:480:3411:6::2
etc...

Actually, what that demonstrates is that out of a set of well-managed
nodes, there are issues with PMTU. Hence, please contact these networks
(they are part of the ring, thus that should be easy) and make them
behave properly. That solves another set of problems.


>From the above, it is actually pure magic that the majority has working
connectivity. Now, the above nodes likely are not affected by PMTUD as
they are likely on full 1500-MTU networks...


You might want to run a similar one from all the nodes, thus making a
cross-RING MTU determination to get even better results, as I did it
only from one vantage point. Maybe time for ring-mtu? :)

And of course, contacting those nodes as they obviously have broken
connectivity; as you have the contacts directly, it should be quick to
get resolved.

> others behind regular
> tunnels, some inadvertently block useful ICMPv6 messages, some networks
> are just broken.

Yes, there where 10 broken nodes in the list. but apparently quite a few
more nodes have issues.

> For NLNOG RING applications we mandate that there is 1 globally unique
> IPv6 address on the host, we do not specify how this should be
> accomplished. This leads to some variety, not all of those
> implementations I would describe as "well behaved".

While that is absolutely true, most of those boxes ARE well behaved, as
the providers involved will take sure that there are no problems.

And they definitely are not located in an access network on a dingy home
DSL line...

Greets,
Jeroen
Re: RING measurements don't match access-networks (Was: Some very nice broken IPv6 networks...) [ In reply to ]
> Also, broken pMTU/traceroute for:
>
> 2a02:58:3:110::23:1
> 2a01:310:8312:1001::19
> 2a00:1f00:dc06:11::10
> 2001:48c8:3:2::2
> 2607:fcc0:2:1:208:70:247:50
> 2001:67c:2274:4021::101

I took a look at the tracepath information you sent for these nodes,
which showed a bunch of unresponsive nodes but no information that
might be useful for assigning blame. It'd be cool to see these paths
with scamper's pmtud traceroute, which tries to find out the MTU for
the hops that aren't sending a PTB.

with that list of IP addresses:
scamper -c "trace -P udp-paris -M" -f <file>

http://www.caida.org/tools/measurement/scamper/
http://www.caida.org/~mjl/pubs/debugging-pmtud.imc2005.pdf

Happy to help anyway I can (I wrote scamper)

Matthew
Re: RING measurements don't match access-networks (Was: Some very nice broken IPv6 networks...) [ In reply to ]
On 2014-11-11 06:38, Matthew Luckie wrote:
>> Also, broken pMTU/traceroute for:
>>
>> 2a02:58:3:110::23:1
>> 2a01:310:8312:1001::19
>> 2a00:1f00:dc06:11::10
>> 2001:48c8:3:2::2
>> 2607:fcc0:2:1:208:70:247:50
>> 2001:67c:2274:4021::101
>
> I took a look at the tracepath information you sent for these nodes,
> which showed a bunch of unresponsive nodes but no information that
> might be useful for assigning blame. It'd be cool to see these paths
> with scamper's pmtud traceroute, which tries to find out the MTU for
> the hops that aren't sending a PTB.
>
> with that list of IP addresses:
> scamper -c "trace -P udp-paris -M" -f <file>

See attached f.out, though I used:

(for i in `cat f`; do echo "==================== $i"; tracepath6 -n $i;
scamper -I "trace -P udp-paris -M $i"; done) >>f.out

This to show the difference between tracepath6 and scamper output, there
are some to be seen, some quite scary (eg the 1455 change).
Could be that one just gets through the ICMP ratelimits in one run and
not the other.

Those nodes are just blackholes it seems. Only the operators of that
network will know what is going on.

I am always surprised to see networks filtering out packets, and
especially wonder what they are trying to achieve with such a filter.

> http://www.caida.org/tools/measurement/scamper/
> http://www.caida.org/~mjl/pubs/debugging-pmtud.imc2005.pdf
>
> Happy to help anyway I can (I wrote scamper)

I am quite aware. Great tool, but not very verbose unfortunately. Hence,
typically it just does/outputs nothing.

Greets,
Jeroen
Re: RING measurements don't match access-networks (Was: Some very nice broken IPv6 networks...) [ In reply to ]
On Tue, Nov 11, 2014 at 12:56:06PM +0100, Jeroen Massar wrote:
> On 2014-11-11 06:38, Matthew Luckie wrote:
> >> Also, broken pMTU/traceroute for:
> >>
> >> 2a02:58:3:110::23:1
> >> 2a01:310:8312:1001::19
> >> 2a00:1f00:dc06:11::10
> >> 2001:48c8:3:2::2
> >> 2607:fcc0:2:1:208:70:247:50
> >> 2001:67c:2274:4021::101
> >
> > with that list of IP addresses:
> > scamper -c "trace -P udp-paris -M" -f <file>
>
> See attached f.out, though I used:
>
> (for i in `cat f`; do echo "==================== $i"; tracepath6 -n $i;
> scamper -I "trace -P udp-paris -M $i"; done) >>f.out
>
> This to show the difference between tracepath6 and scamper output, there
> are some to be seen, some quite scary (eg the 1455 change).
> Could be that one just gets through the ICMP ratelimits in one run and
> not the other.

Unsure what happened with that one in your file, scamper got an
address unreachable response eventually. When I tried it just now
it worked and has a 1500 PMTU.

traceroute from 2001:48d0:101:501:d267:e5ff:fe14:a701 to 2a01:310:8312:3900::2
1 2001:48d0:101:501::18 0.195 ms [mtu: 1500]
2 2001:468:e00:c48::1 2.592 ms [mtu: 1500]
3 2607:f380::108:9a41:af60 3.132 ms [mtu: 1500]
4 2001:468:f000:2300::1 2.622 ms [mtu: 1500]
5 2001:468:f000:f17::1 11.068 ms [mtu: 1500]
6 2001:504:d::5580:1 20.959 ms [mtu: 1500]
7 2a02:d28:5580:1::d1 69.738 ms [mtu: 1500]
8 2a02:d28:5580::1:c9 75.740 ms [mtu: 1500]
9 2a02:d28:5580:5:1008::5 155.666 ms [mtu: 1500]
10 2a02:d28:5580:1::136 159.654 ms [mtu: 1500]
11 2a02:d28:5580:1::156 159.484 ms [mtu: 1500]
12 2a02:d28:5580:e:1000::136 158.690 ms [mtu: 1500]
13 2a01:310:8312:3900::2 158.653 ms [mtu: 1500]

For the rest, those addresses are just unresponsive to traceroute,
i.e. no port unreachable response comes back from the destination.

> I am always surprised to see networks filtering out packets, and
> especially wonder what they are trying to achieve with such a filter.
>
> > http://www.caida.org/tools/measurement/scamper/
> > http://www.caida.org/~mjl/pubs/debugging-pmtud.imc2005.pdf
> >
> > Happy to help anyway I can (I wrote scamper)
>
> I am quite aware. Great tool, but not very verbose unfortunately. Hence,
> typically it just does/outputs nothing.

It is designed for doing lots of measurements in parallel, so does not
output anything until it is done. To do PMTU debugging, it relies
on the end host being responsive to at least some probes, to distinguish
between all packets being discarded, and just big ones.

If you want the full details on what is going on, output to warts
format and use sc_wartsdump or sc_warts2json.

Matthew
Re: RING measurements don't match access-networks (Was: Some very nice broken IPv6 networks...) [ In reply to ]
On 2014-11-11 18:29, Matthew Luckie wrote:
> On Tue, Nov 11, 2014 at 12:56:06PM +0100, Jeroen Massar wrote:
>> On 2014-11-11 06:38, Matthew Luckie wrote:
>>>> Also, broken pMTU/traceroute for:
>>>>
>>>> 2a02:58:3:110::23:1
>>>> 2a01:310:8312:1001::19
>>>> 2a00:1f00:dc06:11::10
>>>> 2001:48c8:3:2::2
>>>> 2607:fcc0:2:1:208:70:247:50
>>>> 2001:67c:2274:4021::101
>>>
>>> with that list of IP addresses:
>>> scamper -c "trace -P udp-paris -M" -f <file>
>>
>> See attached f.out, though I used:
>>
>> (for i in `cat f`; do echo "==================== $i"; tracepath6 -n $i;
>> scamper -I "trace -P udp-paris -M $i"; done) >>f.out
>>
>> This to show the difference between tracepath6 and scamper output, there
>> are some to be seen, some quite scary (eg the 1455 change).
>> Could be that one just gets through the ICMP ratelimits in one run and
>> not the other.
>
> Unsure what happened with that one in your file, scamper got an
> address unreachable response eventually. When I tried it just now
> it worked and has a 1500 PMTU.

I would say, somebody is reading along and fixed it, same result here:

$ scamper -I "trace -P udp-paris -M 2a01:310:8312:3900::2"
traceroute from 2a02:2528:fa:420::66 to 2a01:310:8312:3900::2
1 2a02:2528:fa:420::1 0.201 ms [mtu: 1500]
2 2a02:2528:503:2::ffff 0.556 ms [mtu: 1500]
3 2a02:2528:502:1::1 13.036 ms [mtu: 1500]
4 2001:7f8:c:8235:194:42:48:80 1.780 ms [mtu: 1500]
5 2001:7f8:24::ad 2.455 ms [mtu: 1500]
6 2a02:d28:5580:1::a2 19.674 ms [mtu: 1500]
7 2a02:d28:5580::1:36 24.782 ms [mtu: 1500]
8 2a02:d28:5580:1::145 17.503 ms [mtu: 1500]
9 2a02:d28:5580:1::21 28.141 ms [mtu: 1500]
10 2a02:d28:5580:e:1000::136 17.942 ms [mtu: 1500]
11 2a01:310:8312:3900::2 18.194 ms [mtu: 1500]

$ tracepath6 -n 2a01:310:8312:3900::2
1?: [LOCALHOST] 0.033ms pmtu 1500
1: 2a02:2528:fa:420::1 0.126ms
1: 2a02:2528:fa:420::1 0.070ms
2: 2a02:2528:503:2::ffff 131.162ms
3: 2a02:2528:502:1::1 2.183ms
4: 2001:7f8:c:8235:194:42:48:80 6.248ms
5: 2001:7f8:24::ad 2.831ms
6: 2a02:d28:5580:1::a2 17.387ms asymm 8
7: 2a02:d28:5580::1:36 17.438ms asymm 8
8: 2a02:d28:5580:1::145 18.135ms asymm 9
9: 2a02:d28:5580:1::21 28.188ms
10: 2a02:d28:5580:e:1000::136 18.782ms asymm 11
11: 2a01:310:8312:3900::2 18.558ms reached
Resume: pmtu 1500 hops 11 back 54

[..]
>>> http://www.caida.org/tools/measurement/scamper/
>>> http://www.caida.org/~mjl/pubs/debugging-pmtud.imc2005.pdf
>>>
>>> Happy to help anyway I can (I wrote scamper)
>>
>> I am quite aware. Great tool, but not very verbose unfortunately. Hence,
>> typically it just does/outputs nothing.
>
> It is designed for doing lots of measurements in parallel, so does not
> output anything until it is done. To do PMTU debugging, it relies
> on the end host being responsive to at least some probes, to distinguish
> between all packets being discarded, and just big ones.
>
> If you want the full details on what is going on, output to warts
> format and use sc_wartsdump or sc_warts2json.

Well, a simple '-V' or --verbose option would be useful too ;)

Greets,
Jeroen