Paul Jakma wrote:
>>We are also testing another patch which improves the quality of
>>IPv6 dumps by including the IPv6 nexthop address in "dump bgp
>>routes-mrt" dumps and by providing the correct IPv6 addresses of
>>peers using IPv6 peerings in "dump bgp updates" dumps.
>
> Sounds useful, but are there people out there with tools/scripts/etc
> who depend on the existing format? And how would they be affected?
> AFAICT, it doesnt change AFI_IP dumps.
That's correct, the patch only adds the extra attribute to AFI_IP6
dumps. We have tested route_btoa on the resulting dumps, since that is
what many people use, and the results are as follows:
If route_btoa is compiled with IPv6 support (e.g. on Solaris), it works
fine. It even picks up the extra info added by our patches.
If route_btoa is compiled without IPv6 support, it has two problems:
1. Sometimes (depends on context) it quits halfway through the
IPv6 portion of RIB dumps with a "MSG_NULL" message. We don't
see this as a big problem, since:quagga dumps all AFI_IP routes
before starting on AFI_IP6, soo route_btoa will already have
decoded all IPv4 routes when this occurs. Also, if route_btoa
is compiled without IPv6 support, it can't decode AFI_IP6
records anyway (it just prints empty records). So you're
actually not losing any information unless you are simply
counting the number of INET6 records.
2. It wrongly decodes all IPv6 updates as NULL messages with
garbage FROM: and TO: IPv4 addresses. Without the patch, it
does marginally better, decoding IPv6 updates as
empty UPDATE messages with no FROM: or TO:.
Again, it can't decode the IPv6 stuff anyway, so the only
way someone can run into problems is if he is counting the
number of updates or if he is parsing the IPv4 addresses
from NULL records, which he probably shouldn't be doing.
So, by and large, if people only look at IPv4 and do a minimum of sanity
checking (e.g. not blindly supposing that everything is an IPv4 update),
everything is fine.
We have a simple patch for route_btoa that fixes these problems. The
patch is only cosmetic though; route_btoa still can't decode IPv6 dumps
unless it is compiled with IPv6 support. it is available at:
http://www.ris.ripe.net/source/mrt-patches/mrt-2.2.2a-v6-wallpaper.patch These problems seem to be due to the fact that route_btoa takes an "all
or nothing" approach to IPv6 and unknown attributes in general: instead
of skipping over them, it often attempts to decode them, only to abort
in the middle, get out of sync, and bail out. This is probably because
the BGP library that route_btoa uses is written with a routing daemon,
not a decoding tool, in mind.
Probably, purpose-written decoding tools are more robust in the face of
unknown but well-formed input and cleanly skip over the parts they don't
understand.
>>1. Dumps are produced at fixed times during the day and not at fixed
>>time intervals after bgpd is started.
>
> sounds like a good idea, one question though - what is interval2 for?
> it doesnt appear to be used for anything.
Sorry, I made a mistake when forward porting the patch. The one attached
to this message should be correct.
>>quagga-dump-v6-nexthop-and-peers.patch
>
>>--------------------------------------
>>1. The IP addresses of peers in "dump bgp updates" dumps are correct: if
>>it's an IPv4 peering, the peer's IPv4 address is dumped; if it's an IPv6
>>peering, the IPv6 address is dumped.
>
> as opposed to present where IPv4 is dumped regardless?
I don't really understand the code, but it doesn't seem that way. For
example, my test environment consists of an IPv4 peering over which I
exchange IPv6 routes. The updates dumped are wrongly marked as AFI_IP6
and contain garbage IPv6 addresses in the source and destination IP
address fields.
I think that the code seems to be confusing the address family of the
NLRI information exchanged via BGP with the address family of the socket
the peering is run on, which of course may be different.
>>2. IPv6 next hop values are included in "dump bgp routes-mrt" dumps.
>>This is done by constructing an artificial MP_REACH_NLRI attribute with
>>this information and adding it to each IPv6 prefix dumped.
>
>
> sounds reasonable.
>
> aside from wondering whether anyone would presently rely on IPv4
> nexthops for IPv6 routes being dumped - eg IPv6 address family
> prefixes can be carried over IPv4 peerings surely? - the patch looks
> grand.
Yes, any other address family's routes can be carried over an IPv4
peering. But the relevant next hop is the one contained in the MP_NLRI
attribute, and is of the same family as the routes exchanged. In fact,
if an IPv4 peering only exchanges IPv6 routes, that the IPv4 next hop
should not be included all, as stated in RFC 2858:
> An UPDATE message that carries no NLRI, other than the one encoded in
> the MP_REACH_NLRI attribute, should not carry the NEXT_HOP attribute.
> If such a message contains the NEXT_HOP attribute, the BGP speaker
> that receives the message should ignore this attribute.
An IPv4 next hop makes sense if an update contains both IPv4 and IPv6
routes, but here we're only dumping one IPv6 route, so only the IPv6
next hop is needed.
A different matter, of course, is the address of the peer, which may be
an IPv4 address even for an IPv6 prefix. But we don't store the IP
address of the peer in RIB dumps anyway, not even for IPv4.
> - else if (peer->afc[AFI_IP6][SAFI_UNICAST])
> + else if (peer->su.sa.sa_family == AF_INET6)
> {
> /* Interface Index and Address family. */
> stream_putw (obuf, peer->ifindex);
>
> Surely testing for configured address-family, rather than the socket
> family, is the more correct behaviour?
Hmmm... but from what I see in the code, peer->su _is_ determined
directly from the configuration. peer_remote_as_vty() calls
str2sockunion() on the neighbor address string, and obtains su. Then it
passes su to peer_remote_as(), which passes it to peer_create, which
actually creates the peer and stuffs su into peer->su. The peer's
configured address family doesn't seem to be stored anywhere else. Am I
missing something?
Regards,
Lorenzo