Mailing List Archive

Duplication of kernel routes
Hello,

I’m running into a peculiar problem I’m hoping someone can give me some pointers for further diagnostics, or even ways to mitigate.

I’m running quagga-1.1.1-68.1 on SLES11SP4, 3.0.101-108.21-default #1 SMP Thu Jan 11 12:58:47 CET 2018 x86_64, with current patches. zebra, ospfd, ospf6d and bpd are running.

I can reproduce the issue on a VSphere VM with 3 vmxnet interfaces; two of which are bonded as a active/backup pair. On the production machine, it’s two bonded interfaces with LACP. The production setup is a pair of machines with keepalived, where VLANs are split between both nodes (odd/even VLAN numbers).

The router receives a full table.
ip -4 route show | wc -l: 669651
ip -6 route show | wc -l: 43739

There’s quite a number of customer machines connected to various VLANs; only a single VLAN is producing this effect.

The problem presents like this: whenever one of the customer FreeBSD 11.1 machines has an active IPv6 configuration, the number of IPv6 kernel routes in the router balloons. Every about 30 seconds, the number of routes increases by one full table size (so adding about 45,000); at some point, the entire IPv6 table gets reset and the process starts over. IPv4 appears to be completely unaffected. Other customers machines with FreeBSD do not exhibit this problem, nor do any other Ones.

The FreeBSD configuration appears to be standard: NDP and auto link local are active and working seemingly OK; no routing protocol is active. With NUD active, there are about 5-10 NDP packets per minute, all confirming reachability with solicited requests. With NUD off, the frequency drops to about 2 NDP packets per minute. The customer VLAN has about 4 MACs active, two of which are locally administered.
# ifconfig bridge0 inet6
bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
description: vm-bridge0
inet6 fe80::3c:9fff:fe37:2500%bridge0/64 scopeid 0x7
inet6 2a00:14b0:4200:xyza::1e1/64
inet6 2a00:14b0:4200:xyza::1e2/128
inet6 2a00:14b0:4200:xyza::1fd/128
inet6 2a00:14b0:4200:xyza::1e5/128
nd6 options=8020<AUTO_LINKLOCAL,DEFAULTIF>
# ndp -i bridge0
linkmtu=0, maxmtu=0, curhlim=64, basereachable=30s0ms, reachable=43s, retrans=1s0ms
Flags: auto_linklocal

The effect stops within minutes once both machines’ IPv6 interfaces are disabled.

tcpdump/wireshark traces don’t seem to be showing anything out of the ordinary. Traffic on the VLAN is very moderate.

What’s puzzling to me is that there are lots of identical duplicate routes installed in the kernel:
# ip -6 route show | grep 2a00:14b0:4200:xyza
2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
^C

I’m at a loss how and why these duplicate routes should be installed. I would appreciate any hints on how to diagnose this further.

The following versions of quagga were tried, without any change in behaviour:
# 2012-10-02 17:21:35 quagga-0.99.15-0.10.1.x86_64.rpm installed ok
# 2013-01-14 18:55:55 quagga-0.99.15-0.12.1.x86_64.rpm installed ok
# 2013-09-27 18:27:25 quagga-0.99.15-0.14.1.x86_64.rpm installed ok
# 2015-08-26 16:40:56 quagga-0.99.15-0.14.11.x86_64.rpm installed ok
# 2016-04-06 15:40:30 quagga-0.99.15-0.21.1.x86_64.rpm installed ok
# 2016-06-08 10:45:33 quagga-0.99.15-0.24.2.x86_64.rpm installed ok
# 2016-10-26 14:55:44 quagga-0.99.15-0.29.1.x86_64.rpm installed ok
# 2017-04-18 15:48:51 quagga-1.1.1-65.1.x86_64.rpm installed ok
# 2017-12-12 16:41:30 quagga-1.1.1-68.1.x86_64.rpm installed ok


Thanks,
Stefan

--
Stefan Bethke <stb@lassitu.de> Fon +49 151 14070811
Re: Duplication of kernel routes [ In reply to ]
It appears the issue was triggered by configuring a GUA on the router's interface, and pointing the default route of the hosts in the VLAN to that GUA.

Using RAs and having the default route point at the LLA of the router's interface stopped the weird routing table entries.

> Am 16.01.2018 um 19:48 schrieb Stefan Bethke <stb@lassitu.de>:
>
> Hello,
>
> I’m running into a peculiar problem I’m hoping someone can give me some pointers for further diagnostics, or even ways to mitigate.
>
> I’m running quagga-1.1.1-68.1 on SLES11SP4, 3.0.101-108.21-default #1 SMP Thu Jan 11 12:58:47 CET 2018 x86_64, with current patches. zebra, ospfd, ospf6d and bpd are running.
>
> I can reproduce the issue on a VSphere VM with 3 vmxnet interfaces; two of which are bonded as a active/backup pair. On the production machine, it’s two bonded interfaces with LACP. The production setup is a pair of machines with keepalived, where VLANs are split between both nodes (odd/even VLAN numbers).
>
> The router receives a full table.
> ip -4 route show | wc -l: 669651
> ip -6 route show | wc -l: 43739
>
> There’s quite a number of customer machines connected to various VLANs; only a single VLAN is producing this effect.
>
> The problem presents like this: whenever one of the customer FreeBSD 11.1 machines has an active IPv6 configuration, the number of IPv6 kernel routes in the router balloons. Every about 30 seconds, the number of routes increases by one full table size (so adding about 45,000); at some point, the entire IPv6 table gets reset and the process starts over. IPv4 appears to be completely unaffected. Other customers machines with FreeBSD do not exhibit this problem, nor do any other Ones.
>
> The FreeBSD configuration appears to be standard: NDP and auto link local are active and working seemingly OK; no routing protocol is active. With NUD active, there are about 5-10 NDP packets per minute, all confirming reachability with solicited requests. With NUD off, the frequency drops to about 2 NDP packets per minute. The customer VLAN has about 4 MACs active, two of which are locally administered.
> # ifconfig bridge0 inet6
> bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> description: vm-bridge0
> inet6 fe80::3c:9fff:fe37:2500%bridge0/64 scopeid 0x7
> inet6 2a00:14b0:4200:xyza::1e1/64
> inet6 2a00:14b0:4200:xyza::1e2/128
> inet6 2a00:14b0:4200:xyza::1fd/128
> inet6 2a00:14b0:4200:xyza::1e5/128
> nd6 options=8020<AUTO_LINKLOCAL,DEFAULTIF>
> # ndp -i bridge0
> linkmtu=0, maxmtu=0, curhlim=64, basereachable=30s0ms, reachable=43s, retrans=1s0ms
> Flags: auto_linklocal
>
> The effect stops within minutes once both machines’ IPv6 interfaces are disabled.
>
> tcpdump/wireshark traces don’t seem to be showing anything out of the ordinary. Traffic on the VLAN is very moderate.
>
> What’s puzzling to me is that there are lots of identical duplicate routes installed in the kernel:
> # ip -6 route show | grep 2a00:14b0:4200:xyza
> 2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
> 2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
> 2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
> 2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
> 2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
> 2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
> 2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
> 2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
> 2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
> 2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
> 2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
> 2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
> 2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
> 2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
> 2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
> 2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
> 2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
> 2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
> 2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
> 2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
> 2a00:14b0:4200:xyza::/64 dev vlan503 proto kernel metric 256
> ^C
>
> I’m at a loss how and why these duplicate routes should be installed. I would appreciate any hints on how to diagnose this further.
>
> The following versions of quagga were tried, without any change in behaviour:
> # 2012-10-02 17:21:35 quagga-0.99.15-0.10.1.x86_64.rpm installed ok
> # 2013-01-14 18:55:55 quagga-0.99.15-0.12.1.x86_64.rpm installed ok
> # 2013-09-27 18:27:25 quagga-0.99.15-0.14.1.x86_64.rpm installed ok
> # 2015-08-26 16:40:56 quagga-0.99.15-0.14.11.x86_64.rpm installed ok
> # 2016-04-06 15:40:30 quagga-0.99.15-0.21.1.x86_64.rpm installed ok
> # 2016-06-08 10:45:33 quagga-0.99.15-0.24.2.x86_64.rpm installed ok
> # 2016-10-26 14:55:44 quagga-0.99.15-0.29.1.x86_64.rpm installed ok
> # 2017-04-18 15:48:51 quagga-1.1.1-65.1.x86_64.rpm installed ok
> # 2017-12-12 16:41:30 quagga-1.1.1-68.1.x86_64.rpm installed ok
>
>
> Thanks,
> Stefan
>
> --
> Stefan Bethke <stb@lassitu.de> Fon +49 151 14070811
>
>
> _______________________________________________
> Quagga-users mailing list
> Quagga-users@lists.quagga.net
> https://lists.quagga.net/mailman/listinfo/quagga-users

--
Stefan Bethke <stb@lassitu.de> Fon +49 151 14070811