Mailing List Archive

kernel routing socket handling problems
I found that zebra was crashing with a segfault and had a trashed
stack. I traced the problem to zebra/kernel_socket:ifm_read() which
assumed that a struct if_msghdr was always followed by a sockaddr_dl.
NetBSD sends if_msghdr with status on interface change, and there were
two problems:

1) ifm_read was copying junk from after the ifm_read using a length
read from the junk.

2) ifm_read was failing to match the if_msghdr with a struct
interface, and thus failing to cause the interface to go up and
down.

With the following patch zebra does not crash, and doing 'ifconfig
foo0 down' causes the routes to foo0 (which are sent via 'redistribute
connected') to be withdrawn over OSPF and RIPng. ifconfig up causes
the routes to be restored.

On NetBSD, ifm_read is apparently not used for getting the initial
interface list.

I plan to commit this soon if it runs stably for a while.

Comments?


Index: kernel_socket.c
===================================================================
RCS file: /var/cvsroot/quagga/zebra/kernel_socket.c,v
retrieving revision 1.10
diff -u -r1.10 kernel_socket.c
--- kernel_socket.c 22 Oct 2003 02:51:38 -0000 1.10
+++ kernel_socket.c 4 Jan 2004 01:04:22 -0000
@@ -211,7 +211,11 @@
}
#endif /* RTM_IFANNOUNCE */

-/* Interface adding function called from interface_list. */
+/*
+ * Handle struct if_msghdr obtained from reading routing socket or
+ * sysctl (from interface_list).
+ * There may or may not be sockaddrs present after the header.
+ */
int
ifm_read (struct if_msghdr *ifm)
{
@@ -219,6 +223,17 @@
struct sockaddr_dl *sdl = NULL;
char ifname[IFNAMSIZ];

+ /* paranoia - should not happen */
+ if (ifm->ifm_msglen < sizeof(struct if_msghdr))
+ {
+ zlog_err ("ifm_read: ifm->ifm_msglen %d too short\n",
+ ifm->ifm_msglen);
+ return -1;
+ }
+
+ /*
+ * Check for a sockaddr_dl following the message.
+ */
#ifdef SUNOS_5
int i;
struct sockaddr *sa;
@@ -247,32 +262,64 @@
}
}
#else
- sdl = (struct sockaddr_dl *)(ifm + 1);
+ /* sockaddrs_present? */
+ if (ifm->ifm_addrs)
+ {
+ if (ifm->ifm_addrs == RTA_IFP)
+ {
+ /* just the one we want */
+ sdl = (struct sockaddr_dl *)(ifm + 1);
+ }
+ else
+ {
+ /* Perhaps not an error, but more complicated parsing is needed. */
+ zlog_err ("ifm_read: ifm_addrs != 0 != RTA_IFP %x\n",
+ ifm->ifm_addrs);
+ return -1;
+ }
+ }
#endif

/*
+ * Look up on ifindex. This is useful if this is an up/down
+ * notification for an interface of which we are already aware.
+ * (This happens on NetBSD 1.6.2, for example.)
+ */
+ if (ifp == NULL)
+ ifp = if_lookup_by_index (ifm->ifm_index);
+
+ /*
* Check if ifp already exists. If the interface has already been specified
* in the conf file, but is just getting created, we would have an
* entry in the iflist with incomplete data (e.g., ifindex == -1),
* so we lookup on name.
*/
- if (sdl != NULL)
+ if (ifp != NULL && sdl != NULL)
{
- memcpy (ifname, sdl->sdl_data, sdl->sdl_nlen);
- ifname[sdl->sdl_nlen] = '\0';
+ /* Avoid relying on sdl->sdl_nlen. */
+ bzero(ifname, sizeof(ifname));
+ memcpy (ifname, sdl->sdl_data, sizeof(ifname)-1);
ifp = if_lookup_by_name (ifname);
}

+ /* New interface? */
if ((ifp == NULL) || (ifp->ifindex == -1))
{
/* Check interface's address.*/
if (! (ifm->ifm_addrs & RTA_IFP))
{
- zlog_warn ("There must be RTA_IFP address for ifindex %d\n",
+ zlog_warn ("ifindex %d: RTA_IFP sockaddr required for new interface\n",
ifm->ifm_index);
return -1;
}

+ if (sdl == NULL)
+ {
+ /* Should not happen with RTA_IFP check. */
+ zlog_warn ("ifm_read: no sockaddr_dl present");
+ return -1;
+ }
+
if (ifp == NULL)
ifp = if_create (sdl->sdl_data, sdl->sdl_nlen);

@@ -296,6 +343,7 @@
if_add_update (ifp);
}
else
+ /* Flag change notification on existing interface. */
{
/* There is a case of promisc, allmulti flag modification. */
if (if_is_up (ifp))
@@ -823,6 +871,12 @@
rtmsg_debug (&buf.r.rtm);

rtm = &buf.r.rtm;
+
+ if (rtm->rtm_msglen != nbytes)
+ {
+ zlog_warn ("kernel_read: rtm->rtm_msglen %d, nbytes %d, type %d\n",
+ rtm->rtm_msglen, nbytes, rtm->rtm_type);
+ }

switch (rtm->rtm_type)
{
Re: kernel routing socket handling problems [ In reply to ]
On Sat, 3 Jan 2004, Greg Troxel wrote:

> Comments?

use memset instead of bzero. looks grand otherwise, apart from one
tiny question:

> + /* Avoid relying on sdl->sdl_nlen. */
> + bzero(ifname, sizeof(ifname));
> + memcpy (ifname, sdl->sdl_data, sizeof(ifname)-1);

surely if sdl is present and sdl->sdl_nlen is present, it will be
valid? further, could it occur that sdl_nlen could ever possibly be
0?

if you want to be ultra-paranoid (which i think is the intent, right?
at least wrt over flowing ifname), why not go for the lesser value of
sdl_nlen and ifname? (ifname-1 and sdl_nlen)?

regards,
--
Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A
warning: do not ever send email to spam@dishone.st
Fortune:
Beware of bugs in the above code; I have only proved it correct, not tried it.
-- Donald Knuth
Re: kernel routing socket handling problems [ In reply to ]
[zebra/kernel_socket.c:ifm_read() problems]

I have addressed Paul's comments, added more sanity checks, and
committed the changes.

I would appreciate it if anyone can test on a system other than NetBSD
that uses the routing socket. I tried pretty hard not to break
Solaris, but the existing code was very clearly unsafe on inspection,
and I may have erred on the side of rejecting situations that are ok.

--
Greg Troxel <gdt@ir.bbn.com>
Re: kernel routing socket handling problems [ In reply to ]
>
> I would appreciate it if anyone can test on a system other than NetBSD
> that uses the routing socket. I tried pretty hard not to break
> Solaris, but the existing code was very clearly unsafe on inspection,
> and I may have erred on the side of rejecting situations that are ok.
>
> --
> Greg Troxel <gdt@ir.bbn.com>

I still haven't tested on Solaris yet, but some questions/comments :

- the original problem that you were trying to solve is that

> I traced the problem to zebra/kernel_socket:ifm_read() which
> assumed that a struct if_msghdr was always followed by a sockaddr_dl

This is exactly the reason why solaris does:

274 for (i = 1; i != 0; i <<= 1)
275 {
276 if (i & ifm->ifm_addrs)
277 {
278 sa = (struct sockaddr *)cp;
279 cp += WRAPUP(sa);
280 if (i & RTA_IFP)
281 {
282 sdl = (struct sockaddr_dl *)sa;
283 break;
284 }
285 }
286 }

Note that your patch tests if RTA_IFP is set and if yes, it assumes
that the first address is a sockaddr_dl. In practice, it's true that
the only sockaddr that gets sent with an if_msghdr is the sockaddr_dl,
but since we can't know what may be added in the future, or the peculiarities
of OS-es, it's probably safer to follow the flags as prescribed.
Also, it allows for the same code for all the OS-es.

- as I was telling you in private email, there are problems with looking up
interfaces by ifindex.
1. sometimes (e.g., you add the interface to the conf file before
the interface is plumbed on solaris; or, you add info about a tunnel
interface to the conf file, and then start zebra, and then create
the tunnel) the ifm_index that we have locally can be -1, and then
the if__lookup_by_index will fail. After that, we'd end up with
2 nodes on the list for the same interface, one with -1 index (wasted
memory) and another with the right index.

2. Solaris has a nasty way of creating a new ifindex each time a tunnel
is plumbed, unplumbed and (re)plumbed.


- not sure what you mean by
411 /* XXX sockaddr_dl can be larger than base structure */
Is this for the case of !HAVE_SOCKADDR_DL?

--Sowmini
Re: kernel routing socket handling problems [ In reply to ]
/*
+ * Look up on ifindex. This is useful if this is an up/down
+ * notification for an interface of which we are already aware.
+ * (This happens on NetBSD 1.6.2, for example.)
+ */
+ if (ifp == NULL)
+ ifp = if_lookup_by_index (ifm->ifm_index);

the problem I can think of with this approach is that
1. sometimes (e.g., you add the interface to the conf file before
the interface is plumbed on solaris; or, you add info about a tunnel
interface to the conf file, and then start zebra, and then create
the tunnel) the ifm_index that we have locally can be -1, and then
the if__lookup_by_index will fail. After that, we'd end up with
2 nodes on the list for the same interface, one with -1 index (wasted
memory) and another with the right index.

2. Solaris has a nasty way of creating a new ifindex each time a tunnel
is plumbed, unplumbed and (re)plumbed.

These were the reasons why I'd done the lookup based on interface name,
rather than index.


I hadn't looked at the history of zebra/kernel_socket.c. Looking
back, I see that the problems I found were introduced in September:

revision 1.9
date: 2003/09/24 00:05:45; author: paul; state: Exp; lines: +53 -6
2003-09-24 sowmini.varadhan@sun.com

* zebra/kernel_socket.c: Fix up WRAPUP macro to deal with multiple
address families in the absence of sa_len element in struct
sockaddr.
(ifm_read): Handle solaris 9 if_msghdr_t.
Deal with interfaces which are incomplete, lookup on name rather
than the placeholder interface index of -1.

The original code assigned ifm+1 to sdl, but then didn't reference it
until it had checked that rtm_addrs had RTA_IFP set. The new code
uses sdl before it has been verified to exist.

On NetBSD, up/down notifications do not have a sockaddr_dl in the
message (and don't set the bit). This is legal behavior per my
understanding of the 'routing socket spec', such as it is :-(

0n systems with routing sockets, ifindex is supposed to be the primary
handle for an interface. So not using it when present seems wrong.
(And on systems which don't always include the sockaddr, it is wrong.)

I left in using the sockaddr (if present) when lookup by index fails.
This should find an interface such as your case 1 which has ifindex
-1.

Regarding (2), where Solaris creates a new ifindex when deconfiguring
a tunnel and then reconfiguring it: Well, that's the kernel's way of
telling you that the new interface is not connected with the old
interface. If that's not what's going on you should probably file a
bug with Solaris. Does the deconfiguring get reported on the routing
socket? If so, it should be possible for the interface structure to
get mostly torn down and ifindex set back to -1, so that the
reconfigure will again find the name.

Also, on reading the code I have a few questions:

/*
* if_msghdr_t on 64 bit kernels in Solaris 9 and earlier versions
* is 12 bytes larger than the 32 bit version, so make adjustment
* here.
*/
sa = (struct sockaddr *)cp;
if (sa->sa_family == AF_UNSPEC)
cp = cp + 12;

This seems odd (of solaris, not your code). Have you filed a bug
report with Solaris that what the kernel sends doesn't match the
system .h files, or is this documented behavior? Is this an alignment
issue?

Probably this should be ifdef'd or tested for 64-bit systems; it seems
likely to do the wrong thing on 32-bit systems. I can see why testing
possible padding against AF_UNSPEC, but is it guaranteed that padding
is zero? It is likely not guaranteed that all messages have a
sockaddr_dl, so this really does seem dangerous.
Re: kernel routing socket handling problems [ In reply to ]
This is exactly the reason why solaris does:

Yes, I see that you did check that in the Solaris case. But you
changed the code to unconditionally refer to sdl for all operating
systems, and did not add a corresponding check for sockaddr_dl
presence to the others.

Note that your patch tests if RTA_IFP is set and if yes, it assumes
that the first address is a sockaddr_dl.

No, I did:

if (ifm->ifm_addrs == RTA_IFP)

which checks if RTA_IFP is the only flag set. If so, the first
sockaddr must be the RTA_IFP one. The case of multiple sockaddrs is
left as unhandled.

In practice, it's true that the only sockaddr that gets sent with an
if_msghdr is the sockaddr_dl, but since we can't know what may be
added in the future, or the peculiarities of OS-es, it's probably
safer to follow the flags as prescribed. Also, it allows for the
same code for all the OS-es.

That probably would be good. I didn't look at the history when fixing
the problems I found (I should have), and assumed that the
Solaris-specific code was Solaris-specific for a good reason.

- as I was telling you in private email, there are problems with
looking up interfaces by ifindex.

I just responded to that. It would be a good thing to fix those
problems, but we'll have to be careful not to break anything else
(particularly normal cases that have worked for a long time in zebra).
Re: kernel routing socket handling problems [ In reply to ]
>
> Also, on reading the code I have a few questions:
>
> /*
> * if_msghdr_t on 64 bit kernels in Solaris 9 and earlier versions
> * is 12 bytes larger than the 32 bit version, so make adjustment
> * here.
> */
> sa = (struct sockaddr *)cp;
> if (sa->sa_family == AF_UNSPEC)
> cp = cp + 12;
>
> This seems odd (of solaris, not your code).
Yes.

> Have you filed a bug
> report with Solaris that what the kernel sends doesn't match the
> system .h files
Yes.

> possible padding against AF_UNSPEC, but is it guaranteed that padding
> is zero?

fortunately yes. The fix has been tested on both 32- and 64 bit systems.

This will be fixed in future releases of Solaris.

Meanwhile,

On testing, I ran into another problem. The RTM_IFINFO message is
getting ignored on Solaris because of the lines

928 if (rtm->rtm_msglen != nbytes)
929 {
930 zlog_warn ("kernel_read: rtm->rtm_msglen %d, nbytes %d, ..
931 rtm->rtm_msglen, nbytes, rtm->rtm_type);
932 return -1;
933 }

On Solaris, the sockaddr_dl struct is huge (sdl_data alone has 244 bytes),
whereas the read command is asking for
912 nbytes= read (sock, &buf, sizeof buf);
with buf set to 244 bytes. So, when the returned rtm_msglen is 336,
the code bails out at line 932. Not good.

I'd imagine this could happen in other situations, when one has multiple
socket addresses (e.g., ipv6 and sockaddr_dl's).

My proposal for this is to just bloat up buf a bit (see diffs below)

> No, I did:
>
> if (ifm->ifm_addrs == RTA_IFP)
>
> which checks if RTA_IFP is the only flag set. If so, the first
> sockaddr must be the RTA_IFP one. The case of multiple sockaddrs is
> left as unhandled.

Ok, but as you can see in rt_msg1() in
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/net/rtsock.c
the BSD kernel could set flags other than RTA_IFP.

Since we're going through this exercise, maybe it would be a good
idea to bite the bullet and make the solaris specific code to be common
code, and get rid of

/*
* Not strictly an error, but more complicated parsing than
* is implemented is required to handle this case.
*/
zlog_err ("ifm_read: addrs %x != RTA_IFP (unhandled, ignoring)\n",
ifm->ifm_addrs);
return -1;
???

> - as I was telling you in private email, there are problems with
> looking up interfaces by ifindex.
>
> I just responded to that. It would be a good thing to fix those
> problems, but we'll have to be careful not to break anything else
> (particularly normal cases that have worked for a long time in zebra).

On solaris, this is more than just a corner case- unlike BSD-based
systems (I'm not familiar with linux), an interface doesn't show up
on the GIFCONF list (or in the output of 'ifconfig -a') until it's
explicitly plumbed ("configured").

Here's my suggestion:
- move the endif up so that the goofy cp = cp+12 thing is the only
solaris specific oddity and the rest is shared.
- if there's no sdl in the RTM_IFINFO message, *then* (and only then)
use the ifm_index to find the ifp.

How does that sound? Diffs to this effect are attached. Seems to
work for me...

--Sowmini

--- /home/sowmini/tmp/kernel_socket.c.1 Mon Jan 5 15:10:32 2004
+++ ./kernel_socket.c Mon Jan 5 15:43:57 2004
@@ -247,6 +247,7 @@
sa = (struct sockaddr *)cp;
if (sa->sa_family == AF_UNSPEC)
cp = cp + 12;
+#endif

for (i = 1; i != 0; i <<= 1)
{
@@ -262,34 +263,9 @@
}
}
/*
- * After here, If RTA_IFP was set in ifm_addrs, sdl should point to
- * the sockaddr_dl.
- */
-#else
- /* sockaddrs_present? */
- if (ifm->ifm_addrs)
- {
- if (ifm->ifm_addrs == RTA_IFP)
- {
- /* just the one we want */
- sdl = (struct sockaddr_dl *)(ifm + 1);
- }
- else
- {
- /*
- * Not strictly an error, but more complicated parsing than
- * is implemented is required to handle this case.
- */
- zlog_err ("ifm_read: addrs %x != RTA_IFP (unhandled, ignoring)\n",
- ifm->ifm_addrs);
- return -1;
- }
- }
- /*
* Past here, either ifm_addrs == 0 or ifm_addrs == RTA_IFP and sdl
* points to a RTA_IFP sockaddr.
*/
-#endif

/* Check that sdl, if present, is actually a sockaddr_dl before use. */
if (sdl != NULL)
@@ -311,14 +287,6 @@
}

/*
- * Look up on ifindex. This is useful if this is an up/down
- * notification for an interface of which we are already aware.
- * (This happens on NetBSD 1.6.2, for example.)
- */
- if (ifp == NULL)
- ifp = if_lookup_by_index (ifm->ifm_index);
-
- /*
* If lookup by index was unsuccessful and we have a name, try
* looking up by name. Interfaces specified in the configuration
* file for which the ifindex has not been determined will have
@@ -325,7 +293,7 @@
* ifindex == -1, and such interfaces are found by this search, and
* then their ifindex values can be filled in.
*/
- if (ifp == NULL && sdl != NULL)
+ if (sdl != NULL)
{
/*
* paranoia: sanity check name length. nlen does not include
@@ -343,6 +311,14 @@
}

/*
+ * Look up on ifindex. This is useful if this is an up/down
+ * notification for an interface of which we are already aware.
+ * (This happens on NetBSD 1.6.2, for example.)
+ */
+ if (ifp == NULL)
+ ifp = if_lookup_by_index (ifm->ifm_index);
+
+ /*
* If ifp does not exist or has an invalid index (-1), create or
* fill in an interface.
*/
@@ -903,6 +879,7 @@
struct sockaddr addr[RTAX_MAX];
} ian;
#endif /* RTM_IFANNOUNCE */
+ int msg[2048/sizeof(int)];

} buf;
Re: kernel routing socket handling problems [ In reply to ]
> Have you filed a bug
> report with Solaris that what the kernel sends doesn't match the
> system .h files
Yes.

I'm glad to hear it. So probably the whole cp += 12 should be ifdefed
a lot more tightly, specifically for the broken versions.

[check on getting the whole rtm message]

On Solaris, the sockaddr_dl struct is huge (sdl_data alone has 244 bytes),
whereas the read command is asking for
912 nbytes= read (sock, &buf, sizeof buf);
with buf set to 244 bytes. So, when the returned rtm_msglen is 336,
the code bails out at line 932. Not good.

Well, I count it as good that the check caught this problem. So in
the solaris case there was a risk of reading past the end of the
buffer.

My proposal for this is to just bloat up buf a bit (see diffs below)

The more complex strategy is to retry with twice the buf every
failure, but I concur that just increasing the buffer size makes sense
in this case.

On solaris, this is more than just a corner case- unlike BSD-based
systems (I'm not familiar with linux), an interface doesn't show up
on the GIFCONF list (or in the output of 'ifconfig -a') until it's
explicitly plumbed ("configured").

Sure, I don't mean this shouldn't work. I just am not willing to
endanger the normal cases, like booting a machine with two ethernets
that are configured before quagga starts.

- move the endif up so that the goofy cp = cp+12 thing is the only
solaris specific oddity and the rest is shared.

That sounds fine.

- if there's no sdl in the RTM_IFINFO message, *then* (and only then)
use the ifm_index to find the ifp.

I am not convinced that this is reasonable. ifindex is the primary
kernel reference handle for an interface in BSD44-based systems (which
Solaris seems to be at least mostly for networking). Changing to use
something else as the primary key could have unintended consequences,
and I'm a bit shy about that, having spent too much time cleaning up
breakage due to unintended consequences lately.

The only issue seems to be interfaces that are configured in the
quagga config file, do not exist on startup, and that then get
configured. For these, the quagga interface structure will have
ifindex -1, so no real interface will match that, and then the name
lookup will work and the ifindex will get stored. After that the
ifindex case will be right.
Re: kernel routing socket handling problems [ In reply to ]
I just increased the buffer size for reads. Please try CVS and see
if it works now.

2004-01-05 Greg Troxel <gdt@fnord.ir.bbn.com>

* kernel_socket.c (kernel_read): Add a sockaddr_dl to the ifmsg
structure, because on Solaris sockaddr_dl is far larger than the
base sockaddr structure. (The code had previously been failing to
read all the data.)
Re: kernel routing socket handling problems [ In reply to ]
I just committed what was in your patch, plus a bunch of cleanups,
except for making lookup by name be primary. Please have a look over
it; I may have done something incorrect.

2004-01-05 Greg Troxel <gdt@fnord.ir.bbn.com>
* kernel_socket.c (ifm_read): Major cleanup. Use Sowmini's code
to find the sockaddr_dl in all cases, narrowing the Solaris ifdef
to just the accomodation of broken kernels. Check sockaddr_dl
carefully up front, and later assume any non-NULL sdl pointer is
valid. Clean up types and variable declarations, and rename
WRAPUP to SAROUNDUP to make the name fit the behavior.
Re: kernel routing socket handling problems [ In reply to ]
I'm still reviewing your changes, but wanted to address this:

> On solaris, this is more than just a corner case- unlike BSD-based
> systems (I'm not familiar with linux), an interface doesn't show up
> on the GIFCONF list (or in the output of 'ifconfig -a') until it's
> explicitly plumbed ("configured").
>
> Sure, I don't mean this shouldn't work. I just am not willing to
> endanger the normal cases, like booting a machine with two ethernets
> that are configured before quagga starts.

And why wouldn't this work, if you first did a lookup by name and then
did a lookup by index? Do the two ethernet interfaces have the same name?

From what I remember about BSD device drivers, the name has to be pretty
unique, and all that configuration and device driver initialization takes
place at boot time, so the machine should start up with 2 interfaces
with pretty distinct names.

I really don't see what the problem is.

--Sowmini
Re: kernel routing socket handling problems [ In reply to ]
Several comments.

- First, please slow down. I think we should not be in such a rush to
commit to cvs.

- Regarding
-/* Socket length roundup function. */
+/*
+ * Given a sockaddr length, round it up to include pad bytes following
+ * it. Assumes the kernel pads to sizeof(long).
+ *
+ * XXX: why is ROUNDUP(0) sizeof(long)? 0 is an illegal sockaddr
+ * length anyway (< sizeof (struct sockaddr)), so this shouldn't
+ * matter.
+ */

The reason is that you would call it as ROUNDUP(sa->sa_len), so
if you ran into a sockaddr with sa_len set to 0, you want to advance
by sizeof(long) bytes. I think the XXX can be deleted. Please
also take a look at BSD source code for /sbin/route (remember that the
command 'route monitor' does exactly the same thing that zebra is trying
to do. Source code for route can be found at
http://cvsweb.netbsd.org/bsdweb.cgi/src/sbin/route/route.c

- Let's please do the prescribed thing, and check the flags properly,
instead of hacking for just (flags == RTA_IFP). Please also take a look
at the function pmsg_addrs in
http://cvsweb.netbsd.org/bsdweb.cgi/src/sbin/route/route.c
which does exactly the same thing. If we are going to do it, let's get
it right. I'm referring to the changes around:

- sa = (struct sockaddr *)cp;
- cp += WRAPUP(sa);
- if (i & RTA_IFP)
+ if (i == RTA_IFP)
{
- sdl = (struct sockaddr_dl *)sa;
+ sdl = (struct sockaddr_dl *)cp;
break;
}

- Adding a sockaddr_dl to the buf in kernel_read() does not solve the
real problem. The sockaddr structure is pretty lame, and was not designed
to be the same size as sockaddr_in6 or sockaddr_dl. The real answer is
to use sockaddr_storage (see rfc 2553, section 3.10) but that would
mean that we'd have to change a lot of code to cast things around etc.

The simpler solution is to bloat the structure in the same way that
bsd's /sbin/route does it, and notice that monitor() in
http://cvsweb.netbsd.org/bsdweb.cgi/src/sbin/route/route.c
uses a buffer of size 2048.

The bottom line is that I think we should do whatever 'route monitor'
is doing, so that if there are any problems, they will be consistent!

The linux guys are awfully quiet in all of this. What does linux have,
for sizeof (struct sockaddr_dl)? What does linux's 'route monitor' code do?

--Sowmini
Re: kernel routing socket handling problems [ In reply to ]
- First, please slow down. I think we should not be in such a rush to
commit to cvs.

The time for caution would have been before the incorrect changes that
started all this were committed in the fall (that used sdl when it was
invalid, leading to stack corruption, and that did lookups on name
only). We are now cleaning up from those problems, so in my view
the faster the better.

The reason is that you would call it as ROUNDUP(sa->sa_len), so
if you ran into a sockaddr with sa_len set to 0, you want to advance
by sizeof(long) bytes.

Can a sockaddr with sa_len set to 0 ever be generated by a correct
kernel? If not, one should log an error and stop parsing on
encountering such a length.

- Let's please do the prescribed thing, and check the flags properly,
instead of hacking for just (flags == RTA_IFP). Please also take a look

The code loops through all flag values one by one, checks if each is
set, and then if so if the flag being tested is RTA_IFP, which is what
we are looking for. The expressions 'i & RTA_IFP' and 'i == RTA_IFP'
will evaluate the same in this function (since i has only one bit
set), and the explicit == conveys better the sense of what is going
on. The function you point to has different behavior; it calls a
procedure for each sockaddr.

- Adding a sockaddr_dl to the buf in kernel_read() does not solve the
real problem. The sockaddr structure is pretty lame, and was not designed

You said that sockaddr_dl was large on Solaris, so I added it.

The simpler solution is to bloat the structure in the same way that

But this may fail in the future when Solaris makes struct sockaddr_dl
16KB or so :-)

You have an excellent point about sockaddr_storage; that is clearly
the right thing to use.

The linux guys are awfully quiet in all of this. What does linux have,
for sizeof (struct sockaddr_dl)? What does linux's 'route monitor' code do?

AFAIK Linux doesn't use a BSD44-derived routing socket, but /proc or
netlink.
Re: kernel routing socket handling problems [ In reply to ]
And why wouldn't this work, if you first did a lookup by name and then
did a lookup by index? Do the two ethernet interfaces have the same name?

I didn't say that I was sure it wouldn't. I have spent way too much
time cleaning up from inadequately considered changes that caused
breakage on platforms other than the one the changes were intended to help.

I really don't see what the problem is.

Sorry, but that doesn't give me a warm fuzzy that there are no
problems.

I said earlier that ifindex is the primary handle for user/kernel
communciation about interfaces. If looking up by ifindex does the
wrong thing, it must be because the kernel interface state and zebra's
data structures about interfaces are out of sync. If that happens,
why this arises should be understood and the underlying consistency
problem fixed.
Re: kernel routing socket handling problems [ In reply to ]
> - First, please slow down. I think we should not be in such a rush to
> commit to cvs.
>
> The time for caution would have been before the incorrect changes that
> started all this were committed in the fall (that used sdl when it was
> invalid, leading to stack corruption, and that did lookups on name
> only). We are now cleaning up from those problems, so in my view
> the faster the better.

The patch was posted on Sep 8 2003. Comments were invited, none were
forthcoming?

> The reason is that you would call it as ROUNDUP(sa->sa_len), so
> if you ran into a sockaddr with sa_len set to 0, you want to advance
> by sizeof(long) bytes.
>
> Can a sockaddr with sa_len set to 0 ever be generated by a correct
> kernel? If not, one should log an error and stop parsing on
> encountering such a length.

So why doesn't BSD (and all the other code out there) do this?

> - Let's please do the prescribed thing, and check the flags properly,
> instead of hacking for just (flags == RTA_IFP). Please also take a look
>
> The code loops through all flag values one by one, checks if each is
> set, and then if so if the flag being tested is RTA_IFP, which is what
> we are looking for. The expressions 'i & RTA_IFP' and 'i == RTA_IFP'
> will evaluate the same in this function (since i has only one bit
> set), and the explicit == conveys better the sense of what is going
> on. The function you point to has different behavior; it calls a
> procedure for each sockaddr.

whatever. I'm tired of arguing that we should follow BSD's example.

> - Adding a sockaddr_dl to the buf in kernel_read() does not solve the
> real problem. The sockaddr structure is pretty lame, and was not designed
>
> You said that sockaddr_dl was large on Solaris, so I added it.
> But this may fail in the future when Solaris makes struct sockaddr_dl
> 16KB or so :-)

I'm not sure I see the point of your comment. I'm guessing, from the ":-)"
that the 16KB is supposed to be facile, so I'm going to ignore it.

I still repeat, it's safest to follow the example in /sbin/route, which
is code that's been around (and emulated and supported by all kinds
of vendors) for a long time time, but as I repeated, I give up.

--Sowmini
Re: kernel routing socket handling problems [ In reply to ]
On Tue, 6 Jan 2004 sowmini.varadhan@sun.com wrote:

> The linux guys are awfully quiet in all of this. What does linux
> have, for sizeof (struct sockaddr_dl)? What does linux's 'route
> monitor' code do?

Linux uses something different, rtnetlink messages. See man pages for
rtnetlink (section 7 describing rtnetlink protocol and 3 for various
macros). rtnetlink is part of the netlink family, man netlink (secs 3
and 7 again).

its reasonably well specced and easy to use.

> --Sowmini

regards,
--
Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A
warning: do not ever send email to spam@dishone.st
Fortune:
What's another word for "thesaurus"?
-- Steven Wright
Re: kernel routing socket handling problems [ In reply to ]
>
> - First, please slow down. I think we should not be in such a rush to
> commit to cvs.
>
> The time for caution would have been before the incorrect changes that
> started all this were committed in the fall (that used sdl when it was
> invalid, leading to stack corruption, and that did lookups on name
> only). We are now cleaning up from those problems, so in my view
> the faster the better.

I'm probably going to kick myself for continuing this debate, but I'm
going to try to stick to the technical issues here.

Please let's make haste slowly. It doesn't make sense to be
committing kernel_socket.c every hour (literally), when we haven't
converged on a solution that actually compiles cleanly and works
on our respective test machines. Right now, there's a warning
when I compile:

kernel_socket.c: In function `kernel_read':
kernel_socket.c:870: warning: declaration does not declare anything

so obviously, there's still a problem.

> - Adding a sockaddr_dl to the buf in kernel_read() does not solve the
> real problem. The sockaddr structure is pretty lame, and was not designed
>
> You said that sockaddr_dl was large on Solaris, so I added it.

some research has given me more info about this debate.

Adding a sockaddr_dl at line 870 is a hack- it is assuming that the
kernel is going to always send a sockaddr_dl as the first address
following the ifa_msghdr. This assumption would quickly fail if the
ifm_addrs contained some other flag whose value was less than RTA_IFP.

I know that we are already committing ourselves to this assumption in
ifm_read, and we should probably agree to disagree on lines 272-283,
but the fix at line 870 does not really solve the problem that can
be encountered in kernel_read, not just for Solaris, but for other
BSD44 based OS-es as well!

The bottom line is that the problem is not just a solaris specific
problem, but one that can be encountered for any OS which sends
up a routing socket message with the right mix of sockaddr_dl and
ipv6 addresses.

BTW, even netbsd has the following comment in <net/if_dl.h>:
char sdl_data[12]; /* minimum work area, can be larger;
contains both if name and ll address */


So why does BSD use a 2048 sized buffer to receive messages in /sbin/route?
I think that bsd's /sbin/route uses a 2048 sized buffer because that's
the size of the buffer from an M_EXT mbuf. (For details refer to
TCP/IP Illustrated Vol 2 by Stevens). In other words, a good expectation
for the largest buffer that can be received from the kernel is 2048
bytes. And since BSD is the trend-setter in this case, we can safely
assume that OS vendors will keep conformance to this expectation in mind
as they send routing socket messages up from the kernel.

On another note, more info on the interface index issue, and why Solaris
uses a different index for each incarnation of an interface-
turns out that SNMP requires that the stats that the snmp daemons
maintain for an interface _must_always_increase. In Solaris, an
'ifconfig <intf> unplumb; ifconfig <intf> plumb'
pair will break this rule unless the interface index changes on
re-plumb. This is probably not an issue for BSD where there's no
equivalent for the "unplumb" and "plumb" operations.

So, without getting into OS-bashing, clearly, interface index can
sometimes be an unreliable identifier for identifying an interface
within zebra, whereas interface names are likely more reliable.

Having said that, I'll agree that lookup-by-index followed
by lookup-by-name is in itself not harmful (except that it's going
to be a tad slower for Solaris in some cases), and let it slide
even though I'm not convinced about the comments about ifindices being
the "primary handle" for interfaces.

This brings up an interesting, and related, question in my mind:
if linux can rename interfaces, isn't it possible to break the
snmp rule for increasing-stats-only on linux? How does linux address
this issue?

Here are the diffs that I think are appropriate, in this case:

--- /home/sowmini/tmp/kernel_socket.c Wed Jan 7 08:52:07 2004
+++ kernel_socket.c Wed Jan 7 11:59:10 2004
@@ -847,13 +847,6 @@
int nbytes;
struct rt_msghdr *rtm;

- /*
- * This must be big enough for any message the kernel might send.
- * The code previously used RTAX_MAX struct sockaddrs in all cases,
- * but now that sockaddrs are variable size, this doesn't work
- * (Solaris has 244 bytes of sdl_data!). For now, add a struct
- * sockaddr_dl to the case where it is used.
- */
union
{
/* Routing information. */
@@ -867,7 +860,6 @@
struct
{
struct if_msghdr ifm;
- struct sockaddr_dl;
struct sockaddr addr[RTAX_MAX-1];
} im;

@@ -886,6 +878,7 @@
struct sockaddr addr[RTAX_MAX];
} ian;
#endif /* RTM_IFANNOUNCE */
+ int msg[2048];

} buf;
Re: kernel routing socket handling problems [ In reply to ]
kernel_socket.c:870: warning: declaration does not declare anything

indeed the line I added is wrong and has no variable name. I'm
stunned that this isn't a syntax error, but I suppose it is an
allowable form of forward decl for one to later use a pointer to an
undefined structure.

Adding a sockaddr_dl at line 870 is a hack- it is assuming that the
kernel is going to always send a sockaddr_dl as the first address
following the ifa_msghdr. This assumption would quickly fail if the
ifm_addrs contained some other flag whose value was less than RTA_IFP.

The structures following the msghdrs are never actually referenced or
used in any way. They are just there, as far as i can tell, to make
the buffer big enough.

I did notice that sockaddr_dl is 20 bytes on netbsd, but have recently
seen 24 byte sockaddr_dls returned (but all the actual data is in the
first 18 bytes). Since there is one 24-byte sockaddr_dl and no other
sockaddrs, the message fits easily in 7 or 8 * 20 bytes. But you are
right that the approach is unsound.

Is sockaddr_storage big enough to store sockaddr_dl on Solaris? It
seems to be on NetBSD, and it's slightly unclear to me if the 'can
store address from any AF' is supposed to include sockaddrs that don't
really have an AF of a real PF. If it's big enough, using
sockaddr_storage[RTAX_MAX] probably makes sense. I'm also somewhat
inclined to rip out the whole hairy structure since all that routine
does is look at the type, but I think it's better to leave well enough
alone here, since part of the structure's point seems to be to avoid
casting to the various types (not that union overlays are any safer!).

So why does BSD use a 2048 sized buffer to receive messages in /sbin/route?
I think that bsd's /sbin/route uses a 2048 sized buffer because that's
the size of the buffer from an M_EXT mbuf.

A good point, but I've modified PF_KEY sockets to support much larger
messages. I think it uses 2048 because that is arguably big enough
for now (probably for the reason you note) and it was easy, not
because it's the only right way. I'm a BSD fan, yes, but certainly
not everthing in the code base is correct or elegant. Also the
/sbin/route code might predate sockaddr_storage, which I think arose
due to ISO and IPv6 addresses.

On another note, more info on the interface index issue, and why Solaris
uses a different index for each incarnation of an interface-
turns out that SNMP requires that the stats that the snmp daemons
maintain for an interface _must_always_increase. In Solaris, an
'ifconfig <intf> unplumb; ifconfig <intf> plumb'
pair will break this rule unless the interface index changes on
re-plumb. This is probably not an issue for BSD where there's no
equivalent for the "unplumb" and "plumb" operations.

That's interesting. So the new interface, once replumbed, is actually
a _different interface_ from the Solaris and SNMP point of view.
So there is merit to zebra treating it as a new entity as well. This
decision probably has significant ramifications in such a world.

I suppose BSD should perhaps also do this. While there is no explicit
notion of plumbing, interfaces are destroyed when cardbus cards are
removed, etc., and things like gif can be 'create'd and 'destroy'd.
But, this rule could also be enforced by the SNMP daemon.

interface index can
sometimes be an unreliable identifier for identifying an interface
within zebra, whereas interface names are likely more reliable.

Well, I think this is a deep issue as to what is meant by an interface
- is it a new entity with the same name, or the same entity again?
The SNMP rules above say it is a new entity.

Does Solaris send messages on the routing socket announcing that
interfaces have been plumbed and unplumbed? It would be cool for
zebra to be able to not only do 'down' processing (presumably that
notice comes first when you unplumb an up interface) but to tear down
the rest of the interface state when it is unplumbed.

Having said that, I'll agree that lookup-by-index followed
by lookup-by-name is in itself not harmful (except that it's going
to be a tad slower for Solaris in some cases), and let it slide
even though I'm not convinced about the comments about ifindices being
the "primary handle" for interfaces.

OK, but I don't see it being much slower on balance - index lookup
should be much faster than searching for a name, and these messages
are fairly rare anyway.
Re: kernel routing socket handling problems [ In reply to ]
> Is sockaddr_storage big enough to store sockaddr_dl on Solaris? It

Yes.

> really have an AF of a real PF. If it's big enough, using
> sockaddr_storage[RTAX_MAX] probably makes sense. I'm also somewhat

Absolutely. But the problem is that it'll be a nasty job to dereference
and cast things around from sockaddr_storage to
sockaddr_in/sockaddr_in6/sockaddr_whatever. A reasonable solution here
would be to add another member to the union like
struct
{
struct rt_msghdr rtm; /* is the if_announcemsghdr larger? */
struct sockaddr_storage addr[RTAX_MAX];
} msg;
so that it bloats up without any magic "2048" values, and
we've retained the ability to navigate the message relatively painlessly.

> inclined to rip out the whole hairy structure since all that routine
> does is look at the type, but I think it's better to leave well enough
> alone here, since part of the structure's point seems to be to avoid
> casting to the various types (not that union overlays are any safer!).

> not everthing in the code base is correct or elegant. Also the
> /sbin/route code might predate sockaddr_storage, which I think arose
> due to ISO and IPv6 addresses.

Yes. I know.

>
> Does Solaris send messages on the routing socket announcing that
> interfaces have been plumbed and unplumbed? It would be cool for

Unfortunately, no. :-( Just the RTM_IFINFO saying ~IFF_UP.

> Having said that, I'll agree that lookup-by-index followed
> by lookup-by-name is in itself not harmful (except that it's going
> to be a tad slower for Solaris in some cases), and let it slide
> even though I'm not convinced about the comments about ifindices being
> the "primary handle" for interfaces.
>
> OK, but I don't see it being much slower on balance - index lookup
> should be much faster than searching for a name, and these messages
> are fairly rare anyway.

If you have a lot of interfaces, the worst case scenario would be
that you'd march down the whole list twice. But as I said, I'll let
it slide.

--Sowmini
Re: kernel routing socket handling problems [ In reply to ]
On Mon, 5 Jan 2004, Greg Troxel wrote:

> 2. Solaris has a nasty way of creating a new ifindex each time a tunnel
> is plumbed, unplumbed and (re)plumbed.
>
> These were the reasons why I'd done the lookup based on interface name,
> rather than index.

> On NetBSD, up/down notifications do not have a sockaddr_dl in the
> message (and don't set the bit). This is legal behavior per my
> understanding of the 'routing socket spec', such as it is :-(
>
> 0n systems with routing sockets, ifindex is supposed to be the
> primary handle for an interface. So not using it when present
> seems wrong. (And on systems which don't always include the
> sockaddr, it is wrong.)

Probably not immediately relevant here, but for the purposes of
Quagga, I would consider the name to be the canonical identifier for
an interface.

The name is the label by which operators refer to devices. Who really
looks at the interface index number? What you care about is the name
- eth0 or hme0 is what has meaning, not the interface index. Solaris
is not the only system which regularly creates new if indexes for
interfaces.

> I left in using the sockaddr (if present) when lookup by index
> fails. This should find an interface such as your case 1 which has
> ifindex -1.

Ok, but the lookup by index should verify the name is still the same.
The name is the canonical handle, the index is just a convenience.
(granted, we should be able to track the index number and should,
hopefully, always have the correct number. but we should forget the
number if the interface goes down, ie set it to -1).

regards,
--
Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A
warning: do not ever send email to spam@dishone.st
Fortune:
"On a normal ascii line, the only safe condition to detect is a 'BREAK'
- everything else having been assigned functions by Gnu EMACS."
(By Tarl Neustaedter)
Re: kernel routing socket handling problems [ In reply to ]
Probably not immediately relevant here, but for the purposes of
Quagga, I would consider the name to be the canonical identifier for
an interface.

The name is the label by which operators refer to devices. Who really
looks at the interface index number? What you care about is the name
- eth0 or hme0 is what has meaning, not the interface index. Solaris
is not the only system which regularly creates new if indexes for
interfaces.

Sure, for the interface to the config file or anything else that
humans deal with, I agree that names are primary. But here we are
talking about zebra/kernel communication on the routing socket, and
the conventions there are not necessarily the same, or determined by
the human/zebra interface.

> I left in using the sockaddr (if present) when lookup by index
> fails. This should find an interface such as your case 1 which has
> ifindex -1.

Ok, but the lookup by index should verify the name is still the same.
The name is the canonical handle, the index is just a convenience.
(granted, we should be able to track the index number and should,
hopefully, always have the correct number. but we should forget the
number if the interface goes down, ie set it to -1).

Well, I'm still afraid that there is underyling complexity being
hidden by this, and I don't want to paper over it without fully
understanding it. The whole notion that Solaris has about having a
new ifindex for a 'reused' name to comply with SNMP makes a lot of
sense, and has implications about whether we have a 'new interface'
v.s. a new version of the same interface.

I think on all systems if the index matches there is truly a match.
The issue is that a name can appear with a new index, and perhaps that
should cause the old interface structure to be deleted. Really
kernels which do this should issue a routing socket message that the
name/ifindex association has been ended - one of the design goals of
the routing socket is that a program can listen and track the state of
the kernel's interface/routing activities, and that's just what zebra
does. So if there is trouble I think it will turn out to be because
zebra's database and the kernel's are getting out of sync.

These are all important issues in a world where interfaces come and go
far more often than they used to in the Unibus days.

Sowmini: you might ask the Solaris people responsible for
plumb/unplumb the mechanism a program like zebra is supposed to use to
become aware that an interface has become unplumbed, so that the
program may delete its state that associates the (old) ifindex and the
name.

Perhaps, since there is apparently no routing socket message, we
should check on creating an interface if there is another interface
structure with the same name. It should have state down (as an
assert), and probably should be deleted when this happens, since I'd
argue this situation is a latent detection of a previous unplumb.
Re: kernel routing socket handling problems [ In reply to ]
How does this look?

Index: zebra/ChangeLog
===================================================================
RCS file: /var/cvsroot/quagga/zebra/ChangeLog,v
retrieving revision 1.8
diff -u -r1.8 ChangeLog
--- zebra/ChangeLog 6 Jan 2004 18:23:02 -0000 1.8
+++ zebra/ChangeLog 8 Jan 2004 14:17:07 -0000
@@ -1,3 +1,10 @@
+2004-01-08 Greg Troxel <gdt@fnord.ir.bbn.com>
+
+ * kernel_socket.c (kernel_read): Use sockaddr_storage in buffer
+ for reading kernel messages to ensure enough space (necessary on
+ Solaris due to sockaddr_dl being large). Thanks to Sowmini
+ Varadhan for help with this change.
+
2004-01-06 Greg Troxel <gdt@t1.ir.bbn.com>

* rtadv.c (rtadv_send_packet): Change perror to zlog_err.
Index: zebra/kernel_socket.c
===================================================================
RCS file: /var/cvsroot/quagga/zebra/kernel_socket.c,v
retrieving revision 1.14
diff -u -r1.14 kernel_socket.c
--- zebra/kernel_socket.c 6 Jan 2004 01:13:05 -0000 1.14
+++ zebra/kernel_socket.c 8 Jan 2004 14:17:07 -0000
@@ -849,10 +849,12 @@

/*
* This must be big enough for any message the kernel might send.
- * The code previously used RTAX_MAX struct sockaddrs in all cases,
- * but now that sockaddrs are variable size, this doesn't work
- * (Solaris has 244 bytes of sdl_data!). For now, add a struct
- * sockaddr_dl to the case where it is used.
+ * Rather than determining how many sockaddrs of what size might be
+ * in each particular message, just use RTAX_MAX of sockaddr_storage
+ * for each. Note that the sockaddrs must be after each message
+ * definition, or rather after whichever happens to be the largest,
+ * since the buffer needs to be big enough for a message and the
+ * sockaddrs together.
*/
union
{
@@ -860,22 +862,21 @@
struct
{
struct rt_msghdr rtm;
- struct sockaddr addr[RTAX_MAX];
+ struct sockaddr_storage addr[RTAX_MAX];
} r;

/* Interface information. */
struct
{
struct if_msghdr ifm;
- struct sockaddr_dl;
- struct sockaddr addr[RTAX_MAX-1];
+ struct sockaddr_storage addr[RTAX_MAX];
} im;

/* Interface address information. */
struct
{
struct ifa_msghdr ifa;
- struct sockaddr addr[RTAX_MAX];
+ struct sockaddr_storage addr[RTAX_MAX];
} ia;

#ifdef RTM_IFANNOUNCE
@@ -883,7 +884,7 @@
struct
{
struct if_announcemsghdr ifan;
- struct sockaddr addr[RTAX_MAX];
+ struct sockaddr_storage addr[RTAX_MAX];
} ian;
#endif /* RTM_IFANNOUNCE */

@@ -908,6 +909,10 @@

rtm = &buf.r.rtm;

+ /*
+ * Ensure that we didn't drop any data, so that processing routines
+ * can assume they have the whole message.
+ */
if (rtm->rtm_msglen != nbytes)
{
zlog_warn ("kernel_read: rtm->rtm_msglen %d, nbytes %d, type %d\n",
Re: kernel routing socket handling problems [ In reply to ]
>
> How does this look?

looks good. I just tried it, and even seems to work :-)

--Sowmini

>
> Index: zebra/ChangeLog
> ===================================================================
> RCS file: /var/cvsroot/quagga/zebra/ChangeLog,v
> retrieving revision 1.8
> diff -u -r1.8 ChangeLog
> --- zebra/ChangeLog 6 Jan 2004 18:23:02 -0000 1.8
> +++ zebra/ChangeLog 8 Jan 2004 14:17:07 -0000
> @@ -1,3 +1,10 @@
> +2004-01-08 Greg Troxel <gdt@fnord.ir.bbn.com>
> +
> + * kernel_socket.c (kernel_read): Use sockaddr_storage in buffer
> + for reading kernel messages to ensure enough space (necessary on
> + Solaris due to sockaddr_dl being large). Thanks to Sowmini
> + Varadhan for help with this change.
> +
> 2004-01-06 Greg Troxel <gdt@t1.ir.bbn.com>
>
> * rtadv.c (rtadv_send_packet): Change perror to zlog_err.
> Index: zebra/kernel_socket.c
> ===================================================================
> RCS file: /var/cvsroot/quagga/zebra/kernel_socket.c,v
> retrieving revision 1.14
> diff -u -r1.14 kernel_socket.c
> --- zebra/kernel_socket.c 6 Jan 2004 01:13:05 -0000 1.14
> +++ zebra/kernel_socket.c 8 Jan 2004 14:17:07 -0000
> @@ -849,10 +849,12 @@
>
> /*
> * This must be big enough for any message the kernel might send.
> - * The code previously used RTAX_MAX struct sockaddrs in all cases,
> - * but now that sockaddrs are variable size, this doesn't work
> - * (Solaris has 244 bytes of sdl_data!). For now, add a struct
> - * sockaddr_dl to the case where it is used.
> + * Rather than determining how many sockaddrs of what size might be
> + * in each particular message, just use RTAX_MAX of sockaddr_storage
> + * for each. Note that the sockaddrs must be after each message
> + * definition, or rather after whichever happens to be the largest,
> + * since the buffer needs to be big enough for a message and the
> + * sockaddrs together.
> */
> union
> {
> @@ -860,22 +862,21 @@
> struct
> {
> struct rt_msghdr rtm;
> - struct sockaddr addr[RTAX_MAX];
> + struct sockaddr_storage addr[RTAX_MAX];
> } r;
>
> /* Interface information. */
> struct
> {
> struct if_msghdr ifm;
> - struct sockaddr_dl;
> - struct sockaddr addr[RTAX_MAX-1];
> + struct sockaddr_storage addr[RTAX_MAX];
> } im;
>
> /* Interface address information. */
> struct
> {
> struct ifa_msghdr ifa;
> - struct sockaddr addr[RTAX_MAX];
> + struct sockaddr_storage addr[RTAX_MAX];
> } ia;
>
> #ifdef RTM_IFANNOUNCE
> @@ -883,7 +884,7 @@
> struct
> {
> struct if_announcemsghdr ifan;
> - struct sockaddr addr[RTAX_MAX];
> + struct sockaddr_storage addr[RTAX_MAX];
> } ian;
> #endif /* RTM_IFANNOUNCE */
>
> @@ -908,6 +909,10 @@
>
> rtm = &buf.r.rtm;
>
> + /*
> + * Ensure that we didn't drop any data, so that processing routines
> + * can assume they have the whole message.
> + */
> if (rtm->rtm_msglen != nbytes)
> {
> zlog_warn ("kernel_read: rtm->rtm_msglen %d, nbytes %d, type %d\n",
> _______________________________________________
> Quagga-dev mailing list
> Quagga-dev@lists.quagga.net
> http://lists.quagga.net/mailman/listinfo/quagga-dev
>
Re: kernel routing socket handling problems [ In reply to ]
looks good. I just tried it, and even seems to work :-)

I have just committed the changes, so I think this leaves only the
open issue of data structure consistency in the face of unplumbing
(and likely cardbus/etc. interface removal on other systems too).
I'm glad my rtm message length check exposed this problem so we could
fix it cleanly!
Re: kernel routing socket handling problems [ In reply to ]
Hello!

Paul, please announce if You will commit this patch.. It looks like
this patch solves some stranges with FreeBSD.

>>How does this look?
>>
>>
>
>looks good. I just tried it, and even seems to work :-)
>
>

Regards,
Boris

1 2  View All