Mailing List Archive

ripd status
Hello,

What about other, ripd related, patches? I have been working with ripd
code for some time now - may patch is about 80% ready. It is a quite big
modification, but ripd is now able to keep more than just one route and
switch them when necessary. Some parts of the code were totaly rewrited so
I expect that something may be broken. It would be nice to make ripd in
quagga working before my big changes. For most ripd bugs we have fixes on
this list, can they be commited into cvs before quagga 0.96.5, Paul?

There is one bug left, but I hope Sowmini will fix it soon (see
[quagga-dev 490] Re: Problems with ripd in quagga-0.96.4).

Best regards,

Krzysztof Olêdzki
Re: ripd status [ In reply to ]
On Thu, 25 Dec 2003, Krzysztof Oledzki wrote:

> Hello,
>
> What about other, ripd related, patches? I have been working with
> ripd code for some time now - may patch is about 80% ready. It is a
> quite big modification, but ripd is now able to keep more than just
> one route and switch them when necessary. Some parts of the code
> were totaly rewrited so I expect that something may be broken. It
> would be nice to make ripd in quagga working before my big changes.
> For most ripd bugs we have fixes on this list, can they be commited
> into cvs before quagga 0.96.5, Paul?

The DISTANCE patch is the main outstanding patch correct? It looks
sane - {i,we}'ll test it out early new year and incorporate it.
Ditto for futher patches.

> There is one bug left, but I hope Sowmini will fix it soon (see
> [quagga-dev 490] Re: Problems with ripd in quagga-0.96.4).

Right yes.

> Best regards,
>
> Krzysztof Olędzki

regards,
--
Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A
warning: do not ever send email to spam@dishone.st
Fortune:
"It runs like _x, where _x is something unsavory"
-- Prof. Romas Aleliunas, CS 435
Re: ripd status [ In reply to ]
>
> There is one bug left, but I hope Sowmini will fix it soon (see
> [quagga-dev 490] Re: Problems with ripd in quagga-0.96.4).
>

Thanks for reminding.. I'd forgotten to follow up on this one.
As it turns out, I see that [quagga-dev 428] has not been committed.

As far as I understood, the patch in [quagga-dev 428] reverted
the code to where it used to be in, say, 0.96.2.
I'll have to reproduce your problem to fix it- were you using ripv1?
Was this working with quagga 0.96.2?

--Sowmini
Re: ripd status [ In reply to ]
> From sowmini@quasimodo.East.Sun.COM Thu Dec 25 20:40:27 2003
>
> >
> > There is one bug left, but I hope Sowmini will fix it soon (see
> > [quagga-dev 490] Re: Problems with ripd in quagga-0.96.4).
> >
>

found out what the problem was. I was always closing send_sock,
whereas I should only have closed it for (!to). Here's the new patch
which worked for me on linux with ripv1. Can you please try and
confirm that this one works, so that Paul can commit it?

--Sowmini


===================================================================
RCS file: ripd/rip_interface.c,v
retrieving revision 1.13
diff -uwb -r1.13 ripd/rip_interface.c
--- ripd/rip_interface.c 2003/10/15 23:20:17 1.13
+++ ripd/rip_interface.c 2003/11/07 16:07:38
@@ -146,13 +146,18 @@
struct in_addr addr;
struct prefix_ipv4 *p;

+ if (connected != NULL)
+ {
if (if_pointopoint)
p = (struct prefix_ipv4 *) connected->destination;
else
p = (struct prefix_ipv4 *) connected->address;
-
addr = p->prefix;
-
+ }
+ else
+ {
+ addr.s_addr = INADDR_ANY;
+ }

if (setsockopt_multicast_ipv4 (sock, IP_MULTICAST_IF,
addr, 0, 0) < 0)
@@ -173,7 +178,10 @@

/* Address shoud be any address. */
from.sin_family = AF_INET;
+ if (connected)
addr = ((struct prefix_ipv4 *) connected->address)->prefix;
+ else
+ addr.s_addr = INADDR_ANY;
from.sin_addr = addr;
#ifdef HAVE_SIN_LEN
from.sin_len = sizeof (struct sockaddr_in);
@@ -182,7 +190,6 @@
if (ripd_privs.change (ZPRIVS_RAISE))
zlog_err ("rip_interface_multicast_set: could not raise privs");

- bind (sock, NULL, 0); /* unbind any previous association */
ret = bind (sock, (struct sockaddr *) & from, sizeof (struct sockaddr_in));
if (ret < 0)
{
===================================================================
RCS file: ripd/ripd.c,v
retrieving revision 1.11
diff -uwb -r1.11 ripd/ripd.c
--- ripd/ripd.c 2003/10/15 23:20:17 1.11
+++ ripd/ripd.c 2004/01/08 15:46:45
@@ -42,6 +42,11 @@
#include "ripd/ripd.h"
#include "ripd/rip_debug.h"

+/*
+ * The source address to be used when sending the packet
+ */
+struct connected *source_address;
+
extern struct zebra_privs_t ripd_privs;

/* RIP Structure. */
@@ -1237,7 +1242,7 @@
rip_send_packet (caddr_t buf, int size, struct sockaddr_in *to,
struct interface *ifp)
{
- int ret;
+ int ret, send_sock;
struct sockaddr_in sin;

/* Make destination address. */
@@ -1252,6 +1257,7 @@
{
sin.sin_port = to->sin_port;
sin.sin_addr = to->sin_addr;
+ send_sock = rip->sock;
}
else
{
@@ -1259,11 +1265,32 @@
sin.sin_port = htons (RIP_PORT_DEFAULT);
sin.sin_addr.s_addr = htonl (INADDR_RIP_GROUP);

- /* caller has set multicast interface */
+ /*
+ * we have to open a new socket for each packet because
+ * this is the most portable way to bind to a different
+ * source ipv4 address.
+ */
+ send_sock = socket(AF_INET, SOCK_DGRAM, 0);
+ if (send_sock < 0)
+ {
+ zlog_warn("could not create socket %s", strerror(errno));
+ return -1;
+ }
+
+ sockopt_broadcast (send_sock);
+ sockopt_reuseaddr (send_sock);
+ sockopt_reuseport (send_sock);
+#ifdef RIP_RECVMSG
+ setsockopt_pktinfo (send_sock);
+#endif /* RIP_RECVMSG */
+ rip_interface_multicast_set(send_sock, source_address,
+ if_is_pointopoint(ifp));
+ /* reset source address */
+ source_address = NULL;

}

- ret = sendto (rip->sock, buf, size, 0, (struct sockaddr *)&sin,
+ ret = sendto (send_sock, buf, size, 0, (struct sockaddr *)&sin,
sizeof (struct sockaddr_in));

if (IS_RIP_DEBUG_EVENT)
@@ -1273,6 +1300,8 @@
if (ret < 0)
zlog_warn ("can't send packet : %s", strerror (errno));

+ if (!to)
+ close(send_sock);
return ret;
}

@@ -1839,7 +1868,7 @@
return len;
}

-/* Make socket for RIP protocol. */
+/* Make socket for RIP protocol and bind it to the inaddr. */
int
rip_create_socket ()
{
@@ -1848,7 +1877,6 @@
struct sockaddr_in addr;
struct servent *sp;

- memset (&addr, 0, sizeof (struct sockaddr_in));

/* Set RIP port. */
sp = getservbyname ("router", "udp");
@@ -2358,8 +2386,7 @@
if (ifaddr->family != AF_INET)
continue;

- rip_interface_multicast_set(rip->sock, connected,
- if_is_pointopoint(ifp));
+ source_address = connected;
if (vsend & RIPv1)
rip_update_interface (ifp, RIPv1, route_type, ifaddr);
if (vsend & RIPv2)
@@ -2588,8 +2615,7 @@
if (p->family != AF_INET)
continue;

- rip_interface_multicast_set(rip->sock, connected,
- if_is_pointopoint(ifp));
+ source_address = connected;
if (rip_send_packet ((caddr_t) &rip_packet, sizeof (rip_packet),
to, ifp) != sizeof (rip_packet))
return -1;
Re: ripd status [ In reply to ]
Could you summarize the original problem that let to this proposed
change? That would perhaps be good to include in a ChangeLog entry.

+struct connected *source_address;

This is a global, and it seems to get set in rip_send_packet. I don't
follow this.

How do the sockets get cleaned up when interfaces are removed?

- bind (sock, NULL, 0); /* unbind any previous association */

I know this is a deleted line, but this raises the question of the
portability of 'unbind' - is that fialing what led to this?

(At first glance, the patch also looks like it isn't whitespace-clean,
too.)

--
Greg Troxel <gdt@ir.bbn.com>
Re: ripd status [ In reply to ]
On Fri, 9 Jan 2004, Greg Troxel wrote:

> +struct connected *source_address;
>
> This is a global, and it seems to get set in rip_send_packet. I don't
> follow this.

I think this ties in with the potential cleanup that could be done by
collecting various bits of interface/subnet related information ripd
maintains and making use of the ->info field in struct interface as a
place to store ripd specific interface information, as discussed
previously.

i think sowmini's intention was to do that cleanup at some stage when
she had the time.

regards,
--
Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A
warning: do not ever send email to spam@dishone.st
Fortune:
Vests are to suits as seat-belts are to cars.
Re: ripd status [ In reply to ]
> From gdt@ir.bbn.com Fri Jan 9 10:32:09 2004
>
> Could you summarize the original problem that let to this proposed
> change? That would perhaps be good to include in a ChangeLog entry.
>
> +struct connected *source_address;
>
> This is a global, and it seems to get set in rip_send_packet. I don't
> follow this.

It's a long story- the thread to look for has the subject
"Re: Problems with ripd in quagga-0.96.4" in quagga-dev

The basic issue is: the ripv2 spec requires that ripd send out
a rip response/request packet for each connected network. For mcast
packets sent out on an interface that is on multiple networks,
this is done by setting the source address appropriately. The function
that knows the source address (the one that packages the rip payload,
and knows how to do split horizon, poison reverse etc.) is several
levels higher in the stack than the actual sending function. Rather
than tow the source address through all those functions, I'd
originally tried to do it by setting the source address on rip->sock
but that proved to be non-portable.

>
> How do the sockets get cleaned up when interfaces are removed?
>
> - bind (sock, NULL, 0); /* unbind any previous association */
>
> I know this is a deleted line, but this raises the question of the
> portability of 'unbind' - is that fialing what led to this?

It wasn't portable. See [quagga-dev 427].

The only portable solution was to revert back to the original
clumsy way of open/closing a socket for each packet sent.

> (At first glance, the patch also looks like it isn't whitespace-clean,
> too.)

Unfortunately, the patch in [quagga-dev 428] did not get committed,
and I was trying to recreate from an old source-base that I had lying
around. If Kryzstof can confirm that the patch fixes all of his problems,
I can try to generate a cleaner patch.

--Sowmini
Re: ripd status [ In reply to ]
The only portable solution was to revert back to the original
clumsy way of open/closing a socket for each packet sent.

This should be in the ChangeLog entry.

The basic issue is: the ripv2 spec requires that ripd send out
a rip response/request packet for each connected network. For mcast
packets sent out on an interface that is on multiple networks,
this is done by setting the source address appropriately. The function
that knows the source address (the one that packages the rip payload,
and knows how to do split horizon, poison reverse etc.) is several
levels higher in the stack than the actual sending function. Rather
than tow the source address through all those functions, I'd
originally tried to do it by setting the source address on rip->sock
but that proved to be non-portable.

This should be explained in comments in the code. It sounds like the
situation that is troublesome is an interface with multiple prefixes
configured on it?

Using a global to pass an address in several levels really seems
unclean. If the source address really is needed by the sending
function (which it sounds like), I would really a global-free
solution. I know quagga is not thread-safe, but global usage like
this is hard to understand (especially without a big comment
explaining the rules) and I'm afraid it is likely to lead to trouble
with future maintenance.
Changing a few signatures to carry a source address doesn't sound like
that bad a change.

The only portable solution was to revert back to the original
clumsy way of open/closing a socket for each packet sent.

One could keep the sockets in the interface structure rather than
open/closing.

How do you ensure that these sockets don't receive rip traffic that
might then get dropped when they are closed?
(Please add a comment rather than answering here :-)

> (At first glance, the patch also looks like it isn't whitespace-clean,
> too.)

Unfortunately, the patch in [quagga-dev 428] did not get committed,
and I was trying to recreate from an old source-base that I had lying
around. If Kryzstof can confirm that the patch fixes all of his problems,
I can try to generate a cleaner patch.

OK - patches submitted for application really need to be fully clean
and against the head of CVS. Please also include a ChangeLog entry --
see HACKING at top level for my take on patch submission guidelines.
While there isn't yet established consensus about these, no negative
comments have been received either.
Re: ripd status [ In reply to ]
>
> Using a global to pass an address in several levels really seems
> unclean.

Yes, I know. See Paul's comments about keeping things in ifp->info.
I figured that the global was no worse than doing the unbind()
(which stashes the source address info in the socket internals).

> Changing a few signatures to carry a source address doesn't sound like
> that bad a change.

this gets complicated because the functions are called from several
places, only a few of which have a meaningful source address to send.

> One could keep the sockets in the interface structure rather than
> open/closing.

Yes, see Paul's comments.

> How do you ensure that these sockets don't receive rip traffic that
> might then get dropped when they are closed?
> (Please add a comment rather than answering here :-)

e.g.,
/*
* question for kunihiro: How do you ensure that these
* sockets don't receive rip traffic that might then get dropped
* when they are closed?
*/

no, seriously, that's how the code originally was written. There's
a separate socket (rip->sock) that picks up incoming traffic.

--Sowmini
Re: ripd status [ In reply to ]
> Changing a few signatures to carry a source address doesn't sound like
> that bad a change.

this gets complicated because the functions are called from several
places, only a few of which have a meaningful source address to send.

Then NULL can be passed. On a quick glance, it looks like only
rip_send_packet needs to get an additional argument; the 'connected'
structure or its addr seems to be passed down the others.

e.g.,
/*
* question for kunihiro: How do you ensure that these
* sockets don't receive rip traffic that might then get dropped
* when they are closed?
*/

no, seriously, that's how the code originally was written. There's
a separate socket (rip->sock) that picks up incoming traffic.

OK, but that calls for either a comment when the socket is created
that explains why it won't get traffic, or

/* XXX This socket might receive rip traffic which we will then drop. */
Re: ripd status [ In reply to ]
On Fri, 9 Jan 2004 sowmini.varadhan@sun.com wrote:

> The only portable solution was to revert back to the original
> clumsy way of open/closing a socket for each packet sent.

at which point we might as well permanently open a socket for each
subnet (for lifetime of being active on that subnet) - if it were to
simplify things. (your original argument was socket/subnet might have
scaleability problems, but surely socket/message would be worse?)

> Unfortunately, the patch in [quagga-dev 428] did not get committed,

ah. that can be done. its just it wasnt very clear at that stage what
was what. You could have included this patch in your own patch and
submitted it on too.

> --Sowmini

regards,
--
Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A
warning: do not ever send email to spam@dishone.st
Fortune:
One seldom sees a monument to a committee.
Re: ripd status [ In reply to ]
> > The only portable solution was to revert back to the original
> > clumsy way of open/closing a socket for each packet sent.
>
> at which point we might as well permanently open a socket for each
> subnet (for lifetime of being active on that subnet) - if it were to
> simplify things. (your original argument was socket/subnet might have
> scaleability problems, but surely socket/message would be worse?)

true.

> > Unfortunately, the patch in [quagga-dev 428] did not get committed,
>
> ah. that can be done. its just it wasnt very clear at that stage what
> was what. You could have included this patch in your own patch and
> submitted it on too.

Let's wait for Kryzstof to confirm that this patch solves all the
technical problems and then I'll submit one big patch which addresses
the comment/cosmetic/software-engineering issues.

Kryzstof?

--Sowmini
Re: ripd status [ In reply to ]
On Fri, 9 Jan 2004 sowmini.varadhan@sun.com wrote:

> Let's wait for Kryzstof to confirm that this patch solves all the
> technical problems and then I'll submit one big patch which addresses
> the comment/cosmetic/software-engineering issues.
>
> Kryzstof?

Krzysztof ;-)


Eh :) Tested.


My small testing enviroment:

eth0 interface with 1 address: 192.168.0.33/24
eth3 interface with 2 addresses: 192.168.200.10/24, 192.168.200.11/24, 192.168.200.12/24

-- ripd.conf begin --
router rip
network eth3

timers basic 10 45 120

redistribute connected
-- ripd.conf end --


1. RIPv2 Multicast still works:

20:07:40.416018 192.168.200.10.520 > 224.0.0.9.520: RIPv2-req 24 (DF) [ttl 1]
20:07:40.418367 192.168.200.11.520 > 224.0.0.9.520: RIPv2-req 24 (DF) [ttl 1]
20:07:40.420541 192.168.200.12.520 > 224.0.0.9.520: RIPv2-req 24 (DF) [ttl 1]

20:07:41.415871 192.168.200.10.520 > 224.0.0.9.520: RIPv2-resp [items 1]: {192.168.0.0/255.255.255.0}(1) (DF) [ttl 1]
20:07:41.418270 192.168.200.11.520 > 224.0.0.9.520: RIPv2-resp [items 1]: {192.168.0.0/255.255.255.0}(1) (DF) [ttl 1]
20:07:41.420580 192.168.200.12.520 > 224.0.0.9.520: RIPv2-resp [items 1]: {192.168.0.0/255.255.255.0}(1) (DF) [ttl 1]

20:07:54.416156 192.168.200.10.520 > 224.0.0.9.520: RIPv2-resp [items 1]: {192.168.0.0/255.255.255.0}(1) (DF) [ttl 1]
20:07:54.418620 192.168.200.11.520 > 224.0.0.9.520: RIPv2-resp [items 1]: {192.168.0.0/255.255.255.0}(1) (DF) [ttl 1]
20:07:54.420922 192.168.200.12.520 > 224.0.0.9.520: RIPv2-resp [items 1]: {192.168.0.0/255.255.255.0}(1) (DF) [ttl 1]

20:08:03.426351 192.168.200.10.520 > 224.0.0.9.520: RIPv2-resp [items 1]: {192.168.0.0/255.255.255.0}(1) (DF) [ttl 1]
20:08:03.428804 192.168.200.11.520 > 224.0.0.9.520: RIPv2-resp [items 1]: {192.168.0.0/255.255.255.0}(1) (DF) [ttl 1]
20:08:03.431101 192.168.200.12.520 > 224.0.0.9.520: RIPv2-resp [items 1]: {192.168.0.0/255.255.255.0}(1) (DF) [ttl 1]


2. RIPv1 Unicast does not work properly:

20:09:38.056789 192.168.200.10.520 > 192.168.200.255.520: RIPv1-req 24 (DF)
20:09:38.057499 192.168.200.10.520 > 192.168.200.255.520: RIPv1-req 24 (DF)
20:09:38.058045 192.168.200.10.520 > 192.168.200.255.520: RIPv1-req 24 (DF)
20:09:38.058483 192.168.200.10.520 > 192.168.200.255.520: RIPv1-req 24 (DF)
20:09:38.058835 192.168.200.10.520 > 192.168.200.255.520: RIPv1-req 24 (DF)
20:09:38.059392 192.168.200.10.520 > 192.168.200.255.520: RIPv1-req 24 (DF)
20:09:38.059819 192.168.200.10.520 > 192.168.200.255.520: RIPv1-req 24 (DF)
20:09:38.060164 192.168.200.10.520 > 192.168.200.255.520: RIPv1-req 24 (DF)
20:09:38.060540 192.168.200.10.520 > 192.168.200.255.520: RIPv1-req 24 (DF)

20:09:38.061290 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:38.062060 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:38.062705 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:38.063416 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:38.064038 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:38.064838 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:38.065504 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:38.066237 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:38.066909 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)

20:09:43.056739 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:43.057471 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:43.058197 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:43.058828 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:43.059538 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:43.060163 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:43.061029 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:43.061646 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:43.062284 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)


20:09:56.066942 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:56.067626 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:56.068353 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:56.069224 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:56.069833 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:56.070450 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:56.071164 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:56.071791 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:09:56.072643 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)


Please notice that there are 9 (3^3 probably) instead of 3 requests and 9
(3^3 probably - again) instead of 3 announces. All packets come from
primary ip address (192.168.200.10).

With 2 addresses on eth3 (192.168.200.10/24, 192.168.200.11/24) there are
4 (2^2 probably) requests and 4 announces:

20:11:57.019394 192.168.200.10.520 > 192.168.200.255.520: RIPv1-req 24 (DF)
20:11:57.020135 192.168.200.10.520 > 192.168.200.255.520: RIPv1-req 24 (DF)
20:11:57.020702 192.168.200.10.520 > 192.168.200.255.520: RIPv1-req 24 (DF)
20:11:57.021083 192.168.200.10.520 > 192.168.200.255.520: RIPv1-req 24 (DF)


20:11:58.019318 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:11:58.020013 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:11:58.020751 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)
20:11:58.021452 192.168.200.10.520 > 192.168.200.255.520: RIPv1-resp [items 1]: {192.168.0.0}(1) (DF)



Best regards,


Krzysztof Olêdzki
Re: ripd status [ In reply to ]
> > Ok.. I don't know too much about linux administration, so I used
> > the sequence of commands that seemed most obvious to me.
>
> But this creates new interfaces, each one with one IP address. IMHO this
> is not the same like one interface with many addresses. So, test results
> may vary.

[.Also cc-ing the list, to get some other thoughts about this]

Ok, I'll have to test both cases. But as for the problem itself,
I spent some time thinking about it, and talking to colleagues, and
it appears that the right thing to do, when you have multiple addresses
configured on the same subnet, is to send out exactly *one* request/response
using *one* source address for the message (else the listeners'
routing tables are going to balloon up).

To do this, ripd must probably maintain some sort of "DUP ADDR" in
struct connected, and this makes the change more complex.

Stay tuned.. I'll work on the fix for us both to test.

--Sowmini
Re: ripd status [ In reply to ]
Ok, I'll have to test both cases. But as for the problem itself,
I spent some time thinking about it, and talking to colleagues, and
it appears that the right thing to do, when you have multiple addresses
configured on the same subnet, is to send out exactly *one* request/response
using *one* source address for the message (else the listeners'
routing tables are going to balloon up).

I'll toss out that when I first heard about the situation it seemed
like ripd trying to cope with a perhaps broken world. I find the word
subnet confusing above. If one uses the IPv6 term link to refer to a
medium that can exchange packets, then I think you are talking about
the situation of multiple IPv4 prefixes configured on the same link.

It would seem a bit odd in this circumstance to send multiple rip
announcements, but on the other hand I'd expect rip to require that
only packets from an address falling within a configured prefix be
accepted. So if there are some routers with prefix A and some with
prefix B, and one with A and B (call it router Z), then perhaps Z
should be sending two announcements, and the other ones will route
through Z.

Still, this seems like a very oddly configured network. But odd does
not necessarily equal "broken, we don't support".


In any case, I think it is important to keep the concept of an
interface (a device which can send and receive packets on a link)
separate from a configured prefix.
So if one had two prefixes on one link, there would be one interface
structure which had two prefixes associated with it in the data
structures.

--
Greg Troxel <gdt@ir.bbn.com>
Re: ripd status [ In reply to ]
> like ripd trying to cope with a perhaps broken world. I find the word
> subnet confusing above.

Confusing, but perfectly valid, and not broken at all.
I can have one interface with addresses:
10.0.0.1/24, 10.0.0.2/24, 10.0.0.3/24, for all sorts of reasons e.g.,
failover when one address fails, use a specific address for DNS etc.

> If one uses the IPv6 term link to refer to a
> medium that can exchange packets, then I think you are talking about
> the situation of multiple IPv4 prefixes configured on the same link.

no, I'm talking about multiple *addresses* for one interface on
the same link.

Note that this is also different from the linux concept of "secondary"
address, or bsd's "alias" address, where the secondary/alias may or
may not be on the same prefix/subnet as the "primary" address.

For example, consider a node like this (with 3 physical interfaces,
A, B, C, but the problem wrt ripd is the same)


------------
| |
| A B C | listening router R
------------ |
| | | |
| | | |
------------------------- N


A is the "primary" address in that it is used as source addr for
outgoing packets, with the intention that source address should fail over to
the address configured on B if the interface A fails (marked deprecated).

The way ripd is currently designed (I'm not sure what the other daemons
do) it will send out 3 packets, with source A, B, C. And R will think
that there are 3 routers, A, B, C in the network, its routing
tables will quickly balloon up, and it will be doing all sorts of
gymnastics timing out and managing this ballooned up routing table.

One possibility is for zebra to notice that B and C are really duplicate
addresses on the network N, and to flag them as ZEBRA_IFA_DUP
(leaving ZEBRA_IFA_SECONDARY as designed). This would require some changes
to functions like connected_up_ipv4 (which I'm investigating currently)-
they will have to call prefix_match instead of prefix_same, and do some
work to identify the difference between an exact match and a prefix match
(so that addresses don't get configured twice etc.)

This solution is non-trivial, because the daemons now have to recognize
when to fail-over, when to promote something from DUP to primary etc.

--Sowmini
Re: ripd status [ In reply to ]
no, I'm talking about multiple *addresses* for one interface on
the same link.

Right, so you have multiple subnets on the same link in this case,
since subnet is a prefix-based concept, not a link-based concept.
This is what I meant was potentially confusing.

Note that this is also different from the linux concept of "secondary"
address, or bsd's "alias" address, where the secondary/alias may or
may not be on the same prefix/subnet as the "primary" address.

Good points.

The way ripd is currently designed (I'm not sure what the other daemons
do) it will send out 3 packets, with source A, B, C. And R will think
that there are 3 routers, A, B, C in the network, its routing
tables will quickly balloon up, and it will be doing all sorts of
gymnastics timing out and managing this ballooned up routing table.

Before talking about how to change the code, we really need to have a
very clear understanding of what the specs say and what the correct
behavior is.

One possibility is for zebra to notice that B and C are really duplicate
addresses on the network N, and to flag them as ZEBRA_IFA_DUP
(leaving ZEBRA_IFA_SECONDARY as designed). This would require some changes

Perhaps, but we really need to understand the usages and meanings of
those flags, and fix up the docs for them. Right now things are a bit
underdocumented, but do work for the most part.
Re: ripd status [ In reply to ]
>
> The way ripd is currently designed (I'm not sure what the other daemons
> do) it will send out 3 packets, with source A, B, C. And R will think
> that there are 3 routers, A, B, C in the network, its routing
> tables will quickly balloon up, and it will be doing all sorts of
> gymnastics timing out and managing this ballooned up routing table.
>
> Before talking about how to change the code, we really need to have a
> very clear understanding of what the specs say and what the correct
> behavior is.
>

My understanding of the RIP specs (ipv4 and ipv6) is that you should
send out one packet on each connected network, i.e., on each connected
prefix/subnet. So, in the case discussed earlier (both the A/B/C case
and the one where there's one interface with addresses
{10.0.0.1/24, 10.0.0.2/24, 10.0.0.3/24}), you have *one* connection
to the 10.0.0.0/24 network and you should send out *one* packet on this
network.

Seems like ospfd does something to this effect too- Paul tells
me that it will not send packets on ZEBRA_IFA_SECONDARY networks,
which strikes me as behavior in the same spirit?

--Sowmini
Re: ripd status [ In reply to ]
My understanding of the RIP specs (ipv4 and ipv6) is that you should
send out one packet on each connected network, i.e., on each connected
prefix/subnet.

This is what I meant was confusing. By 'network', do you mean 'link'
or 'prefix'? From the 'i.e.', I gather you mean prefix.

{10.0.0.1/24, 10.0.0.2/24, 10.0.0.3/24}), you have *one* connection
to the 10.0.0.0/24 network and you should send out *one* packet on this
network.

Is your case of interest like this, or with different prefixes?

Having things marked ZEBRA_IFA_SECONDARY for (all but one of)
prefixes/addresses which are prefix_cmp-equal to another address sounds
like it might well be the right thing, but I'd like to see the
semantics for ZEBRA_IFA_SECONDARY documented.

find . -name \*.[ch]|xargs egrep IFA_SECONDARY

./lib/if.h:#define ZEBRA_IFA_SECONDARY (1 << 0)
./ospfd/ospfd.c: if (CHECK_FLAG(co->flags,ZEBRA_IFA_SECONDARY))
./zebra/interface.c: if (CHECK_FLAG (connected->flags, ZEBRA_IFA_SECONDARY))
./zebra/interface.c: SET_FLAG (ifc->flags, ZEBRA_IFA_SECONDARY);
./zebra/interface.c: SET_FLAG (ifc->flags, ZEBRA_IFA_SECONDARY);
./zebra/interface.c: if (CHECK_FLAG (ifc->flags, ZEBRA_IFA_SECONDARY))
./zebra/rt_netlink.c: SET_FLAG (flags, ZEBRA_IFA_SECONDARY);
./zebra/rt_netlink.c: if (CHECK_FLAG (ifc->flags, ZEBRA_IFA_SECONDARY))

It seems only ospfd uses this, but I only see that it gets set via
netlink or the ip_address_secondary command.

#ifdef HAVE_NETLINK
DEFUN (ip_address_secondary,
ip_address_secondary_cmd,
"ip address A.B.C.D/M secondary",
"Interface Internet Protocol config commands\n"
"Set the IP address of an interface\n"
"IP address (e.g. 10.0.0.1/8)\n"
"Secondary IP address\n")
{
return ip_address_install (vty, vty->index, argv[0], NULL, NULL, 1);
}
Re: ripd status [ In reply to ]
> send out one packet on each connected network, i.e., on each connected
> prefix/subnet.
>
> This is what I meant was confusing. By 'network', do you mean 'link'
> or 'prefix'? From the 'i.e.', I gather you mean prefix.

By your definitions, Prefix.

> {10.0.0.1/24, 10.0.0.2/24, 10.0.0.3/24}), you have *one* connection
> to the 10.0.0.0/24 network and you should send out *one* packet on this
> network.
>
> Is your case of interest like this, or with different prefixes?

My case of interest is all of the above, but oleq's particular case is
like the set I indicate above.

>
> Having things marked ZEBRA_IFA_SECONDARY for (all but one of)
> prefixes/addresses which are prefix_cmp-equal to another address sounds

no, it does not.. how can ripd know the difference between the case
when a secondary address is a duplicated prefix (don't send packets)
as compared to when it is not (2 connected networks/links/prefixes- send
packet).

> It seems only ospfd uses this, but I only see that it gets set via
> netlink or the ip_address_secondary command.

I think the linux kernel also sets it, but afaik BSD does not. And of course,
there's no concept of alias/secondary on Solaris. I don't know what the
other flavors of unix do.

--Sowmini
Re: ripd status [ In reply to ]
> Having things marked ZEBRA_IFA_SECONDARY for (all but one of)
> prefixes/addresses which are prefix_cmp-equal to another address sounds

no, it does not.. how can ripd know the difference between the case
when a secondary address is a duplicated prefix (don't send packets)
as compared to when it is not (2 connected networks/links/prefixes- send
packet).

This is why the semantics of ZEBRA_IFA_SECONDARY need to be defined
precisely, or perhaps something else, and then interface and address
adding/deleting modified to respect those invariants. Clearly one can
tell the difference between 2 addrs in one prefix and 2 addrs in
different prefixes.

It remains to be seen whether this is what ZEBRA_IFA_SECONDARY really
means, though.

I think the linux kernel also sets it, but afaik BSD does not. And of course,
there's no concept of alias/secondary on Solaris. I don't know what the
other flavors of unix do.

You mean on solaris you can't ifconfig 2 ip addrs on the same prefix
on an interface?
Re: ripd status [ In reply to ]
>
> I think the linux kernel also sets it, but afaik BSD does not. And of course,
> there's no concept of alias/secondary on Solaris. I don't know what the
> other flavors of unix do.
>
> You mean on solaris you can't ifconfig 2 ip addrs on the same prefix
> on an interface?

no, that would be a pretty gross lacking, and one that would irk customers.

I mean, the concept of a kernel-defined IFF_SECONDARY flag is pretty unique
to linux.

--Sowmini
Re: ripd status [ In reply to ]
Greg Troxel wrote:

> > Having things marked ZEBRA_IFA_SECONDARY for (all but one of)
> > prefixes/addresses which are prefix_cmp-equal to another address sounds
>
> no, it does not.. how can ripd know the difference between the case
> when a secondary address is a duplicated prefix (don't send packets)
> as compared to when it is not (2 connected networks/links/prefixes- send
> packet).
>
> This is why the semantics of ZEBRA_IFA_SECONDARY need to be defined
> precisely, or perhaps something else, and then interface and address
> adding/deleting modified to respect those invariants. Clearly one can
> tell the difference between 2 addrs in one prefix and 2 addrs in
> different prefixes.
>
> It remains to be seen whether this is what ZEBRA_IFA_SECONDARY really
> means, though.

In fact, it does not: this goes back to historical discussions about the
meaning of the term "secondary" in zebra/quagga (for example, see
[quagga-dev 72] and follow-ups). Briefly speaking, I believe the
original need emerged from Cisco compatibility, where a secondary
address is any non-primary address assigned to an interface, and each
interface may have exactly one primary address. On the other hand, a
secondary address in other contexts (Linux for sure, I believe
BSD-variants as well) means subsequent addresses of a single subnet
prefix assigned to a single interface; nonetheless, in Linux for
example, it's the kernel's exclusive role to mark addresses as
secondary. Absurdly enough, zebra/quagga attempts to *set* the secondary
flag for addresses assigned via netlink, obviously in vain... Thus, we
get a considerable inconsistency with regard to what "secondary" and
ZEBRA_IFA_SECONDARY really stand for.

How can it be changed? IMO, there are two possible paths, both
incomplete in some sense:

#1, make 'secondary' a configuration property only, in order to retain
the Cisco-like API, and refrain from any attempt to address forwarding
related issues affected by this attribute as applied by the kernel.
That's approximately what things are today, IMO.

#2, stick to a finer scheme, by which secondary is a kernel derived
attribute, which is aimed to address actual forwarding issues (namely,
tie break for which source address to use for outbound packets). Not a
bad idea, but requires extensive modifications, as the zerba daemon must
trace 'secondary' flags as they are received from the underlying kernel,
on the one hand, and on the other hand disable user intervention in the
form of Cisco-like 'secondary' keyword. Is it worth the effort?

Note that these issues, and whatever decisions taken, may have further
effect that we aren't quite aware of: for example, how should
link-oriented protocols like RIP and OSPF treat a chain of
primary-secondaries addresses? Should they be treated as a single
"bundle" for the sake of transmitting/receiving route updates? What
happens when a primary address is deleted? (this is also tightly
hand-in-hand with the behavior of the specific underlying kernel) And so
on. In other words, we need to do careful thinking and come up with the
most generalized approach that addresses all such issues.

Still want to talk about secondaries in quagga?... ;->

Gilad


(PS: AFAICT, an "interface" has nothing to do with "link" with regard to
this particular discussion.)
Re: ripd status [ In reply to ]
>
> Still want to talk about secondaries in quagga?... ;->

yeah!

I have no familiarity with linux and I'm playing with it, and I see
that when I configure ip address using some /sbin/ip incantations
I picked up from Paul and Krysztof, I get:

# ip -4 add
1: lo: <LOOPBACK,UP> mtu 16436 qdisc noqueue
inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
4: eth1: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 100
inet 10.0.0.33/25 brd 10.255.255.255 scope global eth1
inet 172.16.3.33/24 scope global eth1
inet 10.0.0.34/25 scope global secondary eth1

(why can't good old 'ifconfig -a' display these addresses? but that's
another problem)

So I was wrong in my assumption earlier that "secondary" is set
by the kernel for all non-primary addresses. It seems like "secondary"
is only set for (what I've been calling) duplicate addresses.

What does BSD do with the interface flags(if anything at all)?

On Solaris, if you just add another address (as opposed to explicitly
configuring IPMP groups) via 'ifconfig <intf> addif <....>', does
nothing remarkable:

eri0: flags=1104843<UP,BROADCAST,RUNNING,MULTICAST,DHCP,ROUTER,IPv4> ...
inet 10.0.0.102 netmask ffffff80 broadcast <...>
eri0:1: flags=1100843<UP,BROADCAST,RUNNING,MULTICAST,ROUTER,IPv4> ...
inet 10.0.0.103 netmask ffffff80 broadcast <...>


> (PS: AFAICT, an "interface" has nothing to do with "link" with regard to
> this particular discussion.)

In my dictionary, yes.. To me, an interface is a physical card,
link is what Greg calls "prefix/subnet". Therefore, you can
have multiple interfaces on the same link/prefix/subnet, one
interface on multiple links/prefixes/subnets.

But that's all a matter of definition. I think we are all talking
about the same thing here.

--Sowmini
Re: ripd status [ In reply to ]
sowmini.varadhan@Sun.COM wrote:

> So I was wrong in my assumption earlier that "secondary" is set
> by the kernel for all non-primary addresses. It seems like "secondary"
> is only set for (what I've been calling) duplicate addresses.

Yes, and there's more: for example, try to delete the primary address
and see what happens...


> On Solaris, if you just add another address (as opposed to explicitly
> configuring IPMP groups) via 'ifconfig <intf> addif <....>', does
> nothing remarkable:
>
> eri0: flags=1104843<UP,BROADCAST,RUNNING,MULTICAST,DHCP,ROUTER,IPv4> ...
> inet 10.0.0.102 netmask ffffff80 broadcast <...>
> eri0:1: flags=1100843<UP,BROADCAST,RUNNING,MULTICAST,ROUTER,IPv4> ...
> inet 10.0.0.103 netmask ffffff80 broadcast <...>

So, what happens when you do 'ping 10.0.0.105'? What source address do
the packets carry? (my guess: the non-alias "interface" is used) What
happens when you remove 10.0.0.102? (my guess: you lose all subsequent
aliases) If I'm right here -- and bear in mind that I've never used
Solaris or the like -- then the "primality" property is implied to hold
for non-alias interface, and otherwise for aliases. Am I mistaken?

Gilad

1 2 3 4  View All