Mailing List Archive

Stale pointer crashes OSPFD under FreeBSD 5.1 Current
Hi,

Under FreeBSD 5.1 CURRENT, ospfd crashes with a "bus error"
when adding interfaces during startup.

The crash is due to a pointer into a structure, which had been
"free"ed immediately before, being returned from subroutine.

FreeBSD 5.1 CURRENT, unlike its predecessors, seems to overwrite
memory during "free" calls.

The problem happens in ospf_if_table_lookup (in ospf_interface.c),
where "route_unlock_node" is called immediately before
"return (struct ospf_interface *) rn->info".

The "route_unlock_node", decrementing the lock count to the route
structure to zero, causes the route structure to be free`ed.

Under FreeBSD 5.1 CURRENT, the route structure gets clobbered,
and the returned pointer is no longer the expected NULL, causing
ospfd to crash.

The attached patch saves the pointer before calling "route_unlock_node".

Regards,
Claus.
--
--------------------------------------------------------
Claus Endres | Phone: +61-3-5998 2310
Endres Consulting Pty. Ltd. | Mobile: +61-418-595 136
10 Facey Road | Fax: +61-3-5998 2540
Devon Meadows, VIC 3977 | claus@endresconsulting.com
Re: Stale pointer crashes OSPFD under FreeBSD 5.1 Current [ In reply to ]
Claus Endres wrote:
> Hi,
>
> Under FreeBSD 5.1 CURRENT, ospfd crashes with a "bus error"
> when adding interfaces during startup.
>
> The crash is due to a pointer into a structure, which had been
> "free"ed immediately before, being returned from subroutine.
>
> FreeBSD 5.1 CURRENT, unlike its predecessors, seems to overwrite
> memory during "free" calls.
>
> The problem happens in ospf_if_table_lookup (in ospf_interface.c),
> where "route_unlock_node" is called immediately before
> "return (struct ospf_interface *) rn->info".
>
> The "route_unlock_node", decrementing the lock count to the route
> structure to zero, causes the route structure to be free`ed.
>
> Under FreeBSD 5.1 CURRENT, the route structure gets clobbered,
> and the returned pointer is no longer the expected NULL, causing
> ospfd to crash.
>
> The attached patch saves the pointer before calling
> "route_unlock_node".

This patch seem to fix [quagga-users 828] as well. Not sure how
correct it is and I don't have time to look at it closely. Paul has
opinion for sure ;).

--
Hasso Tepper
Elion Enterprises Ltd.
WAN administrator
Re: Stale pointer crashes OSPFD under FreeBSD 5.1 Current [ In reply to ]
On Fri, 7 Nov 2003, Claus Endres wrote:

> The problem happens in ospf_if_table_lookup (in ospf_interface.c),
> where "route_unlock_node" is called immediately before "return
> (struct ospf_interface *) rn->info".
>
> The "route_unlock_node", decrementing the lock count to the route
> structure to zero, causes the route structure to be free`ed.
>
> Under FreeBSD 5.1 CURRENT, the route structure gets clobbered, and
> the returned pointer is no longer the expected NULL, causing ospfd
> to crash.

Yum.

How's the clobber happening btw?

> The attached patch saves the pointer before calling
> "route_unlock_node".

Thanks.

> Regards,
> Claus.

regards,
--
Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A
warning: do not ever send email to spam@dishone.st
Fortune:
Friction is a drag.
Re: Stale pointer crashes OSPFD under FreeBSD 5.1 Current [ In reply to ]
Paul Jakma wrote:

>
> How's the clobber happening btw?
>

The memory is overwritten by a pattern of all 0x0d characters.
I assume it is for debug/security purposes, as it did not
happen in FreeBSD 5.1 RELEASE, only in CURRENT.

Regards,
Claus.
--
--------------------------------------------------------
Claus Endres | Phone: +61-3-5998 2310
Endres Consulting Pty. Ltd. | Mobile: +61-418-595 136
10 Facey Road | Fax: +61-3-5998 2540
Devon Meadows, VIC 3977 | claus@endresconsulting.com
Re: Stale pointer crashes OSPFD under FreeBSD 5.1 Current [ In reply to ]
On Mon, 10 Nov 2003, Claus Endres wrote:

> The memory is overwritten by a pattern of all 0x0d characters. I
> assume it is for debug/security purposes, as it did not happen in
> FreeBSD 5.1 RELEASE, only in CURRENT.

neat. it will no doubt throw up lots of bugs in lots of programmes.
nice :)

> Regards,
> Claus.

regards,
--
Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A
warning: do not ever send email to spam@dishone.st
Fortune:
"The eleventh commandment was `Thou Shalt Compute' or `Thou Shalt Not
Compute' -- I forget which."
-- Epigrams in Programming, ACM SIGPLAN Sept. 1982
Re: Stale pointer crashes OSPFD under FreeBSD 5.1 Current [ In reply to ]
On Mon, Nov 10, 2003 at 06:07:20AM +0000, Paul Jakma wrote:

> > The memory is overwritten by a pattern of all 0x0d characters. I
> > assume it is for debug/security purposes, as it did not happen in
> > FreeBSD 5.1 RELEASE, only in CURRENT.
>
> neat. it will no doubt throw up lots of bugs in lots of programmes.
> nice :)

If you're worried about this kind of thing, use Valgrind on your
application (developer.kde.org/~sewardj). I was planning on running
it on quagga, but didn't find the time yet. Generally valgrind
reveals a lot of programming errors.


--L
Re: Stale pointer crashes OSPFD under FreeBSD 5.1 Current [ In reply to ]
On Mon, Nov 10, 2003 at 04:36:43AM -0500, Lennert Buytenhek wrote:
> On Mon, Nov 10, 2003 at 06:07:20AM +0000, Paul Jakma wrote:
>
> > > The memory is overwritten by a pattern of all 0x0d characters. I
> > > assume it is for debug/security purposes, as it did not happen in
> > > FreeBSD 5.1 RELEASE, only in CURRENT.
> >
> > neat. it will no doubt throw up lots of bugs in lots of programmes.
> > nice :)
> If you're worried about this kind of thing, use Valgrind on your
> application (developer.kde.org/~sewardj). I was planning on running
> it on quagga, but didn't find the time yet. Generally valgrind
> reveals a lot of programming errors.

Greetings,
I tried to run ospfd under valgrind (ver 2.0.0) and I can't get ospfd
to actually come up and run. I even tried using --skin=none and still
no luck (so far waited 20-30 minutes for ospfd to start). Neighbors
don't see any this router and a tcpdump on the box shows no activity.

In particular I am trying to get more details/help debug the issue where
ospfd is crashing when certain interafaces are brought up/down:
http://lists.quagga.net/pipermail/quagga-users/2003-November/000827.html

If I have ospfd configured for interface eth2 and then I add an additional
IP to that interface (what keepalived does) and then remove the IP, poof.

I can also get the same effect if I bring up eth2:3 (or any other alias)
but don't have the 'eth2:3' configured in ospfd.conf.

From initial looking at the code, it seems like the base lib classes want
a 1 to 1 relationship between IP's and devices. But in Linux IP route2
land this is no longer the case.

As mentioned in:
http://lists.quagga.net/pipermail/quagga-users/2003-November/000986.html

We have worked around the problem (by not listing our networks that don't
have another OSPF router on then and using redistribute connected).

But I am willing to spend time helping to debug/fix the bug that causes
ospfd to segfault. Which I am comfortable in gdb and looking at C code
and have test routers I can muck with, so really just looking for some
ideas/pointers on where to start looking.

Best Regards,
Steven Roberts
Re: Stale pointer crashes OSPFD under FreeBSD 5.1 Current [ In reply to ]
On Sun, Nov 30, 2003 at 08:25:56PM -0800, Steven Roberts wrote:

> > > > The memory is overwritten by a pattern of all 0x0d characters. I
> > > > assume it is for debug/security purposes, as it did not happen in
> > > > FreeBSD 5.1 RELEASE, only in CURRENT.
> > >
> > > neat. it will no doubt throw up lots of bugs in lots of programmes.
> > > nice :)
> >
> > If you're worried about this kind of thing, use Valgrind on your
> > application (developer.kde.org/~sewardj). I was planning on running
> > it on quagga, but didn't find the time yet. Generally valgrind
> > reveals a lot of programming errors.
>
> Greetings,
> I tried to run ospfd under valgrind (ver 2.0.0) and I can't get ospfd
> to actually come up and run. I even tried using --skin=none and still
> no luck (so far waited 20-30 minutes for ospfd to start). Neighbors
> don't see any this router and a tcpdump on the box shows no activity.

Have you tried strace()ing valgrind? Do you see any system calls being
called? Does valgrind or ospfd go to 100% CPU? Can you telnet into the
ospfd port?


--L