Paul,
From looking at the history of Zebra (before Quagga) I saw several
references to Solaris being misaligned often. Looking at the patch code,
though, I see that the main system programing for BGPd and Zebra should not
have this issue.
BGPD, when it dies, does not send an error out to bgpd.log nor does it core,
instead, the only real "error" I get is if I run it without the -d call, I
get "Buss Error" when it finally dies. With a lack of a core file, I ran
truss on the daemon while it was running. I am trying to get it to do
something via GDB to get some form of core dump for more tracking.
Thoughts?
From the truss output, I gathered from the last entry (when BGPD crashes)
that is was an alignment issue. See below:
12192: poll(0xFFBFD758, 11, 7982) (sleeping...)
12192: fd=5 ev=POLLRDNORM rev=0
12192: fd=8 ev=POLLRDNORM rev=0
12192: fd=9 ev=POLLRDNORM rev=0
12192: fd=10 ev=POLLRDNORM rev=0
12192: fd=12 ev=POLLOUT|POLLRDNORM rev=0
12192: fd=13 ev=POLLRDNORM rev=0
12192: fd=14 ev=POLLRDNORM rev=POLLRDNORM
12192: fd=15 ev=POLLRDNORM rev=0
12192: fd=16 ev=POLLOUT|POLLRDNORM rev=0
12192: fd=17 ev=POLLRDNORM rev=0
12192: fd=18 ev=POLLOUT|POLLRDNORM rev=0
12192: 1935.5612 poll(0xFFBFD758, 11, 7982) = 1
12192: fd=5 ev=POLLRDNORM rev=0
12192: fd=8 ev=POLLRDNORM rev=0
12192: fd=9 ev=POLLRDNORM rev=0
12192: fd=10 ev=POLLRDNORM rev=0
12192: fd=12 ev=POLLOUT|POLLRDNORM rev=0
12192: fd=13 ev=POLLRDNORM rev=POLLRDNORM
12192: fd=14 ev=POLLRDNORM rev=0
12192: fd=15 ev=POLLRDNORM rev=0
12192: fd=16 ev=POLLOUT|POLLRDNORM rev=0
12192: fd=17 ev=POLLRDNORM rev=0
12192: fd=18 ev=POLLOUT|POLLRDNORM rev=0
12192: 1935.5621 getpid() =
12192 [1]
12192: 1935.5623 open("/proc/12192/usage", O_RDONLY) = 19
12192: 1935.5626 read(19, "\0\0\0\0\0\0\001\0\0 7E6".., 256) =
256
12192: 1935.5628 close(19) = 0
12192: 1935.5630 fcntl(13, F_GETFL, 0x00000000) = 2
12192: 1935.5631 fstat64(13, 0xFFBFF648) = 0
12192: d=0x03C00000 i=50408 m=0140666 l=0 u=0 g=0 sz=0
12192: at = Mar 3 23:44:02 GMT 2004 [ 1078357442 ]
12192: mt = Mar 3 23:43:58 GMT 2004 [ 1078357438 ]
12192: ct = Mar 3 22:33:26 GMT 2004 [ 1078353206 ]
12192: bsz=8192 blks=0 fs=ufs
12192: 1935.5635 getsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748,
0xFFBFF740, 16711680) = 0
12192: 1935.5637 fstat64(13, 0xFFBFF648) = 0
12192: d=0x03C00000 i=50408 m=0140666 l=0 u=0 g=0 sz=0
12192: at = Mar 3 23:44:02 GMT 2004 [ 1078357442 ]
12192: mt = Mar 3 23:43:58 GMT 2004 [ 1078357438 ]
12192: ct = Mar 3 22:33:26 GMT 2004 [ 1078353206 ]
12192: bsz=8192 blks=0 fs=ufs
12192: 1935.5641 getsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748,
0xFFBFF744, 16711680) = 0
12192: 1935.5643 setsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748, 4,
16711680) = 0
12192: 1935.5645 fcntl(13, F_SETFL, 0x00000082) = 0
12192: 1935.5647 read(13, "FFFFFFFFFFFFFFFFFFFFFFFF".., 19) = 19
12192: 1935.5649 fstat64(13, 0xFFBFF648) = 0
12192: d=0x03C00000 i=50408 m=0140666 l=0 u=0 g=0 sz=0
12192: at = Mar 3 23:44:06 GMT 2004 [ 1078357446 ]
12192: mt = Mar 3 23:43:58 GMT 2004 [ 1078357438 ]
12192: ct = Mar 3 22:33:26 GMT 2004 [ 1078353206 ]
12192: bsz=8192 blks=0 fs=ufs
12192: 1935.5653 getsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748,
0xFFBFF744, 0) = 0
12192: 1935.5655 setsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748, 4, 0)
= 0
12192: 1935.5656 fcntl(13, F_SETFL, 0x00000002) = 0
12192: 1935.5658 fcntl(13, F_GETFL, 0x00000000) = 2
12192: 1935.5659 fstat64(13, 0xFFBFF648) = 0
12192: d=0x03C00000 i=50408 m=0140666 l=0 u=0 g=0 sz=0
12192: at = Mar 3 23:44:06 GMT 2004 [ 1078357446 ]
12192: mt = Mar 3 23:43:58 GMT 2004 [ 1078357438 ]
12192: ct = Mar 3 22:33:26 GMT 2004 [ 1078353206 ]
12192: bsz=8192 blks=0 fs=ufs
12192: 1935.5663 getsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748,
0xFFBFF740, 0) = 0
12192: 1935.5665 fstat64(13, 0xFFBFF648) = 0
12192: d=0x03C00000 i=50408 m=0140666 l=0 u=0 g=0 sz=0
12192: at = Mar 3 23:44:06 GMT 2004 [ 1078357446 ]
12192: mt = Mar 3 23:43:58 GMT 2004 [ 1078357438 ]
12192: ct = Mar 3 22:33:26 GMT 2004 [ 1078353206 ]
12192: bsz=8192 blks=0 fs=ufs
12192: 1935.5668 getsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748,
0xFFBFF744, 0) = 0
12192: 1935.5670 setsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748, 4, 0)
= 0
12192: 1935.5671 fcntl(13, F_SETFL, 0x00000082) = 0
12192: 1935.5673 read(13, "\0031086F2\0 2 @0101\0 @".., 61) = 61
12192: 1935.5675 fstat64(13, 0xFFBFF648) = 0
12192: d=0x03C00000 i=50408 m=0140666 l=0 u=0 g=0 sz=0
12192: at = Mar 3 23:44:06 GMT 2004 [ 1078357446 ]
12192: mt = Mar 3 23:43:58 GMT 2004 [ 1078357438 ]
12192: ct = Mar 3 22:33:26 GMT 2004 [ 1078353206 ]
12192: bsz=8192 blks=0 fs=ufs
12192: 1935.5680 getsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748,
0xFFBFF744, 0) = 0
12192: 1935.5682 setsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748, 4, 0)
= 0
12192: 1935.5684 fcntl(13, F_SETFL, 0x00000002) = 0
12192: 1935.5685 time() =
1078357446
12192: 1935.5687 Incurred fault #5, FLTACCESS %pc = 0x0003E5D8
12192: siginfo: SIGBUS BUS_ADRALN addr=0x00200C41
12192: 1935.5690 Received signal #10, SIGBUS [default]
12192: siginfo: SIGBUS BUS_ADRALN addr=0x00200C41
-----Original Message-----
From: Paul Jakma [mailto:Paul.Jakma@Sun.COM]
Sent: Thu 3/4/2004 8:18 PM
To: Gibbs, Michael
Cc: quagga-dev@lists.quagga.net
Subject: Re: [quagga-dev 952] Re: Problem with revision of Route Server
patch
On Thu, 4 Mar 2004, Gibbs, Michael wrote:
> This looks like an address alignment issue,
What makes you think that?
> in which I see several references to a problem based on ipv6 code, but
> no patches suggesting the ipv4 code in quagga or Zebra has this issue
> on Solaris. Is this a known bug?
Can you be more specific? What references?
Also, you are running on Solaris, right?
> Mike Gibbs
--paulj
From looking at the history of Zebra (before Quagga) I saw several
references to Solaris being misaligned often. Looking at the patch code,
though, I see that the main system programing for BGPd and Zebra should not
have this issue.
BGPD, when it dies, does not send an error out to bgpd.log nor does it core,
instead, the only real "error" I get is if I run it without the -d call, I
get "Buss Error" when it finally dies. With a lack of a core file, I ran
truss on the daemon while it was running. I am trying to get it to do
something via GDB to get some form of core dump for more tracking.
Thoughts?
From the truss output, I gathered from the last entry (when BGPD crashes)
that is was an alignment issue. See below:
12192: poll(0xFFBFD758, 11, 7982) (sleeping...)
12192: fd=5 ev=POLLRDNORM rev=0
12192: fd=8 ev=POLLRDNORM rev=0
12192: fd=9 ev=POLLRDNORM rev=0
12192: fd=10 ev=POLLRDNORM rev=0
12192: fd=12 ev=POLLOUT|POLLRDNORM rev=0
12192: fd=13 ev=POLLRDNORM rev=0
12192: fd=14 ev=POLLRDNORM rev=POLLRDNORM
12192: fd=15 ev=POLLRDNORM rev=0
12192: fd=16 ev=POLLOUT|POLLRDNORM rev=0
12192: fd=17 ev=POLLRDNORM rev=0
12192: fd=18 ev=POLLOUT|POLLRDNORM rev=0
12192: 1935.5612 poll(0xFFBFD758, 11, 7982) = 1
12192: fd=5 ev=POLLRDNORM rev=0
12192: fd=8 ev=POLLRDNORM rev=0
12192: fd=9 ev=POLLRDNORM rev=0
12192: fd=10 ev=POLLRDNORM rev=0
12192: fd=12 ev=POLLOUT|POLLRDNORM rev=0
12192: fd=13 ev=POLLRDNORM rev=POLLRDNORM
12192: fd=14 ev=POLLRDNORM rev=0
12192: fd=15 ev=POLLRDNORM rev=0
12192: fd=16 ev=POLLOUT|POLLRDNORM rev=0
12192: fd=17 ev=POLLRDNORM rev=0
12192: fd=18 ev=POLLOUT|POLLRDNORM rev=0
12192: 1935.5621 getpid() =
12192 [1]
12192: 1935.5623 open("/proc/12192/usage", O_RDONLY) = 19
12192: 1935.5626 read(19, "\0\0\0\0\0\0\001\0\0 7E6".., 256) =
256
12192: 1935.5628 close(19) = 0
12192: 1935.5630 fcntl(13, F_GETFL, 0x00000000) = 2
12192: 1935.5631 fstat64(13, 0xFFBFF648) = 0
12192: d=0x03C00000 i=50408 m=0140666 l=0 u=0 g=0 sz=0
12192: at = Mar 3 23:44:02 GMT 2004 [ 1078357442 ]
12192: mt = Mar 3 23:43:58 GMT 2004 [ 1078357438 ]
12192: ct = Mar 3 22:33:26 GMT 2004 [ 1078353206 ]
12192: bsz=8192 blks=0 fs=ufs
12192: 1935.5635 getsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748,
0xFFBFF740, 16711680) = 0
12192: 1935.5637 fstat64(13, 0xFFBFF648) = 0
12192: d=0x03C00000 i=50408 m=0140666 l=0 u=0 g=0 sz=0
12192: at = Mar 3 23:44:02 GMT 2004 [ 1078357442 ]
12192: mt = Mar 3 23:43:58 GMT 2004 [ 1078357438 ]
12192: ct = Mar 3 22:33:26 GMT 2004 [ 1078353206 ]
12192: bsz=8192 blks=0 fs=ufs
12192: 1935.5641 getsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748,
0xFFBFF744, 16711680) = 0
12192: 1935.5643 setsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748, 4,
16711680) = 0
12192: 1935.5645 fcntl(13, F_SETFL, 0x00000082) = 0
12192: 1935.5647 read(13, "FFFFFFFFFFFFFFFFFFFFFFFF".., 19) = 19
12192: 1935.5649 fstat64(13, 0xFFBFF648) = 0
12192: d=0x03C00000 i=50408 m=0140666 l=0 u=0 g=0 sz=0
12192: at = Mar 3 23:44:06 GMT 2004 [ 1078357446 ]
12192: mt = Mar 3 23:43:58 GMT 2004 [ 1078357438 ]
12192: ct = Mar 3 22:33:26 GMT 2004 [ 1078353206 ]
12192: bsz=8192 blks=0 fs=ufs
12192: 1935.5653 getsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748,
0xFFBFF744, 0) = 0
12192: 1935.5655 setsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748, 4, 0)
= 0
12192: 1935.5656 fcntl(13, F_SETFL, 0x00000002) = 0
12192: 1935.5658 fcntl(13, F_GETFL, 0x00000000) = 2
12192: 1935.5659 fstat64(13, 0xFFBFF648) = 0
12192: d=0x03C00000 i=50408 m=0140666 l=0 u=0 g=0 sz=0
12192: at = Mar 3 23:44:06 GMT 2004 [ 1078357446 ]
12192: mt = Mar 3 23:43:58 GMT 2004 [ 1078357438 ]
12192: ct = Mar 3 22:33:26 GMT 2004 [ 1078353206 ]
12192: bsz=8192 blks=0 fs=ufs
12192: 1935.5663 getsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748,
0xFFBFF740, 0) = 0
12192: 1935.5665 fstat64(13, 0xFFBFF648) = 0
12192: d=0x03C00000 i=50408 m=0140666 l=0 u=0 g=0 sz=0
12192: at = Mar 3 23:44:06 GMT 2004 [ 1078357446 ]
12192: mt = Mar 3 23:43:58 GMT 2004 [ 1078357438 ]
12192: ct = Mar 3 22:33:26 GMT 2004 [ 1078353206 ]
12192: bsz=8192 blks=0 fs=ufs
12192: 1935.5668 getsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748,
0xFFBFF744, 0) = 0
12192: 1935.5670 setsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748, 4, 0)
= 0
12192: 1935.5671 fcntl(13, F_SETFL, 0x00000082) = 0
12192: 1935.5673 read(13, "\0031086F2\0 2 @0101\0 @".., 61) = 61
12192: 1935.5675 fstat64(13, 0xFFBFF648) = 0
12192: d=0x03C00000 i=50408 m=0140666 l=0 u=0 g=0 sz=0
12192: at = Mar 3 23:44:06 GMT 2004 [ 1078357446 ]
12192: mt = Mar 3 23:43:58 GMT 2004 [ 1078357438 ]
12192: ct = Mar 3 22:33:26 GMT 2004 [ 1078353206 ]
12192: bsz=8192 blks=0 fs=ufs
12192: 1935.5680 getsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748,
0xFFBFF744, 0) = 0
12192: 1935.5682 setsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748, 4, 0)
= 0
12192: 1935.5684 fcntl(13, F_SETFL, 0x00000002) = 0
12192: 1935.5685 time() =
1078357446
12192: 1935.5687 Incurred fault #5, FLTACCESS %pc = 0x0003E5D8
12192: siginfo: SIGBUS BUS_ADRALN addr=0x00200C41
12192: 1935.5690 Received signal #10, SIGBUS [default]
12192: siginfo: SIGBUS BUS_ADRALN addr=0x00200C41
-----Original Message-----
From: Paul Jakma [mailto:Paul.Jakma@Sun.COM]
Sent: Thu 3/4/2004 8:18 PM
To: Gibbs, Michael
Cc: quagga-dev@lists.quagga.net
Subject: Re: [quagga-dev 952] Re: Problem with revision of Route Server
patch
On Thu, 4 Mar 2004, Gibbs, Michael wrote:
> This looks like an address alignment issue,
What makes you think that?
> in which I see several references to a problem based on ipv6 code, but
> no patches suggesting the ipv4 code in quagga or Zebra has this issue
> on Solaris. Is this a known bug?
Can you be more specific? What references?
Also, you are running on Solaris, right?
> Mike Gibbs
--paulj