Mailing List Archive

Problem with revision of Route Server patch
Jose,

Looking more into it, it is not a hardware issue. It seems to show up
once we pass 130k routes, aka peer1 and peer2 online only works without a
hitch. The layout is as follows, and each are tagged as rs-clients in the
config

peer1: no routes in/no routes out
peer2: 130k routes in/no routes out
peer3: 130k routes in/no routes out
peer4: 114k routes in/no routes out

Also, I do not get a core file when bgpd crashes. Is this normal or have I
missed a configuration step to turn coring on.

Mike

-----Original Message-----
From: Gibbs, Michael
Sent: Wed 3/3/2004 7:58 PM
To: Jose Luis Rubio; quagga-dev@lists.quagga.net
Cc:
Subject: RE: [quagga-dev 944] Re: Revision of Route Server patch


Jose,

I didn't see much in the way of growth in the memory usage from the
below commands. After an hour and 10 minutes it was stable at a set memory
level. 6 minutes after that it just died. I ran it without the -d, to see
what stderr or stdout would say when it died, and it gave a Bus error when
it crashed. I am doing a diag on the machine, but so far the hardware seems
clean. Thoughts?

Mike Gibbs

-----Original Message-----
From: Jose Luis Rubio [mailto:jrubio@dit.upm.es]
Sent: Wed 3/3/2004 10:23 AM
To: Gibbs, Michael; quagga-dev@lists.quagga.net
Cc:
Subject: Re: [quagga-dev 944] Re: Revision of Route Server patch



Well, it might be a memory leak. Could you send me the output of
the commands 'show memory bgp' and 'show memory lib' at different
times, say for example after 10 min, 30 min, 1 hour, 1:20...

Another thing that could be useful is the log, even though it can be quite
big
after an hour and a half :-) If you don't have debug enabled you can do it
by
adding the following lines.

log file /tmp/bgpd.debug
!
debug bgp
debug bgp events
debug bgp filters
debug bgp fsm
debug bgp keepalives
debug bgp updates

Regards,

Jose


El Mié 03 Mar 2004 06:05, Gibbs, Michael escribió:
> Has anyone had any issues with this patched to quagga .96.4? I currently
> have this deployed in a test system, and have 4 neighbors. 3 are sending
> full routes (130k) each to the route server. They are all
> route-server-clients. I am then exporting those routes (all 130kx3) to
the
> 4 route-server-member. After about an hour to an hour and a half, BGPD
> crashes without a useful error. Has anyone else run into this?
>
> Mike Gibbs
Re: Problem with revision of Route Server patch [ In reply to ]
Have I hit too much information yet? Ok, I have a truss output file
attached that shows the following when it crashed:

12192: 1935.5680 getsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748,
0xFFBFF744, 0) = 0
12192: 1935.5682 setsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748, 4, 0)
= 0
12192: 1935.5684 fcntl(13, F_SETFL, 0x00000002) = 0
12192: 1935.5685 time() =
1078357446
12192: 1935.5687 Incurred fault #5, FLTACCESS %pc = 0x0003E5D8
12192: siginfo: SIGBUS BUS_ADRALN addr=0x00200C41
12192: 1935.5690 Received signal #10, SIGBUS [default]
12192: siginfo: SIGBUS BUS_ADRALN addr=0x00200C41

Ideas?

Mike

-----Original Message-----
From: Gibbs, Michael
Sent: Wed 3/3/2004 9:53 PM
To: Jose Luis Rubio; quagga-dev@lists.quagga.net
Cc:
Subject: Problem with revision of Route Server patch


Jose,

Looking more into it, it is not a hardware issue. It seems to show up
once we pass 130k routes, aka peer1 and peer2 online only works without a
hitch. The layout is as follows, and each are tagged as rs-clients in the
config

peer1: no routes in/no routes out
peer2: 130k routes in/no routes out
peer3: 130k routes in/no routes out
peer4: 114k routes in/no routes out

Also, I do not get a core file when bgpd crashes. Is this normal or have I
missed a configuration step to turn coring on.

Mike

-----Original Message-----
From: Gibbs, Michael
Sent: Wed 3/3/2004 7:58 PM
To: Jose Luis Rubio; quagga-dev@lists.quagga.net
Cc:
Subject: RE: [quagga-dev 944] Re: Revision of Route Server patch


Jose,

I didn't see much in the way of growth in the memory usage from the
below commands. After an hour and 10 minutes it was stable at a set memory
level. 6 minutes after that it just died. I ran it without the -d, to see
what stderr or stdout would say when it died, and it gave a Bus error when
it crashed. I am doing a diag on the machine, but so far the hardware seems
clean. Thoughts?

Mike Gibbs

-----Original Message-----
From: Jose Luis Rubio [mailto:jrubio@dit.upm.es]
Sent: Wed 3/3/2004 10:23 AM
To: Gibbs, Michael; quagga-dev@lists.quagga.net
Cc:
Subject: Re: [quagga-dev 944] Re: Revision of Route Server patch



Well, it might be a memory leak. Could you send me the output of
the commands 'show memory bgp' and 'show memory lib' at different
times, say for example after 10 min, 30 min, 1 hour, 1:20...

Another thing that could be useful is the log, even though it can be quite
big
after an hour and a half :-) If you don't have debug enabled you can do it
by
adding the following lines.

log file /tmp/bgpd.debug
!
debug bgp
debug bgp events
debug bgp filters
debug bgp fsm
debug bgp keepalives
debug bgp updates

Regards,

Jose


El Mié 03 Mar 2004 06:05, Gibbs, Michael escribió:
> Has anyone had any issues with this patched to quagga .96.4? I currently
> have this deployed in a test system, and have 4 neighbors. 3 are sending
> full routes (130k) each to the route server. They are all
> route-server-clients. I am then exporting those routes (all 130kx3) to
the
> 4 route-server-member. After about an hour to an hour and a half, BGPD
> crashes without a useful error. Has anyone else run into this?
>
> Mike Gibbs
Re: Problem with revision of Route Server patch [ In reply to ]
This looks like an address alignment issue, in which I see several
references to a problem based on ipv6 code, but no patches suggesting the
ipv4 code in quagga or Zebra has this issue on Solaris. Is this a known
bug?

Mike Gibbs

-----Original Message-----
From: Gibbs, Michael
Sent: Wed 3/3/2004 11:57 PM
To: Jose Luis Rubio; quagga-dev@lists.quagga.net
Cc:
Subject: RE: Problem with revision of Route Server patch


Have I hit too much information yet? Ok, I have a truss output file
attached that shows the following when it crashed:

12192: 1935.5680 getsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748,
0xFFBFF744, 0) = 0
12192: 1935.5682 setsockopt(13, SOL_SOCKET, 0x2000, 0xFFBFF748, 4, 0)
= 0
12192: 1935.5684 fcntl(13, F_SETFL, 0x00000002) = 0
12192: 1935.5685 time() =
1078357446
12192: 1935.5687 Incurred fault #5, FLTACCESS %pc = 0x0003E5D8
12192: siginfo: SIGBUS BUS_ADRALN addr=0x00200C41
12192: 1935.5690 Received signal #10, SIGBUS [default]
12192: siginfo: SIGBUS BUS_ADRALN addr=0x00200C41

Ideas?

Mike

-----Original Message-----
From: Gibbs, Michael
Sent: Wed 3/3/2004 9:53 PM
To: Jose Luis Rubio; quagga-dev@lists.quagga.net
Cc:
Subject: Problem with revision of Route Server patch


Jose,

Looking more into it, it is not a hardware issue. It seems to show up
once we pass 130k routes, aka peer1 and peer2 online only works without a
hitch. The layout is as follows, and each are tagged as rs-clients in the
config

peer1: no routes in/no routes out
peer2: 130k routes in/no routes out
peer3: 130k routes in/no routes out
peer4: 114k routes in/no routes out

Also, I do not get a core file when bgpd crashes. Is this normal or have I
missed a configuration step to turn coring on.

Mike

-----Original Message-----
From: Gibbs, Michael
Sent: Wed 3/3/2004 7:58 PM
To: Jose Luis Rubio; quagga-dev@lists.quagga.net
Cc:
Subject: RE: [quagga-dev 944] Re: Revision of Route Server patch


Jose,

I didn't see much in the way of growth in the memory usage from the
below commands. After an hour and 10 minutes it was stable at a set memory
level. 6 minutes after that it just died. I ran it without the -d, to see
what stderr or stdout would say when it died, and it gave a Bus error when
it crashed. I am doing a diag on the machine, but so far the hardware seems
clean. Thoughts?

Mike Gibbs

-----Original Message-----
From: Jose Luis Rubio [mailto:jrubio@dit.upm.es]
Sent: Wed 3/3/2004 10:23 AM
To: Gibbs, Michael; quagga-dev@lists.quagga.net
Cc:
Subject: Re: [quagga-dev 944] Re: Revision of Route Server patch



Well, it might be a memory leak. Could you send me the output of
the commands 'show memory bgp' and 'show memory lib' at different
times, say for example after 10 min, 30 min, 1 hour, 1:20...

Another thing that could be useful is the log, even though it can be quite
big
after an hour and a half :-) If you don't have debug enabled you can do it
by
adding the following lines.

log file /tmp/bgpd.debug
!
debug bgp
debug bgp events
debug bgp filters
debug bgp fsm
debug bgp keepalives
debug bgp updates

Regards,

Jose


El Mié 03 Mar 2004 06:05, Gibbs, Michael escribió:
> Has anyone had any issues with this patched to quagga .96.4? I currently
> have this deployed in a test system, and have 4 neighbors. 3 are sending
> full routes (130k) each to the route server. They are all
> route-server-clients. I am then exporting those routes (all 130kx3) to
the
> 4 route-server-member. After about an hour to an hour and a half, BGPD
> crashes without a useful error. Has anyone else run into this?
>
> Mike Gibbs
Re: Problem with revision of Route Server patch [ In reply to ]
On Thu, 4 Mar 2004, Gibbs, Michael wrote:

> This looks like an address alignment issue,

What makes you think that?

> in which I see several references to a problem based on ipv6 code, but
> no patches suggesting the ipv4 code in quagga or Zebra has this issue
> on Solaris. Is this a known bug?

Can you be more specific? What references?

Also, you are running on Solaris, right?

> Mike Gibbs

--paulj