Mailing List Archive

RPD coring today?
Anyone else see their RPD start to core today? Seeing something weird, unclear if it’s local to my network or otherwise but two devices at the same time seem to be having trouble, so puzzling.

Running 20.4R3.8

- jared
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: RPD coring today? [ In reply to ]
On Sat, Sep 17, 2022 at 06:21:51PM -0400, Jared Mauch via juniper-nsp wrote:
> Anyone else see their RPD start to core today? Seeing something weird, unclear if it’s local to my network or otherwise but two devices at the same time seem to be having trouble, so puzzling.
>
> Running 20.4R3.8

What does this show:

show system core-dump core-file-info /path/to/corefile
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: RPD coring today? [ In reply to ]
Le dim. 18 sept. 2022 à 07:08, Chuck Anderson via juniper-nsp
<juniper-nsp@puck.nether.net> a écrit :
>
> On Sat, Sep 17, 2022 at 06:21:51PM -0400, Jared Mauch via juniper-nsp wrote:
> > Anyone else see their RPD start to core today? Seeing something weird, unclear if it’s local to my network or otherwise but two devices at the same time seem to be having trouble, so puzzling.
> >
> > Running 20.4R3.8
>
> What does this show:
>
> show system core-dump core-file-info /path/to/corefile

gdb was removed from junos somewhere around 16. This unfortunately
doesn't work anymore. It was quite handy.
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: RPD coring today? [ In reply to ]
It's in the HRTimers code which is very interesting. seems to have started when I rolled back which IPs were primary on an IRB for my local DHCP pool. Interfaces have VRRP on them and it wasn't happy at all.

Seems specific to 5100.

Very odd.

Sent via RFC1925 compliant device

> On Sep 18, 2022, at 1:08 AM, Chuck Anderson via juniper-nsp <juniper-nsp@puck.nether.net> wrote:
>
> ?On Sat, Sep 17, 2022 at 06:21:51PM -0400, Jared Mauch via juniper-nsp wrote:
>> Anyone else see their RPD start to core today? Seeing something weird, unclear if it’s local to my network or otherwise but two devices at the same time seem to be having trouble, so puzzling.
>>
>> Running 20.4R3.8
>
> What does this show:
>
> show system core-dump core-file-info /path/to/corefile
> _______________________________________________
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: RPD coring today? [ In reply to ]
> > What does this show:
> >
> > show system core-dump core-file-info /path/to/corefile
>
> gdb was removed from junos somewhere around 16. This unfortunately
> doesn't work anymore. It was quite handy.

While it's obviously not that convenient, then one can prepare a VM with
necessary shared libraries from Junos and use gdb there. For example, here
is the backtrace of the rpd core dump from Junos 17.3R3-S11.4 opened in gdb
11.2 for FreeBSD:

(gdb) bt
#0 0x00000000c9c24b7a in __sys_thr_kill () from /root/libs/lib/libc.so.7
#1 0x00000000c9c24a64 in raise () from /root/libs/lib/libc.so.7
#2 0x00000000c9c23690 in abort () from /root/libs/lib/libc.so.7
#3 0x00000000c9c06815 in __assert () from /root/libs/lib/libc.so.7
#4 0x000000000166549f in tag_unlock_tag_label_elm ()
#5 0x0000000001661592 in tag_gw_unlock_tag_label_elm ()
#6 0x000000000153024a in rt_nexthops_free ()
#7 0x0000000001544b1a in rt_change_parms ()
#8 0x000000000155243f in rt_change_ribgroup_import ()
#9 0x0000000000a3d01f in bgp_ribgroup_change_rt ()
#10 0x0000000000a051fe in bgp_sync_cb ()
#11 0x00000000015593bf in rt_nh_change_cb ()
#12 0x00000000015537f9 in rt_nh_change_immediate_cb ()
#13 0x0000000001555cc9 in rt_nh_resolve_change ()
#14 0x0000000000a041e8 in bgp_rt_cnh_resolve_change ()
#15 0x0000000000a0581f in bgp_sync_rt_change ()
#16 0x0000000000a56401 in bgp_rt_change ()
#17 0x0000000000a58c45 in bgp_rcv_nlri ()
#18 0x0000000000a5a8da in bgp_read_v4_update ()
#19 0x00000000009df4ec in bgp_handle_update ()
#20 0x0000000000a1d6c1 in bgp_read_resp_process_internal ()
#21 0x0000000000a1db18 in bgp_read_resp_process ()
#22 0x00000000016f001c in task_job_run_common ()
#23 0x00000000016f1480 in task_job_bg_dispatch ()
#24 0x0000000001708625 in task_scheduler_internal ()
#25 0x0000000001709261 in task_scheduler ()
#26 0x00000000007223ff in main ()
(gdb)

As seen above, the shared object libraries from Junos were placed under
the /root/libs. Gdbinit was configured with "set sysroot /root/libs". With
the stack frame number and the function name it's possible to dig even
deeper using disassemblers like radare2.

Perhaps it's useful for somebody in the future.


Martin
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp