Mailing List Archive

pxelinux doesn't answer ARP requests when it should
Hi there.

Regarding currently known problem as stated on
http://syslinux.zytor.com/pxe.php:
we should probably call the UDP receive function in the keyboard
entry loop, so that we answer ARP requests.

Here's another illustration of the problem, description of related
pxelinux problem and workaround for both problems.

Configuration:
dhcp/tftp server -- 213.180.194.116 (00:1B:78:05:84:6C)
debian 4.0, kernel 2.6.18-4-686-bigmem

dhcp/tftp client (the one which boots) -- 213.180.194.120
(00:30:48:34:26:12)
supermicro x7dbr-8/x7dbr-i bios rev 1.3b
intel boot agent ge v.1.2.36

Both computers are attached to one switch.

Tcpdump (tshark actually) is started on the third computer
attached to the same switch. The switch is configured to copy
all the packets from the dhcp/tftp server's port to the port
of this third computer.

pxelinux tested: syslinux 3.51 (both compiled with HAVE_IDLE 0
and HAVE_IDLE 1), syslinux 3.31 (debian binary package 3.31-4)

Consider the following tcpdump session:

glavryba:~# tshark -i eth0 ether src 00:30:48:34:26:12 or ether dst
00:30:48:34:26:12
Capturing on eth0
0.000000 0.0.0.0 -> 255.255.255.255 DHCP DHCP Discover - Transaction
ID 0x4a342612
1.001464 213.180.194.116 -> 255.255.255.255 DHCP DHCP Offer -
Transaction ID 0x4a342612
2.032308 0.0.0.0 -> 255.255.255.255 DHCP DHCP Request - Transaction
ID 0x4a342612
2.033558 213.180.194.116 -> 255.255.255.255 DHCP DHCP ACK -
Transaction ID 0x4a342612
2.034558 Supermic_34:26:12 -> Broadcast ARP Who has 213.180.194.116?
Tell 213.180.194.120
2.034570 213.180.194.116 -> 213.180.194.120 ICMP Echo (ping) request
2.034575 00:1b:78:05:84:6c -> Supermic_34:26:12 ARP 213.180.194.116 is at
00:1b:78:05:84:6c
2.034933 213.180.194.120 -> 213.180.194.116 TFTP Read Request, File:
pxelinux/pxelinux.0, Transfer type: octet
2.035562 213.180.194.116 -> 213.180.194.120 TFTP Option Acknowledgement
2.036066 213.180.194.120 -> 213.180.194.116 TFTP Error Code, Code: Not
defined, Message: TFTP Aborted
2.036184 213.180.194.120 -> 213.180.194.116 TFTP Read Request, File:
pxelinux/pxelinux.0, Transfer type: octet
2.036808 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 1
2.037306 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 1
2.037430 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 2
2.042927 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 2
[skipped]
2.276153 213.180.194.120 -> 213.180.194.116 TFTP Read Request, File:
pxelinux/pxelinux.cfg/default, Transfer type: octet
2.276657 213.180.194.116 -> 213.180.194.120 TFTP Option Acknowledgement
2.277152 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 0
2.277280 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 1
2.277777 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 1
2.277903 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 2
2.278403 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 2
2.278529 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 3
2.279028 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 3
2.279154 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 4
(last)
2.279652 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 4
2.279658 213.180.194.120 -> 213.180.194.116 TFTP Read Request, File:
pxelinux/pxelinux-screens/welcome, Transfer type: octet
2.280280 213.180.194.116 -> 213.180.194.120 TFTP Option Acknowledgement
2.280651 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 0
2.280904 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 1
2.281277 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 1
2.281527 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 2
(last)
2.281900 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 2
7.033132 00:1b:78:05:84:6c -> Supermic_34:26:12 ARP Who has
213.180.194.120? Tell 213.180.194.116
8.033121 00:1b:78:05:84:6c -> Supermic_34:26:12 ARP Who has
213.180.194.120? Tell 213.180.194.116
9.033111 00:1b:78:05:84:6c -> Supermic_34:26:12 ARP Who has
213.180.194.120? Tell 213.180.194.116
10.031477 213.180.194.120 -> 213.180.194.116 TFTP Read Request, File:
pxelinux/k/ds386, Transfer type: octet
10.031984 213.180.194.116 -> 213.180.194.120 TFTP Option Acknowledgement
10.032477 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 0
10.032604 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 1
10.033101 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 1
13.291531 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 1
19.882343 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 1
33.064094 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 1

At the mark 2.281900 client has finished downloading pxelinux.0 and default
menu,
displayed the menu and is waiting at user prompt. In approx. 5 seconds (mark
7.033132)
server starts to check whether the client as alive and the client doesn't
respond.

For several seconds server uses the old, unverified MAC address (marks
10.031984
and 10.032604). Finally server decides that the client is dead and stops
sending
data packets. At this point client thinks that its last acknowledgement
packet
might be dropped and starts to retry it (marks 10.033101, 13.291531,
19.882343,
33.064094).

Consider another tcpdump session:

glavryba:~# tshark -r test | less
1 0.000000 0.0.0.0 -> 255.255.255.255 DHCP DHCP Discover -
Transaction ID 0x4a342612
2 1.000086 213.180.194.116 -> 255.255.255.255 DHCP DHCP Offer -
Transaction ID 0x4a342612
3 2.032179 0.0.0.0 -> 255.255.255.255 DHCP DHCP Request -
Transaction ID 0x4a342612
4 2.033178 213.180.194.116 -> 255.255.255.255 DHCP DHCP ACK -
Transaction ID 0x4a342612
5 2.034178 Supermic_34:26:12 -> Broadcast ARP Who has
213.180.194.116? Tell 213.180.194.120
6 2.034188 213.180.194.116 -> 213.180.194.120 ICMP Echo (ping) request
7 2.034192 00:1b:78:05:84:6c -> Supermic_34:26:12 ARP 213.180.194.116 is
at 00:1b:78:05:84:6c
8 2.034678 213.180.194.120 -> 213.180.194.116 TFTP Read Request, File:
pxelinux/pxelinux.0, Transfer type: octet
9 2.035303 213.180.194.116 -> 213.180.194.120 TFTP Option
Acknowledgement
10 2.035801 213.180.194.120 -> 213.180.194.116 TFTP Error Code, Code: Not
defined, Message: TFTP Aborted
11 2.035927 213.180.194.120 -> 213.180.194.116 TFTP Read Request, File:
pxelinux/pxelinux.0, Transfer type: octet
[skipped]
99 2.279653 213.180.194.120 -> 213.180.194.116 TFTP Read Request, File:
pxelinux/pxelinux-screens/welcome, Transfer type: octet
100 2.280272 213.180.194.116 -> 213.180.194.120 TFTP Option
Acknowledgement
101 2.280772 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement,
Block: 0
102 2.280898 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 1
103 2.281396 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement,
Block: 1
104 2.281521 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 2
(last)
105 2.282021 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement,
Block: 2
106 5.359440 213.180.194.120 -> 213.180.194.116 TFTP Read Request, File:
pxelinux/k/ds386, Transfer type: octet
107 5.359940 213.180.194.116 -> 213.180.194.120 TFTP Option
Acknowledgement
108 5.360440 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement,
Block: 0
109 5.360566 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 1
110 5.361064 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement,
Block: 1
111 5.361189 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 2
112 5.361689 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement,
Block: 2
[skipped]
5207 7.030129 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block:
2550
5208 7.030629 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement,
Block: 2550
5209 7.031879 00:1b:78:05:84:6c -> Supermic_34:26:12 ARP Who has
213.180.194.120? Tell 213.180.194.116
5210 7.032003 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block:
2551
5211 7.032503 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement,
Block: 2551
[skipped]
7646 8.030992 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement,
Block: 1151
7647 8.031244 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block:
1152
7648 8.031867 00:1b:78:05:84:6c -> Supermic_34:26:12 ARP Who has
213.180.194.120? Tell 213.180.194.116
7649 8.037488 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement,
Block: 1152
7650 8.037739 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block:
1153
7651 8.038112 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement,
Block: 1153
[skipped]
10702 9.031234 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block:
2679
10703 9.031607 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement,
Block: 2679
10704 9.031741 00:1b:78:05:84:6c -> Supermic_34:26:12 ARP Who has
213.180.194.120? Tell 213.180.194.116
10705 9.031859 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block:
2680
10706 9.032232 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement,
Block: 2680
[skipped]
13759 10.030598 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block:
4207
13760 10.031097 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement,
Block: 4207
13761 10.031223 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block:
4208
13762 10.031721 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement,
Block: 4208
13763 13.291526 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement,
Block: 4208
13764 19.882338 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement,
Block: 4208

At mark 105 (time 2.282021) client successfully downloaded
welcome text, displayed it and waiting at user prompt. This time I
was not waiting for server to start checking client with arp requests
but typed desired menu label quickly and client started
to download kernel (mark 106, time 5.359440). Further we observe
the tftp exchange during which server sends 3 arp probes
to check that the client is alive: marks 5209, 7648, 10704.
Unfortunately client does not respond to these arp probes and
consequently server decides that the client is dead and stops
sending tftp data blocks (last one is sent at mark 13761).
Again client decides that server has not received its last
ACK (mark 13762) and starts to resend it (marks 13763, 13764).

Conclusion:
1) pxelinux does not respond to ARP packets not only
in the keyboard loop but also in tftp send/receive loop
2) Linux kernel does not use incoming tftp packets
as a last resort when checking that the host is alive.

Workaround: change server's aggressive arp checking behaviour
which defaults first arp probe of newly known client to 5 seconds
(which seems to be linux 2.6 kernel default).

echo 60 > /proc/sys/net/ipv4/neigh/eth0/delay_first_probe_time


Regards,
Petya.

ps: I would like to thank all the CCed people who helped me
to carry out this little investigation.

_______________________________________________
SYSLINUX mailing list
Submissions to SYSLINUX@zytor.com
Unsubscribe or set options at:
http://www.zytor.com/mailman/listinfo/syslinux
Please do not send private replies to mailing list traffic.
Re: pxelinux doesn't answer ARP requests when it should [ In reply to ]
Petr Kohts wrote:
> Hi there.
>
> Regarding currently known problem as stated on
> http://syslinux.zytor.com/pxe.php:
> we should probably call the UDP receive function in the keyboard
> entry loop, so that we answer ARP requests.
>

... which has turned out to be impossible, because too many PXE stacks
are buggy and will wait for a packet when UDP READ is called, despite
the fact that the call is explicitly documented as nonblocking.

*Sigh.*

> Conclusion:
> 1) pxelinux does not respond to ARP packets not only
> in the keyboard loop but also in tftp send/receive loop
> 2) Linux kernel does not use incoming tftp packets
> as a last resort when checking that the host is alive.

Ironically I heard about this problem first as late as the day before
yesterday. I'm not sure how to work around it, other than having
PXELINUX carry its own IP stack with it. THAT is already in the works,
however, since I've been working with the Etherboot team to come up with
a gPXE-PXELINUX integrated solution. When finalized, this will be a
single "pxelinux.0" image which will contain the gPXE (which contains an
independent IP stack) with it. It will also make it possible to get
files over HTTP or other TCP protocols.

We successfully demoed this at the Etherboot.org booth at LinuxWorld
this week, but it still needs additional work.

-hpa

_______________________________________________
SYSLINUX mailing list
Submissions to SYSLINUX@zytor.com
Unsubscribe or set options at:
http://www.zytor.com/mailman/listinfo/syslinux
Please do not send private replies to mailing list traffic.