Oops in arp_rcv, patch
Jacek Konieczny
jajcus at bnet.pl
Wed Jul 4 16:46:54 CEST 2001
Hi,
One of my router has rebooted a lot last days. I couldn't find the
reason as the oops were not logged, and during most of crashes there was
noone at the console. But finnaly I got the oops on a serial console.
After decoding the oops and examining kernel sources I found the problem
--- it was neigh_release() function which failed. Everywhere else in
the code its argument is protected against being NULL, but not in the
one place. Here is my patch:
===== cut ====
--- linux/net/ipv4/arp.c.orig Thu Jun 28 17:29:10 2001
+++ linux/net/ipv4/arp.c Tue Jul 3 19:37:25 2001
@@ -738,7 +738,7 @@
(addr_type == RTN_UNICAST && rt->u.dst.dev != dev &&
(IN_DEV_PROXY_ARP(in_dev) || pneigh_lookup(&arp_tbl, &tip, dev, 0)))) {
n = neigh_event_ns(&arp_tbl, sha, &sip, dev);
- neigh_release(n);
+ if (n) neigh_release(n);
if (skb->stamp.tv_sec == 0 ||
skb->pkt_type == PACKET_HOST ||
==============
The bug cames out when proxy-arp is configured. It seems number of
entries in ARP table matters to (on my host "ip nieghb show|wc" gives
more than 1000), or it may be number of ethernet ports (I have 10).
The buggy code seems unchanged in 2.4.5 kernel.
Here is the decoded oops:
===============
ksymoops 2.4.1 on i686 2.2.19. Options used
-V (default)
-k /proc/ksyms (default)
-l /lib/modules/2.2.19-16/ (specified)
-o /lib/modules/2.2.19/ (default)
-m /boot/System.map (specified)
Error (expand_objects): cannot stat(/lib/ext2.o) for ext2
Error (expand_objects): cannot stat(/lib/ide-disk.o) for ide-disk
Error (expand_objects): cannot stat(/lib/ide-probe-mod.o) for ide-probe-mod
Error (expand_objects): cannot stat(/lib/ide-mod.o) for ide-mod
Error (regular_file): read_lsmod /lib/modules/2.2.19-16/ is not a regular file, ignored
Warning (map_ksym_to_module): cannot match loaded module ext2 to a unique module object. Trace may not be reliable.
Oops: 0002
CPU: 0
EIP: 0010:[<c016d089>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010286
eax: 00000000 ebx: 00000000 ecx: 5343e3d5 edx: 00000401
esi: c62a7430 edi: ca4d71f0 ebp: c62a7438 esp: c0211f04
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 0, process nr: 0, stackpage=c0211000)
Stack: c0200494 c0210608 5343e3d5 c0211f20 c0211f24 cb8f0750 00017b80 5343e3d5
5143e3d5 c0148038 c7bf6640 ca4d71f0 c0200494 00000001 c023d8e4 0003c15b
c0211f60 c7bf6640 0003c15b c011a269 00000000 c0210000 c010b56a 00001000
Call Trace: [<c0148038>] [<c011a269>] [<c010b56a>] [<c010b230>] [<c01088dd>]
[<c0106000>] [<c010890
[<c010a008>] [<c0106000>] [<c0106077>] [<c0106000>] [<c0100175>]
Code: ff 4b 2c 0f 94 c0 84 c0 74 0f 83 7b 04 00 75 09 53 e8 a9 cd
>>EIP; c016d089 <arp_rcv+2c9/3d4> <=====
Trace; c0148038 <net_bh+1a0/200>
Trace; c011a269 <do_bottom_half+49/70>
Trace; c010b56a <do_IRQ+3a/3c>
Trace; c010b230 <common_interrupt+18/20>
Trace; c01088dd <cpu_idle+5d/6c>
Trace; c0106000 <get_options+0/70>
Trace; c010a008 <system_call+34/38>
Trace; c0106000 <get_options+0/70>
Trace; c0106077 <cpu_idle+7/18>
Trace; c0106000 <get_options+0/70>
Trace; c0100175 <L6+0/2>
Code; c016d089 <arp_rcv+2c9/3d4>
00000000 <_EIP>:
Code; c016d089 <arp_rcv+2c9/3d4> <=====
0: ff 4b 2c decl 0x2c(%ebx) <=====
Code; c016d08c <arp_rcv+2cc/3d4>
3: 0f 94 c0 sete %al
Code; c016d08f <arp_rcv+2cf/3d4>
6: 84 c0 test %al,%al
Code; c016d091 <arp_rcv+2d1/3d4>
8: 74 0f je 19 <_EIP+0x19> c016d0a2 <arp_rcv+2e2/3d4>
Code; c016d093 <arp_rcv+2d3/3d4>
a: 83 7b 04 00 cmpl $0x0,0x4(%ebx)
Code; c016d097 <arp_rcv+2d7/3d4>
e: 75 09 jne 19 <_EIP+0x19> c016d0a2 <arp_rcv+2e2/3d4>
Code; c016d099 <arp_rcv+2d9/3d4>
10: 53 push %ebx
Code; c016d09a <arp_rcv+2da/3d4>
11: e8 a9 cd 00 00 call cdbf <_EIP+0xcdbf> c0179e48 <unix_getname+28/7c>
Aiee, killing interrupt handler
Kernel panic: Attempted to kill the idle task!
In swapper task - not syncing
1 warning and 5 errors issued. Results may not be reliable.
More information about the pld-devel-en
mailing list