httpd.worker - kernel killer?

Jakub Bogusz qboosh w pld-linux.org
Sob, 14 Lip 2007, 13:59:35 CEST


On Sat, Jul 14, 2007 at 09:52:43AM +0200, Adam Gołębiowski wrote:
> Hi,
> 
> Our CVS suddenly stopped working yesterday. When I used 'xm cons' to get
> a console I saw the following:
> 
> 
> tty1 ep09-cvs login: Unable to handle kernel paging request at ffff880013318604 RIP:
> <ffffffff802056ac>{gr_lookup_task_ip_table+60}
[...]
> RIP: e030:[<ffffffff802056ac>] <ffffffff802056ac>{gr_lookup_task_ip_table+60}
> RSP: e02b:ffff88001d68fea0  EFLAGS: 00010286
> RAX: ffff880013318380 RBX: ffff880015567940 RCX: 0000000000007fed
> RDX: ffff880008829fa0 RSI: 00000000101f49d9 RDI: 00000000bb5f00c1
> RBP: ffff880001a623c0 R08: 0000000000000e9e R09: 0000000000005000
[...]
> Call Trace: <ffffffff80205812>{gr_attach_curr_ip+82}
>        <ffffffff8025bbc6>{sys_accept+326} <ffffffff8019f2a6>{do_sys_poll+454}
>        <ffffffff8019e480>{__pollwait+0} <ffffffff8010b08a>{system_call+134}
>        <ffffffff8010b004>{system_call+0}
> 
> Code: 39 b8 84 02 00 00 75 ea 39 b0 88 02 00 00 75 e2 66 44 39 80
> RIP <ffffffff802056ac>{gr_lookup_task_ip_table+60} RSP <ffff88001d68fea0>

Uh, what a bug. This function dereferenced already freed entries in
grsec's connection table.
Affects grsec both full and minimal for Linux 2.6.16 through 2.6.18.
In 2.6.19 code has changed, so gr_del_task_from_ip_table is moved in
proper place once again.

[...]
> This is a domU running on a smp kernel (built from LINUX_2_6_16) on
> amd64, however I had the same error (also caused by httpd.worker) on a
> i686 smp. 

Could be hit by any process using AF_INET sockets (BTW, grsec doesn't
assign IPv6 addresses to processes :P).

> Any hints on how this could be debugged?

Maybe from kernel core dump (unsupported in vanilla or PLD kernels).
Or from source code analysis, like now.


Anyway, fix attached and added to CVS on proper branches (as
SOURCES/linux-2.6-grsec-wrong-deref.patch).
Please apply on LINUX_2_6_{16,17,18} branches for both grsec kinds.
(actually not tested, but should be right)


-- 
Jakub Bogusz    http://qboosh.pl/
-------------- następna część ---------
Fixes dereference of already freed signal structs on conn_table_entry traversal.
(removal of "tsk == sig->curr_target" comparison in a case of 1-element
 process group caused to apply gr_del_task_from_ip_table(tsk) hunk to be
 applied in wrong place, where struct signal is still kept, not where it
 is freed)
--- linux-2.6.16/kernel/signal.c.orig	2007-07-14 12:16:07.661313000 +0200
+++ linux-2.6.16/kernel/signal.c	2007-07-14 13:40:35.919325560 +0200
@@ -367,6 +367,7 @@
 	posix_cpu_timers_exit(tsk);
 	if (atomic_dec_and_test(&sig->count)) {
 		posix_cpu_timers_exit_group(tsk);
+		gr_del_task_from_ip_table(tsk);
 		tsk->signal = NULL;
 		__exit_sighand(tsk);
 		spin_unlock(&sighand->siglock);
@@ -382,7 +383,6 @@
 		}
 		if (tsk == sig->curr_target)
 			sig->curr_target = next_thread(tsk);
-		gr_del_task_from_ip_table(tsk);
 		tsk->signal = NULL;
 		/*
 		 * Accumulate here the counters for all threads but the


Więcej informacji o liście dyskusyjnej pld-kernel