Problem Description:
--------------------
Running many posix timers and threads causes a kernel crash.
The message is:
kernel BUG at
invalid operand: 0000 [#1]
SMP
LTT NESTING LEVEL : 0
Modules linked in: ipmi_watchdog ipmi_si ipmi_devintf ipmi_msghandler softdog bi
nfmt_misc video thermal processor fan button battery ac sctp uhci_hcd usbcore ip
6_tables ip_tables ipv6
CPU: 2
EIP: 0060:[
EFLAGS: 00010046 (2.6.14.7-selinux1-WR1.4aq_cgl)
EIP is at send_sigqueue+0xdf/0xf7
eax: 00000020 ebx: f6b0b368 ecx: f66115b0 edx: f6b0b368
esi: f66115b0 edi: 00000020 ebp: c04b0d9c esp: c04b0d88
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, threadinfo=c04b0000 task=c317eaf0)
Stack: 00000000 00000092 f6602248 00000000 f6b0b3f8 c04b0dac c013fd9d f6602248
f6602250 c04b0dc8 c013fe19 00000000 00000282 f660227c c013fdd2 c30250a0
c04b0df8 c0143cd8 c30250a4 f6602248 6c9d81ba 000000eb 00000001 6c9d81ba
Call Trace:
[
[
[
[
[
[
[
[
[
[
[
[
- When deleting a posix timer by invoking sys_timer_delete(), a race condition is in sigqueue_free() and collect_signal().
- The __sigqueue_free() is called twice on the same "struct sigqueue"
with the obviously bad implications.
- Because all threads in the same thread group have the same ->sighand, and thus the same ->sighand->siglock.
Comments on the Fix:
--------------------
- Therefore, Adding ->sighand->siglock before checking list_empty(&q->list) in sigqueue_free(), collect_signal() is always called under sighand->siglock which is also taken by sigqueue_free(), so the race condition is impossible.
IDENTIFIER = WIND00096705PNELE14
Installation Instructions:
--------------------------
1. Copy the patch zip file to your
2. Unzip the patch file
3. Go to your
4. Run setup_linux and install the patch
5. This is a source patch so you will have to rebuild the kernel to get the patch.