Wind River Support Network

HomeDefectsLIN6-12039
Fixed

LIN6-12039 : BUG: sleeping function called from invalid context at kernel/rtmutex.c:797

Created: Dec 15, 2016    Updated: Dec 3, 2018
Resolved Date: Jun 19, 2017
Found In Version: 6.0.0.23
Fix Version: 6.0.0.34
Severity: Standard
Applicable for: Wind River Linux 6
Component/s: Kernel
Architecture: IA64

Description

While using take_down_cpu() on the non-boot CPU cores in order to enter S3 sleep state the above BUG is sometimes triggered. It may happen on any of the non-boot CPUs. I have never seen it happen on more that one CPU during the same take-down sequence. Here is a kernel log extract including the full BUG message plus a little context before and after. 

Nov 1 05:30:20 mariner kernel: [299112.963325] PM: suspend of devices complete after 1311.321 msecs 
Nov 1 05:30:20 mariner kernel: [299112.963458] turn_disk_power_off() -- port# = 1 
Nov 1 05:30:20 mariner kernel: [299112.963462] turn_disk_power_off() -- port# = 0 
Nov 1 05:30:20 mariner kernel: [299112.963639] PM: late suspend of devices complete after 0.310 msecs 
Nov 1 05:30:20 mariner kernel: [299112.974962] dwc3-pci 0000:00:16.0: power state changed by ACPI to D3hot 
Nov 1 05:30:20 mariner kernel: [299112.975038] xhci_hcd 0000:00:14.0: System wakeup enabled by ACPI 
Nov 1 05:30:20 mariner kernel: [299112.996911] PM: noirq suspend of devices complete after 33.284 msecs 
Nov 1 05:30:20 mariner kernel: [299112.996957] ACPI: Preparing to enter system sleep state S3 
Nov 1 05:30:20 mariner kernel: [299112.997559] PM: Saving platform NVS memory 
Nov 1 05:30:20 mariner kernel: [299113.002254] Disabling non-boot CPUs ... 
Nov 1 05:30:20 mariner kernel: [299113.004011] smpboot: CPU 1 is now offline 
Nov 1 05:30:20 mariner kernel: [299113.006270] BUG: sleeping function called from invalid context at kernel/rtmutex.c:797 
Nov 1 05:30:20 mariner kernel: [299113.006272] in_atomic(): 1, irqs_disabled(): 1, pid: 37, name: migration/2 
Nov 1 05:30:20 mariner kernel: [299113.006281] Preemption disabled at:[<ffffffff8106a9ac>] smpboot_thread_fn+0x1ec/0x360 
Nov 1 05:30:20 mariner kernel: [299113.006282] 
Nov 1 05:30:20 mariner kernel: [299113.006285] CPU: 2 PID: 37 Comm: migration/2 Tainted: P O 3.10.62 #1 
Nov 1 05:30:20 mariner kernel: [299113.006287] Hardware name: Insyde explorer60 X64/6, BIOS explorer60.13.13 10/24/2016 
Nov 1 05:30:20 mariner kernel: [299113.006292] ffffea0005c61a80 ffff88017a22ba90 ffffffff8163d782 ffff88017a22baa8 
Nov 1 05:30:20 mariner kernel: [299113.006295] ffffffff8106e5ff ffff88017fd0d280 ffff88017a22bac0 ffffffff816420a0 
Nov 1 05:30:20 mariner kernel: [299113.006299] ffff88017fd0d280 ffff88017a22bb18 ffffffff81103420 ffff88017a000c00 
Nov 1 05:30:20 mariner kernel: [299113.006302] ffff88017a22bfd8 000000000000da60 ffff88014cb5db00 ffffea0005c61a80 
Nov 1 05:30:20 mariner kernel: [299113.006305] 0000000000000001 0000000000000002 00000000fffffffe ffff88017937e0c0 
Nov 1 05:30:20 mariner kernel: [299113.006308] ffff88017a22bb38 ffffffff81104bb4 ffffea0005c61a80 0000000000000001 
Nov 1 05:30:20 mariner kernel: [299113.006312] ffff88017a22bb48 ffffffff81104e3e ffff88017a22bb90 ffffffff8113785a 
Nov 1 05:30:20 mariner kernel: [299113.006315] 000000010010000e 0000000000000000 ffff88017a22bbf0 ffffea0001dc6d00 
Nov 1 05:30:20 mariner kernel: [299113.006318] 000000000000da60 0000000000000097 ffff88017a001900 ffff88017a22bba8 
Nov 1 05:30:20 mariner kernel: [299113.006321] ffffffff81137996 ffff88017a22bbf0 ffff88017a22bc68 ffffffff8163b038 
Nov 1 05:30:20 mariner kernel: [299113.006324] ffff88017a22bfd8 ffff88017a22bfd8 00000000000000ff 0000000000000006 
Nov 1 05:30:20 mariner kernel: [299113.006328] 0000000000000006 ffff88017fd0da60 ffffffff816424ae ffff88017a22bbf0 
Nov 1 05:30:20 mariner kernel: [299113.006331] ffff88017a22bbf0 ffffffff81027e5a 00000000ffffffea ffff880176032240 
Nov 1 05:30:20 mariner kernel: [299113.006334] 000000018010000f ffff880176032200 ffff88017a22bc68 ffffffff810287c0 
Nov 1 05:30:20 mariner kernel: [299113.006337] ffff88017a22bfd8 ffff8800771b4400 ffff88017fd14950 ffff88017a22bfd8 
Nov 1 05:30:20 mariner kernel: [299113.006341] ffff88017a001900 ffffea0001dc6d00 ffff88017a22bcb0 ffffffff811396a1 
Nov 1 05:30:20 mariner kernel: [299113.006344] ffffffff810172ad 000000000cd4c70a ffff88017fd09a60 0000000000000002 
Nov 1 05:30:20 mariner kernel: [299113.006347] 0000000000000018 0000000000000002 0000000000000000 ffff88017a22bcd0 
Nov 1 05:30:20 mariner kernel: [299113.006350] ffffffff810172ad 00000000fffffffa ffffffff81c2f050 ffff88017a22bce0 
Nov 1 05:30:20 mariner kernel: [299113.006353] ffffffff81631ea9 ffff88017a22bd18 ffffffff810681fe ffff8801722e5d70 
Nov 1 05:30:20 mariner kernel: [299113.006356] 0000000000000003 ffff88017a22bf01 0000000000000202 ffffffff810ab470 
Nov 1 05:30:20 mariner kernel: [299113.006360] ffff88017a22bd28 ffffffff810682be ffff88017a22bd38 ffffffff8103e593 
Nov 1 05:30:20 mariner kernel: [299113.006363] ffff88017a22bd50 ffffffff8162c017 ffff8801722e5ce0 ffff88017a22bd80 
Nov 1 05:30:20 mariner kernel: [299113.006366] ffffffff810ab50e ffff88017fd0ba60 ffff8801722e5c90 ffff88017a22bfd8 
Nov 1 05:30:20 mariner kernel: [299113.006369] ffff8801722e5ce0 ffff88017a22be48 ffffffff810ab891 ffff88017a22bfd8 
Nov 1 05:30:20 mariner kernel: [299113.006373] ffff88017fd0ba80 ffffffff810712bd 0000000000000001 ffff88017a22bed8 
Nov 1 05:30:20 mariner kernel: [299113.006376] 0000000300000001 ffff88017a22bed8 0000000000000286 ffff88017a1a36b0 
Nov 1 05:30:20 mariner kernel: [299113.006379] ffff88017a1a36b0 ffff88017a22bed0 0000000000000002 ffff88017fd0ba60 
Nov 1 05:30:20 mariner kernel: [299113.006382] 0000000000000286 ffffffff81c371a0 0000000000000286 ffff88017a22be28 
Nov 1 05:30:20 mariner kernel: [299113.006385] ffffffff816424ae ffff88017a1a36b0 ffff88017a002460 ffffffff81c371a0 
Nov 1 05:30:20 mariner kernel: [299113.006388] 0000000000000002 ffff88017a22bfd8 ffff88017a22bea8 ffffffff8106a9ac 
Nov 1 05:30:20 mariner kernel: [299113.006392] ffff88017a22bfd8 ffff88017a1a36b0 0000000000000000 ffffffff81640eb0 
Nov 1 05:30:20 mariner kernel: [299113.006395] 0000000000000001 ffff88017a121cd8 ffff88017a002460 ffffffff8106a7c0 
Nov 1 05:30:20 mariner kernel: [299113.006398] 0000000000000000 0000000000000000 ffff88017a22bf48 ffffffff81062156 
Nov 1 05:30:20 mariner kernel: [299113.006401] 0000000000000001 ffff880100000002 ffff88017a002460 000000000000024c 
Nov 1 05:30:20 mariner kernel: [299113.006404] dead4ead00004f4f ffff8801ffffffff ffffffffffffffff ffff88017a22bef0 
Nov 1 05:30:20 mariner kernel: [299113.006408] ffff88017a22bef0 ffffffff00000000 dead4ead7fc10000 ffff8801ffffffff 
Nov 1 05:30:20 mariner kernel: [299113.006411] ffffffffffffffff ffff88017a22bf20 ffff88017a22bf20 ffffffff81062090 
Nov 1 05:30:20 mariner kernel: [299113.006414] 0000000000000000 0000000000000000 ffff88017a121cd8 ffffffff816432b8 
Nov 1 05:30:20 mariner kernel: [299113.006417] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
Nov 1 05:30:20 mariner kernel: [299113.006420] ffff88017a121cd8 ffffffff81062090 0000000000000000 0000000000000000 
Nov 1 05:30:20 mariner kernel: [299113.006423] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
Nov 1 05:30:20 mariner kernel: [299113.006426] 0000000000000000 0000000000000000 0000000000000000 ffffffffffffffff 
Nov 1 05:30:20 mariner kernel: [299113.006429] 0000000000000000 0000000000000010 0000000000000202 ffff88017a22bf58 
Nov 1 05:30:20 mariner kernel: [299113.006430] 0000000000000018 
Nov 1 05:30:20 mariner kernel: [299113.006431] Call Trace: 
Nov 1 05:30:20 mariner kernel: [299113.006438] [<ffffffff8163d782>] dump_stack+0x19/0x1b 
Nov 1 05:30:20 mariner kernel: [299113.006441] [<ffffffff8106e5ff>] __might_sleep+0xef/0x170 
Nov 1 05:30:20 mariner kernel: [299113.006445] [<ffffffff816420a0>] rt_spin_lock+0x20/0x50 
Nov 1 05:30:20 mariner kernel: [299113.006449] [<ffffffff81103420>] __free_pages_ok.part.66+0x60/0x4e0 
Nov 1 05:30:20 mariner kernel: [299113.006452] [<ffffffff81104bb4>] __free_pages+0x54/0x60 
Nov 1 05:30:20 mariner kernel: [299113.006455] [<ffffffff81104e3e>] __free_memcg_kmem_pages+0xe/0x10 
Nov 1 05:30:20 mariner kernel: [299113.006459] [<ffffffff8113785a>] __free_slab+0xba/0x1a0 
Nov 1 05:30:20 mariner kernel: [299113.006463] [<ffffffff81137996>] free_delayed+0x56/0x70 
Nov 1 05:30:20 mariner kernel: [299113.006466] [<ffffffff8163b038>] __slab_free+0x39d/0x554 
Nov 1 05:30:20 mariner kernel: [299113.006469] [<ffffffff816424ae>] ? _raw_spin_unlock_irqrestore+0x1e/0x50 
Nov 1 05:30:20 mariner kernel: [299113.006473] [<ffffffff81027e5a>] ? assign_irq_vector+0x4a/0x60 
Nov 1 05:30:20 mariner kernel: [299113.006477] [<ffffffff810287c0>] ? __ioapic_set_affinity+0x80/0xd0 
Nov 1 05:30:20 mariner kernel: [299113.006480] [<ffffffff811396a1>] kfree+0x1f1/0x210 
Nov 1 05:30:20 mariner kernel: [299113.006484] [<ffffffff810172ad>] ? intel_pmu_cpu_dying+0x6d/0x80 
Nov 1 05:30:20 mariner kernel: [299113.006487] [<ffffffff810172ad>] intel_pmu_cpu_dying+0x6d/0x80 
Nov 1 05:30:20 mariner kernel: [299113.006491] [<ffffffff81631ea9>] x86_pmu_notifier+0xbb/0xc9 
Nov 1 05:30:20 mariner kernel: [299113.006494] [<ffffffff810681fe>] notifier_call_chain+0x4e/0x70 
Nov 1 05:30:20 mariner kernel: [299113.006498] [<ffffffff810ab470>] ? cpu_stop_create+0x30/0x30 
Nov 1 05:30:20 mariner kernel: [299113.006501] [<ffffffff810682be>] __raw_notifier_call_chain+0xe/0x10 
Nov 1 05:30:20 mariner kernel: [299113.006505] [<ffffffff8103e593>] cpu_notify+0x23/0x40 
Nov 1 05:30:20 mariner kernel: [299113.006508] [<ffffffff8162c017>] take_cpu_down+0x27/0x40 
Nov 1 05:30:20 mariner kernel: [299113.006511] [<ffffffff810ab50e>] stop_machine_cpu_stop+0x9e/0xc0 
Nov 1 05:30:20 mariner kernel: [299113.006514] [<ffffffff810ab891>] cpu_stopper_thread+0xb1/0x190 
Nov 1 05:30:20 mariner kernel: [299113.006518] [<ffffffff810712bd>] ? get_parent_ip+0xd/0x50 
Nov 1 05:30:20 mariner kernel: [299113.006522] [<ffffffff816424ae>] ? _raw_spin_unlock_irqrestore+0x1e/0x50 
Nov 1 05:30:20 mariner kernel: [299113.006526] [<ffffffff8106a9ac>] smpboot_thread_fn+0x1ec/0x360 
Nov 1 05:30:20 mariner kernel: [299113.006529] [<ffffffff81640eb0>] ? schedule+0x30/0x90 
Nov 1 05:30:20 mariner kernel: [299113.006532] [<ffffffff8106a7c0>] ? lg_local_unlock+0x30/0x30 
Nov 1 05:30:20 mariner kernel: [299113.006535] [<ffffffff81062156>] kthread+0xc6/0xd0 
Nov 1 05:30:20 mariner kernel: [299113.006539] [<ffffffff81062090>] ? kthread_worker_fn+0x1b0/0x1b0 
Nov 1 05:30:20 mariner kernel: [299113.006543] [<ffffffff816432b8>] ret_from_fork+0x58/0x90 
Nov 1 05:30:20 mariner kernel: [299113.006546] [<ffffffff81062090>] ? kthread_worker_fn+0x1b0/0x1b0 
Nov 1 05:30:20 mariner kernel: [299113.006594] smpboot: CPU 2 is now offline 
Nov 1 05:30:20 mariner kernel: [299113.008750] smpboot: CPU 3 is now offline 
Nov 1 05:30:20 mariner kernel: [299113.009609] ACPI -- Sleep NOW 

I have searched the current list of Defects for Linux 6 https://knowledge.windriver.com/en-us/000_Products/000/010/010/000_Defects_for_Linux_6 but not found this issue. 

Our kernel is based on WR 6.0.0.23 with our own modifications, and built with PREEMPT_RT enabled.

Steps to Reproduce

To duplicate the problem with a non-customised WR kernel configured and running on a well-known supported reference board would take a lot of work, and I don't think it is necessary.  All that's needed is simply to investigate the information given.  A WR person who is familiar with the PREEMPT_RT patch could look at the execution path through our source code (already provided to WR) from the point where pre-emption was disabled to where we ended in a function that “might sleep”, and spot the mistake / mis-implementation.  To assist resoIve this issue, I have just uploaded a document that maps the execution path, plus another fully symbolic disassembly listing of a slightly different kernel which helps greatly in tracing the execution path

This file maps the execution path in the source code (also uploaded) from just before the point where pre-emption was disabled up to the BUG

Compressed fully symbolic kernel disassembly listing of an unstripped rebuild of the kernel, where the absolute addresses of functions are different, but the source file names and line numbers are correct, as are the relative address offsets within each function.

'We don’t use the WR “project” environment.  At some point in history we created a project for the WR 6 source, and copied the kernel into our own source tree, customized it and merged in WR patches up to RCPL 23.  We use our own build scripts.

 I’ve uploaded the exact bzImage that was booted, and a disassembly of it.  I hope this will give you the information you need.'

Other Downloads


Live chat
Online