Wind River Support Network

HomeDefectsLIN1021-4449
Fixed

LIN1021-4449 : intel-socfpga-64: stratix10-svc driver deadlock/hang

Created: Sep 27, 2022    Updated: Dec 20, 2022
Resolved Date: Nov 5, 2022
Found In Version: 10.21.20.13
Fix Version: 10.21.20.15
Severity: Severe
Applicable for: Wind River Linux LTS 21
Component/s: BSP

Description

BSP: intel-socfpga-64

There is a rare occuring hang/deadlock situation in stratix10-svc driver that we have seen mainly happening with the RSU driver when writing to the sysfile "reboot_image".

Please see the attached patch for description/analysis.

When comparing the WindRiver Linux to Intel upstream kernel, it seems the problematic code is part of the Wind River specific optimization.

"
commit cade82b63374564ecb9e33054f961620ee707a0b

driver: firmware: stratix10-svc: schedule thread out when there is no data received
"

Workaround

As a workaround, make the kthread to poll for stopped status once a
second instead going to an infinite sleep.

Steps to Reproduce

The race window is quite narrow, so in normal use the hang is difficult
to reproduce. The following artificial method was used to trigger a hang
with stratix01-rsu driver and write to "reboot_image":

	- Create 100% background CPU load (e.g. "while :; do true; done &"
	  multiple times).

	- Insert busy-looping mdelay(1000) to the kernel thread just before
	  schedule_timeout_interruptible(). This does not change the program
	  logic, just timing.

	- Now write to "reboot_image", it should hang instantly.

	- Examining stack traces, the client process is shown as stuck in
	  kthread_stop() and kthread remains sleeping and scheduled out as
	  predicted:

	# cat /proc/493/stack
	[<0>] __switch_to+0xe0/0x15c
	[<0>] kthread_stop+0x9c/0x270
	[<0>] stratix10_svc_done+0x58/0xd0
	[<0>] rsu_send_msg+0xa0/0x120
	[<0>] reboot_image_store+0x9c/0xe0
	[<0>] dev_attr_store+0x24/0x40
	[<0>] sysfs_kf_write+0x50/0x60
	[<0>] kernfs_fop_write_iter+0x124/0x1b4
	[<0>] new_sync_write+0xf0/0x190
	[<0>] vfs_write+0x21c/0x280
	[<0>] ksys_write+0x74/0x100
	[<0>] __arm64_sys_write+0x28/0x3c
	[<0>] el0_svc_common.constprop.0+0x9c/0x210
	[<0>] do_el0_svc+0x78/0xa0
	[<0>] el0_svc+0x20/0x30
	[<0>] el0_sync_handler+0x1a4/0x1b0
	[<0>] el0_sync+0x180/0x1c0

	# cat /proc/494/stack
	[<0>] __switch_to+0xe0/0x15c
	[<0>] svc_normal_to_secure_thread+0x5d8/0x1430
	[<0>] kthread+0x150/0x160
	[<0>] ret_from_fork+0x10/0x3c
Live chat
Online