Fixed
Created: Oct 2, 2023
Updated: Jan 9, 2024
Resolved Date: Dec 22, 2023
Found In Version: 10.22.33.12
Fix Version: 10.22.33.14
Severity: Standard
Applicable for: Wind River Linux LTS 22
Component/s: Userspace
At the problem, the test shell script never returns to shell.
The problem happens once in a few days.
bash register a SIGINT handler wait_sigint_handler() in addition to original SIGINT handler of bash.
At the registration, the original SIGINT handler saved to old_sigint_handler pointer.
(See wait_for())
When SIGINT happens by ctrl-c, wait_sigint_handler() will be invoked first. the wait_sigint_handler()
restores the original SIGINT handler from old_sigint_handler pointer as primary SIGINT handler
by restore_signal_handler(), then send kill(SIGINT) to itself.
As a consequence, the original handler will be invoked.
(See wait_sigint_handler())
According to the customer's debugging and analysis, there is a small window in wait_for()
which causes the problem.
If SIGINT happens during the window, the original SIGINT handler is never be saved to old_sigint_handler.
So, restore_sigint_handler() will do nothing. Sending kill(SIGINT) to itself just invokes itself(wait_sigint_handler)
again.
The restore_sigint_handler() and kill(SIGINT) cycle will never be stopped. Therefore the shell script never return
to shell.
<bash/5.1.16-r0/bash-5.1.16/jobs.c>
int
wait_for (pid, flags)
pid_t pid;
int flags;
{
...
/* This is possibly a race condition – should it go in stop_pipeline? */
wait_sigint_received = child_caught_sigint = 0;
if (job_control == 0 || (subshell_environment&SUBSHELL_COMSUB))
{
SigHandler *temp_sigint_handler;
/* Small window begin. added by the reporter */
temp_sigint_handler = set_signal_handler (SIGINT, wait_sigint_handler);
if (temp_sigint_handler == wait_sigint_handler)
{ #if defined (DEBUG) internal_warning ("wait_for: recursively setting old_sigint_handler to wait_sigint_handler: running_trap = %d", running_trap); #endif }
else
/* Small window end. added by the reporter */
old_sigint_handler = temp_sigint_handler;
waiting_for_child = 0;
if (old_sigint_handler == SIG_IGN)
set_signal_handler (SIGINT, old_sigint_handler);
}
...
}
static sighandler
wait_sigint_handler (sig)
int sig;
{
...
/* XXX - should this be interrupt_state? If it is, the shell will act
as if it got the SIGINT interrupt. */
if (waiting_for_child)
wait_sigint_received = 1;
else
{ set_exit_status (128+SIGINT); restore_sigint_handler (); kill (getpid (), SIGINT); }
...
}
static void
restore_sigint_handler ()
{
if (old_sigint_handler != INVALID_SIGNAL_HANDLER)
{ set_signal_handler (SIGINT, old_sigint_handler); old_sigint_handler = INVALID_SIGNAL_HANDLER; waiting_for_child = 0; }
}
Hard to reproduce the problem unless you add sleep(1) to the wait_fo() like as attached "jobs.c".
I reproduce the problem easily on qemux86-64 target with the modification.
Here are steps to reproduce the problem.
1 mkdir qemux86_64
2 cd qemux86_64
3 git clone --branch WRLINUX_10_22_LTS https://gateway.delivers.windriver.com/git/linux-lts/release/wrlinux-lts.22/WRLinux-lts-22-Core/wrlinux-x
4 ./wrlinux-x/setup.sh --machines qemux86-64 --dl-layers
5 . ./environment-setup-x86_64-wrlinuxsdk-linux
6 . ./oe-init-build-env
7 Change BB_NO_NETWORK of conf/local.conf from '1' to '0' like below.
BB_NO_NETWORK ?= '0'
8 Unpack and patch bash source files.
bitbake bash -c patch
9 Replace below source file with attached "jobs.c".
tmp-glibc/work/core2-64-wrs-linux/bash/5.1.16-r0/bash-5.1.16/jobs.c
10 Complete the platform build.
bitbake wrlinux-image-small
11 Run the QEMU target.
runqemu qemux86-64 nographic
12 login to the QEMU target as root.
13 Create a shell script file like as attached "test.sh".
14 chmod +x test.sh
15 Run the script.
./test.sh
16 Type ctrl-c from keyboard. See if the script returns to bash shell.