HomeDefectsLIN1022-5510
Fixed

LIN1022-5510 : Bash: shell script rarely won't stop by ctrl-c

Created: Oct 2, 2023    Updated: Jan 9, 2024
Resolved Date: Dec 22, 2023
Found In Version: 10.22.33.12
Fix Version: 10.22.33.14
Severity: Standard
Applicable for: Wind River Linux LTS 22
Component/s: Userspace

Description

At the problem, the test shell script never returns to shell.
The problem happens once in a few days.

bash register a SIGINT handler wait_sigint_handler() in addition to original SIGINT handler of bash.
At the registration, the original SIGINT handler saved to old_sigint_handler pointer.
(See wait_for())

When SIGINT happens by ctrl-c, wait_sigint_handler() will be invoked first. the wait_sigint_handler()
restores the original SIGINT handler from old_sigint_handler pointer as primary SIGINT handler
by restore_signal_handler(), then send kill(SIGINT) to itself.
As a consequence, the original handler will be invoked.
(See wait_sigint_handler())

According to the customer's debugging and analysis, there is a small window in wait_for() 
which causes the problem.
If SIGINT happens during the window, the original SIGINT handler is never be saved to old_sigint_handler.
So, restore_sigint_handler() will do nothing. Sending kill(SIGINT) to itself just invokes itself(wait_sigint_handler)
again.
The restore_sigint_handler() and kill(SIGINT) cycle will never be stopped. Therefore the shell script never return
to shell.

<bash/5.1.16-r0/bash-5.1.16/jobs.c>

int
wait_for (pid, flags)
pid_t pid;
int flags;
{
...
/* This is possibly a race condition – should it go in stop_pipeline? */
wait_sigint_received = child_caught_sigint = 0;
if (job_control == 0 || (subshell_environment&SUBSHELL_COMSUB))
{
SigHandler *temp_sigint_handler;

/* Small window begin. added by the reporter */

temp_sigint_handler = set_signal_handler (SIGINT, wait_sigint_handler);
if (temp_sigint_handler == wait_sigint_handler)

{ #if defined (DEBUG) internal_warning ("wait_for: recursively setting old_sigint_handler to wait_sigint_handler: running_trap = %d", running_trap); #endif }

else

/* Small window end. added by the reporter */

old_sigint_handler = temp_sigint_handler;

waiting_for_child = 0;
if (old_sigint_handler == SIG_IGN)
set_signal_handler (SIGINT, old_sigint_handler);
}
...
}

static sighandler
wait_sigint_handler (sig)
int sig;
{
...
/* XXX - should this be interrupt_state? If it is, the shell will act
as if it got the SIGINT interrupt. */
if (waiting_for_child)
wait_sigint_received = 1;
else

{ set_exit_status (128+SIGINT); restore_sigint_handler (); kill (getpid (), SIGINT); }

...
}

static void
restore_sigint_handler ()
{
if (old_sigint_handler != INVALID_SIGNAL_HANDLER)

{ set_signal_handler (SIGINT, old_sigint_handler); old_sigint_handler = INVALID_SIGNAL_HANDLER; waiting_for_child = 0; }

}

Steps to Reproduce

Hard to reproduce the problem unless you add sleep(1) to the wait_fo() like as attached "jobs.c".
I reproduce the problem easily on qemux86-64 target with the modification.

Here are steps to reproduce the problem.

1 mkdir qemux86_64
2 cd qemux86_64
3 git clone --branch WRLINUX_10_22_LTS https://gateway.delivers.windriver.com/git/linux-lts/release/wrlinux-lts.22/WRLinux-lts-22-Core/wrlinux-x
4 ./wrlinux-x/setup.sh --machines qemux86-64 --dl-layers
5 . ./environment-setup-x86_64-wrlinuxsdk-linux
6 . ./oe-init-build-env
7 Change BB_NO_NETWORK of conf/local.conf from '1' to '0' like below.

    BB_NO_NETWORK ?= '0'
    
8 Unpack and patch bash source files.

    bitbake bash -c patch
    
9 Replace below source file with attached "jobs.c".

    tmp-glibc/work/core2-64-wrs-linux/bash/5.1.16-r0/bash-5.1.16/jobs.c

10 Complete the platform build.

    bitbake wrlinux-image-small
    
11 Run the QEMU target.

    runqemu qemux86-64 nographic
    
12 login to the QEMU target as root.
13 Create a shell script file like as attached "test.sh".
14 chmod +x test.sh
15 Run the script.

    ./test.sh

16 Type ctrl-c from keyboard. See if the script returns to bash shell.