Wind River Support Network

HomeDefectsLIN10-6000
Fixed

LIN10-6000 : Not systematic system freeze (UART/console freeze) applying power off/on

Created: Jun 19, 2019    Updated: Aug 15, 2019
Resolved Date: Jul 16, 2019
Found In Version: 10.17.41.1
Fix Version: 10.17.41.17
Severity: Severe
Applicable for: Wind River Linux LTS 17
Component/s: Kernel

Description

On a ARM 32-bit board, Xilinx Zynq 7035, using the Xilinx WRL9  BSP, we are observing a non systematic "freeze" of the system when applying power off/on of the board. 
The system freezes the login screen via a serial console.

We investigated with the help of a JTAG (Lauterbach) and observed there might be a race condition issue within the UART driver xilinx_uartps.c, the system is "looping" inside the UART IRQ handler:
- involved function is : cdns_uart_handle_rx() 
- the involved task is the : irq/145-xuartps 

It looks that when it is freezes the system is in a loop in the RX UART ISR routine, the UART RX side is disabled and also the interrupts.
Please, see attached:
- the console log around the system start-up  >> console log.txt
- the stack trace from JTAG (Lauterbach)  >> stack.png

We noticed that the Control Register for UART1 has the value 0x118 when the system is in the freeze. So we built the following mechanism as a workaround.

In the functions cdns_uart_handle_rx() and cdns_uart_suspend() into the while () added these lines of code:
 
        uart1_control_reg_test = readl(port->membase + CDNS_UART_CR);
        if (uart1_control_reg_test & CDNS_UART_CR_RX_DIS) {
            /* if into the CTRL Register the RX is disabled
             * something went wrong , so break */
            char buffer [70];
            sprintf (buffer , "UART lock detected in handle_rx, break\n" );
            my_putstring (port, buffer);
            break;
        }
 
Making some tests we were able to see some printed traces "UART lock detected in handle_rx.
This means that  we were in the problem.
This is only a workaround and  the real root cause has not yet been understood.
The question is: is it possible we're into the cdns_uart_handle_rx() and the RX  is disabled?

Steps to Reproduce

Build a preempt_rt Linux image with WRL9.

$./wrlinux-9/setup.sh --machines xilinx-zynq --distro wrlinux --kernel preempt-rt --dl-layers 
...
$ bitbake wrlinux-image-glibc-std
...
Copy the image to the boot media and start the test board, while being connected to it via a serial connection. 

The freeze is reproduced with various Baud rates, after multiple restarts:
- at Baud rate 9600 after 54 off/on cycles
- at Baud rate 19200, after 12 off/on cycles. 
- at Baud rate 38400 after 37 off/on cycles. 
- at Baud rate 115200 it seems to work fine.
The issue is observed ONLY applying preempt-rt patches ( --kernel preempt-rt).
By removing the preempt-rt patches (--kernel standard) the issue was not reproduced.

Live chat
Online