Not to be fixed
Created: May 9, 2013
Updated: Apr 19, 2018
Resolved Date: Apr 17, 2018
Previous ID: LIN5-5725, LIN6-3076
Found In Version: 6.0
Severity: Severe
Applicable for: Wind River Linux 6
Component/s: Kernel
In customer's network night test environment(custom run many auto scripts which have many NIC shutdown/no shutdown operation), they often meet the following issue(can met every night):
/*****************************************************************/
e1000 0000:02:01.0: eth1: Reset adapter
e1000 0000:02:01.0: eth1: e1000_reinit_safe set __E1000_RESETTING
e1000 0000:02:01.0: eth1: e1000_reinit_safe reset __E1000_RESETTING
e1000: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
e1000 0000:02:03.0: eth3: Reset adapter
e1000 0000:02:03.0: eth3: e1000_reinit_safe set __E1000_RESETTING
------------[ cut here ]------------
WARNING: at /ers_rp/build-rp-wrl5/build/projectDir-intel-xeon-core_32-RGZ4vx/bitbake_build/tmp/work/intel_xeon_core_32-wrs-linux/linux-windriver-3.4-r0/linux/drivers/net/ethernet/intel/e1000/e1000_main.c:1461 e1000_close+0x94/0xa0()
Hardware name: VMware Virtual Platform
Modules linked in: evdev power_supply container hwmon processor coretemp ata_generic aes_i586 cryptd aesni_intel ac layer2_module x_tables ip_tables iptable_filter
Pid: 1119, comm: HSL Tainted: G W 3.4.34-WR5.0.1.0_standard #3
Call Trace:
[<c1032822>] warn_slowpath_common+0x72/0xa0
[<c14adcf4>] ? e1000_close+0x94/0xa0
[<c14adcf4>] ? e1000_close+0x94/0xa0
[<c1032872>] warn_slowpath_null+0x22/0x30
[<c14adcf4>] e1000_close+0x94/0xa0
[<c15ca871>] __dev_close_many+0x71/0xc0
[<c1038be3>] ? local_bh_enable_ip+0x43/0xa0
[<c15ca8ed>] __dev_close+0x2d/0x40
[<c15cef72>] __dev_change_flags+0x82/0x150
[<c12e8dcc>] ? security_capable+0x1c/0x30
[<c15cf0e1>] dev_change_flags+0x21/0x60
[<c16498a0>] devinet_ioctl+0x560/0x6b0
[<c133ee76>] ? __percpu_counter_add+0xa6/0xf0
[<c164a38d>] inet_ioctl+0x8d/0xb0
[<c15b822d>] sock_ioctl+0x6d/0x290
[<c15b81c0>] ? sock_fasync+0x90/0x90
[<c113e5c2>] do_vfs_ioctl+0x82/0x570
[<c15b90d4>] ? sock_map_fd+0x24/0x30
[<c15ba0e0>] ? sys_socket+0x50/0xf0
[<c15bb527>] ? sys_socketcall+0x77/0x370
[<c113eb19>] sys_ioctl+0x69/0xe0
[<c1752690>] sysenter_do_call+0x12/0x26
---[ end trace 515a98bd32c5aba3 ]---
/********************************************************/
From the codes of calling e1000_reinit_safe:
/********************************************************/
static void e1000_reinit_safe(struct e1000_adapter *adapter)
{
while (test_and_set_bit(__E1000_RESETTING, &adapter->flags))
msleep(1);
mutex_lock(&adapter->mutex);
e1000_down(adapter);
e1000_up(adapter);
mutex_unlock(&adapter->mutex);
clear_bit(__E1000_RESETTING, &adapter->flags);
}
/********************************************************/
We can see after setting __E1000_RESETTING flag, it will sleep for a while before geting adapter->mutex, at this time, other task may call e1000_close, and found the RESETTING flag is still there will throw the above Warning, if the mutex can not be got all the time, there maybe the above issue.
And after doing some searching on web, found there is a similiar issue:adapter is being closed and reset simultaneously(http://kernel.opensuse.org/cgit/kernel/commit/?id=bb9e44d0d0f45da356c39e485edacff6e14ba961), but it issue on e1000e, after made some changes under the guide of that patch, the frequency of this issue is few, but still can be met.
Hard to reproduce on our side, but customer can meet everynight.