HomeDefectsLIN1019-8553
Fixed

LIN1019-8553 : i40e - XL710 PF - Initial pf_reset failed: -15 - BUG: kernel NULL pointer dereference

Created: Jul 11, 2022    Updated: Apr 4, 2023
Resolved Date: Jan 2, 2023
Found In Version: 10.19.45.17
Fix Version: 10.19.45.27
Severity: Standard
Applicable for: Wind River Linux LTS 19
Component/s: Kernel

Description

Problem description: XL710 PF reset fails (i40e driver)

How can the problem be reproduced: USB installation for WRL OVP 19.45

OS : Wind River Linux OVP *19.45 Update 17* intel-x86-64

Hardware : *Intel(R) Xeon(R) CPU E5-2608L v3* @ 2.00GHz (family: 0x6, model: 0x3f, stepping: 0x2)

*Failure Log extracts :*

Cannot get driver information: No such device

rmmod: ERROR: Module i40e is in use

sh[2538]: PF driver reset failed, trying to reinitialize

 

i40e: Intel(R) 40-10 Gigabit Ethernet Connection Network Driver - version 2.8.43

i40e: Copyright(c) 2013 - 2019 Intel Corporation.

i40e 0000:05:00.0: Initial pf_reset failed: *-15*

i40e 0000:05:00.0: previous errors forcing module to load in debug mode

 

i40e 0000:05:00.1: Initial pf_reset failed: -15

i40e 0000:05:00.1: previous errors forcing module to load in debug mode

BUG: kernel NULL pointer dereference, address: 0000000000000000

e1000e 0000:00:19.0 eth0: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx

#PF: supervisor read access in kernel mode

printk: console [ttyS0]: printing thread stopped

#PF: error_code(0x0000) - not-present page

PGD 0 P4D 0

Oops: 0000 1 PREEMPT SMP PTI

CPU: 1 PID: 2704 Comm: rmmod Kdump: loaded Tainted: G           O      5.2.60-rt15-LTS19 #1

Hardware name: Juniper Networks, Inc. 0512       /HSW RE MX  , BIOS REH_P_MTR1_00.34.01 03/18/2022

RIP: 0010:i40e_remove+0xb1/0x390 [i40e]

Code: 83 bc 24 68 08 00 00 00 74 0d 49 8d bc 24 50 08 00 00 e8 62 bb ad d0 49 8b 84 24 90 0e 00 00 31 f6 41 0f b7 94 24 88 0e 00 00 <48> 8b 3c d0 e8 c6 e1 01 00 49 8b 84 24 00 07 00 00 a9 00 00 10 00

RSP: 0018:ffffa6854c647da0 EFLAGS: 00010246

RAX: 0000000000000000 RBX: ffffa2cbec414008 RCX: 0000000000000000

RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffa2cbec414370

RBP: ffffa6854c647dd8 R08: 0000000000000000 R09: 0000000000000001

R10: ffffdf0e7fb7b540 R11: ffffa2cbef19b370 R12: ffffa2cbec414000

R13: ffffffffc05fb000 R14: ffffa2cbf367f000 R15: dead000000000100

FS:  00007f8b2c8ac740(0000) GS:ffffa2cbffa40000(0000) knlGS:0000000000000000

CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033

CR2: 0000000000000000 CR3: 0000001fe6fe4006 CR4: 00000000001606e0

Call Trace:

 pci_device_remove+0x3e/0xb0

 device_release_driver_internal+0xe4/0x1b0

 driver_detach+0x47/0x82

 bus_remove_driver+0x53/0xa5

 driver_unregister+0x2e/0x50

 pci_unregister_driver+0x32/0x90

 i40e_exit_module+0x10/0xed3 [i40e]

 __se_sys_delete_module+0x156/0x200

 ? exit_to_usermode_loop+0x7b/0x140

 __x64_sys_delete_module+0x16/0x20

 do_syscall_64+0x4d/0x150

 entry_SYSCALL_64_after_hwframe+0x44/0xa9

RIP: 0033:0x7f8b2c9a8867

Code: 73 01 c3 48 8b 0d 19 c6 0b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e9 c5 0b 00 f7 d8 64 89 01 48

RSP: 002b:00007ffc352bc018 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0

RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f8b2c9a8867

RDX: 000000000000000a RSI: 0000000000000800 RDI: 000055d7c6cd77c8

RBP: 00007ffc352bc068 R08: 0000000000000000 R09: 0000000000000000

R10: 00007f8b2ca18ac0 R11: 0000000000000206 R12: 00007ffc352bc230

R13: 00007ffc352bcf31 R14: 000055d7c6cd72a0 R15: 000055d7c6cd7760

Modules linked in: intel_rapl_msr iTCO_wdt iTCO_vendor_support watchdog intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel crct10dif_pclmul crct10dif_common aesni_intel aes_x86_64 glue_helper crypto_simd cryptd i2c_i801 i40e(O-) configfs igb(O) e1000e(O) lpc_ich pcc_cpufreq sch_fq_codel nfsd openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 kvm irqbypass fuse

CR2: 0000000000000000

-{-}{{-}}[ end trace 0000000000000002 ]{{-}}{-}-

printk: enabled sync mode

RIP: 0010:i40e_remove+0xb1/0x390 [i40e]

Code: 83 bc 24 68 08 00 00 00 74 0d 49 8d bc 24 50 08 00 00 e8 62 bb ad d0 49 8b 84 24 90 0e 00 00 31 f6 41 0f b7 94 24 88 0e 00 00 <48> 8b 3c d0 e8 c6 e1 01 00 49 8b 84 24 00 07 00 00 a9 00 00 10 00

RSP: 0018:ffffa6854c647da0 EFLAGS: 00010246

RAX: 0000000000000000 RBX: ffffa2cbec414008 RCX: 0000000000000000

RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffa2cbec414370

RBP: ffffa6854c647dd8 R08: 0000000000000000 R09: 0000000000000001

R10: ffffdf0e7fb7b540 R11: ffffa2cbef19b370 R12: ffffa2cbec414000

R13: ffffffffc05fb000 R14: ffffa2cbf367f000 R15: dead000000000100

FS:  00007f8b2c8ac740(0000) GS:ffffa2cbffa40000(0000) knlGS:0000000000000000

CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033

CR2: 0000000000000000 CR3: 0000001fe6fe4006 CR4: 00000000001606e0

/etc/jmodutilsPost.sh: line 20:  2704 Killed                  rmmod i40e

sh[2538]: PF driver reset failed, trying to reinitialize

[[0;32m  OK  [0m] Started [0;1;39mCheck for failure of daemons and alarm[0m.

Cannot get driver information: No such device

rmmod: ERROR: Module i40e is in use

sh[2538]: PF driver reset failed, trying to reinitialize

Cannot get driver information: No such device

Workaround

NA

Steps to Reproduce

Build LTS19 OVP Host image LTS19 RCPL0017

Setup Project:

$ ./wrlinux-x/setup.sh --dl-layers --distros wrlinux-ovp --machines intel-x86-64 --templates feature/initramfs,feature/kdump,feature/kexec,feature/sysklogd,feature/dpdk,feature/package-management 

Source bitbake build env.

$ . ./environment-setup-x86_64-wrlinuxsdk-linux

$ . ./oe-init-build-env

-> Set PREFERRED_PROVIDER_virtual/kernel = "linux-yocto-rt" in conf/local.conf

-> Build images.

$ bitbake wrlinux-image-ovp-kvm

-> Deploy bzImage and rootfs.tar.bz2 to intelx86-64 based *Haswell* board which has *Intel XL710 NIC* connected to it.

-> Reboot LTS19 multiple times till below errors are seen in system logs,

i40e : Initial pf_reset failed: -15

*-> This issue is currently reproducible in customer setup after multiple (50-100) reboots/power-cycles.*
Live chat
Online