HomeDefectsLIN1025-2503
Fixed

LIN1025-2503 : Security Advisory - linux - CVE-2025-38242

Created: Jul 9, 2025    Updated: Sep 1, 2025
Resolved Date: Jul 15, 2025
Found In Version: 10.25.33.1
Severity: Standard
Applicable for: Wind River Linux LTS 25
Component/s: Kernel

Description

In the Linux kernel, the following vulnerability has been resolved:EOL][EOL]mm: userfaultfd: fix race of userfaultfd_move and swap cache[EOL][EOL]This commit fixes two kinds of races, they may have different results:[EOL][EOL]Barry reported a BUG_ON in commit c50f8e6053b0, we may see the same[EOL]BUG_ON if the filemap lookup returned NULL and folio is added to swap[EOL]cache after that.[EOL][EOL]If another kind of race is triggered (folio changed after lookup) we[EOL]may see RSS counter is corrupted:[EOL][EOL][  406.893936] BUG: Bad rss-counter state mm:ffff0000c5a9ddc0[EOL]type:MM_ANONPAGES val:-1[EOL][  406.894071] BUG: Bad rss-counter state mm:ffff0000c5a9ddc0[EOL]type:MM_SHMEMPAGES val:1[EOL][EOL]Because the folio is being accounted to the wrong VMA.[EOL][EOL]I'm not sure if there will be any data corruption though, seems no. [EOL]The issues above are critical already.[EOL][EOL][EOL]On seeing a swap entry PTE, userfaultfd_move does a lockless swap cache[EOL]lookup, and tries to move the found folio to the faulting vma.  Currently,[EOL]it relies on checking the PTE value to ensure that the moved folio still[EOL]belongs to the src swap entry and that no new folio has been added to the[EOL]swap cache, which turns out to be unreliable.[EOL][EOL]While working and reviewing the swap table series with Barry, following[EOL]existing races are observed and reproduced [1]:[EOL][EOL]In the example below, move_pages_pte is moving src_pte to dst_pte, where[EOL]src_pte is a swap entry PTE holding swap entry S1, and S1 is not in the[EOL]swap cache:[EOL][EOL]CPU1                               CPU2[EOL]userfaultfd_move[EOL]  move_pages_pte()[EOL]    entry = pte_to_swp_entry(orig_src_pte);[EOL]    // Here it got entry = S1[EOL]    ... < interrupted> ...[EOL]                                   <swapin src_pte, alloc and use folio A>[EOL]                                   // folio A is a new allocated folio[EOL]                                   // and get installed into src_pte[EOL]                                   <frees swap entry S1>[EOL]                                   // src_pte now points to folio A, S1[EOL]                                   // has swap count == 0, it can be freed[EOL]                                   // by folio_swap_swap or swap[EOL]                                   // allocator's reclaim.[EOL]                                   <try to swap out another folio B>[EOL]                                   // folio B is a folio in another VMA.[EOL]                                   <put folio B to swap cache using S1 >[EOL]                                   // S1 is freed, folio B can use it[EOL]                                   // for swap out with no problem.[EOL]                                   ...[EOL]    folio = filemap_get_folio(S1)[EOL]    // Got folio B here !!![EOL]    ... < interrupted again> ...[EOL]                                   <swapin folio B and free S1>[EOL]                                   // Now S1 is free to be used again.[EOL]                                   <swapout src_pte & folio A using S1>[EOL]                                   // Now src_pte is a swap entry PTE[EOL]                                   // holding S1 again.[EOL]    folio_trylock(folio)[EOL]    move_swap_pte[EOL]      double_pt_lock[EOL]      is_pte_pages_stable[EOL]      // Check passed because src_pte == S1[EOL]      folio_move_anon_rmap(...)[EOL]      // Moved invalid folio B here !!![EOL][EOL]The race window is very short and requires multiple collisions of multiple[EOL]rare events, so it's very unlikely to happen, but with a deliberately[EOL]constructed reproducer and increased time window, it can be reproduced[EOL]easily.[EOL][EOL]This can be fixed by checking if the folio returned by filemap is the[EOL]valid swap cache folio after acquiring the folio lock.[EOL][EOL]Another similar race is possible: filemap_get_folio may return NULL, but[EOL]folio (A) could be swapped in and then swapped out again using the same[EOL]swap entry after the lookup.  In such a case, folio (A) may remain in the[EOL]swap cache, so it must be moved too:[EOL][EOL]CPU1                               CPU2[EOL]userfaultfd_move[EOL]  move_pages_pte()[EOL]    entry = pte_to_swp_entry(orig_src_pte);[EOL]    // Here it got entry = S1, and S1 is not in swap cache[EOL]    folio = filemap_get[EOL]---truncated---

CREATE(Triage):(User=admin) [CVE-2025-38242 (https://nvd.nist.gov/vuln/detail/CVE-2025-38242)