Description
In the Linux kernel, the following vulnerability has been resolved:
drm/vgem-fence: Fix potential deadlock on release
A timer that expires a vgem fence automatically in 10 seconds is now
released with timer_delete_sync() from fence->ops.release() called on last
dma_fence_put(). In some scenarios, it can run in IRQ context, which is
not safe unless TIMER_IRQSAFE is used. One potentially risky scenario was
demonstrated in Intel DRM CI trybot, BAT run on machine bat-adlp-6, while
working on new IGT subtests syncobj_timeline@stress-* as user space
replacements of some problematic test cases of a dma-fence-chain selftest
[1].
[117.004338] ================================
[117.004340] WARNING: inconsistent lock state
[117.004342] 6.17.0-rc7-CI_DRM_17270-g7644974e648c+ #1 Tainted: G S U
[117.004346] --------------------------------
[117.004347] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
[117.004349] swapper/0/0 [HC1[1]:SC1[1]:HE0:SE0] takes:
[117.004352] ffff888138f86aa8 ((&fence->timer)){?.-.}-{0:0}, at: __timer_delete_sync+0x4b/0x190
[117.004361] {HARDIRQ-ON-W} state was registered at:
[117.004363] lock_acquire+0xc4/0x2e0
[117.004366] call_timer_fn+0x80/0x2a0
[117.004368] __run_timers+0x231/0x310
[117.004370] run_timer_softirq+0x76/0xe0
[117.004372] handle_softirqs+0xd4/0x4d0
[117.004375] __irq_exit_rcu+0x13f/0x160
[117.004377] irq_exit_rcu+0xe/0x20
[117.004379] sysvec_apic_timer_interrupt+0xa0/0xc0
[117.004382] asm_sysvec_apic_timer_interrupt+0x1b/0x20
[117.004385] cpuidle_enter_state+0x12b/0x8a0
[117.004388] cpuidle_enter+0x2e/0x50
[117.004393] call_cpuidle+0x22/0x60
[117.004395] do_idle+0x1fd/0x260
[117.004398] cpu_startup_entry+0x29/0x30
[117.004401] start_secondary+0x12d/0x160
[117.004404] common_startup_64+0x13e/0x141
[117.004407] irq event stamp: 2282669
[117.004409] hardirqs last enabled at (2282668): [<ffffffff8289db71>] _raw_spin_unlock_irqrestore+0x51/0x80
[117.004414] hardirqs last disabled at (2282669): [<ffffffff82882021>] sysvec_irq_work+0x11/0xc0
[117.004419] softirqs last enabled at (2254702): [<ffffffff8289fd00>] __do_softirq+0x10/0x18
[117.004423] softirqs last disabled at (2254725): [<ffffffff813d4ddf>] __irq_exit_rcu+0x13f/0x160
[117.004426]
other info that might help us debug this:
[117.004429] Possible unsafe locking scenario:
[117.004432] CPU0
[117.004433] ----
[117.004434] lock((&fence->timer));
[117.004436] <Interrupt>
[117.004438] lock((&fence->timer));
[117.004440]
*** DEADLOCK ***
[117.004443] 1 lock held by swapper/0/0:
[117.004445] #0: ffffc90000003d50 ((&fence->timer)){?.-.}-{0:0}, at: call_timer_fn+0x7a/0x2a0
[117.004450]
stack backtrace:
[117.004453] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G S U 6.17.0-rc7-CI_DRM_17270-g7644974e648c+ #1 PREEMPT(voluntary)
[117.004455] Tainted: [S]=CPU_OUT_OF_SPEC, [U]=USER
[117.004455] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR4 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 01/20/2023
[117.004456] Call Trace:
[117.004456] <IRQ>
[117.004457] dump_stack_lvl+0x91/0xf0
[117.004460] dump_stack+0x10/0x20
[117.004461] print_usage_bug.part.0+0x260/0x360
[117.004463] mark_lock+0x76e/0x9c0
[117.004465] ? register_lock_class+0x48/0x4a0
[117.004467] __lock_acquire+0xbc3/0x2860
[117.004469] lock_acquire+0xc4/0x2e0
[117.004470] ? __timer_delete_sync+0x4b/0x190
[117.004472] ? __timer_delete_sync+0x4b/0x190
[117.004473] __timer_delete_sync+0x68/0x190
[117.004474] ? __timer_delete_sync+0x4b/0x190
[117.004475] timer_delete_sync+0x10/0x20
[117.004476] vgem_fence_release+0x19/0x30 [vgem]
[117.004478] dma_fence_release+0xc1/0x3b0
[117.004480] ? dma_fence_release+0xa1/0x3b0
[117.004481] dma_fence_chain_release+0xe7/0x130
[117.004483] dma_fence_release+0xc1/0x3b0
[117.004484] ? _raw_spin_unlock_irqrestore+0x27/0x80
[117.004485] dma_fence_chain_irq_work+0x59/0x80
[117.004487] irq_work_single+0x75/0xa0
[117.004490] irq_work_r
---truncated---