| CVE |
Vendors |
Products |
Updated |
CVSS v2 |
CVSS v3 |
In the Linux kernel, the following vulnerability has been resolved:
drm/amdkfd: Don't call mmput from MMU notifier callback
If the process is exiting, the mmput inside mmu notifier callback from
compactd or fork or numa balancing could release the last reference
of mm struct to call exit_mmap and free_pgtable, this triggers deadlock
with below backtrace.
The deadlock will leak kfd process as mmu notifier release is not called
and cause VRAM leaking.
The fix is to take mm reference mmget_non_ ...
In the Linux kernel, the following vulnerability has been resolved:
drm/amdkfd: Don't call mmput from MMU notifier callback
If the process is exiting, the mmput inside mmu notifier callback from
compactd or fork or numa balancing could release the last reference
of mm struct to call exit_mmap and free_pgtable, this triggers deadlock
with below backtrace.
The deadlock will leak kfd process as mmu notifier release is not called
and cause VRAM leaking.
The fix is to take mm reference mmget_non_zero when adding prange to the
deferred list to pair with mmput in deferred list work.
If prange split and add into pchild list, the pchild work_item.mm is not
used, so remove the mm parameter from svm_range_unmap_split and
svm_range_add_child.
The backtrace of hung task:
INFO: task python:348105 blocked for more than 64512 seconds.
Call Trace:
__schedule+0x1c3/0x550
schedule+0x46/0xb0
rwsem_down_write_slowpath+0x24b/0x4c0
unlink_anon_vmas+0xb1/0x1c0
free_pgtables+0xa9/0x130
exit_mmap+0xbc/0x1a0
mmput+0x5a/0x140
svm_range_cpu_invalidate_pagetables+0x2b/0x40 [amdgpu]
mn_itree_invalidate+0x72/0xc0
__mmu_notifier_invalidate_range_start+0x48/0x60
try_to_unmap_one+0x10fa/0x1400
rmap_walk_anon+0x196/0x460
try_to_unmap+0xbb/0x210
migrate_page_unmap+0x54d/0x7e0
migrate_pages_batch+0x1c3/0xae0
migrate_pages_sync+0x98/0x240
migrate_pages+0x25c/0x520
compact_zone+0x29d/0x590
compact_zone_order+0xb6/0xf0
try_to_compact_pages+0xbe/0x220
__alloc_pages_direct_compact+0x96/0x1a0
__alloc_pages_slowpath+0x410/0x930
__alloc_pages_nodemask+0x3a9/0x3e0
do_huge_pmd_anonymous_page+0xd7/0x3e0
__handle_mm_fault+0x5e3/0x5f0
handle_mm_fault+0xf7/0x2e0
hmm_vma_fault.isra.0+0x4d/0xa0
walk_pmd_range.isra.0+0xa8/0x310
walk_pud_range+0x167/0x240
walk_pgd_range+0x55/0x100
__walk_page_range+0x87/0x90
walk_page_range+0xf6/0x160
hmm_range_fault+0x4f/0x90
amdgpu_hmm_range_get_pages+0x123/0x230 [amdgpu]
amdgpu_ttm_tt_get_user_pages+0xb1/0x150 [amdgpu]
init_user_pages+0xb1/0x2a0 [amdgpu]
amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x543/0x7d0 [amdgpu]
kfd_ioctl_alloc_memory_of_gpu+0x24c/0x4e0 [amdgpu]
kfd_ioctl+0x29d/0x500 [amdgpu]
(cherry picked from commit a29e067bd38946f752b0ef855f3dfff87e77bec7)
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
hfsplus: remove mutex_lock check in hfsplus_free_extents
Syzbot reported an issue in hfsplus filesystem:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 4400 at fs/hfsplus/extents.c:346
hfsplus_free_extents+0x700/0xad0
Call Trace:
<TASK>
hfsplus_file_truncate+0x768/0xbb0 fs/hfsplus/extents.c:606
hfsplus_write_begin+0xc2/0xd0 fs/hfsplus/inode.c:56
cont_expand_zero fs/buffer.c:2383 [inline]
cont_write_begin+0x2cf/0x8 ...
In the Linux kernel, the following vulnerability has been resolved:
hfsplus: remove mutex_lock check in hfsplus_free_extents
Syzbot reported an issue in hfsplus filesystem:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 4400 at fs/hfsplus/extents.c:346
hfsplus_free_extents+0x700/0xad0
Call Trace:
<TASK>
hfsplus_file_truncate+0x768/0xbb0 fs/hfsplus/extents.c:606
hfsplus_write_begin+0xc2/0xd0 fs/hfsplus/inode.c:56
cont_expand_zero fs/buffer.c:2383 [inline]
cont_write_begin+0x2cf/0x860 fs/buffer.c:2446
hfsplus_write_begin+0x86/0xd0 fs/hfsplus/inode.c:52
generic_cont_expand_simple+0x151/0x250 fs/buffer.c:2347
hfsplus_setattr+0x168/0x280 fs/hfsplus/inode.c:263
notify_change+0xe38/0x10f0 fs/attr.c:420
do_truncate+0x1fb/0x2e0 fs/open.c:65
do_sys_ftruncate+0x2eb/0x380 fs/open.c:193
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
To avoid deadlock, Commit 31651c607151 ("hfsplus: avoid deadlock
on file truncation") unlock extree before hfsplus_free_extents(),
and add check wheather extree is locked in hfsplus_free_extents().
However, when operations such as hfsplus_file_release,
hfsplus_setattr, hfsplus_unlink, and hfsplus_get_block are executed
concurrently in different files, it is very likely to trigger the
WARN_ON, which will lead syzbot and xfstest to consider it as an
abnormality.
The comment above this warning also describes one of the easy
triggering situations, which can easily trigger and cause
xfstest&syzbot to report errors.
[task A] [task B]
->hfsplus_file_release
->hfsplus_file_truncate
->hfs_find_init
->mutex_lock
->mutex_unlock
->hfsplus_write_begin
->hfsplus_get_block
->hfsplus_file_extend
->hfsplus_ext_read_extent
->hfs_find_init
->mutex_lock
->hfsplus_free_extents
WARN_ON(mutex_is_locked) !!!
Several threads could try to lock the shared extents tree.
And warning can be triggered in one thread when another thread
has locked the tree. This is the wrong behavior of the code and
we need to remove the warning.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
mptcp: make fallback action and fallback decision atomic
Syzkaller reported the following splat:
WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 __mptcp_do_fallback net/mptcp/protocol.h:1223 [inline]
WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 mptcp_do_fallback net/mptcp/protocol.h:1244 [inline]
WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 check_fully_established net/mptcp/options.c:982 [inli ...
In the Linux kernel, the following vulnerability has been resolved:
mptcp: make fallback action and fallback decision atomic
Syzkaller reported the following splat:
WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 __mptcp_do_fallback net/mptcp/protocol.h:1223 [inline]
WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 mptcp_do_fallback net/mptcp/protocol.h:1244 [inline]
WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 check_fully_established net/mptcp/options.c:982 [inline]
WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 mptcp_incoming_options+0x21a8/0x2510 net/mptcp/options.c:1153
Modules linked in:
CPU: 1 UID: 0 PID: 7704 Comm: syz.3.1419 Not tainted 6.16.0-rc3-gbd5ce2324dba #20 PREEMPT(voluntary)
Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:__mptcp_do_fallback net/mptcp/protocol.h:1223 [inline]
RIP: 0010:mptcp_do_fallback net/mptcp/protocol.h:1244 [inline]
RIP: 0010:check_fully_established net/mptcp/options.c:982 [inline]
RIP: 0010:mptcp_incoming_options+0x21a8/0x2510 net/mptcp/options.c:1153
Code: 24 18 e8 bb 2a 00 fd e9 1b df ff ff e8 b1 21 0f 00 e8 ec 5f c4 fc 44 0f b7 ac 24 b0 00 00 00 e9 54 f1 ff ff e8 d9 5f c4 fc 90 <0f> 0b 90 e9 b8 f4 ff ff e8 8b 2a 00 fd e9 8d e6 ff ff e8 81 2a 00
RSP: 0018:ffff8880a3f08448 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8880180a8000 RCX: ffffffff84afcf45
RDX: ffff888090223700 RSI: ffffffff84afdaa7 RDI: 0000000000000001
RBP: ffff888017955780 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8880180a8910 R14: ffff8880a3e9d058 R15: 0000000000000000
FS: 00005555791b8500(0000) GS:ffff88811c495000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000110c2800b7 CR3: 0000000058e44000 CR4: 0000000000350ef0
Call Trace:
<IRQ>
tcp_reset+0x26f/0x2b0 net/ipv4/tcp_input.c:4432
tcp_validate_incoming+0x1057/0x1b60 net/ipv4/tcp_input.c:5975
tcp_rcv_established+0x5b5/0x21f0 net/ipv4/tcp_input.c:6166
tcp_v4_do_rcv+0x5dc/0xa70 net/ipv4/tcp_ipv4.c:1925
tcp_v4_rcv+0x3473/0x44a0 net/ipv4/tcp_ipv4.c:2363
ip_protocol_deliver_rcu+0xba/0x480 net/ipv4/ip_input.c:205
ip_local_deliver_finish+0x2f1/0x500 net/ipv4/ip_input.c:233
NF_HOOK include/linux/netfilter.h:317 [inline]
NF_HOOK include/linux/netfilter.h:311 [inline]
ip_local_deliver+0x1be/0x560 net/ipv4/ip_input.c:254
dst_input include/net/dst.h:469 [inline]
ip_rcv_finish net/ipv4/ip_input.c:447 [inline]
NF_HOOK include/linux/netfilter.h:317 [inline]
NF_HOOK include/linux/netfilter.h:311 [inline]
ip_rcv+0x514/0x810 net/ipv4/ip_input.c:567
__netif_receive_skb_one_core+0x197/0x1e0 net/core/dev.c:5975
__netif_receive_skb+0x1f/0x120 net/core/dev.c:6088
process_backlog+0x301/0x1360 net/core/dev.c:6440
__napi_poll.constprop.0+0xba/0x550 net/core/dev.c:7453
napi_poll net/core/dev.c:7517 [inline]
net_rx_action+0xb44/0x1010 net/core/dev.c:7644
handle_softirqs+0x1d0/0x770 kernel/softirq.c:579
do_softirq+0x3f/0x90 kernel/softirq.c:480
</IRQ>
<TASK>
__local_bh_enable_ip+0xed/0x110 kernel/softirq.c:407
local_bh_enable include/linux/bottom_half.h:33 [inline]
inet_csk_listen_stop+0x2c5/0x1070 net/ipv4/inet_connection_sock.c:1524
mptcp_check_listen_stop.part.0+0x1cc/0x220 net/mptcp/protocol.c:2985
mptcp_check_listen_stop net/mptcp/mib.h:118 [inline]
__mptcp_close+0x9b9/0xbd0 net/mptcp/protocol.c:3000
mptcp_close+0x2f/0x140 net/mptcp/protocol.c:3066
inet_release+0xed/0x200 net/ipv4/af_inet.c:435
inet6_release+0x4f/0x70 net/ipv6/af_inet6.c:487
__sock_release+0xb3/0x270 net/socket.c:649
sock_close+0x1c/0x30 net/socket.c:1439
__fput+0x402/0xb70 fs/file_table.c:465
task_work_run+0x150/0x240 kernel/task_work.c:227
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop+0xd4
---truncated---
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
btrfs: don't take dev_replace rwsem on task already holding it
Running fstests btrfs/011 with MKFS_OPTIONS="-O rst" to force the usage of
the RAID stripe-tree, we get the following splat from lockdep:
BTRFS info (device sdd): dev_replace from /dev/sdd (devid 1) to /dev/sdb started
============================================
WARNING: possible recursive locking detected
6.11.0-rc3-btrfs-for-next #599 Not tainted
-------- ...
In the Linux kernel, the following vulnerability has been resolved:
btrfs: don't take dev_replace rwsem on task already holding it
Running fstests btrfs/011 with MKFS_OPTIONS="-O rst" to force the usage of
the RAID stripe-tree, we get the following splat from lockdep:
BTRFS info (device sdd): dev_replace from /dev/sdd (devid 1) to /dev/sdb started
============================================
WARNING: possible recursive locking detected
6.11.0-rc3-btrfs-for-next #599 Not tainted
--------------------------------------------
btrfs/2326 is trying to acquire lock:
ffff88810f215c98 (&fs_info->dev_replace.rwsem){++++}-{3:3}, at: btrfs_map_block+0x39f/0x2250
but task is already holding lock:
ffff88810f215c98 (&fs_info->dev_replace.rwsem){++++}-{3:3}, at: btrfs_map_block+0x39f/0x2250
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&fs_info->dev_replace.rwsem);
lock(&fs_info->dev_replace.rwsem);
*** DEADLOCK ***
May be due to missing lock nesting notation
1 lock held by btrfs/2326:
#0: ffff88810f215c98 (&fs_info->dev_replace.rwsem){++++}-{3:3}, at: btrfs_map_block+0x39f/0x2250
stack backtrace:
CPU: 1 UID: 0 PID: 2326 Comm: btrfs Not tainted 6.11.0-rc3-btrfs-for-next #599
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Call Trace:
<TASK>
dump_stack_lvl+0x5b/0x80
__lock_acquire+0x2798/0x69d0
? __pfx___lock_acquire+0x10/0x10
? __pfx___lock_acquire+0x10/0x10
lock_acquire+0x19d/0x4a0
? btrfs_map_block+0x39f/0x2250
? __pfx_lock_acquire+0x10/0x10
? find_held_lock+0x2d/0x110
? lock_is_held_type+0x8f/0x100
down_read+0x8e/0x440
? btrfs_map_block+0x39f/0x2250
? __pfx_down_read+0x10/0x10
? do_raw_read_unlock+0x44/0x70
? _raw_read_unlock+0x23/0x40
btrfs_map_block+0x39f/0x2250
? btrfs_dev_replace_by_ioctl+0xd69/0x1d00
? btrfs_bio_counter_inc_blocked+0xd9/0x2e0
? __kasan_slab_alloc+0x6e/0x70
? __pfx_btrfs_map_block+0x10/0x10
? __pfx_btrfs_bio_counter_inc_blocked+0x10/0x10
? kmem_cache_alloc_noprof+0x1f2/0x300
? mempool_alloc_noprof+0xed/0x2b0
btrfs_submit_chunk+0x28d/0x17e0
? __pfx_btrfs_submit_chunk+0x10/0x10
? bvec_alloc+0xd7/0x1b0
? bio_add_folio+0x171/0x270
? __pfx_bio_add_folio+0x10/0x10
? __kasan_check_read+0x20/0x20
btrfs_submit_bio+0x37/0x80
read_extent_buffer_pages+0x3df/0x6c0
btrfs_read_extent_buffer+0x13e/0x5f0
read_tree_block+0x81/0xe0
read_block_for_search+0x4bd/0x7a0
? __pfx_read_block_for_search+0x10/0x10
btrfs_search_slot+0x78d/0x2720
? __pfx_btrfs_search_slot+0x10/0x10
? lock_is_held_type+0x8f/0x100
? kasan_save_track+0x14/0x30
? __kasan_slab_alloc+0x6e/0x70
? kmem_cache_alloc_noprof+0x1f2/0x300
btrfs_get_raid_extent_offset+0x181/0x820
? __pfx_lock_acquire+0x10/0x10
? __pfx_btrfs_get_raid_extent_offset+0x10/0x10
? down_read+0x194/0x440
? __pfx_down_read+0x10/0x10
? do_raw_read_unlock+0x44/0x70
? _raw_read_unlock+0x23/0x40
btrfs_map_block+0x5b5/0x2250
? __pfx_btrfs_map_block+0x10/0x10
scrub_submit_initial_read+0x8fe/0x11b0
? __pfx_scrub_submit_initial_read+0x10/0x10
submit_initial_group_read+0x161/0x3a0
? lock_release+0x20e/0x710
? __pfx_submit_initial_group_read+0x10/0x10
? __pfx_lock_release+0x10/0x10
scrub_simple_mirror.isra.0+0x3eb/0x580
scrub_stripe+0xe4d/0x1440
? lock_release+0x20e/0x710
? __pfx_scrub_stripe+0x10/0x10
? __pfx_lock_release+0x10/0x10
? do_raw_read_unlock+0x44/0x70
? _raw_read_unlock+0x23/0x40
scrub_chunk+0x257/0x4a0
scrub_enumerate_chunks+0x64c/0xf70
? __mutex_unlock_slowpath+0x147/0x5f0
? __pfx_scrub_enumerate_chunks+0x10/0x10
? bit_wait_timeout+0xb0/0x170
? __up_read+0x189/0x700
? scrub_workers_get+0x231/0x300
? up_write+0x490/0x4f0
btrfs_scrub_dev+0x52e/0xcd0
? create_pending_snapshots+0x230/0x250
? __pfx_btrfs_scrub_dev+0x10/0x10
btrfs_dev_replace_by_ioctl+0xd69/0x1d00
? lock_acquire+0x19d/0x4a0
? __pfx_btrfs_dev_replace_by_ioctl+0x10/0x10
?
---truncated---
Show More
|
|
An issue was discovered in PyTorch v2.5 and v2.7.1. Omission of profiler.stop() can cause torch.profiler.profile (PythonTracer) to crash or hang during finalization, leading to a Denial of Service (DoS).
|
In the Linux kernel, the following vulnerability has been resolved:
bpf: Fix deadlock between rcu_tasks_trace and event_mutex.
Fix the following deadlock:
CPU A
_free_event()
perf_kprobe_destroy()
mutex_lock(&event_mutex)
perf_trace_event_unreg()
synchronize_rcu_tasks_trace()
There are several paths where _free_event() grabs event_mutex
and calls sync_rcu_tasks_trace. Above is one such case.
CPU B
bpf_prog_test_run_syscall()
rcu_read_lock_trace()
bpf_prog_run_pin_on ...
In the Linux kernel, the following vulnerability has been resolved:
bpf: Fix deadlock between rcu_tasks_trace and event_mutex.
Fix the following deadlock:
CPU A
_free_event()
perf_kprobe_destroy()
mutex_lock(&event_mutex)
perf_trace_event_unreg()
synchronize_rcu_tasks_trace()
There are several paths where _free_event() grabs event_mutex
and calls sync_rcu_tasks_trace. Above is one such case.
CPU B
bpf_prog_test_run_syscall()
rcu_read_lock_trace()
bpf_prog_run_pin_on_cpu()
bpf_prog_load()
bpf_tracing_func_proto()
trace_set_clr_event()
mutex_lock(&event_mutex)
Delegate trace_set_clr_event() to workqueue to avoid
such lock dependency.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
usb: typec: displayport: Fix potential deadlock
The deadlock can occur due to a recursive lock acquisition of
`cros_typec_altmode_data::mutex`.
The call chain is as follows:
1. cros_typec_altmode_work() acquires the mutex
2. typec_altmode_vdm() -> dp_altmode_vdm() ->
3. typec_altmode_exit() -> cros_typec_altmode_exit()
4. cros_typec_altmode_exit() attempts to acquire the mutex again
To prevent this, defer the `typec_altmode_e ...
In the Linux kernel, the following vulnerability has been resolved:
usb: typec: displayport: Fix potential deadlock
The deadlock can occur due to a recursive lock acquisition of
`cros_typec_altmode_data::mutex`.
The call chain is as follows:
1. cros_typec_altmode_work() acquires the mutex
2. typec_altmode_vdm() -> dp_altmode_vdm() ->
3. typec_altmode_exit() -> cros_typec_altmode_exit()
4. cros_typec_altmode_exit() attempts to acquire the mutex again
To prevent this, defer the `typec_altmode_exit()` call by scheduling
it rather than calling it directly from within the mutex-protected
context.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
netfilter: nf_tables: release mutex after nft_gc_seq_end from abort path
The commit mutex should not be released during the critical section
between nft_gc_seq_begin() and nft_gc_seq_end(), otherwise, async GC
worker could collect expired objects and get the released commit lock
within the same GC sequence.
nf_tables_module_autoload() temporarily releases the mutex to load
module dependencies, then it goes back to replay the ...
In the Linux kernel, the following vulnerability has been resolved:
netfilter: nf_tables: release mutex after nft_gc_seq_end from abort path
The commit mutex should not be released during the critical section
between nft_gc_seq_begin() and nft_gc_seq_end(), otherwise, async GC
worker could collect expired objects and get the released commit lock
within the same GC sequence.
nf_tables_module_autoload() temporarily releases the mutex to load
module dependencies, then it goes back to replay the transaction again.
Move it at the end of the abort phase after nft_gc_seq_end() is called.
Show More
|
|
In the Linux kernel, the following vulnerability has been resolved:
dm snapshot: fix lockup in dm_exception_table_exit
There was reported lockup when we exit a snapshot with many exceptions.
Fix this by adding "cond_resched" to the loop that frees the exceptions.
|
In the Linux kernel, the following vulnerability has been resolved:
interconnect: Don't access req_list while it's being manipulated
The icc_lock mutex was split into separate icc_lock and icc_bw_lock
mutexes in [1] to avoid lockdep splats. However, this didn't adequately
protect access to icc_node::req_list.
The icc_set_bw() function will eventually iterate over req_list while
only holding icc_bw_lock, but req_list can be modified while only
holding icc_lock. This causes races between icc_se ...
In the Linux kernel, the following vulnerability has been resolved:
interconnect: Don't access req_list while it's being manipulated
The icc_lock mutex was split into separate icc_lock and icc_bw_lock
mutexes in [1] to avoid lockdep splats. However, this didn't adequately
protect access to icc_node::req_list.
The icc_set_bw() function will eventually iterate over req_list while
only holding icc_bw_lock, but req_list can be modified while only
holding icc_lock. This causes races between icc_set_bw(), of_icc_get(),
and icc_put().
Example A:
CPU0 CPU1
---- ----
icc_set_bw(path_a)
mutex_lock(&icc_bw_lock);
icc_put(path_b)
mutex_lock(&icc_lock);
aggregate_requests()
hlist_for_each_entry(r, ...
hlist_del(...
<r = invalid pointer>
Example B:
CPU0 CPU1
---- ----
icc_set_bw(path_a)
mutex_lock(&icc_bw_lock);
path_b = of_icc_get()
of_icc_get_by_index()
mutex_lock(&icc_lock);
path_find()
path_init()
aggregate_requests()
hlist_for_each_entry(r, ...
hlist_add_head(...
<r = invalid pointer>
Fix this by ensuring icc_bw_lock is always held before manipulating
icc_node::req_list. The additional places icc_bw_lock is held don't
perform any memory allocations, so we should still be safe from the
original lockdep splats that motivated the separate locks.
[1] commit af42269c3523 ("interconnect: Fix locking for runpm vs reclaim")
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
nfsd: fix RELEASE_LOCKOWNER
The test on so_count in nfsd4_release_lockowner() is nonsense and
harmful. Revert to using check_for_locks(), changing that to not sleep.
First: harmful.
As is documented in the kdoc comment for nfsd4_release_lockowner(), the
test on so_count can transiently return a false positive resulting in a
return of NFS4ERR_LOCKS_HELD when in fact no locks are held. This is
clearly a protocol violation and ...
In the Linux kernel, the following vulnerability has been resolved:
nfsd: fix RELEASE_LOCKOWNER
The test on so_count in nfsd4_release_lockowner() is nonsense and
harmful. Revert to using check_for_locks(), changing that to not sleep.
First: harmful.
As is documented in the kdoc comment for nfsd4_release_lockowner(), the
test on so_count can transiently return a false positive resulting in a
return of NFS4ERR_LOCKS_HELD when in fact no locks are held. This is
clearly a protocol violation and with the Linux NFS client it can cause
incorrect behaviour.
If RELEASE_LOCKOWNER is sent while some other thread is still
processing a LOCK request which failed because, at the time that request
was received, the given owner held a conflicting lock, then the nfsd
thread processing that LOCK request can hold a reference (conflock) to
the lock owner that causes nfsd4_release_lockowner() to return an
incorrect error.
The Linux NFS client ignores that NFS4ERR_LOCKS_HELD error because it
never sends NFS4_RELEASE_LOCKOWNER without first releasing any locks, so
it knows that the error is impossible. It assumes the lock owner was in
fact released so it feels free to use the same lock owner identifier in
some later locking request.
When it does reuse a lock owner identifier for which a previous RELEASE
failed, it will naturally use a lock_seqid of zero. However the server,
which didn't release the lock owner, will expect a larger lock_seqid and
so will respond with NFS4ERR_BAD_SEQID.
So clearly it is harmful to allow a false positive, which testing
so_count allows.
The test is nonsense because ... well... it doesn't mean anything.
so_count is the sum of three different counts.
1/ the set of states listed on so_stateids
2/ the set of active vfs locks owned by any of those states
3/ various transient counts such as for conflicting locks.
When it is tested against '2' it is clear that one of these is the
transient reference obtained by find_lockowner_str_locked(). It is not
clear what the other one is expected to be.
In practice, the count is often 2 because there is precisely one state
on so_stateids. If there were more, this would fail.
In my testing I see two circumstances when RELEASE_LOCKOWNER is called.
In one case, CLOSE is called before RELEASE_LOCKOWNER. That results in
all the lock states being removed, and so the lockowner being discarded
(it is removed when there are no more references which usually happens
when the lock state is discarded). When nfsd4_release_lockowner() finds
that the lock owner doesn't exist, it returns success.
The other case shows an so_count of '2' and precisely one state listed
in so_stateid. It appears that the Linux client uses a separate lock
owner for each file resulting in one lock state per lock owner, so this
test on '2' is safe. For another client it might not be safe.
So this patch changes check_for_locks() to use the (newish)
find_any_file_locked() so that it doesn't take a reference on the
nfs4_file and so never calls nfsd_file_put(), and so never sleeps. With
this check is it safe to restore the use of check_for_locks() rather
than testing so_count against the mysterious '2'.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
PCI: Avoid pci_dev_lock() AB/BA deadlock with sriov_numvfs_store()
The sysfs sriov_numvfs_store() path acquires the device lock before the
config space access lock:
sriov_numvfs_store
device_lock # A (1) acquire device lock
sriov_configure
vfio_pci_sriov_configure # (for example)
vfio_pci_core_sriov_configure
pci_disable_sriov
sriov_disable
pci_cfg_a ...
In the Linux kernel, the following vulnerability has been resolved:
PCI: Avoid pci_dev_lock() AB/BA deadlock with sriov_numvfs_store()
The sysfs sriov_numvfs_store() path acquires the device lock before the
config space access lock:
sriov_numvfs_store
device_lock # A (1) acquire device lock
sriov_configure
vfio_pci_sriov_configure # (for example)
vfio_pci_core_sriov_configure
pci_disable_sriov
sriov_disable
pci_cfg_access_lock
pci_wait_cfg # B (4) wait for dev->block_cfg_access == 0
Previously, pci_dev_lock() acquired the config space access lock before the
device lock:
pci_dev_lock
pci_cfg_access_lock
dev->block_cfg_access = 1 # B (2) set dev->block_cfg_access = 1
device_lock # A (3) wait for device lock
Any path that uses pci_dev_lock(), e.g., pci_reset_function(), may
deadlock with sriov_numvfs_store() if the operations occur in the sequence
(1) (2) (3) (4).
Avoid the deadlock by reversing the order in pci_dev_lock() so it acquires
the device lock before the config space access lock, the same as the
sriov_numvfs_store() path.
[bhelgaas: combined and adapted commit log from Jay Zhou's independent
subsequent posting:
https://lore.kernel.org/r/[email protected]]
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
clk: Get runtime PM before walking tree during disable_unused
Doug reported [1] the following hung task:
INFO: task swapper/0:1 blocked for more than 122 seconds.
Not tainted 5.15.149-21875-gf795ebc40eb8 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:swapper/0 state:D stack: 0 pid: 1 ppid: 0 flags:0x00000008
Call trace:
__switch_to+0xf4/0x1f4
__schedule+0x418/0 ...
In the Linux kernel, the following vulnerability has been resolved:
clk: Get runtime PM before walking tree during disable_unused
Doug reported [1] the following hung task:
INFO: task swapper/0:1 blocked for more than 122 seconds.
Not tainted 5.15.149-21875-gf795ebc40eb8 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:swapper/0 state:D stack: 0 pid: 1 ppid: 0 flags:0x00000008
Call trace:
__switch_to+0xf4/0x1f4
__schedule+0x418/0xb80
schedule+0x5c/0x10c
rpm_resume+0xe0/0x52c
rpm_resume+0x178/0x52c
__pm_runtime_resume+0x58/0x98
clk_pm_runtime_get+0x30/0xb0
clk_disable_unused_subtree+0x58/0x208
clk_disable_unused_subtree+0x38/0x208
clk_disable_unused_subtree+0x38/0x208
clk_disable_unused_subtree+0x38/0x208
clk_disable_unused_subtree+0x38/0x208
clk_disable_unused+0x4c/0xe4
do_one_initcall+0xcc/0x2d8
do_initcall_level+0xa4/0x148
do_initcalls+0x5c/0x9c
do_basic_setup+0x24/0x30
kernel_init_freeable+0xec/0x164
kernel_init+0x28/0x120
ret_from_fork+0x10/0x20
INFO: task kworker/u16:0:9 blocked for more than 122 seconds.
Not tainted 5.15.149-21875-gf795ebc40eb8 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u16:0 state:D stack: 0 pid: 9 ppid: 2 flags:0x00000008
Workqueue: events_unbound deferred_probe_work_func
Call trace:
__switch_to+0xf4/0x1f4
__schedule+0x418/0xb80
schedule+0x5c/0x10c
schedule_preempt_disabled+0x2c/0x48
__mutex_lock+0x238/0x488
__mutex_lock_slowpath+0x1c/0x28
mutex_lock+0x50/0x74
clk_prepare_lock+0x7c/0x9c
clk_core_prepare_lock+0x20/0x44
clk_prepare+0x24/0x30
clk_bulk_prepare+0x40/0xb0
mdss_runtime_resume+0x54/0x1c8
pm_generic_runtime_resume+0x30/0x44
__genpd_runtime_resume+0x68/0x7c
genpd_runtime_resume+0x108/0x1f4
__rpm_callback+0x84/0x144
rpm_callback+0x30/0x88
rpm_resume+0x1f4/0x52c
rpm_resume+0x178/0x52c
__pm_runtime_resume+0x58/0x98
__device_attach+0xe0/0x170
device_initial_probe+0x1c/0x28
bus_probe_device+0x3c/0x9c
device_add+0x644/0x814
mipi_dsi_device_register_full+0xe4/0x170
devm_mipi_dsi_device_register_full+0x28/0x70
ti_sn_bridge_probe+0x1dc/0x2c0
auxiliary_bus_probe+0x4c/0x94
really_probe+0xcc/0x2c8
__driver_probe_device+0xa8/0x130
driver_probe_device+0x48/0x110
__device_attach_driver+0xa4/0xcc
bus_for_each_drv+0x8c/0xd8
__device_attach+0xf8/0x170
device_initial_probe+0x1c/0x28
bus_probe_device+0x3c/0x9c
deferred_probe_work_func+0x9c/0xd8
process_one_work+0x148/0x518
worker_thread+0x138/0x350
kthread+0x138/0x1e0
ret_from_fork+0x10/0x20
The first thread is walking the clk tree and calling
clk_pm_runtime_get() to power on devices required to read the clk
hardware via struct clk_ops::is_enabled(). This thread holds the clk
prepare_lock, and is trying to runtime PM resume a device, when it finds
that the device is in the process of resuming so the thread schedule()s
away waiting for the device to finish resuming before continuing. The
second thread is runtime PM resuming the same device, but the runtime
resume callback is calling clk_prepare(), trying to grab the
prepare_lock waiting on the first thread.
This is a classic ABBA deadlock. To properly fix the deadlock, we must
never runtime PM resume or suspend a device with the clk prepare_lock
held. Actually doing that is near impossible today because the global
prepare_lock would have to be dropped in the middle of the tree, the
device runtime PM resumed/suspended, and then the prepare_lock grabbed
again to ensure consistency of the clk tree topology. If anything
changes with the clk tree in the meantime, we've lost and will need to
start the operation all over again.
Luckily, most of the time we're simply incrementing or decrementing the
runtime PM count on an active device, so we don't have the chance to
schedule away with the prepare_lock held. Let's fix this immediate
problem that can be
---truncated---
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
ftrace: Add cond_resched() to ftrace_graph_set_hash()
When the kernel contains a large number of functions that can be traced,
the loop in ftrace_graph_set_hash() may take a lot of time to execute.
This may trigger the softlockup watchdog.
Add cond_resched() within the loop to allow the kernel to remain
responsive even when processing a large number of functions.
This matches the cond_resched() that is used in other location ...
In the Linux kernel, the following vulnerability has been resolved:
ftrace: Add cond_resched() to ftrace_graph_set_hash()
When the kernel contains a large number of functions that can be traced,
the loop in ftrace_graph_set_hash() may take a lot of time to execute.
This may trigger the softlockup watchdog.
Add cond_resched() within the loop to allow the kernel to remain
responsive even when processing a large number of functions.
This matches the cond_resched() that is used in other locations of the
code that iterates over all functions that can be traced.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
__legitimize_mnt(): check for MNT_SYNC_UMOUNT should be under mount_lock
... or we risk stealing final mntput from sync umount - raising mnt_count
after umount(2) has verified that victim is not busy, but before it
has set MNT_SYNC_UMOUNT; in that case __legitimize_mnt() doesn't see
that it's safe to quietly undo mnt_count increment and leaves dropping
the reference to caller, where it'll be a full-blown mntput().
Check under ...
In the Linux kernel, the following vulnerability has been resolved:
__legitimize_mnt(): check for MNT_SYNC_UMOUNT should be under mount_lock
... or we risk stealing final mntput from sync umount - raising mnt_count
after umount(2) has verified that victim is not busy, but before it
has set MNT_SYNC_UMOUNT; in that case __legitimize_mnt() doesn't see
that it's safe to quietly undo mnt_count increment and leaves dropping
the reference to caller, where it'll be a full-blown mntput().
Check under mount_lock is needed; leaving the current one done before
taking that makes no sense - it's nowhere near common enough to bother
with.
Show More
|
|
In the Linux kernel, the following vulnerability has been resolved:
iio: imu: st_lsm6dsx: fix possible lockup in st_lsm6dsx_read_fifo
Prevent st_lsm6dsx_read_fifo from falling in an infinite loop in case
pattern_len is equal to zero and the device FIFO is not empty.
|
|
In the Linux kernel, the following vulnerability has been resolved:
iio: imu: st_lsm6dsx: fix possible lockup in st_lsm6dsx_read_tagged_fifo
Prevent st_lsm6dsx_read_tagged_fifo from falling in an infinite loop in
case pattern_len is equal to zero and the device FIFO is not empty.
|
In the Linux kernel, the following vulnerability has been resolved:
iio: light: opt3001: fix deadlock due to concurrent flag access
The threaded IRQ function in this driver is reading the flag twice: once to
lock a mutex and once to unlock it. Even though the code setting the flag
is designed to prevent it, there are subtle cases where the flag could be
true at the mutex_lock stage and false at the mutex_unlock stage. This
results in the mutex not being unlocked, resulting in a deadlock.
Fix ...
In the Linux kernel, the following vulnerability has been resolved:
iio: light: opt3001: fix deadlock due to concurrent flag access
The threaded IRQ function in this driver is reading the flag twice: once to
lock a mutex and once to unlock it. Even though the code setting the flag
is designed to prevent it, there are subtle cases where the flag could be
true at the mutex_lock stage and false at the mutex_unlock stage. This
results in the mutex not being unlocked, resulting in a deadlock.
Fix it by making the opt3001_irq() code generally more robust, reading the
flag into a variable and using the variable value at both stages.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
usb: typec: ucsi: displayport: Fix deadlock
This patch introduces the ucsi_con_mutex_lock / ucsi_con_mutex_unlock
functions to the UCSI driver. ucsi_con_mutex_lock ensures the connector
mutex is only locked if a connection is established and the partner pointer
is valid. This resolves a deadlock scenario where
ucsi_displayport_remove_partner holds con->mutex waiting for
dp_altmode_work to complete while dp_altmode_work attempt ...
In the Linux kernel, the following vulnerability has been resolved:
usb: typec: ucsi: displayport: Fix deadlock
This patch introduces the ucsi_con_mutex_lock / ucsi_con_mutex_unlock
functions to the UCSI driver. ucsi_con_mutex_lock ensures the connector
mutex is only locked if a connection is established and the partner pointer
is valid. This resolves a deadlock scenario where
ucsi_displayport_remove_partner holds con->mutex waiting for
dp_altmode_work to complete while dp_altmode_work attempts to acquire it.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
netfilter: ipset: fix region locking in hash types
Region locking introduced in v5.6-rc4 contained three macros to handle
the region locks: ahash_bucket_start(), ahash_bucket_end() which gave
back the start and end hash bucket values belonging to a given region
lock and ahash_region() which should give back the region lock belonging
to a given hash bucket. The latter was incorrect which can lead to a
race condition between the ...
In the Linux kernel, the following vulnerability has been resolved:
netfilter: ipset: fix region locking in hash types
Region locking introduced in v5.6-rc4 contained three macros to handle
the region locks: ahash_bucket_start(), ahash_bucket_end() which gave
back the start and end hash bucket values belonging to a given region
lock and ahash_region() which should give back the region lock belonging
to a given hash bucket. The latter was incorrect which can lead to a
race condition between the garbage collector and adding new elements
when a hash type of set is defined with timeouts.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
Input: gpio-keys - fix a sleep while atomic with PREEMPT_RT
When enabling PREEMPT_RT, the gpio_keys_irq_timer() callback runs in
hard irq context, but the input_event() takes a spin_lock, which isn't
allowed there as it is converted to a rt_spin_lock().
[ 4054.289999] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
[ 4054.290028] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, n ...
In the Linux kernel, the following vulnerability has been resolved:
Input: gpio-keys - fix a sleep while atomic with PREEMPT_RT
When enabling PREEMPT_RT, the gpio_keys_irq_timer() callback runs in
hard irq context, but the input_event() takes a spin_lock, which isn't
allowed there as it is converted to a rt_spin_lock().
[ 4054.289999] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
[ 4054.290028] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/0
...
[ 4054.290195] __might_resched+0x13c/0x1f4
[ 4054.290209] rt_spin_lock+0x54/0x11c
[ 4054.290219] input_event+0x48/0x80
[ 4054.290230] gpio_keys_irq_timer+0x4c/0x78
[ 4054.290243] __hrtimer_run_queues+0x1a4/0x438
[ 4054.290257] hrtimer_interrupt+0xe4/0x240
[ 4054.290269] arch_timer_handler_phys+0x2c/0x44
[ 4054.290283] handle_percpu_devid_irq+0x8c/0x14c
[ 4054.290297] handle_irq_desc+0x40/0x58
[ 4054.290307] generic_handle_domain_irq+0x1c/0x28
[ 4054.290316] gic_handle_irq+0x44/0xcc
Considering the gpio_keys_irq_isr() can run in any context, e.g. it can
be threaded, it seems there's no point in requesting the timer isr to
run in hard irq context.
Relax the hrtimer not to use the hard context.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
net: cadence: macb: Fix a possible deadlock in macb_halt_tx.
There is a situation where after THALT is set high, TGO stays high as
well. Because jiffies are never updated, as we are in a context with
interrupts disabled, we never exit that loop and have a deadlock.
That deadlock was noticed on a sama5d4 device that stayed locked for days.
Use retries instead of jiffies so that the timeout really works and we do
not have a de ...
In the Linux kernel, the following vulnerability has been resolved:
net: cadence: macb: Fix a possible deadlock in macb_halt_tx.
There is a situation where after THALT is set high, TGO stays high as
well. Because jiffies are never updated, as we are in a context with
interrupts disabled, we never exit that loop and have a deadlock.
That deadlock was noticed on a sama5d4 device that stayed locked for days.
Use retries instead of jiffies so that the timeout really works and we do
not have a deadlock anymore.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
tcp: correct handling of extreme memory squeeze
Testing with iperf3 using the "pasta" protocol splicer has revealed
a problem in the way tcp handles window advertising in extreme memory
squeeze situations.
Under memory pressure, a socket endpoint may temporarily advertise
a zero-sized window, but this is not stored as part of the socket data.
The reasoning behind this is that it is considered a temporary setting
which shouldn ...
In the Linux kernel, the following vulnerability has been resolved:
tcp: correct handling of extreme memory squeeze
Testing with iperf3 using the "pasta" protocol splicer has revealed
a problem in the way tcp handles window advertising in extreme memory
squeeze situations.
Under memory pressure, a socket endpoint may temporarily advertise
a zero-sized window, but this is not stored as part of the socket data.
The reasoning behind this is that it is considered a temporary setting
which shouldn't influence any further calculations.
However, if we happen to stall at an unfortunate value of the current
window size, the algorithm selecting a new value will consistently fail
to advertise a non-zero window once we have freed up enough memory.
This means that this side's notion of the current window size is
different from the one last advertised to the peer, causing the latter
to not send any data to resolve the sitution.
The problem occurs on the iperf3 server side, and the socket in question
is a completely regular socket with the default settings for the
fedora40 kernel. We do not use SO_PEEK or SO_RCVBUF on the socket.
The following excerpt of a logging session, with own comments added,
shows more in detail what is happening:
// tcp_v4_rcv(->)
// tcp_rcv_established(->)
[5201<->39222]: ==== Activating log @ net/ipv4/tcp_input.c/tcp_data_queue()/5257 ====
[5201<->39222]: tcp_data_queue(->)
[5201<->39222]: DROPPING skb [265600160..265665640], reason: SKB_DROP_REASON_PROTO_MEM
[rcv_nxt 265600160, rcv_wnd 262144, snt_ack 265469200, win_now 131184]
[copied_seq 259909392->260034360 (124968), unread 5565800, qlen 85, ofoq 0]
[OFO queue: gap: 65480, len: 0]
[5201<->39222]: tcp_data_queue(<-)
[5201<->39222]: __tcp_transmit_skb(->)
[tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160]
[5201<->39222]: tcp_select_window(->)
[5201<->39222]: (inet_csk(sk)->icsk_ack.pending & ICSK_ACK_NOMEM) ? --> TRUE
[tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160]
returning 0
[5201<->39222]: tcp_select_window(<-)
[5201<->39222]: ADVERTISING WIN 0, ACK_SEQ: 265600160
[5201<->39222]: [__tcp_transmit_skb(<-)
[5201<->39222]: tcp_rcv_established(<-)
[5201<->39222]: tcp_v4_rcv(<-)
// Receive queue is at 85 buffers and we are out of memory.
// We drop the incoming buffer, although it is in sequence, and decide
// to send an advertisement with a window of zero.
// We don't update tp->rcv_wnd and tp->rcv_wup accordingly, which means
// we unconditionally shrink the window.
[5201<->39222]: tcp_recvmsg_locked(->)
[5201<->39222]: __tcp_cleanup_rbuf(->) tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160
[5201<->39222]: [new_win = 0, win_now = 131184, 2 * win_now = 262368]
[5201<->39222]: [new_win >= (2 * win_now) ? --> time_to_ack = 0]
[5201<->39222]: NOT calling tcp_send_ack()
[tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160]
[5201<->39222]: __tcp_cleanup_rbuf(<-)
[rcv_nxt 265600160, rcv_wnd 262144, snt_ack 265469200, win_now 131184]
[copied_seq 260040464->260040464 (0), unread 5559696, qlen 85, ofoq 0]
returning 6104 bytes
[5201<->39222]: tcp_recvmsg_locked(<-)
// After each read, the algorithm for calculating the new receive
// window in __tcp_cleanup_rbuf() finds it is too small to advertise
// or to update tp->rcv_wnd.
// Meanwhile, the peer thinks the window is zero, and will not send
// any more data to trigger an update from the interrupt mode side.
[5201<->39222]: tcp_recvmsg_locked(->)
[5201<->39222]: __tcp_cleanup_rbuf(->) tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160
[5201<->39222]: [new_win = 262144, win_now = 131184, 2 * win_n
---truncated---
Show More
|
A post-authentication flaw in the network two-phase commit protocol used for cross-shard transactions in MongoDB Server may lead to logical data inconsistencies under specific conditions which are not predictable and exist for a very short period of time. This error can cause the transaction coordination logic to misinterpret the transaction as committed, resulting in inconsistent state on those shards. This may lead to low integrity and availability impact.
This issue impacts MongoDB Server v8 ...
A post-authentication flaw in the network two-phase commit protocol used for cross-shard transactions in MongoDB Server may lead to logical data inconsistencies under specific conditions which are not predictable and exist for a very short period of time. This error can cause the transaction coordination logic to misinterpret the transaction as committed, resulting in inconsistent state on those shards. This may lead to low integrity and availability impact.
This issue impacts MongoDB Server v8.0 versions prior to 8.0.16, MongoDB Server v7.0 versions prior to 7.0.26 and MongoDB server v8.2 versions prior to 8.2.2.
Show More
|
|
A flaw was found in the X server's request handling. Non-zero 'bytes to ignore' in a client's request can cause the server to skip processing another client's request, potentially leading to a denial of service.
|
|
In processLaunchBrowser of CommandParamsFactory.java, there is a possible browser interaction from the lockscreen due to improper locking. This could lead to physical escalation of privilege with no additional execution privileges needed. User interaction is not needed for exploitation.
|
In the Linux kernel, the following vulnerability has been resolved:
net: hinic: avoid kernel hung in hinic_get_stats64()
When using hinic device as a bond slave device, and reading device stats
of master bond device, the kernel may hung.
The kernel panic calltrace as follows:
Kernel panic - not syncing: softlockup: hung tasks
Call trace:
native_queued_spin_lock_slowpath+0x1ec/0x31c
dev_get_stats+0x60/0xcc
dev_seq_printf_stats+0x40/0x120
dev_seq_show+0x1c/0x40
seq_read_iter+0x3c8/0x4 ...
In the Linux kernel, the following vulnerability has been resolved:
net: hinic: avoid kernel hung in hinic_get_stats64()
When using hinic device as a bond slave device, and reading device stats
of master bond device, the kernel may hung.
The kernel panic calltrace as follows:
Kernel panic - not syncing: softlockup: hung tasks
Call trace:
native_queued_spin_lock_slowpath+0x1ec/0x31c
dev_get_stats+0x60/0xcc
dev_seq_printf_stats+0x40/0x120
dev_seq_show+0x1c/0x40
seq_read_iter+0x3c8/0x4dc
seq_read+0xe0/0x130
proc_reg_read+0xa8/0xe0
vfs_read+0xb0/0x1d4
ksys_read+0x70/0xfc
__arm64_sys_read+0x20/0x30
el0_svc_common+0x88/0x234
do_el0_svc+0x2c/0x90
el0_svc+0x1c/0x30
el0_sync_handler+0xa8/0xb0
el0_sync+0x148/0x180
And the calltrace of task that actually caused kernel hungs as follows:
__switch_to+124
__schedule+548
schedule+72
schedule_timeout+348
__down_common+188
__down+24
down+104
hinic_get_stats64+44 [hinic]
dev_get_stats+92
bond_get_stats+172 [bonding]
dev_get_stats+92
dev_seq_printf_stats+60
dev_seq_show+24
seq_read_iter+964
seq_read+220
proc_reg_read+164
vfs_read+172
ksys_read+108
__arm64_sys_read+28
el0_svc_common+132
do_el0_svc+40
el0_svc+24
el0_sync_handler+164
el0_sync+324
When getting device stats from bond, kernel will call bond_get_stats().
It first holds the spinlock bond->stats_lock, and then call
hinic_get_stats64() to collect hinic device's stats.
However, hinic_get_stats64() calls `down(&nic_dev->mgmt_lock)` to
protect its critical section, which may schedule current task out.
And if system is under high pressure, the task cannot be woken up
immediately, which eventually triggers kernel hung panic.
Since previous patch has replaced hinic_dev.tx_stats/rx_stats with local
variable in hinic_get_stats64(), there is nothing need to be protected
by lock, so just removing down()/up() is ok.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
drm/msm/mdp5: Fix global state lock backoff
We need to grab the lock after the early return for !hwpipe case.
Otherwise, we could have hit contention yet still returned 0.
Fixes an issue that the new CONFIG_DRM_DEBUG_MODESET_LOCK stuff flagged
in CI:
WARNING: CPU: 0 PID: 282 at drivers/gpu/drm/drm_modeset_lock.c:296 drm_modeset_lock+0xf8/0x154
Modules linked in:
CPU: 0 PID: 282 Comm: kms_cursor_lega Tainted: G ...
In the Linux kernel, the following vulnerability has been resolved:
drm/msm/mdp5: Fix global state lock backoff
We need to grab the lock after the early return for !hwpipe case.
Otherwise, we could have hit contention yet still returned 0.
Fixes an issue that the new CONFIG_DRM_DEBUG_MODESET_LOCK stuff flagged
in CI:
WARNING: CPU: 0 PID: 282 at drivers/gpu/drm/drm_modeset_lock.c:296 drm_modeset_lock+0xf8/0x154
Modules linked in:
CPU: 0 PID: 282 Comm: kms_cursor_lega Tainted: G W 5.19.0-rc2-15930-g875cc8bc536a #1
Hardware name: Qualcomm Technologies, Inc. DB820c (DT)
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : drm_modeset_lock+0xf8/0x154
lr : drm_atomic_get_private_obj_state+0x84/0x170
sp : ffff80000cfab6a0
x29: ffff80000cfab6a0 x28: 0000000000000000 x27: ffff000083bc4d00
x26: 0000000000000038 x25: 0000000000000000 x24: ffff80000957ca58
x23: 0000000000000000 x22: ffff000081ace080 x21: 0000000000000001
x20: ffff000081acec18 x19: ffff80000cfabb80 x18: 0000000000000038
x17: 0000000000000000 x16: 0000000000000000 x15: fffffffffffea0d0
x14: 0000000000000000 x13: 284e4f5f4e524157 x12: 5f534b434f4c5f47
x11: ffff80000a386aa8 x10: 0000000000000029 x9 : ffff80000cfab610
x8 : 0000000000000029 x7 : 0000000000000014 x6 : 0000000000000000
x5 : 0000000000000001 x4 : ffff8000081ad904 x3 : 0000000000000029
x2 : ffff0000801db4c0 x1 : ffff80000cfabb80 x0 : ffff000081aceb58
Call trace:
drm_modeset_lock+0xf8/0x154
drm_atomic_get_private_obj_state+0x84/0x170
mdp5_get_global_state+0x54/0x6c
mdp5_pipe_release+0x2c/0xd4
mdp5_plane_atomic_check+0x2ec/0x414
drm_atomic_helper_check_planes+0xd8/0x210
drm_atomic_helper_check+0x54/0xb0
...
---[ end trace 0000000000000000 ]---
drm_modeset_lock attempting to lock a contended lock without backoff:
drm_modeset_lock+0x148/0x154
mdp5_get_global_state+0x30/0x6c
mdp5_pipe_release+0x2c/0xd4
mdp5_plane_atomic_check+0x290/0x414
drm_atomic_helper_check_planes+0xd8/0x210
drm_atomic_helper_check+0x54/0xb0
drm_atomic_check_only+0x4b0/0x8f4
drm_atomic_commit+0x68/0xe0
Patchwork: https://patchwork.freedesktop.org/patch/492701/
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
net: hibmcge: fix rtnl deadlock issue
Currently, the hibmcge netdev acquires the rtnl_lock in
pci_error_handlers.reset_prepare() and releases it in
pci_error_handlers.reset_done().
However, in the PCI framework:
pci_reset_bus - __pci_reset_slot - pci_slot_save_and_disable_locked -
pci_dev_save_and_disable - err_handler->reset_prepare(dev);
In pci_slot_save_and_disable_locked():
list_for_each_entry(dev, &slot->bus->devices, ...
In the Linux kernel, the following vulnerability has been resolved:
net: hibmcge: fix rtnl deadlock issue
Currently, the hibmcge netdev acquires the rtnl_lock in
pci_error_handlers.reset_prepare() and releases it in
pci_error_handlers.reset_done().
However, in the PCI framework:
pci_reset_bus - __pci_reset_slot - pci_slot_save_and_disable_locked -
pci_dev_save_and_disable - err_handler->reset_prepare(dev);
In pci_slot_save_and_disable_locked():
list_for_each_entry(dev, &slot->bus->devices, bus_list) {
if (!dev->slot || dev->slot!= slot)
continue;
pci_dev_save_and_disable(dev);
if (dev->subordinate)
pci_bus_save_and_disable_locked(dev->subordinate);
}
This will iterate through all devices under the current bus and execute
err_handler->reset_prepare(), causing two devices of the hibmcge driver
to sequentially request the rtnl_lock, leading to a deadlock.
Since the driver now executes netif_device_detach()
before the reset process, it will not concurrently with
other netdev APIs, so there is no need to hold the rtnl_lock now.
Therefore, this patch removes the rtnl_lock during the reset process and
adjusts the position of HBG_NIC_STATE_RESETTING to ensure
that multiple resets are not executed concurrently.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
media: mt9m114: Fix deadlock in get_frame_interval/set_frame_interval
Getting / Setting the frame interval using the V4L2 subdev pad ops
get_frame_interval/set_frame_interval causes a deadlock, as the
subdev state is locked in the [1] but also in the driver itself.
In [2] it's described that the caller is responsible to acquire and
release the lock in this case. Therefore, acquiring the lock in the
driver is wrong.
Remove th ...
In the Linux kernel, the following vulnerability has been resolved:
media: mt9m114: Fix deadlock in get_frame_interval/set_frame_interval
Getting / Setting the frame interval using the V4L2 subdev pad ops
get_frame_interval/set_frame_interval causes a deadlock, as the
subdev state is locked in the [1] but also in the driver itself.
In [2] it's described that the caller is responsible to acquire and
release the lock in this case. Therefore, acquiring the lock in the
driver is wrong.
Remove the lock acquisitions/releases from mt9m114_ifp_get_frame_interval()
and mt9m114_ifp_set_frame_interval().
[1] drivers/media/v4l2-core/v4l2-subdev.c - line 1129
[2] Documentation/driver-api/media/v4l2-subdev.rst
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
LoongArch: Optimize module load time by optimizing PLT/GOT counting
When enabling CONFIG_KASAN, CONFIG_PREEMPT_VOLUNTARY_BUILD and
CONFIG_PREEMPT_VOLUNTARY at the same time, there will be soft deadlock,
the relevant logs are as follows:
rcu: INFO: rcu_sched self-detected stall on CPU
...
Call Trace:
[<900000000024f9e4>] show_stack+0x5c/0x180
[<90000000002482f4>] dump_stack_lvl+0x94/0xbc
[<9000000000224544>] rcu_dump_cpu_stack ...
In the Linux kernel, the following vulnerability has been resolved:
LoongArch: Optimize module load time by optimizing PLT/GOT counting
When enabling CONFIG_KASAN, CONFIG_PREEMPT_VOLUNTARY_BUILD and
CONFIG_PREEMPT_VOLUNTARY at the same time, there will be soft deadlock,
the relevant logs are as follows:
rcu: INFO: rcu_sched self-detected stall on CPU
...
Call Trace:
[<900000000024f9e4>] show_stack+0x5c/0x180
[<90000000002482f4>] dump_stack_lvl+0x94/0xbc
[<9000000000224544>] rcu_dump_cpu_stacks+0x1fc/0x280
[<900000000037ac80>] rcu_sched_clock_irq+0x720/0xf88
[<9000000000396c34>] update_process_times+0xb4/0x150
[<90000000003b2474>] tick_nohz_handler+0xf4/0x250
[<9000000000397e28>] __hrtimer_run_queues+0x1d0/0x428
[<9000000000399b2c>] hrtimer_interrupt+0x214/0x538
[<9000000000253634>] constant_timer_interrupt+0x64/0x80
[<9000000000349938>] __handle_irq_event_percpu+0x78/0x1a0
[<9000000000349a78>] handle_irq_event_percpu+0x18/0x88
[<9000000000354c00>] handle_percpu_irq+0x90/0xf0
[<9000000000348c74>] handle_irq_desc+0x94/0xb8
[<9000000001012b28>] handle_cpu_irq+0x68/0xa0
[<9000000001def8c0>] handle_loongarch_irq+0x30/0x48
[<9000000001def958>] do_vint+0x80/0xd0
[<9000000000268a0c>] kasan_mem_to_shadow.part.0+0x2c/0x2a0
[<90000000006344f4>] __asan_load8+0x4c/0x120
[<900000000025c0d0>] module_frob_arch_sections+0x5c8/0x6b8
[<90000000003895f0>] load_module+0x9e0/0x2958
[<900000000038b770>] __do_sys_init_module+0x208/0x2d0
[<9000000001df0c34>] do_syscall+0x94/0x190
[<900000000024d6fc>] handle_syscall+0xbc/0x158
After analysis, this is because the slow speed of loading the amdgpu
module leads to the long time occupation of the cpu and then the soft
deadlock.
When loading a module, module_frob_arch_sections() tries to figure out
the number of PLTs/GOTs that will be needed to handle all the RELAs. It
will call the count_max_entries() to find in an out-of-order date which
counting algorithm has O(n^2) complexity.
To make it faster, we sort the relocation list by info and addend. That
way, to check for a duplicate relocation, it just needs to compare with
the previous entry. This reduces the complexity of the algorithm to O(n
log n), as done in commit d4e0340919fb ("arm64/module: Optimize module
load time by optimizing PLT counting"). This gives sinificant reduction
in module load time for modules with large number of relocations.
After applying this patch, the soft deadlock problem has been solved,
and the kernel starts normally without "Call Trace".
Using the default configuration to test some modules, the results are as
follows:
Module Size
ip_tables 36K
fat 143K
radeon 2.5MB
amdgpu 16MB
Without this patch:
Module Module load time (ms) Count(PLTs/GOTs)
ip_tables 18 59/6
fat 0 162/14
radeon 54 1221/84
amdgpu 1411 4525/1098
With this patch:
Module Module load time (ms) Count(PLTs/GOTs)
ip_tables 18 59/6
fat 0 162/14
radeon 22 1221/84
amdgpu 45 4525/1098
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
bnxt_en: Fix lockdep warning during rmmod
The commit under the Fixes tag added a netdev_assert_locked() in
bnxt_free_ntp_fltrs(). The lock should be held during normal run-time
but the assert will be triggered (see below) during bnxt_remove_one()
which should not need the lock. The netdev is already unregistered by
then. Fix it by calling netdev_assert_locked_or_invisible() which will
not assert if the netdev is unregistere ...
In the Linux kernel, the following vulnerability has been resolved:
bnxt_en: Fix lockdep warning during rmmod
The commit under the Fixes tag added a netdev_assert_locked() in
bnxt_free_ntp_fltrs(). The lock should be held during normal run-time
but the assert will be triggered (see below) during bnxt_remove_one()
which should not need the lock. The netdev is already unregistered by
then. Fix it by calling netdev_assert_locked_or_invisible() which will
not assert if the netdev is unregistered.
WARNING: CPU: 5 PID: 2241 at ./include/net/netdev_lock.h:17 bnxt_free_ntp_fltrs+0xf8/0x100 [bnxt_en]
Modules linked in: rpcrdma rdma_cm iw_cm ib_cm configfs ib_core bnxt_en(-) bridge stp llc x86_pkg_temp_thermal xfs tg3 [last unloaded: bnxt_re]
CPU: 5 UID: 0 PID: 2241 Comm: rmmod Tainted: G S W 6.16.0 #2 PREEMPT(voluntary)
Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/17/2017
RIP: 0010:bnxt_free_ntp_fltrs+0xf8/0x100 [bnxt_en]
Code: 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 8b 47 60 be ff ff ff ff 48 8d b8 28 0c 00 00 e8 d0 cf 41 c3 85 c0 0f 85 2e ff ff ff <0f> 0b e9 27 ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
RSP: 0018:ffffa92082387da0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff9e5b593d8000 RCX: 0000000000000001
RDX: 0000000000000001 RSI: ffffffff83dc9a70 RDI: ffffffff83e1a1cf
RBP: ffff9e5b593d8c80 R08: 0000000000000000 R09: ffffffff8373a2b3
R10: 000000008100009f R11: 0000000000000001 R12: 0000000000000001
R13: ffffffffc01c4478 R14: dead000000000122 R15: dead000000000100
FS: 00007f3a8a52c740(0000) GS:ffff9e631ad1c000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055bb289419c8 CR3: 000000011274e001 CR4: 00000000003706f0
Call Trace:
<TASK>
bnxt_remove_one+0x57/0x180 [bnxt_en]
pci_device_remove+0x39/0xc0
device_release_driver_internal+0xa5/0x130
driver_detach+0x42/0x90
bus_remove_driver+0x61/0xc0
pci_unregister_driver+0x38/0x90
bnxt_exit+0xc/0x7d0 [bnxt_en]
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
dm: dm-crypt: Do not partially accept write BIOs with zoned targets
Read and write operations issued to a dm-crypt target may be split
according to the dm-crypt internal limits defined by the max_read_size
and max_write_size module parameters (default is 128 KB). The intent is
to improve processing time of large BIOs by splitting them into smaller
operations that can be parallelized on different CPUs.
For zoned dm-crypt targe ...
In the Linux kernel, the following vulnerability has been resolved:
dm: dm-crypt: Do not partially accept write BIOs with zoned targets
Read and write operations issued to a dm-crypt target may be split
according to the dm-crypt internal limits defined by the max_read_size
and max_write_size module parameters (default is 128 KB). The intent is
to improve processing time of large BIOs by splitting them into smaller
operations that can be parallelized on different CPUs.
For zoned dm-crypt targets, this BIO splitting is still done but without
the parallel execution to ensure that the issuing order of write
operations to the underlying devices remains sequential. However, the
splitting itself causes other problems:
1) Since dm-crypt relies on the block layer zone write plugging to
handle zone append emulation using regular write operations, the
reminder of a split write BIO will always be plugged into the target
zone write plugged. Once the on-going write BIO finishes, this
reminder BIO is unplugged and issued from the zone write plug work.
If this reminder BIO itself needs to be split, the reminder will be
re-issued and plugged again, but that causes a call to a
blk_queue_enter(), which may block if a queue freeze operation was
initiated. This results in a deadlock as DM submission still holds
BIOs that the queue freeze side is waiting for.
2) dm-crypt relies on the emulation done by the block layer using
regular write operations for processing zone append operations. This
still requires to properly return the written sector as the BIO
sector of the original BIO. However, this can be done correctly only
and only if there is a single clone BIO used for processing the
original zone append operation issued by the user. If the size of a
zone append operation is larger than dm-crypt max_write_size, then
the orginal BIO will be split and processed as a chain of regular
write operations. Such chaining result in an incorrect written sector
being returned to the zone append issuer using the original BIO
sector. This in turn results in file system data corruptions using
xfs or btrfs.
Fix this by modifying get_max_request_size() to always return the size
of the BIO to avoid it being split with dm_accpet_partial_bio() in
crypt_map(). get_max_request_size() is renamed to
get_max_request_sectors() to clarify the unit of the value returned
and its interface is changed to take a struct dm_target pointer and a
pointer to the struct bio being processed. In addition to this change,
to ensure that crypt_alloc_buffer() works correctly, set the dm-crypt
device max_hw_sectors limit to be at most
BIO_MAX_VECS << PAGE_SECTORS_SHIFT (1 MB with a 4KB page architecture).
This forces DM core to split write BIOs before passing them to
crypt_map(), and thus guaranteeing that dm-crypt can always accept an
entire write BIO without needing to split it.
This change does not have any effect on the read path of dm-crypt. Read
operations can still be split and the BIO fragments processed in
parallel. There is also no impact on the performance of the write path
given that all zone write BIOs were already processed inline instead of
in parallel.
This change also does not affect in any way regular dm-crypt block
devices.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
smb: client: fix potential deadlock when releasing mids
All release_mid() callers seem to hold a reference of @mid so there is
no need to call kref_put(&mid->refcount, __release_mid) under
@server->mid_lock spinlock. If they don't, then an use-after-free bug
would have occurred anyways.
By getting rid of such spinlock also fixes a potential deadlock as
shown below
CPU 0 CPU 1
----------------- ...
In the Linux kernel, the following vulnerability has been resolved:
smb: client: fix potential deadlock when releasing mids
All release_mid() callers seem to hold a reference of @mid so there is
no need to call kref_put(&mid->refcount, __release_mid) under
@server->mid_lock spinlock. If they don't, then an use-after-free bug
would have occurred anyways.
By getting rid of such spinlock also fixes a potential deadlock as
shown below
CPU 0 CPU 1
------------------------------------------------------------------
cifs_demultiplex_thread() cifs_debug_data_proc_show()
release_mid()
spin_lock(&server->mid_lock);
spin_lock(&cifs_tcp_ses_lock)
spin_lock(&server->mid_lock)
__release_mid()
smb2_find_smb_tcon()
spin_lock(&cifs_tcp_ses_lock) *deadlock*
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
ext4: avoid deadlock in fs reclaim with page writeback
Ext4 has a filesystem wide lock protecting ext4_writepages() calls to
avoid races with switching of journalled data flag or inode format. This
lock can however cause a deadlock like:
CPU0 CPU1
ext4_writepages()
percpu_down_read(sbi->s_writepages_rwsem);
ext4_change_inode_journal_flag()
...
In the Linux kernel, the following vulnerability has been resolved:
ext4: avoid deadlock in fs reclaim with page writeback
Ext4 has a filesystem wide lock protecting ext4_writepages() calls to
avoid races with switching of journalled data flag or inode format. This
lock can however cause a deadlock like:
CPU0 CPU1
ext4_writepages()
percpu_down_read(sbi->s_writepages_rwsem);
ext4_change_inode_journal_flag()
percpu_down_write(sbi->s_writepages_rwsem);
- blocks, all readers block from now on
ext4_do_writepages()
ext4_init_io_end()
kmem_cache_zalloc(io_end_cachep, GFP_KERNEL)
fs_reclaim frees dentry...
dentry_unlink_inode()
iput() - last ref =>
iput_final() - inode dirty =>
write_inode_now()...
ext4_writepages() tries to acquire sbi->s_writepages_rwsem
and blocks forever
Make sure we cannot recurse into filesystem reclaim from writeback code
to avoid the deadlock.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
md/raid10: prevent soft lockup while flush writes
Currently, there is no limit for raid1/raid10 plugged bio. While flushing
writes, raid1 has cond_resched() while raid10 doesn't, and too many
writes can cause soft lockup.
Follow up soft lockup can be triggered easily with writeback test for
raid10 with ramdisks:
watchdog: BUG: soft lockup - CPU#10 stuck for 27s! [md0_raid10:1293]
Call Trace:
<TASK>
call_rcu+0x16/0x20
put_ ...
In the Linux kernel, the following vulnerability has been resolved:
md/raid10: prevent soft lockup while flush writes
Currently, there is no limit for raid1/raid10 plugged bio. While flushing
writes, raid1 has cond_resched() while raid10 doesn't, and too many
writes can cause soft lockup.
Follow up soft lockup can be triggered easily with writeback test for
raid10 with ramdisks:
watchdog: BUG: soft lockup - CPU#10 stuck for 27s! [md0_raid10:1293]
Call Trace:
<TASK>
call_rcu+0x16/0x20
put_object+0x41/0x80
__delete_object+0x50/0x90
delete_object_full+0x2b/0x40
kmemleak_free+0x46/0xa0
slab_free_freelist_hook.constprop.0+0xed/0x1a0
kmem_cache_free+0xfd/0x300
mempool_free_slab+0x1f/0x30
mempool_free+0x3a/0x100
bio_free+0x59/0x80
bio_put+0xcf/0x2c0
free_r10bio+0xbf/0xf0
raid_end_bio_io+0x78/0xb0
one_write_done+0x8a/0xa0
raid10_end_write_request+0x1b4/0x430
bio_endio+0x175/0x320
brd_submit_bio+0x3b9/0x9b7 [brd]
__submit_bio+0x69/0xe0
submit_bio_noacct_nocheck+0x1e6/0x5a0
submit_bio_noacct+0x38c/0x7e0
flush_pending_writes+0xf0/0x240
raid10d+0xac/0x1ed0
Fix the problem by adding cond_resched() to raid10 like what raid1 did.
Note that unlimited plugged bio still need to be optimized, for example,
in the case of lots of dirty pages writeback, this will take lots of
memory and io will spend a long time in plug, hence io latency is bad.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
e1000: Move cancel_work_sync to avoid deadlock
Previously, e1000_down called cancel_work_sync for the e1000 reset task
(via e1000_down_and_stop), which takes RTNL.
As reported by users and syzbot, a deadlock is possible in the following
scenario:
CPU 0:
- RTNL is held
- e1000_close
- e1000_down
- cancel_work_sync (cancel / wait for e1000_reset_task())
CPU 1:
- process_one_work
- e1000_reset_task
- take RTNL
T ...
In the Linux kernel, the following vulnerability has been resolved:
e1000: Move cancel_work_sync to avoid deadlock
Previously, e1000_down called cancel_work_sync for the e1000 reset task
(via e1000_down_and_stop), which takes RTNL.
As reported by users and syzbot, a deadlock is possible in the following
scenario:
CPU 0:
- RTNL is held
- e1000_close
- e1000_down
- cancel_work_sync (cancel / wait for e1000_reset_task())
CPU 1:
- process_one_work
- e1000_reset_task
- take RTNL
To remedy this, avoid calling cancel_work_sync from e1000_down
(e1000_reset_task does nothing if the device is down anyway). Instead,
call cancel_work_sync for e1000_reset_task when the device is being
removed.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
smb: client: fix potential deadlock when reconnecting channels
Fix cifs_signal_cifsd_for_reconnect() to take the correct lock order
and prevent the following deadlock from happening
======================================================
WARNING: possible circular locking dependency detected
6.16.0-rc3-build2+ #1301 Tainted: G S W
------------------------------------------------------
cifsd/6055 is trying to acquire lock: ...
In the Linux kernel, the following vulnerability has been resolved:
smb: client: fix potential deadlock when reconnecting channels
Fix cifs_signal_cifsd_for_reconnect() to take the correct lock order
and prevent the following deadlock from happening
======================================================
WARNING: possible circular locking dependency detected
6.16.0-rc3-build2+ #1301 Tainted: G S W
------------------------------------------------------
cifsd/6055 is trying to acquire lock:
ffff88810ad56038 (&tcp_ses->srv_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x134/0x200
but task is already holding lock:
ffff888119c64330 (&ret_buf->chan_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0xcf/0x200
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&ret_buf->chan_lock){+.+.}-{3:3}:
validate_chain+0x1cf/0x270
__lock_acquire+0x60e/0x780
lock_acquire.part.0+0xb4/0x1f0
_raw_spin_lock+0x2f/0x40
cifs_setup_session+0x81/0x4b0
cifs_get_smb_ses+0x771/0x900
cifs_mount_get_session+0x7e/0x170
cifs_mount+0x92/0x2d0
cifs_smb3_do_mount+0x161/0x460
smb3_get_tree+0x55/0x90
vfs_get_tree+0x46/0x180
do_new_mount+0x1b0/0x2e0
path_mount+0x6ee/0x740
do_mount+0x98/0xe0
__do_sys_mount+0x148/0x180
do_syscall_64+0xa4/0x260
entry_SYSCALL_64_after_hwframe+0x76/0x7e
-> #1 (&ret_buf->ses_lock){+.+.}-{3:3}:
validate_chain+0x1cf/0x270
__lock_acquire+0x60e/0x780
lock_acquire.part.0+0xb4/0x1f0
_raw_spin_lock+0x2f/0x40
cifs_match_super+0x101/0x320
sget+0xab/0x270
cifs_smb3_do_mount+0x1e0/0x460
smb3_get_tree+0x55/0x90
vfs_get_tree+0x46/0x180
do_new_mount+0x1b0/0x2e0
path_mount+0x6ee/0x740
do_mount+0x98/0xe0
__do_sys_mount+0x148/0x180
do_syscall_64+0xa4/0x260
entry_SYSCALL_64_after_hwframe+0x76/0x7e
-> #0 (&tcp_ses->srv_lock){+.+.}-{3:3}:
check_noncircular+0x95/0xc0
check_prev_add+0x115/0x2f0
validate_chain+0x1cf/0x270
__lock_acquire+0x60e/0x780
lock_acquire.part.0+0xb4/0x1f0
_raw_spin_lock+0x2f/0x40
cifs_signal_cifsd_for_reconnect+0x134/0x200
__cifs_reconnect+0x8f/0x500
cifs_handle_standard+0x112/0x280
cifs_demultiplex_thread+0x64d/0xbc0
kthread+0x2f7/0x310
ret_from_fork+0x2a/0x230
ret_from_fork_asm+0x1a/0x30
other info that might help us debug this:
Chain exists of:
&tcp_ses->srv_lock --> &ret_buf->ses_lock --> &ret_buf->chan_lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&ret_buf->chan_lock);
lock(&ret_buf->ses_lock);
lock(&ret_buf->chan_lock);
lock(&tcp_ses->srv_lock);
*** DEADLOCK ***
3 locks held by cifsd/6055:
#0: ffffffff857de398 (&cifs_tcp_ses_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x7b/0x200
#1: ffff888119c64060 (&ret_buf->ses_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x9c/0x200
#2: ffff888119c64330 (&ret_buf->chan_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0xcf/0x200
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
af_packet: move notifier's packet_dev_mc out of rcu critical section
Syzkaller reports the following issue:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:578
__mutex_lock+0x106/0xe80 kernel/locking/mutex.c:746
team_change_rx_flags+0x38/0x220 drivers/net/team/team_core.c:1781
dev_change_rx_flags net/core/dev.c:9145 [inline]
__dev_set_promiscuity+0x3f8/0x590 net/core/dev.c:9189
netif_set_pro ...
In the Linux kernel, the following vulnerability has been resolved:
af_packet: move notifier's packet_dev_mc out of rcu critical section
Syzkaller reports the following issue:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:578
__mutex_lock+0x106/0xe80 kernel/locking/mutex.c:746
team_change_rx_flags+0x38/0x220 drivers/net/team/team_core.c:1781
dev_change_rx_flags net/core/dev.c:9145 [inline]
__dev_set_promiscuity+0x3f8/0x590 net/core/dev.c:9189
netif_set_promiscuity+0x50/0xe0 net/core/dev.c:9201
dev_set_promiscuity+0x126/0x260 net/core/dev_api.c:286 packet_dev_mc net/packet/af_packet.c:3698 [inline]
packet_dev_mclist_delete net/packet/af_packet.c:3722 [inline]
packet_notifier+0x292/0xa60 net/packet/af_packet.c:4247
notifier_call_chain+0x1b3/0x3e0 kernel/notifier.c:85
call_netdevice_notifiers_extack net/core/dev.c:2214 [inline]
call_netdevice_notifiers net/core/dev.c:2228 [inline]
unregister_netdevice_many_notify+0x15d8/0x2330 net/core/dev.c:11972
rtnl_delete_link net/core/rtnetlink.c:3522 [inline]
rtnl_dellink+0x488/0x710 net/core/rtnetlink.c:3564
rtnetlink_rcv_msg+0x7cf/0xb70 net/core/rtnetlink.c:6955
netlink_rcv_skb+0x219/0x490 net/netlink/af_netlink.c:2534
Calling `PACKET_ADD_MEMBERSHIP` on an ops-locked device can trigger
the `NETDEV_UNREGISTER` notifier, which may require disabling promiscuous
and/or allmulti mode. Both of these operations require acquiring
the netdev instance lock.
Move the call to `packet_dev_mc` outside of the RCU critical section.
The `mclist` modifications (add, del, flush, unregister) are protected by
the RTNL, not the RCU. The RCU only protects the `sklist` and its
associated `sks`. The delayed operation on the `mclist` entry remains
within the RTNL.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
usb: typec: tcpm: move tcpm_queue_vdm_unlocked to asynchronous work
A state check was previously added to tcpm_queue_vdm_unlocked to
prevent a deadlock where the DisplayPort Alt Mode driver would be
executing work and attempting to grab the tcpm_lock while the TCPM
was holding the lock and attempting to unregister the altmode, blocking
on the altmode driver's cancel_work_sync call.
Because the state check isn't protected, the ...
In the Linux kernel, the following vulnerability has been resolved:
usb: typec: tcpm: move tcpm_queue_vdm_unlocked to asynchronous work
A state check was previously added to tcpm_queue_vdm_unlocked to
prevent a deadlock where the DisplayPort Alt Mode driver would be
executing work and attempting to grab the tcpm_lock while the TCPM
was holding the lock and attempting to unregister the altmode, blocking
on the altmode driver's cancel_work_sync call.
Because the state check isn't protected, there is a small window
where the Alt Mode driver could determine that the TCPM is
in a ready state and attempt to grab the lock while the
TCPM grabs the lock and changes the TCPM state to one that
causes the deadlock. The callstack is provided below:
[110121.667392][ C7] Call trace:
[110121.667396][ C7] __switch_to+0x174/0x338
[110121.667406][ C7] __schedule+0x608/0x9f0
[110121.667414][ C7] schedule+0x7c/0xe8
[110121.667423][ C7] kernfs_drain+0xb0/0x114
[110121.667431][ C7] __kernfs_remove+0x16c/0x20c
[110121.667436][ C7] kernfs_remove_by_name_ns+0x74/0xe8
[110121.667442][ C7] sysfs_remove_group+0x84/0xe8
[110121.667450][ C7] sysfs_remove_groups+0x34/0x58
[110121.667458][ C7] device_remove_groups+0x10/0x20
[110121.667464][ C7] device_release_driver_internal+0x164/0x2e4
[110121.667475][ C7] device_release_driver+0x18/0x28
[110121.667484][ C7] bus_remove_device+0xec/0x118
[110121.667491][ C7] device_del+0x1e8/0x4ac
[110121.667498][ C7] device_unregister+0x18/0x38
[110121.667504][ C7] typec_unregister_altmode+0x30/0x44
[110121.667515][ C7] tcpm_reset_port+0xac/0x370
[110121.667523][ C7] tcpm_snk_detach+0x84/0xb8
[110121.667529][ C7] run_state_machine+0x4c0/0x1b68
[110121.667536][ C7] tcpm_state_machine_work+0x94/0xe4
[110121.667544][ C7] kthread_worker_fn+0x10c/0x244
[110121.667552][ C7] kthread+0x104/0x1d4
[110121.667557][ C7] ret_from_fork+0x10/0x20
[110121.667689][ C7] Workqueue: events dp_altmode_work
[110121.667697][ C7] Call trace:
[110121.667701][ C7] __switch_to+0x174/0x338
[110121.667710][ C7] __schedule+0x608/0x9f0
[110121.667717][ C7] schedule+0x7c/0xe8
[110121.667725][ C7] schedule_preempt_disabled+0x24/0x40
[110121.667733][ C7] __mutex_lock+0x408/0xdac
[110121.667741][ C7] __mutex_lock_slowpath+0x14/0x24
[110121.667748][ C7] mutex_lock+0x40/0xec
[110121.667757][ C7] tcpm_altmode_enter+0x78/0xb4
[110121.667764][ C7] typec_altmode_enter+0xdc/0x10c
[110121.667769][ C7] dp_altmode_work+0x68/0x164
[110121.667775][ C7] process_one_work+0x1e4/0x43c
[110121.667783][ C7] worker_thread+0x25c/0x430
[110121.667789][ C7] kthread+0x104/0x1d4
[110121.667794][ C7] ret_from_fork+0x10/0x20
Change tcpm_queue_vdm_unlocked to queue for tcpm_queue_vdm_work,
which can perform the state check while holding the TCPM lock
while the Alt Mode lock is no longer held. This requires a new
struct to hold the vdm data, altmode_vdm_event.
Show More
|