| CVE |
Vendors |
Products |
Updated |
CVSS v2 |
CVSS v3 |
In the Linux kernel, the following vulnerability has been resolved:
bpf: bpf_local_storage: Always use bpf_mem_alloc in PREEMPT_RT
In PREEMPT_RT, kmalloc(GFP_ATOMIC) is still not safe in non preemptible
context. bpf_mem_alloc must be used in PREEMPT_RT. This patch is
to enforce bpf_mem_alloc in the bpf_local_storage when CONFIG_PREEMPT_RT
is enabled.
[ 35.118559] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
[ 35.118566] in_atomic(): 1, irqs_disable ...
In the Linux kernel, the following vulnerability has been resolved:
bpf: bpf_local_storage: Always use bpf_mem_alloc in PREEMPT_RT
In PREEMPT_RT, kmalloc(GFP_ATOMIC) is still not safe in non preemptible
context. bpf_mem_alloc must be used in PREEMPT_RT. This patch is
to enforce bpf_mem_alloc in the bpf_local_storage when CONFIG_PREEMPT_RT
is enabled.
[ 35.118559] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
[ 35.118566] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1832, name: test_progs
[ 35.118569] preempt_count: 1, expected: 0
[ 35.118571] RCU nest depth: 1, expected: 1
[ 35.118577] INFO: lockdep is turned off.
...
[ 35.118647] __might_resched+0x433/0x5b0
[ 35.118677] rt_spin_lock+0xc3/0x290
[ 35.118700] ___slab_alloc+0x72/0xc40
[ 35.118723] __kmalloc_noprof+0x13f/0x4e0
[ 35.118732] bpf_map_kzalloc+0xe5/0x220
[ 35.118740] bpf_selem_alloc+0x1d2/0x7b0
[ 35.118755] bpf_local_storage_update+0x2fa/0x8b0
[ 35.118784] bpf_sk_storage_get_tracing+0x15a/0x1d0
[ 35.118791] bpf_prog_9a118d86fca78ebb_trace_inet_sock_set_state+0x44/0x66
[ 35.118795] bpf_trace_run3+0x222/0x400
[ 35.118820] __bpf_trace_inet_sock_set_state+0x11/0x20
[ 35.118824] trace_inet_sock_set_state+0x112/0x130
[ 35.118830] inet_sk_state_store+0x41/0x90
[ 35.118836] tcp_set_state+0x3b3/0x640
There is no need to adjust the gfp_flags passing to the
bpf_mem_cache_alloc_flags() which only honors the GFP_KERNEL.
The verifier has ensured GFP_KERNEL is passed only in sleepable context.
It has been an old issue since the first introduction of the
bpf_local_storage ~5 years ago, so this patch targets the bpf-next.
bpf_mem_alloc is needed to solve it, so the Fixes tag is set
to the commit when bpf_mem_alloc was first used in the bpf_local_storage.
Show More
|
|
In the Linux kernel, the following vulnerability has been resolved:
media: uvcvideo: Fix deadlock during uvc_probe
If uvc_probe() fails, it can end up calling uvc_status_unregister() before
uvc_status_init() is called.
Fix this by checking if dev->status is NULL or not in
uvc_status_unregister().
|
In the Linux kernel, the following vulnerability has been resolved:
rhashtable: Fix potential deadlock by moving schedule_work outside lock
Move the hash table growth check and work scheduling outside the
rht lock to prevent a possible circular locking dependency.
The original implementation could trigger a lockdep warning due to
a potential deadlock scenario involving nested locks between
rhashtable bucket, rq lock, and dsq lock. By relocating the
growth check and work scheduling after relea ...
In the Linux kernel, the following vulnerability has been resolved:
rhashtable: Fix potential deadlock by moving schedule_work outside lock
Move the hash table growth check and work scheduling outside the
rht lock to prevent a possible circular locking dependency.
The original implementation could trigger a lockdep warning due to
a potential deadlock scenario involving nested locks between
rhashtable bucket, rq lock, and dsq lock. By relocating the
growth check and work scheduling after releasing the rth lock, we break
this potential deadlock chain.
This change expands the flexibility of rhashtable by removing
restrictive locking that previously limited its use in scheduler
and workqueue contexts.
Import to say that this calls rht_grow_above_75(), which reads from
struct rhashtable without holding the lock, if this is a problem, we can
move the check to the lock, and schedule the workqueue after the lock.
Modified so that atomic_inc is also moved outside of the bucket
lock along with the growth above 75% check.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
nfs_common: must not hold RCU while calling nfsd_file_put_local
Move holding the RCU from nfs_to_nfsd_file_put_local to
nfs_to_nfsd_net_put. It is the call to nfs_to->nfsd_serv_put that
requires the RCU anyway (the puts for nfsd_file and netns were
combined to avoid an extra indirect reference but that
micro-optimization isn't possible now).
This fixes xfstests generic/013 and it triggering:
"Voluntary context switch within ...
In the Linux kernel, the following vulnerability has been resolved:
nfs_common: must not hold RCU while calling nfsd_file_put_local
Move holding the RCU from nfs_to_nfsd_file_put_local to
nfs_to_nfsd_net_put. It is the call to nfs_to->nfsd_serv_put that
requires the RCU anyway (the puts for nfsd_file and netns were
combined to avoid an extra indirect reference but that
micro-optimization isn't possible now).
This fixes xfstests generic/013 and it triggering:
"Voluntary context switch within RCU read-side critical section!"
[ 143.545738] Call Trace:
[ 143.546206] <TASK>
[ 143.546625] ? show_regs+0x6d/0x80
[ 143.547267] ? __warn+0x91/0x140
[ 143.547951] ? rcu_note_context_switch+0x496/0x5d0
[ 143.548856] ? report_bug+0x193/0x1a0
[ 143.549557] ? handle_bug+0x63/0xa0
[ 143.550214] ? exc_invalid_op+0x1d/0x80
[ 143.550938] ? asm_exc_invalid_op+0x1f/0x30
[ 143.551736] ? rcu_note_context_switch+0x496/0x5d0
[ 143.552634] ? wakeup_preempt+0x62/0x70
[ 143.553358] __schedule+0xaa/0x1380
[ 143.554025] ? _raw_spin_unlock_irqrestore+0x12/0x40
[ 143.554958] ? try_to_wake_up+0x1fe/0x6b0
[ 143.555715] ? wake_up_process+0x19/0x20
[ 143.556452] schedule+0x2e/0x120
[ 143.557066] schedule_preempt_disabled+0x19/0x30
[ 143.557933] rwsem_down_read_slowpath+0x24d/0x4a0
[ 143.558818] ? xfs_efi_item_format+0x50/0xc0 [xfs]
[ 143.559894] down_read+0x4e/0xb0
[ 143.560519] xlog_cil_commit+0x1b2/0xbc0 [xfs]
[ 143.561460] ? _raw_spin_unlock+0x12/0x30
[ 143.562212] ? xfs_inode_item_precommit+0xc7/0x220 [xfs]
[ 143.563309] ? xfs_trans_run_precommits+0x69/0xd0 [xfs]
[ 143.564394] __xfs_trans_commit+0xb5/0x330 [xfs]
[ 143.565367] xfs_trans_roll+0x48/0xc0 [xfs]
[ 143.566262] xfs_defer_trans_roll+0x57/0x100 [xfs]
[ 143.567278] xfs_defer_finish_noroll+0x27a/0x490 [xfs]
[ 143.568342] xfs_defer_finish+0x1a/0x80 [xfs]
[ 143.569267] xfs_bunmapi_range+0x4d/0xb0 [xfs]
[ 143.570208] xfs_itruncate_extents_flags+0x13d/0x230 [xfs]
[ 143.571353] xfs_free_eofblocks+0x12e/0x190 [xfs]
[ 143.572359] xfs_file_release+0x12d/0x140 [xfs]
[ 143.573324] __fput+0xe8/0x2d0
[ 143.573922] __fput_sync+0x1d/0x30
[ 143.574574] nfsd_filp_close+0x33/0x60 [nfsd]
[ 143.575430] nfsd_file_free+0x96/0x150 [nfsd]
[ 143.576274] nfsd_file_put+0xf7/0x1a0 [nfsd]
[ 143.577104] nfsd_file_put_local+0x18/0x30 [nfsd]
[ 143.578070] nfs_close_local_fh+0x101/0x110 [nfs_localio]
[ 143.579079] __put_nfs_open_context+0xc9/0x180 [nfs]
[ 143.580031] nfs_file_clear_open_context+0x4a/0x60 [nfs]
[ 143.581038] nfs_file_release+0x3e/0x60 [nfs]
[ 143.581879] __fput+0xe8/0x2d0
[ 143.582464] __fput_sync+0x1d/0x30
[ 143.583108] __x64_sys_close+0x41/0x80
[ 143.583823] x64_sys_call+0x189a/0x20d0
[ 143.584552] do_syscall_64+0x64/0x170
[ 143.585240] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 143.586185] RIP: 0033:0x7f3c5153efd7
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
block: Prevent potential deadlocks in zone write plug error recovery
Zone write plugging for handling writes to zones of a zoned block
device always execute a zone report whenever a write BIO to a zone
fails. The intent of this is to ensure that the tracking of a zone write
pointer is always correct to ensure that the alignment to a zone write
pointer of write BIOs can be checked on submission and that we can
always correctly ...
In the Linux kernel, the following vulnerability has been resolved:
block: Prevent potential deadlocks in zone write plug error recovery
Zone write plugging for handling writes to zones of a zoned block
device always execute a zone report whenever a write BIO to a zone
fails. The intent of this is to ensure that the tracking of a zone write
pointer is always correct to ensure that the alignment to a zone write
pointer of write BIOs can be checked on submission and that we can
always correctly emulate zone append operations using regular write
BIOs.
However, this error recovery scheme introduces a potential deadlock if a
device queue freeze is initiated while BIOs are still plugged in a zone
write plug and one of these write operation fails. In such case, the
disk zone write plug error recovery work is scheduled and executes a
report zone. This in turn can result in a request allocation in the
underlying driver to issue the report zones command to the device. But
with the device queue freeze already started, this allocation will
block, preventing the report zone execution and the continuation of the
processing of the plugged BIOs. As plugged BIOs hold a queue usage
reference, the queue freeze itself will never complete, resulting in a
deadlock.
Avoid this problem by completely removing from the zone write plugging
code the use of report zones operations after a failed write operation,
instead relying on the device user to either execute a report zones,
reset the zone, finish the zone, or give up writing to the device (which
is a fairly common pattern for file systems which degrade to read-only
after write failures). This is not an unreasonnable requirement as all
well-behaved applications, FSes and device mapper already use report
zones to recover from write errors whenever possible by comparing the
current position of a zone write pointer with what their assumption
about the position is.
The changes to remove the automatic error recovery are as follows:
- Completely remove the error recovery work and its associated
resources (zone write plug list head, disk error list, and disk
zone_wplugs_work work struct). This also removes the functions
disk_zone_wplug_set_error() and disk_zone_wplug_clear_error().
- Change the BLK_ZONE_WPLUG_ERROR zone write plug flag into
BLK_ZONE_WPLUG_NEED_WP_UPDATE. This new flag is set for a zone write
plug whenever a write opration targetting the zone of the zone write
plug fails. This flag indicates that the zone write pointer offset is
not reliable and that it must be updated when the next report zone,
reset zone, finish zone or disk revalidation is executed.
- Modify blk_zone_write_plug_bio_endio() to set the
BLK_ZONE_WPLUG_NEED_WP_UPDATE flag for the target zone of a failed
write BIO.
- Modify the function disk_zone_wplug_set_wp_offset() to clear this
new flag, thus implementing recovery of a correct write pointer
offset with the reset (all) zone and finish zone operations.
- Modify blkdev_report_zones() to always use the disk_report_zones_cb()
callback so that disk_zone_wplug_sync_wp_offset() can be called for
any zone marked with the BLK_ZONE_WPLUG_NEED_WP_UPDATE flag.
This implements recovery of a correct write pointer offset for zone
write plugs marked with BLK_ZONE_WPLUG_NEED_WP_UPDATE and within
the range of the report zones operation executed by the user.
- Modify blk_revalidate_seq_zone() to call
disk_zone_wplug_sync_wp_offset() for all sequential write required
zones when a zoned block device is revalidated, thus always resolving
any inconsistency between the write pointer offset of zone write
plugs and the actual write pointer position of sequential zones.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
netfilter: IDLETIMER: Fix for possible ABBA deadlock
Deletion of the last rule referencing a given idletimer may happen at
the same time as a read of its file in sysfs:
| ======================================================
| WARNING: possible circular locking dependency detected
| 6.12.0-rc7-01692-g5e9a28f41134-dirty #594 Not tainted
| ------------------------------------------------------
| iptables/3303 is trying to acqu ...
In the Linux kernel, the following vulnerability has been resolved:
netfilter: IDLETIMER: Fix for possible ABBA deadlock
Deletion of the last rule referencing a given idletimer may happen at
the same time as a read of its file in sysfs:
| ======================================================
| WARNING: possible circular locking dependency detected
| 6.12.0-rc7-01692-g5e9a28f41134-dirty #594 Not tainted
| ------------------------------------------------------
| iptables/3303 is trying to acquire lock:
| ffff8881057e04b8 (kn->active#48){++++}-{0:0}, at: __kernfs_remove+0x20
|
| but task is already holding lock:
| ffffffffa0249068 (list_mutex){+.+.}-{3:3}, at: idletimer_tg_destroy_v]
|
| which lock already depends on the new lock.
A simple reproducer is:
| #!/bin/bash
|
| while true; do
| iptables -A INPUT -i foo -j IDLETIMER --timeout 10 --label "testme"
| iptables -D INPUT -i foo -j IDLETIMER --timeout 10 --label "testme"
| done &
| while true; do
| cat /sys/class/xt_idletimer/timers/testme >/dev/null
| done
Avoid this by freeing list_mutex right after deleting the element from
the list, then continuing with the teardown.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: iso: Fix circular lock in iso_listen_bis
This fixes the circular locking dependency warning below, by
releasing the socket lock before enterning iso_listen_bis, to
avoid any potential deadlock with hdev lock.
[ 75.307983] ======================================================
[ 75.307984] WARNING: possible circular locking dependency detected
[ 75.307985] 6.12.0-rc6+ #22 Not tainted
[ 75.307987] ----------- ...
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: iso: Fix circular lock in iso_listen_bis
This fixes the circular locking dependency warning below, by
releasing the socket lock before enterning iso_listen_bis, to
avoid any potential deadlock with hdev lock.
[ 75.307983] ======================================================
[ 75.307984] WARNING: possible circular locking dependency detected
[ 75.307985] 6.12.0-rc6+ #22 Not tainted
[ 75.307987] ------------------------------------------------------
[ 75.307987] kworker/u81:2/2623 is trying to acquire lock:
[ 75.307988] ffff8fde1769da58 (sk_lock-AF_BLUETOOTH-BTPROTO_ISO)
at: iso_connect_cfm+0x253/0x840 [bluetooth]
[ 75.308021]
but task is already holding lock:
[ 75.308022] ffff8fdd61a10078 (&hdev->lock)
at: hci_le_per_adv_report_evt+0x47/0x2f0 [bluetooth]
[ 75.308053]
which lock already depends on the new lock.
[ 75.308054]
the existing dependency chain (in reverse order) is:
[ 75.308055]
-> #1 (&hdev->lock){+.+.}-{3:3}:
[ 75.308057] __mutex_lock+0xad/0xc50
[ 75.308061] mutex_lock_nested+0x1b/0x30
[ 75.308063] iso_sock_listen+0x143/0x5c0 [bluetooth]
[ 75.308085] __sys_listen_socket+0x49/0x60
[ 75.308088] __x64_sys_listen+0x4c/0x90
[ 75.308090] x64_sys_call+0x2517/0x25f0
[ 75.308092] do_syscall_64+0x87/0x150
[ 75.308095] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 75.308098]
-> #0 (sk_lock-AF_BLUETOOTH-BTPROTO_ISO){+.+.}-{0:0}:
[ 75.308100] __lock_acquire+0x155e/0x25f0
[ 75.308103] lock_acquire+0xc9/0x300
[ 75.308105] lock_sock_nested+0x32/0x90
[ 75.308107] iso_connect_cfm+0x253/0x840 [bluetooth]
[ 75.308128] hci_connect_cfm+0x6c/0x190 [bluetooth]
[ 75.308155] hci_le_per_adv_report_evt+0x27b/0x2f0 [bluetooth]
[ 75.308180] hci_le_meta_evt+0xe7/0x200 [bluetooth]
[ 75.308206] hci_event_packet+0x21f/0x5c0 [bluetooth]
[ 75.308230] hci_rx_work+0x3ae/0xb10 [bluetooth]
[ 75.308254] process_one_work+0x212/0x740
[ 75.308256] worker_thread+0x1bd/0x3a0
[ 75.308258] kthread+0xe4/0x120
[ 75.308259] ret_from_fork+0x44/0x70
[ 75.308261] ret_from_fork_asm+0x1a/0x30
[ 75.308263]
other info that might help us debug this:
[ 75.308264] Possible unsafe locking scenario:
[ 75.308264] CPU0 CPU1
[ 75.308265] ---- ----
[ 75.308265] lock(&hdev->lock);
[ 75.308267] lock(sk_lock-
AF_BLUETOOTH-BTPROTO_ISO);
[ 75.308268] lock(&hdev->lock);
[ 75.308269] lock(sk_lock-AF_BLUETOOTH-BTPROTO_ISO);
[ 75.308270]
*** DEADLOCK ***
[ 75.308271] 4 locks held by kworker/u81:2/2623:
[ 75.308272] #0: ffff8fdd66e52148 ((wq_completion)hci0#2){+.+.}-{0:0},
at: process_one_work+0x443/0x740
[ 75.308276] #1: ffffafb488b7fe48 ((work_completion)(&hdev->rx_work)),
at: process_one_work+0x1ce/0x740
[ 75.308280] #2: ffff8fdd61a10078 (&hdev->lock){+.+.}-{3:3}
at: hci_le_per_adv_report_evt+0x47/0x2f0 [bluetooth]
[ 75.308304] #3: ffffffffb6ba4900 (rcu_read_lock){....}-{1:2},
at: hci_connect_cfm+0x29/0x190 [bluetooth]
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: iso: Fix circular lock in iso_conn_big_sync
This fixes the circular locking dependency warning below, by reworking
iso_sock_recvmsg, to ensure that the socket lock is always released
before calling a function that locks hdev.
[ 561.670344] ======================================================
[ 561.670346] WARNING: possible circular locking dependency detected
[ 561.670349] 6.12.0-rc6+ #26 Not tainted
[ 561.67 ...
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: iso: Fix circular lock in iso_conn_big_sync
This fixes the circular locking dependency warning below, by reworking
iso_sock_recvmsg, to ensure that the socket lock is always released
before calling a function that locks hdev.
[ 561.670344] ======================================================
[ 561.670346] WARNING: possible circular locking dependency detected
[ 561.670349] 6.12.0-rc6+ #26 Not tainted
[ 561.670351] ------------------------------------------------------
[ 561.670353] iso-tester/3289 is trying to acquire lock:
[ 561.670355] ffff88811f600078 (&hdev->lock){+.+.}-{3:3},
at: iso_conn_big_sync+0x73/0x260 [bluetooth]
[ 561.670405]
but task is already holding lock:
[ 561.670407] ffff88815af58258 (sk_lock-AF_BLUETOOTH){+.+.}-{0:0},
at: iso_sock_recvmsg+0xbf/0x500 [bluetooth]
[ 561.670450]
which lock already depends on the new lock.
[ 561.670452]
the existing dependency chain (in reverse order) is:
[ 561.670453]
-> #2 (sk_lock-AF_BLUETOOTH){+.+.}-{0:0}:
[ 561.670458] lock_acquire+0x7c/0xc0
[ 561.670463] lock_sock_nested+0x3b/0xf0
[ 561.670467] bt_accept_dequeue+0x1a5/0x4d0 [bluetooth]
[ 561.670510] iso_sock_accept+0x271/0x830 [bluetooth]
[ 561.670547] do_accept+0x3dd/0x610
[ 561.670550] __sys_accept4+0xd8/0x170
[ 561.670553] __x64_sys_accept+0x74/0xc0
[ 561.670556] x64_sys_call+0x17d6/0x25f0
[ 561.670559] do_syscall_64+0x87/0x150
[ 561.670563] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 561.670567]
-> #1 (sk_lock-AF_BLUETOOTH-BTPROTO_ISO){+.+.}-{0:0}:
[ 561.670571] lock_acquire+0x7c/0xc0
[ 561.670574] lock_sock_nested+0x3b/0xf0
[ 561.670577] iso_sock_listen+0x2de/0xf30 [bluetooth]
[ 561.670617] __sys_listen_socket+0xef/0x130
[ 561.670620] __x64_sys_listen+0xe1/0x190
[ 561.670623] x64_sys_call+0x2517/0x25f0
[ 561.670626] do_syscall_64+0x87/0x150
[ 561.670629] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 561.670632]
-> #0 (&hdev->lock){+.+.}-{3:3}:
[ 561.670636] __lock_acquire+0x32ad/0x6ab0
[ 561.670639] lock_acquire.part.0+0x118/0x360
[ 561.670642] lock_acquire+0x7c/0xc0
[ 561.670644] __mutex_lock+0x18d/0x12f0
[ 561.670647] mutex_lock_nested+0x1b/0x30
[ 561.670651] iso_conn_big_sync+0x73/0x260 [bluetooth]
[ 561.670687] iso_sock_recvmsg+0x3e9/0x500 [bluetooth]
[ 561.670722] sock_recvmsg+0x1d5/0x240
[ 561.670725] sock_read_iter+0x27d/0x470
[ 561.670727] vfs_read+0x9a0/0xd30
[ 561.670731] ksys_read+0x1a8/0x250
[ 561.670733] __x64_sys_read+0x72/0xc0
[ 561.670736] x64_sys_call+0x1b12/0x25f0
[ 561.670738] do_syscall_64+0x87/0x150
[ 561.670741] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 561.670744]
other info that might help us debug this:
[ 561.670745] Chain exists of:
&hdev->lock --> sk_lock-AF_BLUETOOTH-BTPROTO_ISO --> sk_lock-AF_BLUETOOTH
[ 561.670751] Possible unsafe locking scenario:
[ 561.670753] CPU0 CPU1
[ 561.670754] ---- ----
[ 561.670756] lock(sk_lock-AF_BLUETOOTH);
[ 561.670758] lock(sk_lock
AF_BLUETOOTH-BTPROTO_ISO);
[ 561.670761] lock(sk_lock-AF_BLUETOOTH);
[ 561.670764] lock(&hdev->lock);
[ 561.670767]
*** DEADLOCK ***
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
pinmux: Use sequential access to access desc->pinmux data
When two client of the same gpio call pinctrl_select_state() for the
same functionality, we are seeing NULL pointer issue while accessing
desc->mux_owner.
Let's say two processes A, B executing in pin_request() for the same pin
and process A updates the desc->mux_usecount but not yet updated the
desc->mux_owner while process B see the desc->mux_usecount which got
updat ...
In the Linux kernel, the following vulnerability has been resolved:
pinmux: Use sequential access to access desc->pinmux data
When two client of the same gpio call pinctrl_select_state() for the
same functionality, we are seeing NULL pointer issue while accessing
desc->mux_owner.
Let's say two processes A, B executing in pin_request() for the same pin
and process A updates the desc->mux_usecount but not yet updated the
desc->mux_owner while process B see the desc->mux_usecount which got
updated by A path and further executes strcmp and while accessing
desc->mux_owner it crashes with NULL pointer.
Serialize the access to mux related setting with a mutex lock.
cpu0 (process A) cpu1(process B)
pinctrl_select_state() { pinctrl_select_state() {
pin_request() { pin_request() {
...
....
} else {
desc->mux_usecount++;
desc->mux_usecount && strcmp(desc->mux_owner, owner)) {
if (desc->mux_usecount > 1)
return 0;
desc->mux_owner = owner;
} }
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
ptdma: pt_core_execute_cmd() should use spinlock
The interrupt handler (pt_core_irq_handler()) of the ptdma
driver can be called from interrupt context. The code flow
in this function can lead down to pt_core_execute_cmd() which
will attempt to grab a mutex, which is not appropriate in
interrupt context and ultimately leads to a kernel panic.
The fix here changes this mutex to a spinlock, which has
been verified to resolve the ...
In the Linux kernel, the following vulnerability has been resolved:
ptdma: pt_core_execute_cmd() should use spinlock
The interrupt handler (pt_core_irq_handler()) of the ptdma
driver can be called from interrupt context. The code flow
in this function can lead down to pt_core_execute_cmd() which
will attempt to grab a mutex, which is not appropriate in
interrupt context and ultimately leads to a kernel panic.
The fix here changes this mutex to a spinlock, which has
been verified to resolve the issue.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
mm/swapfile: add cond_resched() in get_swap_pages()
The softlockup still occurs in get_swap_pages() under memory pressure. 64
CPU cores, 64GB memory, and 28 zram devices, the disksize of each zram
device is 50MB with same priority as si. Use the stress-ng tool to
increase memory pressure, causing the system to oom frequently.
The plist_for_each_entry_safe() loops in get_swap_pages() could reach tens
of thousands of times to ...
In the Linux kernel, the following vulnerability has been resolved:
mm/swapfile: add cond_resched() in get_swap_pages()
The softlockup still occurs in get_swap_pages() under memory pressure. 64
CPU cores, 64GB memory, and 28 zram devices, the disksize of each zram
device is 50MB with same priority as si. Use the stress-ng tool to
increase memory pressure, causing the system to oom frequently.
The plist_for_each_entry_safe() loops in get_swap_pages() could reach tens
of thousands of times to find available space (extreme case:
cond_resched() is not called in scan_swap_map_slots()). Let's add
cond_resched() into get_swap_pages() when failed to find available space
to avoid softlockup.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix deadlock between concurrent dio writes when low on free data space
When reserving data space for a direct IO write we can end up deadlocking
if we have multiple tasks attempting a write to the same file range, there
are multiple extents covered by that file range, we are low on available
space for data and the writes don't expand the inode's i_size.
The deadlock can happen like this:
1) We have a file with an i_si ...
In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix deadlock between concurrent dio writes when low on free data space
When reserving data space for a direct IO write we can end up deadlocking
if we have multiple tasks attempting a write to the same file range, there
are multiple extents covered by that file range, we are low on available
space for data and the writes don't expand the inode's i_size.
The deadlock can happen like this:
1) We have a file with an i_size of 1M, at offset 0 it has an extent with
a size of 128K and at offset 128K it has another extent also with a
size of 128K;
2) Task A does a direct IO write against file range [0, 256K), and because
the write is within the i_size boundary, it takes the inode's lock (VFS
level) in shared mode;
3) Task A locks the file range [0, 256K) at btrfs_dio_iomap_begin(), and
then gets the extent map for the extent covering the range [0, 128K).
At btrfs_get_blocks_direct_write(), it creates an ordered extent for
that file range ([0, 128K));
4) Before returning from btrfs_dio_iomap_begin(), it unlocks the file
range [0, 256K);
5) Task A executes btrfs_dio_iomap_begin() again, this time for the file
range [128K, 256K), and locks the file range [128K, 256K);
6) Task B starts a direct IO write against file range [0, 256K) as well.
It also locks the inode in shared mode, as it's within the i_size limit,
and then tries to lock file range [0, 256K). It is able to lock the
subrange [0, 128K) but then blocks waiting for the range [128K, 256K),
as it is currently locked by task A;
7) Task A enters btrfs_get_blocks_direct_write() and tries to reserve data
space. Because we are low on available free space, it triggers the
async data reclaim task, and waits for it to reserve data space;
8) The async reclaim task decides to wait for all existing ordered extents
to complete (through btrfs_wait_ordered_roots()).
It finds the ordered extent previously created by task A for the file
range [0, 128K) and waits for it to complete;
9) The ordered extent for the file range [0, 128K) can not complete
because it blocks at btrfs_finish_ordered_io() when trying to lock the
file range [0, 128K).
This results in a deadlock, because:
- task B is holding the file range [0, 128K) locked, waiting for the
range [128K, 256K) to be unlocked by task A;
- task A is holding the file range [128K, 256K) locked and it's waiting
for the async data reclaim task to satisfy its space reservation
request;
- the async data reclaim task is waiting for ordered extent [0, 128K)
to complete, but the ordered extent can not complete because the
file range [0, 128K) is currently locked by task B, which is waiting
on task A to unlock file range [128K, 256K) and task A waiting
on the async data reclaim task.
This results in a deadlock between 4 task: task A, task B, the async
data reclaim task and the task doing ordered extent completion (a work
queue task).
This type of deadlock can sporadically be triggered by the test case
generic/300 from fstests, and results in a stack trace like the following:
[12084.033689] INFO: task kworker/u16:7:123749 blocked for more than 241 seconds.
[12084.034877] Not tainted 5.18.0-rc2-btrfs-next-115 #1
[12084.035562] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[12084.036548] task:kworker/u16:7 state:D stack: 0 pid:123749 ppid: 2 flags:0x00004000
[12084.036554] Workqueue: btrfs-flush_delalloc btrfs_work_helper [btrfs]
[12084.036599] Call Trace:
[12084.036601] <TASK>
[12084.036606] __schedule+0x3cb/0xed0
[12084.036616] schedule+0x4e/0xb0
[12084.036620] btrfs_start_ordered_extent+0x109/0x1c0 [btrfs]
[12084.036651] ? prepare_to_wait_exclusive+0xc0/0xc0
[12084.036659] btrfs_run_ordered_extent_work+0x1a/0x30 [btrfs]
[12084.036688] btrfs_work_helper+0xf8/0x400 [btrfs]
[12084.0367
---truncated---
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
scsi: lpfc: Move cfg_log_verbose check before calling lpfc_dmp_dbg()
In an attempt to log message 0126 with LOG_TRACE_EVENT, the following hard
lockup call trace hangs the system.
Call Trace:
_raw_spin_lock_irqsave+0x32/0x40
lpfc_dmp_dbg.part.32+0x28/0x220 [lpfc]
lpfc_cmpl_els_fdisc+0x145/0x460 [lpfc]
lpfc_sli_cancel_jobs+0x92/0xd0 [lpfc]
lpfc_els_flush_cmd+0x43c/0x670 [lpfc]
lpfc_els_flush_all_cmd+0x37/0x60 [lpfc]
lpf ...
In the Linux kernel, the following vulnerability has been resolved:
scsi: lpfc: Move cfg_log_verbose check before calling lpfc_dmp_dbg()
In an attempt to log message 0126 with LOG_TRACE_EVENT, the following hard
lockup call trace hangs the system.
Call Trace:
_raw_spin_lock_irqsave+0x32/0x40
lpfc_dmp_dbg.part.32+0x28/0x220 [lpfc]
lpfc_cmpl_els_fdisc+0x145/0x460 [lpfc]
lpfc_sli_cancel_jobs+0x92/0xd0 [lpfc]
lpfc_els_flush_cmd+0x43c/0x670 [lpfc]
lpfc_els_flush_all_cmd+0x37/0x60 [lpfc]
lpfc_sli4_async_event_proc+0x956/0x1720 [lpfc]
lpfc_do_work+0x1485/0x1d70 [lpfc]
kthread+0x112/0x130
ret_from_fork+0x1f/0x40
Kernel panic - not syncing: Hard LOCKUP
The same CPU tries to claim the phba->port_list_lock twice.
Move the cfg_log_verbose checks as part of the lpfc_printf_vlog() and
lpfc_printf_log() macros before calling lpfc_dmp_dbg(). There is no need
to take the phba->port_list_lock within lpfc_dmp_dbg().
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
scsi: lpfc: Fix SCSI I/O completion and abort handler deadlock
During stress I/O tests with 500+ vports, hard LOCKUP call traces are
observed.
CPU A:
native_queued_spin_lock_slowpath+0x192
_raw_spin_lock_irqsave+0x32
lpfc_handle_fcp_err+0x4c6
lpfc_fcp_io_cmd_wqe_cmpl+0x964
lpfc_sli4_fp_handle_cqe+0x266
__lpfc_sli4_process_cq+0x105
__lpfc_sli4_hba_process_cq+0x3c
lpfc_cq_poll_hdler+0x16
irq_poll_softirq+0x76
__softir ...
In the Linux kernel, the following vulnerability has been resolved:
scsi: lpfc: Fix SCSI I/O completion and abort handler deadlock
During stress I/O tests with 500+ vports, hard LOCKUP call traces are
observed.
CPU A:
native_queued_spin_lock_slowpath+0x192
_raw_spin_lock_irqsave+0x32
lpfc_handle_fcp_err+0x4c6
lpfc_fcp_io_cmd_wqe_cmpl+0x964
lpfc_sli4_fp_handle_cqe+0x266
__lpfc_sli4_process_cq+0x105
__lpfc_sli4_hba_process_cq+0x3c
lpfc_cq_poll_hdler+0x16
irq_poll_softirq+0x76
__softirqentry_text_start+0xe4
irq_exit+0xf7
do_IRQ+0x7f
CPU B:
native_queued_spin_lock_slowpath+0x5b
_raw_spin_lock+0x1c
lpfc_abort_handler+0x13e
scmd_eh_abort_handler+0x85
process_one_work+0x1a7
worker_thread+0x30
kthread+0x112
ret_from_fork+0x1f
Diagram of lockup:
CPUA CPUB
---- ----
lpfc_cmd->buf_lock
phba->hbalock
lpfc_cmd->buf_lock
phba->hbalock
Fix by reordering the taking of the lpfc_cmd->buf_lock and phba->hbalock in
lpfc_abort_handler routine so that it tries to take the lpfc_cmd->buf_lock
first before phba->hbalock.
Show More
|
|
In the Linux kernel, the following vulnerability has been resolved:
loop: implement ->free_disk
Ensure that the lo_device which is stored in the gendisk private
data is valid until the gendisk is freed. Currently the loop driver
uses a lot of effort to make sure a device is not freed when it is
still in use, but to to fix a potential deadlock this will be relaxed
a bit soon.
|
In the Linux kernel, the following vulnerability has been resolved:
media: mediatek: vcodec: prevent kernel crash when rmmod mtk-vcodec-dec.ko
If the driver support subdev mode, the parameter "dev->pm.dev" will be
NULL in mtk_vcodec_dec_remove. Kernel will crash when try to rmmod
mtk-vcodec-dec.ko.
[ 4380.702726] pc : do_raw_spin_trylock+0x4/0x80
[ 4380.707075] lr : _raw_spin_lock_irq+0x90/0x14c
[ 4380.711509] sp : ffff80000819bc10
[ 4380.714811] x29: ffff80000819bc10 x28: ffff3600c03e4000 x2 ...
In the Linux kernel, the following vulnerability has been resolved:
media: mediatek: vcodec: prevent kernel crash when rmmod mtk-vcodec-dec.ko
If the driver support subdev mode, the parameter "dev->pm.dev" will be
NULL in mtk_vcodec_dec_remove. Kernel will crash when try to rmmod
mtk-vcodec-dec.ko.
[ 4380.702726] pc : do_raw_spin_trylock+0x4/0x80
[ 4380.707075] lr : _raw_spin_lock_irq+0x90/0x14c
[ 4380.711509] sp : ffff80000819bc10
[ 4380.714811] x29: ffff80000819bc10 x28: ffff3600c03e4000 x27: 0000000000000000
[ 4380.721934] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
[ 4380.729057] x23: ffff3600c0f34930 x22: ffffd5e923549000 x21: 0000000000000220
[ 4380.736179] x20: 0000000000000208 x19: ffffd5e9213e8ebc x18: 0000000000000020
[ 4380.743298] x17: 0000002000000000 x16: ffffd5e9213e8e90 x15: 696c346f65646976
[ 4380.750420] x14: 0000000000000000 x13: 0000000000000001 x12: 0000000000000040
[ 4380.757542] x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
[ 4380.764664] x8 : 0000000000000000 x7 : ffff3600c7273ae8 x6 : ffffd5e9213e8ebc
[ 4380.771786] x5 : 0000000000000000 x4 : 0000000000000001 x3 : 0000000000000000
[ 4380.778908] x2 : 0000000000000000 x1 : ffff3600c03e4000 x0 : 0000000000000208
[ 4380.786031] Call trace:
[ 4380.788465] do_raw_spin_trylock+0x4/0x80
[ 4380.792462] __pm_runtime_disable+0x2c/0x1b0
[ 4380.796723] mtk_vcodec_dec_remove+0x5c/0xa0 [mtk_vcodec_dec]
[ 4380.802466] platform_remove+0x2c/0x60
[ 4380.806204] __device_release_driver+0x194/0x250
[ 4380.810810] driver_detach+0xc8/0x15c
[ 4380.814462] bus_remove_driver+0x5c/0xb0
[ 4380.818375] driver_unregister+0x34/0x64
[ 4380.822288] platform_driver_unregister+0x18/0x24
[ 4380.826979] mtk_vcodec_dec_driver_exit+0x1c/0x888 [mtk_vcodec_dec]
[ 4380.833240] __arm64_sys_delete_module+0x190/0x224
[ 4380.838020] invoke_syscall+0x48/0x114
[ 4380.841760] el0_svc_common.constprop.0+0x60/0x11c
[ 4380.846540] do_el0_svc+0x28/0x90
[ 4380.849844] el0_svc+0x4c/0x100
[ 4380.852975] el0t_64_sync_handler+0xec/0xf0
[ 4380.857148] el0t_64_sync+0x190/0x194
[ 4380.860801] Code: 94431515 17ffffca d503201f d503245f (b9400004)
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
nvdimm: Fix firmware activation deadlock scenarios
Lockdep reports the following deadlock scenarios for CXL root device
power-management, device_prepare(), operations, and device_shutdown()
operations for 'nd_region' devices:
Chain exists of:
&nvdimm_region_key --> &nvdimm_bus->reconfig_mutex --> system_transition_mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ...
In the Linux kernel, the following vulnerability has been resolved:
nvdimm: Fix firmware activation deadlock scenarios
Lockdep reports the following deadlock scenarios for CXL root device
power-management, device_prepare(), operations, and device_shutdown()
operations for 'nd_region' devices:
Chain exists of:
&nvdimm_region_key --> &nvdimm_bus->reconfig_mutex --> system_transition_mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(system_transition_mutex);
lock(&nvdimm_bus->reconfig_mutex);
lock(system_transition_mutex);
lock(&nvdimm_region_key);
Chain exists of:
&cxl_nvdimm_bridge_key --> acpi_scan_lock --> &cxl_root_key
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&cxl_root_key);
lock(acpi_scan_lock);
lock(&cxl_root_key);
lock(&cxl_nvdimm_bridge_key);
These stem from holding nvdimm_bus_lock() over hibernate_quiet_exec()
which walks the entire system device topology taking device_lock() along
the way. The nvdimm_bus_lock() is protecting against unregistration,
multiple simultaneous ops callers, and preventing activate_show() from
racing activate_store(). For the first 2, the lock is redundant.
Unregistration already flushes all ops users, and sysfs already prevents
multiple threads to be active in an ops handler at the same time. For
the last userspace should already be waiting for its last
activate_store() to complete, and does not need activate_show() to flush
the write side, so this lock usage can be deleted in these attributes.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
tty: fix deadlock caused by calling printk() under tty_port->lock
pty_write() invokes kmalloc() which may invoke a normal printk() to print
failure message. This can cause a deadlock in the scenario reported by
syz-bot below:
CPU0 CPU1 CPU2
---- ---- ----
lock(console_owner);
...
In the Linux kernel, the following vulnerability has been resolved:
tty: fix deadlock caused by calling printk() under tty_port->lock
pty_write() invokes kmalloc() which may invoke a normal printk() to print
failure message. This can cause a deadlock in the scenario reported by
syz-bot below:
CPU0 CPU1 CPU2
---- ---- ----
lock(console_owner);
lock(&port_lock_key);
lock(&port->lock);
lock(&port_lock_key);
lock(&port->lock);
lock(console_owner);
As commit dbdda842fe96 ("printk: Add console owner and waiter logic to
load balance console writes") said, such deadlock can be prevented by
using printk_deferred() in kmalloc() (which is invoked in the section
guarded by the port->lock). But there are too many printk() on the
kmalloc() path, and kmalloc() can be called from anywhere, so changing
printk() to printk_deferred() is too complicated and inelegant.
Therefore, this patch chooses to specify __GFP_NOWARN to kmalloc(), so
that printk() will not be called, and this deadlock problem can be
avoided.
Syzbot reported the following lockdep error:
======================================================
WARNING: possible circular locking dependency detected
5.4.143-00237-g08ccc19a-dirty #10 Not tainted
------------------------------------------------------
syz-executor.4/29420 is trying to acquire lock:
ffffffff8aedb2a0 (console_owner){....}-{0:0}, at: console_trylock_spinning kernel/printk/printk.c:1752 [inline]
ffffffff8aedb2a0 (console_owner){....}-{0:0}, at: vprintk_emit+0x2ca/0x470 kernel/printk/printk.c:2023
but task is already holding lock:
ffff8880119c9158 (&port->lock){-.-.}-{2:2}, at: pty_write+0xf4/0x1f0 drivers/tty/pty.c:120
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&port->lock){-.-.}-{2:2}:
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x35/0x50 kernel/locking/spinlock.c:159
tty_port_tty_get drivers/tty/tty_port.c:288 [inline] <-- lock(&port->lock);
tty_port_default_wakeup+0x1d/0xb0 drivers/tty/tty_port.c:47
serial8250_tx_chars+0x530/0xa80 drivers/tty/serial/8250/8250_port.c:1767
serial8250_handle_irq.part.0+0x31f/0x3d0 drivers/tty/serial/8250/8250_port.c:1854
serial8250_handle_irq drivers/tty/serial/8250/8250_port.c:1827 [inline] <-- lock(&port_lock_key);
serial8250_default_handle_irq+0xb2/0x220 drivers/tty/serial/8250/8250_port.c:1870
serial8250_interrupt+0xfd/0x200 drivers/tty/serial/8250/8250_core.c:126
__handle_irq_event_percpu+0x109/0xa50 kernel/irq/handle.c:156
[...]
-> #1 (&port_lock_key){-.-.}-{2:2}:
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x35/0x50 kernel/locking/spinlock.c:159
serial8250_console_write+0x184/0xa40 drivers/tty/serial/8250/8250_port.c:3198
<-- lock(&port_lock_key);
call_console_drivers kernel/printk/printk.c:1819 [inline]
console_unlock+0x8cb/0xd00 kernel/printk/printk.c:2504
vprintk_emit+0x1b5/0x470 kernel/printk/printk.c:2024 <-- lock(console_owner);
vprintk_func+0x8d/0x250 kernel/printk/printk_safe.c:394
printk+0xba/0xed kernel/printk/printk.c:2084
register_console+0x8b3/0xc10 kernel/printk/printk.c:2829
univ8250_console_init+0x3a/0x46 drivers/tty/serial/8250/8250_core.c:681
console_init+0x49d/0x6d3 kernel/printk/printk.c:2915
start_kernel+0x5e9/0x879 init/main.c:713
secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241
-> #0 (console_owner){....}-{0:0}:
[...]
lock_acquire+0x127/0x340 kernel/locking/lockdep.c:4734
console_trylock_spinning kernel/printk/printk.c:1773
---truncated---
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
block: Fix potential deadlock in blk_ia_range_sysfs_show()
When being read, a sysfs attribute is already protected against removal
with the kobject node active reference counter. As a result, in
blk_ia_range_sysfs_show(), there is no need to take the queue sysfs
lock when reading the value of a range attribute. Using the queue sysfs
lock in this function creates a potential deadlock situation with the
disk removal, something t ...
In the Linux kernel, the following vulnerability has been resolved:
block: Fix potential deadlock in blk_ia_range_sysfs_show()
When being read, a sysfs attribute is already protected against removal
with the kobject node active reference counter. As a result, in
blk_ia_range_sysfs_show(), there is no need to take the queue sysfs
lock when reading the value of a range attribute. Using the queue sysfs
lock in this function creates a potential deadlock situation with the
disk removal, something that a lockdep signals with a splat when the
device is removed:
[ 760.703551] Possible unsafe locking scenario:
[ 760.703551]
[ 760.703554] CPU0 CPU1
[ 760.703556] ---- ----
[ 760.703558] lock(&q->sysfs_lock);
[ 760.703565] lock(kn->active#385);
[ 760.703573] lock(&q->sysfs_lock);
[ 760.703579] lock(kn->active#385);
[ 760.703587]
[ 760.703587] *** DEADLOCK ***
Solve this by removing the mutex_lock()/mutex_unlock() calls from
blk_ia_range_sysfs_show().
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
driver core: fix deadlock in __device_attach
In __device_attach function, The lock holding logic is as follows:
...
__device_attach
device_lock(dev) // get lock dev
async_schedule_dev(__device_attach_async_helper, dev); // func
async_schedule_node
async_schedule_node_domain(func)
entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC);
/* when fail or work limit, sync to execute func, but
__device ...
In the Linux kernel, the following vulnerability has been resolved:
driver core: fix deadlock in __device_attach
In __device_attach function, The lock holding logic is as follows:
...
__device_attach
device_lock(dev) // get lock dev
async_schedule_dev(__device_attach_async_helper, dev); // func
async_schedule_node
async_schedule_node_domain(func)
entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC);
/* when fail or work limit, sync to execute func, but
__device_attach_async_helper will get lock dev as
well, which will lead to A-A deadlock. */
if (!entry || atomic_read(&entry_count) > MAX_WORK) {
func;
else
queue_work_node(node, system_unbound_wq, &entry->work)
device_unlock(dev)
As shown above, when it is allowed to do async probes, because of
out of memory or work limit, async work is not allowed, to do
sync execute instead. it will lead to A-A deadlock because of
__device_attach_async_helper getting lock dev.
To fix the deadlock, move the async_schedule_dev outside device_lock,
as we can see, in async_schedule_node_domain, the parameter of
queue_work_node is system_unbound_wq, so it can accept concurrent
operations. which will also not change the code logic, and will
not lead to deadlock.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
bcache: avoid journal no-space deadlock by reserving 1 journal bucket
The journal no-space deadlock was reported time to time. Such deadlock
can happen in the following situation.
When all journal buckets are fully filled by active jset with heavy
write I/O load, the cache set registration (after a reboot) will load
all active jsets and inserting them into the btree again (which is
called journal replay). If a journaled bkey ...
In the Linux kernel, the following vulnerability has been resolved:
bcache: avoid journal no-space deadlock by reserving 1 journal bucket
The journal no-space deadlock was reported time to time. Such deadlock
can happen in the following situation.
When all journal buckets are fully filled by active jset with heavy
write I/O load, the cache set registration (after a reboot) will load
all active jsets and inserting them into the btree again (which is
called journal replay). If a journaled bkey is inserted into a btree
node and results btree node split, new journal request might be
triggered. For example, the btree grows one more level after the node
split, then the root node record in cache device super block will be
upgrade by bch_journal_meta() from bch_btree_set_root(). But there is no
space in journal buckets, the journal replay has to wait for new journal
bucket to be reclaimed after at least one journal bucket replayed. This
is one example that how the journal no-space deadlock happens.
The solution to avoid the deadlock is to reserve 1 journal bucket in
run time, and only permit the reserved journal bucket to be used during
cache set registration procedure for things like journal replay. Then
the journal space will never be fully filled, there is no chance for
journal no-space deadlock to happen anymore.
This patch adds a new member "bool do_reserve" in struct journal, it is
inititalized to 0 (false) when struct journal is allocated, and set to
1 (true) by bch_journal_space_reserve() when all initialization done in
run_cache_set(). In the run time when journal_reclaim() tries to
allocate a new journal bucket, free_journal_buckets() is called to check
whether there are enough free journal buckets to use. If there is only
1 free journal bucket and journal->do_reserve is 1 (true), the last
bucket is reserved and free_journal_buckets() will return 0 to indicate
no free journal bucket. Then journal_reclaim() will give up, and try
next time to see whetheer there is free journal bucket to allocate. By
this method, there is always 1 jouranl bucket reserved in run time.
During the cache set registration, journal->do_reserve is 0 (false), so
the reserved journal bucket can be used to avoid the no-space deadlock.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
tracing: Fix sleeping function called from invalid context on RT kernel
When setting bootparams="trace_event=initcall:initcall_start tp_printk=1" in the
cmdline, the output_printk() was called, and the spin_lock_irqsave() was called in the
atomic and irq disable interrupt context suitation. On the PREEMPT_RT kernel,
these locks are replaced with sleepable rt-spinlock, so the stack calltrace will
be triggered.
Fix it by raw_spi ...
In the Linux kernel, the following vulnerability has been resolved:
tracing: Fix sleeping function called from invalid context on RT kernel
When setting bootparams="trace_event=initcall:initcall_start tp_printk=1" in the
cmdline, the output_printk() was called, and the spin_lock_irqsave() was called in the
atomic and irq disable interrupt context suitation. On the PREEMPT_RT kernel,
these locks are replaced with sleepable rt-spinlock, so the stack calltrace will
be triggered.
Fix it by raw_spin_lock_irqsave when PREEMPT_RT and "trace_event=initcall:initcall_start
tp_printk=1" enabled.
BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0
preempt_count: 2, expected: 0
RCU nest depth: 0, expected: 0
Preemption disabled at:
[<ffffffff8992303e>] try_to_wake_up+0x7e/0xba0
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.17.1-rt17+ #19 34c5812404187a875f32bee7977f7367f9679ea7
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x60/0x8c
dump_stack+0x10/0x12
__might_resched.cold+0x11d/0x155
rt_spin_lock+0x40/0x70
trace_event_buffer_commit+0x2fa/0x4c0
? map_vsyscall+0x93/0x93
trace_event_raw_event_initcall_start+0xbe/0x110
? perf_trace_initcall_finish+0x210/0x210
? probe_sched_wakeup+0x34/0x40
? ttwu_do_wakeup+0xda/0x310
? trace_hardirqs_on+0x35/0x170
? map_vsyscall+0x93/0x93
do_one_initcall+0x217/0x3c0
? trace_event_raw_event_initcall_level+0x170/0x170
? push_cpu_stop+0x400/0x400
? cblist_init_generic+0x241/0x290
kernel_init_freeable+0x1ac/0x347
? _raw_spin_unlock_irq+0x65/0x80
? rest_init+0xf0/0xf0
kernel_init+0x1e/0x150
ret_from_fork+0x22/0x30
</TASK>
Show More
|
|
In the Linux kernel, the following vulnerability has been resolved:
NFSv4: Don't hold the layoutget locks across multiple RPC calls
When doing layoutget as part of the open() compound, we have to be
careful to release the layout locks before we can call any further RPC
calls, such as setattr(). The reason is that those calls could trigger
a recall, which could deadlock.
|
In the Linux kernel, the following vulnerability has been resolved:
drivers: staging: rtl8192e: Fix deadlock in rtllib_beacons_stop()
There is a deadlock in rtllib_beacons_stop(), which is shown
below:
(Thread 1) | (Thread 2)
| rtllib_send_beacon()
rtllib_beacons_stop() | mod_timer()
spin_lock_irqsave() //(1) | (wait a time)
... | rtllib_send_beacon_cb()
del_timer_sync() | spin_lock_irqsave() //(2)
(wai ...
In the Linux kernel, the following vulnerability has been resolved:
drivers: staging: rtl8192e: Fix deadlock in rtllib_beacons_stop()
There is a deadlock in rtllib_beacons_stop(), which is shown
below:
(Thread 1) | (Thread 2)
| rtllib_send_beacon()
rtllib_beacons_stop() | mod_timer()
spin_lock_irqsave() //(1) | (wait a time)
... | rtllib_send_beacon_cb()
del_timer_sync() | spin_lock_irqsave() //(2)
(wait timer to stop) | ...
We hold ieee->beacon_lock in position (1) of thread 1 and
use del_timer_sync() to wait timer to stop, but timer handler
also need ieee->beacon_lock in position (2) of thread 2.
As a result, rtllib_beacons_stop() will block forever.
This patch extracts del_timer_sync() from the protection of
spin_lock_irqsave(), which could let timer handler to obtain
the needed lock.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
drivers: usb: host: Fix deadlock in oxu_bus_suspend()
There is a deadlock in oxu_bus_suspend(), which is shown below:
(Thread 1) | (Thread 2)
| timer_action()
oxu_bus_suspend() | mod_timer()
spin_lock_irq() //(1) | (wait a time)
... | oxu_watchdog()
del_timer_sync() | spin_lock_irq() //(2)
(wait timer to stop) | ...
We ho ...
In the Linux kernel, the following vulnerability has been resolved:
drivers: usb: host: Fix deadlock in oxu_bus_suspend()
There is a deadlock in oxu_bus_suspend(), which is shown below:
(Thread 1) | (Thread 2)
| timer_action()
oxu_bus_suspend() | mod_timer()
spin_lock_irq() //(1) | (wait a time)
... | oxu_watchdog()
del_timer_sync() | spin_lock_irq() //(2)
(wait timer to stop) | ...
We hold oxu->lock in position (1) of thread 1, and use
del_timer_sync() to wait timer to stop, but timer handler
also need oxu->lock in position (2) of thread 2. As a result,
oxu_bus_suspend() will block forever.
This patch extracts del_timer_sync() from the protection of
spin_lock_irq(), which could let timer handler to obtain
the needed lock.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
drivers: staging: rtl8192bs: Fix deadlock in rtw_joinbss_event_prehandle()
There is a deadlock in rtw_joinbss_event_prehandle(), which is shown
below:
(Thread 1) | (Thread 2)
| _set_timer()
rtw_joinbss_event_prehandle()| mod_timer()
spin_lock_bh() //(1) | (wait a time)
... | _rtw_join_timeout_handler()
del_timer_sync() | spin_l ...
In the Linux kernel, the following vulnerability has been resolved:
drivers: staging: rtl8192bs: Fix deadlock in rtw_joinbss_event_prehandle()
There is a deadlock in rtw_joinbss_event_prehandle(), which is shown
below:
(Thread 1) | (Thread 2)
| _set_timer()
rtw_joinbss_event_prehandle()| mod_timer()
spin_lock_bh() //(1) | (wait a time)
... | _rtw_join_timeout_handler()
del_timer_sync() | spin_lock_bh() //(2)
(wait timer to stop) | ...
We hold pmlmepriv->lock in position (1) of thread 1 and
use del_timer_sync() to wait timer to stop, but timer handler
also need pmlmepriv->lock in position (2) of thread 2.
As a result, rtw_joinbss_event_prehandle() will block forever.
This patch extracts del_timer_sync() from the protection of
spin_lock_bh(), which could let timer handler to obtain
the needed lock. What`s more, we change spin_lock_bh() to
spin_lock_irq() in _rtw_join_timeout_handler() in order to
prevent deadlock.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
drivers: staging: rtl8192u: Fix deadlock in ieee80211_beacons_stop()
There is a deadlock in ieee80211_beacons_stop(), which is shown below:
(Thread 1) | (Thread 2)
| ieee80211_send_beacon()
ieee80211_beacons_stop() | mod_timer()
spin_lock_irqsave() //(1) | (wait a time)
... | ieee80211_send_beacon_cb()
del_timer_sync() | spin_lock_irqsave() ...
In the Linux kernel, the following vulnerability has been resolved:
drivers: staging: rtl8192u: Fix deadlock in ieee80211_beacons_stop()
There is a deadlock in ieee80211_beacons_stop(), which is shown below:
(Thread 1) | (Thread 2)
| ieee80211_send_beacon()
ieee80211_beacons_stop() | mod_timer()
spin_lock_irqsave() //(1) | (wait a time)
... | ieee80211_send_beacon_cb()
del_timer_sync() | spin_lock_irqsave() //(2)
(wait timer to stop) | ...
We hold ieee->beacon_lock in position (1) of thread 1 and use
del_timer_sync() to wait timer to stop, but timer handler
also need ieee->beacon_lock in position (2) of thread 2.
As a result, ieee80211_beacons_stop() will block forever.
This patch extracts del_timer_sync() from the protection of
spin_lock_irqsave(), which could let timer handler to obtain
the needed lock.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
drivers: tty: serial: Fix deadlock in sa1100_set_termios()
There is a deadlock in sa1100_set_termios(), which is shown
below:
(Thread 1) | (Thread 2)
| sa1100_enable_ms()
sa1100_set_termios() | mod_timer()
spin_lock_irqsave() //(1) | (wait a time)
... | sa1100_timeout()
del_timer_sync() | spin_lock_irqsave() //(2)
(wait timer to stop) ...
In the Linux kernel, the following vulnerability has been resolved:
drivers: tty: serial: Fix deadlock in sa1100_set_termios()
There is a deadlock in sa1100_set_termios(), which is shown
below:
(Thread 1) | (Thread 2)
| sa1100_enable_ms()
sa1100_set_termios() | mod_timer()
spin_lock_irqsave() //(1) | (wait a time)
... | sa1100_timeout()
del_timer_sync() | spin_lock_irqsave() //(2)
(wait timer to stop) | ...
We hold sport->port.lock in position (1) of thread 1 and
use del_timer_sync() to wait timer to stop, but timer handler
also need sport->port.lock in position (2) of thread 2. As a result,
sa1100_set_termios() will block forever.
This patch moves del_timer_sync() before spin_lock_irqsave()
in order to prevent the deadlock.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
drivers: staging: rtl8192eu: Fix deadlock in rtw_joinbss_event_prehandle
There is a deadlock in rtw_joinbss_event_prehandle(), which is shown below:
(Thread 1) | (Thread 2)
| _set_timer()
rtw_joinbss_event_prehandle()| mod_timer()
spin_lock_bh() //(1) | (wait a time)
... | rtw_join_timeout_handler()
| _rtw_join ...
In the Linux kernel, the following vulnerability has been resolved:
drivers: staging: rtl8192eu: Fix deadlock in rtw_joinbss_event_prehandle
There is a deadlock in rtw_joinbss_event_prehandle(), which is shown below:
(Thread 1) | (Thread 2)
| _set_timer()
rtw_joinbss_event_prehandle()| mod_timer()
spin_lock_bh() //(1) | (wait a time)
... | rtw_join_timeout_handler()
| _rtw_join_timeout_handler()
del_timer_sync() | spin_lock_bh() //(2)
(wait timer to stop) | ...
We hold pmlmepriv->lock in position (1) of thread 1 and
use del_timer_sync() to wait timer to stop, but timer handler
also need pmlmepriv->lock in position (2) of thread 2.
As a result, rtw_joinbss_event_prehandle() will block forever.
This patch extracts del_timer_sync() from the protection of
spin_lock_bh(), which could let timer handler to obtain
the needed lock. What`s more, we change spin_lock_bh() to
spin_lock_irq() in _rtw_join_timeout_handler() in order to
prevent deadlock.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
ceph: fix possible deadlock when holding Fwb to get inline_data
1, mount with wsync.
2, create a file with O_RDWR, and the request was sent to mds.0:
ceph_atomic_open()-->
ceph_mdsc_do_request(openc)
finish_open(file, dentry, ceph_open)-->
ceph_open()-->
ceph_init_file()-->
ceph_init_file_info()-->
ceph_uninline_data()-->
{
...
if ...
In the Linux kernel, the following vulnerability has been resolved:
ceph: fix possible deadlock when holding Fwb to get inline_data
1, mount with wsync.
2, create a file with O_RDWR, and the request was sent to mds.0:
ceph_atomic_open()-->
ceph_mdsc_do_request(openc)
finish_open(file, dentry, ceph_open)-->
ceph_open()-->
ceph_init_file()-->
ceph_init_file_info()-->
ceph_uninline_data()-->
{
...
if (inline_version == 1 || /* initial version, no data */
inline_version == CEPH_INLINE_NONE)
goto out_unlock;
...
}
The inline_version will be 1, which is the initial version for the
new create file. And here the ci->i_inline_version will keep with 1,
it's buggy.
3, buffer write to the file immediately:
ceph_write_iter()-->
ceph_get_caps(file, need=Fw, want=Fb, ...);
generic_perform_write()-->
a_ops->write_begin()-->
ceph_write_begin()-->
netfs_write_begin()-->
netfs_begin_read()-->
netfs_rreq_submit_slice()-->
netfs_read_from_server()-->
rreq->netfs_ops->issue_read()-->
ceph_netfs_issue_read()-->
{
...
if (ci->i_inline_version != CEPH_INLINE_NONE &&
ceph_netfs_issue_op_inline(subreq))
return;
...
}
ceph_put_cap_refs(ci, Fwb);
The ceph_netfs_issue_op_inline() will send a getattr(Fsr) request to
mds.1.
4, then the mds.1 will request the rd lock for CInode::filelock from
the auth mds.0, the mds.0 will do the CInode::filelock state transation
from excl --> sync, but it need to revoke the Fxwb caps back from the
clients.
While the kernel client has aleady held the Fwb caps and waiting for
the getattr(Fsr).
It's deadlock!
URL: https://tracker.ceph.com/issues/55377
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
ath11k: Fix frames flush failure caused by deadlock
We are seeing below warnings:
kernel: [25393.301506] ath11k_pci 0000:01:00.0: failed to flush mgmt transmit queue 0
kernel: [25398.421509] ath11k_pci 0000:01:00.0: failed to flush mgmt transmit queue 0
kernel: [25398.421831] ath11k_pci 0000:01:00.0: dropping mgmt frame for vdev 0, is_started 0
this means ath11k fails to flush mgmt. frames because wmi_mgmt_tx_work
has no cha ...
In the Linux kernel, the following vulnerability has been resolved:
ath11k: Fix frames flush failure caused by deadlock
We are seeing below warnings:
kernel: [25393.301506] ath11k_pci 0000:01:00.0: failed to flush mgmt transmit queue 0
kernel: [25398.421509] ath11k_pci 0000:01:00.0: failed to flush mgmt transmit queue 0
kernel: [25398.421831] ath11k_pci 0000:01:00.0: dropping mgmt frame for vdev 0, is_started 0
this means ath11k fails to flush mgmt. frames because wmi_mgmt_tx_work
has no chance to run in 5 seconds.
By setting /proc/sys/kernel/hung_task_timeout_secs to 20 and increasing
ATH11K_FLUSH_TIMEOUT to 50 we get below warnings:
kernel: [ 120.763160] INFO: task wpa_supplicant:924 blocked for more than 20 seconds.
kernel: [ 120.763169] Not tainted 5.10.90 #12
kernel: [ 120.763177] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: [ 120.763186] task:wpa_supplicant state:D stack: 0 pid: 924 ppid: 1 flags:0x000043a0
kernel: [ 120.763201] Call Trace:
kernel: [ 120.763214] __schedule+0x785/0x12fa
kernel: [ 120.763224] ? lockdep_hardirqs_on_prepare+0xe2/0x1bb
kernel: [ 120.763242] schedule+0x7e/0xa1
kernel: [ 120.763253] schedule_timeout+0x98/0xfe
kernel: [ 120.763266] ? run_local_timers+0x4a/0x4a
kernel: [ 120.763291] ath11k_mac_flush_tx_complete+0x197/0x2b1 [ath11k 13c3a9bf37790f4ac8103b3decf7ab4008ac314a]
kernel: [ 120.763306] ? init_wait_entry+0x2e/0x2e
kernel: [ 120.763343] __ieee80211_flush_queues+0x167/0x21f [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [ 120.763378] __ieee80211_recalc_idle+0x105/0x125 [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [ 120.763411] ieee80211_recalc_idle+0x14/0x27 [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [ 120.763441] ieee80211_free_chanctx+0x77/0xa2 [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [ 120.763473] __ieee80211_vif_release_channel+0x100/0x131 [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [ 120.763540] ieee80211_vif_release_channel+0x66/0x81 [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [ 120.763572] ieee80211_destroy_auth_data+0xa3/0xe6 [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [ 120.763612] ieee80211_mgd_deauth+0x178/0x29b [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [ 120.763654] cfg80211_mlme_deauth+0x1a8/0x22c [cfg80211 8945aa5bc2af5f6972336665d8ad6f9c191ad5be]
kernel: [ 120.763697] nl80211_deauthenticate+0xfa/0x123 [cfg80211 8945aa5bc2af5f6972336665d8ad6f9c191ad5be]
kernel: [ 120.763715] genl_rcv_msg+0x392/0x3c2
kernel: [ 120.763750] ? nl80211_associate+0x432/0x432 [cfg80211 8945aa5bc2af5f6972336665d8ad6f9c191ad5be]
kernel: [ 120.763782] ? nl80211_associate+0x432/0x432 [cfg80211 8945aa5bc2af5f6972336665d8ad6f9c191ad5be]
kernel: [ 120.763802] ? genl_rcv+0x36/0x36
kernel: [ 120.763814] netlink_rcv_skb+0x89/0xf7
kernel: [ 120.763829] genl_rcv+0x28/0x36
kernel: [ 120.763840] netlink_unicast+0x179/0x24b
kernel: [ 120.763854] netlink_sendmsg+0x393/0x401
kernel: [ 120.763872] sock_sendmsg+0x72/0x76
kernel: [ 120.763886] ____sys_sendmsg+0x170/0x1e6
kernel: [ 120.763897] ? copy_msghdr_from_user+0x7a/0xa2
kernel: [ 120.763914] ___sys_sendmsg+0x95/0xd1
kernel: [ 120.763940] __sys_sendmsg+0x85/0xbf
kernel: [ 120.763956] do_syscall_64+0x43/0x55
kernel: [ 120.763966] entry_SYSCALL_64_after_hwframe+0x44/0xa9
kernel: [ 120.763977] RIP: 0033:0x79089f3fcc83
kernel: [ 120.763986] RSP: 002b:00007ffe604f0508 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
kernel: [ 120.763997] RAX: ffffffffffffffda RBX: 000059b40e987690 RCX: 000079089f3fcc83
kernel: [ 120.764006] RDX: 0000000000000000 RSI: 00007ffe604f0558 RDI: 0000000000000009
kernel: [ 120.764014] RBP: 00007ffe604f0540 R08: 0000000000000004 R09: 0000000000400000
kernel: [ 120.764023] R10: 00007ffe604f0638 R11: 0000000000000246 R12: 000059b40ea04980
kernel: [ 120.764032] R13: 00007ffe604
---truncated---
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
ubifs: Fix deadlock in concurrent rename whiteout and inode writeback
Following hung tasks:
[ 77.028764] task:kworker/u8:4 state:D stack: 0 pid: 132
[ 77.028820] Call Trace:
[ 77.029027] schedule+0x8c/0x1b0
[ 77.029067] mutex_lock+0x50/0x60
[ 77.029074] ubifs_write_inode+0x68/0x1f0 [ubifs]
[ 77.029117] __writeback_single_inode+0x43c/0x570
[ 77.029128] writeback_sb_inodes+0x259/0x740
[ 77.029148] wb ...
In the Linux kernel, the following vulnerability has been resolved:
ubifs: Fix deadlock in concurrent rename whiteout and inode writeback
Following hung tasks:
[ 77.028764] task:kworker/u8:4 state:D stack: 0 pid: 132
[ 77.028820] Call Trace:
[ 77.029027] schedule+0x8c/0x1b0
[ 77.029067] mutex_lock+0x50/0x60
[ 77.029074] ubifs_write_inode+0x68/0x1f0 [ubifs]
[ 77.029117] __writeback_single_inode+0x43c/0x570
[ 77.029128] writeback_sb_inodes+0x259/0x740
[ 77.029148] wb_writeback+0x107/0x4d0
[ 77.029163] wb_workfn+0x162/0x7b0
[ 92.390442] task:aa state:D stack: 0 pid: 1506
[ 92.390448] Call Trace:
[ 92.390458] schedule+0x8c/0x1b0
[ 92.390461] wb_wait_for_completion+0x82/0xd0
[ 92.390469] __writeback_inodes_sb_nr+0xb2/0x110
[ 92.390472] writeback_inodes_sb_nr+0x14/0x20
[ 92.390476] ubifs_budget_space+0x705/0xdd0 [ubifs]
[ 92.390503] do_rename.cold+0x7f/0x187 [ubifs]
[ 92.390549] ubifs_rename+0x8b/0x180 [ubifs]
[ 92.390571] vfs_rename+0xdb2/0x1170
[ 92.390580] do_renameat2+0x554/0x770
, are caused by concurrent rename whiteout and inode writeback processes:
rename_whiteout(Thread 1) wb_workfn(Thread2)
ubifs_rename
do_rename
lock_4_inodes (Hold ui_mutex)
ubifs_budget_space
make_free_space
shrink_liability
__writeback_inodes_sb_nr
bdi_split_work_to_wbs (Queue new wb work)
wb_do_writeback(wb work)
__writeback_single_inode
ubifs_write_inode
LOCK(ui_mutex)
↑
wb_wait_for_completion (Wait wb work) <-- deadlock!
Reproducer (Detail program in [Link]):
1. SYS_renameat2("/mp/dir/file", "/mp/dir/whiteout", RENAME_WHITEOUT)
2. Consume out of space before kernel(mdelay) doing budget for whiteout
Fix it by doing whiteout space budget before locking ubifs inodes.
BTW, it also fixes wrong goto tag 'out_release' in whiteout budget
error handling path(It should at least recover dir i_size and unlock
4 ubifs inodes).
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
powerpc/set_memory: Avoid spinlock recursion in change_page_attr()
Commit 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines")
included a spin_lock() to change_page_attr() in order to
safely perform the three step operations. But then
commit 9f7853d7609d ("powerpc/mm: Fix set_memory_*() against
concurrent accesses") modify it to use pte_update() and do
the operation safely against concurrent access.
In the meantime, M ...
In the Linux kernel, the following vulnerability has been resolved:
powerpc/set_memory: Avoid spinlock recursion in change_page_attr()
Commit 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines")
included a spin_lock() to change_page_attr() in order to
safely perform the three step operations. But then
commit 9f7853d7609d ("powerpc/mm: Fix set_memory_*() against
concurrent accesses") modify it to use pte_update() and do
the operation safely against concurrent access.
In the meantime, Maxime reported some spinlock recursion.
[ 15.351649] BUG: spinlock recursion on CPU#0, kworker/0:2/217
[ 15.357540] lock: init_mm+0x3c/0x420, .magic: dead4ead, .owner: kworker/0:2/217, .owner_cpu: 0
[ 15.366563] CPU: 0 PID: 217 Comm: kworker/0:2 Not tainted 5.15.0+ #523
[ 15.373350] Workqueue: events do_free_init
[ 15.377615] Call Trace:
[ 15.380232] [e4105ac0] [800946a4] do_raw_spin_lock+0xf8/0x120 (unreliable)
[ 15.387340] [e4105ae0] [8001f4ec] change_page_attr+0x40/0x1d4
[ 15.393413] [e4105b10] [801424e0] __apply_to_page_range+0x164/0x310
[ 15.400009] [e4105b60] [80169620] free_pcp_prepare+0x1e4/0x4a0
[ 15.406045] [e4105ba0] [8016c5a0] free_unref_page+0x40/0x2b8
[ 15.411979] [e4105be0] [8018724c] kasan_depopulate_vmalloc_pte+0x6c/0x94
[ 15.418989] [e4105c00] [801424e0] __apply_to_page_range+0x164/0x310
[ 15.425451] [e4105c50] [80187834] kasan_release_vmalloc+0xbc/0x134
[ 15.431898] [e4105c70] [8015f7a8] __purge_vmap_area_lazy+0x4e4/0xdd8
[ 15.438560] [e4105d30] [80160d10] _vm_unmap_aliases.part.0+0x17c/0x24c
[ 15.445283] [e4105d60] [801642d0] __vunmap+0x2f0/0x5c8
[ 15.450684] [e4105db0] [800e32d0] do_free_init+0x68/0x94
[ 15.456181] [e4105dd0] [8005d094] process_one_work+0x4bc/0x7b8
[ 15.462283] [e4105e90] [8005d614] worker_thread+0x284/0x6e8
[ 15.468227] [e4105f00] [8006aaec] kthread+0x1f0/0x210
[ 15.473489] [e4105f40] [80017148] ret_from_kernel_thread+0x14/0x1c
Remove the read / modify / write sequence to make the operation atomic
and remove the spin_lock() in change_page_attr().
To do the operation atomically, we can't use pte modification helpers
anymore. Because all platforms have different combination of bits, it
is not easy to use those bits directly. But all have the
_PAGE_KERNEL_{RO/ROX/RW/RWX} set of flags. All we need it to compare
two sets to know which bits are set or cleared.
For instance, by comparing _PAGE_KERNEL_ROX and _PAGE_KERNEL_RO you
know which bit gets cleared and which bit get set when changing exec
permission.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
NFSv4: Fix a deadlock when recovering state on a sillyrenamed file
If the file is sillyrenamed, and slated for delete on close, it is
possible for a server reboot to triggeer an open reclaim, with can again
race with the application call to close(). When that happens, the call
to put_nfs_open_context() can trigger a synchronous delegreturn call
which deadlocks because it is not marked as privileged.
Instead, ensure that the c ...
In the Linux kernel, the following vulnerability has been resolved:
NFSv4: Fix a deadlock when recovering state on a sillyrenamed file
If the file is sillyrenamed, and slated for delete on close, it is
possible for a server reboot to triggeer an open reclaim, with can again
race with the application call to close(). When that happens, the call
to put_nfs_open_context() can trigger a synchronous delegreturn call
which deadlocks because it is not marked as privileged.
Instead, ensure that the call to nfs4_inode_return_delegation_on_close()
catches the delegreturn, and schedules it asynchronously.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
net: enetc: avoid deadlock in enetc_tx_onestep_tstamp()
This lockdep splat says it better than I could:
================================
WARNING: inconsistent lock state
6.2.0-rc2-07010-ga9b9500ffaac-dirty #967 Not tainted
--------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
kworker/1:3/179 [HC0[0]:SC0[0]:HE1:SE1] takes:
ffff3ec4036ce098 (_xmit_ETHER#2){+.?.}-{3:3}, at: netif_freeze_queues+0x5 ...
In the Linux kernel, the following vulnerability has been resolved:
net: enetc: avoid deadlock in enetc_tx_onestep_tstamp()
This lockdep splat says it better than I could:
================================
WARNING: inconsistent lock state
6.2.0-rc2-07010-ga9b9500ffaac-dirty #967 Not tainted
--------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
kworker/1:3/179 [HC0[0]:SC0[0]:HE1:SE1] takes:
ffff3ec4036ce098 (_xmit_ETHER#2){+.?.}-{3:3}, at: netif_freeze_queues+0x5c/0xc0
{IN-SOFTIRQ-W} state was registered at:
_raw_spin_lock+0x5c/0xc0
sch_direct_xmit+0x148/0x37c
__dev_queue_xmit+0x528/0x111c
ip6_finish_output2+0x5ec/0xb7c
ip6_finish_output+0x240/0x3f0
ip6_output+0x78/0x360
ndisc_send_skb+0x33c/0x85c
ndisc_send_rs+0x54/0x12c
addrconf_rs_timer+0x154/0x260
call_timer_fn+0xb8/0x3a0
__run_timers.part.0+0x214/0x26c
run_timer_softirq+0x3c/0x74
__do_softirq+0x14c/0x5d8
____do_softirq+0x10/0x20
call_on_irq_stack+0x2c/0x5c
do_softirq_own_stack+0x1c/0x30
__irq_exit_rcu+0x168/0x1a0
irq_exit_rcu+0x10/0x40
el1_interrupt+0x38/0x64
irq event stamp: 7825
hardirqs last enabled at (7825): [<ffffdf1f7200cae4>] exit_to_kernel_mode+0x34/0x130
hardirqs last disabled at (7823): [<ffffdf1f708105f0>] __do_softirq+0x550/0x5d8
softirqs last enabled at (7824): [<ffffdf1f7081050c>] __do_softirq+0x46c/0x5d8
softirqs last disabled at (7811): [<ffffdf1f708166e0>] ____do_softirq+0x10/0x20
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(_xmit_ETHER#2);
<Interrupt>
lock(_xmit_ETHER#2);
*** DEADLOCK ***
3 locks held by kworker/1:3/179:
#0: ffff3ec400004748 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x1f4/0x6c0
#1: ffff80000a0bbdc8 ((work_completion)(&priv->tx_onestep_tstamp)){+.+.}-{0:0}, at: process_one_work+0x1f4/0x6c0
#2: ffff3ec4036cd438 (&dev->tx_global_lock){+.+.}-{3:3}, at: netif_tx_lock+0x1c/0x34
Workqueue: events enetc_tx_onestep_tstamp
Call trace:
print_usage_bug.part.0+0x208/0x22c
mark_lock+0x7f0/0x8b0
__lock_acquire+0x7c4/0x1ce0
lock_acquire.part.0+0xe0/0x220
lock_acquire+0x68/0x84
_raw_spin_lock+0x5c/0xc0
netif_freeze_queues+0x5c/0xc0
netif_tx_lock+0x24/0x34
enetc_tx_onestep_tstamp+0x20/0x100
process_one_work+0x28c/0x6c0
worker_thread+0x74/0x450
kthread+0x118/0x11c
but I'll say it anyway: the enetc_tx_onestep_tstamp() work item runs in
process context, therefore with softirqs enabled (i.o.w., it can be
interrupted by a softirq). If we hold the netif_tx_lock() when there is
an interrupt, and the NET_TX softirq then gets scheduled, this will take
the netif_tx_lock() a second time and deadlock the kernel.
To solve this, use netif_tx_lock_bh(), which blocks softirqs from
running.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: Fix possible deadlock in rfcomm_sk_state_change
syzbot reports a possible deadlock in rfcomm_sk_state_change [1].
While rfcomm_sock_connect acquires the sk lock and waits for
the rfcomm lock, rfcomm_sock_release could have the rfcomm
lock and hit a deadlock for acquiring the sk lock.
Here's a simplified flow:
rfcomm_sock_connect:
lock_sock(sk)
rfcomm_dlc_open:
rfcomm_lock()
rfcomm_sock_release:
rfcomm_so ...
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: Fix possible deadlock in rfcomm_sk_state_change
syzbot reports a possible deadlock in rfcomm_sk_state_change [1].
While rfcomm_sock_connect acquires the sk lock and waits for
the rfcomm lock, rfcomm_sock_release could have the rfcomm
lock and hit a deadlock for acquiring the sk lock.
Here's a simplified flow:
rfcomm_sock_connect:
lock_sock(sk)
rfcomm_dlc_open:
rfcomm_lock()
rfcomm_sock_release:
rfcomm_sock_shutdown:
rfcomm_lock()
__rfcomm_dlc_close:
rfcomm_k_state_change:
lock_sock(sk)
This patch drops the sk lock before calling rfcomm_dlc_open to
avoid the possible deadlock and holds sk's reference count to
prevent use-after-free after rfcomm_dlc_open completes.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
VMCI: Use threaded irqs instead of tasklets
The vmci_dispatch_dgs() tasklet function calls vmci_read_data()
which uses wait_event() resulting in invalid sleep in an atomic
context (and therefore potentially in a deadlock).
Use threaded irqs to fix this issue and completely remove usage
of tasklets.
[ 20.264639] BUG: sleeping function called from invalid context at drivers/misc/vmw_vmci/vmci_guest.c:145
[ 20.264643] in_at ...
In the Linux kernel, the following vulnerability has been resolved:
VMCI: Use threaded irqs instead of tasklets
The vmci_dispatch_dgs() tasklet function calls vmci_read_data()
which uses wait_event() resulting in invalid sleep in an atomic
context (and therefore potentially in a deadlock).
Use threaded irqs to fix this issue and completely remove usage
of tasklets.
[ 20.264639] BUG: sleeping function called from invalid context at drivers/misc/vmw_vmci/vmci_guest.c:145
[ 20.264643] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 762, name: vmtoolsd
[ 20.264645] preempt_count: 101, expected: 0
[ 20.264646] RCU nest depth: 0, expected: 0
[ 20.264647] 1 lock held by vmtoolsd/762:
[ 20.264648] #0: ffff0000874ae440 (sk_lock-AF_VSOCK){+.+.}-{0:0}, at: vsock_connect+0x60/0x330 [vsock]
[ 20.264658] Preemption disabled at:
[ 20.264659] [<ffff80000151d7d8>] vmci_send_datagram+0x44/0xa0 [vmw_vmci]
[ 20.264665] CPU: 0 PID: 762 Comm: vmtoolsd Not tainted 5.19.0-0.rc8.20220727git39c3c396f813.60.fc37.aarch64 #1
[ 20.264667] Hardware name: VMware, Inc. VBSA/VBSA, BIOS VEFI 12/31/2020
[ 20.264668] Call trace:
[ 20.264669] dump_backtrace+0xc4/0x130
[ 20.264672] show_stack+0x24/0x80
[ 20.264673] dump_stack_lvl+0x88/0xb4
[ 20.264676] dump_stack+0x18/0x34
[ 20.264677] __might_resched+0x1a0/0x280
[ 20.264679] __might_sleep+0x58/0x90
[ 20.264681] vmci_read_data+0x74/0x120 [vmw_vmci]
[ 20.264683] vmci_dispatch_dgs+0x64/0x204 [vmw_vmci]
[ 20.264686] tasklet_action_common.constprop.0+0x13c/0x150
[ 20.264688] tasklet_action+0x40/0x50
[ 20.264689] __do_softirq+0x23c/0x6b4
[ 20.264690] __irq_exit_rcu+0x104/0x214
[ 20.264691] irq_exit_rcu+0x1c/0x50
[ 20.264693] el1_interrupt+0x38/0x6c
[ 20.264695] el1h_64_irq_handler+0x18/0x24
[ 20.264696] el1h_64_irq+0x68/0x6c
[ 20.264697] preempt_count_sub+0xa4/0xe0
[ 20.264698] _raw_spin_unlock_irqrestore+0x64/0xb0
[ 20.264701] vmci_send_datagram+0x7c/0xa0 [vmw_vmci]
[ 20.264703] vmci_datagram_dispatch+0x84/0x100 [vmw_vmci]
[ 20.264706] vmci_datagram_send+0x2c/0x40 [vmw_vmci]
[ 20.264709] vmci_transport_send_control_pkt+0xb8/0x120 [vmw_vsock_vmci_transport]
[ 20.264711] vmci_transport_connect+0x40/0x7c [vmw_vsock_vmci_transport]
[ 20.264713] vsock_connect+0x278/0x330 [vsock]
[ 20.264715] __sys_connect_file+0x8c/0xc0
[ 20.264718] __sys_connect+0x84/0xb4
[ 20.264720] __arm64_sys_connect+0x2c/0x3c
[ 20.264721] invoke_syscall+0x78/0x100
[ 20.264723] el0_svc_common.constprop.0+0x68/0x124
[ 20.264724] do_el0_svc+0x38/0x4c
[ 20.264725] el0_svc+0x60/0x180
[ 20.264726] el0t_64_sync_handler+0x11c/0x150
[ 20.264728] el0t_64_sync+0x190/0x194
Show More
|
|
In the Linux kernel, the following vulnerability has been resolved:
f2fs: initialize locks earlier in f2fs_fill_super()
syzbot is reporting lockdep warning at f2fs_handle_error() [1], for
spin_lock(&sbi->error_lock) is called before spin_lock_init() is called.
For safe locking in error handling, move initialization of locks (and
obvious structures) in f2fs_fill_super() to immediately after memory
allocation.
|
In the Linux kernel, the following vulnerability has been resolved:
ALSA: timer: Don't take register_mutex with copy_from/to_user()
The infamous mmap_lock taken in copy_from/to_user() can be often
problematic when it's called inside another mutex, as they might lead
to deadlocks.
In the case of ALSA timer code, the bad pattern is with
guard(mutex)(®ister_mutex) that covers copy_from/to_user() -- which
was mistakenly introduced at converting to guard(), and it had been
carefully worked arou ...
In the Linux kernel, the following vulnerability has been resolved:
ALSA: timer: Don't take register_mutex with copy_from/to_user()
The infamous mmap_lock taken in copy_from/to_user() can be often
problematic when it's called inside another mutex, as they might lead
to deadlocks.
In the case of ALSA timer code, the bad pattern is with
guard(mutex)(®ister_mutex) that covers copy_from/to_user() -- which
was mistakenly introduced at converting to guard(), and it had been
carefully worked around in the past.
This patch fixes those pieces simply by moving copy_from/to_user() out
of the register mutex lock again.
Show More
|
In the Linux kernel, the following vulnerability has been resolved:
Revert "arm64: dts: qcom: sdm845: Affirm IDR0.CCTW on apps_smmu"
There are reports that the pagetable walker cache coherency is not a
given across the spectrum of SDM845/850 devices, leading to lock-ups
and resets. It works fine on some devices (like the Dragonboard 845c,
but not so much on the Lenovo Yoga C630).
This unfortunately looks like a fluke in firmware development, where
likely somewhere in the vast hypervisor stack ...
In the Linux kernel, the following vulnerability has been resolved:
Revert "arm64: dts: qcom: sdm845: Affirm IDR0.CCTW on apps_smmu"
There are reports that the pagetable walker cache coherency is not a
given across the spectrum of SDM845/850 devices, leading to lock-ups
and resets. It works fine on some devices (like the Dragonboard 845c,
but not so much on the Lenovo Yoga C630).
This unfortunately looks like a fluke in firmware development, where
likely somewhere in the vast hypervisor stack, a change to accommodate
for this was only introduced after the initial software release (which
often serves as a baseline for products).
Revert the change to avoid additional guesswork around crashes.
This reverts commit 6b31a9744b8726c69bb0af290f8475a368a4b805.
Show More
|