Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

kernel/sys.c: fix the racy usage of task_lock(tsk->group_leader) in sys_prlimit64() paths

The usage of task_lock(tsk->group_leader) in sys_prlimit64()->do_prlimit()
path is very broken.

sys_prlimit64() does get_task_struct(tsk) but this only protects task_struct
itself. If tsk != current and tsk is not a leader, this process can exit/exec
and task_lock(tsk->group_leader) may use the already freed task_struct.

Another problem is that sys_prlimit64() can race with mt-exec which changes
->group_leader. In this case do_prlimit() may take the wrong lock, or (worse)
->group_leader may change between task_lock() and task_unlock().

Change sys_prlimit64() to take tasklist_lock when necessary. This is not
nice, but I don't see a better fix for -stable.

Link: https://lkml.kernel.org/r/20250915120917.GA27702@redhat.com
Fixes: 18c91bb2d872 ("prlimit: do not grab the tasklist_lock")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Oleg Nesterov and committed by
Andrew Morton
a15f37a4 39f17c70

+20 -2
+20 -2
kernel/sys.c
··· 1734 1734 struct rlimit old, new; 1735 1735 struct task_struct *tsk; 1736 1736 unsigned int checkflags = 0; 1737 + bool need_tasklist; 1737 1738 int ret; 1738 1739 1739 1740 if (old_rlim) ··· 1761 1760 get_task_struct(tsk); 1762 1761 rcu_read_unlock(); 1763 1762 1764 - ret = do_prlimit(tsk, resource, new_rlim ? &new : NULL, 1765 - old_rlim ? &old : NULL); 1763 + need_tasklist = !same_thread_group(tsk, current); 1764 + if (need_tasklist) { 1765 + /* 1766 + * Ensure we can't race with group exit or de_thread(), 1767 + * so tsk->group_leader can't be freed or changed until 1768 + * read_unlock(tasklist_lock) below. 1769 + */ 1770 + read_lock(&tasklist_lock); 1771 + if (!pid_alive(tsk)) 1772 + ret = -ESRCH; 1773 + } 1774 + 1775 + if (!ret) { 1776 + ret = do_prlimit(tsk, resource, new_rlim ? &new : NULL, 1777 + old_rlim ? &old : NULL); 1778 + } 1779 + 1780 + if (need_tasklist) 1781 + read_unlock(&tasklist_lock); 1766 1782 1767 1783 if (!ret && old_rlim) { 1768 1784 rlim_to_rlim64(&old, &old64);