sched: psi: fix unprivileged polling against cgroups

519fabc7aaba ("psi: remove 500ms min window size limitation for
triggers") breaks unprivileged psi polling on cgroups.

Historically, we had a privilege check for polling in the open() of a
pressure file in /proc, but were erroneously missing it for the open()
of cgroup pressure files.

When unprivileged polling was introduced in d82caa273565 ("sched/psi:
Allow unprivileged polling of N*2s period"), it needed to filter
privileges depending on the exact polling parameters, and as such
moved the CAP_SYS_RESOURCE check from the proc open() callback to
psi_trigger_create(). Both the proc files as well as cgroup files go
through this during write(). This implicitly added the missing check
for privileges required for HT polling for cgroups.

When 519fabc7aaba ("psi: remove 500ms min window size limitation for
triggers") followed right after to remove further restrictions on the
RT polling window, it incorrectly assumed the cgroup privilege check
was still missing and added it to the cgroup open(), mirroring what we
used to do for proc files in the past.

As a result, unprivileged poll requests that would be supported now
get rejected when opening the cgroup pressure file for writing.

Remove the cgroup open() check. psi_trigger_create() handles it.

Fixes: 519fabc7aaba ("psi: remove 500ms min window size limitation for triggers")
Reported-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Suren Baghdasaryan <surenb@google.com>
Cc: stable@vger.kernel.org # 6.5+
Link: https://lore.kernel.org/r/20231026164114.2488682-1-hannes@cmpxchg.org

authored by

Johannes Weiner and committed by
Peter Zijlstra
8b39d20e eab03c23

-12
-12
kernel/cgroup/cgroup.c
··· 3885 3885 return psi_trigger_poll(&ctx->psi.trigger, of->file, pt); 3886 3886 } 3887 3887 3888 - static int cgroup_pressure_open(struct kernfs_open_file *of) 3889 - { 3890 - if (of->file->f_mode & FMODE_WRITE && !capable(CAP_SYS_RESOURCE)) 3891 - return -EPERM; 3892 - 3893 - return 0; 3894 - } 3895 - 3896 3888 static void cgroup_pressure_release(struct kernfs_open_file *of) 3897 3889 { 3898 3890 struct cgroup_file_ctx *ctx = of->priv; ··· 5291 5299 { 5292 5300 .name = "io.pressure", 5293 5301 .file_offset = offsetof(struct cgroup, psi_files[PSI_IO]), 5294 - .open = cgroup_pressure_open, 5295 5302 .seq_show = cgroup_io_pressure_show, 5296 5303 .write = cgroup_io_pressure_write, 5297 5304 .poll = cgroup_pressure_poll, ··· 5299 5308 { 5300 5309 .name = "memory.pressure", 5301 5310 .file_offset = offsetof(struct cgroup, psi_files[PSI_MEM]), 5302 - .open = cgroup_pressure_open, 5303 5311 .seq_show = cgroup_memory_pressure_show, 5304 5312 .write = cgroup_memory_pressure_write, 5305 5313 .poll = cgroup_pressure_poll, ··· 5307 5317 { 5308 5318 .name = "cpu.pressure", 5309 5319 .file_offset = offsetof(struct cgroup, psi_files[PSI_CPU]), 5310 - .open = cgroup_pressure_open, 5311 5320 .seq_show = cgroup_cpu_pressure_show, 5312 5321 .write = cgroup_cpu_pressure_write, 5313 5322 .poll = cgroup_pressure_poll, ··· 5316 5327 { 5317 5328 .name = "irq.pressure", 5318 5329 .file_offset = offsetof(struct cgroup, psi_files[PSI_IRQ]), 5319 - .open = cgroup_pressure_open, 5320 5330 .seq_show = cgroup_irq_pressure_show, 5321 5331 .write = cgroup_irq_pressure_write, 5322 5332 .poll = cgroup_pressure_poll,