commits

tjh.dev / kernel

Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

kernel os linux

fork atom

Author

Commit

Message

Date

Linus Torvalds

fe15c26e

Linux 6.3-rc1 v6.3-rc1

2y ago

Linus Torvalds

596ff4a0

cpumask: re-introduce constant-sized cpumask optimizations

Commit aa47a7c215e7 ("lib/cpumask: deprecate nr_cpumask_bits") resulted
in the cpumask operations potentially becoming hugely less efficient,
because suddenly the cpumask was always considered to be variable-sized.

The optimization was then later added back in a limited form by commit
6f9c07be9d02 ("lib/cpumask: add FORCE_NR_CPUS config option"), but that
FORCE_NR_CPUS option is not useful in a generic kernel and more of a
special case for embedded situations with fixed hardware.

Instead, just re-introduce the optimization, with some changes.

Instead of depending on CPUMASK_OFFSTACK being false, and then always
using the full constant cpumask width, this introduces three different
cpumask "sizes":

- the exact size (nr_cpumask_bits) remains identical to nr_cpu_ids.

This is used for situations where we should use the exact size.

- the "small" size (small_cpumask_bits) is the NR_CPUS constant if it
fits in a single word and the bitmap operations thus end up able
to trigger the "small_const_nbits()" optimizations.

This is used for the operations that have optimized single-word
cases that get inlined, notably the bit find and scanning functions.

- the "large" size (large_cpumask_bits) is the NR_CPUS constant if it
is an sufficiently small constant that makes simple "copy" and
"clear" operations more efficient.

This is arbitrarily set at four words or less.

As a an example of this situation, without this fixed size optimization,
cpumask_clear() will generate code like

movl nr_cpu_ids(%rip), %edx
addq $63, %rdx
shrq $3, %rdx
andl $-8, %edx
callq memset@PLT

on x86-64, because it would calculate the "exact" number of longwords
that need to be cleared.

In contrast, with this patch, using a MAX_CPU of 64 (which is quite a
reasonable value to use), the above becomes a single

movq $0,cpumask

instruction instead, because instead of caring to figure out exactly how
many CPU's the system has, it just knows that the cpumask will be a
single word and can just clear it all.

Note that this does end up tightening the rules a bit from the original
version in another way: operations that set bits in the cpumask are now
limited to the actual nr_cpu_ids limit, whereas we used to do the
nr_cpumask_bits thing almost everywhere in the cpumask code.

But if you just clear bits, or scan for bits, we can use the simpler
compile-time constants.

In the process, remove 'cpumask_complement()' and 'for_each_cpu_not()'
which were not useful, and which fundamentally have to be limited to
'nr_cpu_ids'. Better remove them now than have somebody introduce use
of them later.

Of course, on x86-64 with MAXSMP there is no sane small compile-time
constant for the cpumask sizes, and we end up using the actual CPU bits,
and will generate the above kind of horrors regardless. Please don't
use MAXSMP unless you really expect to have machines with thousands of
cores.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

2y ago

Linus Torvalds

f915322f

Merge tag 'v6.3-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

2y ago

Linus Torvalds

7f9ec7d8

Merge tag 'x86-urgent-2023-03-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2y ago

Herbert Xu

660ca947

crypto: caam - Fix edesc/iv ordering mixup

2y ago

Linus Torvalds

4e9c542c

Merge tag 'irq-urgent-2023-03-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2y ago

Tom Lendacky

dd093fb0

virt/sev-guest: Return -EIO if certificate buffer is not large enough

2y ago

Taehee Yoo

8b844753

crypto: x86/aria-avx - Do not use avx2 instructions

2y ago

Linus Torvalds

1a90673e

Merge tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

2y ago

Thomas Gleixner

0fb7fb71

genirq/msi, platform-msi: Ensure that MSI descriptors are unreferenced

2y ago

KP Singh

e02b50ca

Documentation/hw-vuln: Document the interaction between IBRS and STIBP

2y ago

Herbert Xu

eb331088

crypto: aspeed - Fix modular aspeed-acry

2y ago

Linus Torvalds

1a8d05a7

Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

2y ago

Al Viro

3304f18b

Adding VFS co-maintainer

2y ago

Johan Hovold

ea9a78c3

genirq/msi: Drop dead domain name assignment

2y ago

KP Singh

6921ed90

x86/speculation: Allow enabling STIBP with legacy IBRS

2y ago

Weili Qian

ced18fd1

crypto: hisilicon/qm - fix coding style issues

2y ago

Masahiro Yamada

95207db8

Remove Intel compiler support

2y ago

Al Viro

caa82ae7

openrisc: fix livelock in uaccess

2y ago

Linus Torvalds

c9c3395d

Linux 6.2 v6.2

2y ago

Juergen Gross

ad32ab96

irqdomain: Add missing NULL pointer check in irq_domain_create_hierarchy()

2y ago

Linus Torvalds

87793476

Merge tag 'x86_cpu_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2y ago

Weili Qian

9b4eb8f8

crypto: hisilicon/qm - update comments to match function

2y ago

Linus Torvalds

b01fe98d

Merge tag 'i2c-for-6.3-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

2y ago

Al Viro

e902e508

nios2: fix livelock in uaccess

2y ago

Linus Torvalds

925cf045

Merge tag 'x86-urgent-2023-02-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2y ago

Thomas Gleixner

5931e4eb

Merge branch 'irq/core' into irq/urgent

2y ago

Dave Hansen

74e19ef0

uaccess: Add speculation barrier to copy_from_user()

2y ago

Alexey Kardashevskiy

79146957

x86/amd: Cache debug register values in percpu variables

2y ago

Weili Qian

ac80056f

crypto: hisilicon/qm - change function names

2y ago

Linus Torvalds

e77d587a

mm: avoid gcc complaint about pointer casting

The migration code ends up temporarily stashing information of the wrong
type in unused fields of the newly allocated destination folio. That
all works fine, but gcc does complain about the pointer type mis-use:

mm/migrate.c: In function ‘__migrate_folio_extract’:
mm/migrate.c:1050:20: note: randstruct: casting between randomized structure pointer types (ssa): ‘struct anon_vma’ and ‘struct address_space’

1050 | *anon_vmap = (void *)dst->mapping;
| ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~

and gcc is actually right to complain since it really doesn't understand
that this is a very temporary special case where this is ok.

This could be fixed in different ways by just obfuscating the assignment
sufficiently that gcc doesn't see what is going on, but the truly
"proper C" way to do this is by explicitly using a union.

Using unions for type conversions like this is normally hugely ugly and
syntactically nasty, but this really is one of the few cases where we
want to make it clear that we're not doing type conversion, we're really
re-using the value bit-for-bit just using another type.

IOW, this should not become a common pattern, but in this one case using
that odd union is probably the best way to document to the compiler what
is conceptually going on here.

[ Side note: there are valid cases where we convert pointers to other
pointer types, notably the whole "folio vs page" situation, where the
types actually have fundamental commonalities.

The fact that the gcc note is limited to just randomized structures
means that we don't see equivalent warnings for those cases, but it
migth also mean that we miss other cases where we do play these kinds
of dodgy games, and this kind of explicit conversion might be a good
idea. ]

I verified that at least for an allmodconfig build on x86-64, this
generates the exact same code, apart from line numbers and assembler
comment changes.

Fixes: 64c8902ed441 ("migrate_pages: split unmap_and_move() to _unmap() and _move()")
Cc: Huang, Ying <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

2y ago

Dan Carpenter

65609d32

i2c: gxp: fix an error code in probe

2y ago

Al Viro

a1179ac7

microblaze: fix livelock in uaccess

2y ago

Linus Torvalds

0097c18e

Merge tag 'timers-urgent-2023-02-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2y ago

Juergen Gross

f9f57da2

x86/mtrr: Revert 90b926e68f50 ("x86/pat: Fix pat_x_mtrr_type() for MTRR disabled case")

2y ago

Thomas Weißschuh

ce7980ae

genirq/irqdesc: Make kobj_type structures constant

2y ago

Thomas Gleixner

6f3ee0e2

Merge tag 'irqchip-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

2y ago

Linus Torvalds

1b72607d

Merge tag 'thermal-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control updates from Rafael Wysocki:
"The majority of changes here are related to the general switch-over to
using arrays of generic trip point structures registered along with a
thermal zone instead of trip point callbacks (this has been done
mostly by Daniel Lezcano with some help from yours truly on the Intel
drivers front).

Apart from that and the related reorganization of code, there are some
enhancements of the existing driver and a new Mediatek Low Voltage
Thermal Sensor (LVTS) driver. The Intel powerclamp undergoes a major
rework so it will use the generic idle_inject facility for CPU idle
time injection going forward and it will take additional module
parameters for specifying the subset of CPUs to be affected by it
(work done by Srinivas Pandruvada).

Also included are assorted fixes and a whole bunch of cleanups.

Specifics:

- Rework a large bunch of drivers to use the generic thermal trip
structure and use the opportunity to do more cleanups by removing
unused functions from the OF code (Daniel Lezcano)

- Remove core header inclusion from drivers (Daniel Lezcano)

- Fix some locking issues related to the generic thermal trip rework
(Johan Hovold)

- Fix a crash when requesting the critical temperature on tegra,
which is related to the generic trip point work (Jon Hunter)

- Clean up thermal device unregistration code (Viresh Kumar)

- Fix and clean up thermal control core initialization error code
paths (Daniel Lezcano)

- Relocate the trip points handling code into a separate file (Daniel
Lezcano)

- Make the thermal core fail registration of thermal zones and
cooling devices if the thermal class has not been registered
(Rafael Wysocki)

- Add trip point initialization helper functions for ACPI-defined
trip points and modify two thermal drivers to use them (Rafael
Wysocki, Daniel Lezcano)

- Make the core thermal control code use sysfs_emit_at() instead of
scnprintf() where applicable (ye xingchen)

- Consolidate code accessing the Intel TCC (Thermal Control
Circuitry) MSRs by introducing library functions for that and
making the TCC-related code in thermal drivers use them (Zhang Rui)

- Enhance the x86_pkg_temp_thermal driver to support dynamic tjmax
changes (Zhang Rui)

- Address an "unsigned expression compared with zero" warning in the
intel_soc_dts_iosf thermal driver (Yang Li)

- Update comments regarding two functions in the Intel Menlow thermal
driver (Deming Wang)

- Use sysfs_emit_at() instead of scnprintf() in the int340x thermal
driver (ye xingchen)

- Make the intel_pch thermal driver support the Wellsburg PCH (Tim
Zimmermann)

- Modify the intel_pch and processor_thermal_device_pci thermal
drivers use generic trip point tables instead of thermal zone trip
point callbacks (Daniel Lezcano)

- Add production mode attribute sysfs attribute to the int340x
thermal driver (Srinivas Pandruvada)

- Rework dynamic trip point updates handling and locking in the
int340x thermal driver (Rafael Wysocki)

- Make the int340x thermal driver use a generic trip points table
instead of thermal zone trip point callbacks (Rafael Wysocki,
Daniel Lezcano)

- Clean up and improve the int340x thermal driver (Rafael Wysocki)

- Simplify and clean up the intel_pch thermal driver (Rafael Wysocki)

- Fix the Intel powerclamp thermal driver and make it use the common
idle injection framework (Srinivas Pandruvada)

- Add two module parameters, cpumask and max_idle, to the Intel
powerclamp thermal driver to allow it to affect only a specific
subset of CPUs instead of all of them (Srinivas Pandruvada)

- Make the Intel quark_dts thermal driver Use generic trip point
objects instead of its own trip point representation (Daniel
Lezcano)

- Add toctree entry for thermal documents and fix two issues in the
Intel powerclamp driver documentation (Bagas Sanjaya)

- Use strscpy() to instead of strncpy() in the thermal core (Xu
Panda)

- Fix thermal_sampling_exit() (Vincent Guittot)

- Add Mediatek Low Voltage Thermal Sensor (LVTS) driver (Balsam
Chihi)

- Add r8a779g0 RCar support to the rcar_gen3 thermal driver (Geert
Uytterhoeven)

- Fix useless call to set_trips() when resuming in the rcar_gen3
thermal control driver and add interrupt support detection at init
time to it (Niklas Söderlund)

- Fix memory corruption in the hi3660 thermal driver (Yongqin Liu)

- Fix include path for libnl3 in pkg-config file for libthermal
(Vibhav Pant)

- Remove syscfg-based driver for st as the platform is not supported
any more (Alain Volmat)"

* tag 'thermal-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (135 commits)
thermal/drivers/st: Remove syscfg based driver
thermal: Remove core header inclusion from drivers
tools/lib/thermal: Fix include path for libnl3 in pkg-config file.
thermal/drivers/hisi: Drop second sensor hi3660
thermal/drivers/rcar_gen3_thermal: Fix device initialization
thermal/drivers/rcar_gen3_thermal: Create device local ops struct
thermal/drivers/rcar_gen3_thermal: Do not call set_trips() when resuming
thermal/drivers/rcar_gen3: Add support for R-Car V4H
dt-bindings: thermal: rcar-gen3-thermal: Add r8a779g0 support
thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver
dt-bindings: thermal: mediatek: Add LVTS thermal controllers
thermal/drivers/mediatek: Relocate driver to mediatek folder
tools/lib/thermal: Fix thermal_sampling_exit()
Documentation: powerclamp: Fix numbered lists formatting
Documentation: powerclamp: Escape wildcard in cpumask description
Documentation: admin-guide: Add toctree entry for thermal docs
thermal: intel: powerclamp: Add two module parameters
Documentation: admin-guide: Move intel_powerclamp documentation
thermal: core: Use sysfs_emit_at() instead of scnprintf()
thermal: intel: powerclamp: Fix duration module parameter
...

2y ago

Kim Phillips

8c19b6f2

KVM: x86: Propagate the AMD Automatic IBRS feature to the guest

3y ago

Weili Qian

f8de067c

crypto: hisilicon/qm - use min() instead of min_t()

2y ago

Linus Torvalds

20fdfd55

Merge tag 'mm-hotfixes-stable-2023-03-04-13-12' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

2y ago

Wolfram Sang

4b3dfb0e

i2c: gxp: return proper error on address NACK

2y ago

Al Viro

d088af1e

ia64: fix livelock in uaccess

2y ago

Linus Torvalds

a33d946c

Merge tag 'irq-urgent-2023-02-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2y ago

Thomas Gleixner

d125d134

alarmtimer: Prevent starvation by small intervals and SIG_IGN

syzbot reported a RCU stall which is caused by setting up an alarmtimer
with a very small interval and ignoring the signal. The reproducer arms the
alarm timer with a relative expiry of 8ns and an interval of 9ns. Not a
problem per se, but that's an issue when the signal is ignored because then
the timer is immediately rearmed because there is no way to delay that
rearming to the signal delivery path. See posix_timer_fn() and commit
58229a189942 ("posix-timers: Prevent softirq starvation by small intervals
and SIG_IGN") for details.

The reproducer does not set SIG_IGN explicitely, but it sets up the timers
signal with SIGCONT. That has the same effect as explicitely setting
SIG_IGN for a signal as SIGCONT is ignored if there is no handler set and
the task is not ptraced.

The log clearly shows that:

[pid 5102] --- SIGCONT {si_signo=SIGCONT, si_code=SI_TIMER, si_timerid=0, si_overrun=316014, si_int=0, si_ptr=NULL} ---

It works because the tasks are traced and therefore the signal is queued so
the tracer can see it, which delays the restart of the timer to the signal
delivery path. But then the tracer is killed:

[pid 5087] kill(-5102, SIGKILL <unfinished ...>
...
./strace-static-x86_64: Process 5107 detached

and after it's gone the stall can be observed:

syzkaller login: [ 79.439102][ C0] hrtimer: interrupt took 68471 ns
[ 184.460538][ C1] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
...
[ 184.658237][ C1] rcu: Stack dump where RCU GP kthread last ran:
[ 184.664574][ C1] Sending NMI from CPU 1 to CPUs 0:
[ 184.669821][ C0] NMI backtrace for cpu 0
[ 184.669831][ C0] CPU: 0 PID: 5108 Comm: syz-executor192 Not tainted 6.2.0-rc6-next-20230203-syzkaller #0
...
[ 184.670036][ C0] Call Trace:
[ 184.670041][ C0] <IRQ>
[ 184.670045][ C0] alarmtimer_fired+0x327/0x670

posix_timer_fn() prevents that by checking whether the interval for
timers which have the signal ignored is smaller than a jiffie and
artifically delay it by shifting the next expiry out by a jiffie. That's
accurate vs. the overrun accounting, but slightly inaccurate
vs. timer_gettimer(2).

The comment in that function says what needs to be done and there was a fix
available for the regular userspace induced SIG_IGN mechanism, but that did
not work due to the implicit ignore for SIGCONT and similar signals. This
needs to be worked on, but for now the only available workaround is to do
exactly what posix_timer_fn() does:

Increase the interval of self-rearming timers, which have their signal
ignored, to at least a jiffie.

Interestingly this has been fixed before via commit ff86bf0c65f1
("alarmtimer: Rate limit periodic intervals") already, but that fix got
lost in a later rework.

Reported-by: syzbot+b9564ba6e8e00694511b@syzkaller.appspotmail.com
Fixes: f2c45807d399 ("alarmtimer: Switch over to generic set/get/rearm routine")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87k00q1no2.ffs@tglx

2y ago

Linus Torvalds

ceaa837f

Linux 6.2-rc8 v6.2-rc8

2y ago

Reinette Chatre

e6cc6f17

PCI/MSI: Clarify usage of pci_msix_free_irq()

2y ago

Ingo Molnar

188a5696

genirq/affinity: Only build SMP-only helper functions on SMP kernels

3y ago

Marc Zyngier

a83bf176

Merge branch irq/bcm-l2-fixes into irq/irqchip-next

2y ago

Linus Torvalds

88af9b16

Merge tag 'acpi-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI updates from Rafael Wysocki:
"These fix a frequency limit issue in the ACPI processor performance
library code, fix a few issues in the ACPICA code, improve Crystal
Cove support in the ACPI PMIC driver, fix string handling in the ACPI
battery driver, add IRQ override quirks for a few machines more, fix
other assorted problems and clean up code and documentation.

Specifics:

- Drop port I/O validation for some regions to avoid AML failures due
to rejections of legitimate port I/O writes (Mario Limonciello)

- Constify acpi_get_handle() pathname argument to allow its callers
to pass const pathnames to it (Sakari Ailus)

- Prevent acpi_ns_simple_repair() from crashing in some cases when
AE_AML_NO_RETURN_VALUE should be returned (Daniil Tatianin)

- Fix typo in CDAT DSMAS struct definition (Lukas Wunner)

- Drop an unnecessary (void *) conversion from the ACPI processor
driver (Zhou jie)

- Modify the ACPI processor performance library code to use the "no
limit" frequency QoS as appropriate and adjust the intel_pstate
driver accordingly (Rafael Wysocki)

- Add support for NBFT to the ACPI table parser (Stuart Hayes)

- Introduce list of known non-PNP devices to avoid enumerating some
of them as PNP devices (Rafael Wysocki)

- Add x86 ACPI paths to the ACPI entry in MAINTAINERS to allow
scripts to report the actual maintainers information (Rafael
Wysocki)

- Add two more entries to the ACPI IRQ override quirk list (Adam
Niederer, Werner Sembach)

- Add a pmic_i2c_address entry for Intel Bay Trail Crystal Cove to
allow intel_soc_pmic_exec_mipi_pmic_seq_element() to be used with
the Bay Trail Crystal Cove PMIC OpRegion driver (Hans de Goede)

- Add comments with DSDT power OpRegion field names to the ACPI PMIC
driver (Hans de Goede)

- Fix string termination handling in the ACPI battery driver (Armin
Wolf)

- Limit error type to 32-bit width in the ACPI APEI error injection
code (Shuai Xue)

- Fix Lenovo Ideapad Z570 DMI match in the ACPI backlight driver
(Hans de Goede)

- Silence missing prototype warnings in some places in the
ACPI-related code (Ammar Faizi)

- Make kobj_type structures used in the ACPI code constant (Thomas
Weißschuh)

- Correct spelling in firmware-guide/ACPI (Randy Dunlap)

- Clarify the meaning of Explicit and Implicit in the _DSD GPIO
properties documentation (Andy Shevchenko)

- Fix some kernel-doc comments in the ACPI CPPC library code (Yang
Li)"

* tag 'acpi-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (25 commits)
ACPI: make kobj_type structures constant
Documentation: firmware-guide: gpio-properties: Clarify Explicit and Implicit
ACPICA: Fix typo in CDAT DSMAS struct definition
ACPI: resource: Do IRQ override on all TongFang GMxRGxx
ACPI: resource: Add IRQ overrides for MAINGEAR Vector Pro 2 models
ACPI: CPPC: Fix some kernel-doc comments
ACPI: video: Fix Lenovo Ideapad Z570 DMI match
Documentation: firmware-guide/ACPI: correct spelling
ACPI: PMIC: Add comments with DSDT power opregion field names
ACPI: battery: Increase maximum string length
ACPI: battery: Fix buffer overread if not NUL-terminated
ACPI: APEI: EINJ: Limit error type to 32-bit width
MAINTAINERS: Add x86 ACPI paths to the ACPI entry
ACPI: battery: Fix missing NUL-termination with large strings
ACPI: PNP: Introduce list of known non-PNP devices
ACPICA: nsrepair: handle cases without a return value correctly
ACPI: Silence missing prototype warnings
cpufreq: intel_pstate: Drop ACPI _PSS states table patching
ACPI: processor: perflib: Avoid updating frequency QoS unnecessarily
ACPI: processor: perflib: Use the "no limit" frequency QoS
...