commits

We employ a "waitboost" heuristic to detect when userspace is stalled
waiting for results from earlier execution. Under latency sensitive work
mixed between the gpu/cpu, the GPU is typically under-utilised and so
RPS sees that low utilisation as a reason to downclock the frequency,
causing longer stalls and lower throughput. The user left waiting for
the results is not impressed.

On applying commit 047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv
workaround") it was observed that deinterlacing h264 on Haswell
performance dropped by 2-5x. The reason being that the natural workload
was not intense enough to trigger RPS (using HW evaluation intervals) to
upclock, and so it was depending on waitboosting for the throughput.

Commit 047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv workaround")
changes the composition of dma-resv from keeping a single write fence +
multiple read fences, to a single array of multiple write and read
fences (a maximum of one pair of write/read fences per context). The
iteration order was also changed implicitly from all-read fences then
the single write fence, to a mix of write fences followed by read
fences. It is that ordering change that belied the fragility of
waitboosting.

Currently, a waitboost is inspected at the point of waiting on an
outstanding fence. If the GPU is backlogged such that we haven't yet
stated the request we need to wait on, we force the GPU to upclock until
the completion of that request. By changing the order in which we waited
upon requests, we ended up waiting on those requests in sequence and as
such we saw that each request was already started and so not a suitable
candidate for waitboosting.

Instead of asking whether to boost each fence in turn, we can look at
whether boosting is required for the dma-resv ensemble prior to waiting
on any fence, making the heuristic more robust to the order in which
fences are stored in the dma-resv.

Reported-by: Thomas Voegtle <tv@lio96.de>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6284
Fixes: 047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv workaround")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Karolina Drobnik <karolina.drobnik@intel.com>
Tested-by: Thomas Voegtle <tv@lio96.de>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/07e05518d9f6620d20cc1101ec1849203fe973f9.1657289332.git.karolina.drobnik@intel.com
(cherry picked from commit 394e2b57a989113de494c52d4683444bcb02d4e1)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

3y ago

Linus Torvalds

ae21fbac

Merge tag 'acpi-5.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

3y ago

Ben Dooks

c1f6eff3

riscv: add as-options for modules with assembly compontents

3y ago

Sai Krishna Potthuri

e1502ba4

spi: spi-cadence: Fix SPI NO Slave Select macro definition

3y ago

Gavin Shan

e923b053

KVM: selftests: Fix target thread to be migrated in rseq_test

3y ago

Linus Torvalds

f2906aa8

Linux 5.19-rc1 v5.19-rc1

3y ago

Tom Lendacky

908fc4c2

virt: sev-guest: Pass the appropriate argument type to iounmap()

3y ago

Linus Torvalds

2eccaca7

Merge tag 'gpio-fixes-for-v5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

3y ago

Kim Phillips

bcf16315

x86/bugs: Remove apostrophe typo

3y ago

Linus Torvalds

88084a3d

Linux 5.19-rc5 v5.19-rc5

3y ago

Adrian Hunter

498c7a54

perf tests: Stop Convert perf time to TSC test opening events twice

3y ago

Chris Wilson

a1c5a7bf

drm/i915/gt: Serialize TLB invalidates with GT resets

3y ago

Linus Torvalds

a5235996

Merge tag 'io_uring-5.19-2022-07-21' of git://git.kernel.dk/linux-block

3y ago

Mario Limonciello

09073396

ACPI: CPPC: Don't require flexible address space if X86_FEATURE_CPPC is supported

3y ago

Krzysztof Kozlowski

89551fdd

riscv: dts: align gpio-key node names with dtschema

3y ago

Marc Kleine-Budde

4ceaa684

spi: bcm2835: bcm2835_spi_handle_err(): fix NULL pointer deref for non DMA transfers

3y ago

Oliver Upton

450a5639

KVM: stats: Fix value for KVM_STATS_UNIT_MAX for boolean stats

3y ago

Linus Torvalds

6684cf42

Merge tag 'pull-work.fd-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

3y ago

Peter Zijlstra

28a99e95

x86/amd: Use IBPB for firmware calls

3y ago

Linus Torvalds

8ad4b6fa

Merge tag 'input-for-v5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

3y ago

Bartosz Golaszewski

7329b071

gpio: sim: fix the chip_name configfs item

3y ago

Peter Zijlstra

564d9981

um: Add missing apply_returns()

3y ago

Linus Torvalds

b8d5109f

lockref: remove unused 'lockref_get_or_lock()' function

Looking at the conditional lock acquire functions in the kernel due to
the new sparse support (see commit 4a557a5d1a61 "sparse: introduce
conditional lock acquire function attribute"), it became obvious that
the lockref code has a couple of them, but they don't match the usual
naming convention for the other ones, and their return value logic is
also reversed.

In the other very similar places, the naming pattern is '*_and_lock()'
(eg 'atomic_put_and_lock()' and 'refcount_dec_and_lock()'), and the
function returns true when the lock is taken.

The lockref code is superficially very similar to the refcount code,
only with the special "atomic wrt the embedded lock" semantics. But
instead of the '*_and_lock()' naming it uses '*_or_lock()'.

And instead of returning true in case it took the lock, it returns true
if it *didn't* take the lock.

Now, arguably the reflock code is quite logical: it really is a "either
decrement _or_ lock" kind of situation - and the return value is about
whether the operation succeeded without any special care needed.

So despite the similarities, the differences do make some sense, and
maybe it's not worth trying to unify the different conditional locking
primitives in this area.

But while looking at this all, it did become obvious that the
'lockref_get_or_lock()' function hasn't actually had any users for
almost a decade.

The only user it ever had was the shortlived 'd_rcu_to_refcount()'
function, and it got removed and replaced with 'lockref_get_not_dead()'
back in 2013 in commits 0d98439ea3c6 ("vfs: use lockred 'dead' flag to
mark unrecoverably dead dentries") and e5c832d55588 ("vfs: fix dentry
RCU to refcounting possibly sleeping dput()")

In fact, that single use was removed less than a week after the whole
function was introduced in commit b3abd80250c1 ("lockref: add
'lockref_get_or_lock() helper") so this function has been around for a
decade, but only had a user for six days.

Let's just put this mis-designed and unused function out of its misery.

We can think about the naming and semantic oddities of the remaining
'lockref_put_or_lock()' later, but at least that function has users.

And while the naming is different and the return value doesn't match,
that function matches the whole '{atomic,refcount}_dec_and_test()'
pattern much better (ie the magic happens when the count goes down to
zero, not when it is incremented from zero).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

3y ago

Arnaldo Carvalho de Melo

91d248c3

tools arch x86: Sync the msr-index.h copy with the kernel sources

3y ago

Chris Wilson

b24dcf1d

drm/i915/gt: Serialize GRDOM access between multiple engine resets

3y ago

Linus Torvalds

d945404f

Merge tag 'block-5.19-2022-07-21' of git://git.kernel.dk/linux-block

3y ago

Dylan Yudaken

934447a6

io_uring: do not recycle buffer in READV

3y ago

Li Zhengyu

3a66a087

RISC-V: kexec: Fix build error without CONFIG_KEXEC

3y ago

Vaishnav Achath

73d5fe04

spi: cadence-quadspi: Remove spi_master_put() in probe failure path

3y ago

Linux 5.19-rc8 v5.19-rc8

e0dccc3b

Linus Torvalds

certs: make system keyring depend on x509 parser

e9088629

Adam Borowski

Merge tag 'perf_urgent_for_v5.19_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

af2c9ac2

Linus Torvalds

Merge tag 'sched_urgent_for_v5.19_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

c2602a7c

Linus Torvalds

perf/x86/intel/lbr: Fix unchecked MSR access error on HSW

b0380e13

Kan Liang

Merge tag 'x86_urgent_for_v5.19_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

05017fed

Linus Torvalds

sched/deadline: Fix BUG_ON condition for deboosted tasks

ddfc7103

Juri Lelli

Linux 5.19-rc7 v5.19-rc7

ff699273

Linus Torvalds

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

714b82c1

Linus Torvalds

x86/speculation: Make all RETbleed mitigations 64-bit only

b648ab48

Ben Hutchings

Merge tag 'drm-intel-fixes-2022-07-17' of git://anongit.freedesktop.org/drm/drm-intel

55ea9bd6

Linus Torvalds

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

515f7141

Linus Torvalds

clk: lan966x: Fix the lan966x clock gate register address

25c2a075

Herve Codina

lkdtm: Disable return thunks in rodata.c

efc72a66

Josh Poimboeuf

Merge tag 'perf-tools-fixes-for-v5.19-2022-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

f7f4da30

Linus Torvalds

drm/i915/ttm: fix 32b build

ced7866d

Matthew Auld

Merge tag 'spi-fix-v5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

301c8949

Linus Torvalds

KVM: x86: Protect the unused bits in MSR exiting flags

cf5029d5

Aaron Lewis

MAINTAINERS: add include/dt-bindings/clock to COMMON CLK FRAMEWORK

a79e69c8

Lukas Bulwahn

x86/bugs: Warn when "ibrs" mitigation is selected on Enhanced IBRS parts

eb23b5ef

Pawan Gupta

Merge tag 'perf_urgent_for_v5.19_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2b18593e

Linus Torvalds

perf trace: Fix SIGSEGV when processing syscall args

On powerpc, 'perf trace' is crashing with a SIGSEGV when trying to
process a perf.data file created with 'perf trace record -p':

#0 0x00000001225b8988 in syscall_arg__scnprintf_augmented_string <snip> at builtin-trace.c:1492
#1 syscall_arg__scnprintf_filename <snip> at builtin-trace.c:1492
#2 syscall_arg__scnprintf_filename <snip> at builtin-trace.c:1486
#3 0x00000001225bdd9c in syscall_arg_fmt__scnprintf_val <snip> at builtin-trace.c:1973
#4 syscall__scnprintf_args <snip> at builtin-trace.c:2041
#5 0x00000001225bff04 in trace__sys_enter <snip> at builtin-trace.c:2319

That points to the below code in tools/perf/builtin-trace.c:
/*
* If this is raw_syscalls.sys_enter, then it always comes with the 6 possible
* arguments, even if the syscall being handled, say "openat", uses only 4 arguments
* this breaks syscall__augmented_args() check for augmented args, as we calculate
* syscall->args_size using each syscalls:sys_enter_NAME tracefs format file,
* so when handling, say the openat syscall, we end up getting 6 args for the
* raw_syscalls:sys_enter event, when we expected just 4, we end up mistakenly
* thinking that the extra 2 u64 args are the augmented filename, so just check
* here and avoid using augmented syscalls when the evsel is the raw_syscalls one.
*/
if (evsel != trace->syscalls.events.sys_enter)
augmented_args = syscall__augmented_args(sc, sample, &augmented_args_size, trace->raw_augmented_syscalls_args_size);

As the comment points out, we should not be trying to augment the args
for raw_syscalls. However, when processing a perf.data file, we are not
initializing those properly. Fix the same.

Reported-by: Claudio Carvalho <cclaudio@linux.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20220707090900.572584-1-naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>