commits

Pull more perf tools updates from Arnaldo Carvalho de Melo:
"Fixes:
- Fixes for 'perf bench numa'.

- Always memset source before memcpy in 'perf bench mem'.

- Quote CC and CXX for their arguments to fix build in environments
using those variables to pass more than just the compiler names.

- Fix module symbol processing, addressing regression detected via
"perf test".

- Allow multiple probes in record+script_probe_vfs_getname.sh 'perf
test' entry.

Improvements:
- Add script to autogenerate socket family name id->string table from
copy of kernel header, used so far in 'perf trace'.

- 'perf ftrace' improvements to provide similar options for this
utility so that one can go from 'perf record', 'perf trace', etc to
'perf ftrace' just by changing the name of the subcommand.

- Prefer new "sched:sched_waking" trace event when it exists in 'perf
sched' post processing.

- Update POWER9 metrics to utilize other metrics.

- Fall back to querying debuginfod if debuginfo not found locally.

Miscellaneous:
- Sync various kvm headers with kernel sources"

* tag 'perf-tools-2020-08-14' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (40 commits)
perf ftrace: Make option description initials all capital letters
perf build-ids: Fall back to debuginfod query if debuginfo not found
perf bench numa: Remove dead code in parse_nodes_opt()
perf stat: Update POWER9 metrics to utilize other metrics
perf ftrace: Add change log
perf: ftrace: Add set_tracing_options() to set all trace options
perf ftrace: Add option --tid to filter by thread id
perf ftrace: Add option -D/--delay to delay tracing
perf: ftrace: Allow set graph depth by '--graph-opts'
perf ftrace: Add support for trace option tracing_thresh
perf ftrace: Add option 'verbose' to show more info for graph tracer
perf ftrace: Add support for tracing option 'irq-info'
perf ftrace: Add support for trace option funcgraph-irqs
perf ftrace: Add support for trace option sleep-time
perf ftrace: Add support for tracing option 'func_stack_trace'
perf tools: Add general function to parse sublevel options
perf ftrace: Add option '--inherit' to trace children processes
perf ftrace: Show trace column header
perf ftrace: Add option '-m/--buffer-size' to set per-cpu buffer size
perf ftrace: Factor out function write_tracing_file_int()
...

5y ago

Geert Uytterhoeven

0c64a0dc

sh: landisk: Add missing initialization of sh_io_port_base

5y ago

Linus Torvalds

dbf83817

Merge tag 'riscv-for-linus-5.9-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

5y ago

Jens Axboe

905e321e

Merge branch 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into block-5.9

5y ago

Jens Axboe

ebf0d100

task_work: only grab task signal lock when needed

5y ago

Linus Torvalds

50f6c7db

Merge tag 'x86-urgent-2020-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

5y ago

Arnaldo Carvalho de Melo

492e4edb

perf ftrace: Make option description initials all capital letters

5y ago

Michael Karcher

03dd061f

sh: bring syscall_set_return_value in line with other architectures

5y ago

Linus Torvalds

e1ec517e

Merge branch 'hch.init_path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

5y ago

Tobias Klauser

40284a07

riscv: disable stack-protector for vDSO

5y ago

Guoqing Jiang

d7aaeef2

rnbd: no need to set bi_end_io in rnbd_bio_map_kern

5y ago

Dan Carpenter

e8abe1de

md-cluster: Fix potential error pointer dereference in resize_bitmaps()

5y ago

Jens Axboe

f254ac04

io_uring: enable lookup of links holding inflight files

5y ago

Linus Torvalds

1195d58f

Merge tag 'sched-urgent-2020-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

5y ago

Sebastian Andrzej Siewior

a6d996cb

x86/alternatives: Acquire pte lock with interrupts enabled

5y ago

Frank Ch. Eigler

c7a14fdc

perf build-ids: Fall back to debuginfod query if debuginfo not found

During a perf-record, use the -ldebuginfod API to query a debuginfod
server, should the debug data not be found in the usual system
locations. If successful, the usual $HOME/.debug dir is populated.

Tested with:

$ find .
.
./ctags-debuginfo-5.8-26.fc31.x86_64.rpm
./usr
./usr/lib
./usr/lib/debug
./usr/lib/debug/.build-id
./usr/lib/debug/.build-id/ca
./usr/lib/debug/.build-id/ca/46f6ae6a0cee57d85abc1d461c49074248908d
./usr/lib/debug/.build-id/ca/46f6ae6a0cee57d85abc1d461c49074248908d.debug
./usr/lib/debug/usr
./usr/lib/debug/usr/bin
./usr/lib/debug/usr/bin/ctags-5.8-26.fc31.x86_64.debug

$ debuginfod -F .
...

$ rm -rf ~/.debug/ ; mkdir ~/.debug

$ perf record make tags
BUILD: Doing 'make -j8' parallel build
GEN tags
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.107 MB perf.data (1483 samples) ]

$ find ~/.debug | grep ctags
/home/jolsa/.debug/usr/bin/ctags
/home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d
/home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d/elf
/home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d/probes

$ rm -rf ~/.debug/ ; mkdir ~/.debug

$ DEBUGINFOD_URLS=http://localhost:8002 perf record make tags
BUILD: Doing 'make -j8' parallel build
GEN tags
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.108 MB perf.data (1531 samples) ]

$ find ~/.debug | grep ctag
/home/jolsa/.debug/usr/bin/ctags
/home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d
/home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d/debug
/home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d/elf
/home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d/probes

Note the 'debug' file is created in the last run.

Note that currently the debuginfo data are downloaded only on record path,
we still need add this support to perf build-id/report.. and test ;-)

Tested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Frank Ch. Eigler <fche@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

5y ago

Michael Karcher

0bb605c2

sh: Add SECCOMP_FILTER

5y ago

Linus Torvalds

19b39c38

Merge branch 'work.regset' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

5y ago

Christoph Hellwig

f0735310

init: add an init_dup helper

5y ago

Atish Patra

635093e3

RISC-V: Fix build warning for smpboot.c

5y ago

Guoqing Jiang

735d77d4

rnbd: remove rnbd_dev_submit_io

5y ago

Junxiao Bi

e8efa9b8

md: get sysfs entry after redundancy attr group create

5y ago

Jens Axboe

a36da65c

io_uring: fail poll arm on queue proc failure

5y ago

Linus Torvalds

7f5faaaa

Merge tag 'perf-urgent-2020-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

5y ago

Libing Zhou

cc172ff3

sched/debug: Fix the alignment of the show-state debug output

5y ago

Pawan Gupta

f29dfa53

x86/bugs/multihit: Fix mitigation reporting when VMX is not in use

5y ago

Peng Fan

a508d061

perf bench numa: Remove dead code in parse_nodes_opt()

5y ago

Michael Karcher

9d2ec8f6

sh: Rearrange blocks in entry-common.S

5y ago

Joerg Roedel

995909a4

x86/mm/64: Do not dereference non-present PGD entries

5y ago

Al Viro

ce327e1c

regset: kill user_regset_copyout{,_zero}()

5y ago

Christoph Hellwig

235e5793

init: add an init_utimes helper

5y ago

Zong Li

3843aca0

riscv: fix build warning of mm/pageattr

5y ago

Coly Li

b35fd742

block: check queue's limits.discard_granularity in __blkdev_issue_discard()

If create a loop device with a backing NVMe SSD, current loop device
driver doesn't correctly set its queue's limits.discard_granularity and
leaves it as 0. If a discard request at LBA 0 on this loop device, in
__blkdev_issue_discard() the calculated req_sects will be 0, and a zero
length discard request will trigger a BUG() panic in generic block layer
code at block/blk-mq.c:563.

[ 955.565006][ C39] ------------[ cut here ]------------
[ 955.559660][ C39] invalid opcode: 0000 [#1] SMP NOPTI
[ 955.622171][ C39] CPU: 39 PID: 248 Comm: ksoftirqd/39 Tainted: G E 5.8.0-default+ #40
[ 955.622171][ C39] Hardware name: Lenovo ThinkSystem SR650 -[7X05CTO1WW]-/-[7X05CTO1WW]-, BIOS -[IVE160M-2.70]- 07/17/2020
[ 955.622175][ C39] RIP: 0010:blk_mq_end_request+0x107/0x110
[ 955.622177][ C39] Code: 48 8b 03 e9 59 ff ff ff 48 89 df 5b 5d 41 5c e9 9f ed ff ff 48 8b 35 98 3c f4 00 48 83 c7 10 48 83 c6 19 e8 cb 56 c9 ff eb cb <0f> 0b 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 54
[ 955.622179][ C39] RSP: 0018:ffffb1288701fe28 EFLAGS: 00010202
[ 955.749277][ C39] RAX: 0000000000000001 RBX: ffff956fffba5080 RCX: 0000000000004003
[ 955.749278][ C39] RDX: 0000000000000003 RSI: 0000000000000000 RDI: 0000000000000000
[ 955.749279][ C39] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 955.749279][ C39] R10: ffffb1288701fd28 R11: 0000000000000001 R12: ffffffffa8e05160
[ 955.749280][ C39] R13: 0000000000000004 R14: 0000000000000004 R15: ffffffffa7ad3a1e
[ 955.749281][ C39] FS: 0000000000000000(0000) GS:ffff95bfbda00000(0000) knlGS:0000000000000000
[ 955.749282][ C39] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 955.749282][ C39] CR2: 00007f6f0ef766a8 CR3: 0000005a37012002 CR4: 00000000007606e0
[ 955.749283][ C39] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 955.749284][ C39] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 955.749284][ C39] PKRU: 55555554
[ 955.749285][ C39] Call Trace:
[ 955.749290][ C39] blk_done_softirq+0x99/0xc0
[ 957.550669][ C39] __do_softirq+0xd3/0x45f
[ 957.550677][ C39] ? smpboot_thread_fn+0x2f/0x1e0
[ 957.550679][ C39] ? smpboot_thread_fn+0x74/0x1e0
[ 957.550680][ C39] ? smpboot_thread_fn+0x14e/0x1e0
[ 957.550684][ C39] run_ksoftirqd+0x30/0x60
[ 957.550687][ C39] smpboot_thread_fn+0x149/0x1e0
[ 957.886225][ C39] ? sort_range+0x20/0x20
[ 957.886226][ C39] kthread+0x137/0x160
[ 957.886228][ C39] ? kthread_park+0x90/0x90
[ 957.886231][ C39] ret_from_fork+0x22/0x30
[ 959.117120][ C39] ---[ end trace 3dacdac97e2ed164 ]---

This is the procedure to reproduce the panic,
# modprobe scsi_debug delay=0 dev_size_mb=2048 max_queue=1
# losetup -f /dev/nvme0n1 --direct-io=on
# blkdiscard /dev/loop0 -o 0 -l 0x200

This patch fixes the issue by checking q->limits.discard_granularity in
__blkdev_issue_discard() before composing the discard bio. If the value
is 0, then prints a warning oops information and returns -EOPNOTSUPP to
the caller to indicate that this buggy device driver doesn't support
discard request.

Fixes: 9b15d109a6b2 ("block: improve discard bio alignment in __blkdev_issue_discard()")
Fixes: c52abf563049 ("loop: Better discard support for block devices")
Reported-and-suggested-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Enzo Matsumiya <ematsumiya@suse.com>
Cc: Evan Green <evgreen@chromium.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Xiao Ni <xni@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

5y ago

Jens Axboe

f59589fc

Merge branch 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-5.9/drivers

5y ago

Jens Axboe

6d816e08

io_uring: hold 'ctx' reference around task_work queue + execute

5y ago

Linus Torvalds

eb1319af

Merge tag 'locking-urgent-2020-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

5y ago

Zhang Rui

bcfd218b

perf/x86/rapl: Add support for Intel SPR platform

5y ago

Phil Auld

a1bd0685

sched: Fix use of count for nr_running tracepoint

5y ago

Kan Liang

76d10256

x86/fpu/xstate: Fix an xstate size check warning with architectural LBRs

An xstate size check warning is triggered on machines which support
Architectural LBRs.

XSAVE consistency problem, dumping leaves
WARNING: CPU: 0 PID: 0 at arch/x86/kernel/fpu/xstate.c:649 fpu__init_system_xstate+0x4d4/0xd0e
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted intel-arch_lbr+
RIP: 0010:fpu__init_system_xstate+0x4d4/0xd0e

The xstate size check routine, init_xstate_size(), compares the size
retrieved from the hardware with the size of task->fpu, which is
calculated by the software.

The size from the hardware is the total size of the enabled xstates in
XCR0 | IA32_XSS. Architectural LBR state is a dynamic supervisor
feature, which sets the corresponding bit in the IA32_XSS at boot time.
The size from the hardware includes the size of the Architectural LBR
state.

However, a dynamic supervisor feature doesn't allocate a buffer in the
task->fpu. The size of task->fpu doesn't include the size of the
Architectural LBR state. The mismatch will trigger the warning.

Three options as below were considered to fix the issue:

- Correct the size from the hardware by subtracting the size of the
dynamic supervisor features.
The purpose of the check is to compare the size CPU told with the size
of the XSAVE buffer, which is calculated by the software. If the
software mucks with the number from hardware, it removes the value of
the check.
This option is not a good option.

- Prevent the hardware from counting the size of the dynamic supervisor
feature by temporarily removing the corresponding bits in IA32_XSS.
Two extra MSR writes are required to flip the IA32_XSS. The option is
not pretty, but it is workable. The check is only called once at early
boot time. The synchronization or context-switching doesn't need to be
worried.
This option is implemented here.

- Remove the check entirely, because the check hasn't found any real
problems. The option may be an alternative as option 2.
This option is not implemented here.

Add a new function, get_xsaves_size_no_dynamic(), which retrieves the
total size without the dynamic supervisor features from the hardware.
The size will be used to compare with the size of task->fpu.

Fixes: f0dccc9da4c0 ("x86/fpu/xstate: Support dynamic supervisor feature for LBR")
Reported-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Link: https://lore.kernel.org/r/1595253051-75374-1-git-send-email-kan.liang@linux.intel.com