Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

perf stat: Do not delay the workload with --delay

The -D/--delay option is to delay the measure after the program starts.
But the current code goes to sleep before starting the program so the
program is delayed too. This is not the intention, let's fix it.

Before:

$ time sudo ./perf stat -a -e cycles -D 3000 sleep 4
Events disabled
Events enabled

Performance counter stats for 'system wide':

4,326,949,337 cycles

4.007494118 seconds time elapsed

real 0m7.474s
user 0m0.356s
sys 0m0.120s

It ran the workload for 4 seconds and gave the 3 second delay. So it
should skip the first 3 second and measure the last 1 second only. But
as you can see, it delays 3 seconds and ran the workload after that for
4 seconds. So the total time (real) was 7 seconds.

After:

$ time sudo ./perf stat -a -e cycles -D 3000 sleep 4
Events disabled
Events enabled

Performance counter stats for 'system wide':

1,063,551,013 cycles

1.002769510 seconds time elapsed

real 0m4.484s
user 0m0.385s
sys 0m0.086s

The bug was introduced when it changed enablement of system-wide events
with a command line workload. But it should've considered the initial
delay case. The code was reworked since then (in bb8bc52e7578) so I'm
afraid it won't be applied cleanly.

Fixes: d0a0a511493d2695 ("perf stat: Fix forked applications enablement of counters")
Reported-by: Kevin Nomura <nomurak@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Link: https://lore.kernel.org/r/20221212230820.901382-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

authored by

Namhyung Kim and committed by
Arnaldo Carvalho de Melo
c587e77e 5f8f9567

+17 -16
+17 -16
tools/perf/builtin-stat.c
··· 539 539 return err; 540 540 } 541 541 542 - if (stat_config.initial_delay < 0) { 543 - pr_info(EVLIST_DISABLED_MSG); 544 - return 0; 545 - } 546 - 547 - if (stat_config.initial_delay > 0) { 548 - pr_info(EVLIST_DISABLED_MSG); 549 - usleep(stat_config.initial_delay * USEC_PER_MSEC); 550 - } 551 - 552 542 /* 553 543 * We need to enable counters only if: 554 544 * - we don't have tracee (attaching to task or cpu) 555 545 * - we have initial delay configured 556 546 */ 557 - if (!target__none(&target) || stat_config.initial_delay) { 547 + if (!target__none(&target)) { 558 548 if (!all_counters_use_bpf) 559 549 evlist__enable(evsel_list); 560 - if (stat_config.initial_delay > 0) 561 - pr_info(EVLIST_ENABLED_MSG); 562 550 } 563 551 return 0; 564 552 } ··· 914 926 return err; 915 927 } 916 928 917 - err = enable_counters(); 918 - if (err) 919 - return -1; 929 + if (stat_config.initial_delay) { 930 + pr_info(EVLIST_DISABLED_MSG); 931 + } else { 932 + err = enable_counters(); 933 + if (err) 934 + return -1; 935 + } 920 936 921 937 /* Exec the command, if any */ 922 938 if (forks) 923 939 evlist__start_workload(evsel_list); 940 + 941 + if (stat_config.initial_delay > 0) { 942 + usleep(stat_config.initial_delay * USEC_PER_MSEC); 943 + err = enable_counters(); 944 + if (err) 945 + return -1; 946 + 947 + pr_info(EVLIST_ENABLED_MSG); 948 + } 924 949 925 950 t0 = rdclock(); 926 951 clock_gettime(CLOCK_MONOTONIC, &ref_time);