Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'turbostat-2026.02.14' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux

Pull turbostat updates from Len Brown:

- Add L2 statistics columns for recent Intel processors:
L2MRPS = L2 Cache M-References Per Second
L2%hit = L2 Cache Hit %

- Sort work and output by cpu# rather than core#

- Minor features and fixes

* tag 'turbostat-2026.02.14' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (23 commits)
tools/power turbostat: version 2026.02.14
tools/power turbostat: Fix and document --header_iterations
tools/power turbostat: Use strtoul() for iteration parsing
tools/power turbostat: Favor cpu# over core#
tools/power turbostat: Expunge logical_cpu_id
tools/power turbostat: Enhance HT enumeration
tools/power turbostat: Simplify global core_id calculation
tools/power turbostat: Unify even/odd/average counter referencing
tools/power turbostat: Allocate average counters dynamically
tools/power turbostat: Delete core_data.core_id
tools/power turbostat: Rename physical_core_id to core_id
tools/power turbostat: Cleanup package_id
tools/power turbostat: Cleanup internal use of "base_cpu"
tools/power turbostat: Add L2 cache statistics
tools/power turbostat: Remove redundant newlines from err(3) strings
tools/power turbostat: Allow more use of is_hybrid flag
tools/power turbostat: Rename "LLCkRPS" column to "LLCMRPS"
tools/power turbostat.8: Document the "--force" option
tools/power turbostat: Harden against unexpected values
tools/power turbostat: Dump hypervisor name
...

+935 -538
+15 -5
tools/power/x86/turbostat/turbostat.8
··· 111 .PP 112 \fB--no-perf\fP Disable all the uses of the perf API. 113 .PP 114 \fB--interval seconds\fP overrides the default 5.0 second measurement interval. 115 .PP 116 \fB--num_iterations num\fP number of the measurement iterations. 117 .PP 118 \fB--out output_file\fP turbostat output is written to the specified output_file. 119 The file is truncated if it already exists, and it is created if it does not exist. ··· 163 .PP 164 \fBSMI\fP The number of System Management Interrupts serviced CPU during the measurement interval. While this counter is actually per-CPU, SMI are triggered on all processors, so the number should be the same for all CPUs. 165 .PP 166 - \fBLLCkRPS\fP Last Level Cache Thousands of References Per Second. For CPUs with an L3 LLC, this is the number of references that CPU made to the L3 (and the number of misses that CPU made to it's L2). For CPUs with an L2 LLC, this is the number of references to the L2 (and the number of misses to the CPU's L1). The system summary row shows the sum for all CPUs. In both cases, the value displayed is the actual value divided by 1000 in the interest of usually fitting into 8 columns. 167 .PP 168 - \fBLLC%hit\fP Last Level Cache Hit Rate %. Hit Rate Percent = 100.0 * (References - Misses)/References. The system summary row shows the weighted average for all CPUs (100.0 * (Sum_References - Sum_Misses)/Sum_References). 169 .PP 170 \fBC1, C2, C3...\fP The number times Linux requested the C1, C2, C3 idle state during the measurement interval. The system summary line shows the sum for all CPUs. These are C-state names as exported in /sys/devices/system/cpu/cpu*/cpuidle/state*/name. While their names are generic, their attributes are processor specific. They the system description section of output shows what MWAIT sub-states they are mapped to on each system. These counters are in the "cpuidle" group, which is disabled, by default. 171 .PP 172 - \fBC1+, C2+, C3+...\fP The idle governor idle state misprediction statistics. Inidcates the number times Linux requested the C1, C2, C3 idle state during the measurement interval, but should have requested a deeper idle state (if it exists and enabled). These statistics come from the /sys/devices/system/cpu/cpu*/cpuidle/state*/below file. These counters are in the "cpuidle" group, which is disabled, by default. 173 .PP 174 - \fBC1-, C2-, C3-...\fP The idle governor idle state misprediction statistics. Inidcates the number times Linux requested the C1, C2, C3 idle state during the measurement interval, but should have requested a shallower idle state (if it exists and enabled). These statistics come from the /sys/devices/system/cpu/cpu*/cpuidle/state*/above file. These counters are in the "cpuidle" group, which is disabled, by default. 175 .PP 176 \fBC1%, C2%, C3%\fP The residency percentage that Linux requested C1, C2, C3.... The system summary is the average of all CPUs in the system. Note that these are software, reflecting what was requested. The hardware counters reflect what was actually achieved. These counters are in the "pct_idle" group, which is enabled by default. 177 .PP ··· 205 .PP 206 \fBGFX%C0\fP Percentage of time that at least one GFX compute engine is busy. 207 .PP 208 - \fBCPUGFX%\fP Percentage of time that at least one CPU is busy at the same time as at least one Graphics compute enginer is busy. 209 .PP 210 \fBPkg%pc2, Pkg%pc3, Pkg%pc6, Pkg%pc7\fP percentage residency in hardware package idle states. These numbers are from hardware residency counters. 211 .PP ··· 567 If the development tree doesn't work, please contact the author via chat, 568 or via email with the word "turbostat" on the Subject line. 569 570 .SH FILES 571 .ta 572 .nf
··· 111 .PP 112 \fB--no-perf\fP Disable all the uses of the perf API. 113 .PP 114 + \fB--force\fP Force turbostat to run on an unsupported platform (minimal defaults). 115 + .PP 116 \fB--interval seconds\fP overrides the default 5.0 second measurement interval. 117 .PP 118 \fB--num_iterations num\fP number of the measurement iterations. 119 + .PP 120 + \fB--header_iterations num\fP print header every num iterations. 121 .PP 122 \fB--out output_file\fP turbostat output is written to the specified output_file. 123 The file is truncated if it already exists, and it is created if it does not exist. ··· 159 .PP 160 \fBSMI\fP The number of System Management Interrupts serviced CPU during the measurement interval. While this counter is actually per-CPU, SMI are triggered on all processors, so the number should be the same for all CPUs. 161 .PP 162 + \fBLLCMRPS\fP Last Level Cache Millions of References Per Second. For CPUs with an L3 LLC, this is the number of references that CPU made to the L3 (and the number of misses that CPU made to it's L2). For CPUs with an L2 LLC, this is the number of references to the L2 (and the number of misses to the CPU's L1). The system summary row shows the sum for all CPUs. In both cases, the value displayed is the actual value divided by 1,000,000. If this value is large, then the LLC%hit column is significant. If this value is small, then the LLC%hit column is not significant. 163 .PP 164 + \fBLLC%hit\fP Last Level Cache Hit Rate %. Hit Rate Percent = 100.0 * Hits/References. The system summary row shows the weighted average for all CPUs (100.0 * Sum_Hits/Sum_References). 165 + .PP 166 + \fBL2MRPS\fP Level-2 Cache Millions of References Per Second. For CPUs with an L2 LLC, this is the same as LLC references. The system summary row shows the sum for all CPUs. In both cases, the value displayed is the actual value divided by 1,000,000. If this value is large, then the L2%hit column is significant. If this value is small, then the L2%hit column is not significant. 167 + .PP 168 + \fBL2%hit\fP Level-2 Cache Hit Rate %. Hit Rate Percent = 100.0 * Hits/References. The system summary row shows the weighted average for all CPUs (100.0 * (Sum_Hits)/Sum_References). 169 .PP 170 \fBC1, C2, C3...\fP The number times Linux requested the C1, C2, C3 idle state during the measurement interval. The system summary line shows the sum for all CPUs. These are C-state names as exported in /sys/devices/system/cpu/cpu*/cpuidle/state*/name. While their names are generic, their attributes are processor specific. They the system description section of output shows what MWAIT sub-states they are mapped to on each system. These counters are in the "cpuidle" group, which is disabled, by default. 171 .PP 172 + \fBC1+, C2+, C3+...\fP The idle governor idle state misprediction statistics. Indicates the number times Linux requested the C1, C2, C3 idle state during the measurement interval, but should have requested a deeper idle state (if it exists and enabled). These statistics come from the /sys/devices/system/cpu/cpu*/cpuidle/state*/below file. These counters are in the "cpuidle" group, which is disabled, by default. 173 .PP 174 + \fBC1-, C2-, C3-...\fP The idle governor idle state misprediction statistics. Indicates the number times Linux requested the C1, C2, C3 idle state during the measurement interval, but should have requested a shallower idle state (if it exists and enabled). These statistics come from the /sys/devices/system/cpu/cpu*/cpuidle/state*/above file. These counters are in the "cpuidle" group, which is disabled, by default. 175 .PP 176 \fBC1%, C2%, C3%\fP The residency percentage that Linux requested C1, C2, C3.... The system summary is the average of all CPUs in the system. Note that these are software, reflecting what was requested. The hardware counters reflect what was actually achieved. These counters are in the "pct_idle" group, which is enabled by default. 177 .PP ··· 197 .PP 198 \fBGFX%C0\fP Percentage of time that at least one GFX compute engine is busy. 199 .PP 200 + \fBCPUGFX%\fP Percentage of time that at least one CPU is busy at the same time as at least one Graphics compute engine is busy. 201 .PP 202 \fBPkg%pc2, Pkg%pc3, Pkg%pc6, Pkg%pc7\fP percentage residency in hardware package idle states. These numbers are from hardware residency counters. 203 .PP ··· 559 If the development tree doesn't work, please contact the author via chat, 560 or via email with the word "turbostat" on the Subject line. 561 562 + An old turbostat binary may run on unknown hardware by using "--force", 563 + but results are unsupported. 564 .SH FILES 565 .ta 566 .nf
+920 -533
tools/power/x86/turbostat/turbostat.c
··· 3 * turbostat -- show CPU frequency and C-state residency 4 * on modern Intel and AMD processors. 5 * 6 - * Copyright (c) 2025 Intel Corporation. 7 * Len Brown <len.brown@intel.com> 8 */ 9 ··· 210 { 0x0, "NMI", NULL, 0, 0, 0, NULL, 0 }, 211 { 0x0, "CPU%c1e", NULL, 0, 0, 0, NULL, 0 }, 212 { 0x0, "pct_idle", NULL, 0, 0, 0, NULL, 0 }, 213 - { 0x0, "LLCkRPS", NULL, 0, 0, 0, NULL, 0 }, 214 { 0x0, "LLC%hit", NULL, 0, 0, 0, NULL, 0 }, 215 }; 216 217 /* n.b. bic_names must match the order in bic[], above */ ··· 283 BIC_NMI, 284 BIC_CPU_c1e, 285 BIC_pct_idle, 286 - BIC_LLC_RPS, 287 BIC_LLC_HIT, 288 MAX_BIC 289 }; 290 ··· 298 299 printf("%s:", s); 300 301 - for (i = 0; i <= MAX_BIC; ++i) { 302 303 - if (CPU_ISSET(i, set)) { 304 - assert(i < MAX_BIC); 305 printf(" %s", bic[i].name); 306 - } 307 } 308 putchar('\n'); 309 } ··· 426 SET_BIC(BIC_pct_idle, &bic_group_idle); 427 428 BIC_INIT(&bic_group_cache); 429 - SET_BIC(BIC_LLC_RPS, &bic_group_cache); 430 SET_BIC(BIC_LLC_HIT, &bic_group_cache); 431 432 BIC_INIT(&bic_group_other); 433 SET_BIC(BIC_IRQ, &bic_group_other); ··· 486 int *fd_percpu; 487 int *fd_instr_count_percpu; 488 int *fd_llc_percpu; 489 struct timeval interval_tv = { 5, 0 }; 490 struct timespec interval_ts = { 5, 0 }; 491 ··· 503 unsigned int dump_only; 504 unsigned int force_load; 505 unsigned int cpuid_has_aperf_mperf; 506 unsigned int has_aperf_access; 507 unsigned int has_epb; 508 unsigned int has_turbo; ··· 534 double rapl_joule_counter_range; 535 unsigned int crystal_hz; 536 unsigned long long tsc_hz; 537 - int base_cpu; 538 unsigned int has_hwp; /* IA32_PM_ENABLE, IA32_HWP_CAPABILITIES */ 539 /* IA32_HWP_REQUEST, IA32_HWP_STATUS */ 540 unsigned int has_hwp_notify; /* IA32_HWP_INTERRUPT */ ··· 626 unsigned int i; 627 double freq; 628 629 - if (get_msr(base_cpu, MSR_FSB_FREQ, &msr)) 630 fprintf(outf, "SLM BCLK: unknown\n"); 631 632 i = msr & 0xf; ··· 1254 { 0, NULL }, 1255 }; 1256 1257 static const struct platform_features *platform; 1258 1259 void probe_platform_features(unsigned int family, unsigned int model) ··· 1375 exit(1); 1376 } 1377 1378 /* Model specific support End */ 1379 1380 #define TJMAX_DEFAULT 100 ··· 1406 1407 #define CPU_SUBSET_MAXCPUS 8192 /* need to use before probe... */ 1408 cpu_set_t *cpu_present_set, *cpu_possible_set, *cpu_effective_set, *cpu_allowed_set, *cpu_affinity_set, *cpu_subset; 1409 size_t cpu_present_setsize, cpu_possible_setsize, cpu_effective_setsize, cpu_allowed_setsize, cpu_affinity_setsize, cpu_subset_size; 1410 #define MAX_ADDED_THREAD_COUNTERS 24 1411 #define MAX_ADDED_CORE_COUNTERS 8 ··· 2107 unsigned long long references; 2108 unsigned long long misses; 2109 }; 2110 struct thread_data { 2111 struct timeval tv_begin; 2112 struct timeval tv_end; ··· 2124 unsigned long long nmi_count; 2125 unsigned int smi_count; 2126 struct llc_stats llc; 2127 unsigned int cpu_id; 2128 unsigned int apic_id; 2129 unsigned int x2apic_id; ··· 2133 unsigned long long counter[MAX_ADDED_THREAD_COUNTERS]; 2134 unsigned long long perf_counter[MAX_ADDED_THREAD_COUNTERS]; 2135 unsigned long long pmt_counter[PMT_MAX_ADDED_THREAD_COUNTERS]; 2136 - } *thread_even, *thread_odd; 2137 2138 struct core_data { 2139 - int base_cpu; 2140 unsigned long long c3; 2141 unsigned long long c6; 2142 unsigned long long c7; 2143 unsigned long long mc6_us; /* duplicate as per-core for now, even though per module */ 2144 unsigned int core_temp_c; 2145 struct rapl_counter core_energy; /* MSR_CORE_ENERGY_STAT */ 2146 - unsigned int core_id; 2147 unsigned long long core_throt_cnt; 2148 unsigned long long counter[MAX_ADDED_CORE_COUNTERS]; 2149 unsigned long long perf_counter[MAX_ADDED_CORE_COUNTERS]; 2150 unsigned long long pmt_counter[PMT_MAX_ADDED_CORE_COUNTERS]; 2151 - } *core_even, *core_odd; 2152 2153 struct pkg_data { 2154 - int base_cpu; 2155 unsigned long long pc2; 2156 unsigned long long pc3; 2157 unsigned long long pc6; ··· 2170 long long sam_mc6_ms; 2171 unsigned int sam_mhz; 2172 unsigned int sam_act_mhz; 2173 - unsigned int package_id; 2174 struct rapl_counter energy_pkg; /* MSR_PKG_ENERGY_STATUS */ 2175 struct rapl_counter energy_dram; /* MSR_DRAM_ENERGY_STATUS */ 2176 struct rapl_counter energy_cores; /* MSR_PP0_ENERGY_STATUS */ ··· 2182 unsigned long long counter[MAX_ADDED_PACKAGE_COUNTERS]; 2183 unsigned long long perf_counter[MAX_ADDED_PACKAGE_COUNTERS]; 2184 unsigned long long pmt_counter[PMT_MAX_ADDED_PACKAGE_COUNTERS]; 2185 - } *package_even, *package_odd; 2186 2187 - #define ODD_COUNTERS thread_odd, core_odd, package_odd 2188 - #define EVEN_COUNTERS thread_even, core_even, package_even 2189 - 2190 - #define GET_THREAD(thread_base, thread_no, core_no, node_no, pkg_no) \ 2191 - ((thread_base) + \ 2192 - ((pkg_no) * \ 2193 - topo.nodes_per_pkg * topo.cores_per_node * topo.threads_per_core) + \ 2194 - ((node_no) * topo.cores_per_node * topo.threads_per_core) + \ 2195 - ((core_no) * topo.threads_per_core) + \ 2196 - (thread_no)) 2197 - 2198 - #define GET_CORE(core_base, core_no, node_no, pkg_no) \ 2199 - ((core_base) + \ 2200 - ((pkg_no) * topo.nodes_per_pkg * topo.cores_per_node) + \ 2201 - ((node_no) * topo.cores_per_node) + \ 2202 - (core_no)) 2203 2204 /* 2205 * The accumulated sum of MSR is defined as a monotonic ··· 2224 2225 switch (idx) { 2226 case IDX_PKG_ENERGY: 2227 - if (valid_rapl_msrs & RAPL_AMD_F17H) 2228 offset = MSR_PKG_ENERGY_STAT; 2229 else 2230 offset = MSR_PKG_ENERGY_STATUS; ··· 2368 sys.added_package_counters -= free_msr_counters_(&sys.pp); 2369 } 2370 2371 - struct system_summary { 2372 - struct thread_data threads; 2373 - struct core_data cores; 2374 - struct pkg_data packages; 2375 - } average; 2376 2377 struct platform_counters { 2378 struct rapl_counter energy_psys; /* MSR_PLATFORM_ENERGY_STATUS */ 2379 } platform_counters_odd, platform_counters_even; 2380 2381 struct cpu_topology { 2382 - int physical_package_id; 2383 int die_id; 2384 int l3_id; 2385 - int logical_cpu_id; 2386 int physical_node_id; 2387 int logical_node_id; /* 0-based count within the package */ 2388 - int physical_core_id; 2389 - int thread_id; 2390 int type; 2391 cpu_set_t *put_ids; /* Processing Unit/Thread IDs */ 2392 } *cpus; ··· 2398 int num_packages; 2399 int num_die; 2400 int num_cpus; 2401 - int num_cores; 2402 int allowed_packages; 2403 int allowed_cpus; 2404 int allowed_cores; 2405 int max_cpu_num; 2406 - int max_core_id; 2407 int max_package_id; 2408 int max_die_id; 2409 int max_l3_id; ··· 2435 return !CPU_ISSET_S(cpu, cpu_allowed_setsize, cpu_allowed_set); 2436 } 2437 2438 /* 2439 * run func(thread, core, package) in topology order 2440 * skip non-present cpus ··· 2446 int for_all_cpus(int (func) (struct thread_data *, struct core_data *, struct pkg_data *), 2447 struct thread_data *thread_base, struct core_data *core_base, struct pkg_data *pkg_base) 2448 { 2449 - int retval, pkg_no, core_no, thread_no, node_no; 2450 2451 retval = 0; 2452 2453 - for (pkg_no = 0; pkg_no < topo.num_packages; ++pkg_no) { 2454 - for (node_no = 0; node_no < topo.nodes_per_pkg; node_no++) { 2455 - for (core_no = 0; core_no < topo.cores_per_node; ++core_no) { 2456 - for (thread_no = 0; thread_no < topo.threads_per_core; ++thread_no) { 2457 - struct thread_data *t; 2458 - struct core_data *c; 2459 2460 - t = GET_THREAD(thread_base, thread_no, core_no, node_no, pkg_no); 2461 2462 - if (cpu_is_not_allowed(t->cpu_id)) 2463 - continue; 2464 2465 - c = GET_CORE(core_base, core_no, node_no, pkg_no); 2466 2467 - retval |= func(t, c, &pkg_base[pkg_no]); 2468 - } 2469 - } 2470 } 2471 } 2472 return retval; ··· 2485 2486 int is_cpu_first_thread_in_core(struct thread_data *t, struct core_data *c) 2487 { 2488 - return ((int)t->cpu_id == c->base_cpu || c->base_cpu < 0); 2489 } 2490 2491 int is_cpu_first_core_in_package(struct thread_data *t, struct pkg_data *p) 2492 { 2493 - return ((int)t->cpu_id == p->base_cpu || p->base_cpu < 0); 2494 } 2495 2496 int is_cpu_first_thread_in_package(struct thread_data *t, struct core_data *c, struct pkg_data *p) ··· 2543 static void bic_disable_perf_access(void) 2544 { 2545 CLR_BIC(BIC_IPC, &bic_enabled); 2546 - CLR_BIC(BIC_LLC_RPS, &bic_enabled); 2547 CLR_BIC(BIC_LLC_HIT, &bic_enabled); 2548 } 2549 2550 static long perf_event_open(struct perf_event_attr *hw_event, pid_t pid, int cpu, int group_fd, unsigned long flags) ··· 2658 return cpu; 2659 2660 case SCOPE_CORE: 2661 - return cpus[cpu].physical_core_id; 2662 2663 case SCOPE_PACKAGE: 2664 - return cpus[cpu].physical_package_id; 2665 } 2666 2667 __builtin_unreachable(); ··· 2735 " sets the Thermal Control Circuit temperature in\n" 2736 " degrees Celsius\n" 2737 " -h, --help\n" 2738 - " print this help message\n" 2739 - " -v, --version\n\t\tprint version information\n\nFor more help, run \"man turbostat\"\n"); 2740 } 2741 2742 /* ··· 2918 if (DO_BIC(BIC_SMI)) 2919 outp += sprintf(outp, "%sSMI", (printed++ ? delim : "")); 2920 2921 - if (DO_BIC(BIC_LLC_RPS)) 2922 - outp += sprintf(outp, "%sLLCkRPS", (printed++ ? delim : "")); 2923 2924 if (DO_BIC(BIC_LLC_HIT)) 2925 outp += sprintf(outp, "%sLLC%%hit", (printed++ ? delim : "")); 2926 2927 for (mp = sys.tp; mp; mp = mp->next) 2928 outp += print_name(mp->width, &printed, delim, mp->name, mp->type, mp->format); ··· 3112 } 3113 3114 /* 3115 - * pct() 3116 * 3117 - * If absolute value is < 1.1, return percentage 3118 - * otherwise, return nan 3119 * 3120 - * return value is appropriate for printing percentages with %f 3121 - * while flagging some obvious erroneous values. 3122 */ 3123 - double pct(double d) 3124 { 3125 3126 - double abs = fabs(d); 3127 3128 - if (abs < 1.10) 3129 - return (100.0 * d); 3130 - return nan(""); 3131 } 3132 3133 int dump_counters(PER_THREAD_PARAMS) 3134 { 3135 int i; 3136 struct msr_counter *mp; 3137 - struct platform_counters *pplat_cnt = p == package_odd ? &platform_counters_odd : &platform_counters_even; 3138 3139 outp += sprintf(outp, "t %p, c %p, p %p\n", t, c, p); 3140 ··· 3165 3166 outp += sprintf(outp, "LLC refs: %lld", t->llc.references); 3167 outp += sprintf(outp, "LLC miss: %lld", t->llc.misses); 3168 - outp += sprintf(outp, "LLC Hit%%: %.2f", pct((t->llc.references - t->llc.misses) / t->llc.references)); 3169 3170 for (i = 0, mp = sys.tp; mp; i++, mp = mp->next) { 3171 outp += sprintf(outp, "tADDED [%d] %8s msr0x%x: %08llX %s\n", i, mp->name, mp->msr_num, t->counter[i], mp->sp->path); ··· 3177 } 3178 3179 if (c && is_cpu_first_thread_in_core(t, c)) { 3180 - outp += sprintf(outp, "core: %d\n", c->core_id); 3181 outp += sprintf(outp, "c3: %016llX\n", c->c3); 3182 outp += sprintf(outp, "c6: %016llX\n", c->c6); 3183 outp += sprintf(outp, "c7: %016llX\n", c->c7); ··· 3197 } 3198 3199 if (p && is_cpu_first_core_in_package(t, p)) { 3200 - outp += sprintf(outp, "package: %d\n", p->package_id); 3201 - 3202 outp += sprintf(outp, "Weighted cores: %016llX\n", p->pkg_wtd_core_c0); 3203 outp += sprintf(outp, "Any cores: %016llX\n", p->pkg_any_core_c0); 3204 outp += sprintf(outp, "Any GFX: %016llX\n", p->pkg_any_gfxe_c0); ··· 3262 actual_read_size = read(fd_llc_percpu[cpu], &r, expected_read_size); 3263 3264 if (actual_read_size == -1) 3265 - err(-1, "%s(cpu%d,) %d,,%ld\n", __func__, cpu, fd_llc_percpu[cpu], expected_read_size); 3266 3267 llc->references = r.llc.references; 3268 llc->misses = r.llc.misses; 3269 if (actual_read_size != expected_read_size) 3270 warn("%s: failed to read perf_data (req %zu act %zu)", __func__, expected_read_size, actual_read_size); 3271 } 3272 3273 /* ··· 3308 char *delim = "\t"; 3309 int printed = 0; 3310 3311 - if (t == &average.threads) { 3312 pplat_cnt = count & 1 ? &platform_counters_odd : &platform_counters_even; 3313 ++count; 3314 } ··· 3322 return 0; 3323 3324 /*if not summary line and --cpu is used */ 3325 - if ((t != &average.threads) && (cpu_subset && !CPU_ISSET_S(t->cpu_id, cpu_subset_size, cpu_subset))) 3326 return 0; 3327 3328 if (DO_BIC(BIC_USEC)) { ··· 3342 tsc = t->tsc * tsc_tweak; 3343 3344 /* topo columns, print blanks on 1st (average) line */ 3345 - if (t == &average.threads) { 3346 if (DO_BIC(BIC_Package)) 3347 outp += sprintf(outp, "%s-", (printed++ ? delim : "")); 3348 if (DO_BIC(BIC_Die)) ··· 3362 } else { 3363 if (DO_BIC(BIC_Package)) { 3364 if (p) 3365 - outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), p->package_id); 3366 else 3367 outp += sprintf(outp, "%s-", (printed++ ? delim : "")); 3368 } ··· 3386 } 3387 if (DO_BIC(BIC_Core)) { 3388 if (c) 3389 - outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), c->core_id); 3390 else 3391 outp += sprintf(outp, "%s-", (printed++ ? delim : "")); 3392 } ··· 3402 outp += sprintf(outp, "%s%.0f", (printed++ ? delim : ""), 1.0 / units * t->aperf / interval_float); 3403 3404 if (DO_BIC(BIC_Busy)) 3405 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(t->mperf / tsc)); 3406 3407 if (DO_BIC(BIC_Bzy_MHz)) { 3408 if (has_base_hz) ··· 3438 outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), t->smi_count); 3439 3440 /* LLC Stats */ 3441 - if (DO_BIC(BIC_LLC_RPS) || DO_BIC(BIC_LLC_HIT)) { 3442 - if (DO_BIC(BIC_LLC_RPS)) 3443 - outp += sprintf(outp, "%s%.0f", (printed++ ? delim : ""), t->llc.references / interval_float / 1000); 3444 3445 - if (DO_BIC(BIC_LLC_HIT)) 3446 - outp += sprintf(outp, fmt8, (printed++ ? delim : ""), pct((t->llc.references - t->llc.misses) / t->llc.references)); 3447 - } 3448 3449 /* Added Thread Counters */ 3450 for (i = 0, mp = sys.tp; mp; i++, mp = mp->next) { ··· 3461 if (mp->type == COUNTER_USEC) 3462 outp += print_float_value(&printed, delim, t->counter[i] / interval_float / 10000); 3463 else 3464 - outp += print_float_value(&printed, delim, pct(t->counter[i] / tsc)); 3465 } 3466 } 3467 ··· 3475 if (pp->type == COUNTER_USEC) 3476 outp += print_float_value(&printed, delim, t->perf_counter[i] / interval_float / 10000); 3477 else 3478 - outp += print_float_value(&printed, delim, pct(t->perf_counter[i] / tsc)); 3479 } 3480 } 3481 ··· 3489 break; 3490 3491 case PMT_TYPE_XTAL_TIME: 3492 - value_converted = pct(value_raw / crystal_hz / interval_float); 3493 outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), value_converted); 3494 break; 3495 3496 case PMT_TYPE_TCORE_CLOCK: 3497 - value_converted = pct(value_raw / tcore_clock_freq_hz / interval_float); 3498 outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), value_converted); 3499 } 3500 } 3501 3502 /* C1 */ 3503 if (DO_BIC(BIC_CPU_c1)) 3504 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(t->c1 / tsc)); 3505 3506 /* print per-core data only for 1st thread in core */ 3507 if (!is_cpu_first_thread_in_core(t, c)) 3508 goto done; 3509 3510 if (DO_BIC(BIC_CPU_c3)) 3511 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(c->c3 / tsc)); 3512 if (DO_BIC(BIC_CPU_c6)) 3513 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(c->c6 / tsc)); 3514 if (DO_BIC(BIC_CPU_c7)) 3515 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(c->c7 / tsc)); 3516 3517 /* Mod%c6 */ 3518 if (DO_BIC(BIC_Mod_c6)) 3519 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(c->mc6_us / tsc)); 3520 3521 if (DO_BIC(BIC_CoreTmp)) 3522 outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), c->core_temp_c); ··· 3532 else if (mp->format == FORMAT_DELTA || mp->format == FORMAT_AVERAGE) 3533 outp += print_decimal_value(mp->width, &printed, delim, c->counter[i]); 3534 else if (mp->format == FORMAT_PERCENT) 3535 - outp += print_float_value(&printed, delim, pct(c->counter[i] / tsc)); 3536 } 3537 3538 /* Added perf Core counters */ ··· 3542 else if (pp->format == FORMAT_DELTA || mp->format == FORMAT_AVERAGE) 3543 outp += print_decimal_value(pp->width, &printed, delim, c->perf_counter[i]); 3544 else if (pp->format == FORMAT_PERCENT) 3545 - outp += print_float_value(&printed, delim, pct(c->perf_counter[i] / tsc)); 3546 } 3547 3548 /* Added PMT Core counters */ ··· 3555 break; 3556 3557 case PMT_TYPE_XTAL_TIME: 3558 - value_converted = pct(value_raw / crystal_hz / interval_float); 3559 outp += print_float_value(&printed, delim, value_converted); 3560 break; 3561 3562 case PMT_TYPE_TCORE_CLOCK: 3563 - value_converted = pct(value_raw / tcore_clock_freq_hz / interval_float); 3564 outp += print_float_value(&printed, delim, value_converted); 3565 } 3566 } ··· 3616 if (DO_BIC(BIC_Totl_c0)) 3617 outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100 * p->pkg_wtd_core_c0 / tsc); /* can exceed 100% */ 3618 if (DO_BIC(BIC_Any_c0)) 3619 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pkg_any_core_c0 / tsc)); 3620 if (DO_BIC(BIC_GFX_c0)) 3621 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pkg_any_gfxe_c0 / tsc)); 3622 if (DO_BIC(BIC_CPUGFX)) 3623 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pkg_both_core_gfxe_c0 / tsc)); 3624 3625 if (DO_BIC(BIC_Pkgpc2)) 3626 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pc2 / tsc)); 3627 if (DO_BIC(BIC_Pkgpc3)) 3628 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pc3 / tsc)); 3629 if (DO_BIC(BIC_Pkgpc6)) 3630 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pc6 / tsc)); 3631 if (DO_BIC(BIC_Pkgpc7)) 3632 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pc7 / tsc)); 3633 if (DO_BIC(BIC_Pkgpc8)) 3634 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pc8 / tsc)); 3635 if (DO_BIC(BIC_Pkgpc9)) 3636 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pc9 / tsc)); 3637 if (DO_BIC(BIC_Pkgpc10)) 3638 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pc10 / tsc)); 3639 3640 if (DO_BIC(BIC_Diec6)) 3641 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->die_c6 / crystal_hz / interval_float)); 3642 3643 if (DO_BIC(BIC_CPU_LPI)) { 3644 if (p->cpu_lpi >= 0) 3645 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->cpu_lpi / 1000000.0 / interval_float)); 3646 else 3647 outp += sprintf(outp, "%s(neg)", (printed++ ? delim : "")); 3648 } 3649 if (DO_BIC(BIC_SYS_LPI)) { 3650 if (p->sys_lpi >= 0) 3651 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->sys_lpi / 1000000.0 / interval_float)); 3652 else 3653 outp += sprintf(outp, "%s(neg)", (printed++ ? delim : "")); 3654 } ··· 3670 if (DO_BIC(BIC_RAM_J)) 3671 outp += sprintf(outp, fmt8, (printed++ ? delim : ""), rapl_counter_get_value(&p->energy_dram, RAPL_UNIT_JOULES, interval_float)); 3672 if (DO_BIC(BIC_PKG__)) 3673 - outp += 3674 - sprintf(outp, fmt8, (printed++ ? delim : ""), rapl_counter_get_value(&p->rapl_pkg_perf_status, RAPL_UNIT_WATTS, interval_float)); 3675 if (DO_BIC(BIC_RAM__)) 3676 - outp += 3677 - sprintf(outp, fmt8, (printed++ ? delim : ""), rapl_counter_get_value(&p->rapl_dram_perf_status, RAPL_UNIT_WATTS, interval_float)); 3678 /* UncMHz */ 3679 if (DO_BIC(BIC_UNCORE_MHZ)) 3680 outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), p->uncore_mhz); ··· 3686 else if (mp->format == FORMAT_DELTA || mp->format == FORMAT_AVERAGE) 3687 outp += print_decimal_value(mp->width, &printed, delim, p->counter[i]); 3688 else if (mp->format == FORMAT_PERCENT) 3689 - outp += print_float_value(&printed, delim, pct(p->counter[i] / tsc)); 3690 } 3691 3692 /* Added perf Package Counters */ ··· 3698 else if (pp->format == FORMAT_DELTA || mp->format == FORMAT_AVERAGE) 3699 outp += print_decimal_value(pp->width, &printed, delim, p->perf_counter[i]); 3700 else if (pp->format == FORMAT_PERCENT) 3701 - outp += print_float_value(&printed, delim, pct(p->perf_counter[i] / tsc)); 3702 } 3703 3704 /* Added PMT Package Counters */ ··· 3711 break; 3712 3713 case PMT_TYPE_XTAL_TIME: 3714 - value_converted = pct(value_raw / crystal_hz / interval_float); 3715 outp += print_float_value(&printed, delim, value_converted); 3716 break; 3717 3718 case PMT_TYPE_TCORE_CLOCK: 3719 - value_converted = pct(value_raw / tcore_clock_freq_hz / interval_float); 3720 outp += print_float_value(&printed, delim, value_converted); 3721 } 3722 } 3723 3724 - if (DO_BIC(BIC_SysWatt) && (t == &average.threads)) 3725 - outp += sprintf(outp, fmt8, (printed++ ? delim : ""), 3726 - rapl_counter_get_value(&pplat_cnt->energy_psys, RAPL_UNIT_WATTS, interval_float)); 3727 - if (DO_BIC(BIC_Sys_J) && (t == &average.threads)) 3728 - outp += sprintf(outp, fmt8, (printed++ ? delim : ""), 3729 - rapl_counter_get_value(&pplat_cnt->energy_psys, RAPL_UNIT_JOULES, interval_float)); 3730 3731 done: 3732 if (*(outp - 1) != '\n') ··· 3762 if ((!count || (header_iterations && !(count % header_iterations))) || !summary_only) 3763 print_header("\t"); 3764 3765 - format_counters(&average.threads, &average.cores, &average.packages); 3766 3767 count++; 3768 ··· 3937 /* check for TSC < 1 Mcycles over interval */ 3938 if (old->tsc < (1000 * 1000)) 3939 errx(-3, "Insanely slow TSC rate, TSC stops in idle?\n" 3940 - "You can disable all c-states by booting with \"idle=poll\"\n" "or just the deep ones with \"processor.max_cstate=1\""); 3941 3942 old->c1 = new->c1 - old->c1; 3943 ··· 3988 if (DO_BIC(BIC_SMI)) 3989 old->smi_count = new->smi_count - old->smi_count; 3990 3991 - if (DO_BIC(BIC_LLC_RPS)) 3992 old->llc.references = new->llc.references - old->llc.references; 3993 3994 if (DO_BIC(BIC_LLC_HIT)) 3995 old->llc.misses = new->llc.misses - old->llc.misses; 3996 3997 for (i = 0, mp = sys.tp; mp; i++, mp = mp->next) { 3998 if (mp->format == FORMAT_RAW || mp->format == FORMAT_AVERAGE) ··· 4080 t->llc.references = 0; 4081 t->llc.misses = 0; 4082 4083 c->c3 = 0; 4084 c->c6 = 0; 4085 c->c7 = 0; ··· 4090 c->core_temp_c = 0; 4091 rapl_counter_clear(&c->core_energy); 4092 c->core_throt_cnt = 0; 4093 - 4094 - t->llc.references = 0; 4095 - t->llc.misses = 0; 4096 4097 p->pkg_wtd_core_c0 = 0; 4098 p->pkg_any_core_c0 = 0; ··· 4166 4167 /* copy un-changing apic_id's */ 4168 if (DO_BIC(BIC_APIC)) 4169 - average.threads.apic_id = t->apic_id; 4170 if (DO_BIC(BIC_X2APIC)) 4171 - average.threads.x2apic_id = t->x2apic_id; 4172 4173 /* remember first tv_begin */ 4174 - if (average.threads.tv_begin.tv_sec == 0) 4175 - average.threads.tv_begin = procsysfs_tv_begin; 4176 4177 /* remember last tv_end */ 4178 - average.threads.tv_end = t->tv_end; 4179 4180 - average.threads.tsc += t->tsc; 4181 - average.threads.aperf += t->aperf; 4182 - average.threads.mperf += t->mperf; 4183 - average.threads.c1 += t->c1; 4184 4185 - average.threads.instr_count += t->instr_count; 4186 4187 - average.threads.irq_count += t->irq_count; 4188 - average.threads.nmi_count += t->nmi_count; 4189 - average.threads.smi_count += t->smi_count; 4190 4191 - average.threads.llc.references += t->llc.references; 4192 - average.threads.llc.misses += t->llc.misses; 4193 4194 for (i = 0, mp = sys.tp; mp; i++, mp = mp->next) { 4195 if (mp->format == FORMAT_RAW) 4196 continue; 4197 - average.threads.counter[i] += t->counter[i]; 4198 } 4199 4200 for (i = 0, pp = sys.perf_tp; pp; i++, pp = pp->next) { 4201 if (pp->format == FORMAT_RAW) 4202 continue; 4203 - average.threads.perf_counter[i] += t->perf_counter[i]; 4204 } 4205 4206 for (i = 0, ppmt = sys.pmt_tp; ppmt; i++, ppmt = ppmt->next) { 4207 - average.threads.pmt_counter[i] += t->pmt_counter[i]; 4208 } 4209 4210 /* sum per-core values only for 1st thread in core */ 4211 if (!is_cpu_first_thread_in_core(t, c)) 4212 return 0; 4213 4214 - average.cores.c3 += c->c3; 4215 - average.cores.c6 += c->c6; 4216 - average.cores.c7 += c->c7; 4217 - average.cores.mc6_us += c->mc6_us; 4218 4219 - average.cores.core_temp_c = MAX(average.cores.core_temp_c, c->core_temp_c); 4220 - average.cores.core_throt_cnt = MAX(average.cores.core_throt_cnt, c->core_throt_cnt); 4221 4222 - rapl_counter_accumulate(&average.cores.core_energy, &c->core_energy); 4223 4224 for (i = 0, mp = sys.cp; mp; i++, mp = mp->next) { 4225 if (mp->format == FORMAT_RAW) 4226 continue; 4227 - average.cores.counter[i] += c->counter[i]; 4228 } 4229 4230 for (i = 0, pp = sys.perf_cp; pp; i++, pp = pp->next) { 4231 if (pp->format == FORMAT_RAW) 4232 continue; 4233 - average.cores.perf_counter[i] += c->perf_counter[i]; 4234 } 4235 4236 for (i = 0, ppmt = sys.pmt_cp; ppmt; i++, ppmt = ppmt->next) { 4237 - average.cores.pmt_counter[i] += c->pmt_counter[i]; 4238 } 4239 4240 /* sum per-pkg values only for 1st core in pkg */ ··· 4245 return 0; 4246 4247 if (DO_BIC(BIC_Totl_c0)) 4248 - average.packages.pkg_wtd_core_c0 += p->pkg_wtd_core_c0; 4249 if (DO_BIC(BIC_Any_c0)) 4250 - average.packages.pkg_any_core_c0 += p->pkg_any_core_c0; 4251 if (DO_BIC(BIC_GFX_c0)) 4252 - average.packages.pkg_any_gfxe_c0 += p->pkg_any_gfxe_c0; 4253 if (DO_BIC(BIC_CPUGFX)) 4254 - average.packages.pkg_both_core_gfxe_c0 += p->pkg_both_core_gfxe_c0; 4255 4256 - average.packages.pc2 += p->pc2; 4257 if (DO_BIC(BIC_Pkgpc3)) 4258 - average.packages.pc3 += p->pc3; 4259 if (DO_BIC(BIC_Pkgpc6)) 4260 - average.packages.pc6 += p->pc6; 4261 if (DO_BIC(BIC_Pkgpc7)) 4262 - average.packages.pc7 += p->pc7; 4263 - average.packages.pc8 += p->pc8; 4264 - average.packages.pc9 += p->pc9; 4265 - average.packages.pc10 += p->pc10; 4266 - average.packages.die_c6 += p->die_c6; 4267 4268 - average.packages.cpu_lpi = p->cpu_lpi; 4269 - average.packages.sys_lpi = p->sys_lpi; 4270 4271 - rapl_counter_accumulate(&average.packages.energy_pkg, &p->energy_pkg); 4272 - rapl_counter_accumulate(&average.packages.energy_dram, &p->energy_dram); 4273 - rapl_counter_accumulate(&average.packages.energy_cores, &p->energy_cores); 4274 - rapl_counter_accumulate(&average.packages.energy_gfx, &p->energy_gfx); 4275 4276 - average.packages.gfx_rc6_ms = p->gfx_rc6_ms; 4277 - average.packages.uncore_mhz = p->uncore_mhz; 4278 - average.packages.gfx_mhz = p->gfx_mhz; 4279 - average.packages.gfx_act_mhz = p->gfx_act_mhz; 4280 - average.packages.sam_mc6_ms = p->sam_mc6_ms; 4281 - average.packages.sam_mhz = p->sam_mhz; 4282 - average.packages.sam_act_mhz = p->sam_act_mhz; 4283 4284 - average.packages.pkg_temp_c = MAX(average.packages.pkg_temp_c, p->pkg_temp_c); 4285 4286 - rapl_counter_accumulate(&average.packages.rapl_pkg_perf_status, &p->rapl_pkg_perf_status); 4287 - rapl_counter_accumulate(&average.packages.rapl_dram_perf_status, &p->rapl_dram_perf_status); 4288 4289 for (i = 0, mp = sys.pp; mp; i++, mp = mp->next) { 4290 if ((mp->format == FORMAT_RAW) && (topo.num_packages == 0)) 4291 - average.packages.counter[i] = p->counter[i]; 4292 else 4293 - average.packages.counter[i] += p->counter[i]; 4294 } 4295 4296 for (i = 0, pp = sys.perf_pp; pp; i++, pp = pp->next) { 4297 if ((pp->format == FORMAT_RAW) && (topo.num_packages == 0)) 4298 - average.packages.perf_counter[i] = p->perf_counter[i]; 4299 else 4300 - average.packages.perf_counter[i] += p->perf_counter[i]; 4301 } 4302 4303 for (i = 0, ppmt = sys.pmt_pp; ppmt; i++, ppmt = ppmt->next) { 4304 - average.packages.pmt_counter[i] += p->pmt_counter[i]; 4305 } 4306 4307 return 0; ··· 4318 struct perf_counter_info *pp; 4319 struct pmt_counter *ppmt; 4320 4321 - clear_counters(&average.threads, &average.cores, &average.packages); 4322 4323 for_all_cpus(sum_counters, t, c, p); 4324 4325 /* Use the global time delta for the average. */ 4326 - average.threads.tv_delta = tv_delta; 4327 4328 - average.threads.tsc /= topo.allowed_cpus; 4329 - average.threads.aperf /= topo.allowed_cpus; 4330 - average.threads.mperf /= topo.allowed_cpus; 4331 - average.threads.instr_count /= topo.allowed_cpus; 4332 - average.threads.c1 /= topo.allowed_cpus; 4333 4334 - if (average.threads.irq_count > 9999999) 4335 sums_need_wide_columns = 1; 4336 - if (average.threads.nmi_count > 9999999) 4337 sums_need_wide_columns = 1; 4338 4339 - average.cores.c3 /= topo.allowed_cores; 4340 - average.cores.c6 /= topo.allowed_cores; 4341 - average.cores.c7 /= topo.allowed_cores; 4342 - average.cores.mc6_us /= topo.allowed_cores; 4343 4344 if (DO_BIC(BIC_Totl_c0)) 4345 - average.packages.pkg_wtd_core_c0 /= topo.allowed_packages; 4346 if (DO_BIC(BIC_Any_c0)) 4347 - average.packages.pkg_any_core_c0 /= topo.allowed_packages; 4348 if (DO_BIC(BIC_GFX_c0)) 4349 - average.packages.pkg_any_gfxe_c0 /= topo.allowed_packages; 4350 if (DO_BIC(BIC_CPUGFX)) 4351 - average.packages.pkg_both_core_gfxe_c0 /= topo.allowed_packages; 4352 4353 - average.packages.pc2 /= topo.allowed_packages; 4354 if (DO_BIC(BIC_Pkgpc3)) 4355 - average.packages.pc3 /= topo.allowed_packages; 4356 if (DO_BIC(BIC_Pkgpc6)) 4357 - average.packages.pc6 /= topo.allowed_packages; 4358 if (DO_BIC(BIC_Pkgpc7)) 4359 - average.packages.pc7 /= topo.allowed_packages; 4360 4361 - average.packages.pc8 /= topo.allowed_packages; 4362 - average.packages.pc9 /= topo.allowed_packages; 4363 - average.packages.pc10 /= topo.allowed_packages; 4364 - average.packages.die_c6 /= topo.allowed_packages; 4365 4366 for (i = 0, mp = sys.tp; mp; i++, mp = mp->next) { 4367 if (mp->format == FORMAT_RAW) 4368 continue; 4369 if (mp->type == COUNTER_ITEMS) { 4370 - if (average.threads.counter[i] > 9999999) 4371 sums_need_wide_columns = 1; 4372 continue; 4373 } 4374 - average.threads.counter[i] /= topo.allowed_cpus; 4375 } 4376 for (i = 0, mp = sys.cp; mp; i++, mp = mp->next) { 4377 if (mp->format == FORMAT_RAW) 4378 continue; 4379 if (mp->type == COUNTER_ITEMS) { 4380 - if (average.cores.counter[i] > 9999999) 4381 sums_need_wide_columns = 1; 4382 } 4383 - average.cores.counter[i] /= topo.allowed_cores; 4384 } 4385 for (i = 0, mp = sys.pp; mp; i++, mp = mp->next) { 4386 if (mp->format == FORMAT_RAW) 4387 continue; 4388 if (mp->type == COUNTER_ITEMS) { 4389 - if (average.packages.counter[i] > 9999999) 4390 sums_need_wide_columns = 1; 4391 } 4392 - average.packages.counter[i] /= topo.allowed_packages; 4393 } 4394 4395 for (i = 0, pp = sys.perf_tp; pp; i++, pp = pp->next) { 4396 if (pp->format == FORMAT_RAW) 4397 continue; 4398 if (pp->type == COUNTER_ITEMS) { 4399 - if (average.threads.perf_counter[i] > 9999999) 4400 sums_need_wide_columns = 1; 4401 continue; 4402 } 4403 - average.threads.perf_counter[i] /= topo.allowed_cpus; 4404 } 4405 for (i = 0, pp = sys.perf_cp; pp; i++, pp = pp->next) { 4406 if (pp->format == FORMAT_RAW) 4407 continue; 4408 if (pp->type == COUNTER_ITEMS) { 4409 - if (average.cores.perf_counter[i] > 9999999) 4410 sums_need_wide_columns = 1; 4411 } 4412 - average.cores.perf_counter[i] /= topo.allowed_cores; 4413 } 4414 for (i = 0, pp = sys.perf_pp; pp; i++, pp = pp->next) { 4415 if (pp->format == FORMAT_RAW) 4416 continue; 4417 if (pp->type == COUNTER_ITEMS) { 4418 - if (average.packages.perf_counter[i] > 9999999) 4419 sums_need_wide_columns = 1; 4420 } 4421 - average.packages.perf_counter[i] /= topo.allowed_packages; 4422 } 4423 4424 for (i = 0, ppmt = sys.pmt_tp; ppmt; i++, ppmt = ppmt->next) { 4425 - average.threads.pmt_counter[i] /= topo.allowed_cpus; 4426 } 4427 for (i = 0, ppmt = sys.pmt_cp; ppmt; i++, ppmt = ppmt->next) { 4428 - average.cores.pmt_counter[i] /= topo.allowed_cores; 4429 } 4430 for (i = 0, ppmt = sys.pmt_pp; ppmt; i++, ppmt = ppmt->next) { 4431 - average.packages.pmt_counter[i] /= topo.allowed_packages; 4432 } 4433 } 4434 ··· 4796 4797 int get_rapl_counters(int cpu, unsigned int domain, struct core_data *c, struct pkg_data *p) 4798 { 4799 - struct platform_counters *pplat_cnt = p == package_odd ? &platform_counters_odd : &platform_counters_even; 4800 unsigned long long perf_data[NUM_RAPL_COUNTERS + 1]; 4801 struct rapl_counter_info_t *rci; 4802 ··· 5153 /* Rapl domain enumeration helpers */ 5154 static inline int get_rapl_num_domains(void) 5155 { 5156 - int num_packages = topo.max_package_id + 1; 5157 - int num_cores_per_package; 5158 - int num_cores; 5159 - 5160 if (!platform->has_per_core_rapl) 5161 - return num_packages; 5162 5163 - num_cores_per_package = topo.max_core_id + 1; 5164 - num_cores = num_cores_per_package * num_packages; 5165 - 5166 - return num_cores; 5167 } 5168 5169 static inline int get_rapl_domain_id(int cpu) 5170 { 5171 - int nr_cores_per_package = topo.max_core_id + 1; 5172 - int rapl_core_id; 5173 - 5174 if (!platform->has_per_core_rapl) 5175 - return cpus[cpu].physical_package_id; 5176 5177 - /* Compute the system-wide unique core-id for @cpu */ 5178 - rapl_core_id = cpus[cpu].physical_core_id; 5179 - rapl_core_id += cpus[cpu].physical_package_id * nr_cores_per_package; 5180 - 5181 - return rapl_core_id; 5182 } 5183 5184 /* ··· 5195 5196 get_smi_aperf_mperf(cpu, t); 5197 5198 - if (DO_BIC(BIC_LLC_RPS) || DO_BIC(BIC_LLC_HIT)) 5199 get_perf_llc_stats(cpu, &t->llc); 5200 5201 if (DO_BIC(BIC_IPC)) 5202 if (read(get_instr_count_fd(cpu), &t->instr_count, sizeof(long long)) != sizeof(long long)) ··· 5265 return -10; 5266 5267 for (i = 0, pp = sys.pmt_cp; pp; i++, pp = pp->next) 5268 - c->pmt_counter[i] = pmt_read_counter(pp, c->core_id); 5269 5270 /* collect package counters only for 1st core in package */ 5271 if (!is_cpu_first_core_in_package(t, p)) ··· 5306 } 5307 5308 if (DO_BIC(BIC_UNCORE_MHZ)) 5309 - p->uncore_mhz = get_legacy_uncore_mhz(p->package_id); 5310 5311 if (DO_BIC(BIC_GFX_rc6)) 5312 p->gfx_rc6_ms = gfx_info[GFX_rc6].val_ull; ··· 5330 char *path = NULL; 5331 5332 if (mp->msr_num == 0) { 5333 - path = find_sysfs_path_by_id(mp->sp, p->package_id); 5334 if (path == NULL) { 5335 - warnx("%s: package_id %d not found", __func__, p->package_id); 5336 return -10; 5337 } 5338 } ··· 5344 return -10; 5345 5346 for (i = 0, pp = sys.pmt_pp; pp; i++, pp = pp->next) 5347 - p->pmt_counter[i] = pmt_read_counter(pp, p->package_id); 5348 5349 done: 5350 gettimeofday(&t->tv_end, (struct timezone *)NULL); ··· 5433 return; 5434 } 5435 5436 - get_msr(base_cpu, MSR_PKG_CST_CONFIG_CONTROL, &msr); 5437 pkg_cstate_limit = pkg_cstate_limits[msr & 0xF]; 5438 } 5439 ··· 5445 if (!platform->has_nhm_msrs || no_msr) 5446 return; 5447 5448 - get_msr(base_cpu, MSR_PLATFORM_INFO, &msr); 5449 5450 - fprintf(outf, "cpu%d: MSR_PLATFORM_INFO: 0x%08llx\n", base_cpu, msr); 5451 5452 ratio = (msr >> 40) & 0xFF; 5453 fprintf(outf, "%d * %.1f = %.1f MHz max efficiency frequency\n", ratio, bclk, ratio * bclk); ··· 5463 if (!platform->has_nhm_msrs || no_msr) 5464 return; 5465 5466 - get_msr(base_cpu, MSR_IA32_POWER_CTL, &msr); 5467 - fprintf(outf, "cpu%d: MSR_IA32_POWER_CTL: 0x%08llx (C1E auto-promotion: %sabled)\n", base_cpu, msr, msr & 0x2 ? "EN" : "DIS"); 5468 5469 /* C-state Pre-wake Disable (CSTATE_PREWAKE_DISABLE) */ 5470 if (platform->has_cst_prewake_bit) ··· 5478 unsigned long long msr; 5479 unsigned int ratio; 5480 5481 - get_msr(base_cpu, MSR_TURBO_RATIO_LIMIT2, &msr); 5482 5483 - fprintf(outf, "cpu%d: MSR_TURBO_RATIO_LIMIT2: 0x%08llx\n", base_cpu, msr); 5484 5485 ratio = (msr >> 8) & 0xFF; 5486 if (ratio) ··· 5497 unsigned long long msr; 5498 unsigned int ratio; 5499 5500 - get_msr(base_cpu, MSR_TURBO_RATIO_LIMIT1, &msr); 5501 5502 - fprintf(outf, "cpu%d: MSR_TURBO_RATIO_LIMIT1: 0x%08llx\n", base_cpu, msr); 5503 5504 ratio = (msr >> 56) & 0xFF; 5505 if (ratio) ··· 5540 unsigned long long msr, core_counts; 5541 int shift; 5542 5543 - get_msr(base_cpu, trl_msr_offset, &msr); 5544 - fprintf(outf, "cpu%d: MSR_%sTURBO_RATIO_LIMIT: 0x%08llx\n", 5545 - base_cpu, trl_msr_offset == MSR_SECONDARY_TURBO_RATIO_LIMIT ? "SECONDARY_" : "", msr); 5546 5547 if (platform->trl_msrs & TRL_CORECOUNT) { 5548 - get_msr(base_cpu, MSR_TURBO_RATIO_LIMIT1, &core_counts); 5549 - fprintf(outf, "cpu%d: MSR_TURBO_RATIO_LIMIT1: 0x%08llx\n", base_cpu, core_counts); 5550 } else { 5551 core_counts = 0x0807060504030201; 5552 } ··· 5567 unsigned long long msr; 5568 unsigned int ratio; 5569 5570 - get_msr(base_cpu, MSR_ATOM_CORE_RATIOS, &msr); 5571 - fprintf(outf, "cpu%d: MSR_ATOM_CORE_RATIOS: 0x%08llx\n", base_cpu, msr & 0xFFFFFFFF); 5572 5573 ratio = (msr >> 0) & 0x3F; 5574 if (ratio) ··· 5582 if (ratio) 5583 fprintf(outf, "%d * %.1f = %.1f MHz base frequency\n", ratio, bclk, ratio * bclk); 5584 5585 - get_msr(base_cpu, MSR_ATOM_CORE_TURBO_RATIOS, &msr); 5586 - fprintf(outf, "cpu%d: MSR_ATOM_CORE_TURBO_RATIOS: 0x%08llx\n", base_cpu, msr & 0xFFFFFFFF); 5587 5588 ratio = (msr >> 24) & 0x3F; 5589 if (ratio) ··· 5612 unsigned int cores[buckets_no]; 5613 unsigned int ratio[buckets_no]; 5614 5615 - get_msr(base_cpu, MSR_TURBO_RATIO_LIMIT, &msr); 5616 5617 - fprintf(outf, "cpu%d: MSR_TURBO_RATIO_LIMIT: 0x%08llx\n", base_cpu, msr); 5618 5619 /* 5620 * Turbo encoding in KNL is as follows: ··· 5664 if (!platform->has_nhm_msrs || no_msr) 5665 return; 5666 5667 - get_msr(base_cpu, MSR_PKG_CST_CONFIG_CONTROL, &msr); 5668 5669 - fprintf(outf, "cpu%d: MSR_PKG_CST_CONFIG_CONTROL: 0x%08llx", base_cpu, msr); 5670 5671 fprintf(outf, " (%s%s%s%s%slocked, pkg-cstate-limit=%d (%s)", 5672 (msr & SNB_C3_AUTO_UNDEMOTE) ? "UNdemote-C3, " : "", ··· 5689 { 5690 unsigned long long msr; 5691 5692 - get_msr(base_cpu, MSR_CONFIG_TDP_NOMINAL, &msr); 5693 - fprintf(outf, "cpu%d: MSR_CONFIG_TDP_NOMINAL: 0x%08llx", base_cpu, msr); 5694 fprintf(outf, " (base_ratio=%d)\n", (unsigned int)msr & 0xFF); 5695 5696 - get_msr(base_cpu, MSR_CONFIG_TDP_LEVEL_1, &msr); 5697 - fprintf(outf, "cpu%d: MSR_CONFIG_TDP_LEVEL_1: 0x%08llx (", base_cpu, msr); 5698 if (msr) { 5699 fprintf(outf, "PKG_MIN_PWR_LVL1=%d ", (unsigned int)(msr >> 48) & 0x7FFF); 5700 fprintf(outf, "PKG_MAX_PWR_LVL1=%d ", (unsigned int)(msr >> 32) & 0x7FFF); ··· 5703 } 5704 fprintf(outf, ")\n"); 5705 5706 - get_msr(base_cpu, MSR_CONFIG_TDP_LEVEL_2, &msr); 5707 - fprintf(outf, "cpu%d: MSR_CONFIG_TDP_LEVEL_2: 0x%08llx (", base_cpu, msr); 5708 if (msr) { 5709 fprintf(outf, "PKG_MIN_PWR_LVL2=%d ", (unsigned int)(msr >> 48) & 0x7FFF); 5710 fprintf(outf, "PKG_MAX_PWR_LVL2=%d ", (unsigned int)(msr >> 32) & 0x7FFF); ··· 5713 } 5714 fprintf(outf, ")\n"); 5715 5716 - get_msr(base_cpu, MSR_CONFIG_TDP_CONTROL, &msr); 5717 - fprintf(outf, "cpu%d: MSR_CONFIG_TDP_CONTROL: 0x%08llx (", base_cpu, msr); 5718 if ((msr) & 0x3) 5719 fprintf(outf, "TDP_LEVEL=%d ", (unsigned int)(msr) & 0x3); 5720 fprintf(outf, " lock=%d", (unsigned int)(msr >> 31) & 1); 5721 fprintf(outf, ")\n"); 5722 5723 - get_msr(base_cpu, MSR_TURBO_ACTIVATION_RATIO, &msr); 5724 - fprintf(outf, "cpu%d: MSR_TURBO_ACTIVATION_RATIO: 0x%08llx (", base_cpu, msr); 5725 fprintf(outf, "MAX_NON_TURBO_RATIO=%d", (unsigned int)(msr) & 0xFF); 5726 fprintf(outf, " lock=%d", (unsigned int)(msr >> 31) & 1); 5727 fprintf(outf, ")\n"); ··· 5737 return; 5738 5739 if (platform->supported_cstates & PC3) { 5740 - get_msr(base_cpu, MSR_PKGC3_IRTL, &msr); 5741 - fprintf(outf, "cpu%d: MSR_PKGC3_IRTL: 0x%08llx (", base_cpu, msr); 5742 fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5743 } 5744 5745 if (platform->supported_cstates & PC6) { 5746 - get_msr(base_cpu, MSR_PKGC6_IRTL, &msr); 5747 - fprintf(outf, "cpu%d: MSR_PKGC6_IRTL: 0x%08llx (", base_cpu, msr); 5748 fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5749 } 5750 5751 if (platform->supported_cstates & PC7) { 5752 - get_msr(base_cpu, MSR_PKGC7_IRTL, &msr); 5753 - fprintf(outf, "cpu%d: MSR_PKGC7_IRTL: 0x%08llx (", base_cpu, msr); 5754 fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5755 } 5756 5757 if (platform->supported_cstates & PC8) { 5758 - get_msr(base_cpu, MSR_PKGC8_IRTL, &msr); 5759 - fprintf(outf, "cpu%d: MSR_PKGC8_IRTL: 0x%08llx (", base_cpu, msr); 5760 fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5761 } 5762 5763 if (platform->supported_cstates & PC9) { 5764 - get_msr(base_cpu, MSR_PKGC9_IRTL, &msr); 5765 - fprintf(outf, "cpu%d: MSR_PKGC9_IRTL: 0x%08llx (", base_cpu, msr); 5766 fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5767 } 5768 5769 if (platform->supported_cstates & PC10) { 5770 - get_msr(base_cpu, MSR_PKGC10_IRTL, &msr); 5771 - fprintf(outf, "cpu%d: MSR_PKGC10_IRTL: 0x%08llx (", base_cpu, msr); 5772 fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5773 } 5774 } ··· 5815 5816 free(fd_llc_percpu); 5817 fd_llc_percpu = NULL; 5818 } 5819 5820 void free_fd_cstate(void) ··· 5939 cpu_affinity_set = NULL; 5940 cpu_affinity_setsize = 0; 5941 5942 - free(thread_even); 5943 - free(core_even); 5944 - free(package_even); 5945 5946 - thread_even = NULL; 5947 - core_even = NULL; 5948 - package_even = NULL; 5949 5950 - free(thread_odd); 5951 - free(core_odd); 5952 - free(package_odd); 5953 5954 - thread_odd = NULL; 5955 - core_odd = NULL; 5956 - package_odd = NULL; 5957 5958 free(output_buffer); 5959 output_buffer = NULL; ··· 5977 free_fd_percpu(); 5978 free_fd_instr_count_percpu(); 5979 free_fd_llc_percpu(); 5980 free_fd_msr(); 5981 free_fd_rapl_percpu(); 5982 free_fd_cstate(); ··· 6027 return cpu == parse_int_file("/sys/devices/system/cpu/cpu%d/topology/core_siblings_list", cpu); 6028 } 6029 6030 - int get_physical_package_id(int cpu) 6031 { 6032 return parse_int_file("/sys/devices/system/cpu/cpu%d/topology/physical_package_id", cpu); 6033 } ··· 6060 for (pkg = 0; pkg < topo.num_packages; pkg++) { 6061 lnode = 0; 6062 for (cpu = 0; cpu <= topo.max_cpu_num; ++cpu) { 6063 - if (cpus[cpu].physical_package_id != pkg) 6064 continue; 6065 /* find a cpu with an unset logical_node_id */ 6066 if (cpus[cpu].logical_node_id != -1) ··· 6073 * the logical_node_id 6074 */ 6075 for (cpux = cpu; cpux <= topo.max_cpu_num; cpux++) { 6076 - if ((cpus[cpux].physical_package_id == pkg) && (cpus[cpux].physical_node_id == node)) { 6077 cpus[cpux].logical_node_id = lnode; 6078 cpu_count++; 6079 } ··· 6092 char path[80]; 6093 FILE *filep; 6094 int i; 6095 - int cpu = thiscpu->logical_cpu_id; 6096 6097 for (i = 0; i <= topo.max_cpu_num; i++) { 6098 sprintf(path, "/sys/devices/system/cpu/cpu%d/node%i/cpulist", cpu, i); ··· 6161 return 0; 6162 } 6163 6164 - int get_thread_siblings(struct cpu_topology *thiscpu) 6165 { 6166 char path[80], character; 6167 FILE *filep; 6168 unsigned long map; 6169 int so, shift, sib_core; 6170 - int cpu = thiscpu->logical_cpu_id; 6171 int offset = topo.max_cpu_num + 1; 6172 size_t size; 6173 int thread_id = 0; 6174 6175 thiscpu->put_ids = CPU_ALLOC((topo.max_cpu_num + 1)); 6176 - if (thiscpu->thread_id < 0) 6177 - thiscpu->thread_id = thread_id++; 6178 if (!thiscpu->put_ids) 6179 return -1; 6180 ··· 6196 if ((map >> shift) & 0x1) { 6197 so = shift + offset; 6198 sib_core = get_core_id(so); 6199 - if (sib_core == thiscpu->physical_core_id) { 6200 CPU_SET_S(so, size, thiscpu->put_ids); 6201 - if ((so != cpu) && (cpus[so].thread_id < 0)) 6202 - cpus[so].thread_id = thread_id++; 6203 } 6204 } 6205 } ··· 6225 struct core_data *core_base, struct pkg_data *pkg_base, 6226 struct thread_data *thread_base2, struct core_data *core_base2, struct pkg_data *pkg_base2) 6227 { 6228 - int retval, pkg_no, node_no, core_no, thread_no; 6229 6230 retval = 0; 6231 6232 - for (pkg_no = 0; pkg_no < topo.num_packages; ++pkg_no) { 6233 - for (node_no = 0; node_no < topo.nodes_per_pkg; ++node_no) { 6234 - for (core_no = 0; core_no < topo.cores_per_node; ++core_no) { 6235 - for (thread_no = 0; thread_no < topo.threads_per_core; ++thread_no) { 6236 - struct thread_data *t, *t2; 6237 - struct core_data *c, *c2; 6238 6239 - t = GET_THREAD(thread_base, thread_no, core_no, node_no, pkg_no); 6240 6241 - if (cpu_is_not_allowed(t->cpu_id)) 6242 - continue; 6243 6244 - t2 = GET_THREAD(thread_base2, thread_no, core_no, node_no, pkg_no); 6245 6246 - c = GET_CORE(core_base, core_no, node_no, pkg_no); 6247 - c2 = GET_CORE(core_base2, core_no, node_no, pkg_no); 6248 6249 - retval |= func(t, c, &pkg_base[pkg_no], t2, c2, &pkg_base2[pkg_no]); 6250 - } 6251 - } 6252 } 6253 } 6254 return retval; ··· 6315 6316 pos = fgets(buf, 1024, fp); 6317 if (!pos) 6318 - err(1, "%s: file read failed\n", PATH_EFFECTIVE_CPUS); 6319 6320 fclose(fp); 6321 ··· 6332 update_effective_str(startup); 6333 6334 if (parse_cpu_str(cpu_effective_str, cpu_effective_set, cpu_effective_setsize)) 6335 - err(1, "%s: cpu str malformat %s\n", PATH_EFFECTIVE_CPUS, cpu_effective_str); 6336 } 6337 6338 void linux_perf_init(void); ··· 6340 void rapl_perf_init(void); 6341 void cstate_perf_init(void); 6342 void perf_llc_init(void); 6343 void added_perf_counters_init(void); 6344 void pmt_init(void); 6345 ··· 6353 rapl_perf_init(); 6354 cstate_perf_init(); 6355 perf_llc_init(); 6356 added_perf_counters_init(); 6357 pmt_init(); 6358 fprintf(outf, "turbostat: re-initialized with num_cpus %d, allowed_cpus %d\n", topo.num_cpus, topo.allowed_cpus); ··· 6362 void set_max_cpu_num(void) 6363 { 6364 FILE *filep; 6365 - int base_cpu; 6366 unsigned long dummy; 6367 char pathname[64]; 6368 6369 - base_cpu = sched_getcpu(); 6370 - if (base_cpu < 0) 6371 err(1, "cannot find calling cpu ID"); 6372 - sprintf(pathname, "/sys/devices/system/cpu/cpu%d/topology/thread_siblings", base_cpu); 6373 6374 filep = fopen_or_die(pathname, "r"); 6375 topo.max_cpu_num = 0; ··· 6397 return 0; 6398 } 6399 6400 - int init_thread_id(int cpu) 6401 { 6402 - cpus[cpu].thread_id = -1; 6403 return 0; 6404 } 6405 ··· 6936 struct stat sb; 6937 char pathname[32]; 6938 6939 - sprintf(pathname, "/dev/msr%d", base_cpu); 6940 return !stat(pathname, &sb); 6941 } 6942 ··· 6945 struct stat sb; 6946 char pathname[32]; 6947 6948 - sprintf(pathname, "/dev/cpu/%d/msr", base_cpu); 6949 return !stat(pathname, &sb); 6950 } 6951 ··· 7005 7006 free_and_exit: 7007 if (cap_free(caps) == -1) 7008 - err(-6, "cap_free\n"); 7009 7010 return ret; 7011 } ··· 7022 failed += check_for_cap_sys_rawio(); 7023 7024 /* test file permissions */ 7025 - sprintf(pathname, use_android_msr_path ? "/dev/msr%d" : "/dev/cpu/%d/msr", base_cpu); 7026 if (euidaccess(pathname, R_OK)) { 7027 failed++; 7028 } ··· 7051 else 7052 return; 7053 7054 - get_msr(base_cpu, MSR_PLATFORM_INFO, &msr); 7055 base_ratio = (msr >> 8) & 0xFF; 7056 7057 base_hz = base_ratio * bclk * 1000000; ··· 7202 } 7203 for (i = uncore_max_id; i >= 0; --i) { 7204 int k, l; 7205 - int package_id, domain_id, cluster_id; 7206 char name_buf[16]; 7207 7208 sprintf(path_base, "/sys/devices/system/cpu/intel_uncore_frequency/uncore%02d", i); 7209 7210 if (access(path_base, R_OK)) 7211 - err(1, "%s: %s\n", __func__, path_base); 7212 7213 sprintf(path, "%s/package_id", path_base); 7214 - package_id = read_sysfs_int(path); 7215 7216 sprintf(path, "%s/domain_id", path_base); 7217 domain_id = read_sysfs_int(path); ··· 7234 */ 7235 if BIC_IS_ENABLED 7236 (BIC_UNCORE_MHZ) 7237 - add_counter(0, path, name_buf, 0, SCOPE_PACKAGE, COUNTER_K2M, FORMAT_AVERAGE, 0, package_id); 7238 7239 if (quiet) 7240 continue; ··· 7243 k = read_sysfs_int(path); 7244 sprintf(path, "%s/max_freq_khz", path_base); 7245 l = read_sysfs_int(path); 7246 - fprintf(outf, "Uncore Frequency package%d domain%d cluster%d: %d - %d MHz ", package_id, domain_id, cluster_id, k / 1000, l / 1000); 7247 7248 sprintf(path, "%s/initial_min_freq_khz", path_base); 7249 k = read_sysfs_int(path); ··· 7398 7399 for (state = 0; state < 10; ++state) { 7400 7401 - sprintf(path, "/sys/devices/system/cpu/cpu%d/cpuidle/state%d/name", base_cpu, state); 7402 input = fopen(path, "r"); 7403 if (input == NULL) 7404 continue; ··· 7414 7415 remove_underbar(name_buf); 7416 7417 - sprintf(path, "/sys/devices/system/cpu/cpu%d/cpuidle/state%d/desc", base_cpu, state); 7418 input = fopen(path, "r"); 7419 if (input == NULL) 7420 continue; 7421 if (!fgets(desc, sizeof(desc), input)) 7422 err(1, "%s: failed to read file", path); 7423 7424 - fprintf(outf, "cpu%d: %s: %s", base_cpu, name_buf, desc); 7425 fclose(input); 7426 } 7427 } ··· 7434 FILE *input; 7435 int turbo; 7436 7437 - sprintf(path, "/sys/devices/system/cpu/cpu%d/cpufreq/scaling_driver", base_cpu); 7438 input = fopen(path, "r"); 7439 if (input == NULL) { 7440 fprintf(outf, "NSFOD %s\n", path); ··· 7444 err(1, "%s: failed to read file", path); 7445 fclose(input); 7446 7447 - sprintf(path, "/sys/devices/system/cpu/cpu%d/cpufreq/scaling_governor", base_cpu); 7448 input = fopen(path, "r"); 7449 if (input == NULL) { 7450 fprintf(outf, "NSFOD %s\n", path); ··· 7454 err(1, "%s: failed to read file", path); 7455 fclose(input); 7456 7457 - fprintf(outf, "cpu%d: cpufreq driver: %s", base_cpu, driver_buf); 7458 - fprintf(outf, "cpu%d: cpufreq governor: %s", base_cpu, governor_buf); 7459 7460 sprintf(path, "/sys/devices/system/cpu/cpufreq/boost"); 7461 input = fopen(path, "r"); ··· 7717 unsigned long long msr; 7718 7719 if (valid_rapl_msrs & RAPL_PKG_POWER_INFO) 7720 - if (!get_msr(base_cpu, MSR_PKG_POWER_INFO, &msr)) 7721 return ((msr >> 0) & RAPL_POWER_GRANULARITY) * rapl_power_units; 7722 return get_quirk_tdp(); 7723 } ··· 7756 CLR_BIC(BIC_RAM__, &bic_enabled); 7757 7758 /* units on package 0, verify later other packages match */ 7759 - if (get_msr(base_cpu, MSR_RAPL_POWER_UNIT, &msr)) 7760 return; 7761 7762 rapl_power_units = 1.0 / (1 << (msr & 0xF)); ··· 7804 if (!valid_rapl_msrs || no_msr) 7805 return; 7806 7807 - if (get_msr(base_cpu, MSR_RAPL_PWR_UNIT, &msr)) 7808 return; 7809 7810 rapl_time_units = ldexp(1.0, -(msr >> 16 & 0xf)); ··· 8013 return -1; 8014 } 8015 8016 - fprintf(outf, "cpu%d: %s: 0x%08llx (%f Watts, %f Joules, %f sec.)\n", cpu, msr_name, msr, 8017 - rapl_power_units, rapl_energy_units, rapl_time_units); 8018 8019 if (valid_rapl_msrs & RAPL_PKG_POWER_INFO) { 8020 ··· 8045 return -9; 8046 8047 fprintf(outf, "cpu%d: MSR_VR_CURRENT_CONFIG: 0x%08llx\n", cpu, msr); 8048 - fprintf(outf, "cpu%d: PKG Limit #4: %f Watts (%slocked)\n", 8049 - cpu, ((msr >> 0) & 0x1FFF) * rapl_power_units, (msr >> 31) & 1 ? "" : "UN"); 8050 } 8051 8052 if (valid_rapl_msrs & RAPL_DRAM_POWER_INFO) { ··· 8113 if (offset < 0) 8114 return; 8115 8116 - ret = get_msr(base_cpu, offset, &msr_value); 8117 if (ret) { 8118 if (debug) 8119 fprintf(outf, "Can not read RAPL_PKG_ENERGY MSR(0x%llx)\n", (unsigned long long)offset); ··· 8198 if (!platform->has_nhm_msrs || no_msr) 8199 goto guess; 8200 8201 - if (get_msr(base_cpu, MSR_IA32_TEMPERATURE_TARGET, &msr)) 8202 goto guess; 8203 8204 tcc_default = (msr >> 16) & 0xFF; ··· 8207 int bits = platform->tcc_offset_bits; 8208 unsigned long long enabled = 0; 8209 8210 - if (bits && !get_msr(base_cpu, MSR_PLATFORM_INFO, &enabled)) 8211 enabled = (enabled >> 30) & 1; 8212 8213 if (bits && enabled) { ··· 8342 if (no_msr) 8343 return; 8344 8345 - if (!get_msr(base_cpu, MSR_IA32_FEAT_CTL, &msr)) 8346 fprintf(outf, "cpu%d: MSR_IA32_FEATURE_CONTROL: 0x%08llx (%sLocked %s)\n", 8347 - base_cpu, msr, msr & FEAT_CTL_LOCKED ? "" : "UN-", msr & (1 << 18) ? "SGX" : ""); 8348 } 8349 8350 void decode_misc_enable_msr(void) ··· 8360 if (!genuine_intel) 8361 return; 8362 8363 - if (!get_msr(base_cpu, MSR_IA32_MISC_ENABLE, &msr)) 8364 fprintf(outf, "cpu%d: MSR_IA32_MISC_ENABLE: 0x%08llx (%sTCC %sEIST %sMWAIT %sPREFETCH %sTURBO)\n", 8365 - base_cpu, msr, 8366 msr & MSR_IA32_MISC_ENABLE_TM1 ? "" : "No-", 8367 msr & MSR_IA32_MISC_ENABLE_ENHANCED_SPEEDSTEP ? "" : "No-", 8368 msr & MSR_IA32_MISC_ENABLE_MWAIT ? "" : "No-", ··· 8379 if (!platform->has_msr_misc_feature_control) 8380 return; 8381 8382 - if (!get_msr(base_cpu, MSR_MISC_FEATURE_CONTROL, &msr)) 8383 fprintf(outf, 8384 "cpu%d: MSR_MISC_FEATURE_CONTROL: 0x%08llx (%sL2-Prefetch %sL2-Prefetch-pair %sL1-Prefetch %sL1-IP-Prefetch)\n", 8385 - base_cpu, msr, msr & (0 << 0) ? "No-" : "", msr & (1 << 0) ? "No-" : "", 8386 - msr & (2 << 0) ? "No-" : "", msr & (3 << 0) ? "No-" : ""); 8387 } 8388 8389 /* ··· 8402 if (!platform->has_msr_misc_pwr_mgmt) 8403 return; 8404 8405 - if (!get_msr(base_cpu, MSR_MISC_PWR_MGMT, &msr)) 8406 fprintf(outf, "cpu%d: MSR_MISC_PWR_MGMT: 0x%08llx (%sable-EIST_Coordination %sable-EPB %sable-OOB)\n", 8407 - base_cpu, msr, msr & (1 << 0) ? "DIS" : "EN", msr & (1 << 1) ? "EN" : "DIS", msr & (1 << 8) ? "EN" : "DIS"); 8408 } 8409 8410 /* ··· 8423 if (!platform->has_msr_c6_demotion_policy_config) 8424 return; 8425 8426 - if (!get_msr(base_cpu, MSR_CC6_DEMOTION_POLICY_CONFIG, &msr)) 8427 - fprintf(outf, "cpu%d: MSR_CC6_DEMOTION_POLICY_CONFIG: 0x%08llx (%sable-CC6-Demotion)\n", 8428 - base_cpu, msr, msr & (1 << 0) ? "EN" : "DIS"); 8429 8430 - if (!get_msr(base_cpu, MSR_MC6_DEMOTION_POLICY_CONFIG, &msr)) 8431 - fprintf(outf, "cpu%d: MSR_MC6_DEMOTION_POLICY_CONFIG: 0x%08llx (%sable-MC6-Demotion)\n", 8432 - base_cpu, msr, msr & (1 << 0) ? "EN" : "DIS"); 8433 } 8434 8435 void print_dev_latency(void) ··· 8462 if (no_perf) 8463 return 0; 8464 8465 - fd = open_perf_counter(base_cpu, PERF_TYPE_HARDWARE, PERF_COUNT_HW_INSTRUCTIONS, -1, 0); 8466 if (fd != -1) 8467 close(fd); 8468 ··· 8515 return ret; 8516 } 8517 8518 /* 8519 - * Linux-perf manages the HW instructions-retired counter 8520 - * by enabling when requested, and hiding rollover 8521 */ 8522 void linux_perf_init(void) 8523 { 8524 if (access("/proc/sys/kernel/perf_event_paranoid", F_OK)) 8525 return; 8526 8527 if (BIC_IS_ENABLED(BIC_IPC) && cpuid_has_aperf_mperf) { 8528 fd_instr_count_percpu = calloc(topo.max_cpu_num + 1, sizeof(int)); 8529 if (fd_instr_count_percpu == NULL) 8530 err(-1, "calloc fd_instr_count_percpu"); 8531 } 8532 - if (BIC_IS_ENABLED(BIC_LLC_RPS)) { 8533 fd_llc_percpu = calloc(topo.max_cpu_num + 1, sizeof(int)); 8534 if (fd_llc_percpu == NULL) 8535 err(-1, "calloc fd_llc_percpu"); 8536 } 8537 } 8538 ··· 8692 8693 domain_visited[next_domain] = 1; 8694 8695 - if ((cai->flags & RAPL_COUNTER_FLAG_PLATFORM_COUNTER) && (cpu != base_cpu)) 8696 continue; 8697 8698 struct rapl_counter_info_t *rci = &rapl_counter_info_perdomain[next_domain]; ··· 8745 /* Assumes msr_counter_info is populated */ 8746 static int has_amperf_access(void) 8747 { 8748 - return cpuid_has_aperf_mperf && msr_counter_arch_infos[MSR_ARCH_INFO_APERF_INDEX].present && 8749 - msr_counter_arch_infos[MSR_ARCH_INFO_MPERF_INDEX].present; 8750 } 8751 8752 int *get_cstate_perf_group_fd(struct cstate_counter_info_t *cci, const char *group_name) ··· 8941 if (cpu_is_not_allowed(cpu)) 8942 continue; 8943 8944 - const int core_id = cpus[cpu].physical_core_id; 8945 - const int pkg_id = cpus[cpu].physical_package_id; 8946 8947 assert(core_id < cores_visited_elems); 8948 assert(pkg_id < pkg_visited_elems); ··· 8956 if (!per_core && pkg_visited[pkg_id]) 8957 continue; 8958 8959 - const bool counter_needed = BIC_IS_ENABLED(cai->bic_number) || 8960 - (soft_c1 && (cai->flags & CSTATE_COUNTER_FLAG_SOFT_C1_DEPENDENCY)); 8961 const bool counter_supported = (platform->supported_cstates & cai->feature_mask); 8962 8963 if (counter_needed && counter_supported) { ··· 9065 for_all_cpus(print_perf_limit, ODD_COUNTERS); 9066 } 9067 9068 void process_cpuid() 9069 { 9070 unsigned int eax, ebx, ecx, edx; ··· 9119 model += ((fms >> 16) & 0xf) << 4; 9120 ecx_flags = ecx; 9121 edx_flags = edx; 9122 9123 if (!no_msr) { 9124 if (get_msr(sched_getcpu(), MSR_IA32_UCODE_REV, &ucode_patch)) ··· 9143 fputc('\n', outf); 9144 9145 fprintf(outf, "CPUID(0x80000000): max_extended_levels: 0x%x\n", max_extended_level); 9146 - fprintf(outf, "CPUID(1): %s %s %s %s %s %s %s %s %s %s\n", 9147 - ecx_flags & (1 << 0) ? "SSE3" : "-", 9148 - ecx_flags & (1 << 3) ? "MONITOR" : "-", 9149 - ecx_flags & (1 << 6) ? "SMX" : "-", 9150 - ecx_flags & (1 << 7) ? "EIST" : "-", 9151 - ecx_flags & (1 << 8) ? "TM2" : "-", 9152 - edx_flags & (1 << 4) ? "TSC" : "-", 9153 - edx_flags & (1 << 5) ? "MSR" : "-", 9154 - edx_flags & (1 << 22) ? "ACPI-TM" : "-", edx_flags & (1 << 28) ? "HT" : "-", edx_flags & (1 << 29) ? "TM" : "-"); 9155 } 9156 9157 probe_platform_features(family, model); 9158 9159 if (!(edx_flags & (1 << 5))) 9160 errx(1, "CPUID: no MSR"); ··· 9208 if (!quiet) 9209 decode_misc_enable_msr(); 9210 9211 - if (max_level >= 0x7 && !quiet) { 9212 int has_sgx; 9213 9214 ecx = 0; ··· 9217 9218 has_sgx = ebx & (1 << 2); 9219 9220 - is_hybrid = edx & (1 << 15); 9221 9222 - fprintf(outf, "CPUID(7): %sSGX %sHybrid\n", has_sgx ? "" : "No-", is_hybrid ? "" : "No-"); 9223 9224 if (has_sgx) 9225 decode_feature_control_msr(); ··· 9246 if (crystal_hz) { 9247 tsc_hz = (unsigned long long)crystal_hz *ebx_tsc / eax_crystal; 9248 if (!quiet) 9249 - fprintf(outf, "TSC: %lld MHz (%d Hz * %d / %d / 1000000)\n", 9250 - tsc_hz / 1000000, crystal_hz, ebx_tsc, eax_crystal); 9251 } 9252 } 9253 } ··· 9324 decode_misc_feature_control(); 9325 } 9326 9327 - /* perf_llc_probe 9328 * 9329 * return 1 on success, else 0 9330 */ ··· 9336 if (no_perf) 9337 return 0; 9338 9339 - fd = open_perf_counter(base_cpu, PERF_TYPE_HARDWARE, PERF_COUNT_HW_CACHE_REFERENCES, -1, PERF_FORMAT_GROUP); 9340 if (fd != -1) 9341 close(fd); 9342 ··· 9354 9355 if (no_perf) 9356 return; 9357 - if (!(BIC_IS_ENABLED(BIC_LLC_RPS) && BIC_IS_ENABLED(BIC_LLC_HIT))) 9358 return; 9359 9360 for (cpu = 0; cpu <= topo.max_cpu_num; ++cpu) { 9361 9362 if (cpu_is_not_allowed(cpu)) 9363 continue; 9364 9365 - assert(fd_llc_percpu != 0); 9366 fd_llc_percpu[cpu] = open_perf_counter(cpu, PERF_TYPE_HARDWARE, PERF_COUNT_HW_CACHE_REFERENCES, -1, PERF_FORMAT_GROUP); 9367 if (fd_llc_percpu[cpu] == -1) { 9368 warnx("%s: perf REFS: failed to open counter on cpu%d", __func__, cpu); 9369 free_fd_llc_percpu(); 9370 return; 9371 } 9372 - assert(fd_llc_percpu != 0); 9373 retval = open_perf_counter(cpu, PERF_TYPE_HARDWARE, PERF_COUNT_HW_CACHE_MISSES, fd_llc_percpu[cpu], PERF_FORMAT_GROUP); 9374 if (retval == -1) { 9375 warnx("%s: perf MISS: failed to open counter on cpu%d", __func__, cpu); ··· 9377 return; 9378 } 9379 } 9380 - BIC_PRESENT(BIC_LLC_RPS); 9381 BIC_PRESENT(BIC_LLC_HIT); 9382 } 9383 9384 /* ··· 9471 return 1; 9472 else 9473 return 0; 9474 - } 9475 - 9476 - char *possible_file = "/sys/devices/system/cpu/possible"; 9477 - char possible_buf[1024]; 9478 - 9479 - int initialize_cpu_possible_set(void) 9480 - { 9481 - FILE *fp; 9482 - 9483 - fp = fopen(possible_file, "r"); 9484 - if (!fp) { 9485 - warn("open %s", possible_file); 9486 - return -1; 9487 - } 9488 - if (fread(possible_buf, sizeof(char), 1024, fp) == 0) { 9489 - warn("read %s", possible_file); 9490 - goto err; 9491 - } 9492 - if (parse_cpu_str(possible_buf, cpu_possible_set, cpu_possible_setsize)) { 9493 - warnx("%s: cpu str malformat %s\n", possible_file, cpu_effective_str); 9494 - goto err; 9495 - } 9496 - return 0; 9497 - 9498 - err: 9499 - fclose(fp); 9500 - return -1; 9501 } 9502 9503 void topology_probe(bool startup) ··· 9512 err(3, "CPU_ALLOC"); 9513 cpu_possible_setsize = CPU_ALLOC_SIZE((topo.max_cpu_num + 1)); 9514 CPU_ZERO_S(cpu_possible_setsize, cpu_possible_set); 9515 - initialize_cpu_possible_set(); 9516 9517 /* 9518 * Allocate and initialize cpu_effective_set ··· 9580 cpu_affinity_setsize = CPU_ALLOC_SIZE((topo.max_cpu_num + 1)); 9581 CPU_ZERO_S(cpu_affinity_setsize, cpu_affinity_set); 9582 9583 - for_all_proc_cpus(init_thread_id); 9584 9585 for_all_proc_cpus(set_cpu_hybrid_type); 9586 9587 /* 9588 * For online cpus 9589 - * find max_core_id, max_package_id 9590 */ 9591 for (i = 0; i <= topo.max_cpu_num; ++i) { 9592 int siblings; ··· 9597 continue; 9598 } 9599 9600 - cpus[i].logical_cpu_id = i; 9601 9602 /* get package information */ 9603 - cpus[i].physical_package_id = get_physical_package_id(i); 9604 - if (cpus[i].physical_package_id > max_package_id) 9605 - max_package_id = cpus[i].physical_package_id; 9606 9607 /* get die information */ 9608 cpus[i].die_id = get_die_id(i); ··· 9620 topo.max_node_num = cpus[i].physical_node_id; 9621 9622 /* get core information */ 9623 - cpus[i].physical_core_id = get_core_id(i); 9624 - if (cpus[i].physical_core_id > max_core_id) 9625 - max_core_id = cpus[i].physical_core_id; 9626 9627 /* get thread information */ 9628 - siblings = get_thread_siblings(&cpus[i]); 9629 if (siblings > max_siblings) 9630 max_siblings = siblings; 9631 - if (cpus[i].thread_id == 0) 9632 topo.num_cores++; 9633 } 9634 - topo.max_core_id = max_core_id; 9635 topo.max_package_id = max_package_id; 9636 9637 topo.cores_per_node = max_core_id + 1; 9638 if (debug > 1) ··· 9674 continue; 9675 fprintf(outf, 9676 "cpu %d pkg %d die %d l3 %d node %d lnode %d core %d thread %d\n", 9677 - i, cpus[i].physical_package_id, cpus[i].die_id, cpus[i].l3_id, 9678 - cpus[i].physical_node_id, cpus[i].logical_node_id, cpus[i].physical_core_id, cpus[i].thread_id); 9679 } 9680 9681 } 9682 9683 - void allocate_counters(struct thread_data **t, struct core_data **c, struct pkg_data **p) 9684 { 9685 int i; 9686 int num_cores = topo.cores_per_node * topo.nodes_per_pkg * topo.num_packages; 9687 int num_threads = topo.threads_per_core * num_cores; 9688 9689 - *t = calloc(num_threads, sizeof(struct thread_data)); 9690 - if (*t == NULL) 9691 goto error; 9692 9693 for (i = 0; i < num_threads; i++) 9694 - (*t)[i].cpu_id = -1; 9695 9696 - *c = calloc(num_cores, sizeof(struct core_data)); 9697 - if (*c == NULL) 9698 goto error; 9699 9700 - for (i = 0; i < num_cores; i++) { 9701 - (*c)[i].core_id = -1; 9702 - (*c)[i].base_cpu = -1; 9703 - } 9704 9705 - *p = calloc(topo.num_packages, sizeof(struct pkg_data)); 9706 - if (*p == NULL) 9707 goto error; 9708 9709 - for (i = 0; i < topo.num_packages; i++) { 9710 - (*p)[i].package_id = i; 9711 - (*p)[i].base_cpu = -1; 9712 - } 9713 9714 return; 9715 error: ··· 9734 /* 9735 * init_counter() 9736 * 9737 - * set FIRST_THREAD_IN_CORE and FIRST_CORE_IN_PACKAGE 9738 */ 9739 void init_counter(struct thread_data *thread_base, struct core_data *core_base, struct pkg_data *pkg_base, int cpu_id) 9740 { 9741 - int pkg_id = cpus[cpu_id].physical_package_id; 9742 int node_id = cpus[cpu_id].logical_node_id; 9743 - int core_id = cpus[cpu_id].physical_core_id; 9744 - int thread_id = cpus[cpu_id].thread_id; 9745 struct thread_data *t; 9746 struct core_data *c; 9747 ··· 9750 if (node_id < 0) 9751 node_id = 0; 9752 9753 - t = GET_THREAD(thread_base, thread_id, core_id, node_id, pkg_id); 9754 - c = GET_CORE(core_base, core_id, node_id, pkg_id); 9755 9756 t->cpu_id = cpu_id; 9757 if (!cpu_is_not_allowed(cpu_id)) { 9758 9759 - if (c->base_cpu < 0) 9760 - c->base_cpu = t->cpu_id; 9761 - if (pkg_base[pkg_id].base_cpu < 0) 9762 - pkg_base[pkg_id].base_cpu = t->cpu_id; 9763 } 9764 - 9765 - c->core_id = core_id; 9766 - pkg_base[pkg_id].package_id = pkg_id; 9767 } 9768 9769 int initialize_counters(int cpu_id) ··· 9803 int update_topo(PER_THREAD_PARAMS) 9804 { 9805 topo.allowed_cpus++; 9806 - if ((int)t->cpu_id == c->base_cpu) 9807 topo.allowed_cores++; 9808 - if ((int)t->cpu_id == p->base_cpu) 9809 topo.allowed_packages++; 9810 9811 return 0; ··· 9824 topology_probe(startup); 9825 allocate_irq_buffers(); 9826 allocate_fd_percpu(); 9827 - allocate_counters(&thread_even, &core_even, &package_even); 9828 - allocate_counters(&thread_odd, &core_odd, &package_odd); 9829 allocate_output_buffer(); 9830 for_all_proc_cpus(initialize_counters); 9831 topology_update(); 9832 } 9833 9834 - void set_base_cpu(void) 9835 { 9836 int i; 9837 9838 for (i = 0; i < topo.max_cpu_num + 1; ++i) { 9839 if (cpu_is_not_allowed(i)) 9840 continue; 9841 - base_cpu = i; 9842 if (debug > 1) 9843 - fprintf(outf, "base_cpu = %d\n", base_cpu); 9844 return; 9845 } 9846 err(-ENODEV, "No valid cpus found"); ··· 9872 if (!has_perf_instr_count_access()) 9873 no_perf = 1; 9874 9875 - if (BIC_IS_ENABLED(BIC_LLC_RPS) || BIC_IS_ENABLED(BIC_LLC_HIT)) 9876 if (!has_perf_llc_access()) 9877 no_perf = 1; 9878 ··· 10355 10356 if (BIC_IS_ENABLED(BIC_Diec6)) { 10357 pmt_add_counter(PMT_MTL_DC6_GUID, PMT_MTL_DC6_SEQ, "Die%c6", PMT_TYPE_XTAL_TIME, 10358 - PMT_COUNTER_MTL_DC6_LSB, PMT_COUNTER_MTL_DC6_MSB, PMT_COUNTER_MTL_DC6_OFFSET, 10359 - SCOPE_PACKAGE, FORMAT_DELTA, 0, PMT_OPEN_TRY); 10360 } 10361 10362 if (BIC_IS_ENABLED(BIC_CPU_c1e)) { ··· 10416 void turbostat_init() 10417 { 10418 setup_all_buffers(true); 10419 - set_base_cpu(); 10420 check_msr_access(); 10421 check_perf_access(); 10422 process_cpuid(); ··· 10427 rapl_perf_init(); 10428 cstate_perf_init(); 10429 perf_llc_init(); 10430 added_perf_counters_init(); 10431 pmt_init(); 10432 10433 for_all_cpus(get_cpu_type, ODD_COUNTERS); 10434 for_all_cpus(get_cpu_type, EVEN_COUNTERS); 10435 10436 - if (BIC_IS_ENABLED(BIC_IPC) && has_aperf_access && get_instr_count_fd(base_cpu) != -1) 10437 BIC_PRESENT(BIC_IPC); 10438 10439 /* ··· 10533 10534 void print_version() 10535 { 10536 - fprintf(outf, "turbostat version 2025.12.02 - Len Brown <lenb@kernel.org>\n"); 10537 } 10538 10539 #define COMMAND_LINE_SIZE 2048 ··· 11155 } 11156 11157 if (direct_path && has_guid) { 11158 - printf("%s: path and guid+seq parameters are mutually exclusive\n" 11159 - "notice: passed guid=0x%x and path=%s\n", __func__, guid, direct_path); 11160 exit(1); 11161 } 11162 ··· 11250 11251 for (state = 10; state >= 0; --state) { 11252 11253 - sprintf(path, "/sys/devices/system/cpu/cpu%d/cpuidle/state%d/name", base_cpu, state); 11254 input = fopen(path, "r"); 11255 if (input == NULL) 11256 continue; ··· 11299 11300 for (state = 10; state >= 0; --state) { 11301 11302 - sprintf(path, "/sys/devices/system/cpu/cpu%d/cpuidle/state%d/name", base_cpu, state); 11303 input = fopen(path, "r"); 11304 if (input == NULL) 11305 continue; ··· 11428 * Parse some options early, because they may make other options invalid, 11429 * like adding the MSR counter with --add and at the same time using --no-msr. 11430 */ 11431 - while ((opt = getopt_long_only(argc, argv, "+MPn:", long_options, &option_index)) != -1) { 11432 switch (opt) { 11433 case 'M': 11434 no_msr = 1; ··· 11442 } 11443 optind = 0; 11444 11445 - while ((opt = getopt_long_only(argc, argv, "+C:c:Dde:hi:Jn:o:qMST:v", long_options, &option_index)) != -1) { 11446 switch (opt) { 11447 case 'a': 11448 parse_add_command(optarg); ··· 11485 } 11486 break; 11487 case 'h': 11488 - default: 11489 help(); 11490 exit(1); 11491 case 'i': ··· 11520 /* Parsed earlier */ 11521 break; 11522 case 'n': 11523 - num_iterations = strtod(optarg, NULL); 11524 11525 - if (num_iterations <= 0) { 11526 - fprintf(outf, "iterations %d should be positive number\n", num_iterations); 11527 - exit(2); 11528 - } 11529 break; 11530 case 'N': 11531 - header_iterations = strtod(optarg, NULL); 11532 11533 - if (header_iterations <= 0) { 11534 - fprintf(outf, "iterations %d should be positive number\n", header_iterations); 11535 - exit(2); 11536 - } 11537 break; 11538 case 's': 11539 /* ··· 11554 print_version(); 11555 exit(0); 11556 break; 11557 } 11558 } 11559 }
··· 3 * turbostat -- show CPU frequency and C-state residency 4 * on modern Intel and AMD processors. 5 * 6 + * Copyright (c) 2010 - 2026 Intel Corporation 7 * Len Brown <len.brown@intel.com> 8 */ 9 ··· 210 { 0x0, "NMI", NULL, 0, 0, 0, NULL, 0 }, 211 { 0x0, "CPU%c1e", NULL, 0, 0, 0, NULL, 0 }, 212 { 0x0, "pct_idle", NULL, 0, 0, 0, NULL, 0 }, 213 + { 0x0, "LLCMRPS", NULL, 0, 0, 0, NULL, 0 }, 214 { 0x0, "LLC%hit", NULL, 0, 0, 0, NULL, 0 }, 215 + { 0x0, "L2MRPS", NULL, 0, 0, 0, NULL, 0 }, 216 + { 0x0, "L2%hit", NULL, 0, 0, 0, NULL, 0 }, 217 }; 218 219 /* n.b. bic_names must match the order in bic[], above */ ··· 281 BIC_NMI, 282 BIC_CPU_c1e, 283 BIC_pct_idle, 284 + BIC_LLC_MRPS, 285 BIC_LLC_HIT, 286 + BIC_L2_MRPS, 287 + BIC_L2_HIT, 288 MAX_BIC 289 }; 290 ··· 294 295 printf("%s:", s); 296 297 + for (i = 0; i < MAX_BIC; ++i) { 298 299 + if (CPU_ISSET(i, set)) 300 printf(" %s", bic[i].name); 301 } 302 putchar('\n'); 303 } ··· 424 SET_BIC(BIC_pct_idle, &bic_group_idle); 425 426 BIC_INIT(&bic_group_cache); 427 + SET_BIC(BIC_LLC_MRPS, &bic_group_cache); 428 SET_BIC(BIC_LLC_HIT, &bic_group_cache); 429 + SET_BIC(BIC_L2_MRPS, &bic_group_cache); 430 + SET_BIC(BIC_L2_HIT, &bic_group_cache); 431 432 BIC_INIT(&bic_group_other); 433 SET_BIC(BIC_IRQ, &bic_group_other); ··· 482 int *fd_percpu; 483 int *fd_instr_count_percpu; 484 int *fd_llc_percpu; 485 + int *fd_l2_percpu; 486 struct timeval interval_tv = { 5, 0 }; 487 struct timespec interval_ts = { 5, 0 }; 488 ··· 498 unsigned int dump_only; 499 unsigned int force_load; 500 unsigned int cpuid_has_aperf_mperf; 501 + unsigned int cpuid_has_hv; 502 unsigned int has_aperf_access; 503 unsigned int has_epb; 504 unsigned int has_turbo; ··· 528 double rapl_joule_counter_range; 529 unsigned int crystal_hz; 530 unsigned long long tsc_hz; 531 + int master_cpu; 532 unsigned int has_hwp; /* IA32_PM_ENABLE, IA32_HWP_CAPABILITIES */ 533 /* IA32_HWP_REQUEST, IA32_HWP_STATUS */ 534 unsigned int has_hwp_notify; /* IA32_HWP_INTERRUPT */ ··· 620 unsigned int i; 621 double freq; 622 623 + if (get_msr(master_cpu, MSR_FSB_FREQ, &msr)) 624 fprintf(outf, "SLM BCLK: unknown\n"); 625 626 i = msr & 0xf; ··· 1248 { 0, NULL }, 1249 }; 1250 1251 + struct { 1252 + unsigned int uniform; 1253 + unsigned int pcore; 1254 + unsigned int ecore; 1255 + unsigned int lcore; 1256 + } perf_pmu_types; 1257 + 1258 + /* 1259 + * Events are enumerated in https://github.com/intel/perfmon 1260 + * and tools/perf/pmu-events/arch/x86/.../cache.json 1261 + */ 1262 + struct perf_l2_events { 1263 + unsigned long long refs; /* L2_REQUEST.ALL */ 1264 + unsigned long long hits; /* L2_REQUEST.HIT */ 1265 + }; 1266 + 1267 + struct perf_model_support { 1268 + unsigned int vfm; 1269 + struct perf_l2_events first; 1270 + struct perf_l2_events second; 1271 + struct perf_l2_events third; 1272 + } *perf_model_support; 1273 + 1274 + /* Perf Cache Events */ 1275 + #define PCE(ext_umask, umask) (((unsigned long long) ext_umask) << 40 | umask << 8 | 0x24) 1276 + 1277 + /* 1278 + * Enumerate up to three perf CPU PMU's in a system. 1279 + * The first, second, and third columns are populated without skipping, describing 1280 + * pcore, ecore, lcore PMUs, in order, if present. (The associated PMU "type" field is 1281 + * read from sysfs in all cases.) Eg. 1282 + * 1283 + * non-hybrid: 1284 + * GNR: pcore, {}, {} 1285 + * ADL-N: ecore, {}, {} 1286 + * hybrid: 1287 + * MTL: pcore, ecore, {}% 1288 + * ARL-H: pcore, ecore, lcore 1289 + * LNL: ecore, ecore%%, {} 1290 + * 1291 + * % MTL physical lcore share architecture and PMU with ecore, and are thus not enumerated separately. 1292 + * %% LNL physical lcore is enumerated by perf as ecore 1293 + */ 1294 + static struct perf_model_support turbostat_perf_model_support[] = { 1295 + { INTEL_SAPPHIRERAPIDS_X, { PCE(0x00, 0xFF), PCE(0x00, 0xDF)}, {}, {} }, 1296 + { INTEL_EMERALDRAPIDS_X, { PCE(0x00, 0xFF), PCE(0x00, 0xDF)}, {}, {} }, 1297 + { INTEL_GRANITERAPIDS_X, { PCE(0x00, 0xFF), PCE(0x00, 0xDF)}, {}, {} }, 1298 + { INTEL_GRANITERAPIDS_D, { PCE(0x00, 0xFF), PCE(0x00, 0xDF)}, {}, {} }, 1299 + { INTEL_DIAMONDRAPIDS_X, { PCE(0x00, 0xFF), PCE(0x00, 0x5F)}, {}, {} }, 1300 + 1301 + { INTEL_ATOM_GRACEMONT, { PCE(0x00, 0x00), PCE(0x00, 0x02)}, {}, {} }, /* ADL-N */ 1302 + { INTEL_ATOM_CRESTMONT_X, { PCE(0x00, 0x00), PCE(0x00, 0x02)}, {}, {} }, /* SRF */ 1303 + { INTEL_ATOM_CRESTMONT, { PCE(0x00, 0x00), PCE(0x00, 0x02)}, {}, {} }, /* GRR */ 1304 + { INTEL_ATOM_DARKMONT_X, { PCE(0x01, 0xFF), PCE(0x01, 0xBF)}, {}, {} }, /* CWF */ 1305 + 1306 + { INTEL_ALDERLAKE, { PCE(0x00, 0xFF), PCE(0x00, 0xDF)}, { PCE(0x00, 0x00), PCE(0x00, 0x02)}, {} }, 1307 + { INTEL_ALDERLAKE, { PCE(0x00, 0xFF), PCE(0x00, 0xDF)}, { PCE(0x00, 0x00), PCE(0x00, 0x02)}, {} }, 1308 + { INTEL_ALDERLAKE_L, { PCE(0x00, 0xFF), PCE(0x00, 0xDF)}, { PCE(0x00, 0x00), PCE(0x00, 0x02)}, {} }, 1309 + { INTEL_RAPTORLAKE, { PCE(0x00, 0xFF), PCE(0x00, 0xDF)}, { PCE(0x00, 0x00), PCE(0x00, 0x02)}, {} }, 1310 + { INTEL_RAPTORLAKE_P, { PCE(0x00, 0xFF), PCE(0x00, 0xDF)}, { PCE(0x00, 0x00), PCE(0x00, 0x02)}, {} }, 1311 + { INTEL_RAPTORLAKE_S, { PCE(0x00, 0xFF), PCE(0x00, 0xDF)}, { PCE(0x00, 0x00), PCE(0x00, 0x02)}, {} }, 1312 + { INTEL_METEORLAKE_L, { PCE(0x00, 0xFF), PCE(0x00, 0xDF)}, { PCE(0x00, 0x00), PCE(0x00, 0x02)}, {} }, 1313 + { INTEL_METEORLAKE, { PCE(0x00, 0xFF), PCE(0x00, 0xDF)}, { PCE(0x00, 0x00), PCE(0x00, 0x02)}, {} }, 1314 + { INTEL_ARROWLAKE_U, { PCE(0x00, 0xFF), PCE(0x00, 0xDF)}, { PCE(0x00, 0x00), PCE(0x00, 0x02)}, {} }, 1315 + 1316 + { INTEL_LUNARLAKE_M, { PCE(0x00, 0xFF), PCE(0x00, 0x5F)}, { PCE(0x00, 0x07), PCE(0x00, 0x02)}, {} }, 1317 + { INTEL_ARROWLAKE_H, { PCE(0x00, 0xFF), PCE(0x00, 0x5F)}, { PCE(0x00, 0x07), PCE(0x00, 0x02)}, { PCE(0x00, 0x00), PCE(0x00, 0x02)} }, 1318 + { INTEL_ARROWLAKE, { PCE(0x00, 0xFF), PCE(0x00, 0x5F)}, { PCE(0x00, 0x07), PCE(0x00, 0x02)}, {} }, 1319 + 1320 + { INTEL_PANTHERLAKE_L, { PCE(0x00, 0xFF), PCE(0x00, 0x5F)}, { PCE(0x01, 0xFF), PCE(0x01, 0xBF)}, {} }, 1321 + { INTEL_WILDCATLAKE_L, { PCE(0x00, 0xFF), PCE(0x00, 0x5F)}, { PCE(0x01, 0xFF), PCE(0x01, 0xBF)}, {} }, 1322 + 1323 + { INTEL_NOVALAKE, { PCE(0x00, 0xFF), PCE(0x00, 0x5F)}, { PCE(0x01, 0xFF), PCE(0x01, 0xBF)}, {} }, 1324 + { INTEL_NOVALAKE_L, { PCE(0x00, 0xFF), PCE(0x00, 0x5F)}, { PCE(0x01, 0xFF), PCE(0x01, 0xBF)}, {} }, 1325 + 1326 + { 0, {}, {}, {} } 1327 + }; 1328 + 1329 static const struct platform_features *platform; 1330 1331 void probe_platform_features(unsigned int family, unsigned int model) ··· 1291 exit(1); 1292 } 1293 1294 + void init_perf_model_support(unsigned int family, unsigned int model) 1295 + { 1296 + int i; 1297 + 1298 + if (!genuine_intel) 1299 + return; 1300 + 1301 + for (i = 0; turbostat_perf_model_support[i].vfm; i++) { 1302 + if (VFM_FAMILY(turbostat_perf_model_support[i].vfm) == family && VFM_MODEL(turbostat_perf_model_support[i].vfm) == model) { 1303 + perf_model_support = &turbostat_perf_model_support[i]; 1304 + return; 1305 + } 1306 + } 1307 + } 1308 + 1309 /* Model specific support End */ 1310 1311 #define TJMAX_DEFAULT 100 ··· 1307 1308 #define CPU_SUBSET_MAXCPUS 8192 /* need to use before probe... */ 1309 cpu_set_t *cpu_present_set, *cpu_possible_set, *cpu_effective_set, *cpu_allowed_set, *cpu_affinity_set, *cpu_subset; 1310 + cpu_set_t *perf_pcore_set, *perf_ecore_set, *perf_lcore_set; 1311 size_t cpu_present_setsize, cpu_possible_setsize, cpu_effective_setsize, cpu_allowed_setsize, cpu_affinity_setsize, cpu_subset_size; 1312 #define MAX_ADDED_THREAD_COUNTERS 24 1313 #define MAX_ADDED_CORE_COUNTERS 8 ··· 2007 unsigned long long references; 2008 unsigned long long misses; 2009 }; 2010 + struct l2_stats { 2011 + unsigned long long references; 2012 + unsigned long long hits; 2013 + }; 2014 struct thread_data { 2015 struct timeval tv_begin; 2016 struct timeval tv_end; ··· 2020 unsigned long long nmi_count; 2021 unsigned int smi_count; 2022 struct llc_stats llc; 2023 + struct l2_stats l2; 2024 unsigned int cpu_id; 2025 unsigned int apic_id; 2026 unsigned int x2apic_id; ··· 2028 unsigned long long counter[MAX_ADDED_THREAD_COUNTERS]; 2029 unsigned long long perf_counter[MAX_ADDED_THREAD_COUNTERS]; 2030 unsigned long long pmt_counter[PMT_MAX_ADDED_THREAD_COUNTERS]; 2031 + }; 2032 2033 struct core_data { 2034 + int first_cpu; 2035 unsigned long long c3; 2036 unsigned long long c6; 2037 unsigned long long c7; 2038 unsigned long long mc6_us; /* duplicate as per-core for now, even though per module */ 2039 unsigned int core_temp_c; 2040 struct rapl_counter core_energy; /* MSR_CORE_ENERGY_STAT */ 2041 unsigned long long core_throt_cnt; 2042 unsigned long long counter[MAX_ADDED_CORE_COUNTERS]; 2043 unsigned long long perf_counter[MAX_ADDED_CORE_COUNTERS]; 2044 unsigned long long pmt_counter[PMT_MAX_ADDED_CORE_COUNTERS]; 2045 + }; 2046 2047 struct pkg_data { 2048 + int first_cpu; 2049 unsigned long long pc2; 2050 unsigned long long pc3; 2051 unsigned long long pc6; ··· 2066 long long sam_mc6_ms; 2067 unsigned int sam_mhz; 2068 unsigned int sam_act_mhz; 2069 struct rapl_counter energy_pkg; /* MSR_PKG_ENERGY_STATUS */ 2070 struct rapl_counter energy_dram; /* MSR_DRAM_ENERGY_STATUS */ 2071 struct rapl_counter energy_cores; /* MSR_PP0_ENERGY_STATUS */ ··· 2079 unsigned long long counter[MAX_ADDED_PACKAGE_COUNTERS]; 2080 unsigned long long perf_counter[MAX_ADDED_PACKAGE_COUNTERS]; 2081 unsigned long long pmt_counter[PMT_MAX_ADDED_PACKAGE_COUNTERS]; 2082 + }; 2083 2084 + #define ODD_COUNTERS odd.threads, odd.cores, odd.packages 2085 + #define EVEN_COUNTERS even.threads, even.cores, even.packages 2086 2087 /* 2088 * The accumulated sum of MSR is defined as a monotonic ··· 2135 2136 switch (idx) { 2137 case IDX_PKG_ENERGY: 2138 + if (platform->plat_rapl_msrs & RAPL_AMD_F17H) 2139 offset = MSR_PKG_ENERGY_STAT; 2140 else 2141 offset = MSR_PKG_ENERGY_STATUS; ··· 2279 sys.added_package_counters -= free_msr_counters_(&sys.pp); 2280 } 2281 2282 + struct counters { 2283 + struct thread_data *threads; 2284 + struct core_data *cores; 2285 + struct pkg_data *packages; 2286 + } average, even, odd; 2287 2288 struct platform_counters { 2289 struct rapl_counter energy_psys; /* MSR_PLATFORM_ENERGY_STATUS */ 2290 } platform_counters_odd, platform_counters_even; 2291 2292 + #define MAX_HT_ID 3 /* support SMT-4 */ 2293 + 2294 struct cpu_topology { 2295 + int cpu_id; 2296 + int core_id; /* unique within a package */ 2297 + int package_id; 2298 int die_id; 2299 int l3_id; 2300 int physical_node_id; 2301 int logical_node_id; /* 0-based count within the package */ 2302 + int ht_id; /* unique within a core */ 2303 + int ht_sibling_cpu_id[MAX_HT_ID + 1]; 2304 int type; 2305 cpu_set_t *put_ids; /* Processing Unit/Thread IDs */ 2306 } *cpus; ··· 2306 int num_packages; 2307 int num_die; 2308 int num_cpus; 2309 + int num_cores; /* system wide */ 2310 int allowed_packages; 2311 int allowed_cpus; 2312 int allowed_cores; 2313 int max_cpu_num; 2314 + int max_core_id; /* within a package */ 2315 int max_package_id; 2316 int max_die_id; 2317 int max_l3_id; ··· 2343 return !CPU_ISSET_S(cpu, cpu_allowed_setsize, cpu_allowed_set); 2344 } 2345 2346 + #define GLOBAL_CORE_ID(core_id, pkg_id) (core_id + pkg_id * (topo.max_core_id + 1)) 2347 /* 2348 * run func(thread, core, package) in topology order 2349 * skip non-present cpus ··· 2353 int for_all_cpus(int (func) (struct thread_data *, struct core_data *, struct pkg_data *), 2354 struct thread_data *thread_base, struct core_data *core_base, struct pkg_data *pkg_base) 2355 { 2356 + int cpu, retval; 2357 2358 retval = 0; 2359 2360 + for (cpu = 0; cpu <= topo.max_cpu_num; ++cpu) { 2361 + struct thread_data *t; 2362 + struct core_data *c; 2363 + struct pkg_data *p; 2364 2365 + int pkg_id = cpus[cpu].package_id; 2366 2367 + if (cpu_is_not_allowed(cpu)) 2368 + continue; 2369 2370 + if (cpus[cpu].ht_id > 0) /* skip HT sibling */ 2371 + continue; 2372 2373 + t = &thread_base[cpu]; 2374 + c = &core_base[GLOBAL_CORE_ID(cpus[cpu].core_id, pkg_id)]; 2375 + p = &pkg_base[pkg_id]; 2376 + 2377 + retval |= func(t, c, p); 2378 + 2379 + /* Handle HT sibling now */ 2380 + int i; 2381 + 2382 + for (i = MAX_HT_ID; i > 0; --i) { /* ht_id 0 is self */ 2383 + if (cpus[cpu].ht_sibling_cpu_id[i] <= 0) 2384 + continue; 2385 + t = &thread_base[cpus[cpu].ht_sibling_cpu_id[i]]; 2386 + 2387 + retval |= func(t, c, p); 2388 } 2389 } 2390 return retval; ··· 2381 2382 int is_cpu_first_thread_in_core(struct thread_data *t, struct core_data *c) 2383 { 2384 + return ((int)t->cpu_id == c->first_cpu || c->first_cpu < 0); 2385 } 2386 2387 int is_cpu_first_core_in_package(struct thread_data *t, struct pkg_data *p) 2388 { 2389 + return ((int)t->cpu_id == p->first_cpu || p->first_cpu < 0); 2390 } 2391 2392 int is_cpu_first_thread_in_package(struct thread_data *t, struct core_data *c, struct pkg_data *p) ··· 2439 static void bic_disable_perf_access(void) 2440 { 2441 CLR_BIC(BIC_IPC, &bic_enabled); 2442 + CLR_BIC(BIC_LLC_MRPS, &bic_enabled); 2443 CLR_BIC(BIC_LLC_HIT, &bic_enabled); 2444 + CLR_BIC(BIC_L2_MRPS, &bic_enabled); 2445 + CLR_BIC(BIC_L2_HIT, &bic_enabled); 2446 } 2447 2448 static long perf_event_open(struct perf_event_attr *hw_event, pid_t pid, int cpu, int group_fd, unsigned long flags) ··· 2552 return cpu; 2553 2554 case SCOPE_CORE: 2555 + return cpus[cpu].core_id; 2556 2557 case SCOPE_PACKAGE: 2558 + return cpus[cpu].package_id; 2559 } 2560 2561 __builtin_unreachable(); ··· 2629 " sets the Thermal Control Circuit temperature in\n" 2630 " degrees Celsius\n" 2631 " -h, --help\n" 2632 + " print this help message\n -v, --version\n\t\tprint version information\n\nFor more help, run \"man turbostat\"\n"); 2633 } 2634 2635 /* ··· 2813 if (DO_BIC(BIC_SMI)) 2814 outp += sprintf(outp, "%sSMI", (printed++ ? delim : "")); 2815 2816 + if (DO_BIC(BIC_LLC_MRPS)) 2817 + outp += sprintf(outp, "%sLLCMRPS", (printed++ ? delim : "")); 2818 2819 if (DO_BIC(BIC_LLC_HIT)) 2820 outp += sprintf(outp, "%sLLC%%hit", (printed++ ? delim : "")); 2821 + 2822 + if (DO_BIC(BIC_L2_MRPS)) 2823 + outp += sprintf(outp, "%sL2MRPS", (printed++ ? delim : "")); 2824 + 2825 + if (DO_BIC(BIC_L2_HIT)) 2826 + outp += sprintf(outp, "%sL2%%hit", (printed++ ? delim : "")); 2827 2828 for (mp = sys.tp; mp; mp = mp->next) 2829 outp += print_name(mp->width, &printed, delim, mp->name, mp->type, mp->format); ··· 3001 } 3002 3003 /* 3004 + * pct(numerator, denominator) 3005 * 3006 + * Return sanity checked percentage (100.0 * numerator/denominotor) 3007 * 3008 + * n < 0: nan 3009 + * d <= 0: nan 3010 + * n/d > 1.1: nan 3011 */ 3012 + double pct(double numerator, double denominator) 3013 { 3014 + double retval; 3015 3016 + if (numerator < 0) 3017 + return nan(""); 3018 3019 + if (denominator <= 0) 3020 + return nan(""); 3021 + 3022 + retval = 100.0 * numerator / denominator; 3023 + 3024 + if (retval > 110.0) 3025 + return nan(""); 3026 + 3027 + return retval; 3028 } 3029 3030 int dump_counters(PER_THREAD_PARAMS) 3031 { 3032 int i; 3033 struct msr_counter *mp; 3034 + struct platform_counters *pplat_cnt = p == odd.packages ? &platform_counters_odd : &platform_counters_even; 3035 3036 outp += sprintf(outp, "t %p, c %p, p %p\n", t, c, p); 3037 ··· 3046 3047 outp += sprintf(outp, "LLC refs: %lld", t->llc.references); 3048 outp += sprintf(outp, "LLC miss: %lld", t->llc.misses); 3049 + outp += sprintf(outp, "LLC Hit%%: %.2f", pct((t->llc.references - t->llc.misses), t->llc.references)); 3050 + 3051 + outp += sprintf(outp, "L2 refs: %lld", t->l2.references); 3052 + outp += sprintf(outp, "L2 hits: %lld", t->l2.hits); 3053 + outp += sprintf(outp, "L2 Hit%%: %.2f", pct(t->l2.hits, t->l2.references)); 3054 3055 for (i = 0, mp = sys.tp; mp; i++, mp = mp->next) { 3056 outp += sprintf(outp, "tADDED [%d] %8s msr0x%x: %08llX %s\n", i, mp->name, mp->msr_num, t->counter[i], mp->sp->path); ··· 3054 } 3055 3056 if (c && is_cpu_first_thread_in_core(t, c)) { 3057 + outp += sprintf(outp, "core: %d\n", cpus[t->cpu_id].core_id); 3058 outp += sprintf(outp, "c3: %016llX\n", c->c3); 3059 outp += sprintf(outp, "c6: %016llX\n", c->c6); 3060 outp += sprintf(outp, "c7: %016llX\n", c->c7); ··· 3074 } 3075 3076 if (p && is_cpu_first_core_in_package(t, p)) { 3077 outp += sprintf(outp, "Weighted cores: %016llX\n", p->pkg_wtd_core_c0); 3078 outp += sprintf(outp, "Any cores: %016llX\n", p->pkg_any_core_c0); 3079 outp += sprintf(outp, "Any GFX: %016llX\n", p->pkg_any_gfxe_c0); ··· 3141 actual_read_size = read(fd_llc_percpu[cpu], &r, expected_read_size); 3142 3143 if (actual_read_size == -1) 3144 + err(-1, "%s(cpu%d,) %d,,%ld", __func__, cpu, fd_llc_percpu[cpu], expected_read_size); 3145 3146 llc->references = r.llc.references; 3147 llc->misses = r.llc.misses; 3148 if (actual_read_size != expected_read_size) 3149 warn("%s: failed to read perf_data (req %zu act %zu)", __func__, expected_read_size, actual_read_size); 3150 + } 3151 + 3152 + void get_perf_l2_stats(int cpu, struct l2_stats *l2) 3153 + { 3154 + struct read_format { 3155 + unsigned long long num_read; 3156 + struct l2_stats l2; 3157 + } r; 3158 + const ssize_t expected_read_size = sizeof(r); 3159 + ssize_t actual_read_size; 3160 + 3161 + actual_read_size = read(fd_l2_percpu[cpu], &r, expected_read_size); 3162 + 3163 + if (actual_read_size == -1) 3164 + err(-1, "%s(cpu%d,) %d,,%ld", __func__, cpu, fd_l2_percpu[cpu], expected_read_size); 3165 + 3166 + l2->references = r.l2.references; 3167 + l2->hits = r.l2.hits; 3168 + if (actual_read_size != expected_read_size) 3169 + warn("%s: cpu%d: failed to read(%d) perf_data (req %zu act %zu)", __func__, cpu, fd_l2_percpu[cpu], expected_read_size, actual_read_size); 3170 } 3171 3172 /* ··· 3167 char *delim = "\t"; 3168 int printed = 0; 3169 3170 + if (t == average.threads) { 3171 pplat_cnt = count & 1 ? &platform_counters_odd : &platform_counters_even; 3172 ++count; 3173 } ··· 3181 return 0; 3182 3183 /*if not summary line and --cpu is used */ 3184 + if ((t != average.threads) && (cpu_subset && !CPU_ISSET_S(t->cpu_id, cpu_subset_size, cpu_subset))) 3185 return 0; 3186 3187 if (DO_BIC(BIC_USEC)) { ··· 3201 tsc = t->tsc * tsc_tweak; 3202 3203 /* topo columns, print blanks on 1st (average) line */ 3204 + if (t == average.threads) { 3205 if (DO_BIC(BIC_Package)) 3206 outp += sprintf(outp, "%s-", (printed++ ? delim : "")); 3207 if (DO_BIC(BIC_Die)) ··· 3221 } else { 3222 if (DO_BIC(BIC_Package)) { 3223 if (p) 3224 + outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), cpus[t->cpu_id].package_id); 3225 else 3226 outp += sprintf(outp, "%s-", (printed++ ? delim : "")); 3227 } ··· 3245 } 3246 if (DO_BIC(BIC_Core)) { 3247 if (c) 3248 + outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), cpus[t->cpu_id].core_id); 3249 else 3250 outp += sprintf(outp, "%s-", (printed++ ? delim : "")); 3251 } ··· 3261 outp += sprintf(outp, "%s%.0f", (printed++ ? delim : ""), 1.0 / units * t->aperf / interval_float); 3262 3263 if (DO_BIC(BIC_Busy)) 3264 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(t->mperf, tsc)); 3265 3266 if (DO_BIC(BIC_Bzy_MHz)) { 3267 if (has_base_hz) ··· 3297 outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), t->smi_count); 3298 3299 /* LLC Stats */ 3300 + if (DO_BIC(BIC_LLC_MRPS)) 3301 + outp += sprintf(outp, "%s%.0f", (printed++ ? delim : ""), t->llc.references / interval_float / 1000000); 3302 3303 + if (DO_BIC(BIC_LLC_HIT)) 3304 + outp += sprintf(outp, fmt8, (printed++ ? delim : ""), pct((t->llc.references - t->llc.misses), t->llc.references)); 3305 + 3306 + /* L2 Stats */ 3307 + if (DO_BIC(BIC_L2_MRPS)) 3308 + outp += sprintf(outp, "%s%.0f", (printed++ ? delim : ""), t->l2.references / interval_float / 1000000); 3309 + 3310 + if (DO_BIC(BIC_L2_HIT)) 3311 + outp += sprintf(outp, fmt8, (printed++ ? delim : ""), pct(t->l2.hits, t->l2.references)); 3312 3313 /* Added Thread Counters */ 3314 for (i = 0, mp = sys.tp; mp; i++, mp = mp->next) { ··· 3315 if (mp->type == COUNTER_USEC) 3316 outp += print_float_value(&printed, delim, t->counter[i] / interval_float / 10000); 3317 else 3318 + outp += print_float_value(&printed, delim, pct(t->counter[i], tsc)); 3319 } 3320 } 3321 ··· 3329 if (pp->type == COUNTER_USEC) 3330 outp += print_float_value(&printed, delim, t->perf_counter[i] / interval_float / 10000); 3331 else 3332 + outp += print_float_value(&printed, delim, pct(t->perf_counter[i], tsc)); 3333 } 3334 } 3335 ··· 3343 break; 3344 3345 case PMT_TYPE_XTAL_TIME: 3346 + value_converted = pct(value_raw / crystal_hz, interval_float); 3347 outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), value_converted); 3348 break; 3349 3350 case PMT_TYPE_TCORE_CLOCK: 3351 + value_converted = pct(value_raw / tcore_clock_freq_hz, interval_float); 3352 outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), value_converted); 3353 } 3354 } 3355 3356 /* C1 */ 3357 if (DO_BIC(BIC_CPU_c1)) 3358 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(t->c1, tsc)); 3359 3360 /* print per-core data only for 1st thread in core */ 3361 if (!is_cpu_first_thread_in_core(t, c)) 3362 goto done; 3363 3364 if (DO_BIC(BIC_CPU_c3)) 3365 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(c->c3, tsc)); 3366 if (DO_BIC(BIC_CPU_c6)) 3367 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(c->c6, tsc)); 3368 if (DO_BIC(BIC_CPU_c7)) 3369 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(c->c7, tsc)); 3370 3371 /* Mod%c6 */ 3372 if (DO_BIC(BIC_Mod_c6)) 3373 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(c->mc6_us, tsc)); 3374 3375 if (DO_BIC(BIC_CoreTmp)) 3376 outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), c->core_temp_c); ··· 3386 else if (mp->format == FORMAT_DELTA || mp->format == FORMAT_AVERAGE) 3387 outp += print_decimal_value(mp->width, &printed, delim, c->counter[i]); 3388 else if (mp->format == FORMAT_PERCENT) 3389 + outp += print_float_value(&printed, delim, pct(c->counter[i], tsc)); 3390 } 3391 3392 /* Added perf Core counters */ ··· 3396 else if (pp->format == FORMAT_DELTA || mp->format == FORMAT_AVERAGE) 3397 outp += print_decimal_value(pp->width, &printed, delim, c->perf_counter[i]); 3398 else if (pp->format == FORMAT_PERCENT) 3399 + outp += print_float_value(&printed, delim, pct(c->perf_counter[i], tsc)); 3400 } 3401 3402 /* Added PMT Core counters */ ··· 3409 break; 3410 3411 case PMT_TYPE_XTAL_TIME: 3412 + value_converted = pct(value_raw / crystal_hz, interval_float); 3413 outp += print_float_value(&printed, delim, value_converted); 3414 break; 3415 3416 case PMT_TYPE_TCORE_CLOCK: 3417 + value_converted = pct(value_raw / tcore_clock_freq_hz, interval_float); 3418 outp += print_float_value(&printed, delim, value_converted); 3419 } 3420 } ··· 3470 if (DO_BIC(BIC_Totl_c0)) 3471 outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100 * p->pkg_wtd_core_c0 / tsc); /* can exceed 100% */ 3472 if (DO_BIC(BIC_Any_c0)) 3473 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pkg_any_core_c0, tsc)); 3474 if (DO_BIC(BIC_GFX_c0)) 3475 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pkg_any_gfxe_c0, tsc)); 3476 if (DO_BIC(BIC_CPUGFX)) 3477 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pkg_both_core_gfxe_c0, tsc)); 3478 3479 if (DO_BIC(BIC_Pkgpc2)) 3480 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pc2, tsc)); 3481 if (DO_BIC(BIC_Pkgpc3)) 3482 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pc3, tsc)); 3483 if (DO_BIC(BIC_Pkgpc6)) 3484 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pc6, tsc)); 3485 if (DO_BIC(BIC_Pkgpc7)) 3486 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pc7, tsc)); 3487 if (DO_BIC(BIC_Pkgpc8)) 3488 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pc8, tsc)); 3489 if (DO_BIC(BIC_Pkgpc9)) 3490 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pc9, tsc)); 3491 if (DO_BIC(BIC_Pkgpc10)) 3492 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pc10, tsc)); 3493 3494 if (DO_BIC(BIC_Diec6)) 3495 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->die_c6 / crystal_hz, interval_float)); 3496 3497 if (DO_BIC(BIC_CPU_LPI)) { 3498 if (p->cpu_lpi >= 0) 3499 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->cpu_lpi / 1000000.0, interval_float)); 3500 else 3501 outp += sprintf(outp, "%s(neg)", (printed++ ? delim : "")); 3502 } 3503 if (DO_BIC(BIC_SYS_LPI)) { 3504 if (p->sys_lpi >= 0) 3505 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->sys_lpi / 1000000.0, interval_float)); 3506 else 3507 outp += sprintf(outp, "%s(neg)", (printed++ ? delim : "")); 3508 } ··· 3524 if (DO_BIC(BIC_RAM_J)) 3525 outp += sprintf(outp, fmt8, (printed++ ? delim : ""), rapl_counter_get_value(&p->energy_dram, RAPL_UNIT_JOULES, interval_float)); 3526 if (DO_BIC(BIC_PKG__)) 3527 + outp += sprintf(outp, fmt8, (printed++ ? delim : ""), rapl_counter_get_value(&p->rapl_pkg_perf_status, RAPL_UNIT_WATTS, interval_float)); 3528 if (DO_BIC(BIC_RAM__)) 3529 + outp += sprintf(outp, fmt8, (printed++ ? delim : ""), rapl_counter_get_value(&p->rapl_dram_perf_status, RAPL_UNIT_WATTS, interval_float)); 3530 /* UncMHz */ 3531 if (DO_BIC(BIC_UNCORE_MHZ)) 3532 outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), p->uncore_mhz); ··· 3542 else if (mp->format == FORMAT_DELTA || mp->format == FORMAT_AVERAGE) 3543 outp += print_decimal_value(mp->width, &printed, delim, p->counter[i]); 3544 else if (mp->format == FORMAT_PERCENT) 3545 + outp += print_float_value(&printed, delim, pct(p->counter[i], tsc)); 3546 } 3547 3548 /* Added perf Package Counters */ ··· 3554 else if (pp->format == FORMAT_DELTA || mp->format == FORMAT_AVERAGE) 3555 outp += print_decimal_value(pp->width, &printed, delim, p->perf_counter[i]); 3556 else if (pp->format == FORMAT_PERCENT) 3557 + outp += print_float_value(&printed, delim, pct(p->perf_counter[i], tsc)); 3558 } 3559 3560 /* Added PMT Package Counters */ ··· 3567 break; 3568 3569 case PMT_TYPE_XTAL_TIME: 3570 + value_converted = pct(value_raw / crystal_hz, interval_float); 3571 outp += print_float_value(&printed, delim, value_converted); 3572 break; 3573 3574 case PMT_TYPE_TCORE_CLOCK: 3575 + value_converted = pct(value_raw / tcore_clock_freq_hz, interval_float); 3576 outp += print_float_value(&printed, delim, value_converted); 3577 } 3578 } 3579 3580 + if (DO_BIC(BIC_SysWatt) && (t == average.threads)) 3581 + outp += sprintf(outp, fmt8, (printed++ ? delim : ""), rapl_counter_get_value(&pplat_cnt->energy_psys, RAPL_UNIT_WATTS, interval_float)); 3582 + if (DO_BIC(BIC_Sys_J) && (t == average.threads)) 3583 + outp += sprintf(outp, fmt8, (printed++ ? delim : ""), rapl_counter_get_value(&pplat_cnt->energy_psys, RAPL_UNIT_JOULES, interval_float)); 3584 3585 done: 3586 if (*(outp - 1) != '\n') ··· 3620 if ((!count || (header_iterations && !(count % header_iterations))) || !summary_only) 3621 print_header("\t"); 3622 3623 + format_counters(average.threads, average.cores, average.packages); 3624 3625 count++; 3626 ··· 3795 /* check for TSC < 1 Mcycles over interval */ 3796 if (old->tsc < (1000 * 1000)) 3797 errx(-3, "Insanely slow TSC rate, TSC stops in idle?\n" 3798 + "You can disable all c-states by booting with \"idle=poll\"\nor just the deep ones with \"processor.max_cstate=1\""); 3799 3800 old->c1 = new->c1 - old->c1; 3801 ··· 3846 if (DO_BIC(BIC_SMI)) 3847 old->smi_count = new->smi_count - old->smi_count; 3848 3849 + if (DO_BIC(BIC_LLC_MRPS) || DO_BIC(BIC_LLC_HIT)) 3850 old->llc.references = new->llc.references - old->llc.references; 3851 3852 if (DO_BIC(BIC_LLC_HIT)) 3853 old->llc.misses = new->llc.misses - old->llc.misses; 3854 + 3855 + if (DO_BIC(BIC_L2_MRPS) || DO_BIC(BIC_L2_HIT)) 3856 + old->l2.references = new->l2.references - old->l2.references; 3857 + 3858 + if (DO_BIC(BIC_L2_HIT)) 3859 + old->l2.hits = new->l2.hits - old->l2.hits; 3860 3861 for (i = 0, mp = sys.tp; mp; i++, mp = mp->next) { 3862 if (mp->format == FORMAT_RAW || mp->format == FORMAT_AVERAGE) ··· 3932 t->llc.references = 0; 3933 t->llc.misses = 0; 3934 3935 + t->l2.references = 0; 3936 + t->l2.hits = 0; 3937 + 3938 c->c3 = 0; 3939 c->c6 = 0; 3940 c->c7 = 0; ··· 3939 c->core_temp_c = 0; 3940 rapl_counter_clear(&c->core_energy); 3941 c->core_throt_cnt = 0; 3942 3943 p->pkg_wtd_core_c0 = 0; 3944 p->pkg_any_core_c0 = 0; ··· 4018 4019 /* copy un-changing apic_id's */ 4020 if (DO_BIC(BIC_APIC)) 4021 + average.threads->apic_id = t->apic_id; 4022 if (DO_BIC(BIC_X2APIC)) 4023 + average.threads->x2apic_id = t->x2apic_id; 4024 4025 /* remember first tv_begin */ 4026 + if (average.threads->tv_begin.tv_sec == 0) 4027 + average.threads->tv_begin = procsysfs_tv_begin; 4028 4029 /* remember last tv_end */ 4030 + average.threads->tv_end = t->tv_end; 4031 4032 + average.threads->tsc += t->tsc; 4033 + average.threads->aperf += t->aperf; 4034 + average.threads->mperf += t->mperf; 4035 + average.threads->c1 += t->c1; 4036 4037 + average.threads->instr_count += t->instr_count; 4038 4039 + average.threads->irq_count += t->irq_count; 4040 + average.threads->nmi_count += t->nmi_count; 4041 + average.threads->smi_count += t->smi_count; 4042 4043 + average.threads->llc.references += t->llc.references; 4044 + average.threads->llc.misses += t->llc.misses; 4045 + 4046 + average.threads->l2.references += t->l2.references; 4047 + average.threads->l2.hits += t->l2.hits; 4048 4049 for (i = 0, mp = sys.tp; mp; i++, mp = mp->next) { 4050 if (mp->format == FORMAT_RAW) 4051 continue; 4052 + average.threads->counter[i] += t->counter[i]; 4053 } 4054 4055 for (i = 0, pp = sys.perf_tp; pp; i++, pp = pp->next) { 4056 if (pp->format == FORMAT_RAW) 4057 continue; 4058 + average.threads->perf_counter[i] += t->perf_counter[i]; 4059 } 4060 4061 for (i = 0, ppmt = sys.pmt_tp; ppmt; i++, ppmt = ppmt->next) { 4062 + average.threads->pmt_counter[i] += t->pmt_counter[i]; 4063 } 4064 4065 /* sum per-core values only for 1st thread in core */ 4066 if (!is_cpu_first_thread_in_core(t, c)) 4067 return 0; 4068 4069 + average.cores->c3 += c->c3; 4070 + average.cores->c6 += c->c6; 4071 + average.cores->c7 += c->c7; 4072 + average.cores->mc6_us += c->mc6_us; 4073 4074 + average.cores->core_temp_c = MAX(average.cores->core_temp_c, c->core_temp_c); 4075 + average.cores->core_throt_cnt = MAX(average.cores->core_throt_cnt, c->core_throt_cnt); 4076 4077 + rapl_counter_accumulate(&average.cores->core_energy, &c->core_energy); 4078 4079 for (i = 0, mp = sys.cp; mp; i++, mp = mp->next) { 4080 if (mp->format == FORMAT_RAW) 4081 continue; 4082 + average.cores->counter[i] += c->counter[i]; 4083 } 4084 4085 for (i = 0, pp = sys.perf_cp; pp; i++, pp = pp->next) { 4086 if (pp->format == FORMAT_RAW) 4087 continue; 4088 + average.cores->perf_counter[i] += c->perf_counter[i]; 4089 } 4090 4091 for (i = 0, ppmt = sys.pmt_cp; ppmt; i++, ppmt = ppmt->next) { 4092 + average.cores->pmt_counter[i] += c->pmt_counter[i]; 4093 } 4094 4095 /* sum per-pkg values only for 1st core in pkg */ ··· 4094 return 0; 4095 4096 if (DO_BIC(BIC_Totl_c0)) 4097 + average.packages->pkg_wtd_core_c0 += p->pkg_wtd_core_c0; 4098 if (DO_BIC(BIC_Any_c0)) 4099 + average.packages->pkg_any_core_c0 += p->pkg_any_core_c0; 4100 if (DO_BIC(BIC_GFX_c0)) 4101 + average.packages->pkg_any_gfxe_c0 += p->pkg_any_gfxe_c0; 4102 if (DO_BIC(BIC_CPUGFX)) 4103 + average.packages->pkg_both_core_gfxe_c0 += p->pkg_both_core_gfxe_c0; 4104 4105 + average.packages->pc2 += p->pc2; 4106 if (DO_BIC(BIC_Pkgpc3)) 4107 + average.packages->pc3 += p->pc3; 4108 if (DO_BIC(BIC_Pkgpc6)) 4109 + average.packages->pc6 += p->pc6; 4110 if (DO_BIC(BIC_Pkgpc7)) 4111 + average.packages->pc7 += p->pc7; 4112 + average.packages->pc8 += p->pc8; 4113 + average.packages->pc9 += p->pc9; 4114 + average.packages->pc10 += p->pc10; 4115 + average.packages->die_c6 += p->die_c6; 4116 4117 + average.packages->cpu_lpi = p->cpu_lpi; 4118 + average.packages->sys_lpi = p->sys_lpi; 4119 4120 + rapl_counter_accumulate(&average.packages->energy_pkg, &p->energy_pkg); 4121 + rapl_counter_accumulate(&average.packages->energy_dram, &p->energy_dram); 4122 + rapl_counter_accumulate(&average.packages->energy_cores, &p->energy_cores); 4123 + rapl_counter_accumulate(&average.packages->energy_gfx, &p->energy_gfx); 4124 4125 + average.packages->gfx_rc6_ms = p->gfx_rc6_ms; 4126 + average.packages->uncore_mhz = p->uncore_mhz; 4127 + average.packages->gfx_mhz = p->gfx_mhz; 4128 + average.packages->gfx_act_mhz = p->gfx_act_mhz; 4129 + average.packages->sam_mc6_ms = p->sam_mc6_ms; 4130 + average.packages->sam_mhz = p->sam_mhz; 4131 + average.packages->sam_act_mhz = p->sam_act_mhz; 4132 4133 + average.packages->pkg_temp_c = MAX(average.packages->pkg_temp_c, p->pkg_temp_c); 4134 4135 + rapl_counter_accumulate(&average.packages->rapl_pkg_perf_status, &p->rapl_pkg_perf_status); 4136 + rapl_counter_accumulate(&average.packages->rapl_dram_perf_status, &p->rapl_dram_perf_status); 4137 4138 for (i = 0, mp = sys.pp; mp; i++, mp = mp->next) { 4139 if ((mp->format == FORMAT_RAW) && (topo.num_packages == 0)) 4140 + average.packages->counter[i] = p->counter[i]; 4141 else 4142 + average.packages->counter[i] += p->counter[i]; 4143 } 4144 4145 for (i = 0, pp = sys.perf_pp; pp; i++, pp = pp->next) { 4146 if ((pp->format == FORMAT_RAW) && (topo.num_packages == 0)) 4147 + average.packages->perf_counter[i] = p->perf_counter[i]; 4148 else 4149 + average.packages->perf_counter[i] += p->perf_counter[i]; 4150 } 4151 4152 for (i = 0, ppmt = sys.pmt_pp; ppmt; i++, ppmt = ppmt->next) { 4153 + average.packages->pmt_counter[i] += p->pmt_counter[i]; 4154 } 4155 4156 return 0; ··· 4167 struct perf_counter_info *pp; 4168 struct pmt_counter *ppmt; 4169 4170 + clear_counters(average.threads, average.cores, average.packages); 4171 4172 for_all_cpus(sum_counters, t, c, p); 4173 4174 /* Use the global time delta for the average. */ 4175 + average.threads->tv_delta = tv_delta; 4176 4177 + average.threads->tsc /= topo.allowed_cpus; 4178 + average.threads->aperf /= topo.allowed_cpus; 4179 + average.threads->mperf /= topo.allowed_cpus; 4180 + average.threads->instr_count /= topo.allowed_cpus; 4181 + average.threads->c1 /= topo.allowed_cpus; 4182 4183 + if (average.threads->irq_count > 9999999) 4184 sums_need_wide_columns = 1; 4185 + if (average.threads->nmi_count > 9999999) 4186 sums_need_wide_columns = 1; 4187 4188 + average.cores->c3 /= topo.allowed_cores; 4189 + average.cores->c6 /= topo.allowed_cores; 4190 + average.cores->c7 /= topo.allowed_cores; 4191 + average.cores->mc6_us /= topo.allowed_cores; 4192 4193 if (DO_BIC(BIC_Totl_c0)) 4194 + average.packages->pkg_wtd_core_c0 /= topo.allowed_packages; 4195 if (DO_BIC(BIC_Any_c0)) 4196 + average.packages->pkg_any_core_c0 /= topo.allowed_packages; 4197 if (DO_BIC(BIC_GFX_c0)) 4198 + average.packages->pkg_any_gfxe_c0 /= topo.allowed_packages; 4199 if (DO_BIC(BIC_CPUGFX)) 4200 + average.packages->pkg_both_core_gfxe_c0 /= topo.allowed_packages; 4201 4202 + average.packages->pc2 /= topo.allowed_packages; 4203 if (DO_BIC(BIC_Pkgpc3)) 4204 + average.packages->pc3 /= topo.allowed_packages; 4205 if (DO_BIC(BIC_Pkgpc6)) 4206 + average.packages->pc6 /= topo.allowed_packages; 4207 if (DO_BIC(BIC_Pkgpc7)) 4208 + average.packages->pc7 /= topo.allowed_packages; 4209 4210 + average.packages->pc8 /= topo.allowed_packages; 4211 + average.packages->pc9 /= topo.allowed_packages; 4212 + average.packages->pc10 /= topo.allowed_packages; 4213 + average.packages->die_c6 /= topo.allowed_packages; 4214 4215 for (i = 0, mp = sys.tp; mp; i++, mp = mp->next) { 4216 if (mp->format == FORMAT_RAW) 4217 continue; 4218 if (mp->type == COUNTER_ITEMS) { 4219 + if (average.threads->counter[i] > 9999999) 4220 sums_need_wide_columns = 1; 4221 continue; 4222 } 4223 + average.threads->counter[i] /= topo.allowed_cpus; 4224 } 4225 for (i = 0, mp = sys.cp; mp; i++, mp = mp->next) { 4226 if (mp->format == FORMAT_RAW) 4227 continue; 4228 if (mp->type == COUNTER_ITEMS) { 4229 + if (average.cores->counter[i] > 9999999) 4230 sums_need_wide_columns = 1; 4231 } 4232 + average.cores->counter[i] /= topo.allowed_cores; 4233 } 4234 for (i = 0, mp = sys.pp; mp; i++, mp = mp->next) { 4235 if (mp->format == FORMAT_RAW) 4236 continue; 4237 if (mp->type == COUNTER_ITEMS) { 4238 + if (average.packages->counter[i] > 9999999) 4239 sums_need_wide_columns = 1; 4240 } 4241 + average.packages->counter[i] /= topo.allowed_packages; 4242 } 4243 4244 for (i = 0, pp = sys.perf_tp; pp; i++, pp = pp->next) { 4245 if (pp->format == FORMAT_RAW) 4246 continue; 4247 if (pp->type == COUNTER_ITEMS) { 4248 + if (average.threads->perf_counter[i] > 9999999) 4249 sums_need_wide_columns = 1; 4250 continue; 4251 } 4252 + average.threads->perf_counter[i] /= topo.allowed_cpus; 4253 } 4254 for (i = 0, pp = sys.perf_cp; pp; i++, pp = pp->next) { 4255 if (pp->format == FORMAT_RAW) 4256 continue; 4257 if (pp->type == COUNTER_ITEMS) { 4258 + if (average.cores->perf_counter[i] > 9999999) 4259 sums_need_wide_columns = 1; 4260 } 4261 + average.cores->perf_counter[i] /= topo.allowed_cores; 4262 } 4263 for (i = 0, pp = sys.perf_pp; pp; i++, pp = pp->next) { 4264 if (pp->format == FORMAT_RAW) 4265 continue; 4266 if (pp->type == COUNTER_ITEMS) { 4267 + if (average.packages->perf_counter[i] > 9999999) 4268 sums_need_wide_columns = 1; 4269 } 4270 + average.packages->perf_counter[i] /= topo.allowed_packages; 4271 } 4272 4273 for (i = 0, ppmt = sys.pmt_tp; ppmt; i++, ppmt = ppmt->next) { 4274 + average.threads->pmt_counter[i] /= topo.allowed_cpus; 4275 } 4276 for (i = 0, ppmt = sys.pmt_cp; ppmt; i++, ppmt = ppmt->next) { 4277 + average.cores->pmt_counter[i] /= topo.allowed_cores; 4278 } 4279 for (i = 0, ppmt = sys.pmt_pp; ppmt; i++, ppmt = ppmt->next) { 4280 + average.packages->pmt_counter[i] /= topo.allowed_packages; 4281 } 4282 } 4283 ··· 4645 4646 int get_rapl_counters(int cpu, unsigned int domain, struct core_data *c, struct pkg_data *p) 4647 { 4648 + struct platform_counters *pplat_cnt = p == odd.packages ? &platform_counters_odd : &platform_counters_even; 4649 unsigned long long perf_data[NUM_RAPL_COUNTERS + 1]; 4650 struct rapl_counter_info_t *rci; 4651 ··· 5002 /* Rapl domain enumeration helpers */ 5003 static inline int get_rapl_num_domains(void) 5004 { 5005 if (!platform->has_per_core_rapl) 5006 + return topo.num_packages; 5007 5008 + return topo.num_cores; 5009 } 5010 5011 static inline int get_rapl_domain_id(int cpu) 5012 { 5013 if (!platform->has_per_core_rapl) 5014 + return cpus[cpu].package_id; 5015 5016 + return GLOBAL_CORE_ID(cpu, cpus[cpu].package_id); 5017 } 5018 5019 /* ··· 5058 5059 get_smi_aperf_mperf(cpu, t); 5060 5061 + if (DO_BIC(BIC_LLC_MRPS) || DO_BIC(BIC_LLC_HIT)) 5062 get_perf_llc_stats(cpu, &t->llc); 5063 + 5064 + if (DO_BIC(BIC_L2_MRPS) || DO_BIC(BIC_L2_HIT)) 5065 + get_perf_l2_stats(cpu, &t->l2); 5066 5067 if (DO_BIC(BIC_IPC)) 5068 if (read(get_instr_count_fd(cpu), &t->instr_count, sizeof(long long)) != sizeof(long long)) ··· 5125 return -10; 5126 5127 for (i = 0, pp = sys.pmt_cp; pp; i++, pp = pp->next) 5128 + c->pmt_counter[i] = pmt_read_counter(pp, cpus[t->cpu_id].core_id); 5129 5130 /* collect package counters only for 1st core in package */ 5131 if (!is_cpu_first_core_in_package(t, p)) ··· 5166 } 5167 5168 if (DO_BIC(BIC_UNCORE_MHZ)) 5169 + p->uncore_mhz = get_legacy_uncore_mhz(cpus[t->cpu_id].package_id); 5170 5171 if (DO_BIC(BIC_GFX_rc6)) 5172 p->gfx_rc6_ms = gfx_info[GFX_rc6].val_ull; ··· 5190 char *path = NULL; 5191 5192 if (mp->msr_num == 0) { 5193 + path = find_sysfs_path_by_id(mp->sp, cpus[t->cpu_id].package_id); 5194 if (path == NULL) { 5195 + warnx("%s: package_id %d not found", __func__, cpus[t->cpu_id].package_id); 5196 return -10; 5197 } 5198 } ··· 5204 return -10; 5205 5206 for (i = 0, pp = sys.pmt_pp; pp; i++, pp = pp->next) 5207 + p->pmt_counter[i] = pmt_read_counter(pp, cpus[t->cpu_id].package_id); 5208 5209 done: 5210 gettimeofday(&t->tv_end, (struct timezone *)NULL); ··· 5293 return; 5294 } 5295 5296 + get_msr(master_cpu, MSR_PKG_CST_CONFIG_CONTROL, &msr); 5297 pkg_cstate_limit = pkg_cstate_limits[msr & 0xF]; 5298 } 5299 ··· 5305 if (!platform->has_nhm_msrs || no_msr) 5306 return; 5307 5308 + get_msr(master_cpu, MSR_PLATFORM_INFO, &msr); 5309 5310 + fprintf(outf, "cpu%d: MSR_PLATFORM_INFO: 0x%08llx\n", master_cpu, msr); 5311 5312 ratio = (msr >> 40) & 0xFF; 5313 fprintf(outf, "%d * %.1f = %.1f MHz max efficiency frequency\n", ratio, bclk, ratio * bclk); ··· 5323 if (!platform->has_nhm_msrs || no_msr) 5324 return; 5325 5326 + get_msr(master_cpu, MSR_IA32_POWER_CTL, &msr); 5327 + fprintf(outf, "cpu%d: MSR_IA32_POWER_CTL: 0x%08llx (C1E auto-promotion: %sabled)\n", master_cpu, msr, msr & 0x2 ? "EN" : "DIS"); 5328 5329 /* C-state Pre-wake Disable (CSTATE_PREWAKE_DISABLE) */ 5330 if (platform->has_cst_prewake_bit) ··· 5338 unsigned long long msr; 5339 unsigned int ratio; 5340 5341 + get_msr(master_cpu, MSR_TURBO_RATIO_LIMIT2, &msr); 5342 5343 + fprintf(outf, "cpu%d: MSR_TURBO_RATIO_LIMIT2: 0x%08llx\n", master_cpu, msr); 5344 5345 ratio = (msr >> 8) & 0xFF; 5346 if (ratio) ··· 5357 unsigned long long msr; 5358 unsigned int ratio; 5359 5360 + get_msr(master_cpu, MSR_TURBO_RATIO_LIMIT1, &msr); 5361 5362 + fprintf(outf, "cpu%d: MSR_TURBO_RATIO_LIMIT1: 0x%08llx\n", master_cpu, msr); 5363 5364 ratio = (msr >> 56) & 0xFF; 5365 if (ratio) ··· 5400 unsigned long long msr, core_counts; 5401 int shift; 5402 5403 + get_msr(master_cpu, trl_msr_offset, &msr); 5404 + fprintf(outf, "cpu%d: MSR_%sTURBO_RATIO_LIMIT: 0x%08llx\n", master_cpu, trl_msr_offset == MSR_SECONDARY_TURBO_RATIO_LIMIT ? "SECONDARY_" : "", msr); 5405 5406 if (platform->trl_msrs & TRL_CORECOUNT) { 5407 + get_msr(master_cpu, MSR_TURBO_RATIO_LIMIT1, &core_counts); 5408 + fprintf(outf, "cpu%d: MSR_TURBO_RATIO_LIMIT1: 0x%08llx\n", master_cpu, core_counts); 5409 } else { 5410 core_counts = 0x0807060504030201; 5411 } ··· 5428 unsigned long long msr; 5429 unsigned int ratio; 5430 5431 + get_msr(master_cpu, MSR_ATOM_CORE_RATIOS, &msr); 5432 + fprintf(outf, "cpu%d: MSR_ATOM_CORE_RATIOS: 0x%08llx\n", master_cpu, msr & 0xFFFFFFFF); 5433 5434 ratio = (msr >> 0) & 0x3F; 5435 if (ratio) ··· 5443 if (ratio) 5444 fprintf(outf, "%d * %.1f = %.1f MHz base frequency\n", ratio, bclk, ratio * bclk); 5445 5446 + get_msr(master_cpu, MSR_ATOM_CORE_TURBO_RATIOS, &msr); 5447 + fprintf(outf, "cpu%d: MSR_ATOM_CORE_TURBO_RATIOS: 0x%08llx\n", master_cpu, msr & 0xFFFFFFFF); 5448 5449 ratio = (msr >> 24) & 0x3F; 5450 if (ratio) ··· 5473 unsigned int cores[buckets_no]; 5474 unsigned int ratio[buckets_no]; 5475 5476 + get_msr(master_cpu, MSR_TURBO_RATIO_LIMIT, &msr); 5477 5478 + fprintf(outf, "cpu%d: MSR_TURBO_RATIO_LIMIT: 0x%08llx\n", master_cpu, msr); 5479 5480 /* 5481 * Turbo encoding in KNL is as follows: ··· 5525 if (!platform->has_nhm_msrs || no_msr) 5526 return; 5527 5528 + get_msr(master_cpu, MSR_PKG_CST_CONFIG_CONTROL, &msr); 5529 5530 + fprintf(outf, "cpu%d: MSR_PKG_CST_CONFIG_CONTROL: 0x%08llx", master_cpu, msr); 5531 5532 fprintf(outf, " (%s%s%s%s%slocked, pkg-cstate-limit=%d (%s)", 5533 (msr & SNB_C3_AUTO_UNDEMOTE) ? "UNdemote-C3, " : "", ··· 5550 { 5551 unsigned long long msr; 5552 5553 + get_msr(master_cpu, MSR_CONFIG_TDP_NOMINAL, &msr); 5554 + fprintf(outf, "cpu%d: MSR_CONFIG_TDP_NOMINAL: 0x%08llx", master_cpu, msr); 5555 fprintf(outf, " (base_ratio=%d)\n", (unsigned int)msr & 0xFF); 5556 5557 + get_msr(master_cpu, MSR_CONFIG_TDP_LEVEL_1, &msr); 5558 + fprintf(outf, "cpu%d: MSR_CONFIG_TDP_LEVEL_1: 0x%08llx (", master_cpu, msr); 5559 if (msr) { 5560 fprintf(outf, "PKG_MIN_PWR_LVL1=%d ", (unsigned int)(msr >> 48) & 0x7FFF); 5561 fprintf(outf, "PKG_MAX_PWR_LVL1=%d ", (unsigned int)(msr >> 32) & 0x7FFF); ··· 5564 } 5565 fprintf(outf, ")\n"); 5566 5567 + get_msr(master_cpu, MSR_CONFIG_TDP_LEVEL_2, &msr); 5568 + fprintf(outf, "cpu%d: MSR_CONFIG_TDP_LEVEL_2: 0x%08llx (", master_cpu, msr); 5569 if (msr) { 5570 fprintf(outf, "PKG_MIN_PWR_LVL2=%d ", (unsigned int)(msr >> 48) & 0x7FFF); 5571 fprintf(outf, "PKG_MAX_PWR_LVL2=%d ", (unsigned int)(msr >> 32) & 0x7FFF); ··· 5574 } 5575 fprintf(outf, ")\n"); 5576 5577 + get_msr(master_cpu, MSR_CONFIG_TDP_CONTROL, &msr); 5578 + fprintf(outf, "cpu%d: MSR_CONFIG_TDP_CONTROL: 0x%08llx (", master_cpu, msr); 5579 if ((msr) & 0x3) 5580 fprintf(outf, "TDP_LEVEL=%d ", (unsigned int)(msr) & 0x3); 5581 fprintf(outf, " lock=%d", (unsigned int)(msr >> 31) & 1); 5582 fprintf(outf, ")\n"); 5583 5584 + get_msr(master_cpu, MSR_TURBO_ACTIVATION_RATIO, &msr); 5585 + fprintf(outf, "cpu%d: MSR_TURBO_ACTIVATION_RATIO: 0x%08llx (", master_cpu, msr); 5586 fprintf(outf, "MAX_NON_TURBO_RATIO=%d", (unsigned int)(msr) & 0xFF); 5587 fprintf(outf, " lock=%d", (unsigned int)(msr >> 31) & 1); 5588 fprintf(outf, ")\n"); ··· 5598 return; 5599 5600 if (platform->supported_cstates & PC3) { 5601 + get_msr(master_cpu, MSR_PKGC3_IRTL, &msr); 5602 + fprintf(outf, "cpu%d: MSR_PKGC3_IRTL: 0x%08llx (", master_cpu, msr); 5603 fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5604 } 5605 5606 if (platform->supported_cstates & PC6) { 5607 + get_msr(master_cpu, MSR_PKGC6_IRTL, &msr); 5608 + fprintf(outf, "cpu%d: MSR_PKGC6_IRTL: 0x%08llx (", master_cpu, msr); 5609 fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5610 } 5611 5612 if (platform->supported_cstates & PC7) { 5613 + get_msr(master_cpu, MSR_PKGC7_IRTL, &msr); 5614 + fprintf(outf, "cpu%d: MSR_PKGC7_IRTL: 0x%08llx (", master_cpu, msr); 5615 fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5616 } 5617 5618 if (platform->supported_cstates & PC8) { 5619 + get_msr(master_cpu, MSR_PKGC8_IRTL, &msr); 5620 + fprintf(outf, "cpu%d: MSR_PKGC8_IRTL: 0x%08llx (", master_cpu, msr); 5621 fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5622 } 5623 5624 if (platform->supported_cstates & PC9) { 5625 + get_msr(master_cpu, MSR_PKGC9_IRTL, &msr); 5626 + fprintf(outf, "cpu%d: MSR_PKGC9_IRTL: 0x%08llx (", master_cpu, msr); 5627 fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5628 } 5629 5630 if (platform->supported_cstates & PC10) { 5631 + get_msr(master_cpu, MSR_PKGC10_IRTL, &msr); 5632 + fprintf(outf, "cpu%d: MSR_PKGC10_IRTL: 0x%08llx (", master_cpu, msr); 5633 fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5634 } 5635 } ··· 5676 5677 free(fd_llc_percpu); 5678 fd_llc_percpu = NULL; 5679 + 5680 + BIC_NOT_PRESENT(BIC_LLC_MRPS); 5681 + BIC_NOT_PRESENT(BIC_LLC_HIT); 5682 + } 5683 + 5684 + void free_fd_l2_percpu(void) 5685 + { 5686 + if (!fd_l2_percpu) 5687 + return; 5688 + 5689 + for (int i = 0; i < topo.max_cpu_num + 1; ++i) { 5690 + if (fd_l2_percpu[i] != 0) 5691 + close(fd_l2_percpu[i]); 5692 + } 5693 + 5694 + free(fd_l2_percpu); 5695 + fd_l2_percpu = NULL; 5696 + 5697 + BIC_NOT_PRESENT(BIC_L2_MRPS); 5698 + BIC_NOT_PRESENT(BIC_L2_HIT); 5699 } 5700 5701 void free_fd_cstate(void) ··· 5780 cpu_affinity_set = NULL; 5781 cpu_affinity_setsize = 0; 5782 5783 + if (perf_pcore_set) { 5784 + CPU_FREE(perf_pcore_set); 5785 + perf_pcore_set = NULL; 5786 + } 5787 5788 + if (perf_ecore_set) { 5789 + CPU_FREE(perf_ecore_set); 5790 + perf_ecore_set = NULL; 5791 + } 5792 5793 + if (perf_lcore_set) { 5794 + CPU_FREE(perf_lcore_set); 5795 + perf_lcore_set = NULL; 5796 + } 5797 5798 + free(even.threads); 5799 + free(even.cores); 5800 + free(even.packages); 5801 + 5802 + even.threads = NULL; 5803 + even.cores = NULL; 5804 + even.packages = NULL; 5805 + 5806 + free(odd.threads); 5807 + free(odd.cores); 5808 + free(odd.packages); 5809 + 5810 + odd.threads = NULL; 5811 + odd.cores = NULL; 5812 + odd.packages = NULL; 5813 5814 free(output_buffer); 5815 output_buffer = NULL; ··· 5803 free_fd_percpu(); 5804 free_fd_instr_count_percpu(); 5805 free_fd_llc_percpu(); 5806 + free_fd_l2_percpu(); 5807 free_fd_msr(); 5808 free_fd_rapl_percpu(); 5809 free_fd_cstate(); ··· 5852 return cpu == parse_int_file("/sys/devices/system/cpu/cpu%d/topology/core_siblings_list", cpu); 5853 } 5854 5855 + int get_package_id(int cpu) 5856 { 5857 return parse_int_file("/sys/devices/system/cpu/cpu%d/topology/physical_package_id", cpu); 5858 } ··· 5885 for (pkg = 0; pkg < topo.num_packages; pkg++) { 5886 lnode = 0; 5887 for (cpu = 0; cpu <= topo.max_cpu_num; ++cpu) { 5888 + if (cpus[cpu].package_id != pkg) 5889 continue; 5890 /* find a cpu with an unset logical_node_id */ 5891 if (cpus[cpu].logical_node_id != -1) ··· 5898 * the logical_node_id 5899 */ 5900 for (cpux = cpu; cpux <= topo.max_cpu_num; cpux++) { 5901 + if ((cpus[cpux].package_id == pkg) && (cpus[cpux].physical_node_id == node)) { 5902 cpus[cpux].logical_node_id = lnode; 5903 cpu_count++; 5904 } ··· 5917 char path[80]; 5918 FILE *filep; 5919 int i; 5920 + int cpu = thiscpu->cpu_id; 5921 5922 for (i = 0; i <= topo.max_cpu_num; i++) { 5923 sprintf(path, "/sys/devices/system/cpu/cpu%d/node%i/cpulist", cpu, i); ··· 5986 return 0; 5987 } 5988 5989 + int set_thread_siblings(struct cpu_topology *thiscpu) 5990 { 5991 char path[80], character; 5992 FILE *filep; 5993 unsigned long map; 5994 int so, shift, sib_core; 5995 + int cpu = thiscpu->cpu_id; 5996 int offset = topo.max_cpu_num + 1; 5997 size_t size; 5998 int thread_id = 0; 5999 6000 thiscpu->put_ids = CPU_ALLOC((topo.max_cpu_num + 1)); 6001 + if (thiscpu->ht_id < 0) 6002 + thiscpu->ht_id = thread_id++; 6003 if (!thiscpu->put_ids) 6004 return -1; 6005 ··· 6021 if ((map >> shift) & 0x1) { 6022 so = shift + offset; 6023 sib_core = get_core_id(so); 6024 + if (sib_core == thiscpu->core_id) { 6025 CPU_SET_S(so, size, thiscpu->put_ids); 6026 + if ((so != cpu) && (cpus[so].ht_id < 0)) { 6027 + cpus[so].ht_id = thread_id; 6028 + cpus[cpu].ht_sibling_cpu_id[thread_id] = so; 6029 + if (debug) 6030 + fprintf(stderr, "%s: cpu%d.ht_sibling_cpu_id[%d] = %d\n", __func__, cpu, thread_id, so); 6031 + thread_id += 1; 6032 + } 6033 } 6034 } 6035 } ··· 6045 struct core_data *core_base, struct pkg_data *pkg_base, 6046 struct thread_data *thread_base2, struct core_data *core_base2, struct pkg_data *pkg_base2) 6047 { 6048 + int cpu, retval; 6049 6050 retval = 0; 6051 6052 + for (cpu = 0; cpu <= topo.max_cpu_num; ++cpu) { 6053 + struct thread_data *t, *t2; 6054 + struct core_data *c, *c2; 6055 + struct pkg_data *p, *p2; 6056 6057 + if (cpu_is_not_allowed(cpu)) 6058 + continue; 6059 6060 + if (cpus[cpu].ht_id > 0) /* skip HT sibling */ 6061 + continue; 6062 6063 + t = &thread_base[cpu]; 6064 + t2 = &thread_base2[cpu]; 6065 + c = &core_base[GLOBAL_CORE_ID(cpus[cpu].core_id, cpus[cpu].package_id)]; 6066 + c2 = &core_base2[GLOBAL_CORE_ID(cpus[cpu].core_id, cpus[cpu].package_id)]; 6067 + p = &pkg_base[cpus[cpu].package_id]; 6068 + p2 = &pkg_base2[cpus[cpu].package_id]; 6069 6070 + retval |= func(t, c, p, t2, c2, p2); 6071 6072 + /* Handle HT sibling now */ 6073 + int i; 6074 + 6075 + for (i = MAX_HT_ID; i > 0; --i) { /* ht_id 0 is self */ 6076 + if (cpus[cpu].ht_sibling_cpu_id[i] <= 0) 6077 + continue; 6078 + t = &thread_base[cpus[cpu].ht_sibling_cpu_id[i]]; 6079 + t2 = &thread_base2[cpus[cpu].ht_sibling_cpu_id[i]]; 6080 + 6081 + retval |= func(t, c, p, t2, c2, p2); 6082 } 6083 } 6084 return retval; ··· 6125 6126 pos = fgets(buf, 1024, fp); 6127 if (!pos) 6128 + err(1, "%s: file read failed", PATH_EFFECTIVE_CPUS); 6129 6130 fclose(fp); 6131 ··· 6142 update_effective_str(startup); 6143 6144 if (parse_cpu_str(cpu_effective_str, cpu_effective_set, cpu_effective_setsize)) 6145 + err(1, "%s: cpu str malformat %s", PATH_EFFECTIVE_CPUS, cpu_effective_str); 6146 } 6147 6148 void linux_perf_init(void); ··· 6150 void rapl_perf_init(void); 6151 void cstate_perf_init(void); 6152 void perf_llc_init(void); 6153 + void perf_l2_init(void); 6154 void added_perf_counters_init(void); 6155 void pmt_init(void); 6156 ··· 6162 rapl_perf_init(); 6163 cstate_perf_init(); 6164 perf_llc_init(); 6165 + perf_l2_init(); 6166 added_perf_counters_init(); 6167 pmt_init(); 6168 fprintf(outf, "turbostat: re-initialized with num_cpus %d, allowed_cpus %d\n", topo.num_cpus, topo.allowed_cpus); ··· 6170 void set_max_cpu_num(void) 6171 { 6172 FILE *filep; 6173 + int current_cpu; 6174 unsigned long dummy; 6175 char pathname[64]; 6176 6177 + current_cpu = sched_getcpu(); 6178 + if (current_cpu < 0) 6179 err(1, "cannot find calling cpu ID"); 6180 + sprintf(pathname, "/sys/devices/system/cpu/cpu%d/topology/thread_siblings", current_cpu); 6181 6182 filep = fopen_or_die(pathname, "r"); 6183 topo.max_cpu_num = 0; ··· 6205 return 0; 6206 } 6207 6208 + int clear_ht_id(int cpu) 6209 { 6210 + int i; 6211 + 6212 + cpus[cpu].ht_id = -1; 6213 + for (i = 0; i <= MAX_HT_ID; ++i) 6214 + cpus[cpu].ht_sibling_cpu_id[i] = -1; 6215 return 0; 6216 } 6217 ··· 6740 struct stat sb; 6741 char pathname[32]; 6742 6743 + sprintf(pathname, "/dev/msr%d", master_cpu); 6744 return !stat(pathname, &sb); 6745 } 6746 ··· 6749 struct stat sb; 6750 char pathname[32]; 6751 6752 + sprintf(pathname, "/dev/cpu/%d/msr", master_cpu); 6753 return !stat(pathname, &sb); 6754 } 6755 ··· 6809 6810 free_and_exit: 6811 if (cap_free(caps) == -1) 6812 + err(-6, "cap_free"); 6813 6814 return ret; 6815 } ··· 6826 failed += check_for_cap_sys_rawio(); 6827 6828 /* test file permissions */ 6829 + sprintf(pathname, use_android_msr_path ? "/dev/msr%d" : "/dev/cpu/%d/msr", master_cpu); 6830 if (euidaccess(pathname, R_OK)) { 6831 failed++; 6832 } ··· 6855 else 6856 return; 6857 6858 + get_msr(master_cpu, MSR_PLATFORM_INFO, &msr); 6859 base_ratio = (msr >> 8) & 0xFF; 6860 6861 base_hz = base_ratio * bclk * 1000000; ··· 7006 } 7007 for (i = uncore_max_id; i >= 0; --i) { 7008 int k, l; 7009 + int unc_pkg_id, domain_id, cluster_id; 7010 char name_buf[16]; 7011 7012 sprintf(path_base, "/sys/devices/system/cpu/intel_uncore_frequency/uncore%02d", i); 7013 7014 if (access(path_base, R_OK)) 7015 + err(1, "%s: %s", __func__, path_base); 7016 7017 sprintf(path, "%s/package_id", path_base); 7018 + unc_pkg_id = read_sysfs_int(path); 7019 7020 sprintf(path, "%s/domain_id", path_base); 7021 domain_id = read_sysfs_int(path); ··· 7038 */ 7039 if BIC_IS_ENABLED 7040 (BIC_UNCORE_MHZ) 7041 + add_counter(0, path, name_buf, 0, SCOPE_PACKAGE, COUNTER_K2M, FORMAT_AVERAGE, 0, unc_pkg_id); 7042 7043 if (quiet) 7044 continue; ··· 7047 k = read_sysfs_int(path); 7048 sprintf(path, "%s/max_freq_khz", path_base); 7049 l = read_sysfs_int(path); 7050 + fprintf(outf, "Uncore Frequency package%d domain%d cluster%d: %d - %d MHz ", unc_pkg_id, domain_id, cluster_id, k / 1000, l / 1000); 7051 7052 sprintf(path, "%s/initial_min_freq_khz", path_base); 7053 k = read_sysfs_int(path); ··· 7202 7203 for (state = 0; state < 10; ++state) { 7204 7205 + sprintf(path, "/sys/devices/system/cpu/cpu%d/cpuidle/state%d/name", master_cpu, state); 7206 input = fopen(path, "r"); 7207 if (input == NULL) 7208 continue; ··· 7218 7219 remove_underbar(name_buf); 7220 7221 + sprintf(path, "/sys/devices/system/cpu/cpu%d/cpuidle/state%d/desc", master_cpu, state); 7222 input = fopen(path, "r"); 7223 if (input == NULL) 7224 continue; 7225 if (!fgets(desc, sizeof(desc), input)) 7226 err(1, "%s: failed to read file", path); 7227 7228 + fprintf(outf, "cpu%d: %s: %s", master_cpu, name_buf, desc); 7229 fclose(input); 7230 } 7231 } ··· 7238 FILE *input; 7239 int turbo; 7240 7241 + sprintf(path, "/sys/devices/system/cpu/cpu%d/cpufreq/scaling_driver", master_cpu); 7242 input = fopen(path, "r"); 7243 if (input == NULL) { 7244 fprintf(outf, "NSFOD %s\n", path); ··· 7248 err(1, "%s: failed to read file", path); 7249 fclose(input); 7250 7251 + sprintf(path, "/sys/devices/system/cpu/cpu%d/cpufreq/scaling_governor", master_cpu); 7252 input = fopen(path, "r"); 7253 if (input == NULL) { 7254 fprintf(outf, "NSFOD %s\n", path); ··· 7258 err(1, "%s: failed to read file", path); 7259 fclose(input); 7260 7261 + fprintf(outf, "cpu%d: cpufreq driver: %s", master_cpu, driver_buf); 7262 + fprintf(outf, "cpu%d: cpufreq governor: %s", master_cpu, governor_buf); 7263 7264 sprintf(path, "/sys/devices/system/cpu/cpufreq/boost"); 7265 input = fopen(path, "r"); ··· 7521 unsigned long long msr; 7522 7523 if (valid_rapl_msrs & RAPL_PKG_POWER_INFO) 7524 + if (!get_msr(master_cpu, MSR_PKG_POWER_INFO, &msr)) 7525 return ((msr >> 0) & RAPL_POWER_GRANULARITY) * rapl_power_units; 7526 return get_quirk_tdp(); 7527 } ··· 7560 CLR_BIC(BIC_RAM__, &bic_enabled); 7561 7562 /* units on package 0, verify later other packages match */ 7563 + if (get_msr(master_cpu, MSR_RAPL_POWER_UNIT, &msr)) 7564 return; 7565 7566 rapl_power_units = 1.0 / (1 << (msr & 0xF)); ··· 7608 if (!valid_rapl_msrs || no_msr) 7609 return; 7610 7611 + if (get_msr(master_cpu, MSR_RAPL_PWR_UNIT, &msr)) 7612 return; 7613 7614 rapl_time_units = ldexp(1.0, -(msr >> 16 & 0xf)); ··· 7817 return -1; 7818 } 7819 7820 + fprintf(outf, "cpu%d: %s: 0x%08llx (%f Watts, %f Joules, %f sec.)\n", cpu, msr_name, msr, rapl_power_units, rapl_energy_units, rapl_time_units); 7821 7822 if (valid_rapl_msrs & RAPL_PKG_POWER_INFO) { 7823 ··· 7850 return -9; 7851 7852 fprintf(outf, "cpu%d: MSR_VR_CURRENT_CONFIG: 0x%08llx\n", cpu, msr); 7853 + fprintf(outf, "cpu%d: PKG Limit #4: %f Watts (%slocked)\n", cpu, ((msr >> 0) & 0x1FFF) * rapl_power_units, (msr >> 31) & 1 ? "" : "UN"); 7854 } 7855 7856 if (valid_rapl_msrs & RAPL_DRAM_POWER_INFO) { ··· 7919 if (offset < 0) 7920 return; 7921 7922 + ret = get_msr(master_cpu, offset, &msr_value); 7923 if (ret) { 7924 if (debug) 7925 fprintf(outf, "Can not read RAPL_PKG_ENERGY MSR(0x%llx)\n", (unsigned long long)offset); ··· 8004 if (!platform->has_nhm_msrs || no_msr) 8005 goto guess; 8006 8007 + if (get_msr(master_cpu, MSR_IA32_TEMPERATURE_TARGET, &msr)) 8008 goto guess; 8009 8010 tcc_default = (msr >> 16) & 0xFF; ··· 8013 int bits = platform->tcc_offset_bits; 8014 unsigned long long enabled = 0; 8015 8016 + if (bits && !get_msr(master_cpu, MSR_PLATFORM_INFO, &enabled)) 8017 enabled = (enabled >> 30) & 1; 8018 8019 if (bits && enabled) { ··· 8148 if (no_msr) 8149 return; 8150 8151 + if (quiet) 8152 + return; 8153 + 8154 + if (!get_msr(master_cpu, MSR_IA32_FEAT_CTL, &msr)) 8155 fprintf(outf, "cpu%d: MSR_IA32_FEATURE_CONTROL: 0x%08llx (%sLocked %s)\n", 8156 + master_cpu, msr, msr & FEAT_CTL_LOCKED ? "" : "UN-", msr & (1 << 18) ? "SGX" : ""); 8157 } 8158 8159 void decode_misc_enable_msr(void) ··· 8163 if (!genuine_intel) 8164 return; 8165 8166 + if (!get_msr(master_cpu, MSR_IA32_MISC_ENABLE, &msr)) 8167 fprintf(outf, "cpu%d: MSR_IA32_MISC_ENABLE: 0x%08llx (%sTCC %sEIST %sMWAIT %sPREFETCH %sTURBO)\n", 8168 + master_cpu, msr, 8169 msr & MSR_IA32_MISC_ENABLE_TM1 ? "" : "No-", 8170 msr & MSR_IA32_MISC_ENABLE_ENHANCED_SPEEDSTEP ? "" : "No-", 8171 msr & MSR_IA32_MISC_ENABLE_MWAIT ? "" : "No-", ··· 8182 if (!platform->has_msr_misc_feature_control) 8183 return; 8184 8185 + if (!get_msr(master_cpu, MSR_MISC_FEATURE_CONTROL, &msr)) 8186 fprintf(outf, 8187 "cpu%d: MSR_MISC_FEATURE_CONTROL: 0x%08llx (%sL2-Prefetch %sL2-Prefetch-pair %sL1-Prefetch %sL1-IP-Prefetch)\n", 8188 + master_cpu, msr, msr & (0 << 0) ? "No-" : "", msr & (1 << 0) ? "No-" : "", msr & (2 << 0) ? "No-" : "", msr & (3 << 0) ? "No-" : ""); 8189 } 8190 8191 /* ··· 8206 if (!platform->has_msr_misc_pwr_mgmt) 8207 return; 8208 8209 + if (!get_msr(master_cpu, MSR_MISC_PWR_MGMT, &msr)) 8210 fprintf(outf, "cpu%d: MSR_MISC_PWR_MGMT: 0x%08llx (%sable-EIST_Coordination %sable-EPB %sable-OOB)\n", 8211 + master_cpu, msr, msr & (1 << 0) ? "DIS" : "EN", msr & (1 << 1) ? "EN" : "DIS", msr & (1 << 8) ? "EN" : "DIS"); 8212 } 8213 8214 /* ··· 8227 if (!platform->has_msr_c6_demotion_policy_config) 8228 return; 8229 8230 + if (!get_msr(master_cpu, MSR_CC6_DEMOTION_POLICY_CONFIG, &msr)) 8231 + fprintf(outf, "cpu%d: MSR_CC6_DEMOTION_POLICY_CONFIG: 0x%08llx (%sable-CC6-Demotion)\n", master_cpu, msr, msr & (1 << 0) ? "EN" : "DIS"); 8232 8233 + if (!get_msr(master_cpu, MSR_MC6_DEMOTION_POLICY_CONFIG, &msr)) 8234 + fprintf(outf, "cpu%d: MSR_MC6_DEMOTION_POLICY_CONFIG: 0x%08llx (%sable-MC6-Demotion)\n", master_cpu, msr, msr & (1 << 0) ? "EN" : "DIS"); 8235 } 8236 8237 void print_dev_latency(void) ··· 8268 if (no_perf) 8269 return 0; 8270 8271 + fd = open_perf_counter(master_cpu, PERF_TYPE_HARDWARE, PERF_COUNT_HW_INSTRUCTIONS, -1, 0); 8272 if (fd != -1) 8273 close(fd); 8274 ··· 8321 return ret; 8322 } 8323 8324 + char cpuset_buf[1024]; 8325 + int initialize_cpu_set_from_sysfs(cpu_set_t *cpu_set, char *sysfs_path, char *sysfs_file) 8326 + { 8327 + FILE *fp; 8328 + char path[128]; 8329 + 8330 + if (snprintf(path, 128, "%s/%s", sysfs_path, sysfs_file) > 128) 8331 + err(-1, "%s %s", sysfs_path, sysfs_file); 8332 + 8333 + fp = fopen(path, "r"); 8334 + if (!fp) { 8335 + warn("open %s", path); 8336 + return -1; 8337 + } 8338 + if (fread(cpuset_buf, sizeof(char), 1024, fp) == 0) { 8339 + warn("read %s", sysfs_path); 8340 + goto err; 8341 + } 8342 + if (parse_cpu_str(cpuset_buf, cpu_set, cpu_possible_setsize)) { 8343 + warnx("%s: cpu str malformat %s\n", sysfs_path, cpu_effective_str); 8344 + goto err; 8345 + } 8346 + return 0; 8347 + 8348 + err: 8349 + fclose(fp); 8350 + return -1; 8351 + } 8352 + 8353 + void print_cpu_set(char *s, cpu_set_t *set) 8354 + { 8355 + int i; 8356 + 8357 + assert(MAX_BIC < CPU_SETSIZE); 8358 + 8359 + printf("%s:", s); 8360 + 8361 + for (i = 0; i <= topo.max_cpu_num; ++i) 8362 + if (CPU_ISSET(i, set)) 8363 + printf(" %d", i); 8364 + putchar('\n'); 8365 + } 8366 + 8367 + void linux_perf_init_hybrid_cpus(void) 8368 + { 8369 + char *perf_cpu_pcore_path = "/sys/devices/cpu_core"; 8370 + char *perf_cpu_ecore_path = "/sys/devices/cpu_atom"; 8371 + char *perf_cpu_lcore_path = "/sys/devices/cpu_lowpower"; 8372 + char path[128]; 8373 + 8374 + if (!access(perf_cpu_pcore_path, F_OK)) { 8375 + perf_pcore_set = CPU_ALLOC((topo.max_cpu_num + 1)); 8376 + if (perf_pcore_set == NULL) 8377 + err(3, "CPU_ALLOC"); 8378 + CPU_ZERO_S(cpu_possible_setsize, perf_pcore_set); 8379 + initialize_cpu_set_from_sysfs(perf_pcore_set, perf_cpu_pcore_path, "cpus"); 8380 + if (debug) 8381 + print_cpu_set("perf pcores", perf_pcore_set); 8382 + sprintf(path, "%s/%s", perf_cpu_pcore_path, "type"); 8383 + perf_pmu_types.pcore = snapshot_sysfs_counter(path); 8384 + } 8385 + 8386 + if (!access(perf_cpu_ecore_path, F_OK)) { 8387 + perf_ecore_set = CPU_ALLOC((topo.max_cpu_num + 1)); 8388 + if (perf_ecore_set == NULL) 8389 + err(3, "CPU_ALLOC"); 8390 + CPU_ZERO_S(cpu_possible_setsize, perf_ecore_set); 8391 + initialize_cpu_set_from_sysfs(perf_ecore_set, perf_cpu_ecore_path, "cpus"); 8392 + if (debug) 8393 + print_cpu_set("perf ecores", perf_ecore_set); 8394 + sprintf(path, "%s/%s", perf_cpu_ecore_path, "type"); 8395 + perf_pmu_types.ecore = snapshot_sysfs_counter(path); 8396 + } 8397 + 8398 + if (!access(perf_cpu_lcore_path, F_OK)) { 8399 + perf_lcore_set = CPU_ALLOC((topo.max_cpu_num + 1)); 8400 + if (perf_lcore_set == NULL) 8401 + err(3, "CPU_ALLOC"); 8402 + CPU_ZERO_S(cpu_possible_setsize, perf_lcore_set); 8403 + initialize_cpu_set_from_sysfs(perf_lcore_set, perf_cpu_lcore_path, "cpus"); 8404 + if (debug) 8405 + print_cpu_set("perf lcores", perf_lcore_set); 8406 + sprintf(path, "%s/%s", perf_cpu_lcore_path, "type"); 8407 + perf_pmu_types.lcore = snapshot_sysfs_counter(path); 8408 + } 8409 + } 8410 + 8411 /* 8412 + * Linux-perf related initialization 8413 */ 8414 void linux_perf_init(void) 8415 { 8416 + char path[128]; 8417 + char *perf_cpu_path = "/sys/devices/cpu"; 8418 + 8419 if (access("/proc/sys/kernel/perf_event_paranoid", F_OK)) 8420 return; 8421 + 8422 + if (!access(perf_cpu_path, F_OK)) { 8423 + sprintf(path, "%s/%s", perf_cpu_path, "type"); 8424 + perf_pmu_types.uniform = snapshot_sysfs_counter(path); 8425 + } else { 8426 + linux_perf_init_hybrid_cpus(); 8427 + } 8428 8429 if (BIC_IS_ENABLED(BIC_IPC) && cpuid_has_aperf_mperf) { 8430 fd_instr_count_percpu = calloc(topo.max_cpu_num + 1, sizeof(int)); 8431 if (fd_instr_count_percpu == NULL) 8432 err(-1, "calloc fd_instr_count_percpu"); 8433 } 8434 + if (BIC_IS_ENABLED(BIC_LLC_MRPS) || BIC_IS_ENABLED(BIC_LLC_HIT)) { 8435 fd_llc_percpu = calloc(topo.max_cpu_num + 1, sizeof(int)); 8436 if (fd_llc_percpu == NULL) 8437 err(-1, "calloc fd_llc_percpu"); 8438 + } 8439 + if (BIC_IS_ENABLED(BIC_L2_MRPS) || BIC_IS_ENABLED(BIC_L2_HIT)) { 8440 + fd_l2_percpu = calloc(topo.max_cpu_num + 1, sizeof(int)); 8441 + if (fd_l2_percpu == NULL) 8442 + err(-1, "calloc fd_l2_percpu"); 8443 } 8444 } 8445 ··· 8397 8398 domain_visited[next_domain] = 1; 8399 8400 + if ((cai->flags & RAPL_COUNTER_FLAG_PLATFORM_COUNTER) && (cpu != master_cpu)) 8401 continue; 8402 8403 struct rapl_counter_info_t *rci = &rapl_counter_info_perdomain[next_domain]; ··· 8450 /* Assumes msr_counter_info is populated */ 8451 static int has_amperf_access(void) 8452 { 8453 + return cpuid_has_aperf_mperf && msr_counter_arch_infos[MSR_ARCH_INFO_APERF_INDEX].present && msr_counter_arch_infos[MSR_ARCH_INFO_MPERF_INDEX].present; 8454 } 8455 8456 int *get_cstate_perf_group_fd(struct cstate_counter_info_t *cci, const char *group_name) ··· 8647 if (cpu_is_not_allowed(cpu)) 8648 continue; 8649 8650 + const int core_id = cpus[cpu].core_id; 8651 + const int pkg_id = cpus[cpu].package_id; 8652 8653 assert(core_id < cores_visited_elems); 8654 assert(pkg_id < pkg_visited_elems); ··· 8662 if (!per_core && pkg_visited[pkg_id]) 8663 continue; 8664 8665 + const bool counter_needed = BIC_IS_ENABLED(cai->bic_number) || (soft_c1 && (cai->flags & CSTATE_COUNTER_FLAG_SOFT_C1_DEPENDENCY)); 8666 const bool counter_supported = (platform->supported_cstates & cai->feature_mask); 8667 8668 if (counter_needed && counter_supported) { ··· 8772 for_all_cpus(print_perf_limit, ODD_COUNTERS); 8773 } 8774 8775 + void dump_word_chars(unsigned int word) 8776 + { 8777 + int i; 8778 + 8779 + for (i = 0; i < 4; ++i) 8780 + fprintf(outf, "%c", (word >> (i * 8)) & 0xFF); 8781 + } 8782 + 8783 + void dump_cpuid_hypervisor(void) 8784 + { 8785 + unsigned int ebx = 0; 8786 + unsigned int ecx = 0; 8787 + unsigned int edx = 0; 8788 + 8789 + __cpuid(0x40000000, max_extended_level, ebx, ecx, edx); 8790 + 8791 + fprintf(outf, "Hypervisor: "); 8792 + dump_word_chars(ebx); 8793 + dump_word_chars(ecx); 8794 + dump_word_chars(edx); 8795 + fprintf(outf, "\n"); 8796 + } 8797 + 8798 void process_cpuid() 8799 { 8800 unsigned int eax, ebx, ecx, edx; ··· 8803 model += ((fms >> 16) & 0xf) << 4; 8804 ecx_flags = ecx; 8805 edx_flags = edx; 8806 + cpuid_has_hv = ecx_flags & (1 << 31); 8807 8808 if (!no_msr) { 8809 if (get_msr(sched_getcpu(), MSR_IA32_UCODE_REV, &ucode_patch)) ··· 8826 fputc('\n', outf); 8827 8828 fprintf(outf, "CPUID(0x80000000): max_extended_levels: 0x%x\n", max_extended_level); 8829 + fprintf(outf, "CPUID(1): %sSSE3 %sMONITOR %sSMX %sEIST %sTM2 %sHV %sTSC %sMSR %sACPI-TM %sHT %sTM\n", 8830 + ecx_flags & (1 << 0) ? "" : "No-", 8831 + ecx_flags & (1 << 3) ? "" : "No-", 8832 + ecx_flags & (1 << 6) ? "" : "No-", 8833 + ecx_flags & (1 << 7) ? "" : "No-", 8834 + ecx_flags & (1 << 8) ? "" : "No-", 8835 + cpuid_has_hv ? "" : "No-", 8836 + edx_flags & (1 << 4) ? "" : "No-", 8837 + edx_flags & (1 << 5) ? "" : "No-", 8838 + edx_flags & (1 << 22) ? "" : "No-", edx_flags & (1 << 28) ? "" : "No-", edx_flags & (1 << 29) ? "" : "No-"); 8839 } 8840 + if (!quiet && cpuid_has_hv) 8841 + dump_cpuid_hypervisor(); 8842 8843 probe_platform_features(family, model); 8844 + init_perf_model_support(family, model); 8845 8846 if (!(edx_flags & (1 << 5))) 8847 errx(1, "CPUID: no MSR"); ··· 8887 if (!quiet) 8888 decode_misc_enable_msr(); 8889 8890 + if (max_level >= 0x7) { 8891 int has_sgx; 8892 8893 ecx = 0; ··· 8896 8897 has_sgx = ebx & (1 << 2); 8898 8899 + is_hybrid = !!(edx & (1 << 15)); 8900 8901 + if (!quiet) 8902 + fprintf(outf, "CPUID(7): %sSGX %sHybrid\n", has_sgx ? "" : "No-", is_hybrid ? "" : "No-"); 8903 8904 if (has_sgx) 8905 decode_feature_control_msr(); ··· 8924 if (crystal_hz) { 8925 tsc_hz = (unsigned long long)crystal_hz *ebx_tsc / eax_crystal; 8926 if (!quiet) 8927 + fprintf(outf, "TSC: %lld MHz (%d Hz * %d / %d / 1000000)\n", tsc_hz / 1000000, crystal_hz, ebx_tsc, eax_crystal); 8928 } 8929 } 8930 } ··· 9003 decode_misc_feature_control(); 9004 } 9005 9006 + /* 9007 + * has_perf_llc_access() 9008 * 9009 * return 1 on success, else 0 9010 */ ··· 9014 if (no_perf) 9015 return 0; 9016 9017 + fd = open_perf_counter(master_cpu, PERF_TYPE_HARDWARE, PERF_COUNT_HW_CACHE_REFERENCES, -1, PERF_FORMAT_GROUP); 9018 if (fd != -1) 9019 close(fd); 9020 ··· 9032 9033 if (no_perf) 9034 return; 9035 + if (!(BIC_IS_ENABLED(BIC_LLC_MRPS) || BIC_IS_ENABLED(BIC_LLC_HIT))) 9036 return; 9037 + 9038 + assert(fd_llc_percpu != 0); 9039 9040 for (cpu = 0; cpu <= topo.max_cpu_num; ++cpu) { 9041 9042 if (cpu_is_not_allowed(cpu)) 9043 continue; 9044 9045 fd_llc_percpu[cpu] = open_perf_counter(cpu, PERF_TYPE_HARDWARE, PERF_COUNT_HW_CACHE_REFERENCES, -1, PERF_FORMAT_GROUP); 9046 if (fd_llc_percpu[cpu] == -1) { 9047 warnx("%s: perf REFS: failed to open counter on cpu%d", __func__, cpu); 9048 free_fd_llc_percpu(); 9049 return; 9050 } 9051 retval = open_perf_counter(cpu, PERF_TYPE_HARDWARE, PERF_COUNT_HW_CACHE_MISSES, fd_llc_percpu[cpu], PERF_FORMAT_GROUP); 9052 if (retval == -1) { 9053 warnx("%s: perf MISS: failed to open counter on cpu%d", __func__, cpu); ··· 9055 return; 9056 } 9057 } 9058 + BIC_PRESENT(BIC_LLC_MRPS); 9059 BIC_PRESENT(BIC_LLC_HIT); 9060 + } 9061 + 9062 + void perf_l2_init(void) 9063 + { 9064 + int cpu; 9065 + int retval; 9066 + 9067 + if (no_perf) 9068 + return; 9069 + if (!(BIC_IS_ENABLED(BIC_L2_MRPS) || BIC_IS_ENABLED(BIC_L2_HIT))) 9070 + return; 9071 + if (perf_model_support == NULL) 9072 + return; 9073 + 9074 + assert(fd_l2_percpu != 0); 9075 + 9076 + for (cpu = 0; cpu <= topo.max_cpu_num; ++cpu) { 9077 + 9078 + if (cpu_is_not_allowed(cpu)) 9079 + continue; 9080 + 9081 + if (!is_hybrid) { 9082 + fd_l2_percpu[cpu] = open_perf_counter(cpu, perf_pmu_types.uniform, perf_model_support->first.refs, -1, PERF_FORMAT_GROUP); 9083 + if (fd_l2_percpu[cpu] == -1) { 9084 + err(-1, "%s(cpu%d, 0x%x, 0x%llx) REFS", __func__, cpu, perf_pmu_types.uniform, perf_model_support->first.refs); 9085 + free_fd_l2_percpu(); 9086 + return; 9087 + } 9088 + retval = open_perf_counter(cpu, perf_pmu_types.uniform, perf_model_support->first.hits, fd_l2_percpu[cpu], PERF_FORMAT_GROUP); 9089 + if (retval == -1) { 9090 + err(-1, "%s(cpu%d, 0x%x, 0x%llx) HITS", __func__, cpu, perf_pmu_types.uniform, perf_model_support->first.hits); 9091 + free_fd_l2_percpu(); 9092 + return; 9093 + } 9094 + continue; 9095 + } 9096 + if (perf_pcore_set && CPU_ISSET_S(cpu, cpu_possible_setsize, perf_pcore_set)) { 9097 + fd_l2_percpu[cpu] = open_perf_counter(cpu, perf_pmu_types.pcore, perf_model_support->first.refs, -1, PERF_FORMAT_GROUP); 9098 + if (fd_l2_percpu[cpu] == -1) { 9099 + err(-1, "%s(cpu%d, 0x%x, 0x%llx) REFS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->first.refs); 9100 + free_fd_l2_percpu(); 9101 + return; 9102 + } 9103 + retval = open_perf_counter(cpu, perf_pmu_types.pcore, perf_model_support->first.hits, fd_l2_percpu[cpu], PERF_FORMAT_GROUP); 9104 + if (retval == -1) { 9105 + err(-1, "%s(cpu%d, 0x%x, 0x%llx) HITS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->first.hits); 9106 + free_fd_l2_percpu(); 9107 + return; 9108 + } 9109 + } else if (perf_ecore_set && CPU_ISSET_S(cpu, cpu_possible_setsize, perf_ecore_set)) { 9110 + fd_l2_percpu[cpu] = open_perf_counter(cpu, perf_pmu_types.ecore, perf_model_support->second.refs, -1, PERF_FORMAT_GROUP); 9111 + if (fd_l2_percpu[cpu] == -1) { 9112 + err(-1, "%s(cpu%d, 0x%x, 0x%llx) REFS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->second.refs); 9113 + free_fd_l2_percpu(); 9114 + return; 9115 + } 9116 + retval = open_perf_counter(cpu, perf_pmu_types.ecore, perf_model_support->second.hits, fd_l2_percpu[cpu], PERF_FORMAT_GROUP); 9117 + if (retval == -1) { 9118 + err(-1, "%s(cpu%d, 0x%x, 0x%llx) HITS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->second.hits); 9119 + free_fd_l2_percpu(); 9120 + return; 9121 + } 9122 + } else if (perf_lcore_set && CPU_ISSET_S(cpu, cpu_possible_setsize, perf_lcore_set)) { 9123 + fd_l2_percpu[cpu] = open_perf_counter(cpu, perf_pmu_types.lcore, perf_model_support->third.refs, -1, PERF_FORMAT_GROUP); 9124 + if (fd_l2_percpu[cpu] == -1) { 9125 + err(-1, "%s(cpu%d, 0x%x, 0x%llx) REFS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->third.refs); 9126 + free_fd_l2_percpu(); 9127 + return; 9128 + } 9129 + retval = open_perf_counter(cpu, perf_pmu_types.lcore, perf_model_support->third.hits, fd_l2_percpu[cpu], PERF_FORMAT_GROUP); 9130 + if (retval == -1) { 9131 + err(-1, "%s(cpu%d, 0x%x, 0x%llx) HITS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->third.hits); 9132 + free_fd_l2_percpu(); 9133 + return; 9134 + } 9135 + } else 9136 + err(-1, "%s: cpu%d: type %d", __func__, cpu, cpus[cpu].type); 9137 + } 9138 + BIC_PRESENT(BIC_L2_MRPS); 9139 + BIC_PRESENT(BIC_L2_HIT); 9140 } 9141 9142 /* ··· 9069 return 1; 9070 else 9071 return 0; 9072 } 9073 9074 void topology_probe(bool startup) ··· 9137 err(3, "CPU_ALLOC"); 9138 cpu_possible_setsize = CPU_ALLOC_SIZE((topo.max_cpu_num + 1)); 9139 CPU_ZERO_S(cpu_possible_setsize, cpu_possible_set); 9140 + initialize_cpu_set_from_sysfs(cpu_possible_set, "/sys/devices/system/cpu", "possible"); 9141 9142 /* 9143 * Allocate and initialize cpu_effective_set ··· 9205 cpu_affinity_setsize = CPU_ALLOC_SIZE((topo.max_cpu_num + 1)); 9206 CPU_ZERO_S(cpu_affinity_setsize, cpu_affinity_set); 9207 9208 + for_all_proc_cpus(clear_ht_id); 9209 9210 for_all_proc_cpus(set_cpu_hybrid_type); 9211 9212 /* 9213 * For online cpus 9214 + * find max_core_id, max_package_id, num_cores (per system) 9215 */ 9216 for (i = 0; i <= topo.max_cpu_num; ++i) { 9217 int siblings; ··· 9222 continue; 9223 } 9224 9225 + cpus[i].cpu_id = i; 9226 9227 /* get package information */ 9228 + cpus[i].package_id = get_package_id(i); 9229 + if (cpus[i].package_id > max_package_id) 9230 + max_package_id = cpus[i].package_id; 9231 9232 /* get die information */ 9233 cpus[i].die_id = get_die_id(i); ··· 9245 topo.max_node_num = cpus[i].physical_node_id; 9246 9247 /* get core information */ 9248 + cpus[i].core_id = get_core_id(i); 9249 + if (cpus[i].core_id > max_core_id) 9250 + max_core_id = cpus[i].core_id; 9251 9252 /* get thread information */ 9253 + siblings = set_thread_siblings(&cpus[i]); 9254 if (siblings > max_siblings) 9255 max_siblings = siblings; 9256 + if (cpus[i].ht_id == 0) 9257 topo.num_cores++; 9258 } 9259 + topo.max_core_id = max_core_id; /* within a package */ 9260 topo.max_package_id = max_package_id; 9261 + topo.num_cores = (max_core_id + 1) * topo.num_packages; /* per system */ 9262 9263 topo.cores_per_node = max_core_id + 1; 9264 if (debug > 1) ··· 9298 continue; 9299 fprintf(outf, 9300 "cpu %d pkg %d die %d l3 %d node %d lnode %d core %d thread %d\n", 9301 + i, cpus[i].package_id, cpus[i].die_id, cpus[i].l3_id, 9302 + cpus[i].physical_node_id, cpus[i].logical_node_id, cpus[i].core_id, cpus[i].ht_id); 9303 } 9304 9305 } 9306 9307 + void allocate_counters_1(struct counters *counters) 9308 + { 9309 + counters->threads = calloc(1, sizeof(struct thread_data)); 9310 + if (counters->threads == NULL) 9311 + goto error; 9312 + 9313 + counters->cores = calloc(1, sizeof(struct core_data)); 9314 + if (counters->cores == NULL) 9315 + goto error; 9316 + 9317 + counters->packages = calloc(1, sizeof(struct pkg_data)); 9318 + if (counters->packages == NULL) 9319 + goto error; 9320 + 9321 + return; 9322 + error: 9323 + err(1, "calloc counters_1"); 9324 + } 9325 + 9326 + void allocate_counters(struct counters *counters) 9327 { 9328 int i; 9329 int num_cores = topo.cores_per_node * topo.nodes_per_pkg * topo.num_packages; 9330 int num_threads = topo.threads_per_core * num_cores; 9331 9332 + counters->threads = calloc(num_threads, sizeof(struct thread_data)); 9333 + if (counters->threads == NULL) 9334 goto error; 9335 9336 for (i = 0; i < num_threads; i++) 9337 + (counters->threads)[i].cpu_id = -1; 9338 9339 + counters->cores = calloc(num_cores, sizeof(struct core_data)); 9340 + if (counters->cores == NULL) 9341 goto error; 9342 9343 + for (i = 0; i < num_cores; i++) 9344 + (counters->cores)[i].first_cpu = -1; 9345 9346 + counters->packages = calloc(topo.num_packages, sizeof(struct pkg_data)); 9347 + if (counters->packages == NULL) 9348 goto error; 9349 9350 + for (i = 0; i < topo.num_packages; i++) 9351 + (counters->packages)[i].first_cpu = -1; 9352 9353 return; 9354 error: ··· 9343 /* 9344 * init_counter() 9345 * 9346 + * set t->cpu_id, FIRST_THREAD_IN_CORE and FIRST_CORE_IN_PACKAGE 9347 */ 9348 void init_counter(struct thread_data *thread_base, struct core_data *core_base, struct pkg_data *pkg_base, int cpu_id) 9349 { 9350 + int pkg_id = cpus[cpu_id].package_id; 9351 int node_id = cpus[cpu_id].logical_node_id; 9352 + int core_id = cpus[cpu_id].core_id; 9353 struct thread_data *t; 9354 struct core_data *c; 9355 ··· 9360 if (node_id < 0) 9361 node_id = 0; 9362 9363 + t = &thread_base[cpu_id]; 9364 + c = &core_base[GLOBAL_CORE_ID(core_id, pkg_id)]; 9365 9366 t->cpu_id = cpu_id; 9367 if (!cpu_is_not_allowed(cpu_id)) { 9368 9369 + if (c->first_cpu < 0) 9370 + c->first_cpu = t->cpu_id; 9371 + if (pkg_base[pkg_id].first_cpu < 0) 9372 + pkg_base[pkg_id].first_cpu = t->cpu_id; 9373 } 9374 } 9375 9376 int initialize_counters(int cpu_id) ··· 9416 int update_topo(PER_THREAD_PARAMS) 9417 { 9418 topo.allowed_cpus++; 9419 + if ((int)t->cpu_id == c->first_cpu) 9420 topo.allowed_cores++; 9421 + if ((int)t->cpu_id == p->first_cpu) 9422 topo.allowed_packages++; 9423 9424 return 0; ··· 9437 topology_probe(startup); 9438 allocate_irq_buffers(); 9439 allocate_fd_percpu(); 9440 + allocate_counters_1(&average); 9441 + allocate_counters(&even); 9442 + allocate_counters(&odd); 9443 allocate_output_buffer(); 9444 for_all_proc_cpus(initialize_counters); 9445 topology_update(); 9446 } 9447 9448 + void set_master_cpu(void) 9449 { 9450 int i; 9451 9452 for (i = 0; i < topo.max_cpu_num + 1; ++i) { 9453 if (cpu_is_not_allowed(i)) 9454 continue; 9455 + master_cpu = i; 9456 if (debug > 1) 9457 + fprintf(outf, "master_cpu = %d\n", master_cpu); 9458 return; 9459 } 9460 err(-ENODEV, "No valid cpus found"); ··· 9484 if (!has_perf_instr_count_access()) 9485 no_perf = 1; 9486 9487 + if (BIC_IS_ENABLED(BIC_LLC_MRPS) || BIC_IS_ENABLED(BIC_LLC_HIT)) 9488 if (!has_perf_llc_access()) 9489 no_perf = 1; 9490 ··· 9967 9968 if (BIC_IS_ENABLED(BIC_Diec6)) { 9969 pmt_add_counter(PMT_MTL_DC6_GUID, PMT_MTL_DC6_SEQ, "Die%c6", PMT_TYPE_XTAL_TIME, 9970 + PMT_COUNTER_MTL_DC6_LSB, PMT_COUNTER_MTL_DC6_MSB, PMT_COUNTER_MTL_DC6_OFFSET, SCOPE_PACKAGE, FORMAT_DELTA, 0, PMT_OPEN_TRY); 9971 } 9972 9973 if (BIC_IS_ENABLED(BIC_CPU_c1e)) { ··· 10029 void turbostat_init() 10030 { 10031 setup_all_buffers(true); 10032 + set_master_cpu(); 10033 check_msr_access(); 10034 check_perf_access(); 10035 process_cpuid(); ··· 10040 rapl_perf_init(); 10041 cstate_perf_init(); 10042 perf_llc_init(); 10043 + perf_l2_init(); 10044 added_perf_counters_init(); 10045 pmt_init(); 10046 10047 for_all_cpus(get_cpu_type, ODD_COUNTERS); 10048 for_all_cpus(get_cpu_type, EVEN_COUNTERS); 10049 10050 + if (BIC_IS_ENABLED(BIC_IPC) && has_aperf_access && get_instr_count_fd(master_cpu) != -1) 10051 BIC_PRESENT(BIC_IPC); 10052 10053 /* ··· 10145 10146 void print_version() 10147 { 10148 + fprintf(outf, "turbostat version 2026.02.14 - Len Brown <lenb@kernel.org>\n"); 10149 } 10150 10151 #define COMMAND_LINE_SIZE 2048 ··· 10767 } 10768 10769 if (direct_path && has_guid) { 10770 + printf("%s: path and guid+seq parameters are mutually exclusive\nnotice: passed guid=0x%x and path=%s\n", __func__, guid, direct_path); 10771 exit(1); 10772 } 10773 ··· 10863 10864 for (state = 10; state >= 0; --state) { 10865 10866 + sprintf(path, "/sys/devices/system/cpu/cpu%d/cpuidle/state%d/name", master_cpu, state); 10867 input = fopen(path, "r"); 10868 if (input == NULL) 10869 continue; ··· 10912 10913 for (state = 10; state >= 0; --state) { 10914 10915 + sprintf(path, "/sys/devices/system/cpu/cpu%d/cpuidle/state%d/name", master_cpu, state); 10916 input = fopen(path, "r"); 10917 if (input == NULL) 10918 continue; ··· 11041 * Parse some options early, because they may make other options invalid, 11042 * like adding the MSR counter with --add and at the same time using --no-msr. 11043 */ 11044 + while ((opt = getopt_long_only(argc, argv, "+:MP", long_options, &option_index)) != -1) { 11045 switch (opt) { 11046 case 'M': 11047 no_msr = 1; ··· 11055 } 11056 optind = 0; 11057 11058 + while ((opt = getopt_long_only(argc, argv, "+C:c:Dde:hi:Jn:N:o:qMST:v", long_options, &option_index)) != -1) { 11059 switch (opt) { 11060 case 'a': 11061 parse_add_command(optarg); ··· 11098 } 11099 break; 11100 case 'h': 11101 help(); 11102 exit(1); 11103 case 'i': ··· 11134 /* Parsed earlier */ 11135 break; 11136 case 'n': 11137 + num_iterations = strtoul(optarg, NULL, 0); 11138 + errno = 0; 11139 11140 + if (errno || num_iterations == 0) 11141 + errx(-1, "invalid iteration count: %s", optarg); 11142 break; 11143 case 'N': 11144 + header_iterations = strtoul(optarg, NULL, 0); 11145 + errno = 0; 11146 11147 + if (errno || header_iterations == 0) 11148 + errx(-1, "invalid header iteration count: %s", optarg); 11149 break; 11150 case 's': 11151 /* ··· 11170 print_version(); 11171 exit(0); 11172 break; 11173 + default: 11174 + help(); 11175 + exit(1); 11176 } 11177 } 11178 }