Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge branch 'for-next/perf' into for-next/core

* for-next/perf: (30 commits)
arm: perf: Fix ARCH=arm build with GCC
MAINTAINERS: add maintainers for DesignWare PCIe PMU driver
drivers/perf: add DesignWare PCIe PMU driver
PCI: Move pci_clear_and_set_dword() helper to PCI header
PCI: Add Alibaba Vendor ID to linux/pci_ids.h
docs: perf: Add description for Synopsys DesignWare PCIe PMU driver
Revert "perf/arm_dmc620: Remove duplicate format attribute #defines"
Documentation: arm64: Document the PMU event counting threshold feature
arm64: perf: Add support for event counting threshold
arm: pmu: Move error message and -EOPNOTSUPP to individual PMUs
KVM: selftests: aarch64: Update tools copy of arm_pmuv3.h
perf/arm_dmc620: Remove duplicate format attribute #defines
arm: pmu: Share user ABI format mechanism with SPE
arm64: perf: Include threshold control fields in PMEVTYPER mask
arm: perf: Convert remaining fields to use GENMASK
arm: perf: Use GENMASK for PMMIR fields
arm: perf/kvm: Use GENMASK for ARMV8_PMU_PMCR_N
arm: perf: Remove inlines from arm_pmuv3.c
drivers/perf: arm_dsu_pmu: Remove kerneldoc-style comment syntax
drivers/perf: Remove usage of the deprecated ida_simple_xx() API
...

+1374 -318
+94
Documentation/admin-guide/perf/dwc_pcie_pmu.rst
··· 1 + ====================================================================== 2 + Synopsys DesignWare Cores (DWC) PCIe Performance Monitoring Unit (PMU) 3 + ====================================================================== 4 + 5 + DesignWare Cores (DWC) PCIe PMU 6 + =============================== 7 + 8 + The PMU is a PCIe configuration space register block provided by each PCIe Root 9 + Port in a Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error 10 + injection, and Statistics). 11 + 12 + As the name indicates, the RAS DES capability supports system level 13 + debugging, AER error injection, and collection of statistics. To facilitate 14 + collection of statistics, Synopsys DesignWare Cores PCIe controller 15 + provides the following two features: 16 + 17 + - one 64-bit counter for Time Based Analysis (RX/TX data throughput and 18 + time spent in each low-power LTSSM state) and 19 + - one 32-bit counter for Event Counting (error and non-error events for 20 + a specified lane) 21 + 22 + Note: There is no interrupt for counter overflow. 23 + 24 + Time Based Analysis 25 + ------------------- 26 + 27 + Using this feature you can obtain information regarding RX/TX data 28 + throughput and time spent in each low-power LTSSM state by the controller. 29 + The PMU measures data in two categories: 30 + 31 + - Group#0: Percentage of time the controller stays in LTSSM states. 32 + - Group#1: Amount of data processed (Units of 16 bytes). 33 + 34 + Lane Event counters 35 + ------------------- 36 + 37 + Using this feature you can obtain Error and Non-Error information in 38 + specific lane by the controller. The PMU event is selected by all of: 39 + 40 + - Group i 41 + - Event j within the Group i 42 + - Lane k 43 + 44 + Some of the events only exist for specific configurations. 45 + 46 + DesignWare Cores (DWC) PCIe PMU Driver 47 + ======================================= 48 + 49 + This driver adds PMU devices for each PCIe Root Port named based on the BDF of 50 + the Root Port. For example, 51 + 52 + 30:03.0 PCI bridge: Device 1ded:8000 (rev 01) 53 + 54 + the PMU device name for this Root Port is dwc_rootport_3018. 55 + 56 + The DWC PCIe PMU driver registers a perf PMU driver, which provides 57 + description of available events and configuration options in sysfs, see 58 + /sys/bus/event_source/devices/dwc_rootport_{bdf}. 59 + 60 + The "format" directory describes format of the config fields of the 61 + perf_event_attr structure. The "events" directory provides configuration 62 + templates for all documented events. For example, 63 + "Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1". 64 + 65 + The "perf list" command shall list the available events from sysfs, e.g.:: 66 + 67 + $# perf list | grep dwc_rootport 68 + <...> 69 + dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event] 70 + <...> 71 + dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event] 72 + 73 + Time Based Analysis Event Usage 74 + ------------------------------- 75 + 76 + Example usage of counting PCIe RX TLP data payload (Units of bytes):: 77 + 78 + $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ 79 + 80 + The average RX/TX bandwidth can be calculated using the following formula: 81 + 82 + PCIe RX Bandwidth = Rx_PCIe_TLP_Data_Payload / Measure_Time_Window 83 + PCIe TX Bandwidth = Tx_PCIe_TLP_Data_Payload / Measure_Time_Window 84 + 85 + Lane Event Usage 86 + ------------------------------- 87 + 88 + Each lane has the same event set and to avoid generating a list of hundreds 89 + of events, the user need to specify the lane ID explicitly, e.g.:: 90 + 91 + $# perf stat -a -e dwc_rootport_3018/rx_memory_read,lane=4/ 92 + 93 + The driver does not support sampling, therefore "perf record" will not 94 + work. Per-task (without "-a") perf sessions are not supported.
+37 -8
Documentation/admin-guide/perf/imx-ddr.rst
··· 13 13 interrupt is raised. If any other counter overflows, it continues counting, and 14 14 no interrupt is raised. 15 15 16 - The "format" directory describes format of the config (event ID) and config1 17 - (AXI filtering) fields of the perf_event_attr structure, see /sys/bus/event_source/ 16 + The "format" directory describes format of the config (event ID) and config1/2 17 + (AXI filter setting) fields of the perf_event_attr structure, see /sys/bus/event_source/ 18 18 devices/imx8_ddr0/format/. The "events" directory describes the events types 19 19 hardware supported that can be used with perf tool, see /sys/bus/event_source/ 20 20 devices/imx8_ddr0/events/. The "caps" directory describes filter features implemented ··· 28 28 AXI filtering is only used by CSV modes 0x41 (axid-read) and 0x42 (axid-write) 29 29 to count reading or writing matches filter setting. Filter setting is various 30 30 from different DRAM controller implementations, which is distinguished by quirks 31 - in the driver. You also can dump info from userspace, filter in "caps" directory 32 - indicates whether PMU supports AXI ID filter or not; enhanced_filter indicates 33 - whether PMU supports enhanced AXI ID filter or not. Value 0 for un-supported, and 34 - value 1 for supported. 31 + in the driver. You also can dump info from userspace, "caps" directory show the 32 + type of AXI filter (filter, enhanced_filter and super_filter). Value 0 for 33 + un-supported, and value 1 for supported. 35 34 36 - * With DDR_CAP_AXI_ID_FILTER quirk(filter: 1, enhanced_filter: 0). 35 + * With DDR_CAP_AXI_ID_FILTER quirk(filter: 1, enhanced_filter: 0, super_filter: 0). 37 36 Filter is defined with two configuration parts: 38 37 --AXI_ID defines AxID matching value. 39 38 --AXI_MASKING defines which bits of AxID are meaningful for the matching. ··· 64 65 65 66 perf stat -a -e imx8_ddr0/axid-read,axi_id=0x12/ cmd, which will monitor ARID=0x12 66 67 67 - * With DDR_CAP_AXI_ID_FILTER_ENHANCED quirk(filter: 1, enhanced_filter: 1). 68 + * With DDR_CAP_AXI_ID_FILTER_ENHANCED quirk(filter: 1, enhanced_filter: 1, super_filter: 0). 68 69 This is an extension to the DDR_CAP_AXI_ID_FILTER quirk which permits 69 70 counting the number of bytes (as opposed to the number of bursts) from DDR 70 71 read and write transactions concurrently with another set of data counters. 72 + 73 + * With DDR_CAP_AXI_ID_PORT_CHANNEL_FILTER quirk(filter: 0, enhanced_filter: 0, super_filter: 1). 74 + There is a limitation in previous AXI filter, it cannot filter different IDs 75 + at the same time as the filter is shared between counters. This quirk is the 76 + extension of AXI ID filter. One improvement is that counter 1-3 has their own 77 + filter, means that it supports concurrently filter various IDs. Another 78 + improvement is that counter 1-3 supports AXI PORT and CHANNEL selection. Support 79 + selecting address channel or data channel. 80 + 81 + Filter is defined with 2 configuration registers per counter 1-3. 82 + --Counter N MASK COMP register - including AXI_ID and AXI_MASKING. 83 + --Counter N MUX CNTL register - including AXI CHANNEL and AXI PORT. 84 + 85 + - 0: address channel 86 + - 1: data channel 87 + 88 + PMU in DDR subsystem, only one single port0 exists, so axi_port is reserved 89 + which should be 0. 90 + 91 + .. code-block:: bash 92 + 93 + perf stat -a -e imx8_ddr0/axid-read,axi_mask=0xMMMM,axi_id=0xDDDD,axi_channel=0xH/ cmd 94 + perf stat -a -e imx8_ddr0/axid-write,axi_mask=0xMMMM,axi_id=0xDDDD,axi_channel=0xH/ cmd 95 + 96 + .. note:: 97 + 98 + axi_channel is inverted in userspace, and it will be reverted in driver 99 + automatically. So that users do not need specify axi_channel if want to 100 + monitor data channel from DDR transactions, since data channel is more 101 + meaningful.
+1
Documentation/admin-guide/perf/index.rst
··· 19 19 arm_dsu_pmu 20 20 thunderx2-pmu 21 21 alibaba_pmu 22 + dwc_pcie_pmu 22 23 nvidia-pmu 23 24 meson-ddr-pmu 24 25 cxl
+72
Documentation/arch/arm64/perf.rst
··· 164 164 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c 165 165 .. _tools/lib/perf/tests/test-evsel.c: 166 166 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c 167 + 168 + Event Counting Threshold 169 + ========================================== 170 + 171 + Overview 172 + -------- 173 + 174 + FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on 175 + events whose count meets a specified threshold condition. For example if 176 + threshold_compare is set to 2 ('Greater than or equal'), and the 177 + threshold is set to 2, then the PMU counter will now only increment by 178 + when an event would have previously incremented the PMU counter by 2 or 179 + more on a single processor cycle. 180 + 181 + To increment by 1 after passing the threshold condition instead of the 182 + number of events on that cycle, add the 'threshold_count' option to the 183 + commandline. 184 + 185 + How-to 186 + ------ 187 + 188 + These are the parameters for controlling the feature: 189 + 190 + .. list-table:: 191 + :header-rows: 1 192 + 193 + * - Parameter 194 + - Description 195 + * - threshold 196 + - Value to threshold the event by. A value of 0 means that 197 + thresholding is disabled and the other parameters have no effect. 198 + * - threshold_compare 199 + - | Comparison function to use, with the following values supported: 200 + | 201 + | 0: Not-equal 202 + | 1: Equals 203 + | 2: Greater-than-or-equal 204 + | 3: Less-than 205 + * - threshold_count 206 + - If this is set, count by 1 after passing the threshold condition 207 + instead of the value of the event on this cycle. 208 + 209 + The threshold, threshold_compare and threshold_count values can be 210 + provided per event, for example: 211 + 212 + .. code-block:: sh 213 + 214 + perf stat -e stall_slot/threshold=2,threshold_compare=2/ \ 215 + -e dtlb_walk/threshold=10,threshold_compare=3,threshold_count/ 216 + 217 + In this example the stall_slot event will count by 2 or more on every 218 + cycle where 2 or more stalls happen. And dtlb_walk will count by 1 on 219 + every cycle where the number of dtlb walks were less than 10. 220 + 221 + The maximum supported threshold value can be read from the caps of each 222 + PMU, for example: 223 + 224 + .. code-block:: sh 225 + 226 + cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max 227 + 228 + 0x000000ff 229 + 230 + If a value higher than this is given, then opening the event will result 231 + in an error. The highest possible maximum is 4095, as the config field 232 + for threshold is limited to 12 bits, and the Perf tool will refuse to 233 + parse higher values. 234 + 235 + If the PMU doesn't support FEAT_PMUv3_TH, then threshold_max will read 236 + 0, and attempting to set a threshold value will also result in an error. 237 + threshold_max will also read as 0 on aarch32 guests, even if the host 238 + is running on hardware with the feature.
+3
Documentation/devicetree/bindings/perf/fsl-imx-ddr.yaml
··· 27 27 - fsl,imx8mq-ddr-pmu 28 28 - fsl,imx8mp-ddr-pmu 29 29 - const: fsl,imx8m-ddr-pmu 30 + - items: 31 + - const: fsl,imx8dxl-ddr-pmu 32 + - const: fsl,imx8-ddr-pmu 30 33 31 34 reg: 32 35 maxItems: 1
+7
MAINTAINERS
··· 21090 21090 S: Maintained 21091 21091 F: drivers/mmc/host/dw_mmc* 21092 21092 21093 + SYNOPSYS DESIGNWARE PCIE PMU DRIVER 21094 + M: Shuai Xue <xueshuai@linux.alibaba.com> 21095 + M: Jing Zhang <renyu.zj@linux.alibaba.com> 21096 + S: Supported 21097 + F: Documentation/admin-guide/perf/dwc_pcie_pmu.rst 21098 + F: drivers/perf/dwc_pcie_pmu.c 21099 + 21093 21100 SYNOPSYS HSDK RESET CONTROLLER DRIVER 21094 21101 M: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> 21095 21102 S: Supported
+5 -23
arch/arm/kernel/perf_event_v6.c
··· 268 268 269 269 static void armv6pmu_enable_event(struct perf_event *event) 270 270 { 271 - unsigned long val, mask, evt, flags; 272 - struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); 271 + unsigned long val, mask, evt; 273 272 struct hw_perf_event *hwc = &event->hw; 274 - struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 275 273 int idx = hwc->idx; 276 274 277 275 if (ARMV6_CYCLE_COUNTER == idx) { ··· 292 294 * Mask out the current event and set the counter to count the event 293 295 * that we're interested in. 294 296 */ 295 - raw_spin_lock_irqsave(&events->pmu_lock, flags); 296 297 val = armv6_pmcr_read(); 297 298 val &= ~mask; 298 299 val |= evt; 299 300 armv6_pmcr_write(val); 300 - raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 301 301 } 302 302 303 303 static irqreturn_t ··· 358 362 359 363 static void armv6pmu_start(struct arm_pmu *cpu_pmu) 360 364 { 361 - unsigned long flags, val; 362 - struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 365 + unsigned long val; 363 366 364 - raw_spin_lock_irqsave(&events->pmu_lock, flags); 365 367 val = armv6_pmcr_read(); 366 368 val |= ARMV6_PMCR_ENABLE; 367 369 armv6_pmcr_write(val); 368 - raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 369 370 } 370 371 371 372 static void armv6pmu_stop(struct arm_pmu *cpu_pmu) 372 373 { 373 - unsigned long flags, val; 374 - struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 374 + unsigned long val; 375 375 376 - raw_spin_lock_irqsave(&events->pmu_lock, flags); 377 376 val = armv6_pmcr_read(); 378 377 val &= ~ARMV6_PMCR_ENABLE; 379 378 armv6_pmcr_write(val); 380 - raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 381 379 } 382 380 383 381 static int ··· 409 419 410 420 static void armv6pmu_disable_event(struct perf_event *event) 411 421 { 412 - unsigned long val, mask, evt, flags; 413 - struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); 422 + unsigned long val, mask, evt; 414 423 struct hw_perf_event *hwc = &event->hw; 415 - struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 416 424 int idx = hwc->idx; 417 425 418 426 if (ARMV6_CYCLE_COUNTER == idx) { ··· 432 444 * of ETM bus signal assertion cycles. The external reporting should 433 445 * be disabled and so this should never increment. 434 446 */ 435 - raw_spin_lock_irqsave(&events->pmu_lock, flags); 436 447 val = armv6_pmcr_read(); 437 448 val &= ~mask; 438 449 val |= evt; 439 450 armv6_pmcr_write(val); 440 - raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 441 451 } 442 452 443 453 static void armv6mpcore_pmu_disable_event(struct perf_event *event) 444 454 { 445 - unsigned long val, mask, flags, evt = 0; 446 - struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); 455 + unsigned long val, mask, evt = 0; 447 456 struct hw_perf_event *hwc = &event->hw; 448 - struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 449 457 int idx = hwc->idx; 450 458 451 459 if (ARMV6_CYCLE_COUNTER == idx) { ··· 459 475 * Unlike UP ARMv6, we don't have a way of stopping the counters. We 460 476 * simply disable the interrupt reporting. 461 477 */ 462 - raw_spin_lock_irqsave(&events->pmu_lock, flags); 463 478 val = armv6_pmcr_read(); 464 479 val &= ~mask; 465 480 val |= evt; 466 481 armv6_pmcr_write(val); 467 - raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 468 482 } 469 483 470 484 static int armv6_map_event(struct perf_event *event)
+4 -46
arch/arm/kernel/perf_event_v7.c
··· 870 870 871 871 static void armv7pmu_enable_event(struct perf_event *event) 872 872 { 873 - unsigned long flags; 874 873 struct hw_perf_event *hwc = &event->hw; 875 874 struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); 876 - struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 877 875 int idx = hwc->idx; 878 876 879 877 if (!armv7_pmnc_counter_valid(cpu_pmu, idx)) { ··· 884 886 * Enable counter and interrupt, and set the counter to count 885 887 * the event that we're interested in. 886 888 */ 887 - raw_spin_lock_irqsave(&events->pmu_lock, flags); 888 889 889 890 /* 890 891 * Disable counter ··· 907 910 * Enable counter 908 911 */ 909 912 armv7_pmnc_enable_counter(idx); 910 - 911 - raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 912 913 } 913 914 914 915 static void armv7pmu_disable_event(struct perf_event *event) 915 916 { 916 - unsigned long flags; 917 917 struct hw_perf_event *hwc = &event->hw; 918 918 struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); 919 - struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 920 919 int idx = hwc->idx; 921 920 922 921 if (!armv7_pmnc_counter_valid(cpu_pmu, idx)) { ··· 924 931 /* 925 932 * Disable counter and interrupt 926 933 */ 927 - raw_spin_lock_irqsave(&events->pmu_lock, flags); 928 934 929 935 /* 930 936 * Disable counter ··· 934 942 * Disable interrupt for this counter 935 943 */ 936 944 armv7_pmnc_disable_intens(idx); 937 - 938 - raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 939 945 } 940 946 941 947 static irqreturn_t armv7pmu_handle_irq(struct arm_pmu *cpu_pmu) ··· 999 1009 1000 1010 static void armv7pmu_start(struct arm_pmu *cpu_pmu) 1001 1011 { 1002 - unsigned long flags; 1003 - struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 1004 - 1005 - raw_spin_lock_irqsave(&events->pmu_lock, flags); 1006 1012 /* Enable all counters */ 1007 1013 armv7_pmnc_write(armv7_pmnc_read() | ARMV7_PMNC_E); 1008 - raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 1009 1014 } 1010 1015 1011 1016 static void armv7pmu_stop(struct arm_pmu *cpu_pmu) 1012 1017 { 1013 - unsigned long flags; 1014 - struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 1015 - 1016 - raw_spin_lock_irqsave(&events->pmu_lock, flags); 1017 1018 /* Disable all counters */ 1018 1019 armv7_pmnc_write(armv7_pmnc_read() & ~ARMV7_PMNC_E); 1019 - raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 1020 1020 } 1021 1021 1022 1022 static int armv7pmu_get_event_idx(struct pmu_hw_events *cpuc, ··· 1052 1072 { 1053 1073 unsigned long config_base = 0; 1054 1074 1055 - if (attr->exclude_idle) 1056 - return -EPERM; 1075 + if (attr->exclude_idle) { 1076 + pr_debug("ARM performance counters do not support mode exclusion\n"); 1077 + return -EOPNOTSUPP; 1078 + } 1057 1079 if (attr->exclude_user) 1058 1080 config_base |= ARMV7_EXCLUDE_USER; 1059 1081 if (attr->exclude_kernel) ··· 1474 1492 1475 1493 static void krait_pmu_disable_event(struct perf_event *event) 1476 1494 { 1477 - unsigned long flags; 1478 1495 struct hw_perf_event *hwc = &event->hw; 1479 1496 int idx = hwc->idx; 1480 - struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); 1481 - struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 1482 1497 1483 1498 /* Disable counter and interrupt */ 1484 - raw_spin_lock_irqsave(&events->pmu_lock, flags); 1485 1499 1486 1500 /* Disable counter */ 1487 1501 armv7_pmnc_disable_counter(idx); ··· 1490 1512 1491 1513 /* Disable interrupt for this counter */ 1492 1514 armv7_pmnc_disable_intens(idx); 1493 - 1494 - raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 1495 1515 } 1496 1516 1497 1517 static void krait_pmu_enable_event(struct perf_event *event) 1498 1518 { 1499 - unsigned long flags; 1500 1519 struct hw_perf_event *hwc = &event->hw; 1501 1520 int idx = hwc->idx; 1502 - struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); 1503 - struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 1504 1521 1505 1522 /* 1506 1523 * Enable counter and interrupt, and set the counter to count 1507 1524 * the event that we're interested in. 1508 1525 */ 1509 - raw_spin_lock_irqsave(&events->pmu_lock, flags); 1510 1526 1511 1527 /* Disable counter */ 1512 1528 armv7_pmnc_disable_counter(idx); ··· 1520 1548 1521 1549 /* Enable counter */ 1522 1550 armv7_pmnc_enable_counter(idx); 1523 - 1524 - raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 1525 1551 } 1526 1552 1527 1553 static void krait_pmu_reset(void *info) ··· 1795 1825 1796 1826 static void scorpion_pmu_disable_event(struct perf_event *event) 1797 1827 { 1798 - unsigned long flags; 1799 1828 struct hw_perf_event *hwc = &event->hw; 1800 1829 int idx = hwc->idx; 1801 - struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); 1802 - struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 1803 1830 1804 1831 /* Disable counter and interrupt */ 1805 - raw_spin_lock_irqsave(&events->pmu_lock, flags); 1806 1832 1807 1833 /* Disable counter */ 1808 1834 armv7_pmnc_disable_counter(idx); ··· 1811 1845 1812 1846 /* Disable interrupt for this counter */ 1813 1847 armv7_pmnc_disable_intens(idx); 1814 - 1815 - raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 1816 1848 } 1817 1849 1818 1850 static void scorpion_pmu_enable_event(struct perf_event *event) 1819 1851 { 1820 - unsigned long flags; 1821 1852 struct hw_perf_event *hwc = &event->hw; 1822 1853 int idx = hwc->idx; 1823 - struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); 1824 - struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 1825 1854 1826 1855 /* 1827 1856 * Enable counter and interrupt, and set the counter to count 1828 1857 * the event that we're interested in. 1829 1858 */ 1830 - raw_spin_lock_irqsave(&events->pmu_lock, flags); 1831 1859 1832 1860 /* Disable counter */ 1833 1861 armv7_pmnc_disable_counter(idx); ··· 1841 1881 1842 1882 /* Enable counter */ 1843 1883 armv7_pmnc_enable_counter(idx); 1844 - 1845 - raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 1846 1884 } 1847 1885 1848 1886 static void scorpion_pmu_reset(void *info)
+8 -36
arch/arm/kernel/perf_event_xscale.c
··· 203 203 204 204 static void xscale1pmu_enable_event(struct perf_event *event) 205 205 { 206 - unsigned long val, mask, evt, flags; 207 - struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); 206 + unsigned long val, mask, evt; 208 207 struct hw_perf_event *hwc = &event->hw; 209 - struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 210 208 int idx = hwc->idx; 211 209 212 210 switch (idx) { ··· 227 229 return; 228 230 } 229 231 230 - raw_spin_lock_irqsave(&events->pmu_lock, flags); 231 232 val = xscale1pmu_read_pmnc(); 232 233 val &= ~mask; 233 234 val |= evt; 234 235 xscale1pmu_write_pmnc(val); 235 - raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 236 236 } 237 237 238 238 static void xscale1pmu_disable_event(struct perf_event *event) 239 239 { 240 - unsigned long val, mask, evt, flags; 241 - struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); 240 + unsigned long val, mask, evt; 242 241 struct hw_perf_event *hwc = &event->hw; 243 - struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 244 242 int idx = hwc->idx; 245 243 246 244 switch (idx) { ··· 257 263 return; 258 264 } 259 265 260 - raw_spin_lock_irqsave(&events->pmu_lock, flags); 261 266 val = xscale1pmu_read_pmnc(); 262 267 val &= ~mask; 263 268 val |= evt; 264 269 xscale1pmu_write_pmnc(val); 265 - raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 266 270 } 267 271 268 272 static int ··· 292 300 293 301 static void xscale1pmu_start(struct arm_pmu *cpu_pmu) 294 302 { 295 - unsigned long flags, val; 296 - struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 303 + unsigned long val; 297 304 298 - raw_spin_lock_irqsave(&events->pmu_lock, flags); 299 305 val = xscale1pmu_read_pmnc(); 300 306 val |= XSCALE_PMU_ENABLE; 301 307 xscale1pmu_write_pmnc(val); 302 - raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 303 308 } 304 309 305 310 static void xscale1pmu_stop(struct arm_pmu *cpu_pmu) 306 311 { 307 - unsigned long flags, val; 308 - struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 312 + unsigned long val; 309 313 310 - raw_spin_lock_irqsave(&events->pmu_lock, flags); 311 314 val = xscale1pmu_read_pmnc(); 312 315 val &= ~XSCALE_PMU_ENABLE; 313 316 xscale1pmu_write_pmnc(val); 314 - raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 315 317 } 316 318 317 319 static inline u64 xscale1pmu_read_counter(struct perf_event *event) ··· 535 549 536 550 static void xscale2pmu_enable_event(struct perf_event *event) 537 551 { 538 - unsigned long flags, ien, evtsel; 539 - struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); 552 + unsigned long ien, evtsel; 540 553 struct hw_perf_event *hwc = &event->hw; 541 - struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 542 554 int idx = hwc->idx; 543 555 544 556 ien = xscale2pmu_read_int_enable(); ··· 571 587 return; 572 588 } 573 589 574 - raw_spin_lock_irqsave(&events->pmu_lock, flags); 575 590 xscale2pmu_write_event_select(evtsel); 576 591 xscale2pmu_write_int_enable(ien); 577 - raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 578 592 } 579 593 580 594 static void xscale2pmu_disable_event(struct perf_event *event) 581 595 { 582 - unsigned long flags, ien, evtsel, of_flags; 583 - struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); 596 + unsigned long ien, evtsel, of_flags; 584 597 struct hw_perf_event *hwc = &event->hw; 585 - struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 586 598 int idx = hwc->idx; 587 599 588 600 ien = xscale2pmu_read_int_enable(); ··· 618 638 return; 619 639 } 620 640 621 - raw_spin_lock_irqsave(&events->pmu_lock, flags); 622 641 xscale2pmu_write_event_select(evtsel); 623 642 xscale2pmu_write_int_enable(ien); 624 643 xscale2pmu_write_overflow_flags(of_flags); 625 - raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 626 644 } 627 645 628 646 static int ··· 641 663 642 664 static void xscale2pmu_start(struct arm_pmu *cpu_pmu) 643 665 { 644 - unsigned long flags, val; 645 - struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 666 + unsigned long val; 646 667 647 - raw_spin_lock_irqsave(&events->pmu_lock, flags); 648 668 val = xscale2pmu_read_pmnc() & ~XSCALE_PMU_CNT64; 649 669 val |= XSCALE_PMU_ENABLE; 650 670 xscale2pmu_write_pmnc(val); 651 - raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 652 671 } 653 672 654 673 static void xscale2pmu_stop(struct arm_pmu *cpu_pmu) 655 674 { 656 - unsigned long flags, val; 657 - struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 675 + unsigned long val; 658 676 659 - raw_spin_lock_irqsave(&events->pmu_lock, flags); 660 677 val = xscale2pmu_read_pmnc(); 661 678 val &= ~XSCALE_PMU_ENABLE; 662 679 xscale2pmu_write_pmnc(val); 663 - raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 664 680 } 665 681 666 682 static inline u64 xscale2pmu_read_counter(struct perf_event *event)
+3 -5
arch/arm64/kvm/pmu-emul.c
··· 267 267 268 268 u64 kvm_pmu_valid_counter_mask(struct kvm_vcpu *vcpu) 269 269 { 270 - u64 val = kvm_vcpu_read_pmcr(vcpu) >> ARMV8_PMU_PMCR_N_SHIFT; 270 + u64 val = FIELD_GET(ARMV8_PMU_PMCR_N, kvm_vcpu_read_pmcr(vcpu)); 271 271 272 - val &= ARMV8_PMU_PMCR_N_MASK; 273 272 if (val == 0) 274 273 return BIT(ARMV8_PMU_CYCLE_IDX); 275 274 else ··· 1135 1136 */ 1136 1137 u64 kvm_vcpu_read_pmcr(struct kvm_vcpu *vcpu) 1137 1138 { 1138 - u64 pmcr = __vcpu_sys_reg(vcpu, PMCR_EL0) & 1139 - ~(ARMV8_PMU_PMCR_N_MASK << ARMV8_PMU_PMCR_N_SHIFT); 1139 + u64 pmcr = __vcpu_sys_reg(vcpu, PMCR_EL0); 1140 1140 1141 - return pmcr | ((u64)vcpu->kvm->arch.pmcr_n << ARMV8_PMU_PMCR_N_SHIFT); 1141 + return u64_replace_bits(pmcr, vcpu->kvm->arch.pmcr_n, ARMV8_PMU_PMCR_N); 1142 1142 }
+2 -2
arch/arm64/kvm/sys_regs.c
··· 877 877 u64 pmcr, val; 878 878 879 879 pmcr = kvm_vcpu_read_pmcr(vcpu); 880 - val = (pmcr >> ARMV8_PMU_PMCR_N_SHIFT) & ARMV8_PMU_PMCR_N_MASK; 880 + val = FIELD_GET(ARMV8_PMU_PMCR_N, pmcr); 881 881 if (idx >= val && idx != ARMV8_PMU_CYCLE_IDX) { 882 882 kvm_inject_undefined(vcpu); 883 883 return false; ··· 1143 1143 static int set_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r, 1144 1144 u64 val) 1145 1145 { 1146 - u8 new_n = (val >> ARMV8_PMU_PMCR_N_SHIFT) & ARMV8_PMU_PMCR_N_MASK; 1146 + u8 new_n = FIELD_GET(ARMV8_PMU_PMCR_N, val); 1147 1147 struct kvm *kvm = vcpu->kvm; 1148 1148 1149 1149 mutex_lock(&kvm->arch.config_lock);
-2
drivers/infiniband/hw/erdma/erdma_hw.h
··· 11 11 #include <linux/types.h> 12 12 13 13 /* PCIe device related definition. */ 14 - #define PCI_VENDOR_ID_ALIBABA 0x1ded 15 - 16 14 #define ERDMA_PCI_WIDTH 64 17 15 #define ERDMA_FUNC_BAR 0 18 16 #define ERDMA_MISX_BAR 2
+12
drivers/pci/access.c
··· 598 598 return pci_bus_write_config_dword(dev->bus, dev->devfn, where, val); 599 599 } 600 600 EXPORT_SYMBOL(pci_write_config_dword); 601 + 602 + void pci_clear_and_set_config_dword(const struct pci_dev *dev, int pos, 603 + u32 clear, u32 set) 604 + { 605 + u32 val; 606 + 607 + pci_read_config_dword(dev, pos, &val); 608 + val &= ~clear; 609 + val |= set; 610 + pci_write_config_dword(dev, pos, val); 611 + } 612 + EXPORT_SYMBOL(pci_clear_and_set_config_dword);
+30 -35
drivers/pci/pcie/aspm.c
··· 426 426 } 427 427 } 428 428 429 - static void pci_clear_and_set_dword(struct pci_dev *pdev, int pos, 430 - u32 clear, u32 set) 431 - { 432 - u32 val; 433 - 434 - pci_read_config_dword(pdev, pos, &val); 435 - val &= ~clear; 436 - val |= set; 437 - pci_write_config_dword(pdev, pos, val); 438 - } 439 - 440 429 /* Calculate L1.2 PM substate timing parameters */ 441 430 static void aspm_calc_l12_info(struct pcie_link_state *link, 442 431 u32 parent_l1ss_cap, u32 child_l1ss_cap) ··· 490 501 cl1_2_enables = cctl1 & PCI_L1SS_CTL1_L1_2_MASK; 491 502 492 503 if (pl1_2_enables || cl1_2_enables) { 493 - pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1, 494 - PCI_L1SS_CTL1_L1_2_MASK, 0); 495 - pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, 496 - PCI_L1SS_CTL1_L1_2_MASK, 0); 504 + pci_clear_and_set_config_dword(child, 505 + child->l1ss + PCI_L1SS_CTL1, 506 + PCI_L1SS_CTL1_L1_2_MASK, 0); 507 + pci_clear_and_set_config_dword(parent, 508 + parent->l1ss + PCI_L1SS_CTL1, 509 + PCI_L1SS_CTL1_L1_2_MASK, 0); 497 510 } 498 511 499 512 /* Program T_POWER_ON times in both ports */ ··· 503 512 pci_write_config_dword(child, child->l1ss + PCI_L1SS_CTL2, ctl2); 504 513 505 514 /* Program Common_Mode_Restore_Time in upstream device */ 506 - pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, 507 - PCI_L1SS_CTL1_CM_RESTORE_TIME, ctl1); 515 + pci_clear_and_set_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1, 516 + PCI_L1SS_CTL1_CM_RESTORE_TIME, ctl1); 508 517 509 518 /* Program LTR_L1.2_THRESHOLD time in both ports */ 510 - pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, 511 - PCI_L1SS_CTL1_LTR_L12_TH_VALUE | 512 - PCI_L1SS_CTL1_LTR_L12_TH_SCALE, ctl1); 513 - pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1, 514 - PCI_L1SS_CTL1_LTR_L12_TH_VALUE | 515 - PCI_L1SS_CTL1_LTR_L12_TH_SCALE, ctl1); 519 + pci_clear_and_set_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1, 520 + PCI_L1SS_CTL1_LTR_L12_TH_VALUE | 521 + PCI_L1SS_CTL1_LTR_L12_TH_SCALE, 522 + ctl1); 523 + pci_clear_and_set_config_dword(child, child->l1ss + PCI_L1SS_CTL1, 524 + PCI_L1SS_CTL1_LTR_L12_TH_VALUE | 525 + PCI_L1SS_CTL1_LTR_L12_TH_SCALE, 526 + ctl1); 516 527 517 528 if (pl1_2_enables || cl1_2_enables) { 518 - pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, 0, 519 - pl1_2_enables); 520 - pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1, 0, 521 - cl1_2_enables); 529 + pci_clear_and_set_config_dword(parent, 530 + parent->l1ss + PCI_L1SS_CTL1, 0, 531 + pl1_2_enables); 532 + pci_clear_and_set_config_dword(child, 533 + child->l1ss + PCI_L1SS_CTL1, 0, 534 + cl1_2_enables); 522 535 } 523 536 } 524 537 ··· 682 687 */ 683 688 684 689 /* Disable all L1 substates */ 685 - pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1, 686 - PCI_L1SS_CTL1_L1SS_MASK, 0); 687 - pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, 688 - PCI_L1SS_CTL1_L1SS_MASK, 0); 690 + pci_clear_and_set_config_dword(child, child->l1ss + PCI_L1SS_CTL1, 691 + PCI_L1SS_CTL1_L1SS_MASK, 0); 692 + pci_clear_and_set_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1, 693 + PCI_L1SS_CTL1_L1SS_MASK, 0); 689 694 /* 690 695 * If needed, disable L1, and it gets enabled later 691 696 * in pcie_config_aspm_link(). ··· 708 713 val |= PCI_L1SS_CTL1_PCIPM_L1_2; 709 714 710 715 /* Enable what we need to enable */ 711 - pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, 712 - PCI_L1SS_CTL1_L1SS_MASK, val); 713 - pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1, 714 - PCI_L1SS_CTL1_L1SS_MASK, val); 716 + pci_clear_and_set_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1, 717 + PCI_L1SS_CTL1_L1SS_MASK, val); 718 + pci_clear_and_set_config_dword(child, child->l1ss + PCI_L1SS_CTL1, 719 + PCI_L1SS_CTL1_L1SS_MASK, val); 715 720 } 716 721 717 722 static void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val)
+7
drivers/perf/Kconfig
··· 217 217 Enable perf support for Marvell DDR Performance monitoring 218 218 event on CN10K platform. 219 219 220 + config DWC_PCIE_PMU 221 + tristate "Synopsys DesignWare PCIe PMU" 222 + depends on PCI 223 + help 224 + Enable perf support for Synopsys DesignWare PCIe PMU Performance 225 + monitoring event on platform including the Alibaba Yitian 710. 226 + 220 227 source "drivers/perf/arm_cspmu/Kconfig" 221 228 222 229 source "drivers/perf/amlogic/Kconfig"
+1
drivers/perf/Makefile
··· 23 23 obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) += marvell_cn10k_ddr_pmu.o 24 24 obj-$(CONFIG_APPLE_M1_CPU_PMU) += apple_m1_cpu_pmu.o 25 25 obj-$(CONFIG_ALIBABA_UNCORE_DRW_PMU) += alibaba_uncore_drw_pmu.o 26 + obj-$(CONFIG_DWC_PCIE_PMU) += dwc_pcie_pmu.o 26 27 obj-$(CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU) += arm_cspmu/ 27 28 obj-$(CONFIG_MESON_DDR_PMU) += amlogic/ 28 29 obj-$(CONFIG_CXL_PMU) += cxl_pmu.o
+4 -2
drivers/perf/apple_m1_cpu_pmu.c
··· 524 524 { 525 525 unsigned long config_base = 0; 526 526 527 - if (!attr->exclude_guest) 528 - return -EINVAL; 527 + if (!attr->exclude_guest) { 528 + pr_debug("ARM performance counters do not support mode exclusion\n"); 529 + return -EOPNOTSUPP; 530 + } 529 531 if (!attr->exclude_kernel) 530 532 config_base |= M1_PMU_CFG_COUNT_KERNEL; 531 533 if (!attr->exclude_user)
+1 -1
drivers/perf/arm-cmn.c
··· 811 811 #define CMN_EVENT_HNF_OCC(_model, _name, _event) \ 812 812 CMN_EVENT_HN_OCC(_model, hnf_##_name, CMN_TYPE_HNF, _event) 813 813 #define CMN_EVENT_HNF_CLS(_model, _name, _event) \ 814 - CMN_EVENT_HN_CLS(_model, hnf_##_name, CMN_TYPE_HNS, _event) 814 + CMN_EVENT_HN_CLS(_model, hnf_##_name, CMN_TYPE_HNF, _event) 815 815 #define CMN_EVENT_HNF_SNT(_model, _name, _event) \ 816 816 CMN_EVENT_HN_SNT(_model, hnf_##_name, CMN_TYPE_HNF, _event) 817 817
+3 -3
drivers/perf/arm_dsu_pmu.c
··· 371 371 return __dsu_pmu_get_reset_overflow(); 372 372 } 373 373 374 - /** 374 + /* 375 375 * dsu_pmu_set_event_period: Set the period for the counter. 376 376 * 377 377 * All DSU PMU event counters, except the cycle counter are 32bit ··· 602 602 return dsu_pmu; 603 603 } 604 604 605 - /** 605 + /* 606 606 * dsu_pmu_dt_get_cpus: Get the list of CPUs in the cluster 607 607 * from device tree. 608 608 */ ··· 632 632 return 0; 633 633 } 634 634 635 - /** 635 + /* 636 636 * dsu_pmu_acpi_get_cpus: Get the list of CPUs in the cluster 637 637 * from ACPI. 638 638 */
+5 -7
drivers/perf/arm_pmu.c
··· 445 445 { 446 446 struct arm_pmu *armpmu = to_arm_pmu(event->pmu); 447 447 struct hw_perf_event *hwc = &event->hw; 448 - int mapping; 448 + int mapping, ret; 449 449 450 450 hwc->flags = 0; 451 451 mapping = armpmu->map_event(event); ··· 470 470 /* 471 471 * Check whether we need to exclude the counter from certain modes. 472 472 */ 473 - if (armpmu->set_event_filter && 474 - armpmu->set_event_filter(hwc, &event->attr)) { 475 - pr_debug("ARM performance counters do not support " 476 - "mode exclusion\n"); 477 - return -EOPNOTSUPP; 473 + if (armpmu->set_event_filter) { 474 + ret = armpmu->set_event_filter(hwc, &event->attr); 475 + if (ret) 476 + return ret; 478 477 } 479 478 480 479 /* ··· 892 893 struct pmu_hw_events *events; 893 894 894 895 events = per_cpu_ptr(pmu->hw_events, cpu); 895 - raw_spin_lock_init(&events->pmu_lock); 896 896 events->percpu_pmu = pmu; 897 897 } 898 898
+159 -81
drivers/perf/arm_pmuv3.c
··· 15 15 #include <clocksource/arm_arch_timer.h> 16 16 17 17 #include <linux/acpi.h> 18 + #include <linux/bitfield.h> 18 19 #include <linux/clocksource.h> 19 20 #include <linux/of.h> 20 21 #include <linux/perf/arm_pmu.h> ··· 170 169 PMU_EVENT_ATTR_ID(name, armv8pmu_events_sysfs_show, config) 171 170 172 171 static struct attribute *armv8_pmuv3_event_attrs[] = { 173 - ARMV8_EVENT_ATTR(sw_incr, ARMV8_PMUV3_PERFCTR_SW_INCR), 172 + /* 173 + * Don't expose the sw_incr event in /sys. It's not usable as writes to 174 + * PMSWINC_EL0 will trap as PMUSERENR.{SW,EN}=={0,0} and event rotation 175 + * means we don't have a fixed event<->counter relationship regardless. 176 + */ 174 177 ARMV8_EVENT_ATTR(l1i_cache_refill, ARMV8_PMUV3_PERFCTR_L1I_CACHE_REFILL), 175 178 ARMV8_EVENT_ATTR(l1i_tlb_refill, ARMV8_PMUV3_PERFCTR_L1I_TLB_REFILL), 176 179 ARMV8_EVENT_ATTR(l1d_cache_refill, ARMV8_PMUV3_PERFCTR_L1D_CACHE_REFILL), ··· 299 294 .is_visible = armv8pmu_event_attr_is_visible, 300 295 }; 301 296 302 - PMU_FORMAT_ATTR(event, "config:0-15"); 303 - PMU_FORMAT_ATTR(long, "config1:0"); 304 - PMU_FORMAT_ATTR(rdpmc, "config1:1"); 297 + /* User ABI */ 298 + #define ATTR_CFG_FLD_event_CFG config 299 + #define ATTR_CFG_FLD_event_LO 0 300 + #define ATTR_CFG_FLD_event_HI 15 301 + #define ATTR_CFG_FLD_long_CFG config1 302 + #define ATTR_CFG_FLD_long_LO 0 303 + #define ATTR_CFG_FLD_long_HI 0 304 + #define ATTR_CFG_FLD_rdpmc_CFG config1 305 + #define ATTR_CFG_FLD_rdpmc_LO 1 306 + #define ATTR_CFG_FLD_rdpmc_HI 1 307 + #define ATTR_CFG_FLD_threshold_count_CFG config1 /* PMEVTYPER.TC[0] */ 308 + #define ATTR_CFG_FLD_threshold_count_LO 2 309 + #define ATTR_CFG_FLD_threshold_count_HI 2 310 + #define ATTR_CFG_FLD_threshold_compare_CFG config1 /* PMEVTYPER.TC[2:1] */ 311 + #define ATTR_CFG_FLD_threshold_compare_LO 3 312 + #define ATTR_CFG_FLD_threshold_compare_HI 4 313 + #define ATTR_CFG_FLD_threshold_CFG config1 /* PMEVTYPER.TH */ 314 + #define ATTR_CFG_FLD_threshold_LO 5 315 + #define ATTR_CFG_FLD_threshold_HI 16 316 + 317 + GEN_PMU_FORMAT_ATTR(event); 318 + GEN_PMU_FORMAT_ATTR(long); 319 + GEN_PMU_FORMAT_ATTR(rdpmc); 320 + GEN_PMU_FORMAT_ATTR(threshold_count); 321 + GEN_PMU_FORMAT_ATTR(threshold_compare); 322 + GEN_PMU_FORMAT_ATTR(threshold); 305 323 306 324 static int sysctl_perf_user_access __read_mostly; 307 325 308 - static inline bool armv8pmu_event_is_64bit(struct perf_event *event) 326 + static bool armv8pmu_event_is_64bit(struct perf_event *event) 309 327 { 310 - return event->attr.config1 & 0x1; 328 + return ATTR_CFG_GET_FLD(&event->attr, long); 311 329 } 312 330 313 - static inline bool armv8pmu_event_want_user_access(struct perf_event *event) 331 + static bool armv8pmu_event_want_user_access(struct perf_event *event) 314 332 { 315 - return event->attr.config1 & 0x2; 333 + return ATTR_CFG_GET_FLD(&event->attr, rdpmc); 334 + } 335 + 336 + static u8 armv8pmu_event_threshold_control(struct perf_event_attr *attr) 337 + { 338 + u8 th_compare = ATTR_CFG_GET_FLD(attr, threshold_compare); 339 + u8 th_count = ATTR_CFG_GET_FLD(attr, threshold_count); 340 + 341 + /* 342 + * The count bit is always the bottom bit of the full control field, and 343 + * the comparison is the upper two bits, but it's not explicitly 344 + * labelled in the Arm ARM. For the Perf interface we split it into two 345 + * fields, so reconstruct it here. 346 + */ 347 + return (th_compare << 1) | th_count; 316 348 } 317 349 318 350 static struct attribute *armv8_pmuv3_format_attrs[] = { 319 351 &format_attr_event.attr, 320 352 &format_attr_long.attr, 321 353 &format_attr_rdpmc.attr, 354 + &format_attr_threshold.attr, 355 + &format_attr_threshold_compare.attr, 356 + &format_attr_threshold_count.attr, 322 357 NULL, 323 358 }; 324 359 ··· 372 327 { 373 328 struct pmu *pmu = dev_get_drvdata(dev); 374 329 struct arm_pmu *cpu_pmu = container_of(pmu, struct arm_pmu, pmu); 375 - u32 slots = cpu_pmu->reg_pmmir & ARMV8_PMU_SLOTS_MASK; 330 + u32 slots = FIELD_GET(ARMV8_PMU_SLOTS, cpu_pmu->reg_pmmir); 376 331 377 332 return sysfs_emit(page, "0x%08x\n", slots); 378 333 } ··· 384 339 { 385 340 struct pmu *pmu = dev_get_drvdata(dev); 386 341 struct arm_pmu *cpu_pmu = container_of(pmu, struct arm_pmu, pmu); 387 - u32 bus_slots = (cpu_pmu->reg_pmmir >> ARMV8_PMU_BUS_SLOTS_SHIFT) 388 - & ARMV8_PMU_BUS_SLOTS_MASK; 342 + u32 bus_slots = FIELD_GET(ARMV8_PMU_BUS_SLOTS, cpu_pmu->reg_pmmir); 389 343 390 344 return sysfs_emit(page, "0x%08x\n", bus_slots); 391 345 } ··· 396 352 { 397 353 struct pmu *pmu = dev_get_drvdata(dev); 398 354 struct arm_pmu *cpu_pmu = container_of(pmu, struct arm_pmu, pmu); 399 - u32 bus_width = (cpu_pmu->reg_pmmir >> ARMV8_PMU_BUS_WIDTH_SHIFT) 400 - & ARMV8_PMU_BUS_WIDTH_MASK; 355 + u32 bus_width = FIELD_GET(ARMV8_PMU_BUS_WIDTH, cpu_pmu->reg_pmmir); 401 356 u32 val = 0; 402 357 403 358 /* Encoded as Log2(number of bytes), plus one */ ··· 408 365 409 366 static DEVICE_ATTR_RO(bus_width); 410 367 368 + static u32 threshold_max(struct arm_pmu *cpu_pmu) 369 + { 370 + /* 371 + * PMMIR.THWIDTH is readable and non-zero on aarch32, but it would be 372 + * impossible to write the threshold in the upper 32 bits of PMEVTYPER. 373 + */ 374 + if (IS_ENABLED(CONFIG_ARM)) 375 + return 0; 376 + 377 + /* 378 + * The largest value that can be written to PMEVTYPER<n>_EL0.TH is 379 + * (2 ^ PMMIR.THWIDTH) - 1. 380 + */ 381 + return (1 << FIELD_GET(ARMV8_PMU_THWIDTH, cpu_pmu->reg_pmmir)) - 1; 382 + } 383 + 384 + static ssize_t threshold_max_show(struct device *dev, 385 + struct device_attribute *attr, char *page) 386 + { 387 + struct pmu *pmu = dev_get_drvdata(dev); 388 + struct arm_pmu *cpu_pmu = container_of(pmu, struct arm_pmu, pmu); 389 + 390 + return sysfs_emit(page, "0x%08x\n", threshold_max(cpu_pmu)); 391 + } 392 + 393 + static DEVICE_ATTR_RO(threshold_max); 394 + 411 395 static struct attribute *armv8_pmuv3_caps_attrs[] = { 412 396 &dev_attr_slots.attr, 413 397 &dev_attr_bus_slots.attr, 414 398 &dev_attr_bus_width.attr, 399 + &dev_attr_threshold_max.attr, 415 400 NULL, 416 401 }; 417 402 ··· 468 397 return (IS_ENABLED(CONFIG_ARM64) && is_pmuv3p5(cpu_pmu->pmuver)); 469 398 } 470 399 471 - static inline bool armv8pmu_event_has_user_read(struct perf_event *event) 400 + static bool armv8pmu_event_has_user_read(struct perf_event *event) 472 401 { 473 402 return event->hw.flags & PERF_EVENT_FLAG_USER_READ_CNT; 474 403 } ··· 478 407 * except when we have allocated the 64bit cycle counter (for CPU 479 408 * cycles event) or when user space counter access is enabled. 480 409 */ 481 - static inline bool armv8pmu_event_is_chained(struct perf_event *event) 410 + static bool armv8pmu_event_is_chained(struct perf_event *event) 482 411 { 483 412 int idx = event->hw.idx; 484 413 struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); ··· 499 428 #define ARMV8_IDX_TO_COUNTER(x) \ 500 429 (((x) - ARMV8_IDX_COUNTER0) & ARMV8_PMU_COUNTER_MASK) 501 430 502 - static inline u64 armv8pmu_pmcr_read(void) 431 + static u64 armv8pmu_pmcr_read(void) 503 432 { 504 433 return read_pmcr(); 505 434 } 506 435 507 - static inline void armv8pmu_pmcr_write(u64 val) 436 + static void armv8pmu_pmcr_write(u64 val) 508 437 { 509 438 val &= ARMV8_PMU_PMCR_MASK; 510 439 isb(); 511 440 write_pmcr(val); 512 441 } 513 442 514 - static inline int armv8pmu_has_overflowed(u32 pmovsr) 443 + static int armv8pmu_has_overflowed(u32 pmovsr) 515 444 { 516 445 return pmovsr & ARMV8_PMU_OVERFLOWED_MASK; 517 446 } 518 447 519 - static inline int armv8pmu_counter_has_overflowed(u32 pmnc, int idx) 448 + static int armv8pmu_counter_has_overflowed(u32 pmnc, int idx) 520 449 { 521 450 return pmnc & BIT(ARMV8_IDX_TO_COUNTER(idx)); 522 451 } 523 452 524 - static inline u64 armv8pmu_read_evcntr(int idx) 453 + static u64 armv8pmu_read_evcntr(int idx) 525 454 { 526 455 u32 counter = ARMV8_IDX_TO_COUNTER(idx); 527 456 528 457 return read_pmevcntrn(counter); 529 458 } 530 459 531 - static inline u64 armv8pmu_read_hw_counter(struct perf_event *event) 460 + static u64 armv8pmu_read_hw_counter(struct perf_event *event) 532 461 { 533 462 int idx = event->hw.idx; 534 463 u64 val = armv8pmu_read_evcntr(idx); ··· 590 519 return armv8pmu_unbias_long_counter(event, value); 591 520 } 592 521 593 - static inline void armv8pmu_write_evcntr(int idx, u64 value) 522 + static void armv8pmu_write_evcntr(int idx, u64 value) 594 523 { 595 524 u32 counter = ARMV8_IDX_TO_COUNTER(idx); 596 525 597 526 write_pmevcntrn(counter, value); 598 527 } 599 528 600 - static inline void armv8pmu_write_hw_counter(struct perf_event *event, 529 + static void armv8pmu_write_hw_counter(struct perf_event *event, 601 530 u64 value) 602 531 { 603 532 int idx = event->hw.idx; ··· 623 552 armv8pmu_write_hw_counter(event, value); 624 553 } 625 554 626 - static inline void armv8pmu_write_evtype(int idx, u32 val) 555 + static void armv8pmu_write_evtype(int idx, unsigned long val) 627 556 { 628 557 u32 counter = ARMV8_IDX_TO_COUNTER(idx); 558 + unsigned long mask = ARMV8_PMU_EVTYPE_EVENT | 559 + ARMV8_PMU_INCLUDE_EL2 | 560 + ARMV8_PMU_EXCLUDE_EL0 | 561 + ARMV8_PMU_EXCLUDE_EL1; 629 562 630 - val &= ARMV8_PMU_EVTYPE_MASK; 563 + if (IS_ENABLED(CONFIG_ARM64)) 564 + mask |= ARMV8_PMU_EVTYPE_TC | ARMV8_PMU_EVTYPE_TH; 565 + 566 + val &= mask; 631 567 write_pmevtypern(counter, val); 632 568 } 633 569 634 - static inline void armv8pmu_write_event_type(struct perf_event *event) 570 + static void armv8pmu_write_event_type(struct perf_event *event) 635 571 { 636 572 struct hw_perf_event *hwc = &event->hw; 637 573 int idx = hwc->idx; ··· 672 594 return mask; 673 595 } 674 596 675 - static inline void armv8pmu_enable_counter(u32 mask) 597 + static void armv8pmu_enable_counter(u32 mask) 676 598 { 677 599 /* 678 600 * Make sure event configuration register writes are visible before we ··· 682 604 write_pmcntenset(mask); 683 605 } 684 606 685 - static inline void armv8pmu_enable_event_counter(struct perf_event *event) 607 + static void armv8pmu_enable_event_counter(struct perf_event *event) 686 608 { 687 609 struct perf_event_attr *attr = &event->attr; 688 610 u32 mask = armv8pmu_event_cnten_mask(event); ··· 694 616 armv8pmu_enable_counter(mask); 695 617 } 696 618 697 - static inline void armv8pmu_disable_counter(u32 mask) 619 + static void armv8pmu_disable_counter(u32 mask) 698 620 { 699 621 write_pmcntenclr(mask); 700 622 /* ··· 704 626 isb(); 705 627 } 706 628 707 - static inline void armv8pmu_disable_event_counter(struct perf_event *event) 629 + static void armv8pmu_disable_event_counter(struct perf_event *event) 708 630 { 709 631 struct perf_event_attr *attr = &event->attr; 710 632 u32 mask = armv8pmu_event_cnten_mask(event); ··· 716 638 armv8pmu_disable_counter(mask); 717 639 } 718 640 719 - static inline void armv8pmu_enable_intens(u32 mask) 641 + static void armv8pmu_enable_intens(u32 mask) 720 642 { 721 643 write_pmintenset(mask); 722 644 } 723 645 724 - static inline void armv8pmu_enable_event_irq(struct perf_event *event) 646 + static void armv8pmu_enable_event_irq(struct perf_event *event) 725 647 { 726 648 u32 counter = ARMV8_IDX_TO_COUNTER(event->hw.idx); 727 649 armv8pmu_enable_intens(BIT(counter)); 728 650 } 729 651 730 - static inline void armv8pmu_disable_intens(u32 mask) 652 + static void armv8pmu_disable_intens(u32 mask) 731 653 { 732 654 write_pmintenclr(mask); 733 655 isb(); ··· 736 658 isb(); 737 659 } 738 660 739 - static inline void armv8pmu_disable_event_irq(struct perf_event *event) 661 + static void armv8pmu_disable_event_irq(struct perf_event *event) 740 662 { 741 663 u32 counter = ARMV8_IDX_TO_COUNTER(event->hw.idx); 742 664 armv8pmu_disable_intens(BIT(counter)); 743 665 } 744 666 745 - static inline u32 armv8pmu_getreset_flags(void) 667 + static u32 armv8pmu_getreset_flags(void) 746 668 { 747 669 u32 value; 748 670 ··· 750 672 value = read_pmovsclr(); 751 673 752 674 /* Write to clear flags */ 753 - value &= ARMV8_PMU_OVSR_MASK; 675 + value &= ARMV8_PMU_OVERFLOWED_MASK; 754 676 write_pmovsclr(value); 755 677 756 678 return value; ··· 992 914 struct perf_event_attr *attr) 993 915 { 994 916 unsigned long config_base = 0; 917 + struct perf_event *perf_event = container_of(attr, struct perf_event, 918 + attr); 919 + struct arm_pmu *cpu_pmu = to_arm_pmu(perf_event->pmu); 920 + u32 th; 995 921 996 - if (attr->exclude_idle) 997 - return -EPERM; 922 + if (attr->exclude_idle) { 923 + pr_debug("ARM performance counters do not support mode exclusion\n"); 924 + return -EOPNOTSUPP; 925 + } 998 926 999 927 /* 1000 928 * If we're running in hyp mode, then we *are* the hypervisor. ··· 1028 944 1029 945 if (attr->exclude_user) 1030 946 config_base |= ARMV8_PMU_EXCLUDE_EL0; 947 + 948 + /* 949 + * If FEAT_PMUv3_TH isn't implemented, then THWIDTH (threshold_max) will 950 + * be 0 and will also trigger this check, preventing it from being used. 951 + */ 952 + th = ATTR_CFG_GET_FLD(attr, threshold); 953 + if (th > threshold_max(cpu_pmu)) { 954 + pr_debug("PMU event threshold exceeds max value\n"); 955 + return -EINVAL; 956 + } 957 + 958 + if (IS_ENABLED(CONFIG_ARM64) && th) { 959 + config_base |= FIELD_PREP(ARMV8_PMU_EVTYPE_TH, th); 960 + config_base |= FIELD_PREP(ARMV8_PMU_EVTYPE_TC, 961 + armv8pmu_event_threshold_control(attr)); 962 + } 1031 963 1032 964 /* 1033 965 * Install the filter into config_base as this is used to ··· 1207 1107 probe->present = true; 1208 1108 1209 1109 /* Read the nb of CNTx counters supported from PMNC */ 1210 - cpu_pmu->num_events = (armv8pmu_pmcr_read() >> ARMV8_PMU_PMCR_N_SHIFT) 1211 - & ARMV8_PMU_PMCR_N_MASK; 1110 + cpu_pmu->num_events = FIELD_GET(ARMV8_PMU_PMCR_N, armv8pmu_pmcr_read()); 1212 1111 1213 1112 /* Add the CPU cycles counter */ 1214 1113 cpu_pmu->num_events += 1; ··· 1320 1221 return armv8_pmu_init(cpu_pmu, #name, armv8_pmuv3_map_event); \ 1321 1222 } 1322 1223 1224 + #define PMUV3_INIT_MAP_EVENT(name, map_event) \ 1225 + static int name##_pmu_init(struct arm_pmu *cpu_pmu) \ 1226 + { \ 1227 + return armv8_pmu_init(cpu_pmu, #name, map_event); \ 1228 + } 1229 + 1323 1230 PMUV3_INIT_SIMPLE(armv8_pmuv3) 1324 1231 1325 1232 PMUV3_INIT_SIMPLE(armv8_cortex_a34) ··· 1352 1247 PMUV3_INIT_SIMPLE(armv8_nvidia_carmel) 1353 1248 PMUV3_INIT_SIMPLE(armv8_nvidia_denver) 1354 1249 1355 - static int armv8_a35_pmu_init(struct arm_pmu *cpu_pmu) 1356 - { 1357 - return armv8_pmu_init(cpu_pmu, "armv8_cortex_a35", armv8_a53_map_event); 1358 - } 1359 - 1360 - static int armv8_a53_pmu_init(struct arm_pmu *cpu_pmu) 1361 - { 1362 - return armv8_pmu_init(cpu_pmu, "armv8_cortex_a53", armv8_a53_map_event); 1363 - } 1364 - 1365 - static int armv8_a57_pmu_init(struct arm_pmu *cpu_pmu) 1366 - { 1367 - return armv8_pmu_init(cpu_pmu, "armv8_cortex_a57", armv8_a57_map_event); 1368 - } 1369 - 1370 - static int armv8_a72_pmu_init(struct arm_pmu *cpu_pmu) 1371 - { 1372 - return armv8_pmu_init(cpu_pmu, "armv8_cortex_a72", armv8_a57_map_event); 1373 - } 1374 - 1375 - static int armv8_a73_pmu_init(struct arm_pmu *cpu_pmu) 1376 - { 1377 - return armv8_pmu_init(cpu_pmu, "armv8_cortex_a73", armv8_a73_map_event); 1378 - } 1379 - 1380 - static int armv8_thunder_pmu_init(struct arm_pmu *cpu_pmu) 1381 - { 1382 - return armv8_pmu_init(cpu_pmu, "armv8_cavium_thunder", armv8_thunder_map_event); 1383 - } 1384 - 1385 - static int armv8_vulcan_pmu_init(struct arm_pmu *cpu_pmu) 1386 - { 1387 - return armv8_pmu_init(cpu_pmu, "armv8_brcm_vulcan", armv8_vulcan_map_event); 1388 - } 1250 + PMUV3_INIT_MAP_EVENT(armv8_cortex_a35, armv8_a53_map_event) 1251 + PMUV3_INIT_MAP_EVENT(armv8_cortex_a53, armv8_a53_map_event) 1252 + PMUV3_INIT_MAP_EVENT(armv8_cortex_a57, armv8_a57_map_event) 1253 + PMUV3_INIT_MAP_EVENT(armv8_cortex_a72, armv8_a57_map_event) 1254 + PMUV3_INIT_MAP_EVENT(armv8_cortex_a73, armv8_a73_map_event) 1255 + PMUV3_INIT_MAP_EVENT(armv8_cavium_thunder, armv8_thunder_map_event) 1256 + PMUV3_INIT_MAP_EVENT(armv8_brcm_vulcan, armv8_vulcan_map_event) 1389 1257 1390 1258 static const struct of_device_id armv8_pmu_of_device_ids[] = { 1391 1259 {.compatible = "arm,armv8-pmuv3", .data = armv8_pmuv3_pmu_init}, 1392 1260 {.compatible = "arm,cortex-a34-pmu", .data = armv8_cortex_a34_pmu_init}, 1393 - {.compatible = "arm,cortex-a35-pmu", .data = armv8_a35_pmu_init}, 1394 - {.compatible = "arm,cortex-a53-pmu", .data = armv8_a53_pmu_init}, 1261 + {.compatible = "arm,cortex-a35-pmu", .data = armv8_cortex_a35_pmu_init}, 1262 + {.compatible = "arm,cortex-a53-pmu", .data = armv8_cortex_a53_pmu_init}, 1395 1263 {.compatible = "arm,cortex-a55-pmu", .data = armv8_cortex_a55_pmu_init}, 1396 - {.compatible = "arm,cortex-a57-pmu", .data = armv8_a57_pmu_init}, 1264 + {.compatible = "arm,cortex-a57-pmu", .data = armv8_cortex_a57_pmu_init}, 1397 1265 {.compatible = "arm,cortex-a65-pmu", .data = armv8_cortex_a65_pmu_init}, 1398 - {.compatible = "arm,cortex-a72-pmu", .data = armv8_a72_pmu_init}, 1399 - {.compatible = "arm,cortex-a73-pmu", .data = armv8_a73_pmu_init}, 1266 + {.compatible = "arm,cortex-a72-pmu", .data = armv8_cortex_a72_pmu_init}, 1267 + {.compatible = "arm,cortex-a73-pmu", .data = armv8_cortex_a73_pmu_init}, 1400 1268 {.compatible = "arm,cortex-a75-pmu", .data = armv8_cortex_a75_pmu_init}, 1401 1269 {.compatible = "arm,cortex-a76-pmu", .data = armv8_cortex_a76_pmu_init}, 1402 1270 {.compatible = "arm,cortex-a77-pmu", .data = armv8_cortex_a77_pmu_init}, ··· 1387 1309 {.compatible = "arm,neoverse-n1-pmu", .data = armv8_neoverse_n1_pmu_init}, 1388 1310 {.compatible = "arm,neoverse-n2-pmu", .data = armv9_neoverse_n2_pmu_init}, 1389 1311 {.compatible = "arm,neoverse-v1-pmu", .data = armv8_neoverse_v1_pmu_init}, 1390 - {.compatible = "cavium,thunder-pmu", .data = armv8_thunder_pmu_init}, 1391 - {.compatible = "brcm,vulcan-pmu", .data = armv8_vulcan_pmu_init}, 1312 + {.compatible = "cavium,thunder-pmu", .data = armv8_cavium_thunder_pmu_init}, 1313 + {.compatible = "brcm,vulcan-pmu", .data = armv8_brcm_vulcan_pmu_init}, 1392 1314 {.compatible = "nvidia,carmel-pmu", .data = armv8_nvidia_carmel_pmu_init}, 1393 1315 {.compatible = "nvidia,denver-pmu", .data = armv8_nvidia_denver_pmu_init}, 1394 1316 {},
-22
drivers/perf/arm_spe_pmu.c
··· 206 206 #define ATTR_CFG_FLD_inv_event_filter_LO 0 207 207 #define ATTR_CFG_FLD_inv_event_filter_HI 63 208 208 209 - /* Why does everything I do descend into this? */ 210 - #define __GEN_PMU_FORMAT_ATTR(cfg, lo, hi) \ 211 - (lo) == (hi) ? #cfg ":" #lo "\n" : #cfg ":" #lo "-" #hi 212 - 213 - #define _GEN_PMU_FORMAT_ATTR(cfg, lo, hi) \ 214 - __GEN_PMU_FORMAT_ATTR(cfg, lo, hi) 215 - 216 - #define GEN_PMU_FORMAT_ATTR(name) \ 217 - PMU_FORMAT_ATTR(name, \ 218 - _GEN_PMU_FORMAT_ATTR(ATTR_CFG_FLD_##name##_CFG, \ 219 - ATTR_CFG_FLD_##name##_LO, \ 220 - ATTR_CFG_FLD_##name##_HI)) 221 - 222 - #define _ATTR_CFG_GET_FLD(attr, cfg, lo, hi) \ 223 - ((((attr)->cfg) >> lo) & GENMASK(hi - lo, 0)) 224 - 225 - #define ATTR_CFG_GET_FLD(attr, name) \ 226 - _ATTR_CFG_GET_FLD(attr, \ 227 - ATTR_CFG_FLD_##name##_CFG, \ 228 - ATTR_CFG_FLD_##name##_LO, \ 229 - ATTR_CFG_FLD_##name##_HI) 230 - 231 209 GEN_PMU_FORMAT_ATTR(ts_enable); 232 210 GEN_PMU_FORMAT_ATTR(pa_enable); 233 211 GEN_PMU_FORMAT_ATTR(pct_enable);
+792
drivers/perf/dwc_pcie_pmu.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Synopsys DesignWare PCIe PMU driver 4 + * 5 + * Copyright (C) 2021-2023 Alibaba Inc. 6 + */ 7 + 8 + #include <linux/bitfield.h> 9 + #include <linux/bitops.h> 10 + #include <linux/cpuhotplug.h> 11 + #include <linux/cpumask.h> 12 + #include <linux/device.h> 13 + #include <linux/errno.h> 14 + #include <linux/kernel.h> 15 + #include <linux/list.h> 16 + #include <linux/perf_event.h> 17 + #include <linux/pci.h> 18 + #include <linux/platform_device.h> 19 + #include <linux/smp.h> 20 + #include <linux/sysfs.h> 21 + #include <linux/types.h> 22 + 23 + #define DWC_PCIE_VSEC_RAS_DES_ID 0x02 24 + #define DWC_PCIE_EVENT_CNT_CTL 0x8 25 + 26 + /* 27 + * Event Counter Data Select includes two parts: 28 + * - 27-24: Group number(4-bit: 0..0x7) 29 + * - 23-16: Event number(8-bit: 0..0x13) within the Group 30 + * 31 + * Put them together as in TRM. 32 + */ 33 + #define DWC_PCIE_CNT_EVENT_SEL GENMASK(27, 16) 34 + #define DWC_PCIE_CNT_LANE_SEL GENMASK(11, 8) 35 + #define DWC_PCIE_CNT_STATUS BIT(7) 36 + #define DWC_PCIE_CNT_ENABLE GENMASK(4, 2) 37 + #define DWC_PCIE_PER_EVENT_OFF 0x1 38 + #define DWC_PCIE_PER_EVENT_ON 0x3 39 + #define DWC_PCIE_EVENT_CLEAR GENMASK(1, 0) 40 + #define DWC_PCIE_EVENT_PER_CLEAR 0x1 41 + 42 + #define DWC_PCIE_EVENT_CNT_DATA 0xC 43 + 44 + #define DWC_PCIE_TIME_BASED_ANAL_CTL 0x10 45 + #define DWC_PCIE_TIME_BASED_REPORT_SEL GENMASK(31, 24) 46 + #define DWC_PCIE_TIME_BASED_DURATION_SEL GENMASK(15, 8) 47 + #define DWC_PCIE_DURATION_MANUAL_CTL 0x0 48 + #define DWC_PCIE_DURATION_1MS 0x1 49 + #define DWC_PCIE_DURATION_10MS 0x2 50 + #define DWC_PCIE_DURATION_100MS 0x3 51 + #define DWC_PCIE_DURATION_1S 0x4 52 + #define DWC_PCIE_DURATION_2S 0x5 53 + #define DWC_PCIE_DURATION_4S 0x6 54 + #define DWC_PCIE_DURATION_4US 0xFF 55 + #define DWC_PCIE_TIME_BASED_TIMER_START BIT(0) 56 + #define DWC_PCIE_TIME_BASED_CNT_ENABLE 0x1 57 + 58 + #define DWC_PCIE_TIME_BASED_ANAL_DATA_REG_LOW 0x14 59 + #define DWC_PCIE_TIME_BASED_ANAL_DATA_REG_HIGH 0x18 60 + 61 + /* Event attributes */ 62 + #define DWC_PCIE_CONFIG_EVENTID GENMASK(15, 0) 63 + #define DWC_PCIE_CONFIG_TYPE GENMASK(19, 16) 64 + #define DWC_PCIE_CONFIG_LANE GENMASK(27, 20) 65 + 66 + #define DWC_PCIE_EVENT_ID(event) FIELD_GET(DWC_PCIE_CONFIG_EVENTID, (event)->attr.config) 67 + #define DWC_PCIE_EVENT_TYPE(event) FIELD_GET(DWC_PCIE_CONFIG_TYPE, (event)->attr.config) 68 + #define DWC_PCIE_EVENT_LANE(event) FIELD_GET(DWC_PCIE_CONFIG_LANE, (event)->attr.config) 69 + 70 + enum dwc_pcie_event_type { 71 + DWC_PCIE_TIME_BASE_EVENT, 72 + DWC_PCIE_LANE_EVENT, 73 + DWC_PCIE_EVENT_TYPE_MAX, 74 + }; 75 + 76 + #define DWC_PCIE_LANE_EVENT_MAX_PERIOD GENMASK_ULL(31, 0) 77 + #define DWC_PCIE_MAX_PERIOD GENMASK_ULL(63, 0) 78 + 79 + struct dwc_pcie_pmu { 80 + struct pmu pmu; 81 + struct pci_dev *pdev; /* Root Port device */ 82 + u16 ras_des_offset; 83 + u32 nr_lanes; 84 + 85 + struct list_head pmu_node; 86 + struct hlist_node cpuhp_node; 87 + struct perf_event *event[DWC_PCIE_EVENT_TYPE_MAX]; 88 + int on_cpu; 89 + }; 90 + 91 + #define to_dwc_pcie_pmu(p) (container_of(p, struct dwc_pcie_pmu, pmu)) 92 + 93 + static int dwc_pcie_pmu_hp_state; 94 + static struct list_head dwc_pcie_dev_info_head = 95 + LIST_HEAD_INIT(dwc_pcie_dev_info_head); 96 + static bool notify; 97 + 98 + struct dwc_pcie_dev_info { 99 + struct platform_device *plat_dev; 100 + struct pci_dev *pdev; 101 + struct list_head dev_node; 102 + }; 103 + 104 + struct dwc_pcie_vendor_id { 105 + int vendor_id; 106 + }; 107 + 108 + static const struct dwc_pcie_vendor_id dwc_pcie_vendor_ids[] = { 109 + {.vendor_id = PCI_VENDOR_ID_ALIBABA }, 110 + {} /* terminator */ 111 + }; 112 + 113 + static ssize_t cpumask_show(struct device *dev, 114 + struct device_attribute *attr, 115 + char *buf) 116 + { 117 + struct dwc_pcie_pmu *pcie_pmu = to_dwc_pcie_pmu(dev_get_drvdata(dev)); 118 + 119 + return cpumap_print_to_pagebuf(true, buf, cpumask_of(pcie_pmu->on_cpu)); 120 + } 121 + static DEVICE_ATTR_RO(cpumask); 122 + 123 + static struct attribute *dwc_pcie_pmu_cpumask_attrs[] = { 124 + &dev_attr_cpumask.attr, 125 + NULL 126 + }; 127 + 128 + static struct attribute_group dwc_pcie_cpumask_attr_group = { 129 + .attrs = dwc_pcie_pmu_cpumask_attrs, 130 + }; 131 + 132 + struct dwc_pcie_format_attr { 133 + struct device_attribute attr; 134 + u64 field; 135 + int config; 136 + }; 137 + 138 + PMU_FORMAT_ATTR(eventid, "config:0-15"); 139 + PMU_FORMAT_ATTR(type, "config:16-19"); 140 + PMU_FORMAT_ATTR(lane, "config:20-27"); 141 + 142 + static struct attribute *dwc_pcie_format_attrs[] = { 143 + &format_attr_type.attr, 144 + &format_attr_eventid.attr, 145 + &format_attr_lane.attr, 146 + NULL, 147 + }; 148 + 149 + static struct attribute_group dwc_pcie_format_attrs_group = { 150 + .name = "format", 151 + .attrs = dwc_pcie_format_attrs, 152 + }; 153 + 154 + struct dwc_pcie_event_attr { 155 + struct device_attribute attr; 156 + enum dwc_pcie_event_type type; 157 + u16 eventid; 158 + u8 lane; 159 + }; 160 + 161 + static ssize_t dwc_pcie_event_show(struct device *dev, 162 + struct device_attribute *attr, char *buf) 163 + { 164 + struct dwc_pcie_event_attr *eattr; 165 + 166 + eattr = container_of(attr, typeof(*eattr), attr); 167 + 168 + if (eattr->type == DWC_PCIE_LANE_EVENT) 169 + return sysfs_emit(buf, "eventid=0x%x,type=0x%x,lane=?\n", 170 + eattr->eventid, eattr->type); 171 + else if (eattr->type == DWC_PCIE_TIME_BASE_EVENT) 172 + return sysfs_emit(buf, "eventid=0x%x,type=0x%x\n", 173 + eattr->eventid, eattr->type); 174 + 175 + return 0; 176 + } 177 + 178 + #define DWC_PCIE_EVENT_ATTR(_name, _type, _eventid, _lane) \ 179 + (&((struct dwc_pcie_event_attr[]) {{ \ 180 + .attr = __ATTR(_name, 0444, dwc_pcie_event_show, NULL), \ 181 + .type = _type, \ 182 + .eventid = _eventid, \ 183 + .lane = _lane, \ 184 + }})[0].attr.attr) 185 + 186 + #define DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(_name, _eventid) \ 187 + DWC_PCIE_EVENT_ATTR(_name, DWC_PCIE_TIME_BASE_EVENT, _eventid, 0) 188 + #define DWC_PCIE_PMU_LANE_EVENT_ATTR(_name, _eventid) \ 189 + DWC_PCIE_EVENT_ATTR(_name, DWC_PCIE_LANE_EVENT, _eventid, 0) 190 + 191 + static struct attribute *dwc_pcie_pmu_time_event_attrs[] = { 192 + /* Group #0 */ 193 + DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(one_cycle, 0x00), 194 + DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(TX_L0S, 0x01), 195 + DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(RX_L0S, 0x02), 196 + DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(L0, 0x03), 197 + DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(L1, 0x04), 198 + DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(L1_1, 0x05), 199 + DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(L1_2, 0x06), 200 + DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(CFG_RCVRY, 0x07), 201 + DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(TX_RX_L0S, 0x08), 202 + DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(L1_AUX, 0x09), 203 + 204 + /* Group #1 */ 205 + DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(Tx_PCIe_TLP_Data_Payload, 0x20), 206 + DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(Rx_PCIe_TLP_Data_Payload, 0x21), 207 + DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(Tx_CCIX_TLP_Data_Payload, 0x22), 208 + DWC_PCIE_PMU_TIME_BASE_EVENT_ATTR(Rx_CCIX_TLP_Data_Payload, 0x23), 209 + 210 + /* 211 + * Leave it to the user to specify the lane ID to avoid generating 212 + * a list of hundreds of events. 213 + */ 214 + DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_ack_dllp, 0x600), 215 + DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_update_fc_dllp, 0x601), 216 + DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_ack_dllp, 0x602), 217 + DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_update_fc_dllp, 0x603), 218 + DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_nulified_tlp, 0x604), 219 + DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_nulified_tlp, 0x605), 220 + DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_duplicate_tl, 0x606), 221 + DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_memory_write, 0x700), 222 + DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_memory_read, 0x701), 223 + DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_configuration_write, 0x702), 224 + DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_configuration_read, 0x703), 225 + DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_io_write, 0x704), 226 + DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_io_read, 0x705), 227 + DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_completion_without_data, 0x706), 228 + DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_completion_with_data, 0x707), 229 + DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_message_tlp, 0x708), 230 + DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_atomic, 0x709), 231 + DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_tlp_with_prefix, 0x70A), 232 + DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_memory_write, 0x70B), 233 + DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_memory_read, 0x70C), 234 + DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_io_write, 0x70F), 235 + DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_io_read, 0x710), 236 + DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_completion_without_data, 0x711), 237 + DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_completion_with_data, 0x712), 238 + DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_message_tlp, 0x713), 239 + DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_atomic, 0x714), 240 + DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_tlp_with_prefix, 0x715), 241 + DWC_PCIE_PMU_LANE_EVENT_ATTR(tx_ccix_tlp, 0x716), 242 + DWC_PCIE_PMU_LANE_EVENT_ATTR(rx_ccix_tlp, 0x717), 243 + NULL 244 + }; 245 + 246 + static const struct attribute_group dwc_pcie_event_attrs_group = { 247 + .name = "events", 248 + .attrs = dwc_pcie_pmu_time_event_attrs, 249 + }; 250 + 251 + static const struct attribute_group *dwc_pcie_attr_groups[] = { 252 + &dwc_pcie_event_attrs_group, 253 + &dwc_pcie_format_attrs_group, 254 + &dwc_pcie_cpumask_attr_group, 255 + NULL 256 + }; 257 + 258 + static void dwc_pcie_pmu_lane_event_enable(struct dwc_pcie_pmu *pcie_pmu, 259 + bool enable) 260 + { 261 + struct pci_dev *pdev = pcie_pmu->pdev; 262 + u16 ras_des_offset = pcie_pmu->ras_des_offset; 263 + 264 + if (enable) 265 + pci_clear_and_set_config_dword(pdev, 266 + ras_des_offset + DWC_PCIE_EVENT_CNT_CTL, 267 + DWC_PCIE_CNT_ENABLE, DWC_PCIE_PER_EVENT_ON); 268 + else 269 + pci_clear_and_set_config_dword(pdev, 270 + ras_des_offset + DWC_PCIE_EVENT_CNT_CTL, 271 + DWC_PCIE_CNT_ENABLE, DWC_PCIE_PER_EVENT_OFF); 272 + } 273 + 274 + static void dwc_pcie_pmu_time_based_event_enable(struct dwc_pcie_pmu *pcie_pmu, 275 + bool enable) 276 + { 277 + struct pci_dev *pdev = pcie_pmu->pdev; 278 + u16 ras_des_offset = pcie_pmu->ras_des_offset; 279 + 280 + pci_clear_and_set_config_dword(pdev, 281 + ras_des_offset + DWC_PCIE_TIME_BASED_ANAL_CTL, 282 + DWC_PCIE_TIME_BASED_TIMER_START, enable); 283 + } 284 + 285 + static u64 dwc_pcie_pmu_read_lane_event_counter(struct perf_event *event) 286 + { 287 + struct dwc_pcie_pmu *pcie_pmu = to_dwc_pcie_pmu(event->pmu); 288 + struct pci_dev *pdev = pcie_pmu->pdev; 289 + u16 ras_des_offset = pcie_pmu->ras_des_offset; 290 + u32 val; 291 + 292 + pci_read_config_dword(pdev, ras_des_offset + DWC_PCIE_EVENT_CNT_DATA, &val); 293 + 294 + return val; 295 + } 296 + 297 + static u64 dwc_pcie_pmu_read_time_based_counter(struct perf_event *event) 298 + { 299 + struct dwc_pcie_pmu *pcie_pmu = to_dwc_pcie_pmu(event->pmu); 300 + struct pci_dev *pdev = pcie_pmu->pdev; 301 + int event_id = DWC_PCIE_EVENT_ID(event); 302 + u16 ras_des_offset = pcie_pmu->ras_des_offset; 303 + u32 lo, hi, ss; 304 + u64 val; 305 + 306 + /* 307 + * The 64-bit value of the data counter is spread across two 308 + * registers that are not synchronized. In order to read them 309 + * atomically, ensure that the high 32 bits match before and after 310 + * reading the low 32 bits. 311 + */ 312 + pci_read_config_dword(pdev, 313 + ras_des_offset + DWC_PCIE_TIME_BASED_ANAL_DATA_REG_HIGH, &hi); 314 + do { 315 + /* snapshot the high 32 bits */ 316 + ss = hi; 317 + 318 + pci_read_config_dword( 319 + pdev, ras_des_offset + DWC_PCIE_TIME_BASED_ANAL_DATA_REG_LOW, 320 + &lo); 321 + pci_read_config_dword( 322 + pdev, ras_des_offset + DWC_PCIE_TIME_BASED_ANAL_DATA_REG_HIGH, 323 + &hi); 324 + } while (hi != ss); 325 + 326 + val = ((u64)hi << 32) | lo; 327 + /* 328 + * The Group#1 event measures the amount of data processed in 16-byte 329 + * units. Simplify the end-user interface by multiplying the counter 330 + * at the point of read. 331 + */ 332 + if (event_id >= 0x20 && event_id <= 0x23) 333 + val *= 16; 334 + 335 + return val; 336 + } 337 + 338 + static void dwc_pcie_pmu_event_update(struct perf_event *event) 339 + { 340 + struct hw_perf_event *hwc = &event->hw; 341 + enum dwc_pcie_event_type type = DWC_PCIE_EVENT_TYPE(event); 342 + u64 delta, prev, now = 0; 343 + 344 + do { 345 + prev = local64_read(&hwc->prev_count); 346 + 347 + if (type == DWC_PCIE_LANE_EVENT) 348 + now = dwc_pcie_pmu_read_lane_event_counter(event); 349 + else if (type == DWC_PCIE_TIME_BASE_EVENT) 350 + now = dwc_pcie_pmu_read_time_based_counter(event); 351 + 352 + } while (local64_cmpxchg(&hwc->prev_count, prev, now) != prev); 353 + 354 + delta = (now - prev) & DWC_PCIE_MAX_PERIOD; 355 + /* 32-bit counter for Lane Event Counting */ 356 + if (type == DWC_PCIE_LANE_EVENT) 357 + delta &= DWC_PCIE_LANE_EVENT_MAX_PERIOD; 358 + 359 + local64_add(delta, &event->count); 360 + } 361 + 362 + static int dwc_pcie_pmu_event_init(struct perf_event *event) 363 + { 364 + struct dwc_pcie_pmu *pcie_pmu = to_dwc_pcie_pmu(event->pmu); 365 + enum dwc_pcie_event_type type = DWC_PCIE_EVENT_TYPE(event); 366 + struct perf_event *sibling; 367 + u32 lane; 368 + 369 + if (event->attr.type != event->pmu->type) 370 + return -ENOENT; 371 + 372 + /* We don't support sampling */ 373 + if (is_sampling_event(event)) 374 + return -EINVAL; 375 + 376 + /* We cannot support task bound events */ 377 + if (event->cpu < 0 || event->attach_state & PERF_ATTACH_TASK) 378 + return -EINVAL; 379 + 380 + if (event->group_leader != event && 381 + !is_software_event(event->group_leader)) 382 + return -EINVAL; 383 + 384 + for_each_sibling_event(sibling, event->group_leader) { 385 + if (sibling->pmu != event->pmu && !is_software_event(sibling)) 386 + return -EINVAL; 387 + } 388 + 389 + if (type < 0 || type >= DWC_PCIE_EVENT_TYPE_MAX) 390 + return -EINVAL; 391 + 392 + if (type == DWC_PCIE_LANE_EVENT) { 393 + lane = DWC_PCIE_EVENT_LANE(event); 394 + if (lane < 0 || lane >= pcie_pmu->nr_lanes) 395 + return -EINVAL; 396 + } 397 + 398 + event->cpu = pcie_pmu->on_cpu; 399 + 400 + return 0; 401 + } 402 + 403 + static void dwc_pcie_pmu_event_start(struct perf_event *event, int flags) 404 + { 405 + struct hw_perf_event *hwc = &event->hw; 406 + struct dwc_pcie_pmu *pcie_pmu = to_dwc_pcie_pmu(event->pmu); 407 + enum dwc_pcie_event_type type = DWC_PCIE_EVENT_TYPE(event); 408 + 409 + hwc->state = 0; 410 + local64_set(&hwc->prev_count, 0); 411 + 412 + if (type == DWC_PCIE_LANE_EVENT) 413 + dwc_pcie_pmu_lane_event_enable(pcie_pmu, true); 414 + else if (type == DWC_PCIE_TIME_BASE_EVENT) 415 + dwc_pcie_pmu_time_based_event_enable(pcie_pmu, true); 416 + } 417 + 418 + static void dwc_pcie_pmu_event_stop(struct perf_event *event, int flags) 419 + { 420 + struct dwc_pcie_pmu *pcie_pmu = to_dwc_pcie_pmu(event->pmu); 421 + enum dwc_pcie_event_type type = DWC_PCIE_EVENT_TYPE(event); 422 + struct hw_perf_event *hwc = &event->hw; 423 + 424 + if (event->hw.state & PERF_HES_STOPPED) 425 + return; 426 + 427 + if (type == DWC_PCIE_LANE_EVENT) 428 + dwc_pcie_pmu_lane_event_enable(pcie_pmu, false); 429 + else if (type == DWC_PCIE_TIME_BASE_EVENT) 430 + dwc_pcie_pmu_time_based_event_enable(pcie_pmu, false); 431 + 432 + dwc_pcie_pmu_event_update(event); 433 + hwc->state |= PERF_HES_STOPPED | PERF_HES_UPTODATE; 434 + } 435 + 436 + static int dwc_pcie_pmu_event_add(struct perf_event *event, int flags) 437 + { 438 + struct dwc_pcie_pmu *pcie_pmu = to_dwc_pcie_pmu(event->pmu); 439 + struct pci_dev *pdev = pcie_pmu->pdev; 440 + struct hw_perf_event *hwc = &event->hw; 441 + enum dwc_pcie_event_type type = DWC_PCIE_EVENT_TYPE(event); 442 + int event_id = DWC_PCIE_EVENT_ID(event); 443 + int lane = DWC_PCIE_EVENT_LANE(event); 444 + u16 ras_des_offset = pcie_pmu->ras_des_offset; 445 + u32 ctrl; 446 + 447 + /* one counter for each type and it is in use */ 448 + if (pcie_pmu->event[type]) 449 + return -ENOSPC; 450 + 451 + pcie_pmu->event[type] = event; 452 + hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE; 453 + 454 + if (type == DWC_PCIE_LANE_EVENT) { 455 + /* EVENT_COUNTER_DATA_REG needs clear manually */ 456 + ctrl = FIELD_PREP(DWC_PCIE_CNT_EVENT_SEL, event_id) | 457 + FIELD_PREP(DWC_PCIE_CNT_LANE_SEL, lane) | 458 + FIELD_PREP(DWC_PCIE_CNT_ENABLE, DWC_PCIE_PER_EVENT_OFF) | 459 + FIELD_PREP(DWC_PCIE_EVENT_CLEAR, DWC_PCIE_EVENT_PER_CLEAR); 460 + pci_write_config_dword(pdev, ras_des_offset + DWC_PCIE_EVENT_CNT_CTL, 461 + ctrl); 462 + } else if (type == DWC_PCIE_TIME_BASE_EVENT) { 463 + /* 464 + * TIME_BASED_ANAL_DATA_REG is a 64 bit register, we can safely 465 + * use it with any manually controlled duration. And it is 466 + * cleared when next measurement starts. 467 + */ 468 + ctrl = FIELD_PREP(DWC_PCIE_TIME_BASED_REPORT_SEL, event_id) | 469 + FIELD_PREP(DWC_PCIE_TIME_BASED_DURATION_SEL, 470 + DWC_PCIE_DURATION_MANUAL_CTL) | 471 + DWC_PCIE_TIME_BASED_CNT_ENABLE; 472 + pci_write_config_dword( 473 + pdev, ras_des_offset + DWC_PCIE_TIME_BASED_ANAL_CTL, ctrl); 474 + } 475 + 476 + if (flags & PERF_EF_START) 477 + dwc_pcie_pmu_event_start(event, PERF_EF_RELOAD); 478 + 479 + perf_event_update_userpage(event); 480 + 481 + return 0; 482 + } 483 + 484 + static void dwc_pcie_pmu_event_del(struct perf_event *event, int flags) 485 + { 486 + struct dwc_pcie_pmu *pcie_pmu = to_dwc_pcie_pmu(event->pmu); 487 + enum dwc_pcie_event_type type = DWC_PCIE_EVENT_TYPE(event); 488 + 489 + dwc_pcie_pmu_event_stop(event, flags | PERF_EF_UPDATE); 490 + perf_event_update_userpage(event); 491 + pcie_pmu->event[type] = NULL; 492 + } 493 + 494 + static void dwc_pcie_pmu_remove_cpuhp_instance(void *hotplug_node) 495 + { 496 + cpuhp_state_remove_instance_nocalls(dwc_pcie_pmu_hp_state, hotplug_node); 497 + } 498 + 499 + /* 500 + * Find the binded DES capability device info of a PCI device. 501 + * @pdev: The PCI device. 502 + */ 503 + static struct dwc_pcie_dev_info *dwc_pcie_find_dev_info(struct pci_dev *pdev) 504 + { 505 + struct dwc_pcie_dev_info *dev_info; 506 + 507 + list_for_each_entry(dev_info, &dwc_pcie_dev_info_head, dev_node) 508 + if (dev_info->pdev == pdev) 509 + return dev_info; 510 + 511 + return NULL; 512 + } 513 + 514 + static void dwc_pcie_unregister_pmu(void *data) 515 + { 516 + struct dwc_pcie_pmu *pcie_pmu = data; 517 + 518 + perf_pmu_unregister(&pcie_pmu->pmu); 519 + } 520 + 521 + static bool dwc_pcie_match_des_cap(struct pci_dev *pdev) 522 + { 523 + const struct dwc_pcie_vendor_id *vid; 524 + u16 vsec = 0; 525 + u32 val; 526 + 527 + if (!pci_is_pcie(pdev) || !(pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT)) 528 + return false; 529 + 530 + for (vid = dwc_pcie_vendor_ids; vid->vendor_id; vid++) { 531 + vsec = pci_find_vsec_capability(pdev, vid->vendor_id, 532 + DWC_PCIE_VSEC_RAS_DES_ID); 533 + if (vsec) 534 + break; 535 + } 536 + if (!vsec) 537 + return false; 538 + 539 + pci_read_config_dword(pdev, vsec + PCI_VNDR_HEADER, &val); 540 + if (PCI_VNDR_HEADER_REV(val) != 0x04) 541 + return false; 542 + 543 + pci_dbg(pdev, 544 + "Detected PCIe Vendor-Specific Extended Capability RAS DES\n"); 545 + return true; 546 + } 547 + 548 + static void dwc_pcie_unregister_dev(struct dwc_pcie_dev_info *dev_info) 549 + { 550 + platform_device_unregister(dev_info->plat_dev); 551 + list_del(&dev_info->dev_node); 552 + kfree(dev_info); 553 + } 554 + 555 + static int dwc_pcie_register_dev(struct pci_dev *pdev) 556 + { 557 + struct platform_device *plat_dev; 558 + struct dwc_pcie_dev_info *dev_info; 559 + u32 bdf; 560 + 561 + bdf = PCI_DEVID(pdev->bus->number, pdev->devfn); 562 + plat_dev = platform_device_register_data(NULL, "dwc_pcie_pmu", bdf, 563 + pdev, sizeof(*pdev)); 564 + 565 + if (IS_ERR(plat_dev)) 566 + return PTR_ERR(plat_dev); 567 + 568 + dev_info = kzalloc(sizeof(*dev_info), GFP_KERNEL); 569 + if (!dev_info) 570 + return -ENOMEM; 571 + 572 + /* Cache platform device to handle pci device hotplug */ 573 + dev_info->plat_dev = plat_dev; 574 + dev_info->pdev = pdev; 575 + list_add(&dev_info->dev_node, &dwc_pcie_dev_info_head); 576 + 577 + return 0; 578 + } 579 + 580 + static int dwc_pcie_pmu_notifier(struct notifier_block *nb, 581 + unsigned long action, void *data) 582 + { 583 + struct device *dev = data; 584 + struct pci_dev *pdev = to_pci_dev(dev); 585 + struct dwc_pcie_dev_info *dev_info; 586 + 587 + switch (action) { 588 + case BUS_NOTIFY_ADD_DEVICE: 589 + if (!dwc_pcie_match_des_cap(pdev)) 590 + return NOTIFY_DONE; 591 + if (dwc_pcie_register_dev(pdev)) 592 + return NOTIFY_BAD; 593 + break; 594 + case BUS_NOTIFY_DEL_DEVICE: 595 + dev_info = dwc_pcie_find_dev_info(pdev); 596 + if (!dev_info) 597 + return NOTIFY_DONE; 598 + dwc_pcie_unregister_dev(dev_info); 599 + break; 600 + } 601 + 602 + return NOTIFY_OK; 603 + } 604 + 605 + static struct notifier_block dwc_pcie_pmu_nb = { 606 + .notifier_call = dwc_pcie_pmu_notifier, 607 + }; 608 + 609 + static int dwc_pcie_pmu_probe(struct platform_device *plat_dev) 610 + { 611 + struct pci_dev *pdev = plat_dev->dev.platform_data; 612 + struct dwc_pcie_pmu *pcie_pmu; 613 + char *name; 614 + u32 bdf, val; 615 + u16 vsec; 616 + int ret; 617 + 618 + vsec = pci_find_vsec_capability(pdev, pdev->vendor, 619 + DWC_PCIE_VSEC_RAS_DES_ID); 620 + pci_read_config_dword(pdev, vsec + PCI_VNDR_HEADER, &val); 621 + bdf = PCI_DEVID(pdev->bus->number, pdev->devfn); 622 + name = devm_kasprintf(&plat_dev->dev, GFP_KERNEL, "dwc_rootport_%x", bdf); 623 + if (!name) 624 + return -ENOMEM; 625 + 626 + pcie_pmu = devm_kzalloc(&plat_dev->dev, sizeof(*pcie_pmu), GFP_KERNEL); 627 + if (!pcie_pmu) 628 + return -ENOMEM; 629 + 630 + pcie_pmu->pdev = pdev; 631 + pcie_pmu->ras_des_offset = vsec; 632 + pcie_pmu->nr_lanes = pcie_get_width_cap(pdev); 633 + pcie_pmu->on_cpu = -1; 634 + pcie_pmu->pmu = (struct pmu){ 635 + .name = name, 636 + .parent = &pdev->dev, 637 + .module = THIS_MODULE, 638 + .attr_groups = dwc_pcie_attr_groups, 639 + .capabilities = PERF_PMU_CAP_NO_EXCLUDE, 640 + .task_ctx_nr = perf_invalid_context, 641 + .event_init = dwc_pcie_pmu_event_init, 642 + .add = dwc_pcie_pmu_event_add, 643 + .del = dwc_pcie_pmu_event_del, 644 + .start = dwc_pcie_pmu_event_start, 645 + .stop = dwc_pcie_pmu_event_stop, 646 + .read = dwc_pcie_pmu_event_update, 647 + }; 648 + 649 + /* Add this instance to the list used by the offline callback */ 650 + ret = cpuhp_state_add_instance(dwc_pcie_pmu_hp_state, 651 + &pcie_pmu->cpuhp_node); 652 + if (ret) { 653 + pci_err(pdev, "Error %d registering hotplug @%x\n", ret, bdf); 654 + return ret; 655 + } 656 + 657 + /* Unwind when platform driver removes */ 658 + ret = devm_add_action_or_reset(&plat_dev->dev, 659 + dwc_pcie_pmu_remove_cpuhp_instance, 660 + &pcie_pmu->cpuhp_node); 661 + if (ret) 662 + return ret; 663 + 664 + ret = perf_pmu_register(&pcie_pmu->pmu, name, -1); 665 + if (ret) { 666 + pci_err(pdev, "Error %d registering PMU @%x\n", ret, bdf); 667 + return ret; 668 + } 669 + ret = devm_add_action_or_reset(&plat_dev->dev, dwc_pcie_unregister_pmu, 670 + pcie_pmu); 671 + if (ret) 672 + return ret; 673 + 674 + return 0; 675 + } 676 + 677 + static int dwc_pcie_pmu_online_cpu(unsigned int cpu, struct hlist_node *cpuhp_node) 678 + { 679 + struct dwc_pcie_pmu *pcie_pmu; 680 + 681 + pcie_pmu = hlist_entry_safe(cpuhp_node, struct dwc_pcie_pmu, cpuhp_node); 682 + if (pcie_pmu->on_cpu == -1) 683 + pcie_pmu->on_cpu = cpumask_local_spread( 684 + 0, dev_to_node(&pcie_pmu->pdev->dev)); 685 + 686 + return 0; 687 + } 688 + 689 + static int dwc_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *cpuhp_node) 690 + { 691 + struct dwc_pcie_pmu *pcie_pmu; 692 + struct pci_dev *pdev; 693 + int node; 694 + cpumask_t mask; 695 + unsigned int target; 696 + 697 + pcie_pmu = hlist_entry_safe(cpuhp_node, struct dwc_pcie_pmu, cpuhp_node); 698 + /* Nothing to do if this CPU doesn't own the PMU */ 699 + if (cpu != pcie_pmu->on_cpu) 700 + return 0; 701 + 702 + pcie_pmu->on_cpu = -1; 703 + pdev = pcie_pmu->pdev; 704 + node = dev_to_node(&pdev->dev); 705 + if (cpumask_and(&mask, cpumask_of_node(node), cpu_online_mask) && 706 + cpumask_andnot(&mask, &mask, cpumask_of(cpu))) 707 + target = cpumask_any(&mask); 708 + else 709 + target = cpumask_any_but(cpu_online_mask, cpu); 710 + 711 + if (target >= nr_cpu_ids) { 712 + pci_err(pdev, "There is no CPU to set\n"); 713 + return 0; 714 + } 715 + 716 + /* This PMU does NOT support interrupt, just migrate context. */ 717 + perf_pmu_migrate_context(&pcie_pmu->pmu, cpu, target); 718 + pcie_pmu->on_cpu = target; 719 + 720 + return 0; 721 + } 722 + 723 + static struct platform_driver dwc_pcie_pmu_driver = { 724 + .probe = dwc_pcie_pmu_probe, 725 + .driver = {.name = "dwc_pcie_pmu",}, 726 + }; 727 + 728 + static int __init dwc_pcie_pmu_init(void) 729 + { 730 + struct pci_dev *pdev = NULL; 731 + bool found = false; 732 + int ret; 733 + 734 + for_each_pci_dev(pdev) { 735 + if (!dwc_pcie_match_des_cap(pdev)) 736 + continue; 737 + 738 + ret = dwc_pcie_register_dev(pdev); 739 + if (ret) { 740 + pci_dev_put(pdev); 741 + return ret; 742 + } 743 + 744 + found = true; 745 + } 746 + if (!found) 747 + return -ENODEV; 748 + 749 + ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, 750 + "perf/dwc_pcie_pmu:online", 751 + dwc_pcie_pmu_online_cpu, 752 + dwc_pcie_pmu_offline_cpu); 753 + if (ret < 0) 754 + return ret; 755 + 756 + dwc_pcie_pmu_hp_state = ret; 757 + 758 + ret = platform_driver_register(&dwc_pcie_pmu_driver); 759 + if (ret) 760 + goto platform_driver_register_err; 761 + 762 + ret = bus_register_notifier(&pci_bus_type, &dwc_pcie_pmu_nb); 763 + if (ret) 764 + goto platform_driver_register_err; 765 + notify = true; 766 + 767 + return 0; 768 + 769 + platform_driver_register_err: 770 + cpuhp_remove_multi_state(dwc_pcie_pmu_hp_state); 771 + 772 + return ret; 773 + } 774 + 775 + static void __exit dwc_pcie_pmu_exit(void) 776 + { 777 + struct dwc_pcie_dev_info *dev_info, *tmp; 778 + 779 + if (notify) 780 + bus_unregister_notifier(&pci_bus_type, &dwc_pcie_pmu_nb); 781 + list_for_each_entry_safe(dev_info, tmp, &dwc_pcie_dev_info_head, dev_node) 782 + dwc_pcie_unregister_dev(dev_info); 783 + platform_driver_unregister(&dwc_pcie_pmu_driver); 784 + cpuhp_remove_multi_state(dwc_pcie_pmu_hp_state); 785 + } 786 + 787 + module_init(dwc_pcie_pmu_init); 788 + module_exit(dwc_pcie_pmu_exit); 789 + 790 + MODULE_DESCRIPTION("PMU driver for DesignWare Cores PCI Express Controller"); 791 + MODULE_AUTHOR("Shuai Xue <xueshuai@linux.alibaba.com>"); 792 + MODULE_LICENSE("GPL v2");
+45
drivers/perf/fsl_imx8_ddr_perf.c
··· 19 19 #define COUNTER_READ 0x20 20 20 21 21 #define COUNTER_DPCR1 0x30 22 + #define COUNTER_MUX_CNTL 0x50 23 + #define COUNTER_MASK_COMP 0x54 22 24 23 25 #define CNTL_OVER 0x1 24 26 #define CNTL_CLEAR 0x2 ··· 33 31 #define CNTL_CP_MASK (0xFF << CNTL_CP_SHIFT) 34 32 #define CNTL_CSV_SHIFT 24 35 33 #define CNTL_CSV_MASK (0xFFU << CNTL_CSV_SHIFT) 34 + 35 + #define READ_PORT_SHIFT 0 36 + #define READ_PORT_MASK (0x7 << READ_PORT_SHIFT) 37 + #define READ_CHANNEL_REVERT 0x00000008 /* bit 3 for read channel select */ 38 + #define WRITE_PORT_SHIFT 8 39 + #define WRITE_PORT_MASK (0x7 << WRITE_PORT_SHIFT) 40 + #define WRITE_CHANNEL_REVERT 0x00000800 /* bit 11 for write channel select */ 36 41 37 42 #define EVENT_CYCLES_ID 0 38 43 #define EVENT_CYCLES_COUNTER 0 ··· 59 50 /* DDR Perf hardware feature */ 60 51 #define DDR_CAP_AXI_ID_FILTER 0x1 /* support AXI ID filter */ 61 52 #define DDR_CAP_AXI_ID_FILTER_ENHANCED 0x3 /* support enhanced AXI ID filter */ 53 + #define DDR_CAP_AXI_ID_PORT_CHANNEL_FILTER 0x4 /* support AXI ID PORT CHANNEL filter */ 62 54 63 55 struct fsl_ddr_devtype_data { 64 56 unsigned int quirks; /* quirks needed for different DDR Perf core */ ··· 92 82 .identifier = "i.MX8MP", 93 83 }; 94 84 85 + static const struct fsl_ddr_devtype_data imx8dxl_devtype_data = { 86 + .quirks = DDR_CAP_AXI_ID_PORT_CHANNEL_FILTER, 87 + .identifier = "i.MX8DXL", 88 + }; 89 + 95 90 static const struct of_device_id imx_ddr_pmu_dt_ids[] = { 96 91 { .compatible = "fsl,imx8-ddr-pmu", .data = &imx8_devtype_data}, 97 92 { .compatible = "fsl,imx8m-ddr-pmu", .data = &imx8m_devtype_data}, ··· 104 89 { .compatible = "fsl,imx8mm-ddr-pmu", .data = &imx8mm_devtype_data}, 105 90 { .compatible = "fsl,imx8mn-ddr-pmu", .data = &imx8mn_devtype_data}, 106 91 { .compatible = "fsl,imx8mp-ddr-pmu", .data = &imx8mp_devtype_data}, 92 + { .compatible = "fsl,imx8dxl-ddr-pmu", .data = &imx8dxl_devtype_data}, 107 93 { /* sentinel */ } 108 94 }; 109 95 MODULE_DEVICE_TABLE(of, imx_ddr_pmu_dt_ids); ··· 160 144 enum ddr_perf_filter_capabilities { 161 145 PERF_CAP_AXI_ID_FILTER = 0, 162 146 PERF_CAP_AXI_ID_FILTER_ENHANCED, 147 + PERF_CAP_AXI_ID_PORT_CHANNEL_FILTER, 163 148 PERF_CAP_AXI_ID_FEAT_MAX, 164 149 }; 165 150 ··· 174 157 case PERF_CAP_AXI_ID_FILTER_ENHANCED: 175 158 quirks &= DDR_CAP_AXI_ID_FILTER_ENHANCED; 176 159 return quirks == DDR_CAP_AXI_ID_FILTER_ENHANCED; 160 + case PERF_CAP_AXI_ID_PORT_CHANNEL_FILTER: 161 + return !!(quirks & DDR_CAP_AXI_ID_PORT_CHANNEL_FILTER); 177 162 default: 178 163 WARN(1, "unknown filter cap %d\n", cap); 179 164 } ··· 206 187 static struct attribute *ddr_perf_filter_cap_attr[] = { 207 188 PERF_FILTER_EXT_ATTR_ENTRY(filter, PERF_CAP_AXI_ID_FILTER), 208 189 PERF_FILTER_EXT_ATTR_ENTRY(enhanced_filter, PERF_CAP_AXI_ID_FILTER_ENHANCED), 190 + PERF_FILTER_EXT_ATTR_ENTRY(super_filter, PERF_CAP_AXI_ID_PORT_CHANNEL_FILTER), 209 191 NULL, 210 192 }; 211 193 ··· 292 272 PMU_FORMAT_ATTR(event, "config:0-7"); 293 273 PMU_FORMAT_ATTR(axi_id, "config1:0-15"); 294 274 PMU_FORMAT_ATTR(axi_mask, "config1:16-31"); 275 + PMU_FORMAT_ATTR(axi_port, "config2:0-2"); 276 + PMU_FORMAT_ATTR(axi_channel, "config2:3-3"); 295 277 296 278 static struct attribute *ddr_perf_format_attrs[] = { 297 279 &format_attr_event.attr, 298 280 &format_attr_axi_id.attr, 299 281 &format_attr_axi_mask.attr, 282 + &format_attr_axi_port.attr, 283 + &format_attr_axi_channel.attr, 300 284 NULL, 301 285 }; 302 286 ··· 554 530 int counter; 555 531 int cfg = event->attr.config; 556 532 int cfg1 = event->attr.config1; 533 + int cfg2 = event->attr.config2; 557 534 558 535 if (pmu->devtype_data->quirks & DDR_CAP_AXI_ID_FILTER) { 559 536 int i; ··· 576 551 if (counter < 0) { 577 552 dev_dbg(pmu->dev, "There are not enough counters\n"); 578 553 return -EOPNOTSUPP; 554 + } 555 + 556 + if (pmu->devtype_data->quirks & DDR_CAP_AXI_ID_PORT_CHANNEL_FILTER) { 557 + if (ddr_perf_is_filtered(event)) { 558 + /* revert axi id masking(axi_mask) value */ 559 + cfg1 ^= AXI_MASKING_REVERT; 560 + writel(cfg1, pmu->base + COUNTER_MASK_COMP + ((counter - 1) << 4)); 561 + 562 + if (cfg == 0x41) { 563 + /* revert axi read channel(axi_channel) value */ 564 + cfg2 ^= READ_CHANNEL_REVERT; 565 + cfg2 |= FIELD_PREP(READ_PORT_MASK, cfg2); 566 + } else { 567 + /* revert axi write channel(axi_channel) value */ 568 + cfg2 ^= WRITE_CHANNEL_REVERT; 569 + cfg2 |= FIELD_PREP(WRITE_PORT_MASK, cfg2); 570 + } 571 + 572 + writel(cfg2, pmu->base + COUNTER_MUX_CNTL + ((counter - 1) << 4)); 573 + } 579 574 } 580 575 581 576 pmu->events[counter] = event;
+3 -3
drivers/perf/fsl_imx9_ddr_perf.c
··· 617 617 618 618 platform_set_drvdata(pdev, pmu); 619 619 620 - pmu->id = ida_simple_get(&ddr_ida, 0, 0, GFP_KERNEL); 620 + pmu->id = ida_alloc(&ddr_ida, GFP_KERNEL); 621 621 name = devm_kasprintf(&pdev->dev, GFP_KERNEL, DDR_PERF_DEV_NAME "%d", pmu->id); 622 622 if (!name) { 623 623 ret = -ENOMEM; ··· 674 674 cpuhp_remove_multi_state(pmu->cpuhp_state); 675 675 cpuhp_state_err: 676 676 format_string_err: 677 - ida_simple_remove(&ddr_ida, pmu->id); 677 + ida_free(&ddr_ida, pmu->id); 678 678 dev_warn(&pdev->dev, "i.MX9 DDR Perf PMU failed (%d), disabled\n", ret); 679 679 return ret; 680 680 } ··· 688 688 689 689 perf_pmu_unregister(&pmu->pmu); 690 690 691 - ida_simple_remove(&ddr_ida, pmu->id); 691 + ida_free(&ddr_ida, pmu->id); 692 692 693 693 return 0; 694 694 }
+2 -2
drivers/perf/hisilicon/hisi_uncore_uc_pmu.c
··· 383 383 HISI_PMU_EVENT_ATTR(cpu_rd, 0x10), 384 384 HISI_PMU_EVENT_ATTR(cpu_rd64, 0x17), 385 385 HISI_PMU_EVENT_ATTR(cpu_rs64, 0x19), 386 - HISI_PMU_EVENT_ATTR(cpu_mru, 0x1a), 387 - HISI_PMU_EVENT_ATTR(cycles, 0x9c), 386 + HISI_PMU_EVENT_ATTR(cpu_mru, 0x1c), 387 + HISI_PMU_EVENT_ATTR(cycles, 0x95), 388 388 HISI_PMU_EVENT_ATTR(spipe_hit, 0xb3), 389 389 HISI_PMU_EVENT_ATTR(hpipe_hit, 0xdb), 390 390 HISI_PMU_EVENT_ATTR(cring_rxdat_cnt, 0xfa),
+2
include/linux/pci.h
··· 1239 1239 int pci_write_config_byte(const struct pci_dev *dev, int where, u8 val); 1240 1240 int pci_write_config_word(const struct pci_dev *dev, int where, u16 val); 1241 1241 int pci_write_config_dword(const struct pci_dev *dev, int where, u32 val); 1242 + void pci_clear_and_set_config_dword(const struct pci_dev *dev, int pos, 1243 + u32 clear, u32 set); 1242 1244 1243 1245 int pcie_capability_read_word(struct pci_dev *dev, int pos, u16 *val); 1244 1246 int pcie_capability_read_dword(struct pci_dev *dev, int pos, u32 *val);
+2
include/linux/pci_ids.h
··· 2605 2605 #define PCI_VENDOR_ID_TEKRAM 0x1de1 2606 2606 #define PCI_DEVICE_ID_TEKRAM_DC290 0xdc29 2607 2607 2608 + #define PCI_VENDOR_ID_ALIBABA 0x1ded 2609 + 2608 2610 #define PCI_VENDOR_ID_TEHUTI 0x1fc9 2609 2611 #define PCI_DEVICE_ID_TEHUTI_3009 0x3009 2610 2612 #define PCI_DEVICE_ID_TEHUTI_3010 0x3010
+22 -6
include/linux/perf/arm_pmu.h
··· 60 60 DECLARE_BITMAP(used_mask, ARMPMU_MAX_HWEVENTS); 61 61 62 62 /* 63 - * Hardware lock to serialize accesses to PMU registers. Needed for the 64 - * read/modify/write sequences. 65 - */ 66 - raw_spinlock_t pmu_lock; 67 - 68 - /* 69 63 * When using percpu IRQs, we need a percpu dev_id. Place it here as we 70 64 * already have to allocate this struct per cpu. 71 65 */ ··· 182 188 183 189 #define ARMV8_SPE_PDEV_NAME "arm,spe-v1" 184 190 #define ARMV8_TRBE_PDEV_NAME "arm,trbe" 191 + 192 + /* Why does everything I do descend into this? */ 193 + #define __GEN_PMU_FORMAT_ATTR(cfg, lo, hi) \ 194 + (lo) == (hi) ? #cfg ":" #lo "\n" : #cfg ":" #lo "-" #hi 195 + 196 + #define _GEN_PMU_FORMAT_ATTR(cfg, lo, hi) \ 197 + __GEN_PMU_FORMAT_ATTR(cfg, lo, hi) 198 + 199 + #define GEN_PMU_FORMAT_ATTR(name) \ 200 + PMU_FORMAT_ATTR(name, \ 201 + _GEN_PMU_FORMAT_ATTR(ATTR_CFG_FLD_##name##_CFG, \ 202 + ATTR_CFG_FLD_##name##_LO, \ 203 + ATTR_CFG_FLD_##name##_HI)) 204 + 205 + #define _ATTR_CFG_GET_FLD(attr, cfg, lo, hi) \ 206 + ((((attr)->cfg) >> lo) & GENMASK_ULL(hi - lo, 0)) 207 + 208 + #define ATTR_CFG_GET_FLD(attr, name) \ 209 + _ATTR_CFG_GET_FLD(attr, \ 210 + ATTR_CFG_FLD_##name##_CFG, \ 211 + ATTR_CFG_FLD_##name##_LO, \ 212 + ATTR_CFG_FLD_##name##_HI) 185 213 186 214 #endif /* __ARM_PMU_H__ */
+20 -14
include/linux/perf/arm_pmuv3.h
··· 215 215 #define ARMV8_PMU_PMCR_DP (1 << 5) /* Disable CCNT if non-invasive debug*/ 216 216 #define ARMV8_PMU_PMCR_LC (1 << 6) /* Overflow on 64 bit cycle counter */ 217 217 #define ARMV8_PMU_PMCR_LP (1 << 7) /* Long event counter enable */ 218 - #define ARMV8_PMU_PMCR_N_SHIFT 11 /* Number of counters supported */ 219 - #define ARMV8_PMU_PMCR_N_MASK 0x1f 220 - #define ARMV8_PMU_PMCR_MASK 0xff /* Mask for writable bits */ 218 + #define ARMV8_PMU_PMCR_N GENMASK(15, 11) /* Number of counters supported */ 219 + /* Mask for writable bits */ 220 + #define ARMV8_PMU_PMCR_MASK (ARMV8_PMU_PMCR_E | ARMV8_PMU_PMCR_P | \ 221 + ARMV8_PMU_PMCR_C | ARMV8_PMU_PMCR_D | \ 222 + ARMV8_PMU_PMCR_X | ARMV8_PMU_PMCR_DP | \ 223 + ARMV8_PMU_PMCR_LC | ARMV8_PMU_PMCR_LP) 221 224 222 225 /* 223 226 * PMOVSR: counters overflow flag status reg 224 227 */ 225 - #define ARMV8_PMU_OVSR_MASK 0xffffffff /* Mask for writable bits */ 226 - #define ARMV8_PMU_OVERFLOWED_MASK ARMV8_PMU_OVSR_MASK 228 + #define ARMV8_PMU_OVSR_P GENMASK(30, 0) 229 + #define ARMV8_PMU_OVSR_C BIT(31) 230 + /* Mask for writable bits is both P and C fields */ 231 + #define ARMV8_PMU_OVERFLOWED_MASK (ARMV8_PMU_OVSR_P | ARMV8_PMU_OVSR_C) 227 232 228 233 /* 229 234 * PMXEVTYPER: Event selection reg 230 235 */ 231 - #define ARMV8_PMU_EVTYPE_MASK 0xc800ffff /* Mask for writable bits */ 232 - #define ARMV8_PMU_EVTYPE_EVENT 0xffff /* Mask for EVENT bits */ 236 + #define ARMV8_PMU_EVTYPE_EVENT GENMASK(15, 0) /* Mask for EVENT bits */ 237 + #define ARMV8_PMU_EVTYPE_TH GENMASK_ULL(43, 32) /* arm64 only */ 238 + #define ARMV8_PMU_EVTYPE_TC GENMASK_ULL(63, 61) /* arm64 only */ 233 239 234 240 /* 235 241 * Event filters for PMUv3 ··· 250 244 /* 251 245 * PMUSERENR: user enable reg 252 246 */ 253 - #define ARMV8_PMU_USERENR_MASK 0xf /* Mask for writable bits */ 254 247 #define ARMV8_PMU_USERENR_EN (1 << 0) /* PMU regs can be accessed at EL0 */ 255 248 #define ARMV8_PMU_USERENR_SW (1 << 1) /* PMSWINC can be written at EL0 */ 256 249 #define ARMV8_PMU_USERENR_CR (1 << 2) /* Cycle counter can be read at EL0 */ 257 250 #define ARMV8_PMU_USERENR_ER (1 << 3) /* Event counter can be read at EL0 */ 251 + /* Mask for writable bits */ 252 + #define ARMV8_PMU_USERENR_MASK (ARMV8_PMU_USERENR_EN | ARMV8_PMU_USERENR_SW | \ 253 + ARMV8_PMU_USERENR_CR | ARMV8_PMU_USERENR_ER) 258 254 259 255 /* PMMIR_EL1.SLOTS mask */ 260 - #define ARMV8_PMU_SLOTS_MASK 0xff 261 - 262 - #define ARMV8_PMU_BUS_SLOTS_SHIFT 8 263 - #define ARMV8_PMU_BUS_SLOTS_MASK 0xff 264 - #define ARMV8_PMU_BUS_WIDTH_SHIFT 16 265 - #define ARMV8_PMU_BUS_WIDTH_MASK 0xf 256 + #define ARMV8_PMU_SLOTS GENMASK(7, 0) 257 + #define ARMV8_PMU_BUS_SLOTS GENMASK(15, 8) 258 + #define ARMV8_PMU_BUS_WIDTH GENMASK(19, 16) 259 + #define ARMV8_PMU_THWIDTH GENMASK(23, 20) 266 260 267 261 /* 268 262 * This code is really good
+26 -17
tools/include/perf/arm_pmuv3.h
··· 218 218 #define ARMV8_PMU_PMCR_DP (1 << 5) /* Disable CCNT if non-invasive debug*/ 219 219 #define ARMV8_PMU_PMCR_LC (1 << 6) /* Overflow on 64 bit cycle counter */ 220 220 #define ARMV8_PMU_PMCR_LP (1 << 7) /* Long event counter enable */ 221 - #define ARMV8_PMU_PMCR_N_SHIFT 11 /* Number of counters supported */ 222 - #define ARMV8_PMU_PMCR_N_MASK 0x1f 223 - #define ARMV8_PMU_PMCR_MASK 0xff /* Mask for writable bits */ 221 + #define ARMV8_PMU_PMCR_N GENMASK(15, 11) /* Number of counters supported */ 222 + /* Mask for writable bits */ 223 + #define ARMV8_PMU_PMCR_MASK (ARMV8_PMU_PMCR_E | ARMV8_PMU_PMCR_P | \ 224 + ARMV8_PMU_PMCR_C | ARMV8_PMU_PMCR_D | \ 225 + ARMV8_PMU_PMCR_X | ARMV8_PMU_PMCR_DP | \ 226 + ARMV8_PMU_PMCR_LC | ARMV8_PMU_PMCR_LP) 224 227 225 228 /* 226 229 * PMOVSR: counters overflow flag status reg 227 230 */ 228 - #define ARMV8_PMU_OVSR_MASK 0xffffffff /* Mask for writable bits */ 229 - #define ARMV8_PMU_OVERFLOWED_MASK ARMV8_PMU_OVSR_MASK 231 + #define ARMV8_PMU_OVSR_P GENMASK(30, 0) 232 + #define ARMV8_PMU_OVSR_C BIT(31) 233 + /* Mask for writable bits is both P and C fields */ 234 + #define ARMV8_PMU_OVERFLOWED_MASK (ARMV8_PMU_OVSR_P | ARMV8_PMU_OVSR_C) 230 235 231 236 /* 232 237 * PMXEVTYPER: Event selection reg 233 238 */ 234 - #define ARMV8_PMU_EVTYPE_MASK 0xc800ffff /* Mask for writable bits */ 235 - #define ARMV8_PMU_EVTYPE_EVENT 0xffff /* Mask for EVENT bits */ 239 + #define ARMV8_PMU_EVTYPE_EVENT GENMASK(15, 0) /* Mask for EVENT bits */ 240 + #define ARMV8_PMU_EVTYPE_TH GENMASK(43, 32) 241 + #define ARMV8_PMU_EVTYPE_TC GENMASK(63, 61) 236 242 237 243 /* 238 244 * Event filters for PMUv3 239 245 */ 240 - #define ARMV8_PMU_EXCLUDE_EL1 (1U << 31) 241 - #define ARMV8_PMU_EXCLUDE_EL0 (1U << 30) 242 - #define ARMV8_PMU_INCLUDE_EL2 (1U << 27) 246 + #define ARMV8_PMU_EXCLUDE_EL1 (1U << 31) 247 + #define ARMV8_PMU_EXCLUDE_EL0 (1U << 30) 248 + #define ARMV8_PMU_EXCLUDE_NS_EL1 (1U << 29) 249 + #define ARMV8_PMU_EXCLUDE_NS_EL0 (1U << 28) 250 + #define ARMV8_PMU_INCLUDE_EL2 (1U << 27) 251 + #define ARMV8_PMU_EXCLUDE_EL3 (1U << 26) 243 252 244 253 /* 245 254 * PMUSERENR: user enable reg 246 255 */ 247 - #define ARMV8_PMU_USERENR_MASK 0xf /* Mask for writable bits */ 248 256 #define ARMV8_PMU_USERENR_EN (1 << 0) /* PMU regs can be accessed at EL0 */ 249 257 #define ARMV8_PMU_USERENR_SW (1 << 1) /* PMSWINC can be written at EL0 */ 250 258 #define ARMV8_PMU_USERENR_CR (1 << 2) /* Cycle counter can be read at EL0 */ 251 259 #define ARMV8_PMU_USERENR_ER (1 << 3) /* Event counter can be read at EL0 */ 260 + /* Mask for writable bits */ 261 + #define ARMV8_PMU_USERENR_MASK (ARMV8_PMU_USERENR_EN | ARMV8_PMU_USERENR_SW | \ 262 + ARMV8_PMU_USERENR_CR | ARMV8_PMU_USERENR_ER) 252 263 253 264 /* PMMIR_EL1.SLOTS mask */ 254 - #define ARMV8_PMU_SLOTS_MASK 0xff 255 - 256 - #define ARMV8_PMU_BUS_SLOTS_SHIFT 8 257 - #define ARMV8_PMU_BUS_SLOTS_MASK 0xff 258 - #define ARMV8_PMU_BUS_WIDTH_SHIFT 16 259 - #define ARMV8_PMU_BUS_WIDTH_MASK 0xf 265 + #define ARMV8_PMU_SLOTS GENMASK(7, 0) 266 + #define ARMV8_PMU_BUS_SLOTS GENMASK(15, 8) 267 + #define ARMV8_PMU_BUS_WIDTH GENMASK(19, 16) 268 + #define ARMV8_PMU_THWIDTH GENMASK(23, 20) 260 269 261 270 /* 262 271 * This code is really good
+2 -3
tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
··· 42 42 43 43 static uint64_t get_pmcr_n(uint64_t pmcr) 44 44 { 45 - return (pmcr >> ARMV8_PMU_PMCR_N_SHIFT) & ARMV8_PMU_PMCR_N_MASK; 45 + return FIELD_GET(ARMV8_PMU_PMCR_N, pmcr); 46 46 } 47 47 48 48 static void set_pmcr_n(uint64_t *pmcr, uint64_t pmcr_n) 49 49 { 50 - *pmcr = *pmcr & ~(ARMV8_PMU_PMCR_N_MASK << ARMV8_PMU_PMCR_N_SHIFT); 51 - *pmcr |= (pmcr_n << ARMV8_PMU_PMCR_N_SHIFT); 50 + u64p_replace_bits((__u64 *) pmcr, pmcr_n, ARMV8_PMU_PMCR_N); 52 51 } 53 52 54 53 static uint64_t get_counters_mask(uint64_t n)