Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge branch 'for-next/perf' into for-next/core

* for-next/perf: (29 commits)
perf/dwc_pcie: Fix use of uninitialized variable
Documentation: hisi-pmu: Add introduction to HiSilicon V3 PMU
Documentation: hisi-pmu: Fix of minor format error
drivers/perf: hisi: Add support for L3C PMU v3
drivers/perf: hisi: Refactor the event configuration of L3C PMU
drivers/perf: hisi: Extend the field of tt_core
drivers/perf: hisi: Extract the event filter check of L3C PMU
drivers/perf: hisi: Simplify the probe process of each L3C PMU version
drivers/perf: hisi: Export hisi_uncore_pmu_isr()
drivers/perf: hisi: Relax the event ID check in the framework
perf: Fujitsu: Add the Uncore PMU driver
perf/arm-cmn: Fix CMN S3 DTM offset
perf: arm_spe: Prevent overflow in PERF_IDX2OFF()
coresight: trbe: Prevent overflow in PERF_IDX2OFF()
MAINTAINERS: Remove myself from HiSilicon PMU maintainers
drivers/perf: hisi: Add support for HiSilicon MN PMU driver
drivers/perf: hisi: Add support for HiSilicon NoC PMU
perf: arm_pmuv3: Factor out PMCCNTR_EL0 use conditions
arm64/boot: Enable EL2 requirements for SPE_FEAT_FDS
arm64/boot: Factor out a macro to check SPE version
...

+2399 -174
+2 -2
Documentation/admin-guide/perf/dwc_pcie_pmu.rst
··· 16 16 17 17 - one 64-bit counter for Time Based Analysis (RX/TX data throughput and 18 18 time spent in each low-power LTSSM state) and 19 - - one 32-bit counter for Event Counting (error and non-error events for 20 - a specified lane) 19 + - one 32-bit counter per event for Event Counting (error and non-error 20 + events for a specified lane) 21 21 22 22 Note: There is no interrupt for counter overflow. 23 23
+110
Documentation/admin-guide/perf/fujitsu_uncore_pmu.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0-only 2 + 3 + ================================================ 4 + Fujitsu Uncore Performance Monitoring Unit (PMU) 5 + ================================================ 6 + 7 + This driver supports the Uncore MAC PMUs and the Uncore PCI PMUs found 8 + in Fujitsu chips. 9 + Each MAC PMU on these chips is exposed as a uncore perf PMU with device name 10 + mac_iod<iod>_mac<mac>_ch<ch>. 11 + And each PCI PMU on these chips is exposed as a uncore perf PMU with device name 12 + pci_iod<iod>_pci<pci>. 13 + 14 + The driver provides a description of its available events and configuration 15 + options in sysfs, see /sys/bus/event_sources/devices/mac_iod<iod>_mac<mac>_ch<ch>/ 16 + and /sys/bus/event_sources/devices/pci_iod<iod>_pci<pci>/. 17 + This driver exports: 18 + - formats, used by perf user space and other tools to configure events 19 + - events, used by perf user space and other tools to create events 20 + symbolically, e.g.: 21 + perf stat -a -e mac_iod0_mac0_ch0/event=0x21/ ls 22 + perf stat -a -e pci_iod0_pci0/event=0x24/ ls 23 + - cpumask, used by perf user space and other tools to know on which CPUs 24 + to open the events 25 + 26 + This driver supports the following events for MAC: 27 + - cycles 28 + This event counts MAC cycles at MAC frequency. 29 + - read-count 30 + This event counts the number of read requests to MAC. 31 + - read-count-request 32 + This event counts the number of read requests including retry to MAC. 33 + - read-count-return 34 + This event counts the number of responses to read requests to MAC. 35 + - read-count-request-pftgt 36 + This event counts the number of read requests including retry with PFTGT 37 + flag. 38 + - read-count-request-normal 39 + This event counts the number of read requests including retry without PFTGT 40 + flag. 41 + - read-count-return-pftgt-hit 42 + This event counts the number of responses to read requests which hit the 43 + PFTGT buffer. 44 + - read-count-return-pftgt-miss 45 + This event counts the number of responses to read requests which miss the 46 + PFTGT buffer. 47 + - read-wait 48 + This event counts outstanding read requests issued by DDR memory controller 49 + per cycle. 50 + - write-count 51 + This event counts the number of write requests to MAC (including zero write, 52 + full write, partial write, write cancel). 53 + - write-count-write 54 + This event counts the number of full write requests to MAC (not including 55 + zero write). 56 + - write-count-pwrite 57 + This event counts the number of partial write requests to MAC. 58 + - memory-read-count 59 + This event counts the number of read requests from MAC to memory. 60 + - memory-write-count 61 + This event counts the number of full write requests from MAC to memory. 62 + - memory-pwrite-count 63 + This event counts the number of partial write requests from MAC to memory. 64 + - ea-mac 65 + This event counts energy consumption of MAC. 66 + - ea-memory 67 + This event counts energy consumption of memory. 68 + - ea-memory-mac-write 69 + This event counts the number of write requests from MAC to memory. 70 + - ea-ha 71 + This event counts energy consumption of HA. 72 + 73 + 'ea' is the abbreviation for 'Energy Analyzer'. 74 + 75 + Examples for use with perf:: 76 + 77 + perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls 78 + 79 + And, this driver supports the following events for PCI: 80 + - pci-port0-cycles 81 + This event counts PCI cycles at PCI frequency in port0. 82 + - pci-port0-read-count 83 + This event counts read transactions for data transfer in port0. 84 + - pci-port0-read-count-bus 85 + This event counts read transactions for bus usage in port0. 86 + - pci-port0-write-count 87 + This event counts write transactions for data transfer in port0. 88 + - pci-port0-write-count-bus 89 + This event counts write transactions for bus usage in port0. 90 + - pci-port1-cycles 91 + This event counts PCI cycles at PCI frequency in port1. 92 + - pci-port1-read-count 93 + This event counts read transactions for data transfer in port1. 94 + - pci-port1-read-count-bus 95 + This event counts read transactions for bus usage in port1. 96 + - pci-port1-write-count 97 + This event counts write transactions for data transfer in port1. 98 + - pci-port1-write-count-bus 99 + This event counts write transactions for bus usage in port1. 100 + - ea-pci 101 + This event counts energy consumption of PCI. 102 + 103 + 'ea' is the abbreviation for 'Energy Analyzer'. 104 + 105 + Examples for use with perf:: 106 + 107 + perf stat -e pci_iod0_pci0/ea-pci/ ls 108 + 109 + Given that these are uncore PMUs the driver does not support sampling, therefore 110 + "perf record" will not work. Per-task perf sessions are not supported.
+47 -2
Documentation/admin-guide/perf/hisi-pmu.rst
··· 18 18 Each device PMU has separate registers for event counting, control and 19 19 interrupt, and the PMU driver shall register perf PMU drivers like L3C, 20 20 HHA and DDRC etc. The available events and configuration options shall 21 - be described in the sysfs, see: 21 + be described in the sysfs, see:: 22 22 23 - /sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}/hha{Y}/ddrc{Y}>. 23 + /sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}/hha{Y}/ddrc{Y}> 24 + 24 25 The "perf list" command shall list the available events from sysfs. 25 26 26 27 Each L3C, HHA and DDRC is registered as a separate PMU with perf. The PMU ··· 112 111 - 2'b10: count the events which sent to the uring (non-MATA) channel; 113 112 - 2'b00: default value, count the events which sent to the both uring and 114 113 uring_ext channel; 114 + 115 + 6. ch: NoC PMU supports filtering the event counts of certain transaction 116 + channel with this option. The current supported channels are as follows: 117 + 118 + - 3'b010: Request channel 119 + - 3'b100: Snoop channel 120 + - 3'b110: Response channel 121 + - 3'b111: Data channel 122 + 123 + 7. tt_en: NoC PMU supports counting only transactions that have tracetag set 124 + if this option is set. See the 2nd list for more information about tracetag. 125 + 126 + For HiSilicon uncore PMU v3 whose identifier is 0x40, some uncore PMUs are 127 + further divided into parts for finer granularity of tracing, each part has its 128 + own dedicated PMU, and all such PMUs together cover the monitoring job of events 129 + on particular uncore device. Such PMUs are described in sysfs with name format 130 + slightly changed:: 131 + 132 + /sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}_{Z}/ddrc{Y}_{Z}/noc{Y}_{Z}> 133 + 134 + Z is the sub-id, indicating different PMUs for part of hardware device. 135 + 136 + Usage of most PMUs with different sub-ids are identical. Specially, L3C PMU 137 + provides ``ext`` option to allow exploration of even finer granual statistics 138 + of L3C PMU. L3C PMU driver uses that as hint of termination when delivering 139 + perf command to hardware: 140 + 141 + - ext=0: Default, could be used with event names. 142 + - ext=1 and ext=2: Must be used with event codes, event names are not supported. 143 + 144 + An example of perf command could be:: 145 + 146 + $# perf stat -a -e hisi_sccl0_l3c1_0/rd_spipe/ sleep 5 147 + 148 + or:: 149 + 150 + $# perf stat -a -e hisi_sccl0_l3c1_0/event=0x1,ext=1/ sleep 5 151 + 152 + As above, ``hisi_sccl0_l3c1_0`` locates PMU of Super CPU CLuster 0, L3 cache 1 153 + pipe0. 154 + 155 + First command locates the first part of L3C since ``ext=0`` is implied by 156 + default. Second command issues the counting on another part of L3C with the 157 + event ``0x1``. 115 158 116 159 Users could configure IDs to count data come from specific CCL/ICL, by setting 117 160 srcid_cmd & srcid_msk, and data desitined for specific CCL/ICL by setting
+1
Documentation/admin-guide/perf/index.rst
··· 29 29 cxl 30 30 ampere_cspmu 31 31 mrvl-pem-pmu 32 + fujitsu_uncore_pmu
+11
Documentation/arch/arm64/booting.rst
··· 466 466 - HDFGWTR2_EL2.nPMICFILTR_EL0 (bit 3) must be initialised to 0b1. 467 467 - HDFGWTR2_EL2.nPMUACR_EL1 (bit 4) must be initialised to 0b1. 468 468 469 + For CPUs with SPE data source filtering (FEAT_SPE_FDS): 470 + 471 + - If EL3 is present: 472 + 473 + - MDCR_EL3.EnPMS3 (bit 42) must be initialised to 0b1. 474 + 475 + - If the kernel is entered at EL1 and EL2 is present: 476 + 477 + - HDFGRTR2_EL2.nPMSDSFR_EL1 (bit 19) must be initialised to 0b1. 478 + - HDFGWTR2_EL2.nPMSDSFR_EL1 (bit 19) must be initialised to 0b1. 479 + 469 480 For CPUs with Memory Copy and Memory Set instructions (FEAT_MOPS): 470 481 471 482 - If the kernel is entered at EL1 and EL2 is present:
+1
Documentation/devicetree/bindings/perf/fsl-imx-ddr.yaml
··· 33 33 - items: 34 34 - enum: 35 35 - fsl,imx91-ddr-pmu 36 + - fsl,imx94-ddr-pmu 36 37 - fsl,imx95-ddr-pmu 37 38 - const: fsl,imx93-ddr-pmu 38 39
+3 -1
MAINTAINERS
··· 9744 9744 9745 9745 FREESCALE IMX DDR PMU DRIVER 9746 9746 M: Frank Li <Frank.li@nxp.com> 9747 + M: Xu Yang <xu.yang_2@nxp.com> 9747 9748 L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) 9748 9749 S: Maintained 9749 9750 F: Documentation/admin-guide/perf/imx-ddr.rst 9750 9751 F: Documentation/devicetree/bindings/perf/fsl-imx-ddr.yaml 9751 9752 F: drivers/perf/fsl_imx8_ddr_perf.c 9753 + F: drivers/perf/fsl_imx9_ddr_perf.c 9754 + F: tools/perf/pmu-events/arch/arm64/freescale/ 9752 9755 9753 9756 FREESCALE IMX I2C DRIVER 9754 9757 M: Oleksij Rempel <o.rempel@pengutronix.de> ··· 11062 11059 F: drivers/net/ethernet/hisilicon/ 11063 11060 11064 11061 HISILICON PMU DRIVER 11065 - M: Yicong Yang <yangyicong@hisilicon.com> 11066 11062 M: Jonathan Cameron <jonathan.cameron@huawei.com> 11067 11063 S: Supported 11068 11064 W: http://www.hisilicon.com
+22 -6
arch/arm64/include/asm/el2_setup.h
··· 91 91 msr cntvoff_el2, xzr // Clear virtual offset 92 92 .endm 93 93 94 + /* Branch to skip_label if SPE version is less than given version */ 95 + .macro __spe_vers_imp skip_label, version, tmp 96 + mrs \tmp, id_aa64dfr0_el1 97 + ubfx \tmp, \tmp, #ID_AA64DFR0_EL1_PMSVer_SHIFT, #4 98 + cmp \tmp, \version 99 + b.lt \skip_label 100 + .endm 101 + 94 102 .macro __init_el2_debug 95 103 mrs x1, id_aa64dfr0_el1 96 104 ubfx x0, x1, #ID_AA64DFR0_EL1_PMUVer_SHIFT, #4 ··· 111 103 csel x2, xzr, x0, eq // all PMU counters from EL1 112 104 113 105 /* Statistical profiling */ 114 - ubfx x0, x1, #ID_AA64DFR0_EL1_PMSVer_SHIFT, #4 115 - cbz x0, .Lskip_spe_\@ // Skip if SPE not present 106 + __spe_vers_imp .Lskip_spe_\@, ID_AA64DFR0_EL1_PMSVer_IMP, x0 // Skip if SPE not present 116 107 117 108 mrs_s x0, SYS_PMBIDR_EL1 // If SPE available at EL2, 118 109 and x0, x0, #(1 << PMBIDR_EL1_P_SHIFT) ··· 270 263 271 264 mov x0, xzr 272 265 mov x2, xzr 273 - mrs x1, id_aa64dfr0_el1 274 - ubfx x1, x1, #ID_AA64DFR0_EL1_PMSVer_SHIFT, #4 275 - cmp x1, #3 276 - b.lt .Lskip_spe_fgt_\@ 266 + /* If SPEv1p2 is implemented, */ 267 + __spe_vers_imp .Lskip_spe_fgt_\@, #ID_AA64DFR0_EL1_PMSVer_V1P2, x1 277 268 /* Disable PMSNEVFR_EL1 read and write traps */ 278 269 orr x0, x0, #HDFGRTR_EL2_nPMSNEVFR_EL1_MASK 279 270 orr x2, x2, #HDFGWTR_EL2_nPMSNEVFR_EL1_MASK ··· 392 387 orr x0, x0, #HDFGRTR2_EL2_nPMICFILTR_EL0 393 388 orr x0, x0, #HDFGRTR2_EL2_nPMUACR_EL1 394 389 .Lskip_pmuv3p9_\@: 390 + /* If SPE is implemented, */ 391 + __spe_vers_imp .Lskip_spefds_\@, ID_AA64DFR0_EL1_PMSVer_IMP, x1 392 + /* we can read PMSIDR and */ 393 + mrs_s x1, SYS_PMSIDR_EL1 394 + and x1, x1, #PMSIDR_EL1_FDS 395 + /* if FEAT_SPE_FDS is implemented, */ 396 + cbz x1, .Lskip_spefds_\@ 397 + /* disable traps of PMSDSFR to EL2. */ 398 + orr x0, x0, #HDFGRTR2_EL2_nPMSDSFR_EL1 399 + 400 + .Lskip_spefds_\@: 395 401 msr_s SYS_HDFGRTR2_EL2, x0 396 402 msr_s SYS_HDFGWTR2_EL2, x0 397 403 msr_s SYS_HFGRTR2_EL2, xzr
-9
arch/arm64/include/asm/sysreg.h
··· 344 344 #define SYS_PAR_EL1_ATTR GENMASK_ULL(63, 56) 345 345 #define SYS_PAR_EL1_F0_RES0 (GENMASK_ULL(6, 1) | GENMASK_ULL(55, 52)) 346 346 347 - /*** Statistical Profiling Extension ***/ 348 - #define PMSEVFR_EL1_RES0_IMP \ 349 - (GENMASK_ULL(47, 32) | GENMASK_ULL(23, 16) | GENMASK_ULL(11, 8) |\ 350 - BIT_ULL(6) | BIT_ULL(4) | BIT_ULL(2) | BIT_ULL(0)) 351 - #define PMSEVFR_EL1_RES0_V1P1 \ 352 - (PMSEVFR_EL1_RES0_IMP & ~(BIT_ULL(18) | BIT_ULL(17) | BIT_ULL(11))) 353 - #define PMSEVFR_EL1_RES0_V1P2 \ 354 - (PMSEVFR_EL1_RES0_V1P1 & ~BIT_ULL(6)) 355 - 356 347 /* Buffer error reporting */ 357 348 #define PMBSR_EL1_FAULT_FSC_SHIFT PMBSR_EL1_MSS_SHIFT 358 349 #define PMBSR_EL1_FAULT_FSC_MASK PMBSR_EL1_MSS_MASK
+11 -2
arch/arm64/tools/sysreg
··· 2994 2994 EndSysreg 2995 2995 2996 2996 Sysreg PMSFCR_EL1 3 0 9 9 4 2997 - Res0 63:19 2997 + Res0 63:53 2998 + Field 52 SIMDm 2999 + Field 51 FPm 3000 + Field 50 STm 3001 + Field 49 LDm 3002 + Field 48 Bm 3003 + Res0 47:21 3004 + Field 20 SIMD 3005 + Field 19 FP 2998 3006 Field 18 ST 2999 3007 Field 17 LD 3000 3008 Field 16 B 3001 - Res0 15:4 3009 + Res0 15:5 3010 + Field 4 FDS 3002 3011 Field 3 FnE 3003 3012 Field 2 FL 3004 3013 Field 1 FT
+2 -1
drivers/hwtracing/coresight/coresight-trbe.c
··· 23 23 #include "coresight-self-hosted-trace.h" 24 24 #include "coresight-trbe.h" 25 25 26 - #define PERF_IDX2OFF(idx, buf) ((idx) % ((buf)->nr_pages << PAGE_SHIFT)) 26 + #define PERF_IDX2OFF(idx, buf) \ 27 + ((idx) % ((unsigned long)(buf)->nr_pages << PAGE_SHIFT)) 27 28 28 29 /* 29 30 * A padding packet that will help the user space tools
+9
drivers/perf/Kconfig
··· 178 178 can give information about memory throughput and other related 179 179 events. 180 180 181 + config FUJITSU_UNCORE_PMU 182 + tristate "Fujitsu Uncore PMU" 183 + depends on (ARM64 && ACPI) || (COMPILE_TEST && 64BIT) 184 + help 185 + Provides support for the Uncore performance monitor unit (PMU) 186 + in Fujitsu processors. 187 + Adds the Uncore PMU into the perf events subsystem for 188 + monitoring Uncore events. 189 + 181 190 config QCOM_L2_PMU 182 191 bool "Qualcomm Technologies L2-cache PMU" 183 192 depends on ARCH_QCOM && ARM64 && ACPI
+1
drivers/perf/Makefile
··· 13 13 obj-$(CONFIG_ARM_SMMU_V3_PMU) += arm_smmuv3_pmu.o 14 14 obj-$(CONFIG_FSL_IMX8_DDR_PMU) += fsl_imx8_ddr_perf.o 15 15 obj-$(CONFIG_FSL_IMX9_DDR_PMU) += fsl_imx9_ddr_perf.o 16 + obj-$(CONFIG_FUJITSU_UNCORE_PMU) += fujitsu_uncore_pmu.o 16 17 obj-$(CONFIG_HISI_PMU) += hisilicon/ 17 18 obj-$(CONFIG_QCOM_L2_PMU) += qcom_l2_pmu.o 18 19 obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o
+1 -1
drivers/perf/arm-ccn.c
··· 565 565 566 566 static ktime_t arm_ccn_pmu_timer_period(void) 567 567 { 568 - return ns_to_ktime((u64)arm_ccn_pmu_poll_period_us * 1000); 568 + return us_to_ktime((u64)arm_ccn_pmu_poll_period_us); 569 569 } 570 570 571 571
+6 -3
drivers/perf/arm-cmn.c
··· 65 65 /* PMU registers occupy the 3rd 4KB page of each node's region */ 66 66 #define CMN_PMU_OFFSET 0x2000 67 67 /* ...except when they don't :( */ 68 - #define CMN_S3_DTM_OFFSET 0xa000 68 + #define CMN_S3_R1_DTM_OFFSET 0xa000 69 69 #define CMN_S3_PMU_OFFSET 0xd900 70 70 71 71 /* For most nodes, this is all there is */ ··· 233 233 REV_CMN700_R1P0, 234 234 REV_CMN700_R2P0, 235 235 REV_CMN700_R3P0, 236 + REV_CMNS3_R0P0 = 0, 237 + REV_CMNS3_R0P1, 238 + REV_CMNS3_R1P0, 236 239 REV_CI700_R0P0 = 0, 237 240 REV_CI700_R1P0, 238 241 REV_CI700_R2P0, ··· 428 425 static int arm_cmn_pmu_offset(const struct arm_cmn *cmn, const struct arm_cmn_node *dn) 429 426 { 430 427 if (cmn->part == PART_CMN_S3) { 431 - if (dn->type == CMN_TYPE_XP) 432 - return CMN_S3_DTM_OFFSET; 428 + if (cmn->rev >= REV_CMNS3_R1P0 && dn->type == CMN_TYPE_XP) 429 + return CMN_S3_R1_DTM_OFFSET; 433 430 return CMN_S3_PMU_OFFSET; 434 431 } 435 432 return CMN_PMU_OFFSET;
+27 -2
drivers/perf/arm_pmuv3.c
··· 978 978 return -EAGAIN; 979 979 } 980 980 981 + static bool armv8pmu_can_use_pmccntr(struct pmu_hw_events *cpuc, 982 + struct perf_event *event) 983 + { 984 + struct hw_perf_event *hwc = &event->hw; 985 + unsigned long evtype = hwc->config_base & ARMV8_PMU_EVTYPE_EVENT; 986 + 987 + if (evtype != ARMV8_PMUV3_PERFCTR_CPU_CYCLES) 988 + return false; 989 + 990 + /* 991 + * A CPU_CYCLES event with threshold counting cannot use PMCCNTR_EL0 992 + * since it lacks threshold support. 993 + */ 994 + if (armv8pmu_event_get_threshold(&event->attr)) 995 + return false; 996 + 997 + /* 998 + * PMCCNTR_EL0 is not affected by BRBE controls like BRBCR_ELx.FZP. 999 + * So don't use it for branch events. 1000 + */ 1001 + if (has_branch_stack(event)) 1002 + return false; 1003 + 1004 + return true; 1005 + } 1006 + 981 1007 static int armv8pmu_get_event_idx(struct pmu_hw_events *cpuc, 982 1008 struct perf_event *event) 983 1009 { ··· 1012 986 unsigned long evtype = hwc->config_base & ARMV8_PMU_EVTYPE_EVENT; 1013 987 1014 988 /* Always prefer to place a cycle counter into the cycle counter. */ 1015 - if ((evtype == ARMV8_PMUV3_PERFCTR_CPU_CYCLES) && 1016 - !armv8pmu_event_get_threshold(&event->attr) && !has_branch_stack(event)) { 989 + if (armv8pmu_can_use_pmccntr(cpuc, event)) { 1017 990 if (!test_and_set_bit(ARMV8_PMU_CYCLE_IDX, cpuc->used_mask)) 1018 991 return ARMV8_PMU_CYCLE_IDX; 1019 992 else if (armv8pmu_event_is_64bit(event) &&
+95 -19
drivers/perf/arm_spe_pmu.c
··· 86 86 #define SPE_PMU_FEAT_ERND (1UL << 5) 87 87 #define SPE_PMU_FEAT_INV_FILT_EVT (1UL << 6) 88 88 #define SPE_PMU_FEAT_DISCARD (1UL << 7) 89 + #define SPE_PMU_FEAT_EFT (1UL << 8) 89 90 #define SPE_PMU_FEAT_DEV_PROBED (1UL << 63) 90 91 u64 features; 91 92 93 + u64 pmsevfr_res0; 92 94 u16 max_record_sz; 93 95 u16 align; 94 96 struct perf_output_handle __percpu *handle; ··· 99 97 #define to_spe_pmu(p) (container_of(p, struct arm_spe_pmu, pmu)) 100 98 101 99 /* Convert a free-running index from perf into an SPE buffer offset */ 102 - #define PERF_IDX2OFF(idx, buf) ((idx) % ((buf)->nr_pages << PAGE_SHIFT)) 100 + #define PERF_IDX2OFF(idx, buf) \ 101 + ((idx) % ((unsigned long)(buf)->nr_pages << PAGE_SHIFT)) 103 102 104 103 /* Keep track of our dynamic hotplug state */ 105 104 static enum cpuhp_state arm_spe_pmu_online; ··· 118 115 SPE_PMU_CAP_FEAT_MAX, 119 116 SPE_PMU_CAP_CNT_SZ = SPE_PMU_CAP_FEAT_MAX, 120 117 SPE_PMU_CAP_MIN_IVAL, 118 + SPE_PMU_CAP_EVENT_FILTER, 121 119 }; 122 120 123 121 static int arm_spe_pmu_feat_caps[SPE_PMU_CAP_FEAT_MAX] = { ··· 126 122 [SPE_PMU_CAP_ERND] = SPE_PMU_FEAT_ERND, 127 123 }; 128 124 129 - static u32 arm_spe_pmu_cap_get(struct arm_spe_pmu *spe_pmu, int cap) 125 + static u64 arm_spe_pmu_cap_get(struct arm_spe_pmu *spe_pmu, int cap) 130 126 { 131 127 if (cap < SPE_PMU_CAP_FEAT_MAX) 132 128 return !!(spe_pmu->features & arm_spe_pmu_feat_caps[cap]); ··· 136 132 return spe_pmu->counter_sz; 137 133 case SPE_PMU_CAP_MIN_IVAL: 138 134 return spe_pmu->min_period; 135 + case SPE_PMU_CAP_EVENT_FILTER: 136 + return ~spe_pmu->pmsevfr_res0; 139 137 default: 140 138 WARN(1, "unknown cap %d\n", cap); 141 139 } ··· 154 148 container_of(attr, struct dev_ext_attribute, attr); 155 149 int cap = (long)ea->var; 156 150 157 - return sysfs_emit(buf, "%u\n", arm_spe_pmu_cap_get(spe_pmu, cap)); 151 + return sysfs_emit(buf, "%llu\n", arm_spe_pmu_cap_get(spe_pmu, cap)); 152 + } 153 + 154 + static ssize_t arm_spe_pmu_cap_show_hex(struct device *dev, 155 + struct device_attribute *attr, 156 + char *buf) 157 + { 158 + struct arm_spe_pmu *spe_pmu = dev_get_drvdata(dev); 159 + struct dev_ext_attribute *ea = 160 + container_of(attr, struct dev_ext_attribute, attr); 161 + int cap = (long)ea->var; 162 + 163 + return sysfs_emit(buf, "0x%llx\n", arm_spe_pmu_cap_get(spe_pmu, cap)); 158 164 } 159 165 160 166 #define SPE_EXT_ATTR_ENTRY(_name, _func, _var) \ ··· 176 158 177 159 #define SPE_CAP_EXT_ATTR_ENTRY(_name, _var) \ 178 160 SPE_EXT_ATTR_ENTRY(_name, arm_spe_pmu_cap_show, _var) 161 + #define SPE_CAP_EXT_ATTR_ENTRY_HEX(_name, _var) \ 162 + SPE_EXT_ATTR_ENTRY(_name, arm_spe_pmu_cap_show_hex, _var) 179 163 180 164 static struct attribute *arm_spe_pmu_cap_attr[] = { 181 165 SPE_CAP_EXT_ATTR_ENTRY(arch_inst, SPE_PMU_CAP_ARCH_INST), 182 166 SPE_CAP_EXT_ATTR_ENTRY(ernd, SPE_PMU_CAP_ERND), 183 167 SPE_CAP_EXT_ATTR_ENTRY(count_size, SPE_PMU_CAP_CNT_SZ), 184 168 SPE_CAP_EXT_ATTR_ENTRY(min_interval, SPE_PMU_CAP_MIN_IVAL), 169 + SPE_CAP_EXT_ATTR_ENTRY_HEX(event_filter, SPE_PMU_CAP_EVENT_FILTER), 185 170 NULL, 186 171 }; 187 172 ··· 218 197 #define ATTR_CFG_FLD_discard_CFG config /* PMBLIMITR_EL1.FM = DISCARD */ 219 198 #define ATTR_CFG_FLD_discard_LO 35 220 199 #define ATTR_CFG_FLD_discard_HI 35 200 + #define ATTR_CFG_FLD_branch_filter_mask_CFG config /* PMSFCR_EL1.Bm */ 201 + #define ATTR_CFG_FLD_branch_filter_mask_LO 36 202 + #define ATTR_CFG_FLD_branch_filter_mask_HI 36 203 + #define ATTR_CFG_FLD_load_filter_mask_CFG config /* PMSFCR_EL1.LDm */ 204 + #define ATTR_CFG_FLD_load_filter_mask_LO 37 205 + #define ATTR_CFG_FLD_load_filter_mask_HI 37 206 + #define ATTR_CFG_FLD_store_filter_mask_CFG config /* PMSFCR_EL1.STm */ 207 + #define ATTR_CFG_FLD_store_filter_mask_LO 38 208 + #define ATTR_CFG_FLD_store_filter_mask_HI 38 209 + #define ATTR_CFG_FLD_simd_filter_CFG config /* PMSFCR_EL1.SIMD */ 210 + #define ATTR_CFG_FLD_simd_filter_LO 39 211 + #define ATTR_CFG_FLD_simd_filter_HI 39 212 + #define ATTR_CFG_FLD_simd_filter_mask_CFG config /* PMSFCR_EL1.SIMDm */ 213 + #define ATTR_CFG_FLD_simd_filter_mask_LO 40 214 + #define ATTR_CFG_FLD_simd_filter_mask_HI 40 215 + #define ATTR_CFG_FLD_float_filter_CFG config /* PMSFCR_EL1.FP */ 216 + #define ATTR_CFG_FLD_float_filter_LO 41 217 + #define ATTR_CFG_FLD_float_filter_HI 41 218 + #define ATTR_CFG_FLD_float_filter_mask_CFG config /* PMSFCR_EL1.FPm */ 219 + #define ATTR_CFG_FLD_float_filter_mask_LO 42 220 + #define ATTR_CFG_FLD_float_filter_mask_HI 42 221 221 222 222 #define ATTR_CFG_FLD_event_filter_CFG config1 /* PMSEVFR_EL1 */ 223 223 #define ATTR_CFG_FLD_event_filter_LO 0 ··· 257 215 GEN_PMU_FORMAT_ATTR(pct_enable); 258 216 GEN_PMU_FORMAT_ATTR(jitter); 259 217 GEN_PMU_FORMAT_ATTR(branch_filter); 218 + GEN_PMU_FORMAT_ATTR(branch_filter_mask); 260 219 GEN_PMU_FORMAT_ATTR(load_filter); 220 + GEN_PMU_FORMAT_ATTR(load_filter_mask); 261 221 GEN_PMU_FORMAT_ATTR(store_filter); 222 + GEN_PMU_FORMAT_ATTR(store_filter_mask); 223 + GEN_PMU_FORMAT_ATTR(simd_filter); 224 + GEN_PMU_FORMAT_ATTR(simd_filter_mask); 225 + GEN_PMU_FORMAT_ATTR(float_filter); 226 + GEN_PMU_FORMAT_ATTR(float_filter_mask); 262 227 GEN_PMU_FORMAT_ATTR(event_filter); 263 228 GEN_PMU_FORMAT_ATTR(inv_event_filter); 264 229 GEN_PMU_FORMAT_ATTR(min_latency); ··· 277 228 &format_attr_pct_enable.attr, 278 229 &format_attr_jitter.attr, 279 230 &format_attr_branch_filter.attr, 231 + &format_attr_branch_filter_mask.attr, 280 232 &format_attr_load_filter.attr, 233 + &format_attr_load_filter_mask.attr, 281 234 &format_attr_store_filter.attr, 235 + &format_attr_store_filter_mask.attr, 236 + &format_attr_simd_filter.attr, 237 + &format_attr_simd_filter_mask.attr, 238 + &format_attr_float_filter.attr, 239 + &format_attr_float_filter_mask.attr, 282 240 &format_attr_event_filter.attr, 283 241 &format_attr_inv_event_filter.attr, 284 242 &format_attr_min_latency.attr, ··· 304 248 return 0; 305 249 306 250 if (attr == &format_attr_inv_event_filter.attr && !(spe_pmu->features & SPE_PMU_FEAT_INV_FILT_EVT)) 251 + return 0; 252 + 253 + if ((attr == &format_attr_branch_filter_mask.attr || 254 + attr == &format_attr_load_filter_mask.attr || 255 + attr == &format_attr_store_filter_mask.attr || 256 + attr == &format_attr_simd_filter.attr || 257 + attr == &format_attr_simd_filter_mask.attr || 258 + attr == &format_attr_float_filter.attr || 259 + attr == &format_attr_float_filter_mask.attr) && 260 + !(spe_pmu->features & SPE_PMU_FEAT_EFT)) 307 261 return 0; 308 262 309 263 return attr->mode; ··· 411 345 u64 reg = 0; 412 346 413 347 reg |= FIELD_PREP(PMSFCR_EL1_LD, ATTR_CFG_GET_FLD(attr, load_filter)); 348 + reg |= FIELD_PREP(PMSFCR_EL1_LDm, ATTR_CFG_GET_FLD(attr, load_filter_mask)); 414 349 reg |= FIELD_PREP(PMSFCR_EL1_ST, ATTR_CFG_GET_FLD(attr, store_filter)); 350 + reg |= FIELD_PREP(PMSFCR_EL1_STm, ATTR_CFG_GET_FLD(attr, store_filter_mask)); 415 351 reg |= FIELD_PREP(PMSFCR_EL1_B, ATTR_CFG_GET_FLD(attr, branch_filter)); 352 + reg |= FIELD_PREP(PMSFCR_EL1_Bm, ATTR_CFG_GET_FLD(attr, branch_filter_mask)); 353 + reg |= FIELD_PREP(PMSFCR_EL1_SIMD, ATTR_CFG_GET_FLD(attr, simd_filter)); 354 + reg |= FIELD_PREP(PMSFCR_EL1_SIMDm, ATTR_CFG_GET_FLD(attr, simd_filter_mask)); 355 + reg |= FIELD_PREP(PMSFCR_EL1_FP, ATTR_CFG_GET_FLD(attr, float_filter)); 356 + reg |= FIELD_PREP(PMSFCR_EL1_FPm, ATTR_CFG_GET_FLD(attr, float_filter_mask)); 416 357 417 358 if (reg) 418 359 reg |= PMSFCR_EL1_FT; ··· 770 697 return IRQ_HANDLED; 771 698 } 772 699 773 - static u64 arm_spe_pmsevfr_res0(u16 pmsver) 774 - { 775 - switch (pmsver) { 776 - case ID_AA64DFR0_EL1_PMSVer_IMP: 777 - return PMSEVFR_EL1_RES0_IMP; 778 - case ID_AA64DFR0_EL1_PMSVer_V1P1: 779 - return PMSEVFR_EL1_RES0_V1P1; 780 - case ID_AA64DFR0_EL1_PMSVer_V1P2: 781 - /* Return the highest version we support in default */ 782 - default: 783 - return PMSEVFR_EL1_RES0_V1P2; 784 - } 785 - } 786 - 787 700 /* Perf callbacks */ 788 701 static int arm_spe_pmu_event_init(struct perf_event *event) 789 702 { ··· 785 726 !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus)) 786 727 return -ENOENT; 787 728 788 - if (arm_spe_event_to_pmsevfr(event) & arm_spe_pmsevfr_res0(spe_pmu->pmsver)) 729 + if (arm_spe_event_to_pmsevfr(event) & spe_pmu->pmsevfr_res0) 789 730 return -EOPNOTSUPP; 790 731 791 - if (arm_spe_event_to_pmsnevfr(event) & arm_spe_pmsevfr_res0(spe_pmu->pmsver)) 732 + if (arm_spe_event_to_pmsnevfr(event) & spe_pmu->pmsevfr_res0) 792 733 return -EOPNOTSUPP; 793 734 794 735 if (attr->exclude_idle) ··· 819 760 820 761 if ((FIELD_GET(PMSFCR_EL1_FL, reg)) && 821 762 !(spe_pmu->features & SPE_PMU_FEAT_FILT_LAT)) 763 + return -EOPNOTSUPP; 764 + 765 + if ((FIELD_GET(PMSFCR_EL1_LDm, reg) || 766 + FIELD_GET(PMSFCR_EL1_STm, reg) || 767 + FIELD_GET(PMSFCR_EL1_Bm, reg) || 768 + FIELD_GET(PMSFCR_EL1_SIMD, reg) || 769 + FIELD_GET(PMSFCR_EL1_SIMDm, reg) || 770 + FIELD_GET(PMSFCR_EL1_FP, reg) || 771 + FIELD_GET(PMSFCR_EL1_FPm, reg)) && 772 + !(spe_pmu->features & SPE_PMU_FEAT_EFT)) 822 773 return -EOPNOTSUPP; 823 774 824 775 if (ATTR_CFG_GET_FLD(&event->attr, discard) && ··· 1122 1053 if (spe_pmu->pmsver >= ID_AA64DFR0_EL1_PMSVer_V1P2) 1123 1054 spe_pmu->features |= SPE_PMU_FEAT_DISCARD; 1124 1055 1056 + if (FIELD_GET(PMSIDR_EL1_EFT, reg)) 1057 + spe_pmu->features |= SPE_PMU_FEAT_EFT; 1058 + 1125 1059 /* This field has a spaced out encoding, so just use a look-up */ 1126 1060 fld = FIELD_GET(PMSIDR_EL1_INTERVAL, reg); 1127 1061 switch (fld) { ··· 1178 1106 case PMSIDR_EL1_COUNTSIZE_16_BIT_SAT: 1179 1107 spe_pmu->counter_sz = 16; 1180 1108 } 1109 + 1110 + /* Write all 1s and then read back. Unsupported filter bits are RAZ/WI. */ 1111 + write_sysreg_s(U64_MAX, SYS_PMSEVFR_EL1); 1112 + spe_pmu->pmsevfr_res0 = ~read_sysreg_s(SYS_PMSEVFR_EL1); 1181 1113 1182 1114 dev_info(dev, 1183 1115 "probed SPEv1.%d for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
+130 -31
drivers/perf/dwc_pcie_pmu.c
··· 39 39 #define DWC_PCIE_EVENT_CLEAR GENMASK(1, 0) 40 40 #define DWC_PCIE_EVENT_PER_CLEAR 0x1 41 41 42 + /* Event Selection Field has two subfields */ 43 + #define DWC_PCIE_CNT_EVENT_SEL_GROUP GENMASK(11, 8) 44 + #define DWC_PCIE_CNT_EVENT_SEL_EVID GENMASK(7, 0) 45 + 42 46 #define DWC_PCIE_EVENT_CNT_DATA 0xC 43 47 44 48 #define DWC_PCIE_TIME_BASED_ANAL_CTL 0x10 ··· 77 73 DWC_PCIE_EVENT_TYPE_MAX, 78 74 }; 79 75 76 + #define DWC_PCIE_LANE_GROUP_6 6 77 + #define DWC_PCIE_LANE_GROUP_7 7 78 + #define DWC_PCIE_LANE_MAX_EVENTS_PER_GROUP 256 79 + 80 80 #define DWC_PCIE_LANE_EVENT_MAX_PERIOD GENMASK_ULL(31, 0) 81 81 #define DWC_PCIE_MAX_PERIOD GENMASK_ULL(63, 0) 82 82 ··· 90 82 u16 ras_des_offset; 91 83 u32 nr_lanes; 92 84 85 + /* Groups #6 and #7 */ 86 + DECLARE_BITMAP(lane_events, 2 * DWC_PCIE_LANE_MAX_EVENTS_PER_GROUP); 87 + struct perf_event *time_based_event; 88 + 93 89 struct hlist_node cpuhp_node; 94 - struct perf_event *event[DWC_PCIE_EVENT_TYPE_MAX]; 95 90 int on_cpu; 96 91 }; 97 92 ··· 257 246 }; 258 247 259 248 static void dwc_pcie_pmu_lane_event_enable(struct dwc_pcie_pmu *pcie_pmu, 249 + struct perf_event *event, 260 250 bool enable) 261 251 { 262 252 struct pci_dev *pdev = pcie_pmu->pdev; 263 253 u16 ras_des_offset = pcie_pmu->ras_des_offset; 254 + int event_id = DWC_PCIE_EVENT_ID(event); 255 + int lane = DWC_PCIE_EVENT_LANE(event); 256 + u32 ctrl; 257 + 258 + ctrl = FIELD_PREP(DWC_PCIE_CNT_EVENT_SEL, event_id) | 259 + FIELD_PREP(DWC_PCIE_CNT_LANE_SEL, lane) | 260 + FIELD_PREP(DWC_PCIE_EVENT_CLEAR, DWC_PCIE_EVENT_PER_CLEAR); 264 261 265 262 if (enable) 266 - pci_clear_and_set_config_dword(pdev, 267 - ras_des_offset + DWC_PCIE_EVENT_CNT_CTL, 268 - DWC_PCIE_CNT_ENABLE, DWC_PCIE_PER_EVENT_ON); 263 + ctrl |= FIELD_PREP(DWC_PCIE_CNT_ENABLE, DWC_PCIE_PER_EVENT_ON); 269 264 else 270 - pci_clear_and_set_config_dword(pdev, 271 - ras_des_offset + DWC_PCIE_EVENT_CNT_CTL, 272 - DWC_PCIE_CNT_ENABLE, DWC_PCIE_PER_EVENT_OFF); 265 + ctrl |= FIELD_PREP(DWC_PCIE_CNT_ENABLE, DWC_PCIE_PER_EVENT_OFF); 266 + 267 + pci_write_config_dword(pdev, ras_des_offset + DWC_PCIE_EVENT_CNT_CTL, 268 + ctrl); 273 269 } 274 270 275 271 static void dwc_pcie_pmu_time_based_event_enable(struct dwc_pcie_pmu *pcie_pmu, ··· 294 276 { 295 277 struct dwc_pcie_pmu *pcie_pmu = to_dwc_pcie_pmu(event->pmu); 296 278 struct pci_dev *pdev = pcie_pmu->pdev; 279 + int event_id = DWC_PCIE_EVENT_ID(event); 280 + int lane = DWC_PCIE_EVENT_LANE(event); 297 281 u16 ras_des_offset = pcie_pmu->ras_des_offset; 298 - u32 val; 282 + u32 val, ctrl; 299 283 284 + ctrl = FIELD_PREP(DWC_PCIE_CNT_EVENT_SEL, event_id) | 285 + FIELD_PREP(DWC_PCIE_CNT_LANE_SEL, lane) | 286 + FIELD_PREP(DWC_PCIE_CNT_ENABLE, DWC_PCIE_PER_EVENT_ON); 287 + pci_write_config_dword(pdev, ras_des_offset + DWC_PCIE_EVENT_CNT_CTL, 288 + ctrl); 300 289 pci_read_config_dword(pdev, ras_des_offset + DWC_PCIE_EVENT_CNT_DATA, &val); 290 + 291 + ctrl |= FIELD_PREP(DWC_PCIE_EVENT_CLEAR, DWC_PCIE_EVENT_PER_CLEAR); 292 + pci_write_config_dword(pdev, ras_des_offset + DWC_PCIE_EVENT_CNT_CTL, 293 + ctrl); 301 294 302 295 return val; 303 296 } ··· 358 329 { 359 330 struct hw_perf_event *hwc = &event->hw; 360 331 enum dwc_pcie_event_type type = DWC_PCIE_EVENT_TYPE(event); 361 - u64 delta, prev, now = 0; 332 + u64 delta, prev, now; 333 + 334 + if (type == DWC_PCIE_LANE_EVENT) { 335 + now = dwc_pcie_pmu_read_lane_event_counter(event) & 336 + DWC_PCIE_LANE_EVENT_MAX_PERIOD; 337 + local64_add(now, &event->count); 338 + return; 339 + } 362 340 363 341 do { 364 342 prev = local64_read(&hwc->prev_count); 365 - 366 - if (type == DWC_PCIE_LANE_EVENT) 367 - now = dwc_pcie_pmu_read_lane_event_counter(event); 368 - else if (type == DWC_PCIE_TIME_BASE_EVENT) 369 - now = dwc_pcie_pmu_read_time_based_counter(event); 343 + now = dwc_pcie_pmu_read_time_based_counter(event); 370 344 371 345 } while (local64_cmpxchg(&hwc->prev_count, prev, now) != prev); 372 346 373 347 delta = (now - prev) & DWC_PCIE_MAX_PERIOD; 374 - /* 32-bit counter for Lane Event Counting */ 375 - if (type == DWC_PCIE_LANE_EVENT) 376 - delta &= DWC_PCIE_LANE_EVENT_MAX_PERIOD; 377 - 378 348 local64_add(delta, &event->count); 349 + } 350 + 351 + static int dwc_pcie_pmu_validate_add_lane_event(struct perf_event *event, 352 + unsigned long val_lane_events[]) 353 + { 354 + int event_id, event_nr, group; 355 + 356 + event_id = DWC_PCIE_EVENT_ID(event); 357 + event_nr = FIELD_GET(DWC_PCIE_CNT_EVENT_SEL_EVID, event_id); 358 + group = FIELD_GET(DWC_PCIE_CNT_EVENT_SEL_GROUP, event_id); 359 + 360 + if (group != DWC_PCIE_LANE_GROUP_6 && group != DWC_PCIE_LANE_GROUP_7) 361 + return -EINVAL; 362 + 363 + group -= DWC_PCIE_LANE_GROUP_6; 364 + 365 + if (test_and_set_bit(group * DWC_PCIE_LANE_MAX_EVENTS_PER_GROUP + event_nr, 366 + val_lane_events)) 367 + return -EINVAL; 368 + 369 + return 0; 370 + } 371 + 372 + static int dwc_pcie_pmu_validate_group(struct perf_event *event) 373 + { 374 + struct perf_event *sibling, *leader = event->group_leader; 375 + DECLARE_BITMAP(val_lane_events, 2 * DWC_PCIE_LANE_MAX_EVENTS_PER_GROUP); 376 + bool time_event = false; 377 + int type; 378 + 379 + type = DWC_PCIE_EVENT_TYPE(leader); 380 + if (type == DWC_PCIE_TIME_BASE_EVENT) 381 + time_event = true; 382 + else 383 + if (dwc_pcie_pmu_validate_add_lane_event(leader, val_lane_events)) 384 + return -ENOSPC; 385 + 386 + for_each_sibling_event(sibling, leader) { 387 + type = DWC_PCIE_EVENT_TYPE(sibling); 388 + if (type == DWC_PCIE_TIME_BASE_EVENT) { 389 + if (time_event) 390 + return -ENOSPC; 391 + 392 + time_event = true; 393 + continue; 394 + } 395 + 396 + if (dwc_pcie_pmu_validate_add_lane_event(sibling, val_lane_events)) 397 + return -ENOSPC; 398 + } 399 + 400 + return 0; 379 401 } 380 402 381 403 static int dwc_pcie_pmu_event_init(struct perf_event *event) ··· 447 367 if (event->cpu < 0 || event->attach_state & PERF_ATTACH_TASK) 448 368 return -EINVAL; 449 369 450 - if (event->group_leader != event && 451 - !is_software_event(event->group_leader)) 452 - return -EINVAL; 453 - 454 370 for_each_sibling_event(sibling, event->group_leader) { 455 371 if (sibling->pmu != event->pmu && !is_software_event(sibling)) 456 372 return -EINVAL; ··· 460 384 if (lane < 0 || lane >= pcie_pmu->nr_lanes) 461 385 return -EINVAL; 462 386 } 387 + 388 + if (dwc_pcie_pmu_validate_group(event)) 389 + return -ENOSPC; 463 390 464 391 event->cpu = pcie_pmu->on_cpu; 465 392 ··· 479 400 local64_set(&hwc->prev_count, 0); 480 401 481 402 if (type == DWC_PCIE_LANE_EVENT) 482 - dwc_pcie_pmu_lane_event_enable(pcie_pmu, true); 403 + dwc_pcie_pmu_lane_event_enable(pcie_pmu, event, true); 483 404 else if (type == DWC_PCIE_TIME_BASE_EVENT) 484 405 dwc_pcie_pmu_time_based_event_enable(pcie_pmu, true); 485 406 } ··· 493 414 if (event->hw.state & PERF_HES_STOPPED) 494 415 return; 495 416 417 + dwc_pcie_pmu_event_update(event); 418 + 496 419 if (type == DWC_PCIE_LANE_EVENT) 497 - dwc_pcie_pmu_lane_event_enable(pcie_pmu, false); 420 + dwc_pcie_pmu_lane_event_enable(pcie_pmu, event, false); 498 421 else if (type == DWC_PCIE_TIME_BASE_EVENT) 499 422 dwc_pcie_pmu_time_based_event_enable(pcie_pmu, false); 500 423 501 - dwc_pcie_pmu_event_update(event); 502 424 hwc->state |= PERF_HES_STOPPED | PERF_HES_UPTODATE; 503 425 } 504 426 ··· 514 434 u16 ras_des_offset = pcie_pmu->ras_des_offset; 515 435 u32 ctrl; 516 436 517 - /* one counter for each type and it is in use */ 518 - if (pcie_pmu->event[type]) 519 - return -ENOSPC; 520 - 521 - pcie_pmu->event[type] = event; 522 437 hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE; 523 438 524 439 if (type == DWC_PCIE_LANE_EVENT) { 440 + int event_nr = FIELD_GET(DWC_PCIE_CNT_EVENT_SEL_EVID, event_id); 441 + int group = FIELD_GET(DWC_PCIE_CNT_EVENT_SEL_GROUP, event_id) - 442 + DWC_PCIE_LANE_GROUP_6; 443 + 444 + if (test_and_set_bit(group * DWC_PCIE_LANE_MAX_EVENTS_PER_GROUP + event_nr, 445 + pcie_pmu->lane_events)) 446 + return -ENOSPC; 447 + 525 448 /* EVENT_COUNTER_DATA_REG needs clear manually */ 526 449 ctrl = FIELD_PREP(DWC_PCIE_CNT_EVENT_SEL, event_id) | 527 450 FIELD_PREP(DWC_PCIE_CNT_LANE_SEL, lane) | ··· 533 450 pci_write_config_dword(pdev, ras_des_offset + DWC_PCIE_EVENT_CNT_CTL, 534 451 ctrl); 535 452 } else if (type == DWC_PCIE_TIME_BASE_EVENT) { 453 + if (pcie_pmu->time_based_event) 454 + return -ENOSPC; 455 + 456 + pcie_pmu->time_based_event = event; 457 + 536 458 /* 537 459 * TIME_BASED_ANAL_DATA_REG is a 64 bit register, we can safely 538 460 * use it with any manually controlled duration. And it is ··· 566 478 567 479 dwc_pcie_pmu_event_stop(event, flags | PERF_EF_UPDATE); 568 480 perf_event_update_userpage(event); 569 - pcie_pmu->event[type] = NULL; 481 + 482 + if (type == DWC_PCIE_TIME_BASE_EVENT) { 483 + pcie_pmu->time_based_event = NULL; 484 + } else { 485 + int event_id = DWC_PCIE_EVENT_ID(event); 486 + int event_nr = FIELD_GET(DWC_PCIE_CNT_EVENT_SEL_EVID, event_id); 487 + int group = FIELD_GET(DWC_PCIE_CNT_EVENT_SEL_GROUP, event_id) - 488 + DWC_PCIE_LANE_GROUP_6; 489 + 490 + clear_bit(group * DWC_PCIE_LANE_MAX_EVENTS_PER_GROUP + event_nr, 491 + pcie_pmu->lane_events); 492 + } 570 493 } 571 494 572 495 static void dwc_pcie_pmu_remove_cpuhp_instance(void *hotplug_node)
+6
drivers/perf/fsl_imx9_ddr_perf.c
··· 104 104 .filter_ver = DDR_PERF_AXI_FILTER_V1 105 105 }; 106 106 107 + static const struct imx_ddr_devtype_data imx94_devtype_data = { 108 + .identifier = "imx94", 109 + .filter_ver = DDR_PERF_AXI_FILTER_V2 110 + }; 111 + 107 112 static const struct imx_ddr_devtype_data imx95_devtype_data = { 108 113 .identifier = "imx95", 109 114 .filter_ver = DDR_PERF_AXI_FILTER_V2 ··· 127 122 static const struct of_device_id imx_ddr_pmu_dt_ids[] = { 128 123 { .compatible = "fsl,imx91-ddr-pmu", .data = &imx91_devtype_data }, 129 124 { .compatible = "fsl,imx93-ddr-pmu", .data = &imx93_devtype_data }, 125 + { .compatible = "fsl,imx94-ddr-pmu", .data = &imx94_devtype_data }, 130 126 { .compatible = "fsl,imx95-ddr-pmu", .data = &imx95_devtype_data }, 131 127 { /* sentinel */ } 132 128 };
+613
drivers/perf/fujitsu_uncore_pmu.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* 3 + * Driver for the Uncore PMUs in Fujitsu chips. 4 + * 5 + * See Documentation/admin-guide/perf/fujitsu_uncore_pmu.rst for more details. 6 + * 7 + * Copyright (c) 2025 Fujitsu. All rights reserved. 8 + */ 9 + 10 + #include <linux/acpi.h> 11 + #include <linux/bitfield.h> 12 + #include <linux/bitops.h> 13 + #include <linux/interrupt.h> 14 + #include <linux/io.h> 15 + #include <linux/list.h> 16 + #include <linux/mod_devicetable.h> 17 + #include <linux/module.h> 18 + #include <linux/perf_event.h> 19 + #include <linux/platform_device.h> 20 + 21 + /* Number of counters on each PMU */ 22 + #define MAC_NUM_COUNTERS 8 23 + #define PCI_NUM_COUNTERS 8 24 + /* Mask for the event type field within perf_event_attr.config and EVTYPE reg */ 25 + #define UNCORE_EVTYPE_MASK 0xFF 26 + 27 + /* Perfmon registers */ 28 + #define PM_EVCNTR(__cntr) (0x000 + (__cntr) * 8) 29 + #define PM_CNTCTL(__cntr) (0x100 + (__cntr) * 8) 30 + #define PM_CNTCTL_RESET 0 31 + #define PM_EVTYPE(__cntr) (0x200 + (__cntr) * 8) 32 + #define PM_EVTYPE_EVSEL(__val) FIELD_GET(UNCORE_EVTYPE_MASK, __val) 33 + #define PM_CR 0x400 34 + #define PM_CR_RESET BIT(1) 35 + #define PM_CR_ENABLE BIT(0) 36 + #define PM_CNTENSET 0x410 37 + #define PM_CNTENSET_IDX(__cntr) BIT(__cntr) 38 + #define PM_CNTENCLR 0x418 39 + #define PM_CNTENCLR_IDX(__cntr) BIT(__cntr) 40 + #define PM_CNTENCLR_RESET 0xFF 41 + #define PM_INTENSET 0x420 42 + #define PM_INTENSET_IDX(__cntr) BIT(__cntr) 43 + #define PM_INTENCLR 0x428 44 + #define PM_INTENCLR_IDX(__cntr) BIT(__cntr) 45 + #define PM_INTENCLR_RESET 0xFF 46 + #define PM_OVSR 0x440 47 + #define PM_OVSR_OVSRCLR_RESET 0xFF 48 + 49 + enum fujitsu_uncore_pmu { 50 + FUJITSU_UNCORE_PMU_MAC = 1, 51 + FUJITSU_UNCORE_PMU_PCI = 2, 52 + }; 53 + 54 + struct uncore_pmu { 55 + int num_counters; 56 + struct pmu pmu; 57 + struct hlist_node node; 58 + void __iomem *regs; 59 + struct perf_event **events; 60 + unsigned long *used_mask; 61 + int cpu; 62 + int irq; 63 + struct device *dev; 64 + }; 65 + 66 + #define to_uncore_pmu(p) (container_of(p, struct uncore_pmu, pmu)) 67 + 68 + static int uncore_pmu_cpuhp_state; 69 + 70 + static void fujitsu_uncore_counter_start(struct perf_event *event) 71 + { 72 + struct uncore_pmu *uncorepmu = to_uncore_pmu(event->pmu); 73 + int idx = event->hw.idx; 74 + 75 + /* Initialize the hardware counter and reset prev_count*/ 76 + local64_set(&event->hw.prev_count, 0); 77 + writeq_relaxed(0, uncorepmu->regs + PM_EVCNTR(idx)); 78 + 79 + /* Set the event type */ 80 + writeq_relaxed(PM_EVTYPE_EVSEL(event->attr.config), uncorepmu->regs + PM_EVTYPE(idx)); 81 + 82 + /* Enable interrupt generation by this counter */ 83 + writeq_relaxed(PM_INTENSET_IDX(idx), uncorepmu->regs + PM_INTENSET); 84 + 85 + /* Finally, enable the counter */ 86 + writeq_relaxed(PM_CNTCTL_RESET, uncorepmu->regs + PM_CNTCTL(idx)); 87 + writeq_relaxed(PM_CNTENSET_IDX(idx), uncorepmu->regs + PM_CNTENSET); 88 + } 89 + 90 + static void fujitsu_uncore_counter_stop(struct perf_event *event) 91 + { 92 + struct uncore_pmu *uncorepmu = to_uncore_pmu(event->pmu); 93 + int idx = event->hw.idx; 94 + 95 + /* Disable the counter */ 96 + writeq_relaxed(PM_CNTENCLR_IDX(idx), uncorepmu->regs + PM_CNTENCLR); 97 + 98 + /* Disable interrupt generation by this counter */ 99 + writeq_relaxed(PM_INTENCLR_IDX(idx), uncorepmu->regs + PM_INTENCLR); 100 + } 101 + 102 + static void fujitsu_uncore_counter_update(struct perf_event *event) 103 + { 104 + struct uncore_pmu *uncorepmu = to_uncore_pmu(event->pmu); 105 + int idx = event->hw.idx; 106 + u64 prev, new; 107 + 108 + do { 109 + prev = local64_read(&event->hw.prev_count); 110 + new = readq_relaxed(uncorepmu->regs + PM_EVCNTR(idx)); 111 + } while (local64_cmpxchg(&event->hw.prev_count, prev, new) != prev); 112 + 113 + local64_add(new - prev, &event->count); 114 + } 115 + 116 + static inline void fujitsu_uncore_init(struct uncore_pmu *uncorepmu) 117 + { 118 + int i; 119 + 120 + writeq_relaxed(PM_CR_RESET, uncorepmu->regs + PM_CR); 121 + 122 + writeq_relaxed(PM_CNTENCLR_RESET, uncorepmu->regs + PM_CNTENCLR); 123 + writeq_relaxed(PM_INTENCLR_RESET, uncorepmu->regs + PM_INTENCLR); 124 + writeq_relaxed(PM_OVSR_OVSRCLR_RESET, uncorepmu->regs + PM_OVSR); 125 + 126 + for (i = 0; i < uncorepmu->num_counters; ++i) { 127 + writeq_relaxed(PM_CNTCTL_RESET, uncorepmu->regs + PM_CNTCTL(i)); 128 + writeq_relaxed(PM_EVTYPE_EVSEL(0), uncorepmu->regs + PM_EVTYPE(i)); 129 + } 130 + writeq_relaxed(PM_CR_ENABLE, uncorepmu->regs + PM_CR); 131 + } 132 + 133 + static irqreturn_t fujitsu_uncore_handle_irq(int irq_num, void *data) 134 + { 135 + struct uncore_pmu *uncorepmu = data; 136 + /* Read the overflow status register */ 137 + long status = readq_relaxed(uncorepmu->regs + PM_OVSR); 138 + int idx; 139 + 140 + if (status == 0) 141 + return IRQ_NONE; 142 + 143 + /* Clear the bits we read on the overflow status register */ 144 + writeq_relaxed(status, uncorepmu->regs + PM_OVSR); 145 + 146 + for_each_set_bit(idx, &status, uncorepmu->num_counters) { 147 + struct perf_event *event; 148 + 149 + event = uncorepmu->events[idx]; 150 + if (!event) 151 + continue; 152 + 153 + fujitsu_uncore_counter_update(event); 154 + } 155 + 156 + return IRQ_HANDLED; 157 + } 158 + 159 + static void fujitsu_uncore_pmu_enable(struct pmu *pmu) 160 + { 161 + writeq_relaxed(PM_CR_ENABLE, to_uncore_pmu(pmu)->regs + PM_CR); 162 + } 163 + 164 + static void fujitsu_uncore_pmu_disable(struct pmu *pmu) 165 + { 166 + writeq_relaxed(0, to_uncore_pmu(pmu)->regs + PM_CR); 167 + } 168 + 169 + static bool fujitsu_uncore_validate_event_group(struct perf_event *event) 170 + { 171 + struct uncore_pmu *uncorepmu = to_uncore_pmu(event->pmu); 172 + struct perf_event *leader = event->group_leader; 173 + struct perf_event *sibling; 174 + int counters = 1; 175 + 176 + if (leader == event) 177 + return true; 178 + 179 + if (leader->pmu == event->pmu) 180 + counters++; 181 + 182 + for_each_sibling_event(sibling, leader) { 183 + if (sibling->pmu == event->pmu) 184 + counters++; 185 + } 186 + 187 + /* 188 + * If the group requires more counters than the HW has, it 189 + * cannot ever be scheduled. 190 + */ 191 + return counters <= uncorepmu->num_counters; 192 + } 193 + 194 + static int fujitsu_uncore_event_init(struct perf_event *event) 195 + { 196 + struct uncore_pmu *uncorepmu = to_uncore_pmu(event->pmu); 197 + struct hw_perf_event *hwc = &event->hw; 198 + 199 + /* Is the event for this PMU? */ 200 + if (event->attr.type != event->pmu->type) 201 + return -ENOENT; 202 + 203 + /* 204 + * Sampling not supported since these events are not 205 + * core-attributable. 206 + */ 207 + if (is_sampling_event(event)) 208 + return -EINVAL; 209 + 210 + /* 211 + * Task mode not available, we run the counters as socket counters, 212 + * not attributable to any CPU and therefore cannot attribute per-task. 213 + */ 214 + if (event->cpu < 0) 215 + return -EINVAL; 216 + 217 + /* Validate the group */ 218 + if (!fujitsu_uncore_validate_event_group(event)) 219 + return -EINVAL; 220 + 221 + hwc->idx = -1; 222 + 223 + event->cpu = uncorepmu->cpu; 224 + 225 + return 0; 226 + } 227 + 228 + static void fujitsu_uncore_event_start(struct perf_event *event, int flags) 229 + { 230 + struct hw_perf_event *hwc = &event->hw; 231 + 232 + hwc->state = 0; 233 + fujitsu_uncore_counter_start(event); 234 + } 235 + 236 + static void fujitsu_uncore_event_stop(struct perf_event *event, int flags) 237 + { 238 + struct hw_perf_event *hwc = &event->hw; 239 + 240 + if (hwc->state & PERF_HES_STOPPED) 241 + return; 242 + 243 + fujitsu_uncore_counter_stop(event); 244 + if (flags & PERF_EF_UPDATE) 245 + fujitsu_uncore_counter_update(event); 246 + hwc->state |= PERF_HES_STOPPED | PERF_HES_UPTODATE; 247 + } 248 + 249 + static int fujitsu_uncore_event_add(struct perf_event *event, int flags) 250 + { 251 + struct uncore_pmu *uncorepmu = to_uncore_pmu(event->pmu); 252 + struct hw_perf_event *hwc = &event->hw; 253 + int idx; 254 + 255 + /* Try to allocate a counter. */ 256 + idx = bitmap_find_free_region(uncorepmu->used_mask, uncorepmu->num_counters, 0); 257 + if (idx < 0) 258 + /* The counters are all in use. */ 259 + return -EAGAIN; 260 + 261 + hwc->idx = idx; 262 + hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE; 263 + uncorepmu->events[idx] = event; 264 + 265 + if (flags & PERF_EF_START) 266 + fujitsu_uncore_event_start(event, 0); 267 + 268 + /* Propagate changes to the userspace mapping. */ 269 + perf_event_update_userpage(event); 270 + 271 + return 0; 272 + } 273 + 274 + static void fujitsu_uncore_event_del(struct perf_event *event, int flags) 275 + { 276 + struct uncore_pmu *uncorepmu = to_uncore_pmu(event->pmu); 277 + struct hw_perf_event *hwc = &event->hw; 278 + 279 + /* Stop and clean up */ 280 + fujitsu_uncore_event_stop(event, flags | PERF_EF_UPDATE); 281 + uncorepmu->events[hwc->idx] = NULL; 282 + bitmap_release_region(uncorepmu->used_mask, hwc->idx, 0); 283 + 284 + /* Propagate changes to the userspace mapping. */ 285 + perf_event_update_userpage(event); 286 + } 287 + 288 + static void fujitsu_uncore_event_read(struct perf_event *event) 289 + { 290 + fujitsu_uncore_counter_update(event); 291 + } 292 + 293 + #define UNCORE_PMU_FORMAT_ATTR(_name, _config) \ 294 + (&((struct dev_ext_attribute[]) { \ 295 + { .attr = __ATTR(_name, 0444, device_show_string, NULL), \ 296 + .var = (void *)_config, } \ 297 + })[0].attr.attr) 298 + 299 + static struct attribute *fujitsu_uncore_pmu_formats[] = { 300 + UNCORE_PMU_FORMAT_ATTR(event, "config:0-7"), 301 + NULL 302 + }; 303 + 304 + static const struct attribute_group fujitsu_uncore_pmu_format_group = { 305 + .name = "format", 306 + .attrs = fujitsu_uncore_pmu_formats, 307 + }; 308 + 309 + static ssize_t fujitsu_uncore_pmu_event_show(struct device *dev, 310 + struct device_attribute *attr, char *page) 311 + { 312 + struct perf_pmu_events_attr *pmu_attr; 313 + 314 + pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr); 315 + return sysfs_emit(page, "event=0x%02llx\n", pmu_attr->id); 316 + } 317 + 318 + #define MAC_EVENT_ATTR(_name, _id) \ 319 + PMU_EVENT_ATTR_ID(_name, fujitsu_uncore_pmu_event_show, _id) 320 + 321 + static struct attribute *fujitsu_uncore_mac_pmu_events[] = { 322 + MAC_EVENT_ATTR(cycles, 0x00), 323 + MAC_EVENT_ATTR(read-count, 0x10), 324 + MAC_EVENT_ATTR(read-count-request, 0x11), 325 + MAC_EVENT_ATTR(read-count-return, 0x12), 326 + MAC_EVENT_ATTR(read-count-request-pftgt, 0x13), 327 + MAC_EVENT_ATTR(read-count-request-normal, 0x14), 328 + MAC_EVENT_ATTR(read-count-return-pftgt-hit, 0x15), 329 + MAC_EVENT_ATTR(read-count-return-pftgt-miss, 0x16), 330 + MAC_EVENT_ATTR(read-wait, 0x17), 331 + MAC_EVENT_ATTR(write-count, 0x20), 332 + MAC_EVENT_ATTR(write-count-write, 0x21), 333 + MAC_EVENT_ATTR(write-count-pwrite, 0x22), 334 + MAC_EVENT_ATTR(memory-read-count, 0x40), 335 + MAC_EVENT_ATTR(memory-write-count, 0x50), 336 + MAC_EVENT_ATTR(memory-pwrite-count, 0x60), 337 + MAC_EVENT_ATTR(ea-mac, 0x80), 338 + MAC_EVENT_ATTR(ea-memory, 0x90), 339 + MAC_EVENT_ATTR(ea-memory-mac-write, 0x92), 340 + MAC_EVENT_ATTR(ea-ha, 0xa0), 341 + NULL 342 + }; 343 + 344 + #define PCI_EVENT_ATTR(_name, _id) \ 345 + PMU_EVENT_ATTR_ID(_name, fujitsu_uncore_pmu_event_show, _id) 346 + 347 + static struct attribute *fujitsu_uncore_pci_pmu_events[] = { 348 + PCI_EVENT_ATTR(pci-port0-cycles, 0x00), 349 + PCI_EVENT_ATTR(pci-port0-read-count, 0x10), 350 + PCI_EVENT_ATTR(pci-port0-read-count-bus, 0x14), 351 + PCI_EVENT_ATTR(pci-port0-write-count, 0x20), 352 + PCI_EVENT_ATTR(pci-port0-write-count-bus, 0x24), 353 + PCI_EVENT_ATTR(pci-port1-cycles, 0x40), 354 + PCI_EVENT_ATTR(pci-port1-read-count, 0x50), 355 + PCI_EVENT_ATTR(pci-port1-read-count-bus, 0x54), 356 + PCI_EVENT_ATTR(pci-port1-write-count, 0x60), 357 + PCI_EVENT_ATTR(pci-port1-write-count-bus, 0x64), 358 + PCI_EVENT_ATTR(ea-pci, 0x80), 359 + NULL 360 + }; 361 + 362 + static const struct attribute_group fujitsu_uncore_mac_pmu_events_group = { 363 + .name = "events", 364 + .attrs = fujitsu_uncore_mac_pmu_events, 365 + }; 366 + 367 + static const struct attribute_group fujitsu_uncore_pci_pmu_events_group = { 368 + .name = "events", 369 + .attrs = fujitsu_uncore_pci_pmu_events, 370 + }; 371 + 372 + static ssize_t cpumask_show(struct device *dev, 373 + struct device_attribute *attr, char *buf) 374 + { 375 + struct uncore_pmu *uncorepmu = to_uncore_pmu(dev_get_drvdata(dev)); 376 + 377 + return cpumap_print_to_pagebuf(true, buf, cpumask_of(uncorepmu->cpu)); 378 + } 379 + static DEVICE_ATTR_RO(cpumask); 380 + 381 + static struct attribute *fujitsu_uncore_pmu_cpumask_attrs[] = { 382 + &dev_attr_cpumask.attr, 383 + NULL 384 + }; 385 + 386 + static const struct attribute_group fujitsu_uncore_pmu_cpumask_attr_group = { 387 + .attrs = fujitsu_uncore_pmu_cpumask_attrs, 388 + }; 389 + 390 + static const struct attribute_group *fujitsu_uncore_mac_pmu_attr_grps[] = { 391 + &fujitsu_uncore_pmu_format_group, 392 + &fujitsu_uncore_mac_pmu_events_group, 393 + &fujitsu_uncore_pmu_cpumask_attr_group, 394 + NULL 395 + }; 396 + 397 + static const struct attribute_group *fujitsu_uncore_pci_pmu_attr_grps[] = { 398 + &fujitsu_uncore_pmu_format_group, 399 + &fujitsu_uncore_pci_pmu_events_group, 400 + &fujitsu_uncore_pmu_cpumask_attr_group, 401 + NULL 402 + }; 403 + 404 + static void fujitsu_uncore_pmu_migrate(struct uncore_pmu *uncorepmu, unsigned int cpu) 405 + { 406 + perf_pmu_migrate_context(&uncorepmu->pmu, uncorepmu->cpu, cpu); 407 + irq_set_affinity(uncorepmu->irq, cpumask_of(cpu)); 408 + uncorepmu->cpu = cpu; 409 + } 410 + 411 + static int fujitsu_uncore_pmu_online_cpu(unsigned int cpu, struct hlist_node *cpuhp_node) 412 + { 413 + struct uncore_pmu *uncorepmu; 414 + int node; 415 + 416 + uncorepmu = hlist_entry_safe(cpuhp_node, struct uncore_pmu, node); 417 + node = dev_to_node(uncorepmu->dev); 418 + if (cpu_to_node(uncorepmu->cpu) != node && cpu_to_node(cpu) == node) 419 + fujitsu_uncore_pmu_migrate(uncorepmu, cpu); 420 + 421 + return 0; 422 + } 423 + 424 + static int fujitsu_uncore_pmu_offline_cpu(unsigned int cpu, struct hlist_node *cpuhp_node) 425 + { 426 + struct uncore_pmu *uncorepmu; 427 + unsigned int target; 428 + int node; 429 + 430 + uncorepmu = hlist_entry_safe(cpuhp_node, struct uncore_pmu, node); 431 + if (cpu != uncorepmu->cpu) 432 + return 0; 433 + 434 + node = dev_to_node(uncorepmu->dev); 435 + target = cpumask_any_and_but(cpumask_of_node(node), cpu_online_mask, cpu); 436 + if (target >= nr_cpu_ids) 437 + target = cpumask_any_but(cpu_online_mask, cpu); 438 + 439 + if (target < nr_cpu_ids) 440 + fujitsu_uncore_pmu_migrate(uncorepmu, target); 441 + 442 + return 0; 443 + } 444 + 445 + static int fujitsu_uncore_pmu_probe(struct platform_device *pdev) 446 + { 447 + struct device *dev = &pdev->dev; 448 + unsigned long device_type = (unsigned long)device_get_match_data(dev); 449 + const struct attribute_group **attr_groups; 450 + struct uncore_pmu *uncorepmu; 451 + struct resource *memrc; 452 + size_t alloc_size; 453 + char *name; 454 + int ret; 455 + int irq; 456 + u64 uid; 457 + 458 + ret = acpi_dev_uid_to_integer(ACPI_COMPANION(dev), &uid); 459 + if (ret) 460 + return dev_err_probe(dev, ret, "unable to read ACPI uid\n"); 461 + 462 + uncorepmu = devm_kzalloc(dev, sizeof(*uncorepmu), GFP_KERNEL); 463 + if (!uncorepmu) 464 + return -ENOMEM; 465 + uncorepmu->dev = dev; 466 + uncorepmu->cpu = cpumask_local_spread(0, dev_to_node(dev)); 467 + platform_set_drvdata(pdev, uncorepmu); 468 + 469 + switch (device_type) { 470 + case FUJITSU_UNCORE_PMU_MAC: 471 + uncorepmu->num_counters = MAC_NUM_COUNTERS; 472 + attr_groups = fujitsu_uncore_mac_pmu_attr_grps; 473 + name = devm_kasprintf(dev, GFP_KERNEL, "mac_iod%llu_mac%llu_ch%llu", 474 + (uid >> 8) & 0xF, (uid >> 4) & 0xF, uid & 0xF); 475 + break; 476 + case FUJITSU_UNCORE_PMU_PCI: 477 + uncorepmu->num_counters = PCI_NUM_COUNTERS; 478 + attr_groups = fujitsu_uncore_pci_pmu_attr_grps; 479 + name = devm_kasprintf(dev, GFP_KERNEL, "pci_iod%llu_pci%llu", 480 + (uid >> 4) & 0xF, uid & 0xF); 481 + break; 482 + default: 483 + return dev_err_probe(dev, -EINVAL, "illegal device type: %lu\n", device_type); 484 + } 485 + if (!name) 486 + return -ENOMEM; 487 + 488 + uncorepmu->pmu = (struct pmu) { 489 + .parent = dev, 490 + .task_ctx_nr = perf_invalid_context, 491 + 492 + .attr_groups = attr_groups, 493 + 494 + .pmu_enable = fujitsu_uncore_pmu_enable, 495 + .pmu_disable = fujitsu_uncore_pmu_disable, 496 + .event_init = fujitsu_uncore_event_init, 497 + .add = fujitsu_uncore_event_add, 498 + .del = fujitsu_uncore_event_del, 499 + .start = fujitsu_uncore_event_start, 500 + .stop = fujitsu_uncore_event_stop, 501 + .read = fujitsu_uncore_event_read, 502 + 503 + .capabilities = PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT, 504 + }; 505 + 506 + alloc_size = sizeof(uncorepmu->events[0]) * uncorepmu->num_counters; 507 + uncorepmu->events = devm_kzalloc(dev, alloc_size, GFP_KERNEL); 508 + if (!uncorepmu->events) 509 + return -ENOMEM; 510 + 511 + alloc_size = sizeof(uncorepmu->used_mask[0]) * BITS_TO_LONGS(uncorepmu->num_counters); 512 + uncorepmu->used_mask = devm_kzalloc(dev, alloc_size, GFP_KERNEL); 513 + if (!uncorepmu->used_mask) 514 + return -ENOMEM; 515 + 516 + uncorepmu->regs = devm_platform_get_and_ioremap_resource(pdev, 0, &memrc); 517 + if (IS_ERR(uncorepmu->regs)) 518 + return PTR_ERR(uncorepmu->regs); 519 + 520 + fujitsu_uncore_init(uncorepmu); 521 + 522 + irq = platform_get_irq(pdev, 0); 523 + if (irq < 0) 524 + return irq; 525 + 526 + ret = devm_request_irq(dev, irq, fujitsu_uncore_handle_irq, 527 + IRQF_NOBALANCING | IRQF_NO_THREAD, 528 + name, uncorepmu); 529 + if (ret) 530 + return dev_err_probe(dev, ret, "Failed to request IRQ:%d\n", irq); 531 + 532 + ret = irq_set_affinity(irq, cpumask_of(uncorepmu->cpu)); 533 + if (ret) 534 + return dev_err_probe(dev, ret, "Failed to set irq affinity:%d\n", irq); 535 + 536 + uncorepmu->irq = irq; 537 + 538 + /* Add this instance to the list used by the offline callback */ 539 + ret = cpuhp_state_add_instance(uncore_pmu_cpuhp_state, &uncorepmu->node); 540 + if (ret) 541 + return dev_err_probe(dev, ret, "Error registering hotplug"); 542 + 543 + ret = perf_pmu_register(&uncorepmu->pmu, name, -1); 544 + if (ret < 0) { 545 + cpuhp_state_remove_instance_nocalls(uncore_pmu_cpuhp_state, &uncorepmu->node); 546 + return dev_err_probe(dev, ret, "Failed to register %s PMU\n", name); 547 + } 548 + 549 + dev_dbg(dev, "Registered %s, type: %d\n", name, uncorepmu->pmu.type); 550 + 551 + return 0; 552 + } 553 + 554 + static void fujitsu_uncore_pmu_remove(struct platform_device *pdev) 555 + { 556 + struct uncore_pmu *uncorepmu = platform_get_drvdata(pdev); 557 + 558 + writeq_relaxed(0, uncorepmu->regs + PM_CR); 559 + 560 + perf_pmu_unregister(&uncorepmu->pmu); 561 + cpuhp_state_remove_instance_nocalls(uncore_pmu_cpuhp_state, &uncorepmu->node); 562 + } 563 + 564 + static const struct acpi_device_id fujitsu_uncore_pmu_acpi_match[] = { 565 + { "FUJI200C", FUJITSU_UNCORE_PMU_MAC }, 566 + { "FUJI200D", FUJITSU_UNCORE_PMU_PCI }, 567 + { } 568 + }; 569 + MODULE_DEVICE_TABLE(acpi, fujitsu_uncore_pmu_acpi_match); 570 + 571 + static struct platform_driver fujitsu_uncore_pmu_driver = { 572 + .driver = { 573 + .name = "fujitsu-uncore-pmu", 574 + .acpi_match_table = fujitsu_uncore_pmu_acpi_match, 575 + .suppress_bind_attrs = true, 576 + }, 577 + .probe = fujitsu_uncore_pmu_probe, 578 + .remove = fujitsu_uncore_pmu_remove, 579 + }; 580 + 581 + static int __init fujitsu_uncore_pmu_init(void) 582 + { 583 + int ret; 584 + 585 + /* Install a hook to update the reader CPU in case it goes offline */ 586 + ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, 587 + "perf/fujitsu/uncore:online", 588 + fujitsu_uncore_pmu_online_cpu, 589 + fujitsu_uncore_pmu_offline_cpu); 590 + if (ret < 0) 591 + return ret; 592 + 593 + uncore_pmu_cpuhp_state = ret; 594 + 595 + ret = platform_driver_register(&fujitsu_uncore_pmu_driver); 596 + if (ret) 597 + cpuhp_remove_multi_state(uncore_pmu_cpuhp_state); 598 + 599 + return ret; 600 + } 601 + 602 + static void __exit fujitsu_uncore_pmu_exit(void) 603 + { 604 + platform_driver_unregister(&fujitsu_uncore_pmu_driver); 605 + cpuhp_remove_multi_state(uncore_pmu_cpuhp_state); 606 + } 607 + 608 + module_init(fujitsu_uncore_pmu_init); 609 + module_exit(fujitsu_uncore_pmu_exit); 610 + 611 + MODULE_AUTHOR("Koichi Okuno <fj2767dz@fujitsu.com>"); 612 + MODULE_DESCRIPTION("Fujitsu Uncore PMU driver"); 613 + MODULE_LICENSE("GPL");
+2 -1
drivers/perf/hisilicon/Makefile
··· 1 1 # SPDX-License-Identifier: GPL-2.0-only 2 2 obj-$(CONFIG_HISI_PMU) += hisi_uncore_pmu.o hisi_uncore_l3c_pmu.o \ 3 3 hisi_uncore_hha_pmu.o hisi_uncore_ddrc_pmu.o hisi_uncore_sllc_pmu.o \ 4 - hisi_uncore_pa_pmu.o hisi_uncore_cpa_pmu.o hisi_uncore_uc_pmu.o 4 + hisi_uncore_pa_pmu.o hisi_uncore_cpa_pmu.o hisi_uncore_uc_pmu.o \ 5 + hisi_uncore_noc_pmu.o hisi_uncore_mn_pmu.o 5 6 6 7 obj-$(CONFIG_HISI_PCIE_PMU) += hisi_pcie_pmu.o 7 8 obj-$(CONFIG_HNS3_PMU) += hns3_pmu.o
+438 -90
drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
··· 39 39 40 40 /* L3C has 8-counters */ 41 41 #define L3C_NR_COUNTERS 0x8 42 + #define L3C_MAX_EXT 2 42 43 43 44 #define L3C_PERF_CTRL_EN 0x10000 44 45 #define L3C_TRACETAG_EN BIT(31) ··· 56 55 #define L3C_V1_NR_EVENTS 0x59 57 56 #define L3C_V2_NR_EVENTS 0xFF 58 57 59 - HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_core, config1, 7, 0); 58 + HISI_PMU_EVENT_ATTR_EXTRACTOR(ext, config, 17, 16); 60 59 HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_req, config1, 10, 8); 61 60 HISI_PMU_EVENT_ATTR_EXTRACTOR(datasrc_cfg, config1, 15, 11); 62 61 HISI_PMU_EVENT_ATTR_EXTRACTOR(datasrc_skt, config1, 16, 16); 62 + HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_core, config2, 15, 0); 63 + 64 + struct hisi_l3c_pmu { 65 + struct hisi_pmu l3c_pmu; 66 + 67 + /* MMIO and IRQ resources for extension events */ 68 + void __iomem *ext_base[L3C_MAX_EXT]; 69 + int ext_irq[L3C_MAX_EXT]; 70 + int ext_num; 71 + }; 72 + 73 + #define to_hisi_l3c_pmu(_l3c_pmu) \ 74 + container_of(_l3c_pmu, struct hisi_l3c_pmu, l3c_pmu) 75 + 76 + /* 77 + * The hardware counter idx used in counter enable/disable, 78 + * interrupt enable/disable and status check, etc. 79 + */ 80 + #define L3C_HW_IDX(_cntr_idx) ((_cntr_idx) % L3C_NR_COUNTERS) 81 + 82 + /* Range of ext counters in used mask. */ 83 + #define L3C_CNTR_EXT_L(_ext) (((_ext) + 1) * L3C_NR_COUNTERS) 84 + #define L3C_CNTR_EXT_H(_ext) (((_ext) + 2) * L3C_NR_COUNTERS) 85 + 86 + struct hisi_l3c_pmu_ext { 87 + bool support_ext; 88 + }; 89 + 90 + static bool support_ext(struct hisi_l3c_pmu *pmu) 91 + { 92 + struct hisi_l3c_pmu_ext *l3c_pmu_ext = pmu->l3c_pmu.dev_info->private; 93 + 94 + return l3c_pmu_ext->support_ext; 95 + } 96 + 97 + static int hisi_l3c_pmu_get_event_idx(struct perf_event *event) 98 + { 99 + struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); 100 + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); 101 + unsigned long *used_mask = l3c_pmu->pmu_events.used_mask; 102 + int ext = hisi_get_ext(event); 103 + int idx; 104 + 105 + /* 106 + * For an L3C PMU that supports extension events, we can monitor 107 + * maximum 2 * num_counters to 3 * num_counters events, depending on 108 + * the number of ext regions supported by hardware. Thus use bit 109 + * [0, num_counters - 1] for normal events and bit 110 + * [ext * num_counters, (ext + 1) * num_counters - 1] for extension 111 + * events. The idx allocation will keep unchanged for normal events and 112 + * we can also use the idx to distinguish whether it's an extension 113 + * event or not. 114 + * 115 + * Since normal events and extension events locates on the different 116 + * address space, save the base address to the event->hw.event_base. 117 + */ 118 + if (ext && !support_ext(hisi_l3c_pmu)) 119 + return -EOPNOTSUPP; 120 + 121 + if (ext) 122 + event->hw.event_base = (unsigned long)hisi_l3c_pmu->ext_base[ext - 1]; 123 + else 124 + event->hw.event_base = (unsigned long)l3c_pmu->base; 125 + 126 + ext -= 1; 127 + idx = find_next_zero_bit(used_mask, L3C_CNTR_EXT_H(ext), L3C_CNTR_EXT_L(ext)); 128 + 129 + if (idx >= L3C_CNTR_EXT_H(ext)) 130 + return -EAGAIN; 131 + 132 + set_bit(idx, used_mask); 133 + 134 + return idx; 135 + } 136 + 137 + static u32 hisi_l3c_pmu_event_readl(struct hw_perf_event *hwc, u32 reg) 138 + { 139 + return readl((void __iomem *)hwc->event_base + reg); 140 + } 141 + 142 + static void hisi_l3c_pmu_event_writel(struct hw_perf_event *hwc, u32 reg, u32 val) 143 + { 144 + writel(val, (void __iomem *)hwc->event_base + reg); 145 + } 146 + 147 + static u64 hisi_l3c_pmu_event_readq(struct hw_perf_event *hwc, u32 reg) 148 + { 149 + return readq((void __iomem *)hwc->event_base + reg); 150 + } 151 + 152 + static void hisi_l3c_pmu_event_writeq(struct hw_perf_event *hwc, u32 reg, u64 val) 153 + { 154 + writeq(val, (void __iomem *)hwc->event_base + reg); 155 + } 63 156 64 157 static void hisi_l3c_pmu_config_req_tracetag(struct perf_event *event) 65 158 { 66 - struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); 159 + struct hw_perf_event *hwc = &event->hw; 67 160 u32 tt_req = hisi_get_tt_req(event); 68 161 69 162 if (tt_req) { 70 163 u32 val; 71 164 72 165 /* Set request-type for tracetag */ 73 - val = readl(l3c_pmu->base + L3C_TRACETAG_CTRL); 166 + val = hisi_l3c_pmu_event_readl(hwc, L3C_TRACETAG_CTRL); 74 167 val |= tt_req << L3C_TRACETAG_REQ_SHIFT; 75 168 val |= L3C_TRACETAG_REQ_EN; 76 - writel(val, l3c_pmu->base + L3C_TRACETAG_CTRL); 169 + hisi_l3c_pmu_event_writel(hwc, L3C_TRACETAG_CTRL, val); 77 170 78 171 /* Enable request-tracetag statistics */ 79 - val = readl(l3c_pmu->base + L3C_PERF_CTRL); 172 + val = hisi_l3c_pmu_event_readl(hwc, L3C_PERF_CTRL); 80 173 val |= L3C_TRACETAG_EN; 81 - writel(val, l3c_pmu->base + L3C_PERF_CTRL); 174 + hisi_l3c_pmu_event_writel(hwc, L3C_PERF_CTRL, val); 82 175 } 83 176 } 84 177 85 178 static void hisi_l3c_pmu_clear_req_tracetag(struct perf_event *event) 86 179 { 87 - struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); 180 + struct hw_perf_event *hwc = &event->hw; 88 181 u32 tt_req = hisi_get_tt_req(event); 89 182 90 183 if (tt_req) { 91 184 u32 val; 92 185 93 186 /* Clear request-type */ 94 - val = readl(l3c_pmu->base + L3C_TRACETAG_CTRL); 187 + val = hisi_l3c_pmu_event_readl(hwc, L3C_TRACETAG_CTRL); 95 188 val &= ~(tt_req << L3C_TRACETAG_REQ_SHIFT); 96 189 val &= ~L3C_TRACETAG_REQ_EN; 97 - writel(val, l3c_pmu->base + L3C_TRACETAG_CTRL); 190 + hisi_l3c_pmu_event_writel(hwc, L3C_TRACETAG_CTRL, val); 98 191 99 192 /* Disable request-tracetag statistics */ 100 - val = readl(l3c_pmu->base + L3C_PERF_CTRL); 193 + val = hisi_l3c_pmu_event_readl(hwc, L3C_PERF_CTRL); 101 194 val &= ~L3C_TRACETAG_EN; 102 - writel(val, l3c_pmu->base + L3C_PERF_CTRL); 195 + hisi_l3c_pmu_event_writel(hwc, L3C_PERF_CTRL, val); 103 196 } 104 197 } 105 198 106 199 static void hisi_l3c_pmu_write_ds(struct perf_event *event, u32 ds_cfg) 107 200 { 108 - struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); 109 201 struct hw_perf_event *hwc = &event->hw; 110 202 u32 reg, reg_idx, shift, val; 111 - int idx = hwc->idx; 203 + int idx = L3C_HW_IDX(hwc->idx); 112 204 113 205 /* 114 206 * Select the appropriate datasource register(L3C_DATSRC_TYPE0/1). ··· 214 120 reg_idx = idx % 4; 215 121 shift = 8 * reg_idx; 216 122 217 - val = readl(l3c_pmu->base + reg); 123 + val = hisi_l3c_pmu_event_readl(hwc, reg); 218 124 val &= ~(L3C_DATSRC_MASK << shift); 219 125 val |= ds_cfg << shift; 220 - writel(val, l3c_pmu->base + reg); 126 + hisi_l3c_pmu_event_writel(hwc, reg, val); 221 127 } 222 128 223 129 static void hisi_l3c_pmu_config_ds(struct perf_event *event) 224 130 { 225 - struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); 131 + struct hw_perf_event *hwc = &event->hw; 226 132 u32 ds_cfg = hisi_get_datasrc_cfg(event); 227 133 u32 ds_skt = hisi_get_datasrc_skt(event); 228 134 ··· 232 138 if (ds_skt) { 233 139 u32 val; 234 140 235 - val = readl(l3c_pmu->base + L3C_DATSRC_CTRL); 141 + val = hisi_l3c_pmu_event_readl(hwc, L3C_DATSRC_CTRL); 236 142 val |= L3C_DATSRC_SKT_EN; 237 - writel(val, l3c_pmu->base + L3C_DATSRC_CTRL); 143 + hisi_l3c_pmu_event_writel(hwc, L3C_DATSRC_CTRL, val); 238 144 } 239 145 } 240 146 241 147 static void hisi_l3c_pmu_clear_ds(struct perf_event *event) 242 148 { 243 - struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); 149 + struct hw_perf_event *hwc = &event->hw; 244 150 u32 ds_cfg = hisi_get_datasrc_cfg(event); 245 151 u32 ds_skt = hisi_get_datasrc_skt(event); 246 152 ··· 250 156 if (ds_skt) { 251 157 u32 val; 252 158 253 - val = readl(l3c_pmu->base + L3C_DATSRC_CTRL); 159 + val = hisi_l3c_pmu_event_readl(hwc, L3C_DATSRC_CTRL); 254 160 val &= ~L3C_DATSRC_SKT_EN; 255 - writel(val, l3c_pmu->base + L3C_DATSRC_CTRL); 161 + hisi_l3c_pmu_event_writel(hwc, L3C_DATSRC_CTRL, val); 256 162 } 257 163 } 258 164 259 165 static void hisi_l3c_pmu_config_core_tracetag(struct perf_event *event) 260 166 { 261 - struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); 167 + struct hw_perf_event *hwc = &event->hw; 262 168 u32 core = hisi_get_tt_core(event); 263 169 264 170 if (core) { 265 171 u32 val; 266 172 267 173 /* Config and enable core information */ 268 - writel(core, l3c_pmu->base + L3C_CORE_CTRL); 269 - val = readl(l3c_pmu->base + L3C_PERF_CTRL); 174 + hisi_l3c_pmu_event_writel(hwc, L3C_CORE_CTRL, core); 175 + val = hisi_l3c_pmu_event_readl(hwc, L3C_PERF_CTRL); 270 176 val |= L3C_CORE_EN; 271 - writel(val, l3c_pmu->base + L3C_PERF_CTRL); 177 + hisi_l3c_pmu_event_writel(hwc, L3C_PERF_CTRL, val); 272 178 273 179 /* Enable core-tracetag statistics */ 274 - val = readl(l3c_pmu->base + L3C_TRACETAG_CTRL); 180 + val = hisi_l3c_pmu_event_readl(hwc, L3C_TRACETAG_CTRL); 275 181 val |= L3C_TRACETAG_CORE_EN; 276 - writel(val, l3c_pmu->base + L3C_TRACETAG_CTRL); 182 + hisi_l3c_pmu_event_writel(hwc, L3C_TRACETAG_CTRL, val); 277 183 } 278 184 } 279 185 280 186 static void hisi_l3c_pmu_clear_core_tracetag(struct perf_event *event) 281 187 { 282 - struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); 188 + struct hw_perf_event *hwc = &event->hw; 283 189 u32 core = hisi_get_tt_core(event); 284 190 285 191 if (core) { 286 192 u32 val; 287 193 288 194 /* Clear core information */ 289 - writel(L3C_COER_NONE, l3c_pmu->base + L3C_CORE_CTRL); 290 - val = readl(l3c_pmu->base + L3C_PERF_CTRL); 195 + hisi_l3c_pmu_event_writel(hwc, L3C_CORE_CTRL, L3C_COER_NONE); 196 + val = hisi_l3c_pmu_event_readl(hwc, L3C_PERF_CTRL); 291 197 val &= ~L3C_CORE_EN; 292 - writel(val, l3c_pmu->base + L3C_PERF_CTRL); 198 + hisi_l3c_pmu_event_writel(hwc, L3C_PERF_CTRL, val); 293 199 294 200 /* Disable core-tracetag statistics */ 295 - val = readl(l3c_pmu->base + L3C_TRACETAG_CTRL); 201 + val = hisi_l3c_pmu_event_readl(hwc, L3C_TRACETAG_CTRL); 296 202 val &= ~L3C_TRACETAG_CORE_EN; 297 - writel(val, l3c_pmu->base + L3C_TRACETAG_CTRL); 203 + hisi_l3c_pmu_event_writel(hwc, L3C_TRACETAG_CTRL, val); 298 204 } 205 + } 206 + 207 + static bool hisi_l3c_pmu_have_filter(struct perf_event *event) 208 + { 209 + return hisi_get_tt_req(event) || hisi_get_tt_core(event) || 210 + hisi_get_datasrc_cfg(event) || hisi_get_datasrc_skt(event); 299 211 } 300 212 301 213 static void hisi_l3c_pmu_enable_filter(struct perf_event *event) 302 214 { 303 - if (event->attr.config1 != 0x0) { 215 + if (hisi_l3c_pmu_have_filter(event)) { 304 216 hisi_l3c_pmu_config_req_tracetag(event); 305 217 hisi_l3c_pmu_config_core_tracetag(event); 306 218 hisi_l3c_pmu_config_ds(event); ··· 315 215 316 216 static void hisi_l3c_pmu_disable_filter(struct perf_event *event) 317 217 { 318 - if (event->attr.config1 != 0x0) { 218 + if (hisi_l3c_pmu_have_filter(event)) { 319 219 hisi_l3c_pmu_clear_ds(event); 320 220 hisi_l3c_pmu_clear_core_tracetag(event); 321 221 hisi_l3c_pmu_clear_req_tracetag(event); 322 222 } 223 + } 224 + 225 + static int hisi_l3c_pmu_check_filter(struct perf_event *event) 226 + { 227 + struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); 228 + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); 229 + int ext = hisi_get_ext(event); 230 + 231 + if (ext < 0 || ext > hisi_l3c_pmu->ext_num) 232 + return -EINVAL; 233 + 234 + return 0; 323 235 } 324 236 325 237 /* ··· 339 227 */ 340 228 static u32 hisi_l3c_pmu_get_counter_offset(int cntr_idx) 341 229 { 342 - return (L3C_CNTR0_LOWER + (cntr_idx * 8)); 230 + return L3C_CNTR0_LOWER + L3C_HW_IDX(cntr_idx) * 8; 343 231 } 344 232 345 233 static u64 hisi_l3c_pmu_read_counter(struct hisi_pmu *l3c_pmu, 346 234 struct hw_perf_event *hwc) 347 235 { 348 - return readq(l3c_pmu->base + hisi_l3c_pmu_get_counter_offset(hwc->idx)); 236 + return hisi_l3c_pmu_event_readq(hwc, hisi_l3c_pmu_get_counter_offset(hwc->idx)); 349 237 } 350 238 351 239 static void hisi_l3c_pmu_write_counter(struct hisi_pmu *l3c_pmu, 352 240 struct hw_perf_event *hwc, u64 val) 353 241 { 354 - writeq(val, l3c_pmu->base + hisi_l3c_pmu_get_counter_offset(hwc->idx)); 242 + hisi_l3c_pmu_event_writeq(hwc, hisi_l3c_pmu_get_counter_offset(hwc->idx), val); 355 243 } 356 244 357 245 static void hisi_l3c_pmu_write_evtype(struct hisi_pmu *l3c_pmu, int idx, 358 246 u32 type) 359 247 { 248 + struct hw_perf_event *hwc = &l3c_pmu->pmu_events.hw_events[idx]->hw; 360 249 u32 reg, reg_idx, shift, val; 250 + 251 + idx = L3C_HW_IDX(idx); 361 252 362 253 /* 363 254 * Select the appropriate event select register(L3C_EVENT_TYPE0/1). ··· 374 259 shift = 8 * reg_idx; 375 260 376 261 /* Write event code to L3C_EVENT_TYPEx Register */ 377 - val = readl(l3c_pmu->base + reg); 262 + val = hisi_l3c_pmu_event_readl(hwc, reg); 378 263 val &= ~(L3C_EVTYPE_NONE << shift); 379 - val |= (type << shift); 380 - writel(val, l3c_pmu->base + reg); 264 + val |= type << shift; 265 + hisi_l3c_pmu_event_writel(hwc, reg, val); 381 266 } 382 267 383 268 static void hisi_l3c_pmu_start_counters(struct hisi_pmu *l3c_pmu) 384 269 { 270 + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); 271 + unsigned long *used_mask = l3c_pmu->pmu_events.used_mask; 272 + unsigned long used_cntr = find_first_bit(used_mask, l3c_pmu->num_counters); 385 273 u32 val; 274 + int i; 386 275 387 276 /* 388 - * Set perf_enable bit in L3C_PERF_CTRL register to start counting 389 - * for all enabled counters. 277 + * Check if any counter belongs to the normal range (instead of ext 278 + * range). If so, enable it. 390 279 */ 391 - val = readl(l3c_pmu->base + L3C_PERF_CTRL); 392 - val |= L3C_PERF_CTRL_EN; 393 - writel(val, l3c_pmu->base + L3C_PERF_CTRL); 280 + if (used_cntr < L3C_NR_COUNTERS) { 281 + val = readl(l3c_pmu->base + L3C_PERF_CTRL); 282 + val |= L3C_PERF_CTRL_EN; 283 + writel(val, l3c_pmu->base + L3C_PERF_CTRL); 284 + } 285 + 286 + /* If not, do enable it on ext ranges. */ 287 + for (i = 0; i < hisi_l3c_pmu->ext_num; i++) { 288 + /* Find used counter in this ext range, skip the range if not. */ 289 + used_cntr = find_next_bit(used_mask, L3C_CNTR_EXT_H(i), L3C_CNTR_EXT_L(i)); 290 + if (used_cntr >= L3C_CNTR_EXT_H(i)) 291 + continue; 292 + 293 + val = readl(hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL); 294 + val |= L3C_PERF_CTRL_EN; 295 + writel(val, hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL); 296 + } 394 297 } 395 298 396 299 static void hisi_l3c_pmu_stop_counters(struct hisi_pmu *l3c_pmu) 397 300 { 301 + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); 302 + unsigned long *used_mask = l3c_pmu->pmu_events.used_mask; 303 + unsigned long used_cntr = find_first_bit(used_mask, l3c_pmu->num_counters); 398 304 u32 val; 305 + int i; 399 306 400 307 /* 401 - * Clear perf_enable bit in L3C_PERF_CTRL register to stop counting 402 - * for all enabled counters. 308 + * Check if any counter belongs to the normal range (instead of ext 309 + * range). If so, stop it. 403 310 */ 404 - val = readl(l3c_pmu->base + L3C_PERF_CTRL); 405 - val &= ~(L3C_PERF_CTRL_EN); 406 - writel(val, l3c_pmu->base + L3C_PERF_CTRL); 311 + if (used_cntr < L3C_NR_COUNTERS) { 312 + val = readl(l3c_pmu->base + L3C_PERF_CTRL); 313 + val &= ~L3C_PERF_CTRL_EN; 314 + writel(val, l3c_pmu->base + L3C_PERF_CTRL); 315 + } 316 + 317 + /* If not, do stop it on ext ranges. */ 318 + for (i = 0; i < hisi_l3c_pmu->ext_num; i++) { 319 + /* Find used counter in this ext range, skip the range if not. */ 320 + used_cntr = find_next_bit(used_mask, L3C_CNTR_EXT_H(i), L3C_CNTR_EXT_L(i)); 321 + if (used_cntr >= L3C_CNTR_EXT_H(i)) 322 + continue; 323 + 324 + val = readl(hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL); 325 + val &= ~L3C_PERF_CTRL_EN; 326 + writel(val, hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL); 327 + } 407 328 } 408 329 409 330 static void hisi_l3c_pmu_enable_counter(struct hisi_pmu *l3c_pmu, ··· 448 297 u32 val; 449 298 450 299 /* Enable counter index in L3C_EVENT_CTRL register */ 451 - val = readl(l3c_pmu->base + L3C_EVENT_CTRL); 452 - val |= (1 << hwc->idx); 453 - writel(val, l3c_pmu->base + L3C_EVENT_CTRL); 300 + val = hisi_l3c_pmu_event_readl(hwc, L3C_EVENT_CTRL); 301 + val |= 1 << L3C_HW_IDX(hwc->idx); 302 + hisi_l3c_pmu_event_writel(hwc, L3C_EVENT_CTRL, val); 454 303 } 455 304 456 305 static void hisi_l3c_pmu_disable_counter(struct hisi_pmu *l3c_pmu, ··· 459 308 u32 val; 460 309 461 310 /* Clear counter index in L3C_EVENT_CTRL register */ 462 - val = readl(l3c_pmu->base + L3C_EVENT_CTRL); 463 - val &= ~(1 << hwc->idx); 464 - writel(val, l3c_pmu->base + L3C_EVENT_CTRL); 311 + val = hisi_l3c_pmu_event_readl(hwc, L3C_EVENT_CTRL); 312 + val &= ~(1 << L3C_HW_IDX(hwc->idx)); 313 + hisi_l3c_pmu_event_writel(hwc, L3C_EVENT_CTRL, val); 465 314 } 466 315 467 316 static void hisi_l3c_pmu_enable_counter_int(struct hisi_pmu *l3c_pmu, ··· 469 318 { 470 319 u32 val; 471 320 472 - val = readl(l3c_pmu->base + L3C_INT_MASK); 321 + val = hisi_l3c_pmu_event_readl(hwc, L3C_INT_MASK); 473 322 /* Write 0 to enable interrupt */ 474 - val &= ~(1 << hwc->idx); 475 - writel(val, l3c_pmu->base + L3C_INT_MASK); 323 + val &= ~(1 << L3C_HW_IDX(hwc->idx)); 324 + hisi_l3c_pmu_event_writel(hwc, L3C_INT_MASK, val); 476 325 } 477 326 478 327 static void hisi_l3c_pmu_disable_counter_int(struct hisi_pmu *l3c_pmu, ··· 480 329 { 481 330 u32 val; 482 331 483 - val = readl(l3c_pmu->base + L3C_INT_MASK); 332 + val = hisi_l3c_pmu_event_readl(hwc, L3C_INT_MASK); 484 333 /* Write 1 to mask interrupt */ 485 - val |= (1 << hwc->idx); 486 - writel(val, l3c_pmu->base + L3C_INT_MASK); 334 + val |= 1 << L3C_HW_IDX(hwc->idx); 335 + hisi_l3c_pmu_event_writel(hwc, L3C_INT_MASK, val); 487 336 } 488 337 489 338 static u32 hisi_l3c_pmu_get_int_status(struct hisi_pmu *l3c_pmu) 490 339 { 491 - return readl(l3c_pmu->base + L3C_INT_STATUS); 340 + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); 341 + u32 ext_int, status, status_ext = 0; 342 + int i; 343 + 344 + status = readl(l3c_pmu->base + L3C_INT_STATUS); 345 + 346 + if (!support_ext(hisi_l3c_pmu)) 347 + return status; 348 + 349 + for (i = 0; i < hisi_l3c_pmu->ext_num; i++) { 350 + ext_int = readl(hisi_l3c_pmu->ext_base[i] + L3C_INT_STATUS); 351 + status_ext |= ext_int << (L3C_NR_COUNTERS * i); 352 + } 353 + 354 + return status | (status_ext << L3C_NR_COUNTERS); 492 355 } 493 356 494 357 static void hisi_l3c_pmu_clear_int_status(struct hisi_pmu *l3c_pmu, int idx) 495 358 { 496 - writel(1 << idx, l3c_pmu->base + L3C_INT_CLEAR); 497 - } 359 + struct hw_perf_event *hwc = &l3c_pmu->pmu_events.hw_events[idx]->hw; 498 360 499 - static const struct acpi_device_id hisi_l3c_pmu_acpi_match[] = { 500 - { "HISI0213", }, 501 - { "HISI0214", }, 502 - {} 503 - }; 504 - MODULE_DEVICE_TABLE(acpi, hisi_l3c_pmu_acpi_match); 361 + hisi_l3c_pmu_event_writel(hwc, L3C_INT_CLEAR, 1 << L3C_HW_IDX(idx)); 362 + } 505 363 506 364 static int hisi_l3c_pmu_init_data(struct platform_device *pdev, 507 365 struct hisi_pmu *l3c_pmu) ··· 531 371 return -EINVAL; 532 372 } 533 373 374 + l3c_pmu->dev_info = device_get_match_data(&pdev->dev); 375 + if (!l3c_pmu->dev_info) 376 + return -ENODEV; 377 + 534 378 l3c_pmu->base = devm_platform_ioremap_resource(pdev, 0); 535 379 if (IS_ERR(l3c_pmu->base)) { 536 380 dev_err(&pdev->dev, "ioremap failed for l3c_pmu resource\n"); ··· 542 378 } 543 379 544 380 l3c_pmu->identifier = readl(l3c_pmu->base + L3C_VERSION); 381 + 382 + return 0; 383 + } 384 + 385 + static int hisi_l3c_pmu_init_ext(struct hisi_pmu *l3c_pmu, struct platform_device *pdev) 386 + { 387 + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); 388 + int ret, irq, ext_num, i; 389 + char *irqname; 390 + 391 + /* HiSilicon L3C PMU supporting ext should have more than 1 irq resources. */ 392 + ext_num = platform_irq_count(pdev); 393 + if (ext_num < L3C_MAX_EXT) 394 + return -ENODEV; 395 + 396 + /* 397 + * The number of ext supported equals the number of irq - 1, since one 398 + * of the irqs belongs to the normal part of PMU. 399 + */ 400 + hisi_l3c_pmu->ext_num = ext_num - 1; 401 + 402 + for (i = 0; i < hisi_l3c_pmu->ext_num; i++) { 403 + hisi_l3c_pmu->ext_base[i] = devm_platform_ioremap_resource(pdev, i + 1); 404 + if (IS_ERR(hisi_l3c_pmu->ext_base[i])) 405 + return PTR_ERR(hisi_l3c_pmu->ext_base[i]); 406 + 407 + irq = platform_get_irq(pdev, i + 1); 408 + if (irq < 0) 409 + return irq; 410 + 411 + irqname = devm_kasprintf(&pdev->dev, GFP_KERNEL, "%s ext%d", 412 + dev_name(&pdev->dev), i + 1); 413 + if (!irqname) 414 + return -ENOMEM; 415 + 416 + ret = devm_request_irq(&pdev->dev, irq, hisi_uncore_pmu_isr, 417 + IRQF_NOBALANCING | IRQF_NO_THREAD, 418 + irqname, l3c_pmu); 419 + if (ret < 0) 420 + return dev_err_probe(&pdev->dev, ret, 421 + "Fail to request EXT IRQ: %d.\n", irq); 422 + 423 + hisi_l3c_pmu->ext_irq[i] = irq; 424 + } 545 425 546 426 return 0; 547 427 } ··· 602 394 603 395 static struct attribute *hisi_l3c_pmu_v2_format_attr[] = { 604 396 HISI_PMU_FORMAT_ATTR(event, "config:0-7"), 605 - HISI_PMU_FORMAT_ATTR(tt_core, "config1:0-7"), 397 + HISI_PMU_FORMAT_ATTR(tt_core, "config2:0-15"), 606 398 HISI_PMU_FORMAT_ATTR(tt_req, "config1:8-10"), 607 399 HISI_PMU_FORMAT_ATTR(datasrc_cfg, "config1:11-15"), 608 400 HISI_PMU_FORMAT_ATTR(datasrc_skt, "config1:16"), ··· 612 404 static const struct attribute_group hisi_l3c_pmu_v2_format_group = { 613 405 .name = "format", 614 406 .attrs = hisi_l3c_pmu_v2_format_attr, 407 + }; 408 + 409 + static struct attribute *hisi_l3c_pmu_v3_format_attr[] = { 410 + HISI_PMU_FORMAT_ATTR(event, "config:0-7"), 411 + HISI_PMU_FORMAT_ATTR(ext, "config:16-17"), 412 + HISI_PMU_FORMAT_ATTR(tt_req, "config1:8-10"), 413 + HISI_PMU_FORMAT_ATTR(tt_core, "config2:0-15"), 414 + NULL 415 + }; 416 + 417 + static const struct attribute_group hisi_l3c_pmu_v3_format_group = { 418 + .name = "format", 419 + .attrs = hisi_l3c_pmu_v3_format_attr, 615 420 }; 616 421 617 422 static struct attribute *hisi_l3c_pmu_v1_events_attr[] = { ··· 662 441 .attrs = hisi_l3c_pmu_v2_events_attr, 663 442 }; 664 443 444 + static struct attribute *hisi_l3c_pmu_v3_events_attr[] = { 445 + HISI_PMU_EVENT_ATTR(rd_spipe, 0x18), 446 + HISI_PMU_EVENT_ATTR(rd_hit_spipe, 0x19), 447 + HISI_PMU_EVENT_ATTR(wr_spipe, 0x1a), 448 + HISI_PMU_EVENT_ATTR(wr_hit_spipe, 0x1b), 449 + HISI_PMU_EVENT_ATTR(io_rd_spipe, 0x1c), 450 + HISI_PMU_EVENT_ATTR(io_rd_hit_spipe, 0x1d), 451 + HISI_PMU_EVENT_ATTR(io_wr_spipe, 0x1e), 452 + HISI_PMU_EVENT_ATTR(io_wr_hit_spipe, 0x1f), 453 + HISI_PMU_EVENT_ATTR(cycles, 0x7f), 454 + HISI_PMU_EVENT_ATTR(l3c_ref, 0xbc), 455 + HISI_PMU_EVENT_ATTR(l3c2ring, 0xbd), 456 + NULL 457 + }; 458 + 459 + static const struct attribute_group hisi_l3c_pmu_v3_events_group = { 460 + .name = "events", 461 + .attrs = hisi_l3c_pmu_v3_events_attr, 462 + }; 463 + 665 464 static const struct attribute_group *hisi_l3c_pmu_v1_attr_groups[] = { 666 465 &hisi_l3c_pmu_v1_format_group, 667 466 &hisi_l3c_pmu_v1_events_group, ··· 698 457 NULL 699 458 }; 700 459 460 + static const struct attribute_group *hisi_l3c_pmu_v3_attr_groups[] = { 461 + &hisi_l3c_pmu_v3_format_group, 462 + &hisi_l3c_pmu_v3_events_group, 463 + &hisi_pmu_cpumask_attr_group, 464 + &hisi_pmu_identifier_group, 465 + NULL 466 + }; 467 + 468 + static struct hisi_l3c_pmu_ext hisi_l3c_pmu_support_ext = { 469 + .support_ext = true, 470 + }; 471 + 472 + static struct hisi_l3c_pmu_ext hisi_l3c_pmu_not_support_ext = { 473 + .support_ext = false, 474 + }; 475 + 476 + static const struct hisi_pmu_dev_info hisi_l3c_pmu_v1 = { 477 + .attr_groups = hisi_l3c_pmu_v1_attr_groups, 478 + .counter_bits = 48, 479 + .check_event = L3C_V1_NR_EVENTS, 480 + .private = &hisi_l3c_pmu_not_support_ext, 481 + }; 482 + 483 + static const struct hisi_pmu_dev_info hisi_l3c_pmu_v2 = { 484 + .attr_groups = hisi_l3c_pmu_v2_attr_groups, 485 + .counter_bits = 64, 486 + .check_event = L3C_V2_NR_EVENTS, 487 + .private = &hisi_l3c_pmu_not_support_ext, 488 + }; 489 + 490 + static const struct hisi_pmu_dev_info hisi_l3c_pmu_v3 = { 491 + .attr_groups = hisi_l3c_pmu_v3_attr_groups, 492 + .counter_bits = 64, 493 + .check_event = L3C_V2_NR_EVENTS, 494 + .private = &hisi_l3c_pmu_support_ext, 495 + }; 496 + 701 497 static const struct hisi_uncore_ops hisi_uncore_l3c_ops = { 702 498 .write_evtype = hisi_l3c_pmu_write_evtype, 703 - .get_event_idx = hisi_uncore_pmu_get_event_idx, 499 + .get_event_idx = hisi_l3c_pmu_get_event_idx, 704 500 .start_counters = hisi_l3c_pmu_start_counters, 705 501 .stop_counters = hisi_l3c_pmu_stop_counters, 706 502 .enable_counter = hisi_l3c_pmu_enable_counter, ··· 750 472 .clear_int_status = hisi_l3c_pmu_clear_int_status, 751 473 .enable_filter = hisi_l3c_pmu_enable_filter, 752 474 .disable_filter = hisi_l3c_pmu_disable_filter, 475 + .check_filter = hisi_l3c_pmu_check_filter, 753 476 }; 754 477 755 478 static int hisi_l3c_pmu_dev_probe(struct platform_device *pdev, 756 479 struct hisi_pmu *l3c_pmu) 757 480 { 481 + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); 482 + struct hisi_l3c_pmu_ext *l3c_pmu_dev_ext; 758 483 int ret; 759 484 760 485 ret = hisi_l3c_pmu_init_data(pdev, l3c_pmu); ··· 768 487 if (ret) 769 488 return ret; 770 489 771 - if (l3c_pmu->identifier >= HISI_PMU_V2) { 772 - l3c_pmu->counter_bits = 64; 773 - l3c_pmu->check_event = L3C_V2_NR_EVENTS; 774 - l3c_pmu->pmu_events.attr_groups = hisi_l3c_pmu_v2_attr_groups; 775 - } else { 776 - l3c_pmu->counter_bits = 48; 777 - l3c_pmu->check_event = L3C_V1_NR_EVENTS; 778 - l3c_pmu->pmu_events.attr_groups = hisi_l3c_pmu_v1_attr_groups; 779 - } 780 - 490 + l3c_pmu->pmu_events.attr_groups = l3c_pmu->dev_info->attr_groups; 491 + l3c_pmu->counter_bits = l3c_pmu->dev_info->counter_bits; 492 + l3c_pmu->check_event = l3c_pmu->dev_info->check_event; 781 493 l3c_pmu->num_counters = L3C_NR_COUNTERS; 782 494 l3c_pmu->ops = &hisi_uncore_l3c_ops; 783 495 l3c_pmu->dev = &pdev->dev; 784 496 l3c_pmu->on_cpu = -1; 497 + 498 + l3c_pmu_dev_ext = l3c_pmu->dev_info->private; 499 + if (l3c_pmu_dev_ext->support_ext) { 500 + ret = hisi_l3c_pmu_init_ext(l3c_pmu, pdev); 501 + if (ret) 502 + return ret; 503 + /* 504 + * The extension events have their own counters with the 505 + * same number of the normal events counters. So we can 506 + * have at maximum num_counters * ext events monitored. 507 + */ 508 + l3c_pmu->num_counters += hisi_l3c_pmu->ext_num * L3C_NR_COUNTERS; 509 + } 785 510 786 511 return 0; 787 512 } 788 513 789 514 static int hisi_l3c_pmu_probe(struct platform_device *pdev) 790 515 { 516 + struct hisi_l3c_pmu *hisi_l3c_pmu; 791 517 struct hisi_pmu *l3c_pmu; 792 518 char *name; 793 519 int ret; 794 520 795 - l3c_pmu = devm_kzalloc(&pdev->dev, sizeof(*l3c_pmu), GFP_KERNEL); 796 - if (!l3c_pmu) 521 + hisi_l3c_pmu = devm_kzalloc(&pdev->dev, sizeof(*hisi_l3c_pmu), GFP_KERNEL); 522 + if (!hisi_l3c_pmu) 797 523 return -ENOMEM; 798 524 525 + l3c_pmu = &hisi_l3c_pmu->l3c_pmu; 799 526 platform_set_drvdata(pdev, l3c_pmu); 800 527 801 528 ret = hisi_l3c_pmu_dev_probe(pdev, l3c_pmu); 802 529 if (ret) 803 530 return ret; 804 531 805 - name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%d_l3c%d", 806 - l3c_pmu->topo.sccl_id, l3c_pmu->topo.ccl_id); 532 + if (l3c_pmu->topo.sub_id >= 0) 533 + name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%d_l3c%d_%d", 534 + l3c_pmu->topo.sccl_id, l3c_pmu->topo.ccl_id, 535 + l3c_pmu->topo.sub_id); 536 + else 537 + name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%d_l3c%d", 538 + l3c_pmu->topo.sccl_id, l3c_pmu->topo.ccl_id); 807 539 if (!name) 808 540 return -ENOMEM; 809 541 ··· 848 554 &l3c_pmu->node); 849 555 } 850 556 557 + static const struct acpi_device_id hisi_l3c_pmu_acpi_match[] = { 558 + { "HISI0213", (kernel_ulong_t)&hisi_l3c_pmu_v1 }, 559 + { "HISI0214", (kernel_ulong_t)&hisi_l3c_pmu_v2 }, 560 + { "HISI0215", (kernel_ulong_t)&hisi_l3c_pmu_v3 }, 561 + {} 562 + }; 563 + MODULE_DEVICE_TABLE(acpi, hisi_l3c_pmu_acpi_match); 564 + 851 565 static struct platform_driver hisi_l3c_pmu_driver = { 852 566 .driver = { 853 567 .name = "hisi_l3c_pmu", ··· 866 564 .remove = hisi_l3c_pmu_remove, 867 565 }; 868 566 567 + static int hisi_l3c_pmu_online_cpu(unsigned int cpu, struct hlist_node *node) 568 + { 569 + struct hisi_pmu *l3c_pmu = hlist_entry_safe(node, struct hisi_pmu, node); 570 + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); 571 + int ret, i; 572 + 573 + ret = hisi_uncore_pmu_online_cpu(cpu, node); 574 + if (ret) 575 + return ret; 576 + 577 + /* Avoid L3C pmu not supporting ext from ext irq migrating. */ 578 + if (!support_ext(hisi_l3c_pmu)) 579 + return 0; 580 + 581 + for (i = 0; i < hisi_l3c_pmu->ext_num; i++) 582 + WARN_ON(irq_set_affinity(hisi_l3c_pmu->ext_irq[i], 583 + cpumask_of(l3c_pmu->on_cpu))); 584 + 585 + return 0; 586 + } 587 + 588 + static int hisi_l3c_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node) 589 + { 590 + struct hisi_pmu *l3c_pmu = hlist_entry_safe(node, struct hisi_pmu, node); 591 + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); 592 + int ret, i; 593 + 594 + ret = hisi_uncore_pmu_offline_cpu(cpu, node); 595 + if (ret) 596 + return ret; 597 + 598 + /* If failed to find any available CPU, skip irq migration. */ 599 + if (l3c_pmu->on_cpu < 0) 600 + return 0; 601 + 602 + /* Avoid L3C pmu not supporting ext from ext irq migrating. */ 603 + if (!support_ext(hisi_l3c_pmu)) 604 + return 0; 605 + 606 + for (i = 0; i < hisi_l3c_pmu->ext_num; i++) 607 + WARN_ON(irq_set_affinity(hisi_l3c_pmu->ext_irq[i], 608 + cpumask_of(l3c_pmu->on_cpu))); 609 + 610 + return 0; 611 + } 612 + 869 613 static int __init hisi_l3c_pmu_module_init(void) 870 614 { 871 615 int ret; 872 616 873 617 ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_ARM_HISI_L3_ONLINE, 874 618 "AP_PERF_ARM_HISI_L3_ONLINE", 875 - hisi_uncore_pmu_online_cpu, 876 - hisi_uncore_pmu_offline_cpu); 619 + hisi_l3c_pmu_online_cpu, 620 + hisi_l3c_pmu_offline_cpu); 877 621 if (ret) { 878 622 pr_err("L3C PMU: Error setup hotplug, ret = %d\n", ret); 879 623 return ret;
+411
drivers/perf/hisilicon/hisi_uncore_mn_pmu.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* 3 + * HiSilicon SoC MN uncore Hardware event counters support 4 + * 5 + * Copyright (c) 2025 HiSilicon Technologies Co., Ltd. 6 + */ 7 + #include <linux/cpuhotplug.h> 8 + #include <linux/interrupt.h> 9 + #include <linux/iopoll.h> 10 + #include <linux/irq.h> 11 + #include <linux/list.h> 12 + #include <linux/mod_devicetable.h> 13 + #include <linux/property.h> 14 + 15 + #include "hisi_uncore_pmu.h" 16 + 17 + /* Dynamic CPU hotplug state used by MN PMU */ 18 + static enum cpuhp_state hisi_mn_pmu_online; 19 + 20 + /* MN register definition */ 21 + #define HISI_MN_DYNAMIC_CTRL_REG 0x400 22 + #define HISI_MN_DYNAMIC_CTRL_EN BIT(0) 23 + #define HISI_MN_PERF_CTRL_REG 0x408 24 + #define HISI_MN_PERF_CTRL_EN BIT(6) 25 + #define HISI_MN_INT_MASK_REG 0x800 26 + #define HISI_MN_INT_STATUS_REG 0x808 27 + #define HISI_MN_INT_CLEAR_REG 0x80C 28 + #define HISI_MN_EVENT_CTRL_REG 0x1C00 29 + #define HISI_MN_VERSION_REG 0x1C04 30 + #define HISI_MN_EVTYPE0_REG 0x1d00 31 + #define HISI_MN_EVTYPE_MASK GENMASK(7, 0) 32 + #define HISI_MN_CNTR0_REG 0x1e00 33 + #define HISI_MN_EVTYPE_REGn(evtype0, n) ((evtype0) + (n) * 4) 34 + #define HISI_MN_CNTR_REGn(cntr0, n) ((cntr0) + (n) * 8) 35 + 36 + #define HISI_MN_NR_COUNTERS 4 37 + #define HISI_MN_TIMEOUT_US 500U 38 + 39 + struct hisi_mn_pmu_regs { 40 + u32 version; 41 + u32 dyn_ctrl; 42 + u32 perf_ctrl; 43 + u32 int_mask; 44 + u32 int_clear; 45 + u32 int_status; 46 + u32 event_ctrl; 47 + u32 event_type0; 48 + u32 event_cntr0; 49 + }; 50 + 51 + /* 52 + * Each event request takes a certain amount of time to complete. If 53 + * we counting the latency related event, we need to wait for the all 54 + * requests complete. Otherwise, the value of counter is slightly larger. 55 + */ 56 + static void hisi_mn_pmu_counter_flush(struct hisi_pmu *mn_pmu) 57 + { 58 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 59 + int ret; 60 + u32 val; 61 + 62 + val = readl(mn_pmu->base + reg_info->dyn_ctrl); 63 + val |= HISI_MN_DYNAMIC_CTRL_EN; 64 + writel(val, mn_pmu->base + reg_info->dyn_ctrl); 65 + 66 + ret = readl_poll_timeout_atomic(mn_pmu->base + reg_info->dyn_ctrl, 67 + val, !(val & HISI_MN_DYNAMIC_CTRL_EN), 68 + 1, HISI_MN_TIMEOUT_US); 69 + if (ret) 70 + dev_warn(mn_pmu->dev, "Counter flush timeout\n"); 71 + } 72 + 73 + static u64 hisi_mn_pmu_read_counter(struct hisi_pmu *mn_pmu, 74 + struct hw_perf_event *hwc) 75 + { 76 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 77 + 78 + return readq(mn_pmu->base + HISI_MN_CNTR_REGn(reg_info->event_cntr0, hwc->idx)); 79 + } 80 + 81 + static void hisi_mn_pmu_write_counter(struct hisi_pmu *mn_pmu, 82 + struct hw_perf_event *hwc, u64 val) 83 + { 84 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 85 + 86 + writeq(val, mn_pmu->base + HISI_MN_CNTR_REGn(reg_info->event_cntr0, hwc->idx)); 87 + } 88 + 89 + static void hisi_mn_pmu_write_evtype(struct hisi_pmu *mn_pmu, int idx, u32 type) 90 + { 91 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 92 + u32 val; 93 + 94 + /* 95 + * Select the appropriate event select register. 96 + * There are 2 32-bit event select registers for the 97 + * 8 hardware counters, each event code is 8-bit wide. 98 + */ 99 + val = readl(mn_pmu->base + HISI_MN_EVTYPE_REGn(reg_info->event_type0, idx / 4)); 100 + val &= ~(HISI_MN_EVTYPE_MASK << HISI_PMU_EVTYPE_SHIFT(idx)); 101 + val |= (type << HISI_PMU_EVTYPE_SHIFT(idx)); 102 + writel(val, mn_pmu->base + HISI_MN_EVTYPE_REGn(reg_info->event_type0, idx / 4)); 103 + } 104 + 105 + static void hisi_mn_pmu_start_counters(struct hisi_pmu *mn_pmu) 106 + { 107 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 108 + u32 val; 109 + 110 + val = readl(mn_pmu->base + reg_info->perf_ctrl); 111 + val |= HISI_MN_PERF_CTRL_EN; 112 + writel(val, mn_pmu->base + reg_info->perf_ctrl); 113 + } 114 + 115 + static void hisi_mn_pmu_stop_counters(struct hisi_pmu *mn_pmu) 116 + { 117 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 118 + u32 val; 119 + 120 + val = readl(mn_pmu->base + reg_info->perf_ctrl); 121 + val &= ~HISI_MN_PERF_CTRL_EN; 122 + writel(val, mn_pmu->base + reg_info->perf_ctrl); 123 + 124 + hisi_mn_pmu_counter_flush(mn_pmu); 125 + } 126 + 127 + static void hisi_mn_pmu_enable_counter(struct hisi_pmu *mn_pmu, 128 + struct hw_perf_event *hwc) 129 + { 130 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 131 + u32 val; 132 + 133 + val = readl(mn_pmu->base + reg_info->event_ctrl); 134 + val |= BIT(hwc->idx); 135 + writel(val, mn_pmu->base + reg_info->event_ctrl); 136 + } 137 + 138 + static void hisi_mn_pmu_disable_counter(struct hisi_pmu *mn_pmu, 139 + struct hw_perf_event *hwc) 140 + { 141 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 142 + u32 val; 143 + 144 + val = readl(mn_pmu->base + reg_info->event_ctrl); 145 + val &= ~BIT(hwc->idx); 146 + writel(val, mn_pmu->base + reg_info->event_ctrl); 147 + } 148 + 149 + static void hisi_mn_pmu_enable_counter_int(struct hisi_pmu *mn_pmu, 150 + struct hw_perf_event *hwc) 151 + { 152 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 153 + u32 val; 154 + 155 + val = readl(mn_pmu->base + reg_info->int_mask); 156 + val &= ~BIT(hwc->idx); 157 + writel(val, mn_pmu->base + reg_info->int_mask); 158 + } 159 + 160 + static void hisi_mn_pmu_disable_counter_int(struct hisi_pmu *mn_pmu, 161 + struct hw_perf_event *hwc) 162 + { 163 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 164 + u32 val; 165 + 166 + val = readl(mn_pmu->base + reg_info->int_mask); 167 + val |= BIT(hwc->idx); 168 + writel(val, mn_pmu->base + reg_info->int_mask); 169 + } 170 + 171 + static u32 hisi_mn_pmu_get_int_status(struct hisi_pmu *mn_pmu) 172 + { 173 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 174 + 175 + return readl(mn_pmu->base + reg_info->int_status); 176 + } 177 + 178 + static void hisi_mn_pmu_clear_int_status(struct hisi_pmu *mn_pmu, int idx) 179 + { 180 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 181 + 182 + writel(BIT(idx), mn_pmu->base + reg_info->int_clear); 183 + } 184 + 185 + static struct attribute *hisi_mn_pmu_format_attr[] = { 186 + HISI_PMU_FORMAT_ATTR(event, "config:0-7"), 187 + NULL 188 + }; 189 + 190 + static const struct attribute_group hisi_mn_pmu_format_group = { 191 + .name = "format", 192 + .attrs = hisi_mn_pmu_format_attr, 193 + }; 194 + 195 + static struct attribute *hisi_mn_pmu_events_attr[] = { 196 + HISI_PMU_EVENT_ATTR(req_eobarrier_num, 0x00), 197 + HISI_PMU_EVENT_ATTR(req_ecbarrier_num, 0x01), 198 + HISI_PMU_EVENT_ATTR(req_dvmop_num, 0x02), 199 + HISI_PMU_EVENT_ATTR(req_dvmsync_num, 0x03), 200 + HISI_PMU_EVENT_ATTR(req_retry_num, 0x04), 201 + HISI_PMU_EVENT_ATTR(req_writenosnp_num, 0x05), 202 + HISI_PMU_EVENT_ATTR(req_readnosnp_num, 0x06), 203 + HISI_PMU_EVENT_ATTR(snp_dvm_num, 0x07), 204 + HISI_PMU_EVENT_ATTR(snp_dvmsync_num, 0x08), 205 + HISI_PMU_EVENT_ATTR(l3t_req_dvm_num, 0x09), 206 + HISI_PMU_EVENT_ATTR(l3t_req_dvmsync_num, 0x0A), 207 + HISI_PMU_EVENT_ATTR(mn_req_dvm_num, 0x0B), 208 + HISI_PMU_EVENT_ATTR(mn_req_dvmsync_num, 0x0C), 209 + HISI_PMU_EVENT_ATTR(pa_req_dvm_num, 0x0D), 210 + HISI_PMU_EVENT_ATTR(pa_req_dvmsync_num, 0x0E), 211 + HISI_PMU_EVENT_ATTR(snp_dvm_latency, 0x80), 212 + HISI_PMU_EVENT_ATTR(snp_dvmsync_latency, 0x81), 213 + HISI_PMU_EVENT_ATTR(l3t_req_dvm_latency, 0x82), 214 + HISI_PMU_EVENT_ATTR(l3t_req_dvmsync_latency, 0x83), 215 + HISI_PMU_EVENT_ATTR(mn_req_dvm_latency, 0x84), 216 + HISI_PMU_EVENT_ATTR(mn_req_dvmsync_latency, 0x85), 217 + HISI_PMU_EVENT_ATTR(pa_req_dvm_latency, 0x86), 218 + HISI_PMU_EVENT_ATTR(pa_req_dvmsync_latency, 0x87), 219 + NULL 220 + }; 221 + 222 + static const struct attribute_group hisi_mn_pmu_events_group = { 223 + .name = "events", 224 + .attrs = hisi_mn_pmu_events_attr, 225 + }; 226 + 227 + static const struct attribute_group *hisi_mn_pmu_attr_groups[] = { 228 + &hisi_mn_pmu_format_group, 229 + &hisi_mn_pmu_events_group, 230 + &hisi_pmu_cpumask_attr_group, 231 + &hisi_pmu_identifier_group, 232 + NULL 233 + }; 234 + 235 + static const struct hisi_uncore_ops hisi_uncore_mn_ops = { 236 + .write_evtype = hisi_mn_pmu_write_evtype, 237 + .get_event_idx = hisi_uncore_pmu_get_event_idx, 238 + .start_counters = hisi_mn_pmu_start_counters, 239 + .stop_counters = hisi_mn_pmu_stop_counters, 240 + .enable_counter = hisi_mn_pmu_enable_counter, 241 + .disable_counter = hisi_mn_pmu_disable_counter, 242 + .enable_counter_int = hisi_mn_pmu_enable_counter_int, 243 + .disable_counter_int = hisi_mn_pmu_disable_counter_int, 244 + .write_counter = hisi_mn_pmu_write_counter, 245 + .read_counter = hisi_mn_pmu_read_counter, 246 + .get_int_status = hisi_mn_pmu_get_int_status, 247 + .clear_int_status = hisi_mn_pmu_clear_int_status, 248 + }; 249 + 250 + static int hisi_mn_pmu_dev_init(struct platform_device *pdev, 251 + struct hisi_pmu *mn_pmu) 252 + { 253 + struct hisi_mn_pmu_regs *reg_info; 254 + int ret; 255 + 256 + hisi_uncore_pmu_init_topology(mn_pmu, &pdev->dev); 257 + 258 + if (mn_pmu->topo.scl_id < 0) 259 + return dev_err_probe(&pdev->dev, -EINVAL, 260 + "Failed to read MN scl id\n"); 261 + 262 + if (mn_pmu->topo.index_id < 0) 263 + return dev_err_probe(&pdev->dev, -EINVAL, 264 + "Failed to read MN index id\n"); 265 + 266 + mn_pmu->base = devm_platform_ioremap_resource(pdev, 0); 267 + if (IS_ERR(mn_pmu->base)) 268 + return dev_err_probe(&pdev->dev, PTR_ERR(mn_pmu->base), 269 + "Failed to ioremap resource\n"); 270 + 271 + ret = hisi_uncore_pmu_init_irq(mn_pmu, pdev); 272 + if (ret) 273 + return ret; 274 + 275 + mn_pmu->dev_info = device_get_match_data(&pdev->dev); 276 + if (!mn_pmu->dev_info) 277 + return -ENODEV; 278 + 279 + mn_pmu->pmu_events.attr_groups = mn_pmu->dev_info->attr_groups; 280 + mn_pmu->counter_bits = mn_pmu->dev_info->counter_bits; 281 + mn_pmu->check_event = mn_pmu->dev_info->check_event; 282 + mn_pmu->num_counters = HISI_MN_NR_COUNTERS; 283 + mn_pmu->ops = &hisi_uncore_mn_ops; 284 + mn_pmu->dev = &pdev->dev; 285 + mn_pmu->on_cpu = -1; 286 + 287 + reg_info = mn_pmu->dev_info->private; 288 + mn_pmu->identifier = readl(mn_pmu->base + reg_info->version); 289 + 290 + return 0; 291 + } 292 + 293 + static void hisi_mn_pmu_remove_cpuhp(void *hotplug_node) 294 + { 295 + cpuhp_state_remove_instance_nocalls(hisi_mn_pmu_online, hotplug_node); 296 + } 297 + 298 + static void hisi_mn_pmu_unregister(void *pmu) 299 + { 300 + perf_pmu_unregister(pmu); 301 + } 302 + 303 + static int hisi_mn_pmu_probe(struct platform_device *pdev) 304 + { 305 + struct hisi_pmu *mn_pmu; 306 + char *name; 307 + int ret; 308 + 309 + mn_pmu = devm_kzalloc(&pdev->dev, sizeof(*mn_pmu), GFP_KERNEL); 310 + if (!mn_pmu) 311 + return -ENOMEM; 312 + 313 + platform_set_drvdata(pdev, mn_pmu); 314 + 315 + ret = hisi_mn_pmu_dev_init(pdev, mn_pmu); 316 + if (ret) 317 + return ret; 318 + 319 + name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_scl%d_mn%d", 320 + mn_pmu->topo.scl_id, mn_pmu->topo.index_id); 321 + if (!name) 322 + return -ENOMEM; 323 + 324 + ret = cpuhp_state_add_instance(hisi_mn_pmu_online, &mn_pmu->node); 325 + if (ret) 326 + return dev_err_probe(&pdev->dev, ret, "Failed to register cpu hotplug\n"); 327 + 328 + ret = devm_add_action_or_reset(&pdev->dev, hisi_mn_pmu_remove_cpuhp, &mn_pmu->node); 329 + if (ret) 330 + return ret; 331 + 332 + hisi_pmu_init(mn_pmu, THIS_MODULE); 333 + 334 + ret = perf_pmu_register(&mn_pmu->pmu, name, -1); 335 + if (ret) 336 + return dev_err_probe(mn_pmu->dev, ret, "Failed to register MN PMU\n"); 337 + 338 + return devm_add_action_or_reset(&pdev->dev, hisi_mn_pmu_unregister, &mn_pmu->pmu); 339 + } 340 + 341 + static struct hisi_mn_pmu_regs hisi_mn_v1_pmu_regs = { 342 + .version = HISI_MN_VERSION_REG, 343 + .dyn_ctrl = HISI_MN_DYNAMIC_CTRL_REG, 344 + .perf_ctrl = HISI_MN_PERF_CTRL_REG, 345 + .int_mask = HISI_MN_INT_MASK_REG, 346 + .int_clear = HISI_MN_INT_CLEAR_REG, 347 + .int_status = HISI_MN_INT_STATUS_REG, 348 + .event_ctrl = HISI_MN_EVENT_CTRL_REG, 349 + .event_type0 = HISI_MN_EVTYPE0_REG, 350 + .event_cntr0 = HISI_MN_CNTR0_REG, 351 + }; 352 + 353 + static const struct hisi_pmu_dev_info hisi_mn_v1 = { 354 + .attr_groups = hisi_mn_pmu_attr_groups, 355 + .counter_bits = 48, 356 + .check_event = HISI_MN_EVTYPE_MASK, 357 + .private = &hisi_mn_v1_pmu_regs, 358 + }; 359 + 360 + static const struct acpi_device_id hisi_mn_pmu_acpi_match[] = { 361 + { "HISI0222", (kernel_ulong_t) &hisi_mn_v1 }, 362 + { } 363 + }; 364 + MODULE_DEVICE_TABLE(acpi, hisi_mn_pmu_acpi_match); 365 + 366 + static struct platform_driver hisi_mn_pmu_driver = { 367 + .driver = { 368 + .name = "hisi_mn_pmu", 369 + .acpi_match_table = hisi_mn_pmu_acpi_match, 370 + /* 371 + * We have not worked out a safe bind/unbind process, 372 + * Forcefully unbinding during sampling will lead to a 373 + * kernel panic, so this is not supported yet. 374 + */ 375 + .suppress_bind_attrs = true, 376 + }, 377 + .probe = hisi_mn_pmu_probe, 378 + }; 379 + 380 + static int __init hisi_mn_pmu_module_init(void) 381 + { 382 + int ret; 383 + 384 + ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, "perf/hisi/mn:online", 385 + hisi_uncore_pmu_online_cpu, 386 + hisi_uncore_pmu_offline_cpu); 387 + if (ret < 0) { 388 + pr_err("hisi_mn_pmu: Failed to setup MN PMU hotplug: %d\n", ret); 389 + return ret; 390 + } 391 + hisi_mn_pmu_online = ret; 392 + 393 + ret = platform_driver_register(&hisi_mn_pmu_driver); 394 + if (ret) 395 + cpuhp_remove_multi_state(hisi_mn_pmu_online); 396 + 397 + return ret; 398 + } 399 + module_init(hisi_mn_pmu_module_init); 400 + 401 + static void __exit hisi_mn_pmu_module_exit(void) 402 + { 403 + platform_driver_unregister(&hisi_mn_pmu_driver); 404 + cpuhp_remove_multi_state(hisi_mn_pmu_online); 405 + } 406 + module_exit(hisi_mn_pmu_module_exit); 407 + 408 + MODULE_IMPORT_NS("HISI_PMU"); 409 + MODULE_DESCRIPTION("HiSilicon SoC MN uncore PMU driver"); 410 + MODULE_LICENSE("GPL"); 411 + MODULE_AUTHOR("Junhao He <hejunhao3@huawei.com>");
+443
drivers/perf/hisilicon/hisi_uncore_noc_pmu.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Driver for HiSilicon Uncore NoC (Network on Chip) PMU device 4 + * 5 + * Copyright (c) 2025 HiSilicon Technologies Co., Ltd. 6 + * Author: Yicong Yang <yangyicong@hisilicon.com> 7 + */ 8 + #include <linux/bitops.h> 9 + #include <linux/cpuhotplug.h> 10 + #include <linux/device.h> 11 + #include <linux/io.h> 12 + #include <linux/mod_devicetable.h> 13 + #include <linux/module.h> 14 + #include <linux/platform_device.h> 15 + #include <linux/property.h> 16 + #include <linux/sysfs.h> 17 + 18 + #include "hisi_uncore_pmu.h" 19 + 20 + #define NOC_PMU_VERSION 0x1e00 21 + #define NOC_PMU_GLOBAL_CTRL 0x1e04 22 + #define NOC_PMU_GLOBAL_CTRL_PMU_EN BIT(0) 23 + #define NOC_PMU_GLOBAL_CTRL_TT_EN BIT(1) 24 + #define NOC_PMU_CNT_INFO 0x1e08 25 + #define NOC_PMU_CNT_INFO_OVERFLOW(n) BIT(n) 26 + #define NOC_PMU_EVENT_CTRL0 0x1e20 27 + #define NOC_PMU_EVENT_CTRL_TYPE GENMASK(4, 0) 28 + /* 29 + * Note channel of 0x0 will reset the counter value, so don't do it before 30 + * we read out the counter. 31 + */ 32 + #define NOC_PMU_EVENT_CTRL_CHANNEL GENMASK(10, 8) 33 + #define NOC_PMU_EVENT_CTRL_EN BIT(11) 34 + #define NOC_PMU_EVENT_COUNTER0 0x1e80 35 + 36 + #define NOC_PMU_NR_COUNTERS 4 37 + #define NOC_PMU_CH_DEFAULT 0x7 38 + 39 + #define NOC_PMU_EVENT_CTRLn(ctrl0, n) ((ctrl0) + 4 * (n)) 40 + #define NOC_PMU_EVENT_CNTRn(cntr0, n) ((cntr0) + 8 * (n)) 41 + 42 + HISI_PMU_EVENT_ATTR_EXTRACTOR(ch, config1, 2, 0); 43 + HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_en, config1, 3, 3); 44 + 45 + /* Dynamic CPU hotplug state used by this PMU driver */ 46 + static enum cpuhp_state hisi_noc_pmu_cpuhp_state; 47 + 48 + struct hisi_noc_pmu_regs { 49 + u32 version; 50 + u32 pmu_ctrl; 51 + u32 event_ctrl0; 52 + u32 event_cntr0; 53 + u32 overflow_status; 54 + }; 55 + 56 + /* 57 + * Tracetag filtering is not per event and all the events should keep 58 + * the consistence. Return true if the new comer doesn't match the 59 + * tracetag filtering configuration of the current scheduled events. 60 + */ 61 + static bool hisi_noc_pmu_check_global_filter(struct perf_event *curr, 62 + struct perf_event *new) 63 + { 64 + return hisi_get_tt_en(curr) == hisi_get_tt_en(new); 65 + } 66 + 67 + static void hisi_noc_pmu_write_evtype(struct hisi_pmu *noc_pmu, int idx, u32 type) 68 + { 69 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 70 + u32 reg; 71 + 72 + reg = readl(noc_pmu->base + NOC_PMU_EVENT_CTRLn(reg_info->event_ctrl0, idx)); 73 + reg &= ~NOC_PMU_EVENT_CTRL_TYPE; 74 + reg |= FIELD_PREP(NOC_PMU_EVENT_CTRL_TYPE, type); 75 + writel(reg, noc_pmu->base + NOC_PMU_EVENT_CTRLn(reg_info->event_ctrl0, idx)); 76 + } 77 + 78 + static int hisi_noc_pmu_get_event_idx(struct perf_event *event) 79 + { 80 + struct hisi_pmu *noc_pmu = to_hisi_pmu(event->pmu); 81 + struct hisi_pmu_hwevents *pmu_events = &noc_pmu->pmu_events; 82 + int cur_idx; 83 + 84 + cur_idx = find_first_bit(pmu_events->used_mask, noc_pmu->num_counters); 85 + if (cur_idx != noc_pmu->num_counters && 86 + !hisi_noc_pmu_check_global_filter(pmu_events->hw_events[cur_idx], event)) 87 + return -EAGAIN; 88 + 89 + return hisi_uncore_pmu_get_event_idx(event); 90 + } 91 + 92 + static u64 hisi_noc_pmu_read_counter(struct hisi_pmu *noc_pmu, 93 + struct hw_perf_event *hwc) 94 + { 95 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 96 + 97 + return readq(noc_pmu->base + NOC_PMU_EVENT_CNTRn(reg_info->event_cntr0, hwc->idx)); 98 + } 99 + 100 + static void hisi_noc_pmu_write_counter(struct hisi_pmu *noc_pmu, 101 + struct hw_perf_event *hwc, u64 val) 102 + { 103 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 104 + 105 + writeq(val, noc_pmu->base + NOC_PMU_EVENT_CNTRn(reg_info->event_cntr0, hwc->idx)); 106 + } 107 + 108 + static void hisi_noc_pmu_enable_counter(struct hisi_pmu *noc_pmu, 109 + struct hw_perf_event *hwc) 110 + { 111 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 112 + u32 reg; 113 + 114 + reg = readl(noc_pmu->base + NOC_PMU_EVENT_CTRLn(reg_info->event_ctrl0, hwc->idx)); 115 + reg |= NOC_PMU_EVENT_CTRL_EN; 116 + writel(reg, noc_pmu->base + NOC_PMU_EVENT_CTRLn(reg_info->event_ctrl0, hwc->idx)); 117 + } 118 + 119 + static void hisi_noc_pmu_disable_counter(struct hisi_pmu *noc_pmu, 120 + struct hw_perf_event *hwc) 121 + { 122 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 123 + u32 reg; 124 + 125 + reg = readl(noc_pmu->base + NOC_PMU_EVENT_CTRLn(reg_info->event_ctrl0, hwc->idx)); 126 + reg &= ~NOC_PMU_EVENT_CTRL_EN; 127 + writel(reg, noc_pmu->base + NOC_PMU_EVENT_CTRLn(reg_info->event_ctrl0, hwc->idx)); 128 + } 129 + 130 + static void hisi_noc_pmu_enable_counter_int(struct hisi_pmu *noc_pmu, 131 + struct hw_perf_event *hwc) 132 + { 133 + /* We don't support interrupt, so a stub here. */ 134 + } 135 + 136 + static void hisi_noc_pmu_disable_counter_int(struct hisi_pmu *noc_pmu, 137 + struct hw_perf_event *hwc) 138 + { 139 + } 140 + 141 + static void hisi_noc_pmu_start_counters(struct hisi_pmu *noc_pmu) 142 + { 143 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 144 + u32 reg; 145 + 146 + reg = readl(noc_pmu->base + reg_info->pmu_ctrl); 147 + reg |= NOC_PMU_GLOBAL_CTRL_PMU_EN; 148 + writel(reg, noc_pmu->base + reg_info->pmu_ctrl); 149 + } 150 + 151 + static void hisi_noc_pmu_stop_counters(struct hisi_pmu *noc_pmu) 152 + { 153 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 154 + u32 reg; 155 + 156 + reg = readl(noc_pmu->base + reg_info->pmu_ctrl); 157 + reg &= ~NOC_PMU_GLOBAL_CTRL_PMU_EN; 158 + writel(reg, noc_pmu->base + reg_info->pmu_ctrl); 159 + } 160 + 161 + static u32 hisi_noc_pmu_get_int_status(struct hisi_pmu *noc_pmu) 162 + { 163 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 164 + 165 + return readl(noc_pmu->base + reg_info->overflow_status); 166 + } 167 + 168 + static void hisi_noc_pmu_clear_int_status(struct hisi_pmu *noc_pmu, int idx) 169 + { 170 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 171 + u32 reg; 172 + 173 + reg = readl(noc_pmu->base + reg_info->overflow_status); 174 + reg &= ~NOC_PMU_CNT_INFO_OVERFLOW(idx); 175 + writel(reg, noc_pmu->base + reg_info->overflow_status); 176 + } 177 + 178 + static void hisi_noc_pmu_enable_filter(struct perf_event *event) 179 + { 180 + struct hisi_pmu *noc_pmu = to_hisi_pmu(event->pmu); 181 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 182 + struct hw_perf_event *hwc = &event->hw; 183 + u32 tt_en = hisi_get_tt_en(event); 184 + u32 ch = hisi_get_ch(event); 185 + u32 reg; 186 + 187 + if (!ch) 188 + ch = NOC_PMU_CH_DEFAULT; 189 + 190 + reg = readl(noc_pmu->base + NOC_PMU_EVENT_CTRLn(reg_info->event_ctrl0, hwc->idx)); 191 + reg &= ~NOC_PMU_EVENT_CTRL_CHANNEL; 192 + reg |= FIELD_PREP(NOC_PMU_EVENT_CTRL_CHANNEL, ch); 193 + writel(reg, noc_pmu->base + NOC_PMU_EVENT_CTRLn(reg_info->event_ctrl0, hwc->idx)); 194 + 195 + /* 196 + * Since tracetag filter applies to all the counters, don't touch it 197 + * if user doesn't specify it explicitly. 198 + */ 199 + if (tt_en) { 200 + reg = readl(noc_pmu->base + reg_info->pmu_ctrl); 201 + reg |= NOC_PMU_GLOBAL_CTRL_TT_EN; 202 + writel(reg, noc_pmu->base + reg_info->pmu_ctrl); 203 + } 204 + } 205 + 206 + static void hisi_noc_pmu_disable_filter(struct perf_event *event) 207 + { 208 + struct hisi_pmu *noc_pmu = to_hisi_pmu(event->pmu); 209 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 210 + u32 tt_en = hisi_get_tt_en(event); 211 + u32 reg; 212 + 213 + /* 214 + * If we're not the last counter, don't touch the global tracetag 215 + * configuration. 216 + */ 217 + if (bitmap_weight(noc_pmu->pmu_events.used_mask, noc_pmu->num_counters) > 1) 218 + return; 219 + 220 + if (tt_en) { 221 + reg = readl(noc_pmu->base + reg_info->pmu_ctrl); 222 + reg &= ~NOC_PMU_GLOBAL_CTRL_TT_EN; 223 + writel(reg, noc_pmu->base + reg_info->pmu_ctrl); 224 + } 225 + } 226 + 227 + static const struct hisi_uncore_ops hisi_uncore_noc_ops = { 228 + .write_evtype = hisi_noc_pmu_write_evtype, 229 + .get_event_idx = hisi_noc_pmu_get_event_idx, 230 + .read_counter = hisi_noc_pmu_read_counter, 231 + .write_counter = hisi_noc_pmu_write_counter, 232 + .enable_counter = hisi_noc_pmu_enable_counter, 233 + .disable_counter = hisi_noc_pmu_disable_counter, 234 + .enable_counter_int = hisi_noc_pmu_enable_counter_int, 235 + .disable_counter_int = hisi_noc_pmu_disable_counter_int, 236 + .start_counters = hisi_noc_pmu_start_counters, 237 + .stop_counters = hisi_noc_pmu_stop_counters, 238 + .get_int_status = hisi_noc_pmu_get_int_status, 239 + .clear_int_status = hisi_noc_pmu_clear_int_status, 240 + .enable_filter = hisi_noc_pmu_enable_filter, 241 + .disable_filter = hisi_noc_pmu_disable_filter, 242 + }; 243 + 244 + static struct attribute *hisi_noc_pmu_format_attrs[] = { 245 + HISI_PMU_FORMAT_ATTR(event, "config:0-7"), 246 + HISI_PMU_FORMAT_ATTR(ch, "config1:0-2"), 247 + HISI_PMU_FORMAT_ATTR(tt_en, "config1:3"), 248 + NULL 249 + }; 250 + 251 + static const struct attribute_group hisi_noc_pmu_format_group = { 252 + .name = "format", 253 + .attrs = hisi_noc_pmu_format_attrs, 254 + }; 255 + 256 + static struct attribute *hisi_noc_pmu_events_attrs[] = { 257 + HISI_PMU_EVENT_ATTR(cycles, 0x0e), 258 + /* Flux on/off the ring */ 259 + HISI_PMU_EVENT_ATTR(ingress_flow_sum, 0x1a), 260 + HISI_PMU_EVENT_ATTR(egress_flow_sum, 0x17), 261 + /* Buffer full duration on/off the ring */ 262 + HISI_PMU_EVENT_ATTR(ingress_buf_full, 0x19), 263 + HISI_PMU_EVENT_ATTR(egress_buf_full, 0x12), 264 + /* Failure packets count on/off the ring */ 265 + HISI_PMU_EVENT_ATTR(cw_ingress_fail, 0x01), 266 + HISI_PMU_EVENT_ATTR(cc_ingress_fail, 0x09), 267 + HISI_PMU_EVENT_ATTR(cw_egress_fail, 0x03), 268 + HISI_PMU_EVENT_ATTR(cc_egress_fail, 0x0b), 269 + /* Flux of the ring */ 270 + HISI_PMU_EVENT_ATTR(cw_main_flow_sum, 0x05), 271 + HISI_PMU_EVENT_ATTR(cc_main_flow_sum, 0x0d), 272 + NULL 273 + }; 274 + 275 + static const struct attribute_group hisi_noc_pmu_events_group = { 276 + .name = "events", 277 + .attrs = hisi_noc_pmu_events_attrs, 278 + }; 279 + 280 + static const struct attribute_group *hisi_noc_pmu_attr_groups[] = { 281 + &hisi_noc_pmu_format_group, 282 + &hisi_noc_pmu_events_group, 283 + &hisi_pmu_cpumask_attr_group, 284 + &hisi_pmu_identifier_group, 285 + NULL 286 + }; 287 + 288 + static int hisi_noc_pmu_dev_init(struct platform_device *pdev, struct hisi_pmu *noc_pmu) 289 + { 290 + struct hisi_noc_pmu_regs *reg_info; 291 + 292 + hisi_uncore_pmu_init_topology(noc_pmu, &pdev->dev); 293 + 294 + if (noc_pmu->topo.scl_id < 0) 295 + return dev_err_probe(&pdev->dev, -EINVAL, "failed to get scl-id\n"); 296 + 297 + if (noc_pmu->topo.index_id < 0) 298 + return dev_err_probe(&pdev->dev, -EINVAL, "failed to get idx-id\n"); 299 + 300 + if (noc_pmu->topo.sub_id < 0) 301 + return dev_err_probe(&pdev->dev, -EINVAL, "failed to get sub-id\n"); 302 + 303 + noc_pmu->base = devm_platform_ioremap_resource(pdev, 0); 304 + if (IS_ERR(noc_pmu->base)) 305 + return dev_err_probe(&pdev->dev, PTR_ERR(noc_pmu->base), 306 + "fail to remap io memory\n"); 307 + 308 + noc_pmu->dev_info = device_get_match_data(&pdev->dev); 309 + if (!noc_pmu->dev_info) 310 + return -ENODEV; 311 + 312 + noc_pmu->pmu_events.attr_groups = noc_pmu->dev_info->attr_groups; 313 + noc_pmu->counter_bits = noc_pmu->dev_info->counter_bits; 314 + noc_pmu->check_event = noc_pmu->dev_info->check_event; 315 + noc_pmu->num_counters = NOC_PMU_NR_COUNTERS; 316 + noc_pmu->ops = &hisi_uncore_noc_ops; 317 + noc_pmu->dev = &pdev->dev; 318 + noc_pmu->on_cpu = -1; 319 + 320 + reg_info = noc_pmu->dev_info->private; 321 + noc_pmu->identifier = readl(noc_pmu->base + reg_info->version); 322 + 323 + return 0; 324 + } 325 + 326 + static void hisi_noc_pmu_remove_cpuhp_instance(void *hotplug_node) 327 + { 328 + cpuhp_state_remove_instance_nocalls(hisi_noc_pmu_cpuhp_state, hotplug_node); 329 + } 330 + 331 + static void hisi_noc_pmu_unregister_pmu(void *pmu) 332 + { 333 + perf_pmu_unregister(pmu); 334 + } 335 + 336 + static int hisi_noc_pmu_probe(struct platform_device *pdev) 337 + { 338 + struct device *dev = &pdev->dev; 339 + struct hisi_pmu *noc_pmu; 340 + char *name; 341 + int ret; 342 + 343 + noc_pmu = devm_kzalloc(dev, sizeof(*noc_pmu), GFP_KERNEL); 344 + if (!noc_pmu) 345 + return -ENOMEM; 346 + 347 + /* 348 + * HiSilicon Uncore PMU framework needs to get common hisi_pmu device 349 + * from device's drvdata. 350 + */ 351 + platform_set_drvdata(pdev, noc_pmu); 352 + 353 + ret = hisi_noc_pmu_dev_init(pdev, noc_pmu); 354 + if (ret) 355 + return ret; 356 + 357 + ret = cpuhp_state_add_instance(hisi_noc_pmu_cpuhp_state, &noc_pmu->node); 358 + if (ret) 359 + return dev_err_probe(dev, ret, "Fail to register cpuhp instance\n"); 360 + 361 + ret = devm_add_action_or_reset(dev, hisi_noc_pmu_remove_cpuhp_instance, 362 + &noc_pmu->node); 363 + if (ret) 364 + return ret; 365 + 366 + hisi_pmu_init(noc_pmu, THIS_MODULE); 367 + 368 + name = devm_kasprintf(dev, GFP_KERNEL, "hisi_scl%d_noc%d_%d", 369 + noc_pmu->topo.scl_id, noc_pmu->topo.index_id, 370 + noc_pmu->topo.sub_id); 371 + if (!name) 372 + return -ENOMEM; 373 + 374 + ret = perf_pmu_register(&noc_pmu->pmu, name, -1); 375 + if (ret) 376 + return dev_err_probe(dev, ret, "Fail to register PMU\n"); 377 + 378 + return devm_add_action_or_reset(dev, hisi_noc_pmu_unregister_pmu, 379 + &noc_pmu->pmu); 380 + } 381 + 382 + static struct hisi_noc_pmu_regs hisi_noc_v1_pmu_regs = { 383 + .version = NOC_PMU_VERSION, 384 + .pmu_ctrl = NOC_PMU_GLOBAL_CTRL, 385 + .event_ctrl0 = NOC_PMU_EVENT_CTRL0, 386 + .event_cntr0 = NOC_PMU_EVENT_COUNTER0, 387 + .overflow_status = NOC_PMU_CNT_INFO, 388 + }; 389 + 390 + static const struct hisi_pmu_dev_info hisi_noc_v1 = { 391 + .attr_groups = hisi_noc_pmu_attr_groups, 392 + .counter_bits = 64, 393 + .check_event = NOC_PMU_EVENT_CTRL_TYPE, 394 + .private = &hisi_noc_v1_pmu_regs, 395 + }; 396 + 397 + static const struct acpi_device_id hisi_noc_pmu_ids[] = { 398 + { "HISI04E0", (kernel_ulong_t) &hisi_noc_v1 }, 399 + { } 400 + }; 401 + MODULE_DEVICE_TABLE(acpi, hisi_noc_pmu_ids); 402 + 403 + static struct platform_driver hisi_noc_pmu_driver = { 404 + .driver = { 405 + .name = "hisi_noc_pmu", 406 + .acpi_match_table = hisi_noc_pmu_ids, 407 + .suppress_bind_attrs = true, 408 + }, 409 + .probe = hisi_noc_pmu_probe, 410 + }; 411 + 412 + static int __init hisi_noc_pmu_module_init(void) 413 + { 414 + int ret; 415 + 416 + ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, "perf/hisi/noc:online", 417 + hisi_uncore_pmu_online_cpu, 418 + hisi_uncore_pmu_offline_cpu); 419 + if (ret < 0) { 420 + pr_err("hisi_noc_pmu: Fail to setup cpuhp callbacks, ret = %d\n", ret); 421 + return ret; 422 + } 423 + hisi_noc_pmu_cpuhp_state = ret; 424 + 425 + ret = platform_driver_register(&hisi_noc_pmu_driver); 426 + if (ret) 427 + cpuhp_remove_multi_state(hisi_noc_pmu_cpuhp_state); 428 + 429 + return ret; 430 + } 431 + module_init(hisi_noc_pmu_module_init); 432 + 433 + static void __exit hisi_noc_pmu_module_exit(void) 434 + { 435 + platform_driver_unregister(&hisi_noc_pmu_driver); 436 + cpuhp_remove_multi_state(hisi_noc_pmu_cpuhp_state); 437 + } 438 + module_exit(hisi_noc_pmu_module_exit); 439 + 440 + MODULE_IMPORT_NS("HISI_PMU"); 441 + MODULE_DESCRIPTION("HiSilicon SoC Uncore NoC PMU driver"); 442 + MODULE_LICENSE("GPL"); 443 + MODULE_AUTHOR("Yicong Yang <yangyicong@hisilicon.com>");
+3 -2
drivers/perf/hisilicon/hisi_uncore_pmu.c
··· 149 149 clear_bit(idx, hisi_pmu->pmu_events.used_mask); 150 150 } 151 151 152 - static irqreturn_t hisi_uncore_pmu_isr(int irq, void *data) 152 + irqreturn_t hisi_uncore_pmu_isr(int irq, void *data) 153 153 { 154 154 struct hisi_pmu *hisi_pmu = data; 155 155 struct perf_event *event; ··· 178 178 179 179 return IRQ_HANDLED; 180 180 } 181 + EXPORT_SYMBOL_NS_GPL(hisi_uncore_pmu_isr, "HISI_PMU"); 181 182 182 183 int hisi_uncore_pmu_init_irq(struct hisi_pmu *hisi_pmu, 183 184 struct platform_device *pdev) ··· 235 234 return -EINVAL; 236 235 237 236 hisi_pmu = to_hisi_pmu(event->pmu); 238 - if (event->attr.config > hisi_pmu->check_event) 237 + if ((event->attr.config & HISI_EVENTID_MASK) > hisi_pmu->check_event) 239 238 return -EINVAL; 240 239 241 240 if (hisi_pmu->on_cpu == -1)
+4 -2
drivers/perf/hisilicon/hisi_uncore_pmu.h
··· 24 24 #define pr_fmt(fmt) "hisi_pmu: " fmt 25 25 26 26 #define HISI_PMU_V2 0x30 27 - #define HISI_MAX_COUNTERS 0x10 27 + #define HISI_MAX_COUNTERS 0x18 28 28 #define to_hisi_pmu(p) (container_of(p, struct hisi_pmu, pmu)) 29 29 30 30 #define HISI_PMU_ATTR(_name, _func, _config) \ ··· 43 43 return FIELD_GET(GENMASK_ULL(hi, lo), event->attr.config); \ 44 44 } 45 45 46 - #define HISI_GET_EVENTID(ev) (ev->hw.config_base & 0xff) 46 + #define HISI_EVENTID_MASK GENMASK(7, 0) 47 + #define HISI_GET_EVENTID(ev) ((ev)->hw.config_base & HISI_EVENTID_MASK) 47 48 48 49 #define HISI_PMU_EVTYPE_BITS 8 49 50 #define HISI_PMU_EVTYPE_SHIFT(idx) ((idx) % 4 * HISI_PMU_EVTYPE_BITS) ··· 165 164 ssize_t hisi_uncore_pmu_identifier_attr_show(struct device *dev, 166 165 struct device_attribute *attr, 167 166 char *page); 167 + irqreturn_t hisi_uncore_pmu_isr(int irq, void *data); 168 168 int hisi_uncore_pmu_init_irq(struct hisi_pmu *hisi_pmu, 169 169 struct platform_device *pdev); 170 170 void hisi_uncore_pmu_init_topology(struct hisi_pmu *hisi_pmu, struct device *dev);