Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge branch 'for-next/perf' into for-next/core

- Support for additional PMU topologies on HiSilicon platforms
- Support for CCN-512 interconnect PMU
- Support for AXI ID filtering in the IMX8 DDR PMU
- Support for the CCPI2 uncore PMU in ThunderX2
- Driver cleanup to use devm_platform_ioremap_resource()

* for-next/perf:
drivers/perf: hisi: update the sccl_id/ccl_id for certain HiSilicon platform
perf/imx_ddr: Dump AXI ID filter info to userspace
docs/perf: Add AXI ID filter capabilities information
perf/imx_ddr: Add driver for DDR PMU in i.MX8MPlus
perf/imx_ddr: Add enhanced AXI ID filter support
bindings: perf: imx-ddr: Add new compatible string
docs/perf: Add explanation for DDR_CAP_AXI_ID_FILTER_ENHANCED quirk
arm64: perf: Simplify the ARMv8 PMUv3 event attributes
drivers/perf: Add CCPI2 PMU support in ThunderX2 UNCORE driver.
Documentation: perf: Update documentation for ThunderX2 PMU uncore driver
Documentation: Add documentation for CCN-512 DTS binding
perf: arm-ccn: Enable stats for CCN-512 interconnect
perf/smmuv3: use devm_platform_ioremap_resource() to simplify code
perf/arm-cci: use devm_platform_ioremap_resource() to simplify code
perf/arm-ccn: use devm_platform_ioremap_resource() to simplify code
perf: xgene: use devm_platform_ioremap_resource() to simplify code
perf: hisi: use devm_platform_ioremap_resource() to simplify code

+457 -226
+12 -3
Documentation/admin-guide/perf/imx-ddr.rst
··· 17 17 (AXI filtering) fields of the perf_event_attr structure, see /sys/bus/event_source/ 18 18 devices/imx8_ddr0/format/. The "events" directory describes the events types 19 19 hardware supported that can be used with perf tool, see /sys/bus/event_source/ 20 - devices/imx8_ddr0/events/. 20 + devices/imx8_ddr0/events/. The "caps" directory describes filter features implemented 21 + in DDR PMU, see /sys/bus/events_source/devices/imx8_ddr0/caps/. 21 22 e.g.:: 22 23 perf stat -a -e imx8_ddr0/cycles/ cmd 23 24 perf stat -a -e imx8_ddr0/read/,imx8_ddr0/write/ cmd ··· 26 25 AXI filtering is only used by CSV modes 0x41 (axid-read) and 0x42 (axid-write) 27 26 to count reading or writing matches filter setting. Filter setting is various 28 27 from different DRAM controller implementations, which is distinguished by quirks 29 - in the driver. 28 + in the driver. You also can dump info from userspace, filter in "caps" directory 29 + indicates whether PMU supports AXI ID filter or not; enhanced_filter indicates 30 + whether PMU supports enhanced AXI ID filter or not. Value 0 for un-supported, and 31 + value 1 for supported. 30 32 31 - * With DDR_CAP_AXI_ID_FILTER quirk. 33 + * With DDR_CAP_AXI_ID_FILTER quirk(filter: 1, enhanced_filter: 0). 32 34 Filter is defined with two configuration parts: 33 35 --AXI_ID defines AxID matching value. 34 36 --AXI_MASKING defines which bits of AxID are meaningful for the matching. ··· 54 50 axi_id to monitor a specific id, rather than having to specify axi_mask. 55 51 e.g.:: 56 52 perf stat -a -e imx8_ddr0/axid-read,axi_id=0x12/ cmd, which will monitor ARID=0x12 53 + 54 + * With DDR_CAP_AXI_ID_FILTER_ENHANCED quirk(filter: 1, enhanced_filter: 1). 55 + This is an extension to the DDR_CAP_AXI_ID_FILTER quirk which permits 56 + counting the number of bytes (as opposed to the number of bursts) from DDR 57 + read and write transactions concurrently with another set of data counters.
+11 -9
Documentation/admin-guide/perf/thunderx2-pmu.rst
··· 3 3 ============================================================= 4 4 5 5 The ThunderX2 SoC PMU consists of independent, system-wide, per-socket 6 - PMUs such as the Level 3 Cache (L3C) and DDR4 Memory Controller (DMC). 6 + PMUs such as the Level 3 Cache (L3C), DDR4 Memory Controller (DMC) and 7 + Cavium Coherent Processor Interconnect (CCPI2). 7 8 8 9 The DMC has 8 interleaved channels and the L3C has 16 interleaved tiles. 9 10 Events are counted for the default channel (i.e. channel 0) and prorated 10 11 to the total number of channels/tiles. 11 12 12 - The DMC and L3C support up to 4 counters. Counters are independently 13 - programmable and can be started and stopped individually. Each counter 14 - can be set to a different event. Counters are 32-bit and do not support 15 - an overflow interrupt; they are read every 2 seconds. 13 + The DMC and L3C support up to 4 counters, while the CCPI2 supports up to 8 14 + counters. Counters are independently programmable to different events and 15 + can be started and stopped individually. None of the counters support an 16 + overflow interrupt. DMC and L3C counters are 32-bit and read every 2 seconds. 17 + The CCPI2 counters are 64-bit and assumed not to overflow in normal operation. 16 18 17 19 PMU UNCORE (perf) driver: 18 20 19 21 The thunderx2_pmu driver registers per-socket perf PMUs for the DMC and 20 - L3C devices. Each PMU can be used to count up to 4 events 21 - simultaneously. The PMUs provide a description of their available events 22 - and configuration options under sysfs, see 23 - /sys/devices/uncore_<l3c_S/dmc_S/>; S is the socket id. 22 + L3C devices. Each PMU can be used to count up to 4 (DMC/L3C) or up to 8 23 + (CCPI2) events simultaneously. The PMUs provide a description of their 24 + available events and configuration options under sysfs, see 25 + /sys/devices/uncore_<l3c_S/dmc_S/ccpi2_S/>; S is the socket id. 24 26 25 27 The driver does not support sampling, therefore "perf record" will not 26 28 work. Per-task perf sessions are also not supported.
+1
Documentation/devicetree/bindings/perf/arm-ccn.txt
··· 6 6 "arm,ccn-502" 7 7 "arm,ccn-504" 8 8 "arm,ccn-508" 9 + "arm,ccn-512" 9 10 10 11 - reg: (standard registers property) physical address and size 11 12 (16MB) of the configuration registers block
+1
Documentation/devicetree/bindings/perf/fsl-imx-ddr.txt
··· 5 5 - compatible: should be one of: 6 6 "fsl,imx8-ddr-pmu" 7 7 "fsl,imx8m-ddr-pmu" 8 + "fsl,imx8mp-ddr-pmu" 8 9 9 10 - reg: physical address and size 10 11
+66 -125
arch/arm64/kernel/perf_event.c
··· 158 158 return sprintf(page, "event=0x%03llx\n", pmu_attr->id); 159 159 } 160 160 161 - #define ARMV8_EVENT_ATTR(name, config) \ 162 - PMU_EVENT_ATTR(name, armv8_event_attr_##name, \ 163 - config, armv8pmu_events_sysfs_show) 164 - 165 - ARMV8_EVENT_ATTR(sw_incr, ARMV8_PMUV3_PERFCTR_SW_INCR); 166 - ARMV8_EVENT_ATTR(l1i_cache_refill, ARMV8_PMUV3_PERFCTR_L1I_CACHE_REFILL); 167 - ARMV8_EVENT_ATTR(l1i_tlb_refill, ARMV8_PMUV3_PERFCTR_L1I_TLB_REFILL); 168 - ARMV8_EVENT_ATTR(l1d_cache_refill, ARMV8_PMUV3_PERFCTR_L1D_CACHE_REFILL); 169 - ARMV8_EVENT_ATTR(l1d_cache, ARMV8_PMUV3_PERFCTR_L1D_CACHE); 170 - ARMV8_EVENT_ATTR(l1d_tlb_refill, ARMV8_PMUV3_PERFCTR_L1D_TLB_REFILL); 171 - ARMV8_EVENT_ATTR(ld_retired, ARMV8_PMUV3_PERFCTR_LD_RETIRED); 172 - ARMV8_EVENT_ATTR(st_retired, ARMV8_PMUV3_PERFCTR_ST_RETIRED); 173 - ARMV8_EVENT_ATTR(inst_retired, ARMV8_PMUV3_PERFCTR_INST_RETIRED); 174 - ARMV8_EVENT_ATTR(exc_taken, ARMV8_PMUV3_PERFCTR_EXC_TAKEN); 175 - ARMV8_EVENT_ATTR(exc_return, ARMV8_PMUV3_PERFCTR_EXC_RETURN); 176 - ARMV8_EVENT_ATTR(cid_write_retired, ARMV8_PMUV3_PERFCTR_CID_WRITE_RETIRED); 177 - ARMV8_EVENT_ATTR(pc_write_retired, ARMV8_PMUV3_PERFCTR_PC_WRITE_RETIRED); 178 - ARMV8_EVENT_ATTR(br_immed_retired, ARMV8_PMUV3_PERFCTR_BR_IMMED_RETIRED); 179 - ARMV8_EVENT_ATTR(br_return_retired, ARMV8_PMUV3_PERFCTR_BR_RETURN_RETIRED); 180 - ARMV8_EVENT_ATTR(unaligned_ldst_retired, ARMV8_PMUV3_PERFCTR_UNALIGNED_LDST_RETIRED); 181 - ARMV8_EVENT_ATTR(br_mis_pred, ARMV8_PMUV3_PERFCTR_BR_MIS_PRED); 182 - ARMV8_EVENT_ATTR(cpu_cycles, ARMV8_PMUV3_PERFCTR_CPU_CYCLES); 183 - ARMV8_EVENT_ATTR(br_pred, ARMV8_PMUV3_PERFCTR_BR_PRED); 184 - ARMV8_EVENT_ATTR(mem_access, ARMV8_PMUV3_PERFCTR_MEM_ACCESS); 185 - ARMV8_EVENT_ATTR(l1i_cache, ARMV8_PMUV3_PERFCTR_L1I_CACHE); 186 - ARMV8_EVENT_ATTR(l1d_cache_wb, ARMV8_PMUV3_PERFCTR_L1D_CACHE_WB); 187 - ARMV8_EVENT_ATTR(l2d_cache, ARMV8_PMUV3_PERFCTR_L2D_CACHE); 188 - ARMV8_EVENT_ATTR(l2d_cache_refill, ARMV8_PMUV3_PERFCTR_L2D_CACHE_REFILL); 189 - ARMV8_EVENT_ATTR(l2d_cache_wb, ARMV8_PMUV3_PERFCTR_L2D_CACHE_WB); 190 - ARMV8_EVENT_ATTR(bus_access, ARMV8_PMUV3_PERFCTR_BUS_ACCESS); 191 - ARMV8_EVENT_ATTR(memory_error, ARMV8_PMUV3_PERFCTR_MEMORY_ERROR); 192 - ARMV8_EVENT_ATTR(inst_spec, ARMV8_PMUV3_PERFCTR_INST_SPEC); 193 - ARMV8_EVENT_ATTR(ttbr_write_retired, ARMV8_PMUV3_PERFCTR_TTBR_WRITE_RETIRED); 194 - ARMV8_EVENT_ATTR(bus_cycles, ARMV8_PMUV3_PERFCTR_BUS_CYCLES); 195 - /* Don't expose the chain event in /sys, since it's useless in isolation */ 196 - ARMV8_EVENT_ATTR(l1d_cache_allocate, ARMV8_PMUV3_PERFCTR_L1D_CACHE_ALLOCATE); 197 - ARMV8_EVENT_ATTR(l2d_cache_allocate, ARMV8_PMUV3_PERFCTR_L2D_CACHE_ALLOCATE); 198 - ARMV8_EVENT_ATTR(br_retired, ARMV8_PMUV3_PERFCTR_BR_RETIRED); 199 - ARMV8_EVENT_ATTR(br_mis_pred_retired, ARMV8_PMUV3_PERFCTR_BR_MIS_PRED_RETIRED); 200 - ARMV8_EVENT_ATTR(stall_frontend, ARMV8_PMUV3_PERFCTR_STALL_FRONTEND); 201 - ARMV8_EVENT_ATTR(stall_backend, ARMV8_PMUV3_PERFCTR_STALL_BACKEND); 202 - ARMV8_EVENT_ATTR(l1d_tlb, ARMV8_PMUV3_PERFCTR_L1D_TLB); 203 - ARMV8_EVENT_ATTR(l1i_tlb, ARMV8_PMUV3_PERFCTR_L1I_TLB); 204 - ARMV8_EVENT_ATTR(l2i_cache, ARMV8_PMUV3_PERFCTR_L2I_CACHE); 205 - ARMV8_EVENT_ATTR(l2i_cache_refill, ARMV8_PMUV3_PERFCTR_L2I_CACHE_REFILL); 206 - ARMV8_EVENT_ATTR(l3d_cache_allocate, ARMV8_PMUV3_PERFCTR_L3D_CACHE_ALLOCATE); 207 - ARMV8_EVENT_ATTR(l3d_cache_refill, ARMV8_PMUV3_PERFCTR_L3D_CACHE_REFILL); 208 - ARMV8_EVENT_ATTR(l3d_cache, ARMV8_PMUV3_PERFCTR_L3D_CACHE); 209 - ARMV8_EVENT_ATTR(l3d_cache_wb, ARMV8_PMUV3_PERFCTR_L3D_CACHE_WB); 210 - ARMV8_EVENT_ATTR(l2d_tlb_refill, ARMV8_PMUV3_PERFCTR_L2D_TLB_REFILL); 211 - ARMV8_EVENT_ATTR(l2i_tlb_refill, ARMV8_PMUV3_PERFCTR_L2I_TLB_REFILL); 212 - ARMV8_EVENT_ATTR(l2d_tlb, ARMV8_PMUV3_PERFCTR_L2D_TLB); 213 - ARMV8_EVENT_ATTR(l2i_tlb, ARMV8_PMUV3_PERFCTR_L2I_TLB); 214 - ARMV8_EVENT_ATTR(remote_access, ARMV8_PMUV3_PERFCTR_REMOTE_ACCESS); 215 - ARMV8_EVENT_ATTR(ll_cache, ARMV8_PMUV3_PERFCTR_LL_CACHE); 216 - ARMV8_EVENT_ATTR(ll_cache_miss, ARMV8_PMUV3_PERFCTR_LL_CACHE_MISS); 217 - ARMV8_EVENT_ATTR(dtlb_walk, ARMV8_PMUV3_PERFCTR_DTLB_WALK); 218 - ARMV8_EVENT_ATTR(itlb_walk, ARMV8_PMUV3_PERFCTR_ITLB_WALK); 219 - ARMV8_EVENT_ATTR(ll_cache_rd, ARMV8_PMUV3_PERFCTR_LL_CACHE_RD); 220 - ARMV8_EVENT_ATTR(ll_cache_miss_rd, ARMV8_PMUV3_PERFCTR_LL_CACHE_MISS_RD); 221 - ARMV8_EVENT_ATTR(remote_access_rd, ARMV8_PMUV3_PERFCTR_REMOTE_ACCESS_RD); 222 - ARMV8_EVENT_ATTR(sample_pop, ARMV8_SPE_PERFCTR_SAMPLE_POP); 223 - ARMV8_EVENT_ATTR(sample_feed, ARMV8_SPE_PERFCTR_SAMPLE_FEED); 224 - ARMV8_EVENT_ATTR(sample_filtrate, ARMV8_SPE_PERFCTR_SAMPLE_FILTRATE); 225 - ARMV8_EVENT_ATTR(sample_collision, ARMV8_SPE_PERFCTR_SAMPLE_COLLISION); 161 + #define ARMV8_EVENT_ATTR(name, config) \ 162 + (&((struct perf_pmu_events_attr) { \ 163 + .attr = __ATTR(name, 0444, armv8pmu_events_sysfs_show, NULL), \ 164 + .id = config, \ 165 + }).attr.attr) 226 166 227 167 static struct attribute *armv8_pmuv3_event_attrs[] = { 228 - &armv8_event_attr_sw_incr.attr.attr, 229 - &armv8_event_attr_l1i_cache_refill.attr.attr, 230 - &armv8_event_attr_l1i_tlb_refill.attr.attr, 231 - &armv8_event_attr_l1d_cache_refill.attr.attr, 232 - &armv8_event_attr_l1d_cache.attr.attr, 233 - &armv8_event_attr_l1d_tlb_refill.attr.attr, 234 - &armv8_event_attr_ld_retired.attr.attr, 235 - &armv8_event_attr_st_retired.attr.attr, 236 - &armv8_event_attr_inst_retired.attr.attr, 237 - &armv8_event_attr_exc_taken.attr.attr, 238 - &armv8_event_attr_exc_return.attr.attr, 239 - &armv8_event_attr_cid_write_retired.attr.attr, 240 - &armv8_event_attr_pc_write_retired.attr.attr, 241 - &armv8_event_attr_br_immed_retired.attr.attr, 242 - &armv8_event_attr_br_return_retired.attr.attr, 243 - &armv8_event_attr_unaligned_ldst_retired.attr.attr, 244 - &armv8_event_attr_br_mis_pred.attr.attr, 245 - &armv8_event_attr_cpu_cycles.attr.attr, 246 - &armv8_event_attr_br_pred.attr.attr, 247 - &armv8_event_attr_mem_access.attr.attr, 248 - &armv8_event_attr_l1i_cache.attr.attr, 249 - &armv8_event_attr_l1d_cache_wb.attr.attr, 250 - &armv8_event_attr_l2d_cache.attr.attr, 251 - &armv8_event_attr_l2d_cache_refill.attr.attr, 252 - &armv8_event_attr_l2d_cache_wb.attr.attr, 253 - &armv8_event_attr_bus_access.attr.attr, 254 - &armv8_event_attr_memory_error.attr.attr, 255 - &armv8_event_attr_inst_spec.attr.attr, 256 - &armv8_event_attr_ttbr_write_retired.attr.attr, 257 - &armv8_event_attr_bus_cycles.attr.attr, 258 - &armv8_event_attr_l1d_cache_allocate.attr.attr, 259 - &armv8_event_attr_l2d_cache_allocate.attr.attr, 260 - &armv8_event_attr_br_retired.attr.attr, 261 - &armv8_event_attr_br_mis_pred_retired.attr.attr, 262 - &armv8_event_attr_stall_frontend.attr.attr, 263 - &armv8_event_attr_stall_backend.attr.attr, 264 - &armv8_event_attr_l1d_tlb.attr.attr, 265 - &armv8_event_attr_l1i_tlb.attr.attr, 266 - &armv8_event_attr_l2i_cache.attr.attr, 267 - &armv8_event_attr_l2i_cache_refill.attr.attr, 268 - &armv8_event_attr_l3d_cache_allocate.attr.attr, 269 - &armv8_event_attr_l3d_cache_refill.attr.attr, 270 - &armv8_event_attr_l3d_cache.attr.attr, 271 - &armv8_event_attr_l3d_cache_wb.attr.attr, 272 - &armv8_event_attr_l2d_tlb_refill.attr.attr, 273 - &armv8_event_attr_l2i_tlb_refill.attr.attr, 274 - &armv8_event_attr_l2d_tlb.attr.attr, 275 - &armv8_event_attr_l2i_tlb.attr.attr, 276 - &armv8_event_attr_remote_access.attr.attr, 277 - &armv8_event_attr_ll_cache.attr.attr, 278 - &armv8_event_attr_ll_cache_miss.attr.attr, 279 - &armv8_event_attr_dtlb_walk.attr.attr, 280 - &armv8_event_attr_itlb_walk.attr.attr, 281 - &armv8_event_attr_ll_cache_rd.attr.attr, 282 - &armv8_event_attr_ll_cache_miss_rd.attr.attr, 283 - &armv8_event_attr_remote_access_rd.attr.attr, 284 - &armv8_event_attr_sample_pop.attr.attr, 285 - &armv8_event_attr_sample_feed.attr.attr, 286 - &armv8_event_attr_sample_filtrate.attr.attr, 287 - &armv8_event_attr_sample_collision.attr.attr, 168 + ARMV8_EVENT_ATTR(sw_incr, ARMV8_PMUV3_PERFCTR_SW_INCR), 169 + ARMV8_EVENT_ATTR(l1i_cache_refill, ARMV8_PMUV3_PERFCTR_L1I_CACHE_REFILL), 170 + ARMV8_EVENT_ATTR(l1i_tlb_refill, ARMV8_PMUV3_PERFCTR_L1I_TLB_REFILL), 171 + ARMV8_EVENT_ATTR(l1d_cache_refill, ARMV8_PMUV3_PERFCTR_L1D_CACHE_REFILL), 172 + ARMV8_EVENT_ATTR(l1d_cache, ARMV8_PMUV3_PERFCTR_L1D_CACHE), 173 + ARMV8_EVENT_ATTR(l1d_tlb_refill, ARMV8_PMUV3_PERFCTR_L1D_TLB_REFILL), 174 + ARMV8_EVENT_ATTR(ld_retired, ARMV8_PMUV3_PERFCTR_LD_RETIRED), 175 + ARMV8_EVENT_ATTR(st_retired, ARMV8_PMUV3_PERFCTR_ST_RETIRED), 176 + ARMV8_EVENT_ATTR(inst_retired, ARMV8_PMUV3_PERFCTR_INST_RETIRED), 177 + ARMV8_EVENT_ATTR(exc_taken, ARMV8_PMUV3_PERFCTR_EXC_TAKEN), 178 + ARMV8_EVENT_ATTR(exc_return, ARMV8_PMUV3_PERFCTR_EXC_RETURN), 179 + ARMV8_EVENT_ATTR(cid_write_retired, ARMV8_PMUV3_PERFCTR_CID_WRITE_RETIRED), 180 + ARMV8_EVENT_ATTR(pc_write_retired, ARMV8_PMUV3_PERFCTR_PC_WRITE_RETIRED), 181 + ARMV8_EVENT_ATTR(br_immed_retired, ARMV8_PMUV3_PERFCTR_BR_IMMED_RETIRED), 182 + ARMV8_EVENT_ATTR(br_return_retired, ARMV8_PMUV3_PERFCTR_BR_RETURN_RETIRED), 183 + ARMV8_EVENT_ATTR(unaligned_ldst_retired, ARMV8_PMUV3_PERFCTR_UNALIGNED_LDST_RETIRED), 184 + ARMV8_EVENT_ATTR(br_mis_pred, ARMV8_PMUV3_PERFCTR_BR_MIS_PRED), 185 + ARMV8_EVENT_ATTR(cpu_cycles, ARMV8_PMUV3_PERFCTR_CPU_CYCLES), 186 + ARMV8_EVENT_ATTR(br_pred, ARMV8_PMUV3_PERFCTR_BR_PRED), 187 + ARMV8_EVENT_ATTR(mem_access, ARMV8_PMUV3_PERFCTR_MEM_ACCESS), 188 + ARMV8_EVENT_ATTR(l1i_cache, ARMV8_PMUV3_PERFCTR_L1I_CACHE), 189 + ARMV8_EVENT_ATTR(l1d_cache_wb, ARMV8_PMUV3_PERFCTR_L1D_CACHE_WB), 190 + ARMV8_EVENT_ATTR(l2d_cache, ARMV8_PMUV3_PERFCTR_L2D_CACHE), 191 + ARMV8_EVENT_ATTR(l2d_cache_refill, ARMV8_PMUV3_PERFCTR_L2D_CACHE_REFILL), 192 + ARMV8_EVENT_ATTR(l2d_cache_wb, ARMV8_PMUV3_PERFCTR_L2D_CACHE_WB), 193 + ARMV8_EVENT_ATTR(bus_access, ARMV8_PMUV3_PERFCTR_BUS_ACCESS), 194 + ARMV8_EVENT_ATTR(memory_error, ARMV8_PMUV3_PERFCTR_MEMORY_ERROR), 195 + ARMV8_EVENT_ATTR(inst_spec, ARMV8_PMUV3_PERFCTR_INST_SPEC), 196 + ARMV8_EVENT_ATTR(ttbr_write_retired, ARMV8_PMUV3_PERFCTR_TTBR_WRITE_RETIRED), 197 + ARMV8_EVENT_ATTR(bus_cycles, ARMV8_PMUV3_PERFCTR_BUS_CYCLES), 198 + /* Don't expose the chain event in /sys, since it's useless in isolation */ 199 + ARMV8_EVENT_ATTR(l1d_cache_allocate, ARMV8_PMUV3_PERFCTR_L1D_CACHE_ALLOCATE), 200 + ARMV8_EVENT_ATTR(l2d_cache_allocate, ARMV8_PMUV3_PERFCTR_L2D_CACHE_ALLOCATE), 201 + ARMV8_EVENT_ATTR(br_retired, ARMV8_PMUV3_PERFCTR_BR_RETIRED), 202 + ARMV8_EVENT_ATTR(br_mis_pred_retired, ARMV8_PMUV3_PERFCTR_BR_MIS_PRED_RETIRED), 203 + ARMV8_EVENT_ATTR(stall_frontend, ARMV8_PMUV3_PERFCTR_STALL_FRONTEND), 204 + ARMV8_EVENT_ATTR(stall_backend, ARMV8_PMUV3_PERFCTR_STALL_BACKEND), 205 + ARMV8_EVENT_ATTR(l1d_tlb, ARMV8_PMUV3_PERFCTR_L1D_TLB), 206 + ARMV8_EVENT_ATTR(l1i_tlb, ARMV8_PMUV3_PERFCTR_L1I_TLB), 207 + ARMV8_EVENT_ATTR(l2i_cache, ARMV8_PMUV3_PERFCTR_L2I_CACHE), 208 + ARMV8_EVENT_ATTR(l2i_cache_refill, ARMV8_PMUV3_PERFCTR_L2I_CACHE_REFILL), 209 + ARMV8_EVENT_ATTR(l3d_cache_allocate, ARMV8_PMUV3_PERFCTR_L3D_CACHE_ALLOCATE), 210 + ARMV8_EVENT_ATTR(l3d_cache_refill, ARMV8_PMUV3_PERFCTR_L3D_CACHE_REFILL), 211 + ARMV8_EVENT_ATTR(l3d_cache, ARMV8_PMUV3_PERFCTR_L3D_CACHE), 212 + ARMV8_EVENT_ATTR(l3d_cache_wb, ARMV8_PMUV3_PERFCTR_L3D_CACHE_WB), 213 + ARMV8_EVENT_ATTR(l2d_tlb_refill, ARMV8_PMUV3_PERFCTR_L2D_TLB_REFILL), 214 + ARMV8_EVENT_ATTR(l2i_tlb_refill, ARMV8_PMUV3_PERFCTR_L2I_TLB_REFILL), 215 + ARMV8_EVENT_ATTR(l2d_tlb, ARMV8_PMUV3_PERFCTR_L2D_TLB), 216 + ARMV8_EVENT_ATTR(l2i_tlb, ARMV8_PMUV3_PERFCTR_L2I_TLB), 217 + ARMV8_EVENT_ATTR(remote_access, ARMV8_PMUV3_PERFCTR_REMOTE_ACCESS), 218 + ARMV8_EVENT_ATTR(ll_cache, ARMV8_PMUV3_PERFCTR_LL_CACHE), 219 + ARMV8_EVENT_ATTR(ll_cache_miss, ARMV8_PMUV3_PERFCTR_LL_CACHE_MISS), 220 + ARMV8_EVENT_ATTR(dtlb_walk, ARMV8_PMUV3_PERFCTR_DTLB_WALK), 221 + ARMV8_EVENT_ATTR(itlb_walk, ARMV8_PMUV3_PERFCTR_ITLB_WALK), 222 + ARMV8_EVENT_ATTR(ll_cache_rd, ARMV8_PMUV3_PERFCTR_LL_CACHE_RD), 223 + ARMV8_EVENT_ATTR(ll_cache_miss_rd, ARMV8_PMUV3_PERFCTR_LL_CACHE_MISS_RD), 224 + ARMV8_EVENT_ATTR(remote_access_rd, ARMV8_PMUV3_PERFCTR_REMOTE_ACCESS_RD), 225 + ARMV8_EVENT_ATTR(sample_pop, ARMV8_SPE_PERFCTR_SAMPLE_POP), 226 + ARMV8_EVENT_ATTR(sample_feed, ARMV8_SPE_PERFCTR_SAMPLE_FEED), 227 + ARMV8_EVENT_ATTR(sample_filtrate, ARMV8_SPE_PERFCTR_SAMPLE_FILTRATE), 228 + ARMV8_EVENT_ATTR(sample_collision, ARMV8_SPE_PERFCTR_SAMPLE_COLLISION), 288 229 NULL, 289 230 }; 290 231
+1 -3
drivers/perf/arm-cci.c
··· 1642 1642 1643 1643 static int cci_pmu_probe(struct platform_device *pdev) 1644 1644 { 1645 - struct resource *res; 1646 1645 struct cci_pmu *cci_pmu; 1647 1646 int i, ret, irq; 1648 1647 ··· 1649 1650 if (IS_ERR(cci_pmu)) 1650 1651 return PTR_ERR(cci_pmu); 1651 1652 1652 - res = platform_get_resource(pdev, IORESOURCE_MEM, 0); 1653 - cci_pmu->base = devm_ioremap_resource(&pdev->dev, res); 1653 + cci_pmu->base = devm_platform_ioremap_resource(pdev, 0); 1654 1654 if (IS_ERR(cci_pmu->base)) 1655 1655 return -ENOMEM; 1656 1656
+2 -2
drivers/perf/arm-ccn.c
··· 1477 1477 ccn->dev = &pdev->dev; 1478 1478 platform_set_drvdata(pdev, ccn); 1479 1479 1480 - res = platform_get_resource(pdev, IORESOURCE_MEM, 0); 1481 - ccn->base = devm_ioremap_resource(ccn->dev, res); 1480 + ccn->base = devm_platform_ioremap_resource(pdev, 0); 1482 1481 if (IS_ERR(ccn->base)) 1483 1482 return PTR_ERR(ccn->base); 1484 1483 ··· 1536 1537 static const struct of_device_id arm_ccn_match[] = { 1537 1538 { .compatible = "arm,ccn-502", }, 1538 1539 { .compatible = "arm,ccn-504", }, 1540 + { .compatible = "arm,ccn-512", }, 1539 1541 {}, 1540 1542 }; 1541 1543 MODULE_DEVICE_TABLE(of, arm_ccn_match);
+2 -3
drivers/perf/arm_smmuv3_pmu.c
··· 727 727 static int smmu_pmu_probe(struct platform_device *pdev) 728 728 { 729 729 struct smmu_pmu *smmu_pmu; 730 - struct resource *res_0, *res_1; 730 + struct resource *res_0; 731 731 u32 cfgr, reg_size; 732 732 u64 ceid_64[2]; 733 733 int irq, err; ··· 764 764 765 765 /* Determine if page 1 is present */ 766 766 if (cfgr & SMMU_PMCG_CFGR_RELOC_CTRS) { 767 - res_1 = platform_get_resource(pdev, IORESOURCE_MEM, 1); 768 - smmu_pmu->reloc_base = devm_ioremap_resource(dev, res_1); 767 + smmu_pmu->reloc_base = devm_platform_ioremap_resource(pdev, 1); 769 768 if (IS_ERR(smmu_pmu->reloc_base)) 770 769 return PTR_ERR(smmu_pmu->reloc_base); 771 770 } else {
+103 -21
drivers/perf/fsl_imx8_ddr_perf.c
··· 45 45 static DEFINE_IDA(ddr_ida); 46 46 47 47 /* DDR Perf hardware feature */ 48 - #define DDR_CAP_AXI_ID_FILTER 0x1 /* support AXI ID filter */ 48 + #define DDR_CAP_AXI_ID_FILTER 0x1 /* support AXI ID filter */ 49 + #define DDR_CAP_AXI_ID_FILTER_ENHANCED 0x3 /* support enhanced AXI ID filter */ 49 50 50 51 struct fsl_ddr_devtype_data { 51 52 unsigned int quirks; /* quirks needed for different DDR Perf core */ ··· 58 57 .quirks = DDR_CAP_AXI_ID_FILTER, 59 58 }; 60 59 60 + static const struct fsl_ddr_devtype_data imx8mp_devtype_data = { 61 + .quirks = DDR_CAP_AXI_ID_FILTER_ENHANCED, 62 + }; 63 + 61 64 static const struct of_device_id imx_ddr_pmu_dt_ids[] = { 62 65 { .compatible = "fsl,imx8-ddr-pmu", .data = &imx8_devtype_data}, 63 66 { .compatible = "fsl,imx8m-ddr-pmu", .data = &imx8m_devtype_data}, 67 + { .compatible = "fsl,imx8mp-ddr-pmu", .data = &imx8mp_devtype_data}, 64 68 { /* sentinel */ } 65 69 }; 66 70 MODULE_DEVICE_TABLE(of, imx_ddr_pmu_dt_ids); ··· 82 76 const struct fsl_ddr_devtype_data *devtype_data; 83 77 int irq; 84 78 int id; 79 + }; 80 + 81 + enum ddr_perf_filter_capabilities { 82 + PERF_CAP_AXI_ID_FILTER = 0, 83 + PERF_CAP_AXI_ID_FILTER_ENHANCED, 84 + PERF_CAP_AXI_ID_FEAT_MAX, 85 + }; 86 + 87 + static u32 ddr_perf_filter_cap_get(struct ddr_pmu *pmu, int cap) 88 + { 89 + u32 quirks = pmu->devtype_data->quirks; 90 + 91 + switch (cap) { 92 + case PERF_CAP_AXI_ID_FILTER: 93 + return !!(quirks & DDR_CAP_AXI_ID_FILTER); 94 + case PERF_CAP_AXI_ID_FILTER_ENHANCED: 95 + quirks &= DDR_CAP_AXI_ID_FILTER_ENHANCED; 96 + return quirks == DDR_CAP_AXI_ID_FILTER_ENHANCED; 97 + default: 98 + WARN(1, "unknown filter cap %d\n", cap); 99 + } 100 + 101 + return 0; 102 + } 103 + 104 + static ssize_t ddr_perf_filter_cap_show(struct device *dev, 105 + struct device_attribute *attr, 106 + char *buf) 107 + { 108 + struct ddr_pmu *pmu = dev_get_drvdata(dev); 109 + struct dev_ext_attribute *ea = 110 + container_of(attr, struct dev_ext_attribute, attr); 111 + int cap = (long)ea->var; 112 + 113 + return snprintf(buf, PAGE_SIZE, "%u\n", 114 + ddr_perf_filter_cap_get(pmu, cap)); 115 + } 116 + 117 + #define PERF_EXT_ATTR_ENTRY(_name, _func, _var) \ 118 + (&((struct dev_ext_attribute) { \ 119 + __ATTR(_name, 0444, _func, NULL), (void *)_var \ 120 + }).attr.attr) 121 + 122 + #define PERF_FILTER_EXT_ATTR_ENTRY(_name, _var) \ 123 + PERF_EXT_ATTR_ENTRY(_name, ddr_perf_filter_cap_show, _var) 124 + 125 + static struct attribute *ddr_perf_filter_cap_attr[] = { 126 + PERF_FILTER_EXT_ATTR_ENTRY(filter, PERF_CAP_AXI_ID_FILTER), 127 + PERF_FILTER_EXT_ATTR_ENTRY(enhanced_filter, PERF_CAP_AXI_ID_FILTER_ENHANCED), 128 + NULL, 129 + }; 130 + 131 + static struct attribute_group ddr_perf_filter_cap_attr_group = { 132 + .name = "caps", 133 + .attrs = ddr_perf_filter_cap_attr, 85 134 }; 86 135 87 136 static ssize_t ddr_perf_cpumask_show(struct device *dev, ··· 236 175 &ddr_perf_events_attr_group, 237 176 &ddr_perf_format_attr_group, 238 177 &ddr_perf_cpumask_attr_group, 178 + &ddr_perf_filter_cap_attr_group, 239 179 NULL, 240 180 }; 181 + 182 + static bool ddr_perf_is_filtered(struct perf_event *event) 183 + { 184 + return event->attr.config == 0x41 || event->attr.config == 0x42; 185 + } 186 + 187 + static u32 ddr_perf_filter_val(struct perf_event *event) 188 + { 189 + return event->attr.config1; 190 + } 191 + 192 + static bool ddr_perf_filters_compatible(struct perf_event *a, 193 + struct perf_event *b) 194 + { 195 + if (!ddr_perf_is_filtered(a)) 196 + return true; 197 + if (!ddr_perf_is_filtered(b)) 198 + return true; 199 + return ddr_perf_filter_val(a) == ddr_perf_filter_val(b); 200 + } 201 + 202 + static bool ddr_perf_is_enhanced_filtered(struct perf_event *event) 203 + { 204 + unsigned int filt; 205 + struct ddr_pmu *pmu = to_ddr_pmu(event->pmu); 206 + 207 + filt = pmu->devtype_data->quirks & DDR_CAP_AXI_ID_FILTER_ENHANCED; 208 + return (filt == DDR_CAP_AXI_ID_FILTER_ENHANCED) && 209 + ddr_perf_is_filtered(event); 210 + } 241 211 242 212 static u32 ddr_perf_alloc_counter(struct ddr_pmu *pmu, int event) 243 213 { ··· 301 209 302 210 static u32 ddr_perf_read_counter(struct ddr_pmu *pmu, int counter) 303 211 { 304 - return readl_relaxed(pmu->base + COUNTER_READ + counter * 4); 305 - } 212 + struct perf_event *event = pmu->events[counter]; 213 + void __iomem *base = pmu->base; 306 214 307 - static bool ddr_perf_is_filtered(struct perf_event *event) 308 - { 309 - return event->attr.config == 0x41 || event->attr.config == 0x42; 310 - } 311 - 312 - static u32 ddr_perf_filter_val(struct perf_event *event) 313 - { 314 - return event->attr.config1; 315 - } 316 - 317 - static bool ddr_perf_filters_compatible(struct perf_event *a, 318 - struct perf_event *b) 319 - { 320 - if (!ddr_perf_is_filtered(a)) 321 - return true; 322 - if (!ddr_perf_is_filtered(b)) 323 - return true; 324 - return ddr_perf_filter_val(a) == ddr_perf_filter_val(b); 215 + /* 216 + * return bytes instead of bursts from ddr transaction for 217 + * axid-read and axid-write event if PMU core supports enhanced 218 + * filter. 219 + */ 220 + base += ddr_perf_is_enhanced_filtered(event) ? COUNTER_DPCR1 : 221 + COUNTER_READ; 222 + return readl_relaxed(base + counter * 4); 325 223 } 326 224 327 225 static int ddr_perf_event_init(struct perf_event *event)
+1 -4
drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c
··· 243 243 static int hisi_ddrc_pmu_init_data(struct platform_device *pdev, 244 244 struct hisi_pmu *ddrc_pmu) 245 245 { 246 - struct resource *res; 247 - 248 246 /* 249 247 * Use the SCCL_ID and DDRC channel ID to identify the 250 248 * DDRC PMU, while SCCL_ID is in MPIDR[aff2]. ··· 261 263 /* DDRC PMUs only share the same SCCL */ 262 264 ddrc_pmu->ccl_id = -1; 263 265 264 - res = platform_get_resource(pdev, IORESOURCE_MEM, 0); 265 - ddrc_pmu->base = devm_ioremap_resource(&pdev->dev, res); 266 + ddrc_pmu->base = devm_platform_ioremap_resource(pdev, 0); 266 267 if (IS_ERR(ddrc_pmu->base)) { 267 268 dev_err(&pdev->dev, "ioremap failed for ddrc_pmu resource\n"); 268 269 return PTR_ERR(ddrc_pmu->base);
+1 -3
drivers/perf/hisilicon/hisi_uncore_hha_pmu.c
··· 234 234 struct hisi_pmu *hha_pmu) 235 235 { 236 236 unsigned long long id; 237 - struct resource *res; 238 237 acpi_status status; 239 238 240 239 status = acpi_evaluate_integer(ACPI_HANDLE(&pdev->dev), ··· 255 256 /* HHA PMUs only share the same SCCL */ 256 257 hha_pmu->ccl_id = -1; 257 258 258 - res = platform_get_resource(pdev, IORESOURCE_MEM, 0); 259 - hha_pmu->base = devm_ioremap_resource(&pdev->dev, res); 259 + hha_pmu->base = devm_platform_ioremap_resource(pdev, 0); 260 260 if (IS_ERR(hha_pmu->base)) { 261 261 dev_err(&pdev->dev, "ioremap failed for hha_pmu resource\n"); 262 262 return PTR_ERR(hha_pmu->base);
+1 -3
drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
··· 233 233 struct hisi_pmu *l3c_pmu) 234 234 { 235 235 unsigned long long id; 236 - struct resource *res; 237 236 acpi_status status; 238 237 239 238 status = acpi_evaluate_integer(ACPI_HANDLE(&pdev->dev), ··· 258 259 return -EINVAL; 259 260 } 260 261 261 - res = platform_get_resource(pdev, IORESOURCE_MEM, 0); 262 - l3c_pmu->base = devm_ioremap_resource(&pdev->dev, res); 262 + l3c_pmu->base = devm_platform_ioremap_resource(pdev, 0); 263 263 if (IS_ERR(l3c_pmu->base)) { 264 264 dev_err(&pdev->dev, "ioremap failed for l3c_pmu resource\n"); 265 265 return PTR_ERR(l3c_pmu->base);
+17 -7
drivers/perf/hisilicon/hisi_uncore_pmu.c
··· 15 15 #include <linux/errno.h> 16 16 #include <linux/interrupt.h> 17 17 18 + #include <asm/cputype.h> 18 19 #include <asm/local64.h> 19 20 20 21 #include "hisi_uncore_pmu.h" ··· 339 338 340 339 /* 341 340 * Read Super CPU cluster and CPU cluster ID from MPIDR_EL1. 342 - * If multi-threading is supported, CCL_ID is the low 3-bits in MPIDR[Aff2] 343 - * and SCCL_ID is the upper 5-bits of Aff2 field; if not, SCCL_ID 341 + * If multi-threading is supported, On Huawei Kunpeng 920 SoC whose cpu 342 + * core is tsv110, CCL_ID is the low 3-bits in MPIDR[Aff2] and SCCL_ID 343 + * is the upper 5-bits of Aff2 field; while for other cpu types, SCCL_ID 344 + * is in MPIDR[Aff3] and CCL_ID is in MPIDR[Aff2], if not, SCCL_ID 344 345 * is in MPIDR[Aff2] and CCL_ID is in MPIDR[Aff1]. 345 346 */ 346 347 static void hisi_read_sccl_and_ccl_id(int *sccl_id, int *ccl_id) ··· 350 347 u64 mpidr = read_cpuid_mpidr(); 351 348 352 349 if (mpidr & MPIDR_MT_BITMASK) { 353 - int aff2 = MPIDR_AFFINITY_LEVEL(mpidr, 2); 350 + if (read_cpuid_part_number() == HISI_CPU_PART_TSV110) { 351 + int aff2 = MPIDR_AFFINITY_LEVEL(mpidr, 2); 354 352 355 - if (sccl_id) 356 - *sccl_id = aff2 >> 3; 357 - if (ccl_id) 358 - *ccl_id = aff2 & 0x7; 353 + if (sccl_id) 354 + *sccl_id = aff2 >> 3; 355 + if (ccl_id) 356 + *ccl_id = aff2 & 0x7; 357 + } else { 358 + if (sccl_id) 359 + *sccl_id = MPIDR_AFFINITY_LEVEL(mpidr, 3); 360 + if (ccl_id) 361 + *ccl_id = MPIDR_AFFINITY_LEVEL(mpidr, 2); 362 + } 359 363 } else { 360 364 if (sccl_id) 361 365 *sccl_id = MPIDR_AFFINITY_LEVEL(mpidr, 2);
+234 -33
drivers/perf/thunderx2_pmu.c
··· 16 16 * they need to be sampled before overflow(i.e, at every 2 seconds). 17 17 */ 18 18 19 - #define TX2_PMU_MAX_COUNTERS 4 19 + #define TX2_PMU_DMC_L3C_MAX_COUNTERS 4 20 + #define TX2_PMU_CCPI2_MAX_COUNTERS 8 21 + #define TX2_PMU_MAX_COUNTERS TX2_PMU_CCPI2_MAX_COUNTERS 22 + 23 + 20 24 #define TX2_PMU_DMC_CHANNELS 8 21 25 #define TX2_PMU_L3_TILES 16 22 26 23 27 #define TX2_PMU_HRTIMER_INTERVAL (2 * NSEC_PER_SEC) 24 - #define GET_EVENTID(ev) ((ev->hw.config) & 0x1f) 25 - #define GET_COUNTERID(ev) ((ev->hw.idx) & 0x3) 28 + #define GET_EVENTID(ev, mask) ((ev->hw.config) & mask) 29 + #define GET_COUNTERID(ev, mask) ((ev->hw.idx) & mask) 26 30 /* 1 byte per counter(4 counters). 27 31 * Event id is encoded in bits [5:1] of a byte, 28 32 */ 29 33 #define DMC_EVENT_CFG(idx, val) ((val) << (((idx) * 8) + 1)) 30 34 35 + /* bits[3:0] to select counters, are indexed from 8 to 15. */ 36 + #define CCPI2_COUNTER_OFFSET 8 37 + 31 38 #define L3C_COUNTER_CTL 0xA8 32 39 #define L3C_COUNTER_DATA 0xAC 33 40 #define DMC_COUNTER_CTL 0x234 34 41 #define DMC_COUNTER_DATA 0x240 42 + 43 + #define CCPI2_PERF_CTL 0x108 44 + #define CCPI2_COUNTER_CTL 0x10C 45 + #define CCPI2_COUNTER_SEL 0x12c 46 + #define CCPI2_COUNTER_DATA_L 0x130 47 + #define CCPI2_COUNTER_DATA_H 0x134 35 48 36 49 /* L3C event IDs */ 37 50 #define L3_EVENT_READ_REQ 0xD ··· 64 51 #define DMC_EVENT_READ_TXNS 0xF 65 52 #define DMC_EVENT_MAX 0x10 66 53 54 + #define CCPI2_EVENT_REQ_PKT_SENT 0x3D 55 + #define CCPI2_EVENT_SNOOP_PKT_SENT 0x65 56 + #define CCPI2_EVENT_DATA_PKT_SENT 0x105 57 + #define CCPI2_EVENT_GIC_PKT_SENT 0x12D 58 + #define CCPI2_EVENT_MAX 0x200 59 + 60 + #define CCPI2_PERF_CTL_ENABLE BIT(0) 61 + #define CCPI2_PERF_CTL_START BIT(1) 62 + #define CCPI2_PERF_CTL_RESET BIT(4) 63 + #define CCPI2_EVENT_LEVEL_RISING_EDGE BIT(10) 64 + #define CCPI2_EVENT_TYPE_EDGE_SENSITIVE BIT(11) 65 + 67 66 enum tx2_uncore_type { 68 67 PMU_TYPE_L3C, 69 68 PMU_TYPE_DMC, 69 + PMU_TYPE_CCPI2, 70 70 PMU_TYPE_INVALID, 71 71 }; 72 72 73 73 /* 74 - * pmu on each socket has 2 uncore devices(dmc and l3c), 75 - * each device has 4 counters. 74 + * Each socket has 3 uncore devices associated with a PMU. The DMC and 75 + * L3C have 4 32-bit counters and the CCPI2 has 8 64-bit counters. 76 76 */ 77 77 struct tx2_uncore_pmu { 78 78 struct hlist_node hpnode; ··· 95 69 int node; 96 70 int cpu; 97 71 u32 max_counters; 72 + u32 counters_mask; 98 73 u32 prorate_factor; 99 74 u32 max_events; 75 + u32 events_mask; 100 76 u64 hrtimer_interval; 101 77 void __iomem *base; 102 78 DECLARE_BITMAP(active_counters, TX2_PMU_MAX_COUNTERS); ··· 107 79 struct hrtimer hrtimer; 108 80 const struct attribute_group **attr_groups; 109 81 enum tx2_uncore_type type; 82 + enum hrtimer_restart (*hrtimer_callback)(struct hrtimer *cb); 110 83 void (*init_cntr_base)(struct perf_event *event, 111 84 struct tx2_uncore_pmu *tx2_pmu); 112 85 void (*stop_event)(struct perf_event *event); ··· 121 92 return container_of(pmu, struct tx2_uncore_pmu, pmu); 122 93 } 123 94 124 - PMU_FORMAT_ATTR(event, "config:0-4"); 95 + #define TX2_PMU_FORMAT_ATTR(_var, _name, _format) \ 96 + static ssize_t \ 97 + __tx2_pmu_##_var##_show(struct device *dev, \ 98 + struct device_attribute *attr, \ 99 + char *page) \ 100 + { \ 101 + BUILD_BUG_ON(sizeof(_format) >= PAGE_SIZE); \ 102 + return sprintf(page, _format "\n"); \ 103 + } \ 104 + \ 105 + static struct device_attribute format_attr_##_var = \ 106 + __ATTR(_name, 0444, __tx2_pmu_##_var##_show, NULL) 107 + 108 + TX2_PMU_FORMAT_ATTR(event, event, "config:0-4"); 109 + TX2_PMU_FORMAT_ATTR(event_ccpi2, event, "config:0-9"); 125 110 126 111 static struct attribute *l3c_pmu_format_attrs[] = { 127 112 &format_attr_event.attr, ··· 147 104 NULL, 148 105 }; 149 106 107 + static struct attribute *ccpi2_pmu_format_attrs[] = { 108 + &format_attr_event_ccpi2.attr, 109 + NULL, 110 + }; 111 + 150 112 static const struct attribute_group l3c_pmu_format_attr_group = { 151 113 .name = "format", 152 114 .attrs = l3c_pmu_format_attrs, ··· 160 112 static const struct attribute_group dmc_pmu_format_attr_group = { 161 113 .name = "format", 162 114 .attrs = dmc_pmu_format_attrs, 115 + }; 116 + 117 + static const struct attribute_group ccpi2_pmu_format_attr_group = { 118 + .name = "format", 119 + .attrs = ccpi2_pmu_format_attrs, 163 120 }; 164 121 165 122 /* ··· 217 164 NULL, 218 165 }; 219 166 167 + TX2_EVENT_ATTR(req_pktsent, CCPI2_EVENT_REQ_PKT_SENT); 168 + TX2_EVENT_ATTR(snoop_pktsent, CCPI2_EVENT_SNOOP_PKT_SENT); 169 + TX2_EVENT_ATTR(data_pktsent, CCPI2_EVENT_DATA_PKT_SENT); 170 + TX2_EVENT_ATTR(gic_pktsent, CCPI2_EVENT_GIC_PKT_SENT); 171 + 172 + static struct attribute *ccpi2_pmu_events_attrs[] = { 173 + &tx2_pmu_event_attr_req_pktsent.attr.attr, 174 + &tx2_pmu_event_attr_snoop_pktsent.attr.attr, 175 + &tx2_pmu_event_attr_data_pktsent.attr.attr, 176 + &tx2_pmu_event_attr_gic_pktsent.attr.attr, 177 + NULL, 178 + }; 179 + 220 180 static const struct attribute_group l3c_pmu_events_attr_group = { 221 181 .name = "events", 222 182 .attrs = l3c_pmu_events_attrs, ··· 238 172 static const struct attribute_group dmc_pmu_events_attr_group = { 239 173 .name = "events", 240 174 .attrs = dmc_pmu_events_attrs, 175 + }; 176 + 177 + static const struct attribute_group ccpi2_pmu_events_attr_group = { 178 + .name = "events", 179 + .attrs = ccpi2_pmu_events_attrs, 241 180 }; 242 181 243 182 /* ··· 284 213 NULL 285 214 }; 286 215 216 + static const struct attribute_group *ccpi2_pmu_attr_groups[] = { 217 + &ccpi2_pmu_format_attr_group, 218 + &pmu_cpumask_attr_group, 219 + &ccpi2_pmu_events_attr_group, 220 + NULL 221 + }; 222 + 287 223 static inline u32 reg_readl(unsigned long addr) 288 224 { 289 225 return readl((void __iomem *)addr); ··· 323 245 struct tx2_uncore_pmu *tx2_pmu) 324 246 { 325 247 struct hw_perf_event *hwc = &event->hw; 248 + u32 cmask; 249 + 250 + tx2_pmu = pmu_to_tx2_pmu(event->pmu); 251 + cmask = tx2_pmu->counters_mask; 326 252 327 253 /* counter ctrl/data reg offset at 8 */ 328 254 hwc->config_base = (unsigned long)tx2_pmu->base 329 - + L3C_COUNTER_CTL + (8 * GET_COUNTERID(event)); 255 + + L3C_COUNTER_CTL + (8 * GET_COUNTERID(event, cmask)); 330 256 hwc->event_base = (unsigned long)tx2_pmu->base 331 - + L3C_COUNTER_DATA + (8 * GET_COUNTERID(event)); 257 + + L3C_COUNTER_DATA + (8 * GET_COUNTERID(event, cmask)); 332 258 } 333 259 334 260 static void init_cntr_base_dmc(struct perf_event *event, 335 261 struct tx2_uncore_pmu *tx2_pmu) 336 262 { 337 263 struct hw_perf_event *hwc = &event->hw; 264 + u32 cmask; 265 + 266 + tx2_pmu = pmu_to_tx2_pmu(event->pmu); 267 + cmask = tx2_pmu->counters_mask; 338 268 339 269 hwc->config_base = (unsigned long)tx2_pmu->base 340 270 + DMC_COUNTER_CTL; 341 271 /* counter data reg offset at 0xc */ 342 272 hwc->event_base = (unsigned long)tx2_pmu->base 343 - + DMC_COUNTER_DATA + (0xc * GET_COUNTERID(event)); 273 + + DMC_COUNTER_DATA + (0xc * GET_COUNTERID(event, cmask)); 274 + } 275 + 276 + static void init_cntr_base_ccpi2(struct perf_event *event, 277 + struct tx2_uncore_pmu *tx2_pmu) 278 + { 279 + struct hw_perf_event *hwc = &event->hw; 280 + u32 cmask; 281 + 282 + cmask = tx2_pmu->counters_mask; 283 + 284 + hwc->config_base = (unsigned long)tx2_pmu->base 285 + + CCPI2_COUNTER_CTL + (4 * GET_COUNTERID(event, cmask)); 286 + hwc->event_base = (unsigned long)tx2_pmu->base; 344 287 } 345 288 346 289 static void uncore_start_event_l3c(struct perf_event *event, int flags) 347 290 { 348 - u32 val; 291 + u32 val, emask; 349 292 struct hw_perf_event *hwc = &event->hw; 293 + struct tx2_uncore_pmu *tx2_pmu; 294 + 295 + tx2_pmu = pmu_to_tx2_pmu(event->pmu); 296 + emask = tx2_pmu->events_mask; 350 297 351 298 /* event id encoded in bits [07:03] */ 352 - val = GET_EVENTID(event) << 3; 299 + val = GET_EVENTID(event, emask) << 3; 353 300 reg_writel(val, hwc->config_base); 354 301 local64_set(&hwc->prev_count, 0); 355 302 reg_writel(0, hwc->event_base); ··· 387 284 388 285 static void uncore_start_event_dmc(struct perf_event *event, int flags) 389 286 { 390 - u32 val; 287 + u32 val, cmask, emask; 391 288 struct hw_perf_event *hwc = &event->hw; 392 - int idx = GET_COUNTERID(event); 393 - int event_id = GET_EVENTID(event); 289 + struct tx2_uncore_pmu *tx2_pmu; 290 + int idx, event_id; 291 + 292 + tx2_pmu = pmu_to_tx2_pmu(event->pmu); 293 + cmask = tx2_pmu->counters_mask; 294 + emask = tx2_pmu->events_mask; 295 + 296 + idx = GET_COUNTERID(event, cmask); 297 + event_id = GET_EVENTID(event, emask); 394 298 395 299 /* enable and start counters. 396 300 * 8 bits for each counter, bits[05:01] of a counter to set event type. ··· 412 302 413 303 static void uncore_stop_event_dmc(struct perf_event *event) 414 304 { 415 - u32 val; 305 + u32 val, cmask; 416 306 struct hw_perf_event *hwc = &event->hw; 417 - int idx = GET_COUNTERID(event); 307 + struct tx2_uncore_pmu *tx2_pmu; 308 + int idx; 309 + 310 + tx2_pmu = pmu_to_tx2_pmu(event->pmu); 311 + cmask = tx2_pmu->counters_mask; 312 + idx = GET_COUNTERID(event, cmask); 418 313 419 314 /* clear event type(bits[05:01]) to stop counter */ 420 315 val = reg_readl(hwc->config_base); ··· 427 312 reg_writel(val, hwc->config_base); 428 313 } 429 314 315 + static void uncore_start_event_ccpi2(struct perf_event *event, int flags) 316 + { 317 + u32 emask; 318 + struct hw_perf_event *hwc = &event->hw; 319 + struct tx2_uncore_pmu *tx2_pmu; 320 + 321 + tx2_pmu = pmu_to_tx2_pmu(event->pmu); 322 + emask = tx2_pmu->events_mask; 323 + 324 + /* Bit [09:00] to set event id. 325 + * Bits [10], set level to rising edge. 326 + * Bits [11], set type to edge sensitive. 327 + */ 328 + reg_writel((CCPI2_EVENT_TYPE_EDGE_SENSITIVE | 329 + CCPI2_EVENT_LEVEL_RISING_EDGE | 330 + GET_EVENTID(event, emask)), hwc->config_base); 331 + 332 + /* reset[4], enable[0] and start[1] counters */ 333 + reg_writel(CCPI2_PERF_CTL_RESET | 334 + CCPI2_PERF_CTL_START | 335 + CCPI2_PERF_CTL_ENABLE, 336 + hwc->event_base + CCPI2_PERF_CTL); 337 + local64_set(&event->hw.prev_count, 0ULL); 338 + } 339 + 340 + static void uncore_stop_event_ccpi2(struct perf_event *event) 341 + { 342 + struct hw_perf_event *hwc = &event->hw; 343 + 344 + /* disable and stop counter */ 345 + reg_writel(0, hwc->event_base + CCPI2_PERF_CTL); 346 + } 347 + 430 348 static void tx2_uncore_event_update(struct perf_event *event) 431 349 { 432 - s64 prev, delta, new = 0; 350 + u64 prev, delta, new = 0; 433 351 struct hw_perf_event *hwc = &event->hw; 434 352 struct tx2_uncore_pmu *tx2_pmu; 435 353 enum tx2_uncore_type type; 436 354 u32 prorate_factor; 355 + u32 cmask, emask; 437 356 438 357 tx2_pmu = pmu_to_tx2_pmu(event->pmu); 439 358 type = tx2_pmu->type; 359 + cmask = tx2_pmu->counters_mask; 360 + emask = tx2_pmu->events_mask; 440 361 prorate_factor = tx2_pmu->prorate_factor; 441 - 442 - new = reg_readl(hwc->event_base); 443 - prev = local64_xchg(&hwc->prev_count, new); 444 - 445 - /* handles rollover of 32 bit counter */ 446 - delta = (u32)(((1UL << 32) - prev) + new); 362 + if (type == PMU_TYPE_CCPI2) { 363 + reg_writel(CCPI2_COUNTER_OFFSET + 364 + GET_COUNTERID(event, cmask), 365 + hwc->event_base + CCPI2_COUNTER_SEL); 366 + new = reg_readl(hwc->event_base + CCPI2_COUNTER_DATA_H); 367 + new = (new << 32) + 368 + reg_readl(hwc->event_base + CCPI2_COUNTER_DATA_L); 369 + prev = local64_xchg(&hwc->prev_count, new); 370 + delta = new - prev; 371 + } else { 372 + new = reg_readl(hwc->event_base); 373 + prev = local64_xchg(&hwc->prev_count, new); 374 + /* handles rollover of 32 bit counter */ 375 + delta = (u32)(((1UL << 32) - prev) + new); 376 + } 447 377 448 378 /* DMC event data_transfers granularity is 16 Bytes, convert it to 64 */ 449 379 if (type == PMU_TYPE_DMC && 450 - GET_EVENTID(event) == DMC_EVENT_DATA_TRANSFERS) 380 + GET_EVENTID(event, emask) == DMC_EVENT_DATA_TRANSFERS) 451 381 delta = delta/4; 452 382 453 383 /* L3C and DMC has 16 and 8 interleave channels respectively. ··· 511 351 } devices[] = { 512 352 {"CAV901D", PMU_TYPE_L3C}, 513 353 {"CAV901F", PMU_TYPE_DMC}, 354 + {"CAV901E", PMU_TYPE_CCPI2}, 514 355 {"", PMU_TYPE_INVALID} 515 356 }; 516 357 ··· 541 380 * Make sure the group of events can be scheduled at once 542 381 * on the PMU. 543 382 */ 544 - static bool tx2_uncore_validate_event_group(struct perf_event *event) 383 + static bool tx2_uncore_validate_event_group(struct perf_event *event, 384 + int max_counters) 545 385 { 546 386 struct perf_event *sibling, *leader = event->group_leader; 547 387 int counters = 0; ··· 565 403 * If the group requires more counters than the HW has, 566 404 * it cannot ever be scheduled. 567 405 */ 568 - return counters <= TX2_PMU_MAX_COUNTERS; 406 + return counters <= max_counters; 569 407 } 570 408 571 409 ··· 601 439 hwc->config = event->attr.config; 602 440 603 441 /* Validate the group */ 604 - if (!tx2_uncore_validate_event_group(event)) 442 + if (!tx2_uncore_validate_event_group(event, tx2_pmu->max_counters)) 605 443 return -EINVAL; 606 444 607 445 return 0; ··· 617 455 618 456 tx2_pmu->start_event(event, flags); 619 457 perf_event_update_userpage(event); 458 + 459 + /* No hrtimer needed for CCPI2, 64-bit counters */ 460 + if (!tx2_pmu->hrtimer_callback) 461 + return; 620 462 621 463 /* Start timer for first event */ 622 464 if (bitmap_weight(tx2_pmu->active_counters, ··· 676 510 { 677 511 struct tx2_uncore_pmu *tx2_pmu = pmu_to_tx2_pmu(event->pmu); 678 512 struct hw_perf_event *hwc = &event->hw; 513 + u32 cmask; 679 514 515 + cmask = tx2_pmu->counters_mask; 680 516 tx2_uncore_event_stop(event, PERF_EF_UPDATE); 681 517 682 518 /* clear the assigned counter */ 683 - free_counter(tx2_pmu, GET_COUNTERID(event)); 519 + free_counter(tx2_pmu, GET_COUNTERID(event, cmask)); 684 520 685 521 perf_event_update_userpage(event); 686 522 tx2_pmu->events[hwc->idx] = NULL; 687 523 hwc->idx = -1; 524 + 525 + if (!tx2_pmu->hrtimer_callback) 526 + return; 527 + 528 + if (bitmap_empty(tx2_pmu->active_counters, tx2_pmu->max_counters)) 529 + hrtimer_cancel(&tx2_pmu->hrtimer); 688 530 } 689 531 690 532 static void tx2_uncore_event_read(struct perf_event *event) ··· 754 580 cpu_online_mask); 755 581 756 582 tx2_pmu->cpu = cpu; 757 - hrtimer_init(&tx2_pmu->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); 758 - tx2_pmu->hrtimer.function = tx2_hrtimer_callback; 583 + 584 + if (tx2_pmu->hrtimer_callback) { 585 + hrtimer_init(&tx2_pmu->hrtimer, 586 + CLOCK_MONOTONIC, HRTIMER_MODE_REL); 587 + tx2_pmu->hrtimer.function = tx2_pmu->hrtimer_callback; 588 + } 759 589 760 590 ret = tx2_uncore_pmu_register(tx2_pmu); 761 591 if (ret) { ··· 831 653 832 654 switch (tx2_pmu->type) { 833 655 case PMU_TYPE_L3C: 834 - tx2_pmu->max_counters = TX2_PMU_MAX_COUNTERS; 656 + tx2_pmu->max_counters = TX2_PMU_DMC_L3C_MAX_COUNTERS; 657 + tx2_pmu->counters_mask = 0x3; 835 658 tx2_pmu->prorate_factor = TX2_PMU_L3_TILES; 836 659 tx2_pmu->max_events = L3_EVENT_MAX; 660 + tx2_pmu->events_mask = 0x1f; 837 661 tx2_pmu->hrtimer_interval = TX2_PMU_HRTIMER_INTERVAL; 662 + tx2_pmu->hrtimer_callback = tx2_hrtimer_callback; 838 663 tx2_pmu->attr_groups = l3c_pmu_attr_groups; 839 664 tx2_pmu->name = devm_kasprintf(dev, GFP_KERNEL, 840 665 "uncore_l3c_%d", tx2_pmu->node); ··· 846 665 tx2_pmu->stop_event = uncore_stop_event_l3c; 847 666 break; 848 667 case PMU_TYPE_DMC: 849 - tx2_pmu->max_counters = TX2_PMU_MAX_COUNTERS; 668 + tx2_pmu->max_counters = TX2_PMU_DMC_L3C_MAX_COUNTERS; 669 + tx2_pmu->counters_mask = 0x3; 850 670 tx2_pmu->prorate_factor = TX2_PMU_DMC_CHANNELS; 851 671 tx2_pmu->max_events = DMC_EVENT_MAX; 672 + tx2_pmu->events_mask = 0x1f; 852 673 tx2_pmu->hrtimer_interval = TX2_PMU_HRTIMER_INTERVAL; 674 + tx2_pmu->hrtimer_callback = tx2_hrtimer_callback; 853 675 tx2_pmu->attr_groups = dmc_pmu_attr_groups; 854 676 tx2_pmu->name = devm_kasprintf(dev, GFP_KERNEL, 855 677 "uncore_dmc_%d", tx2_pmu->node); 856 678 tx2_pmu->init_cntr_base = init_cntr_base_dmc; 857 679 tx2_pmu->start_event = uncore_start_event_dmc; 858 680 tx2_pmu->stop_event = uncore_stop_event_dmc; 681 + break; 682 + case PMU_TYPE_CCPI2: 683 + /* CCPI2 has 8 counters */ 684 + tx2_pmu->max_counters = TX2_PMU_CCPI2_MAX_COUNTERS; 685 + tx2_pmu->counters_mask = 0x7; 686 + tx2_pmu->prorate_factor = 1; 687 + tx2_pmu->max_events = CCPI2_EVENT_MAX; 688 + tx2_pmu->events_mask = 0x1ff; 689 + tx2_pmu->attr_groups = ccpi2_pmu_attr_groups; 690 + tx2_pmu->name = devm_kasprintf(dev, GFP_KERNEL, 691 + "uncore_ccpi2_%d", tx2_pmu->node); 692 + tx2_pmu->init_cntr_base = init_cntr_base_ccpi2; 693 + tx2_pmu->start_event = uncore_start_event_ccpi2; 694 + tx2_pmu->stop_event = uncore_stop_event_ccpi2; 695 + tx2_pmu->hrtimer_callback = NULL; 859 696 break; 860 697 case PMU_TYPE_INVALID: 861 698 devm_kfree(dev, tx2_pmu); ··· 943 744 if (cpu != tx2_pmu->cpu) 944 745 return 0; 945 746 946 - hrtimer_cancel(&tx2_pmu->hrtimer); 747 + if (tx2_pmu->hrtimer_callback) 748 + hrtimer_cancel(&tx2_pmu->hrtimer); 749 + 947 750 cpumask_copy(&cpu_online_mask_temp, cpu_online_mask); 948 751 cpumask_clear_cpu(cpu, &cpu_online_mask_temp); 949 752 new_cpu = cpumask_any_and(
+4 -10
drivers/perf/xgene_pmu.c
··· 1282 1282 struct platform_device *pdev) 1283 1283 { 1284 1284 void __iomem *csw_csr, *mcba_csr, *mcbb_csr; 1285 - struct resource *res; 1286 1285 unsigned int reg; 1287 1286 1288 - res = platform_get_resource(pdev, IORESOURCE_MEM, 1); 1289 - csw_csr = devm_ioremap_resource(&pdev->dev, res); 1287 + csw_csr = devm_platform_ioremap_resource(pdev, 1); 1290 1288 if (IS_ERR(csw_csr)) { 1291 1289 dev_err(&pdev->dev, "ioremap failed for CSW CSR resource\n"); 1292 1290 return PTR_ERR(csw_csr); 1293 1291 } 1294 1292 1295 - res = platform_get_resource(pdev, IORESOURCE_MEM, 2); 1296 - mcba_csr = devm_ioremap_resource(&pdev->dev, res); 1293 + mcba_csr = devm_platform_ioremap_resource(pdev, 2); 1297 1294 if (IS_ERR(mcba_csr)) { 1298 1295 dev_err(&pdev->dev, "ioremap failed for MCBA CSR resource\n"); 1299 1296 return PTR_ERR(mcba_csr); 1300 1297 } 1301 1298 1302 - res = platform_get_resource(pdev, IORESOURCE_MEM, 3); 1303 - mcbb_csr = devm_ioremap_resource(&pdev->dev, res); 1299 + mcbb_csr = devm_platform_ioremap_resource(pdev, 3); 1304 1300 if (IS_ERR(mcbb_csr)) { 1305 1301 dev_err(&pdev->dev, "ioremap failed for MCBB CSR resource\n"); 1306 1302 return PTR_ERR(mcbb_csr); ··· 1328 1332 struct platform_device *pdev) 1329 1333 { 1330 1334 void __iomem *csw_csr; 1331 - struct resource *res; 1332 1335 unsigned int reg; 1333 1336 u32 mcb0routing; 1334 1337 u32 mcb1routing; 1335 1338 1336 - res = platform_get_resource(pdev, IORESOURCE_MEM, 1); 1337 - csw_csr = devm_ioremap_resource(&pdev->dev, res); 1339 + csw_csr = devm_platform_ioremap_resource(pdev, 1); 1338 1340 if (IS_ERR(csw_csr)) { 1339 1341 dev_err(&pdev->dev, "ioremap failed for CSW CSR resource\n"); 1340 1342 return PTR_ERR(csw_csr);