Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

powerpc/powernv: Add debugfs interface for imc-mode and imc-command

In memory Collection (IMC) counter pmu driver controls the ucode's
execution state. At the system boot, IMC perf driver pause the ucode.
Ucode state is changed to "running" only when any of the nest units
are monitored or profiled using perf tool.

Nest units support only limited set of hardware counters and ucode is
always programmed in the "production mode" ("accumulation") mode. This
mode is configured to provide key performance metric data for most of
the nest units.

But ucode also supports other modes which would be used for "debug" to
drill down specific nest units. That is, ucode when switched to
"powerbus" debug mode (for example), will dynamically reconfigure the
nest counters to target only "powerbus" related events in the hardware
counters. This allows the IMC nest unit to focus on powerbus related
transactions in the system in more detail. At this point, production
mode events may or may not be counted.

IMC nest counters has both in-band (ucode access) and out of band
access to it. Since not all nest counter configurations are supported
by ucode, out of band tools are used to characterize other nest
counter configurations.

Patch provides an interface via "debugfs" to enable the switching of
ucode modes in the system. To switch ucode mode, one has to first
pause the microcode (imc_cmd), and then write the target mode value to
the "imc_mode" file.

Proposed Approach:

In the proposed approach, the function (export_imc_mode_and_cmd) which
creates the debugfs interface for imc mode and command is implemented
in opal-imc.c. Thus we can use imc_get_mem_addr() to get the homer
base address for each chip.

The interface to expose imc mode and command is required only if we
have nest pmu units registered. Employing the existing data structures
to track whether we have any nest units registered will require to
extend data from perf side to opal-imc.c. Instead an integer is
introduced to hold that information by counting successful nest unit
registration. Debugfs interface is removed based on the integer count.

Example for the interface:

$ ls /sys/kernel/debug/imc
imc_cmd_0 imc_cmd_8 imc_mode_0 imc_mode_8

Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

authored by

Anju T Sudhakar and committed by
Michael Ellerman
684d9840 8b4e6dea

+84
+7
arch/powerpc/include/asm/imc-pmu.h
··· 35 35 #define THREAD_IMC_ENABLE 0x8000000000000000ULL 36 36 37 37 /* 38 + * For debugfs interface for imc-mode and imc-command 39 + */ 40 + #define IMC_CNTL_BLK_OFFSET 0x3FC00 41 + #define IMC_CNTL_BLK_CMD_OFFSET 8 42 + #define IMC_CNTL_BLK_MODE_OFFSET 32 43 + 44 + /* 38 45 * Structure to hold memory address information for imc units. 39 46 */ 40 47 struct imc_mem_info {
+77
arch/powerpc/platforms/powernv/opal-imc.c
··· 21 21 #include <asm/io.h> 22 22 #include <asm/imc-pmu.h> 23 23 #include <asm/cputhreads.h> 24 + #include <asm/debugfs.h> 25 + 26 + static struct dentry *imc_debugfs_parent; 27 + 28 + /* Helpers to export imc command and mode via debugfs */ 29 + static int imc_mem_get(void *data, u64 *val) 30 + { 31 + *val = cpu_to_be64(*(u64 *)data); 32 + return 0; 33 + } 34 + 35 + static int imc_mem_set(void *data, u64 val) 36 + { 37 + *(u64 *)data = cpu_to_be64(val); 38 + return 0; 39 + } 40 + DEFINE_DEBUGFS_ATTRIBUTE(fops_imc_x64, imc_mem_get, imc_mem_set, "0x%016llx\n"); 41 + 42 + static struct dentry *imc_debugfs_create_x64(const char *name, umode_t mode, 43 + struct dentry *parent, u64 *value) 44 + { 45 + return debugfs_create_file_unsafe(name, mode, parent, 46 + value, &fops_imc_x64); 47 + } 48 + 49 + /* 50 + * export_imc_mode_and_cmd: Create a debugfs interface 51 + * for imc_cmd and imc_mode 52 + * for each node in the system. 53 + * imc_mode and imc_cmd can be changed by echo into 54 + * this interface. 55 + */ 56 + static void export_imc_mode_and_cmd(struct device_node *node, 57 + struct imc_pmu *pmu_ptr) 58 + { 59 + static u64 loc, *imc_mode_addr, *imc_cmd_addr; 60 + int chip = 0, nid; 61 + char mode[16], cmd[16]; 62 + u32 cb_offset; 63 + 64 + imc_debugfs_parent = debugfs_create_dir("imc", powerpc_debugfs_root); 65 + 66 + /* 67 + * Return here, either because 'imc' directory already exists, 68 + * Or failed to create a new one. 69 + */ 70 + if (!imc_debugfs_parent) 71 + return; 72 + 73 + if (of_property_read_u32(node, "cb_offset", &cb_offset)) 74 + cb_offset = IMC_CNTL_BLK_OFFSET; 75 + 76 + for_each_node(nid) { 77 + loc = (u64)(pmu_ptr->mem_info[chip].vbase) + cb_offset; 78 + imc_mode_addr = (u64 *)(loc + IMC_CNTL_BLK_MODE_OFFSET); 79 + sprintf(mode, "imc_mode_%d", nid); 80 + if (!imc_debugfs_create_x64(mode, 0600, imc_debugfs_parent, 81 + imc_mode_addr)) 82 + goto err; 83 + 84 + imc_cmd_addr = (u64 *)(loc + IMC_CNTL_BLK_CMD_OFFSET); 85 + sprintf(cmd, "imc_cmd_%d", nid); 86 + if (!imc_debugfs_create_x64(cmd, 0600, imc_debugfs_parent, 87 + imc_cmd_addr)) 88 + goto err; 89 + chip++; 90 + } 91 + return; 92 + 93 + err: 94 + debugfs_remove_recursive(imc_debugfs_parent); 95 + } 24 96 25 97 /* 26 98 * imc_get_mem_addr_nest: Function to get nest counter memory region ··· 137 65 } 138 66 139 67 pmu_ptr->imc_counter_mmaped = true; 68 + export_imc_mode_and_cmd(node, pmu_ptr); 140 69 kfree(base_addr_arr); 141 70 kfree(chipid_arr); 142 71 return 0; ··· 285 212 pmu_count++; 286 213 } 287 214 } 215 + 216 + /* If none of the nest units are registered, remove debugfs interface */ 217 + if (pmu_count == 0) 218 + debugfs_remove_recursive(imc_debugfs_parent); 288 219 289 220 return 0; 290 221 }