Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge branch 'VExpress_DCSCB' of git://git.linaro.org/people/nico/linux into next/soc

From Nicolas Pitre:

This is the first MCPM backend submission for VExpress running on RTSM
aka Fast Models implementing the big.LITTLE system architecture. This
enables SMP secondary boot as well as CPU hotplug on this platform.

A big prerequisite for this support is the CCI driver from Lorenzo
included in this pull request.

Also included is Rob Herring's set_auxcr/get_auxcr allowing nicer code.

Signed-off-by: Olof Johansson <olof@lixom.net>

* 'VExpress_DCSCB' of git://git.linaro.org/people/nico/linux:
ARM: vexpress: Select multi-cluster SMP operation if required
ARM: vexpress/dcscb: handle platform coherency exit/setup and CCI
ARM: vexpress/dcscb: do not hardcode number of CPUs per cluster
ARM: vexpress/dcscb: add CPU use counts to the power up/down API implementation
ARM: vexpress: introduce DCSCB support
ARM: introduce common set_auxcr/get_auxcr functions
drivers/bus: arm-cci: function to enable CCI ports from early boot code
drivers: bus: add ARM CCI support

+1132 -14
+172
Documentation/devicetree/bindings/arm/cci.txt
··· 1 + ======================================================= 2 + ARM CCI cache coherent interconnect binding description 3 + ======================================================= 4 + 5 + ARM multi-cluster systems maintain intra-cluster coherency through a 6 + cache coherent interconnect (CCI) that is capable of monitoring bus 7 + transactions and manage coherency, TLB invalidations and memory barriers. 8 + 9 + It allows snooping and distributed virtual memory message broadcast across 10 + clusters, through memory mapped interface, with a global control register 11 + space and multiple sets of interface control registers, one per slave 12 + interface. 13 + 14 + Bindings for the CCI node follow the ePAPR standard, available from: 15 + 16 + www.power.org/documentation/epapr-version-1-1/ 17 + 18 + with the addition of the bindings described in this document which are 19 + specific to ARM. 20 + 21 + * CCI interconnect node 22 + 23 + Description: Describes a CCI cache coherent Interconnect component 24 + 25 + Node name must be "cci". 26 + Node's parent must be the root node /, and the address space visible 27 + through the CCI interconnect is the same as the one seen from the 28 + root node (ie from CPUs perspective as per DT standard). 29 + Every CCI node has to define the following properties: 30 + 31 + - compatible 32 + Usage: required 33 + Value type: <string> 34 + Definition: must be set to 35 + "arm,cci-400" 36 + 37 + - reg 38 + Usage: required 39 + Value type: <prop-encoded-array> 40 + Definition: A standard property. Specifies base physical 41 + address of CCI control registers common to all 42 + interfaces. 43 + 44 + - ranges: 45 + Usage: required 46 + Value type: <prop-encoded-array> 47 + Definition: A standard property. Follow rules in the ePAPR for 48 + hierarchical bus addressing. CCI interfaces 49 + addresses refer to the parent node addressing 50 + scheme to declare their register bases. 51 + 52 + CCI interconnect node can define the following child nodes: 53 + 54 + - CCI control interface nodes 55 + 56 + Node name must be "slave-if". 57 + Parent node must be CCI interconnect node. 58 + 59 + A CCI control interface node must contain the following 60 + properties: 61 + 62 + - compatible 63 + Usage: required 64 + Value type: <string> 65 + Definition: must be set to 66 + "arm,cci-400-ctrl-if" 67 + 68 + - interface-type: 69 + Usage: required 70 + Value type: <string> 71 + Definition: must be set to one of {"ace", "ace-lite"} 72 + depending on the interface type the node 73 + represents. 74 + 75 + - reg: 76 + Usage: required 77 + Value type: <prop-encoded-array> 78 + Definition: the base address and size of the 79 + corresponding interface programming 80 + registers. 81 + 82 + * CCI interconnect bus masters 83 + 84 + Description: masters in the device tree connected to a CCI port 85 + (inclusive of CPUs and their cpu nodes). 86 + 87 + A CCI interconnect bus master node must contain the following 88 + properties: 89 + 90 + - cci-control-port: 91 + Usage: required 92 + Value type: <phandle> 93 + Definition: a phandle containing the CCI control interface node 94 + the master is connected to. 95 + 96 + Example: 97 + 98 + cpus { 99 + #size-cells = <0>; 100 + #address-cells = <1>; 101 + 102 + CPU0: cpu@0 { 103 + device_type = "cpu"; 104 + compatible = "arm,cortex-a15"; 105 + cci-control-port = <&cci_control1>; 106 + reg = <0x0>; 107 + }; 108 + 109 + CPU1: cpu@1 { 110 + device_type = "cpu"; 111 + compatible = "arm,cortex-a15"; 112 + cci-control-port = <&cci_control1>; 113 + reg = <0x1>; 114 + }; 115 + 116 + CPU2: cpu@100 { 117 + device_type = "cpu"; 118 + compatible = "arm,cortex-a7"; 119 + cci-control-port = <&cci_control2>; 120 + reg = <0x100>; 121 + }; 122 + 123 + CPU3: cpu@101 { 124 + device_type = "cpu"; 125 + compatible = "arm,cortex-a7"; 126 + cci-control-port = <&cci_control2>; 127 + reg = <0x101>; 128 + }; 129 + 130 + }; 131 + 132 + dma0: dma@3000000 { 133 + compatible = "arm,pl330", "arm,primecell"; 134 + cci-control-port = <&cci_control0>; 135 + reg = <0x0 0x3000000 0x0 0x1000>; 136 + interrupts = <10>; 137 + #dma-cells = <1>; 138 + #dma-channels = <8>; 139 + #dma-requests = <32>; 140 + }; 141 + 142 + cci@2c090000 { 143 + compatible = "arm,cci-400"; 144 + #address-cells = <1>; 145 + #size-cells = <1>; 146 + reg = <0x0 0x2c090000 0 0x1000>; 147 + ranges = <0x0 0x0 0x2c090000 0x6000>; 148 + 149 + cci_control0: slave-if@1000 { 150 + compatible = "arm,cci-400-ctrl-if"; 151 + interface-type = "ace-lite"; 152 + reg = <0x1000 0x1000>; 153 + }; 154 + 155 + cci_control1: slave-if@4000 { 156 + compatible = "arm,cci-400-ctrl-if"; 157 + interface-type = "ace"; 158 + reg = <0x4000 0x1000>; 159 + }; 160 + 161 + cci_control2: slave-if@5000 { 162 + compatible = "arm,cci-400-ctrl-if"; 163 + interface-type = "ace"; 164 + reg = <0x5000 0x1000>; 165 + }; 166 + }; 167 + 168 + This CCI node corresponds to a CCI component whose control registers sits 169 + at address 0x000000002c090000. 170 + CCI slave interface @0x000000002c091000 is connected to dma controller dma0. 171 + CCI slave interface @0x000000002c094000 is connected to CPUs {CPU0, CPU1}; 172 + CCI slave interface @0x000000002c095000 is connected to CPUs {CPU2, CPU3};
+19
Documentation/devicetree/bindings/arm/rtsm-dcscb.txt
··· 1 + ARM Dual Cluster System Configuration Block 2 + ------------------------------------------- 3 + 4 + The Dual Cluster System Configuration Block (DCSCB) provides basic 5 + functionality for controlling clocks, resets and configuration pins in 6 + the Dual Cluster System implemented by the Real-Time System Model (RTSM). 7 + 8 + Required properties: 9 + 10 + - compatible : should be "arm,rtsm,dcscb" 11 + 12 + - reg : physical base address and the size of the registers window 13 + 14 + Example: 15 + 16 + dcscb@60000000 { 17 + compatible = "arm,rtsm,dcscb"; 18 + reg = <0x60000000 0x1000>; 19 + };
+14
arch/arm/include/asm/cp15.h
··· 61 61 isb(); 62 62 } 63 63 64 + static inline unsigned int get_auxcr(void) 65 + { 66 + unsigned int val; 67 + asm("mrc p15, 0, %0, c1, c0, 1 @ get AUXCR" : "=r" (val)); 68 + return val; 69 + } 70 + 71 + static inline void set_auxcr(unsigned int val) 72 + { 73 + asm volatile("mcr p15, 0, %0, c1, c0, 1 @ set AUXCR" 74 + : : "r" (val)); 75 + isb(); 76 + } 77 + 64 78 #ifndef CONFIG_SMP 65 79 extern void adjust_cr(unsigned long mask, unsigned long set); 66 80 #endif
+9
arch/arm/mach-vexpress/Kconfig
··· 57 57 config ARCH_VEXPRESS_CA9X4 58 58 bool "Versatile Express Cortex-A9x4 tile" 59 59 60 + config ARCH_VEXPRESS_DCSCB 61 + bool "Dual Cluster System Control Block (DCSCB) support" 62 + depends on MCPM 63 + select ARM_CCI 64 + help 65 + Support for the Dual Cluster System Configuration Block (DCSCB). 66 + This is needed to provide CPU and cluster power management 67 + on RTSM implementing big.LITTLE. 68 + 60 69 endmenu
+1
arch/arm/mach-vexpress/Makefile
··· 6 6 7 7 obj-y := v2m.o 8 8 obj-$(CONFIG_ARCH_VEXPRESS_CA9X4) += ct-ca9x4.o 9 + obj-$(CONFIG_ARCH_VEXPRESS_DCSCB) += dcscb.o dcscb_setup.o 9 10 obj-$(CONFIG_SMP) += platsmp.o 10 11 obj-$(CONFIG_HOTPLUG_CPU) += hotplug.o
+2
arch/arm/mach-vexpress/core.h
··· 6 6 7 7 void vexpress_dt_smp_map_io(void); 8 8 9 + bool vexpress_smp_init_ops(void); 10 + 9 11 extern struct smp_operations vexpress_smp_ops; 10 12 11 13 extern void vexpress_cpu_die(unsigned int cpu);
+253
arch/arm/mach-vexpress/dcscb.c
··· 1 + /* 2 + * arch/arm/mach-vexpress/dcscb.c - Dual Cluster System Configuration Block 3 + * 4 + * Created by: Nicolas Pitre, May 2012 5 + * Copyright: (C) 2012-2013 Linaro Limited 6 + * 7 + * This program is free software; you can redistribute it and/or modify 8 + * it under the terms of the GNU General Public License version 2 as 9 + * published by the Free Software Foundation. 10 + */ 11 + 12 + #include <linux/init.h> 13 + #include <linux/kernel.h> 14 + #include <linux/io.h> 15 + #include <linux/spinlock.h> 16 + #include <linux/errno.h> 17 + #include <linux/of_address.h> 18 + #include <linux/vexpress.h> 19 + #include <linux/arm-cci.h> 20 + 21 + #include <asm/mcpm.h> 22 + #include <asm/proc-fns.h> 23 + #include <asm/cacheflush.h> 24 + #include <asm/cputype.h> 25 + #include <asm/cp15.h> 26 + 27 + 28 + #define RST_HOLD0 0x0 29 + #define RST_HOLD1 0x4 30 + #define SYS_SWRESET 0x8 31 + #define RST_STAT0 0xc 32 + #define RST_STAT1 0x10 33 + #define EAG_CFG_R 0x20 34 + #define EAG_CFG_W 0x24 35 + #define KFC_CFG_R 0x28 36 + #define KFC_CFG_W 0x2c 37 + #define DCS_CFG_R 0x30 38 + 39 + /* 40 + * We can't use regular spinlocks. In the switcher case, it is possible 41 + * for an outbound CPU to call power_down() while its inbound counterpart 42 + * is already live using the same logical CPU number which trips lockdep 43 + * debugging. 44 + */ 45 + static arch_spinlock_t dcscb_lock = __ARCH_SPIN_LOCK_UNLOCKED; 46 + 47 + static void __iomem *dcscb_base; 48 + static int dcscb_use_count[4][2]; 49 + static int dcscb_allcpus_mask[2]; 50 + 51 + static int dcscb_power_up(unsigned int cpu, unsigned int cluster) 52 + { 53 + unsigned int rst_hold, cpumask = (1 << cpu); 54 + unsigned int all_mask = dcscb_allcpus_mask[cluster]; 55 + 56 + pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster); 57 + if (cpu >= 4 || cluster >= 2) 58 + return -EINVAL; 59 + 60 + /* 61 + * Since this is called with IRQs enabled, and no arch_spin_lock_irq 62 + * variant exists, we need to disable IRQs manually here. 63 + */ 64 + local_irq_disable(); 65 + arch_spin_lock(&dcscb_lock); 66 + 67 + dcscb_use_count[cpu][cluster]++; 68 + if (dcscb_use_count[cpu][cluster] == 1) { 69 + rst_hold = readl_relaxed(dcscb_base + RST_HOLD0 + cluster * 4); 70 + if (rst_hold & (1 << 8)) { 71 + /* remove cluster reset and add individual CPU's reset */ 72 + rst_hold &= ~(1 << 8); 73 + rst_hold |= all_mask; 74 + } 75 + rst_hold &= ~(cpumask | (cpumask << 4)); 76 + writel_relaxed(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4); 77 + } else if (dcscb_use_count[cpu][cluster] != 2) { 78 + /* 79 + * The only possible values are: 80 + * 0 = CPU down 81 + * 1 = CPU (still) up 82 + * 2 = CPU requested to be up before it had a chance 83 + * to actually make itself down. 84 + * Any other value is a bug. 85 + */ 86 + BUG(); 87 + } 88 + 89 + arch_spin_unlock(&dcscb_lock); 90 + local_irq_enable(); 91 + 92 + return 0; 93 + } 94 + 95 + static void dcscb_power_down(void) 96 + { 97 + unsigned int mpidr, cpu, cluster, rst_hold, cpumask, all_mask; 98 + bool last_man = false, skip_wfi = false; 99 + 100 + mpidr = read_cpuid_mpidr(); 101 + cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0); 102 + cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1); 103 + cpumask = (1 << cpu); 104 + all_mask = dcscb_allcpus_mask[cluster]; 105 + 106 + pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster); 107 + BUG_ON(cpu >= 4 || cluster >= 2); 108 + 109 + __mcpm_cpu_going_down(cpu, cluster); 110 + 111 + arch_spin_lock(&dcscb_lock); 112 + BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP); 113 + dcscb_use_count[cpu][cluster]--; 114 + if (dcscb_use_count[cpu][cluster] == 0) { 115 + rst_hold = readl_relaxed(dcscb_base + RST_HOLD0 + cluster * 4); 116 + rst_hold |= cpumask; 117 + if (((rst_hold | (rst_hold >> 4)) & all_mask) == all_mask) { 118 + rst_hold |= (1 << 8); 119 + last_man = true; 120 + } 121 + writel_relaxed(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4); 122 + } else if (dcscb_use_count[cpu][cluster] == 1) { 123 + /* 124 + * A power_up request went ahead of us. 125 + * Even if we do not want to shut this CPU down, 126 + * the caller expects a certain state as if the WFI 127 + * was aborted. So let's continue with cache cleaning. 128 + */ 129 + skip_wfi = true; 130 + } else 131 + BUG(); 132 + 133 + if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) { 134 + arch_spin_unlock(&dcscb_lock); 135 + 136 + /* 137 + * Flush all cache levels for this cluster. 138 + * 139 + * A15/A7 can hit in the cache with SCTLR.C=0, so we don't need 140 + * a preliminary flush here for those CPUs. At least, that's 141 + * the theory -- without the extra flush, Linux explodes on 142 + * RTSM (to be investigated). 143 + */ 144 + flush_cache_all(); 145 + set_cr(get_cr() & ~CR_C); 146 + flush_cache_all(); 147 + 148 + /* 149 + * This is a harmless no-op. On platforms with a real 150 + * outer cache this might either be needed or not, 151 + * depending on where the outer cache sits. 152 + */ 153 + outer_flush_all(); 154 + 155 + /* Disable local coherency by clearing the ACTLR "SMP" bit: */ 156 + set_auxcr(get_auxcr() & ~(1 << 6)); 157 + 158 + /* 159 + * Disable cluster-level coherency by masking 160 + * incoming snoops and DVM messages: 161 + */ 162 + cci_disable_port_by_cpu(mpidr); 163 + 164 + __mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN); 165 + } else { 166 + arch_spin_unlock(&dcscb_lock); 167 + 168 + /* 169 + * Flush the local CPU cache. 170 + * 171 + * A15/A7 can hit in the cache with SCTLR.C=0, so we don't need 172 + * a preliminary flush here for those CPUs. At least, that's 173 + * the theory -- without the extra flush, Linux explodes on 174 + * RTSM (to be investigated). 175 + */ 176 + flush_cache_louis(); 177 + set_cr(get_cr() & ~CR_C); 178 + flush_cache_louis(); 179 + 180 + /* Disable local coherency by clearing the ACTLR "SMP" bit: */ 181 + set_auxcr(get_auxcr() & ~(1 << 6)); 182 + } 183 + 184 + __mcpm_cpu_down(cpu, cluster); 185 + 186 + /* Now we are prepared for power-down, do it: */ 187 + dsb(); 188 + if (!skip_wfi) 189 + wfi(); 190 + 191 + /* Not dead at this point? Let our caller cope. */ 192 + } 193 + 194 + static const struct mcpm_platform_ops dcscb_power_ops = { 195 + .power_up = dcscb_power_up, 196 + .power_down = dcscb_power_down, 197 + }; 198 + 199 + static void __init dcscb_usage_count_init(void) 200 + { 201 + unsigned int mpidr, cpu, cluster; 202 + 203 + mpidr = read_cpuid_mpidr(); 204 + cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0); 205 + cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1); 206 + 207 + pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster); 208 + BUG_ON(cpu >= 4 || cluster >= 2); 209 + dcscb_use_count[cpu][cluster] = 1; 210 + } 211 + 212 + extern void dcscb_power_up_setup(unsigned int affinity_level); 213 + 214 + static int __init dcscb_init(void) 215 + { 216 + struct device_node *node; 217 + unsigned int cfg; 218 + int ret; 219 + 220 + if (!cci_probed()) 221 + return -ENODEV; 222 + 223 + node = of_find_compatible_node(NULL, NULL, "arm,rtsm,dcscb"); 224 + if (!node) 225 + return -ENODEV; 226 + dcscb_base = of_iomap(node, 0); 227 + if (!dcscb_base) 228 + return -EADDRNOTAVAIL; 229 + cfg = readl_relaxed(dcscb_base + DCS_CFG_R); 230 + dcscb_allcpus_mask[0] = (1 << (((cfg >> 16) >> (0 << 2)) & 0xf)) - 1; 231 + dcscb_allcpus_mask[1] = (1 << (((cfg >> 16) >> (1 << 2)) & 0xf)) - 1; 232 + dcscb_usage_count_init(); 233 + 234 + ret = mcpm_platform_register(&dcscb_power_ops); 235 + if (!ret) 236 + ret = mcpm_sync_init(dcscb_power_up_setup); 237 + if (ret) { 238 + iounmap(dcscb_base); 239 + return ret; 240 + } 241 + 242 + pr_info("VExpress DCSCB support installed\n"); 243 + 244 + /* 245 + * Future entries into the kernel can now go 246 + * through the cluster entry vectors. 247 + */ 248 + vexpress_flags_set(virt_to_phys(mcpm_entry_point)); 249 + 250 + return 0; 251 + } 252 + 253 + early_initcall(dcscb_init);
+38
arch/arm/mach-vexpress/dcscb_setup.S
··· 1 + /* 2 + * arch/arm/include/asm/dcscb_setup.S 3 + * 4 + * Created by: Dave Martin, 2012-06-22 5 + * Copyright: (C) 2012-2013 Linaro Limited 6 + * 7 + * This program is free software; you can redistribute it and/or modify 8 + * it under the terms of the GNU General Public License version 2 as 9 + * published by the Free Software Foundation. 10 + */ 11 + 12 + #include <linux/linkage.h> 13 + 14 + 15 + ENTRY(dcscb_power_up_setup) 16 + 17 + cmp r0, #0 @ check affinity level 18 + beq 2f 19 + 20 + /* 21 + * Enable cluster-level coherency, in preparation for turning on the MMU. 22 + * The ACTLR SMP bit does not need to be set here, because cpu_resume() 23 + * already restores that. 24 + * 25 + * A15/A7 may not require explicit L2 invalidation on reset, dependent 26 + * on hardware integration decisions. 27 + * For now, this code assumes that L2 is either already invalidated, 28 + * or invalidation is not required. 29 + */ 30 + 31 + b cci_enable_port_for_self 32 + 33 + 2: @ Implementation-specific local CPU setup operations should go here, 34 + @ if any. In this case, there is nothing to do. 35 + 36 + bx lr 37 + 38 + ENDPROC(dcscb_power_up_setup)
+20
arch/arm/mach-vexpress/platsmp.c
··· 12 12 #include <linux/errno.h> 13 13 #include <linux/smp.h> 14 14 #include <linux/io.h> 15 + #include <linux/of.h> 15 16 #include <linux/of_fdt.h> 16 17 #include <linux/vexpress.h> 17 18 19 + #include <asm/mcpm.h> 18 20 #include <asm/smp_scu.h> 19 21 #include <asm/mach/map.h> 20 22 ··· 205 203 .cpu_die = vexpress_cpu_die, 206 204 #endif 207 205 }; 206 + 207 + bool __init vexpress_smp_init_ops(void) 208 + { 209 + #ifdef CONFIG_MCPM 210 + /* 211 + * The best way to detect a multi-cluster configuration at the moment 212 + * is to look for the presence of a CCI in the system. 213 + * Override the default vexpress_smp_ops if so. 214 + */ 215 + struct device_node *node; 216 + node = of_find_compatible_node(NULL, NULL, "arm,cci-400"); 217 + if (node && of_device_is_available(node)) { 218 + mcpm_smp_set_ops(); 219 + return true; 220 + } 221 + #endif 222 + return false; 223 + }
+1
arch/arm/mach-vexpress/v2m.c
··· 456 456 DT_MACHINE_START(VEXPRESS_DT, "ARM-Versatile Express") 457 457 .dt_compat = v2m_dt_match, 458 458 .smp = smp_ops(vexpress_smp_ops), 459 + .smp_init = smp_init_ops(vexpress_smp_init_ops), 459 460 .map_io = v2m_dt_map_io, 460 461 .init_early = v2m_dt_init_early, 461 462 .init_irq = irqchip_init,
+7
drivers/bus/Kconfig
··· 26 26 27 27 help 28 28 Driver to enable OMAP interconnect error handling driver. 29 + 30 + config ARM_CCI 31 + bool "ARM CCI driver support" 32 + depends on ARM 33 + help 34 + Driver supporting the CCI cache coherent interconnect for ARM 35 + platforms. 29 36 endmenu
+2
drivers/bus/Makefile
··· 7 7 8 8 # Interconnect bus driver for OMAP SoCs. 9 9 obj-$(CONFIG_OMAP_INTERCONNECT) += omap_l3_smx.o omap_l3_noc.o 10 + # CCI cache coherent interconnect for ARM platforms 11 + obj-$(CONFIG_ARM_CCI) += arm-cci.o
+533
drivers/bus/arm-cci.c
··· 1 + /* 2 + * CCI cache coherent interconnect driver 3 + * 4 + * Copyright (C) 2013 ARM Ltd. 5 + * Author: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> 6 + * 7 + * This program is free software; you can redistribute it and/or modify 8 + * it under the terms of the GNU General Public License version 2 as 9 + * published by the Free Software Foundation. 10 + * 11 + * This program is distributed "as is" WITHOUT ANY WARRANTY of any 12 + * kind, whether express or implied; without even the implied warranty 13 + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 14 + * GNU General Public License for more details. 15 + */ 16 + 17 + #include <linux/arm-cci.h> 18 + #include <linux/io.h> 19 + #include <linux/module.h> 20 + #include <linux/of_address.h> 21 + #include <linux/slab.h> 22 + 23 + #include <asm/cacheflush.h> 24 + #include <asm/smp_plat.h> 25 + 26 + #define CCI_PORT_CTRL 0x0 27 + #define CCI_CTRL_STATUS 0xc 28 + 29 + #define CCI_ENABLE_SNOOP_REQ 0x1 30 + #define CCI_ENABLE_DVM_REQ 0x2 31 + #define CCI_ENABLE_REQ (CCI_ENABLE_SNOOP_REQ | CCI_ENABLE_DVM_REQ) 32 + 33 + struct cci_nb_ports { 34 + unsigned int nb_ace; 35 + unsigned int nb_ace_lite; 36 + }; 37 + 38 + enum cci_ace_port_type { 39 + ACE_INVALID_PORT = 0x0, 40 + ACE_PORT, 41 + ACE_LITE_PORT, 42 + }; 43 + 44 + struct cci_ace_port { 45 + void __iomem *base; 46 + unsigned long phys; 47 + enum cci_ace_port_type type; 48 + struct device_node *dn; 49 + }; 50 + 51 + static struct cci_ace_port *ports; 52 + static unsigned int nb_cci_ports; 53 + 54 + static void __iomem *cci_ctrl_base; 55 + static unsigned long cci_ctrl_phys; 56 + 57 + struct cpu_port { 58 + u64 mpidr; 59 + u32 port; 60 + }; 61 + 62 + /* 63 + * Use the port MSB as valid flag, shift can be made dynamic 64 + * by computing number of bits required for port indexes. 65 + * Code disabling CCI cpu ports runs with D-cache invalidated 66 + * and SCTLR bit clear so data accesses must be kept to a minimum 67 + * to improve performance; for now shift is left static to 68 + * avoid one more data access while disabling the CCI port. 69 + */ 70 + #define PORT_VALID_SHIFT 31 71 + #define PORT_VALID (0x1 << PORT_VALID_SHIFT) 72 + 73 + static inline void init_cpu_port(struct cpu_port *port, u32 index, u64 mpidr) 74 + { 75 + port->port = PORT_VALID | index; 76 + port->mpidr = mpidr; 77 + } 78 + 79 + static inline bool cpu_port_is_valid(struct cpu_port *port) 80 + { 81 + return !!(port->port & PORT_VALID); 82 + } 83 + 84 + static inline bool cpu_port_match(struct cpu_port *port, u64 mpidr) 85 + { 86 + return port->mpidr == (mpidr & MPIDR_HWID_BITMASK); 87 + } 88 + 89 + static struct cpu_port cpu_port[NR_CPUS]; 90 + 91 + /** 92 + * __cci_ace_get_port - Function to retrieve the port index connected to 93 + * a cpu or device. 94 + * 95 + * @dn: device node of the device to look-up 96 + * @type: port type 97 + * 98 + * Return value: 99 + * - CCI port index if success 100 + * - -ENODEV if failure 101 + */ 102 + static int __cci_ace_get_port(struct device_node *dn, int type) 103 + { 104 + int i; 105 + bool ace_match; 106 + struct device_node *cci_portn; 107 + 108 + cci_portn = of_parse_phandle(dn, "cci-control-port", 0); 109 + for (i = 0; i < nb_cci_ports; i++) { 110 + ace_match = ports[i].type == type; 111 + if (ace_match && cci_portn == ports[i].dn) 112 + return i; 113 + } 114 + return -ENODEV; 115 + } 116 + 117 + int cci_ace_get_port(struct device_node *dn) 118 + { 119 + return __cci_ace_get_port(dn, ACE_LITE_PORT); 120 + } 121 + EXPORT_SYMBOL_GPL(cci_ace_get_port); 122 + 123 + static void __init cci_ace_init_ports(void) 124 + { 125 + int port, ac, cpu; 126 + u64 hwid; 127 + const u32 *cell; 128 + struct device_node *cpun, *cpus; 129 + 130 + cpus = of_find_node_by_path("/cpus"); 131 + if (WARN(!cpus, "Missing cpus node, bailing out\n")) 132 + return; 133 + 134 + if (WARN_ON(of_property_read_u32(cpus, "#address-cells", &ac))) 135 + ac = of_n_addr_cells(cpus); 136 + 137 + /* 138 + * Port index look-up speeds up the function disabling ports by CPU, 139 + * since the logical to port index mapping is done once and does 140 + * not change after system boot. 141 + * The stashed index array is initialized for all possible CPUs 142 + * at probe time. 143 + */ 144 + for_each_child_of_node(cpus, cpun) { 145 + if (of_node_cmp(cpun->type, "cpu")) 146 + continue; 147 + cell = of_get_property(cpun, "reg", NULL); 148 + if (WARN(!cell, "%s: missing reg property\n", cpun->full_name)) 149 + continue; 150 + 151 + hwid = of_read_number(cell, ac); 152 + cpu = get_logical_index(hwid & MPIDR_HWID_BITMASK); 153 + 154 + if (cpu < 0 || !cpu_possible(cpu)) 155 + continue; 156 + port = __cci_ace_get_port(cpun, ACE_PORT); 157 + if (port < 0) 158 + continue; 159 + 160 + init_cpu_port(&cpu_port[cpu], port, cpu_logical_map(cpu)); 161 + } 162 + 163 + for_each_possible_cpu(cpu) { 164 + WARN(!cpu_port_is_valid(&cpu_port[cpu]), 165 + "CPU %u does not have an associated CCI port\n", 166 + cpu); 167 + } 168 + } 169 + /* 170 + * Functions to enable/disable a CCI interconnect slave port 171 + * 172 + * They are called by low-level power management code to disable slave 173 + * interfaces snoops and DVM broadcast. 174 + * Since they may execute with cache data allocation disabled and 175 + * after the caches have been cleaned and invalidated the functions provide 176 + * no explicit locking since they may run with D-cache disabled, so normal 177 + * cacheable kernel locks based on ldrex/strex may not work. 178 + * Locking has to be provided by BSP implementations to ensure proper 179 + * operations. 180 + */ 181 + 182 + /** 183 + * cci_port_control() - function to control a CCI port 184 + * 185 + * @port: index of the port to setup 186 + * @enable: if true enables the port, if false disables it 187 + */ 188 + static void notrace cci_port_control(unsigned int port, bool enable) 189 + { 190 + void __iomem *base = ports[port].base; 191 + 192 + writel_relaxed(enable ? CCI_ENABLE_REQ : 0, base + CCI_PORT_CTRL); 193 + /* 194 + * This function is called from power down procedures 195 + * and must not execute any instruction that might 196 + * cause the processor to be put in a quiescent state 197 + * (eg wfi). Hence, cpu_relax() can not be added to this 198 + * read loop to optimize power, since it might hide possibly 199 + * disruptive operations. 200 + */ 201 + while (readl_relaxed(cci_ctrl_base + CCI_CTRL_STATUS) & 0x1) 202 + ; 203 + } 204 + 205 + /** 206 + * cci_disable_port_by_cpu() - function to disable a CCI port by CPU 207 + * reference 208 + * 209 + * @mpidr: mpidr of the CPU whose CCI port should be disabled 210 + * 211 + * Disabling a CCI port for a CPU implies disabling the CCI port 212 + * controlling that CPU cluster. Code disabling CPU CCI ports 213 + * must make sure that the CPU running the code is the last active CPU 214 + * in the cluster ie all other CPUs are quiescent in a low power state. 215 + * 216 + * Return: 217 + * 0 on success 218 + * -ENODEV on port look-up failure 219 + */ 220 + int notrace cci_disable_port_by_cpu(u64 mpidr) 221 + { 222 + int cpu; 223 + bool is_valid; 224 + for (cpu = 0; cpu < nr_cpu_ids; cpu++) { 225 + is_valid = cpu_port_is_valid(&cpu_port[cpu]); 226 + if (is_valid && cpu_port_match(&cpu_port[cpu], mpidr)) { 227 + cci_port_control(cpu_port[cpu].port, false); 228 + return 0; 229 + } 230 + } 231 + return -ENODEV; 232 + } 233 + EXPORT_SYMBOL_GPL(cci_disable_port_by_cpu); 234 + 235 + /** 236 + * cci_enable_port_for_self() - enable a CCI port for calling CPU 237 + * 238 + * Enabling a CCI port for the calling CPU implies enabling the CCI 239 + * port controlling that CPU's cluster. Caller must make sure that the 240 + * CPU running the code is the first active CPU in the cluster and all 241 + * other CPUs are quiescent in a low power state or waiting for this CPU 242 + * to complete the CCI initialization. 243 + * 244 + * Because this is called when the MMU is still off and with no stack, 245 + * the code must be position independent and ideally rely on callee 246 + * clobbered registers only. To achieve this we must code this function 247 + * entirely in assembler. 248 + * 249 + * On success this returns with the proper CCI port enabled. In case of 250 + * any failure this never returns as the inability to enable the CCI is 251 + * fatal and there is no possible recovery at this stage. 252 + */ 253 + asmlinkage void __naked cci_enable_port_for_self(void) 254 + { 255 + asm volatile ("\n" 256 + 257 + " mrc p15, 0, r0, c0, c0, 5 @ get MPIDR value \n" 258 + " and r0, r0, #"__stringify(MPIDR_HWID_BITMASK)" \n" 259 + " adr r1, 5f \n" 260 + " ldr r2, [r1] \n" 261 + " add r1, r1, r2 @ &cpu_port \n" 262 + " add ip, r1, %[sizeof_cpu_port] \n" 263 + 264 + /* Loop over the cpu_port array looking for a matching MPIDR */ 265 + "1: ldr r2, [r1, %[offsetof_cpu_port_mpidr_lsb]] \n" 266 + " cmp r2, r0 @ compare MPIDR \n" 267 + " bne 2f \n" 268 + 269 + /* Found a match, now test port validity */ 270 + " ldr r3, [r1, %[offsetof_cpu_port_port]] \n" 271 + " tst r3, #"__stringify(PORT_VALID)" \n" 272 + " bne 3f \n" 273 + 274 + /* no match, loop with the next cpu_port entry */ 275 + "2: add r1, r1, %[sizeof_struct_cpu_port] \n" 276 + " cmp r1, ip @ done? \n" 277 + " blo 1b \n" 278 + 279 + /* CCI port not found -- cheaply try to stall this CPU */ 280 + "cci_port_not_found: \n" 281 + " wfi \n" 282 + " wfe \n" 283 + " b cci_port_not_found \n" 284 + 285 + /* Use matched port index to look up the corresponding ports entry */ 286 + "3: bic r3, r3, #"__stringify(PORT_VALID)" \n" 287 + " adr r0, 6f \n" 288 + " ldmia r0, {r1, r2} \n" 289 + " sub r1, r1, r0 @ virt - phys \n" 290 + " ldr r0, [r0, r2] @ *(&ports) \n" 291 + " mov r2, %[sizeof_struct_ace_port] \n" 292 + " mla r0, r2, r3, r0 @ &ports[index] \n" 293 + " sub r0, r0, r1 @ virt_to_phys() \n" 294 + 295 + /* Enable the CCI port */ 296 + " ldr r0, [r0, %[offsetof_port_phys]] \n" 297 + " mov r3, #"__stringify(CCI_ENABLE_REQ)" \n" 298 + " str r3, [r0, #"__stringify(CCI_PORT_CTRL)"] \n" 299 + 300 + /* poll the status reg for completion */ 301 + " adr r1, 7f \n" 302 + " ldr r0, [r1] \n" 303 + " ldr r0, [r0, r1] @ cci_ctrl_base \n" 304 + "4: ldr r1, [r0, #"__stringify(CCI_CTRL_STATUS)"] \n" 305 + " tst r1, #1 \n" 306 + " bne 4b \n" 307 + 308 + " mov r0, #0 \n" 309 + " bx lr \n" 310 + 311 + " .align 2 \n" 312 + "5: .word cpu_port - . \n" 313 + "6: .word . \n" 314 + " .word ports - 6b \n" 315 + "7: .word cci_ctrl_phys - . \n" 316 + : : 317 + [sizeof_cpu_port] "i" (sizeof(cpu_port)), 318 + #ifndef __ARMEB__ 319 + [offsetof_cpu_port_mpidr_lsb] "i" (offsetof(struct cpu_port, mpidr)), 320 + #else 321 + [offsetof_cpu_port_mpidr_lsb] "i" (offsetof(struct cpu_port, mpidr)+4), 322 + #endif 323 + [offsetof_cpu_port_port] "i" (offsetof(struct cpu_port, port)), 324 + [sizeof_struct_cpu_port] "i" (sizeof(struct cpu_port)), 325 + [sizeof_struct_ace_port] "i" (sizeof(struct cci_ace_port)), 326 + [offsetof_port_phys] "i" (offsetof(struct cci_ace_port, phys)) ); 327 + 328 + unreachable(); 329 + } 330 + 331 + /** 332 + * __cci_control_port_by_device() - function to control a CCI port by device 333 + * reference 334 + * 335 + * @dn: device node pointer of the device whose CCI port should be 336 + * controlled 337 + * @enable: if true enables the port, if false disables it 338 + * 339 + * Return: 340 + * 0 on success 341 + * -ENODEV on port look-up failure 342 + */ 343 + int notrace __cci_control_port_by_device(struct device_node *dn, bool enable) 344 + { 345 + int port; 346 + 347 + if (!dn) 348 + return -ENODEV; 349 + 350 + port = __cci_ace_get_port(dn, ACE_LITE_PORT); 351 + if (WARN_ONCE(port < 0, "node %s ACE lite port look-up failure\n", 352 + dn->full_name)) 353 + return -ENODEV; 354 + cci_port_control(port, enable); 355 + return 0; 356 + } 357 + EXPORT_SYMBOL_GPL(__cci_control_port_by_device); 358 + 359 + /** 360 + * __cci_control_port_by_index() - function to control a CCI port by port index 361 + * 362 + * @port: port index previously retrieved with cci_ace_get_port() 363 + * @enable: if true enables the port, if false disables it 364 + * 365 + * Return: 366 + * 0 on success 367 + * -ENODEV on port index out of range 368 + * -EPERM if operation carried out on an ACE PORT 369 + */ 370 + int notrace __cci_control_port_by_index(u32 port, bool enable) 371 + { 372 + if (port >= nb_cci_ports || ports[port].type == ACE_INVALID_PORT) 373 + return -ENODEV; 374 + /* 375 + * CCI control for ports connected to CPUS is extremely fragile 376 + * and must be made to go through a specific and controlled 377 + * interface (ie cci_disable_port_by_cpu(); control by general purpose 378 + * indexing is therefore disabled for ACE ports. 379 + */ 380 + if (ports[port].type == ACE_PORT) 381 + return -EPERM; 382 + 383 + cci_port_control(port, enable); 384 + return 0; 385 + } 386 + EXPORT_SYMBOL_GPL(__cci_control_port_by_index); 387 + 388 + static const struct cci_nb_ports cci400_ports = { 389 + .nb_ace = 2, 390 + .nb_ace_lite = 3 391 + }; 392 + 393 + static const struct of_device_id arm_cci_matches[] = { 394 + {.compatible = "arm,cci-400", .data = &cci400_ports }, 395 + {}, 396 + }; 397 + 398 + static const struct of_device_id arm_cci_ctrl_if_matches[] = { 399 + {.compatible = "arm,cci-400-ctrl-if", }, 400 + {}, 401 + }; 402 + 403 + static int __init cci_probe(void) 404 + { 405 + struct cci_nb_ports const *cci_config; 406 + int ret, i, nb_ace = 0, nb_ace_lite = 0; 407 + struct device_node *np, *cp; 408 + struct resource res; 409 + const char *match_str; 410 + bool is_ace; 411 + 412 + np = of_find_matching_node(NULL, arm_cci_matches); 413 + if (!np) 414 + return -ENODEV; 415 + 416 + cci_config = of_match_node(arm_cci_matches, np)->data; 417 + if (!cci_config) 418 + return -ENODEV; 419 + 420 + nb_cci_ports = cci_config->nb_ace + cci_config->nb_ace_lite; 421 + 422 + ports = kcalloc(sizeof(*ports), nb_cci_ports, GFP_KERNEL); 423 + if (!ports) 424 + return -ENOMEM; 425 + 426 + ret = of_address_to_resource(np, 0, &res); 427 + if (!ret) { 428 + cci_ctrl_base = ioremap(res.start, resource_size(&res)); 429 + cci_ctrl_phys = res.start; 430 + } 431 + if (ret || !cci_ctrl_base) { 432 + WARN(1, "unable to ioremap CCI ctrl\n"); 433 + ret = -ENXIO; 434 + goto memalloc_err; 435 + } 436 + 437 + for_each_child_of_node(np, cp) { 438 + if (!of_match_node(arm_cci_ctrl_if_matches, cp)) 439 + continue; 440 + 441 + i = nb_ace + nb_ace_lite; 442 + 443 + if (i >= nb_cci_ports) 444 + break; 445 + 446 + if (of_property_read_string(cp, "interface-type", 447 + &match_str)) { 448 + WARN(1, "node %s missing interface-type property\n", 449 + cp->full_name); 450 + continue; 451 + } 452 + is_ace = strcmp(match_str, "ace") == 0; 453 + if (!is_ace && strcmp(match_str, "ace-lite")) { 454 + WARN(1, "node %s containing invalid interface-type property, skipping it\n", 455 + cp->full_name); 456 + continue; 457 + } 458 + 459 + ret = of_address_to_resource(cp, 0, &res); 460 + if (!ret) { 461 + ports[i].base = ioremap(res.start, resource_size(&res)); 462 + ports[i].phys = res.start; 463 + } 464 + if (ret || !ports[i].base) { 465 + WARN(1, "unable to ioremap CCI port %d\n", i); 466 + continue; 467 + } 468 + 469 + if (is_ace) { 470 + if (WARN_ON(nb_ace >= cci_config->nb_ace)) 471 + continue; 472 + ports[i].type = ACE_PORT; 473 + ++nb_ace; 474 + } else { 475 + if (WARN_ON(nb_ace_lite >= cci_config->nb_ace_lite)) 476 + continue; 477 + ports[i].type = ACE_LITE_PORT; 478 + ++nb_ace_lite; 479 + } 480 + ports[i].dn = cp; 481 + } 482 + 483 + /* initialize a stashed array of ACE ports to speed-up look-up */ 484 + cci_ace_init_ports(); 485 + 486 + /* 487 + * Multi-cluster systems may need this data when non-coherent, during 488 + * cluster power-up/power-down. Make sure it reaches main memory. 489 + */ 490 + sync_cache_w(&cci_ctrl_base); 491 + sync_cache_w(&cci_ctrl_phys); 492 + sync_cache_w(&ports); 493 + sync_cache_w(&cpu_port); 494 + __sync_cache_range_w(ports, sizeof(*ports) * nb_cci_ports); 495 + pr_info("ARM CCI driver probed\n"); 496 + return 0; 497 + 498 + memalloc_err: 499 + 500 + kfree(ports); 501 + return ret; 502 + } 503 + 504 + static int cci_init_status = -EAGAIN; 505 + static DEFINE_MUTEX(cci_probing); 506 + 507 + static int __init cci_init(void) 508 + { 509 + if (cci_init_status != -EAGAIN) 510 + return cci_init_status; 511 + 512 + mutex_lock(&cci_probing); 513 + if (cci_init_status == -EAGAIN) 514 + cci_init_status = cci_probe(); 515 + mutex_unlock(&cci_probing); 516 + return cci_init_status; 517 + } 518 + 519 + /* 520 + * To sort out early init calls ordering a helper function is provided to 521 + * check if the CCI driver has beed initialized. Function check if the driver 522 + * has been initialized, if not it calls the init function that probes 523 + * the driver and updates the return value. 524 + */ 525 + bool __init cci_probed(void) 526 + { 527 + return cci_init() == 0; 528 + } 529 + EXPORT_SYMBOL_GPL(cci_probed); 530 + 531 + early_initcall(cci_init); 532 + MODULE_LICENSE("GPL"); 533 + MODULE_DESCRIPTION("ARM CCI support");
-14
drivers/cpuidle/cpuidle-calxeda.c
··· 37 37 extern void highbank_set_cpu_jump(int cpu, void *jump_addr); 38 38 extern void *scu_base_addr; 39 39 40 - static inline unsigned int get_auxcr(void) 41 - { 42 - unsigned int val; 43 - asm("mrc p15, 0, %0, c1, c0, 1 @ get AUXCR" : "=r" (val) : : "cc"); 44 - return val; 45 - } 46 - 47 - static inline void set_auxcr(unsigned int val) 48 - { 49 - asm volatile("mcr p15, 0, %0, c1, c0, 1 @ set AUXCR" 50 - : : "r" (val) : "cc"); 51 - isb(); 52 - } 53 - 54 40 static noinline void calxeda_idle_restore(void) 55 41 { 56 42 set_cr(get_cr() | CR_C);
+61
include/linux/arm-cci.h
··· 1 + /* 2 + * CCI cache coherent interconnect support 3 + * 4 + * Copyright (C) 2013 ARM Ltd. 5 + * 6 + * This program is free software; you can redistribute it and/or modify 7 + * it under the terms of the GNU General Public License as published by 8 + * the Free Software Foundation; either version 2 of the License, or 9 + * (at your option) any later version. 10 + * 11 + * This program is distributed in the hope that it will be useful, 12 + * but WITHOUT ANY WARRANTY; without even the implied warranty of 13 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 14 + * GNU General Public License for more details. 15 + * 16 + * You should have received a copy of the GNU General Public License 17 + * along with this program; if not, write to the Free Software 18 + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA 19 + */ 20 + 21 + #ifndef __LINUX_ARM_CCI_H 22 + #define __LINUX_ARM_CCI_H 23 + 24 + #include <linux/errno.h> 25 + #include <linux/types.h> 26 + 27 + struct device_node; 28 + 29 + #ifdef CONFIG_ARM_CCI 30 + extern bool cci_probed(void); 31 + extern int cci_ace_get_port(struct device_node *dn); 32 + extern int cci_disable_port_by_cpu(u64 mpidr); 33 + extern int __cci_control_port_by_device(struct device_node *dn, bool enable); 34 + extern int __cci_control_port_by_index(u32 port, bool enable); 35 + #else 36 + static inline bool cci_probed(void) { return false; } 37 + static inline int cci_ace_get_port(struct device_node *dn) 38 + { 39 + return -ENODEV; 40 + } 41 + static inline int cci_disable_port_by_cpu(u64 mpidr) { return -ENODEV; } 42 + static inline int __cci_control_port_by_device(struct device_node *dn, 43 + bool enable) 44 + { 45 + return -ENODEV; 46 + } 47 + static inline int __cci_control_port_by_index(u32 port, bool enable) 48 + { 49 + return -ENODEV; 50 + } 51 + #endif 52 + #define cci_disable_port_by_device(dev) \ 53 + __cci_control_port_by_device(dev, false) 54 + #define cci_enable_port_by_device(dev) \ 55 + __cci_control_port_by_device(dev, true) 56 + #define cci_disable_port_by_index(dev) \ 57 + __cci_control_port_by_index(dev, false) 58 + #define cci_enable_port_by_index(dev) \ 59 + __cci_control_port_by_index(dev, true) 60 + 61 + #endif