Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'powerpc-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

- Convert pseries & powernv to use MSI IRQ domains.

- Rework the pseries CPU numbering so that CPUs that are removed, and
later re-added, are given a CPU number on the same node as
previously, when possible.

- Add support for a new more flexible device-tree format for specifying
NUMA distances.

- Convert powerpc to GENERIC_PTDUMP.

- Retire sbc8548 and sbc8641d board support.

- Various other small features and fixes.

Thanks to Alexey Kardashevskiy, Aneesh Kumar K.V, Anton Blanchard,
Cédric Le Goater, Christophe Leroy, Emmanuel Gil Peyrot, Fabiano Rosas,
Fangrui Song, Finn Thain, Gautham R. Shenoy, Hari Bathini, Joel
Stanley, Jordan Niethe, Kajol Jain, Laurent Dufour, Leonardo Bras, Lukas
Bulwahn, Marc Zyngier, Masahiro Yamada, Michal Suchanek, Nathan
Chancellor, Nicholas Piggin, Parth Shah, Paul Gortmaker, Pratik R.
Sampat, Randy Dunlap, Sebastian Andrzej Siewior, Srikar Dronamraju, Wan
Jiabing, Xiongwei Song, and Zheng Yongjun.

* tag 'powerpc-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (154 commits)
powerpc/bug: Cast to unsigned long before passing to inline asm
powerpc/ptdump: Fix generic ptdump for 64-bit
KVM: PPC: Fix clearing never mapped TCEs in realmode
powerpc/pseries/iommu: Rename "direct window" to "dma window"
powerpc/pseries/iommu: Make use of DDW for indirect mapping
powerpc/pseries/iommu: Find existing DDW with given property name
powerpc/pseries/iommu: Update remove_dma_window() to accept property name
powerpc/pseries/iommu: Reorganize iommu_table_setparms*() with new helper
powerpc/pseries/iommu: Add ddw_property_create() and refactor enable_ddw()
powerpc/pseries/iommu: Allow DDW windows starting at 0x00
powerpc/pseries/iommu: Add ddw_list_new_entry() helper
powerpc/pseries/iommu: Add iommu_pseries_alloc_table() helper
powerpc/kernel/iommu: Add new iommu_table_in_use() helper
powerpc/pseries/iommu: Replace hard-coded page shift
powerpc/numa: Update cpu_cpu_map on CPU online/offline
powerpc/numa: Print debug statements only when required
powerpc/numa: convert printk to pr_xxx
powerpc/numa: Drop dbg in favour of pr_debug
powerpc/smp: Enable CACHE domain for shared processor
powerpc/smp: Update cpu_core_map on all PowerPc systems
...

+2824 -2627
+105
Documentation/powerpc/associativity.rst
··· 1 + ============================ 2 + NUMA resource associativity 3 + ============================ 4 + 5 + Associativity represents the groupings of the various platform resources into 6 + domains of substantially similar mean performance relative to resources outside 7 + of that domain. Resources subsets of a given domain that exhibit better 8 + performance relative to each other than relative to other resources subsets 9 + are represented as being members of a sub-grouping domain. This performance 10 + characteristic is presented in terms of NUMA node distance within the Linux kernel. 11 + From the platform view, these groups are also referred to as domains. 12 + 13 + PAPR interface currently supports different ways of communicating these resource 14 + grouping details to the OS. These are referred to as Form 0, Form 1 and Form2 15 + associativity grouping. Form 0 is the oldest format and is now considered deprecated. 16 + 17 + Hypervisor indicates the type/form of associativity used via "ibm,architecture-vec-5 property". 18 + Bit 0 of byte 5 in the "ibm,architecture-vec-5" property indicates usage of Form 0 or Form 1. 19 + A value of 1 indicates the usage of Form 1 associativity. For Form 2 associativity 20 + bit 2 of byte 5 in the "ibm,architecture-vec-5" property is used. 21 + 22 + Form 0 23 + ------ 24 + Form 0 associativity supports only two NUMA distances (LOCAL and REMOTE). 25 + 26 + Form 1 27 + ------ 28 + With Form 1 a combination of ibm,associativity-reference-points, and ibm,associativity 29 + device tree properties are used to determine the NUMA distance between resource groups/domains. 30 + 31 + The “ibm,associativity” property contains a list of one or more numbers (domainID) 32 + representing the resource’s platform grouping domains. 33 + 34 + The “ibm,associativity-reference-points” property contains a list of one or more numbers 35 + (domainID index) that represents the 1 based ordinal in the associativity lists. 36 + The list of domainID indexes represents an increasing hierarchy of resource grouping. 37 + 38 + ex: 39 + { primary domainID index, secondary domainID index, tertiary domainID index.. } 40 + 41 + Linux kernel uses the domainID at the primary domainID index as the NUMA node id. 42 + Linux kernel computes NUMA distance between two domains by recursively comparing 43 + if they belong to the same higher-level domains. For mismatch at every higher 44 + level of the resource group, the kernel doubles the NUMA distance between the 45 + comparing domains. 46 + 47 + Form 2 48 + ------- 49 + Form 2 associativity format adds separate device tree properties representing NUMA node distance 50 + thereby making the node distance computation flexible. Form 2 also allows flexible primary 51 + domain numbering. With numa distance computation now detached from the index value in 52 + "ibm,associativity-reference-points" property, Form 2 allows a large number of primary domain 53 + ids at the same domainID index representing resource groups of different performance/latency 54 + characteristics. 55 + 56 + Hypervisor indicates the usage of FORM2 associativity using bit 2 of byte 5 in the 57 + "ibm,architecture-vec-5" property. 58 + 59 + "ibm,numa-lookup-index-table" property contains a list of one or more numbers representing 60 + the domainIDs present in the system. The offset of the domainID in this property is 61 + used as an index while computing numa distance information via "ibm,numa-distance-table". 62 + 63 + prop-encoded-array: The number N of the domainIDs encoded as with encode-int, followed by 64 + N domainID encoded as with encode-int 65 + 66 + For ex: 67 + "ibm,numa-lookup-index-table" = {4, 0, 8, 250, 252}. The offset of domainID 8 (2) is used when 68 + computing the distance of domain 8 from other domains present in the system. For the rest of 69 + this document, this offset will be referred to as domain distance offset. 70 + 71 + "ibm,numa-distance-table" property contains a list of one or more numbers representing the NUMA 72 + distance between resource groups/domains present in the system. 73 + 74 + prop-encoded-array: The number N of the distance values encoded as with encode-int, followed by 75 + N distance values encoded as with encode-bytes. The max distance value we could encode is 255. 76 + The number N must be equal to the square of m where m is the number of domainIDs in the 77 + numa-lookup-index-table. 78 + 79 + For ex: 80 + ibm,numa-lookup-index-table = <3 0 8 40>; 81 + ibm,numa-distace-table = <9>, /bits/ 8 < 10 20 80 20 10 160 80 160 10>; 82 + 83 + :: 84 + 85 + | 0 8 40 86 + --|------------ 87 + | 88 + 0 | 10 20 80 89 + | 90 + 8 | 20 10 160 91 + | 92 + 40| 80 160 10 93 + 94 + A possible "ibm,associativity" property for resources in node 0, 8 and 40 95 + 96 + { 3, 6, 7, 0 } 97 + { 3, 6, 9, 8 } 98 + { 3, 6, 7, 40} 99 + 100 + With "ibm,associativity-reference-points" { 0x3 } 101 + 102 + "ibm,lookup-index-table" helps in having a compact representation of distance matrix. 103 + Since domainID can be sparse, the matrix of distances can also be effectively sparse. 104 + With "ibm,lookup-index-table" we can achieve a compact representation of 105 + distance information.
+1
Documentation/powerpc/index.rst
··· 7 7 .. toctree:: 8 8 :maxdepth: 1 9 9 10 + associativity 10 11 booting 11 12 bootwrapper 12 13 cpu_families
-1
MAINTAINERS
··· 6853 6853 F: drivers/media/usb/em28xx/ 6854 6854 6855 6855 EMBEDDED LINUX 6856 - M: Paul Gortmaker <paul.gortmaker@windriver.com> 6857 6856 M: Matt Mackall <mpm@selenic.com> 6858 6857 M: David Woodhouse <dwmw2@infradead.org> 6859 6858 L: linux-embedded@vger.kernel.org
+2
arch/powerpc/Kconfig
··· 123 123 select ARCH_HAS_COPY_MC if PPC64 124 124 select ARCH_HAS_DEBUG_VIRTUAL 125 125 select ARCH_HAS_DEBUG_VM_PGTABLE 126 + select ARCH_HAS_DEBUG_WX if STRICT_KERNEL_RWX 126 127 select ARCH_HAS_DEVMEM_IS_ALLOWED 127 128 select ARCH_HAS_DMA_MAP_DIRECT if PPC_PSERIES 128 129 select ARCH_HAS_ELF_RANDOMIZE ··· 183 182 select GENERIC_IRQ_SHOW 184 183 select GENERIC_IRQ_SHOW_LEVEL 185 184 select GENERIC_PCI_IOMAP if PCI 185 + select GENERIC_PTDUMP 186 186 select GENERIC_SMP_IDLE_THREAD 187 187 select GENERIC_TIME_VSYSCALL 188 188 select GENERIC_VDSO_TIME_NS
-30
arch/powerpc/Kconfig.debug
··· 365 365 366 366 If you are unsure, say N. 367 367 368 - config PPC_PTDUMP 369 - bool "Export kernel pagetable layout to userspace via debugfs" 370 - depends on DEBUG_KERNEL && DEBUG_FS 371 - help 372 - This option exports the state of the kernel pagetables to a 373 - debugfs file. This is only useful for kernel developers who are 374 - working in architecture specific areas of the kernel - probably 375 - not a good idea to enable this feature in a production kernel. 376 - 377 - If you are unsure, say N. 378 - 379 - config PPC_DEBUG_WX 380 - bool "Warn on W+X mappings at boot" 381 - depends on PPC_PTDUMP && STRICT_KERNEL_RWX 382 - help 383 - Generate a warning if any W+X mappings are found at boot. 384 - 385 - This is useful for discovering cases where the kernel is leaving 386 - W+X mappings after applying NX, as such mappings are a security risk. 387 - 388 - Note that even if the check fails, your kernel is possibly 389 - still fine, as W+X mappings are not a security hole in 390 - themselves, what they do is that they make the exploitation 391 - of other unfixed kernel bugs easier. 392 - 393 - There is no runtime or memory usage effect of this option 394 - once the kernel has booted up - it's a one time check. 395 - 396 - If in doubt, say "Y". 397 - 398 368 config PPC_FAST_ENDIAN_SWITCH 399 369 bool "Deprecated fast endian-switch syscall" 400 370 depends on DEBUG_KERNEL && PPC_BOOK3S_64
+3 -1
arch/powerpc/Makefile
··· 122 122 123 123 LDFLAGS_vmlinux-y := -Bstatic 124 124 LDFLAGS_vmlinux-$(CONFIG_RELOCATABLE) := -pie 125 + LDFLAGS_vmlinux-$(CONFIG_RELOCATABLE) += -z notext 125 126 LDFLAGS_vmlinux := $(LDFLAGS_vmlinux-y) 126 127 127 128 ifdef CONFIG_PPC64 ··· 408 407 409 408 PHONY += install 410 409 install: 411 - $(Q)$(MAKE) $(build)=$(boot) install 410 + sh -x $(srctree)/$(boot)/install.sh "$(KERNELRELEASE)" vmlinux \ 411 + System.map "$(INSTALL_PATH)" 412 412 413 413 archclean: 414 414 $(Q)$(MAKE) $(clean)=$(boot)
-11
arch/powerpc/boot/Makefile
··· 341 341 image-$(CONFIG_TQM8548) += cuImage.tqm8548 342 342 image-$(CONFIG_TQM8555) += cuImage.tqm8555 343 343 image-$(CONFIG_TQM8560) += cuImage.tqm8560 344 - image-$(CONFIG_SBC8548) += cuImage.sbc8548 345 344 image-$(CONFIG_KSI8560) += cuImage.ksi8560 346 345 347 346 # Board ports in arch/powerpc/platform/86xx/Kconfig ··· 442 443 $(Q)rm -f $@; ln $< $@ 443 444 $(obj)/zImage.initrd: $(addprefix $(obj)/, $(initrd-y)) 444 445 $(Q)rm -f $@; ln $< $@ 445 - 446 - # Only install the vmlinux 447 - install: $(CONFIGURE) $(addprefix $(obj)/, $(image-y)) 448 - sh -x $(srctree)/$(src)/install.sh "$(KERNELRELEASE)" vmlinux System.map "$(INSTALL_PATH)" 449 - 450 - # Install the vmlinux and other built boot targets. 451 - zInstall: $(CONFIGURE) $(addprefix $(obj)/, $(image-y)) 452 - sh -x $(srctree)/$(src)/install.sh "$(KERNELRELEASE)" vmlinux System.map "$(INSTALL_PATH)" $^ 453 - 454 - PHONY += install zInstall 455 446 456 447 # anything not in $(targets) 457 448 clean-files += $(image-) $(initrd-) cuImage.* dtbImage.* treeImage.* \
-176
arch/powerpc/boot/dts/fsl/sbc8641d.dts
··· 1 - // SPDX-License-Identifier: GPL-2.0-or-later 2 - /* 3 - * SBC8641D Device Tree Source 4 - * 5 - * Copyright 2008 Wind River Systems Inc. 6 - * 7 - * Paul Gortmaker (see MAINTAINERS for contact information) 8 - * 9 - * Based largely on the mpc8641_hpcn.dts by Freescale Semiconductor Inc. 10 - */ 11 - 12 - /include/ "mpc8641si-pre.dtsi" 13 - 14 - / { 15 - model = "SBC8641D"; 16 - compatible = "wind,sbc8641"; 17 - 18 - memory { 19 - device_type = "memory"; 20 - reg = <0x00000000 0x20000000>; // 512M at 0x0 21 - }; 22 - 23 - lbc: localbus@f8005000 { 24 - reg = <0xf8005000 0x1000>; 25 - 26 - ranges = <0 0 0xff000000 0x01000000 // 16MB Boot flash 27 - 1 0 0xf0000000 0x00010000 // 64KB EEPROM 28 - 2 0 0xf1000000 0x00100000 // EPLD (1MB) 29 - 3 0 0xe0000000 0x04000000 // 64MB LB SDRAM (CS3) 30 - 4 0 0xe4000000 0x04000000 // 64MB LB SDRAM (CS4) 31 - 6 0 0xf4000000 0x00100000 // LCD display (1MB) 32 - 7 0 0xe8000000 0x04000000>; // 64MB OneNAND 33 - 34 - flash@0,0 { 35 - compatible = "cfi-flash"; 36 - reg = <0 0 0x01000000>; 37 - bank-width = <2>; 38 - device-width = <2>; 39 - #address-cells = <1>; 40 - #size-cells = <1>; 41 - partition@0 { 42 - label = "dtb"; 43 - reg = <0x00000000 0x00100000>; 44 - read-only; 45 - }; 46 - partition@300000 { 47 - label = "kernel"; 48 - reg = <0x00100000 0x00400000>; 49 - read-only; 50 - }; 51 - partition@400000 { 52 - label = "fs"; 53 - reg = <0x00500000 0x00a00000>; 54 - }; 55 - partition@700000 { 56 - label = "firmware"; 57 - reg = <0x00f00000 0x00100000>; 58 - read-only; 59 - }; 60 - }; 61 - 62 - epld@2,0 { 63 - compatible = "wrs,epld-localbus"; 64 - #address-cells = <2>; 65 - #size-cells = <1>; 66 - reg = <2 0 0x100000>; 67 - ranges = <0 0 5 0 1 // User switches 68 - 1 0 5 1 1 // Board ID/Rev 69 - 3 0 5 3 1>; // LEDs 70 - }; 71 - }; 72 - 73 - soc: soc@f8000000 { 74 - ranges = <0x00000000 0xf8000000 0x00100000>; 75 - 76 - enet0: ethernet@24000 { 77 - tbi-handle = <&tbi0>; 78 - phy-handle = <&phy0>; 79 - phy-connection-type = "rgmii-id"; 80 - }; 81 - 82 - mdio@24520 { 83 - phy0: ethernet-phy@1f { 84 - reg = <0x1f>; 85 - }; 86 - phy1: ethernet-phy@0 { 87 - reg = <0>; 88 - }; 89 - phy2: ethernet-phy@1 { 90 - reg = <1>; 91 - }; 92 - phy3: ethernet-phy@2 { 93 - reg = <2>; 94 - }; 95 - tbi0: tbi-phy@11 { 96 - reg = <0x11>; 97 - device_type = "tbi-phy"; 98 - }; 99 - }; 100 - 101 - enet1: ethernet@25000 { 102 - tbi-handle = <&tbi1>; 103 - phy-handle = <&phy1>; 104 - phy-connection-type = "rgmii-id"; 105 - }; 106 - 107 - mdio@25520 { 108 - tbi1: tbi-phy@11 { 109 - reg = <0x11>; 110 - device_type = "tbi-phy"; 111 - }; 112 - }; 113 - 114 - enet2: ethernet@26000 { 115 - tbi-handle = <&tbi2>; 116 - phy-handle = <&phy2>; 117 - phy-connection-type = "rgmii-id"; 118 - }; 119 - 120 - mdio@26520 { 121 - tbi2: tbi-phy@11 { 122 - reg = <0x11>; 123 - device_type = "tbi-phy"; 124 - }; 125 - }; 126 - 127 - enet3: ethernet@27000 { 128 - tbi-handle = <&tbi3>; 129 - phy-handle = <&phy3>; 130 - phy-connection-type = "rgmii-id"; 131 - }; 132 - 133 - mdio@27520 { 134 - tbi3: tbi-phy@11 { 135 - reg = <0x11>; 136 - device_type = "tbi-phy"; 137 - }; 138 - }; 139 - }; 140 - 141 - pci0: pcie@f8008000 { 142 - reg = <0xf8008000 0x1000>; 143 - ranges = <0x02000000 0x0 0x80000000 0x80000000 0x0 0x20000000 144 - 0x01000000 0x0 0x00000000 0xe2000000 0x0 0x00100000>; 145 - interrupt-map-mask = <0xff00 0 0 7>; 146 - 147 - pcie@0 { 148 - ranges = <0x02000000 0x0 0x80000000 149 - 0x02000000 0x0 0x80000000 150 - 0x0 0x20000000 151 - 152 - 0x01000000 0x0 0x00000000 153 - 0x01000000 0x0 0x00000000 154 - 0x0 0x00100000>; 155 - }; 156 - 157 - }; 158 - 159 - pci1: pcie@f8009000 { 160 - reg = <0xf8009000 0x1000>; 161 - ranges = <0x02000000 0x0 0xa0000000 0xa0000000 0x0 0x20000000 162 - 0x01000000 0x0 0x00000000 0xe3000000 0x0 0x00100000>; 163 - 164 - pcie@0 { 165 - ranges = <0x02000000 0x0 0xa0000000 166 - 0x02000000 0x0 0xa0000000 167 - 0x0 0x20000000 168 - 169 - 0x01000000 0x0 0x00000000 170 - 0x01000000 0x0 0x00000000 171 - 0x0 0x00100000>; 172 - }; 173 - }; 174 - }; 175 - 176 - /include/ "mpc8641si-post.dtsi"
+12
arch/powerpc/boot/dts/microwatt.dts
··· 127 127 fifo-size = <16>; 128 128 interrupts = <0x10 0x1>; 129 129 }; 130 + 131 + ethernet@8020000 { 132 + compatible = "litex,liteeth"; 133 + reg = <0x8021000 0x100 134 + 0x8020800 0x100 135 + 0x8030000 0x2000>; 136 + reg-names = "mac", "mido", "buffer"; 137 + litex,rx-slots = <2>; 138 + litex,tx-slots = <2>; 139 + litex,slot-size = <0x800>; 140 + interrupts = <0x11 0x1>; 141 + }; 130 142 }; 131 143 132 144 chosen {
-111
arch/powerpc/boot/dts/sbc8548-altflash.dts
··· 1 - // SPDX-License-Identifier: GPL-2.0-or-later 2 - /* 3 - * SBC8548 Device Tree Source 4 - * 5 - * Configured for booting off the alternate (64MB SODIMM) flash. 6 - * Requires switching JP12 jumpers and changing SW2.8 setting. 7 - * 8 - * Copyright 2013 Wind River Systems Inc. 9 - * 10 - * Paul Gortmaker (see MAINTAINERS for contact information) 11 - */ 12 - 13 - 14 - /dts-v1/; 15 - 16 - /include/ "sbc8548-pre.dtsi" 17 - 18 - /{ 19 - localbus@e0000000 { 20 - #address-cells = <2>; 21 - #size-cells = <1>; 22 - compatible = "simple-bus"; 23 - reg = <0xe0000000 0x5000>; 24 - interrupt-parent = <&mpic>; 25 - 26 - ranges = <0x0 0x0 0xfc000000 0x04000000 /*64MB Flash*/ 27 - 0x3 0x0 0xf0000000 0x04000000 /*64MB SDRAM*/ 28 - 0x4 0x0 0xf4000000 0x04000000 /*64MB SDRAM*/ 29 - 0x5 0x0 0xf8000000 0x00b10000 /* EPLD */ 30 - 0x6 0x0 0xef800000 0x00800000>; /*8MB Flash*/ 31 - 32 - flash@0,0 { 33 - #address-cells = <1>; 34 - #size-cells = <1>; 35 - reg = <0x0 0x0 0x04000000>; 36 - compatible = "intel,JS28F128", "cfi-flash"; 37 - bank-width = <4>; 38 - device-width = <1>; 39 - partition@0 { 40 - label = "space"; 41 - /* FC000000 -> FFEFFFFF */ 42 - reg = <0x00000000 0x03f00000>; 43 - }; 44 - partition@3f00000 { 45 - label = "bootloader"; 46 - /* FFF00000 -> FFFFFFFF */ 47 - reg = <0x03f00000 0x00100000>; 48 - read-only; 49 - }; 50 - }; 51 - 52 - 53 - epld@5,0 { 54 - compatible = "wrs,epld-localbus"; 55 - #address-cells = <2>; 56 - #size-cells = <1>; 57 - reg = <0x5 0x0 0x00b10000>; 58 - ranges = < 59 - 0x0 0x0 0x5 0x000000 0x1fff /* LED */ 60 - 0x1 0x0 0x5 0x100000 0x1fff /* Switches */ 61 - 0x3 0x0 0x5 0x300000 0x1fff /* HW Rev. */ 62 - 0xb 0x0 0x5 0xb00000 0x1fff /* EEPROM */ 63 - >; 64 - 65 - led@0,0 { 66 - compatible = "led"; 67 - reg = <0x0 0x0 0x1fff>; 68 - }; 69 - 70 - switches@1,0 { 71 - compatible = "switches"; 72 - reg = <0x1 0x0 0x1fff>; 73 - }; 74 - 75 - hw-rev@3,0 { 76 - compatible = "hw-rev"; 77 - reg = <0x3 0x0 0x1fff>; 78 - }; 79 - 80 - eeprom@b,0 { 81 - compatible = "eeprom"; 82 - reg = <0xb 0 0x1fff>; 83 - }; 84 - 85 - }; 86 - 87 - alt-flash@6,0 { 88 - #address-cells = <1>; 89 - #size-cells = <1>; 90 - compatible = "intel,JS28F640", "cfi-flash"; 91 - reg = <0x6 0x0 0x800000>; 92 - bank-width = <1>; 93 - device-width = <1>; 94 - partition@0 { 95 - label = "space"; 96 - /* EF800000 -> EFF9FFFF */ 97 - reg = <0x00000000 0x007a0000>; 98 - }; 99 - partition@7a0000 { 100 - label = "bootloader"; 101 - /* EFFA0000 -> EFFFFFFF */ 102 - reg = <0x007a0000 0x00060000>; 103 - read-only; 104 - }; 105 - }; 106 - 107 - 108 - }; 109 - }; 110 - 111 - /include/ "sbc8548-post.dtsi"
-289
arch/powerpc/boot/dts/sbc8548-post.dtsi
··· 1 - // SPDX-License-Identifier: GPL-2.0-or-later 2 - /* 3 - * SBC8548 Device Tree Source 4 - * 5 - * Copyright 2007 Wind River Systems Inc. 6 - * 7 - * Paul Gortmaker (see MAINTAINERS for contact information) 8 - */ 9 - 10 - /{ 11 - soc8548@e0000000 { 12 - #address-cells = <1>; 13 - #size-cells = <1>; 14 - device_type = "soc"; 15 - ranges = <0x00000000 0xe0000000 0x00100000>; 16 - bus-frequency = <0>; 17 - compatible = "simple-bus"; 18 - 19 - ecm-law@0 { 20 - compatible = "fsl,ecm-law"; 21 - reg = <0x0 0x1000>; 22 - fsl,num-laws = <10>; 23 - }; 24 - 25 - ecm@1000 { 26 - compatible = "fsl,mpc8548-ecm", "fsl,ecm"; 27 - reg = <0x1000 0x1000>; 28 - interrupts = <17 2>; 29 - interrupt-parent = <&mpic>; 30 - }; 31 - 32 - memory-controller@2000 { 33 - compatible = "fsl,mpc8548-memory-controller"; 34 - reg = <0x2000 0x1000>; 35 - interrupt-parent = <&mpic>; 36 - interrupts = <0x12 0x2>; 37 - }; 38 - 39 - L2: l2-cache-controller@20000 { 40 - compatible = "fsl,mpc8548-l2-cache-controller"; 41 - reg = <0x20000 0x1000>; 42 - cache-line-size = <0x20>; // 32 bytes 43 - cache-size = <0x80000>; // L2, 512K 44 - interrupt-parent = <&mpic>; 45 - interrupts = <0x10 0x2>; 46 - }; 47 - 48 - i2c@3000 { 49 - #address-cells = <1>; 50 - #size-cells = <0>; 51 - cell-index = <0>; 52 - compatible = "fsl-i2c"; 53 - reg = <0x3000 0x100>; 54 - interrupts = <0x2b 0x2>; 55 - interrupt-parent = <&mpic>; 56 - dfsrr; 57 - }; 58 - 59 - i2c@3100 { 60 - #address-cells = <1>; 61 - #size-cells = <0>; 62 - cell-index = <1>; 63 - compatible = "fsl-i2c"; 64 - reg = <0x3100 0x100>; 65 - interrupts = <0x2b 0x2>; 66 - interrupt-parent = <&mpic>; 67 - dfsrr; 68 - }; 69 - 70 - dma@21300 { 71 - #address-cells = <1>; 72 - #size-cells = <1>; 73 - compatible = "fsl,mpc8548-dma", "fsl,eloplus-dma"; 74 - reg = <0x21300 0x4>; 75 - ranges = <0x0 0x21100 0x200>; 76 - cell-index = <0>; 77 - dma-channel@0 { 78 - compatible = "fsl,mpc8548-dma-channel", 79 - "fsl,eloplus-dma-channel"; 80 - reg = <0x0 0x80>; 81 - cell-index = <0>; 82 - interrupt-parent = <&mpic>; 83 - interrupts = <20 2>; 84 - }; 85 - dma-channel@80 { 86 - compatible = "fsl,mpc8548-dma-channel", 87 - "fsl,eloplus-dma-channel"; 88 - reg = <0x80 0x80>; 89 - cell-index = <1>; 90 - interrupt-parent = <&mpic>; 91 - interrupts = <21 2>; 92 - }; 93 - dma-channel@100 { 94 - compatible = "fsl,mpc8548-dma-channel", 95 - "fsl,eloplus-dma-channel"; 96 - reg = <0x100 0x80>; 97 - cell-index = <2>; 98 - interrupt-parent = <&mpic>; 99 - interrupts = <22 2>; 100 - }; 101 - dma-channel@180 { 102 - compatible = "fsl,mpc8548-dma-channel", 103 - "fsl,eloplus-dma-channel"; 104 - reg = <0x180 0x80>; 105 - cell-index = <3>; 106 - interrupt-parent = <&mpic>; 107 - interrupts = <23 2>; 108 - }; 109 - }; 110 - 111 - enet0: ethernet@24000 { 112 - #address-cells = <1>; 113 - #size-cells = <1>; 114 - cell-index = <0>; 115 - device_type = "network"; 116 - model = "eTSEC"; 117 - compatible = "gianfar"; 118 - reg = <0x24000 0x1000>; 119 - ranges = <0x0 0x24000 0x1000>; 120 - local-mac-address = [ 00 00 00 00 00 00 ]; 121 - interrupts = <0x1d 0x2 0x1e 0x2 0x22 0x2>; 122 - interrupt-parent = <&mpic>; 123 - tbi-handle = <&tbi0>; 124 - phy-handle = <&phy0>; 125 - 126 - mdio@520 { 127 - #address-cells = <1>; 128 - #size-cells = <0>; 129 - compatible = "fsl,gianfar-mdio"; 130 - reg = <0x520 0x20>; 131 - 132 - phy0: ethernet-phy@19 { 133 - interrupt-parent = <&mpic>; 134 - interrupts = <0x6 0x1>; 135 - reg = <0x19>; 136 - }; 137 - phy1: ethernet-phy@1a { 138 - interrupt-parent = <&mpic>; 139 - interrupts = <0x7 0x1>; 140 - reg = <0x1a>; 141 - }; 142 - tbi0: tbi-phy@11 { 143 - reg = <0x11>; 144 - device_type = "tbi-phy"; 145 - }; 146 - }; 147 - }; 148 - 149 - enet1: ethernet@25000 { 150 - #address-cells = <1>; 151 - #size-cells = <1>; 152 - cell-index = <1>; 153 - device_type = "network"; 154 - model = "eTSEC"; 155 - compatible = "gianfar"; 156 - reg = <0x25000 0x1000>; 157 - ranges = <0x0 0x25000 0x1000>; 158 - local-mac-address = [ 00 00 00 00 00 00 ]; 159 - interrupts = <0x23 0x2 0x24 0x2 0x28 0x2>; 160 - interrupt-parent = <&mpic>; 161 - tbi-handle = <&tbi1>; 162 - phy-handle = <&phy1>; 163 - 164 - mdio@520 { 165 - #address-cells = <1>; 166 - #size-cells = <0>; 167 - compatible = "fsl,gianfar-tbi"; 168 - reg = <0x520 0x20>; 169 - 170 - tbi1: tbi-phy@11 { 171 - reg = <0x11>; 172 - device_type = "tbi-phy"; 173 - }; 174 - }; 175 - }; 176 - 177 - serial0: serial@4500 { 178 - cell-index = <0>; 179 - device_type = "serial"; 180 - compatible = "fsl,ns16550", "ns16550"; 181 - reg = <0x4500 0x100>; // reg base, size 182 - clock-frequency = <0>; // should we fill in in uboot? 183 - interrupts = <0x2a 0x2>; 184 - interrupt-parent = <&mpic>; 185 - }; 186 - 187 - serial1: serial@4600 { 188 - cell-index = <1>; 189 - device_type = "serial"; 190 - compatible = "fsl,ns16550", "ns16550"; 191 - reg = <0x4600 0x100>; // reg base, size 192 - clock-frequency = <0>; // should we fill in in uboot? 193 - interrupts = <0x2a 0x2>; 194 - interrupt-parent = <&mpic>; 195 - }; 196 - 197 - global-utilities@e0000 { //global utilities reg 198 - compatible = "fsl,mpc8548-guts"; 199 - reg = <0xe0000 0x1000>; 200 - fsl,has-rstcr; 201 - }; 202 - 203 - crypto@30000 { 204 - compatible = "fsl,sec2.1", "fsl,sec2.0"; 205 - reg = <0x30000 0x10000>; 206 - interrupts = <45 2>; 207 - interrupt-parent = <&mpic>; 208 - fsl,num-channels = <4>; 209 - fsl,channel-fifo-len = <24>; 210 - fsl,exec-units-mask = <0xfe>; 211 - fsl,descriptor-types-mask = <0x12b0ebf>; 212 - }; 213 - 214 - mpic: pic@40000 { 215 - interrupt-controller; 216 - #address-cells = <0>; 217 - #interrupt-cells = <2>; 218 - reg = <0x40000 0x40000>; 219 - compatible = "chrp,open-pic"; 220 - device_type = "open-pic"; 221 - }; 222 - }; 223 - 224 - pci0: pci@e0008000 { 225 - interrupt-map-mask = <0xf800 0x0 0x0 0x7>; 226 - interrupt-map = < 227 - /* IDSEL 0x01 (PCI-X slot) @66MHz */ 228 - 0x0800 0x0 0x0 0x1 &mpic 0x2 0x1 229 - 0x0800 0x0 0x0 0x2 &mpic 0x3 0x1 230 - 0x0800 0x0 0x0 0x3 &mpic 0x4 0x1 231 - 0x0800 0x0 0x0 0x4 &mpic 0x1 0x1 232 - 233 - /* IDSEL 0x11 (PCI, 3.3V 32bit) @33MHz */ 234 - 0x8800 0x0 0x0 0x1 &mpic 0x2 0x1 235 - 0x8800 0x0 0x0 0x2 &mpic 0x3 0x1 236 - 0x8800 0x0 0x0 0x3 &mpic 0x4 0x1 237 - 0x8800 0x0 0x0 0x4 &mpic 0x1 0x1>; 238 - 239 - interrupt-parent = <&mpic>; 240 - interrupts = <0x18 0x2>; 241 - bus-range = <0 0>; 242 - ranges = <0x02000000 0x0 0x80000000 0x80000000 0x0 0x10000000 243 - 0x01000000 0x0 0x00000000 0xe2000000 0x0 0x00800000>; 244 - clock-frequency = <66000000>; 245 - #interrupt-cells = <1>; 246 - #size-cells = <2>; 247 - #address-cells = <3>; 248 - reg = <0xe0008000 0x1000>; 249 - compatible = "fsl,mpc8540-pcix", "fsl,mpc8540-pci"; 250 - device_type = "pci"; 251 - }; 252 - 253 - pci1: pcie@e000a000 { 254 - interrupt-map-mask = <0xf800 0x0 0x0 0x7>; 255 - interrupt-map = < 256 - 257 - /* IDSEL 0x0 (PEX) */ 258 - 0x0000 0x0 0x0 0x1 &mpic 0x0 0x1 259 - 0x0000 0x0 0x0 0x2 &mpic 0x1 0x1 260 - 0x0000 0x0 0x0 0x3 &mpic 0x2 0x1 261 - 0x0000 0x0 0x0 0x4 &mpic 0x3 0x1>; 262 - 263 - interrupt-parent = <&mpic>; 264 - interrupts = <0x1a 0x2>; 265 - bus-range = <0x0 0xff>; 266 - ranges = <0x02000000 0x0 0xa0000000 0xa0000000 0x0 0x10000000 267 - 0x01000000 0x0 0x00000000 0xe2800000 0x0 0x08000000>; 268 - clock-frequency = <33000000>; 269 - #interrupt-cells = <1>; 270 - #size-cells = <2>; 271 - #address-cells = <3>; 272 - reg = <0xe000a000 0x1000>; 273 - compatible = "fsl,mpc8548-pcie"; 274 - device_type = "pci"; 275 - pcie@0 { 276 - reg = <0x0 0x0 0x0 0x0 0x0>; 277 - #size-cells = <2>; 278 - #address-cells = <3>; 279 - device_type = "pci"; 280 - ranges = <0x02000000 0x0 0xa0000000 281 - 0x02000000 0x0 0xa0000000 282 - 0x0 0x10000000 283 - 284 - 0x01000000 0x0 0x00000000 285 - 0x01000000 0x0 0x00000000 286 - 0x0 0x00800000>; 287 - }; 288 - }; 289 - };
-48
arch/powerpc/boot/dts/sbc8548-pre.dtsi
··· 1 - // SPDX-License-Identifier: GPL-2.0-or-later 2 - /* 3 - * SBC8548 Device Tree Source 4 - * 5 - * Copyright 2007 Wind River Systems Inc. 6 - * 7 - * Paul Gortmaker (see MAINTAINERS for contact information) 8 - */ 9 - 10 - /{ 11 - model = "SBC8548"; 12 - compatible = "SBC8548"; 13 - #address-cells = <1>; 14 - #size-cells = <1>; 15 - 16 - aliases { 17 - ethernet0 = &enet0; 18 - ethernet1 = &enet1; 19 - serial0 = &serial0; 20 - serial1 = &serial1; 21 - pci0 = &pci0; 22 - pci1 = &pci1; 23 - }; 24 - 25 - cpus { 26 - #address-cells = <1>; 27 - #size-cells = <0>; 28 - 29 - PowerPC,8548@0 { 30 - device_type = "cpu"; 31 - reg = <0>; 32 - d-cache-line-size = <0x20>; // 32 bytes 33 - i-cache-line-size = <0x20>; // 32 bytes 34 - d-cache-size = <0x8000>; // L1, 32K 35 - i-cache-size = <0x8000>; // L1, 32K 36 - timebase-frequency = <0>; // From uboot 37 - bus-frequency = <0>; 38 - clock-frequency = <0>; 39 - next-level-cache = <&L2>; 40 - }; 41 - }; 42 - 43 - memory { 44 - device_type = "memory"; 45 - reg = <0x00000000 0x10000000>; 46 - }; 47 - 48 - };
-106
arch/powerpc/boot/dts/sbc8548.dts
··· 1 - // SPDX-License-Identifier: GPL-2.0-or-later 2 - /* 3 - * SBC8548 Device Tree Source 4 - * 5 - * Copyright 2007 Wind River Systems Inc. 6 - * 7 - * Paul Gortmaker (see MAINTAINERS for contact information) 8 - */ 9 - 10 - 11 - /dts-v1/; 12 - 13 - /include/ "sbc8548-pre.dtsi" 14 - 15 - /{ 16 - localbus@e0000000 { 17 - #address-cells = <2>; 18 - #size-cells = <1>; 19 - compatible = "simple-bus"; 20 - reg = <0xe0000000 0x5000>; 21 - interrupt-parent = <&mpic>; 22 - 23 - ranges = <0x0 0x0 0xff800000 0x00800000 /*8MB Flash*/ 24 - 0x3 0x0 0xf0000000 0x04000000 /*64MB SDRAM*/ 25 - 0x4 0x0 0xf4000000 0x04000000 /*64MB SDRAM*/ 26 - 0x5 0x0 0xf8000000 0x00b10000 /* EPLD */ 27 - 0x6 0x0 0xec000000 0x04000000>; /*64MB Flash*/ 28 - 29 - 30 - flash@0,0 { 31 - #address-cells = <1>; 32 - #size-cells = <1>; 33 - compatible = "intel,JS28F640", "cfi-flash"; 34 - reg = <0x0 0x0 0x800000>; 35 - bank-width = <1>; 36 - device-width = <1>; 37 - partition@0 { 38 - label = "space"; 39 - /* FF800000 -> FFF9FFFF */ 40 - reg = <0x00000000 0x007a0000>; 41 - }; 42 - partition@7a0000 { 43 - label = "bootloader"; 44 - /* FFFA0000 -> FFFFFFFF */ 45 - reg = <0x007a0000 0x00060000>; 46 - read-only; 47 - }; 48 - }; 49 - 50 - epld@5,0 { 51 - compatible = "wrs,epld-localbus"; 52 - #address-cells = <2>; 53 - #size-cells = <1>; 54 - reg = <0x5 0x0 0x00b10000>; 55 - ranges = < 56 - 0x0 0x0 0x5 0x000000 0x1fff /* LED */ 57 - 0x1 0x0 0x5 0x100000 0x1fff /* Switches */ 58 - 0x3 0x0 0x5 0x300000 0x1fff /* HW Rev. */ 59 - 0xb 0x0 0x5 0xb00000 0x1fff /* EEPROM */ 60 - >; 61 - 62 - led@0,0 { 63 - compatible = "led"; 64 - reg = <0x0 0x0 0x1fff>; 65 - }; 66 - 67 - switches@1,0 { 68 - compatible = "switches"; 69 - reg = <0x1 0x0 0x1fff>; 70 - }; 71 - 72 - hw-rev@3,0 { 73 - compatible = "hw-rev"; 74 - reg = <0x3 0x0 0x1fff>; 75 - }; 76 - 77 - eeprom@b,0 { 78 - compatible = "eeprom"; 79 - reg = <0xb 0 0x1fff>; 80 - }; 81 - 82 - }; 83 - 84 - alt-flash@6,0 { 85 - #address-cells = <1>; 86 - #size-cells = <1>; 87 - reg = <0x6 0x0 0x04000000>; 88 - compatible = "intel,JS28F128", "cfi-flash"; 89 - bank-width = <4>; 90 - device-width = <1>; 91 - partition@0 { 92 - label = "space"; 93 - /* EC000000 -> EFEFFFFF */ 94 - reg = <0x00000000 0x03f00000>; 95 - }; 96 - partition@3f00000 { 97 - label = "bootloader"; 98 - /* EFF00000 -> EFFFFFFF */ 99 - reg = <0x03f00000 0x00100000>; 100 - read-only; 101 - }; 102 - }; 103 - }; 104 - }; 105 - 106 - /include/ "sbc8548-post.dtsi"
+12 -1
arch/powerpc/boot/dts/wii.dts
··· 216 216 217 217 control@d800100 { 218 218 compatible = "nintendo,hollywood-control"; 219 - reg = <0x0d800100 0x300>; 219 + /* 220 + * Both the address and length are wrong, according to 221 + * Wiibrew this should be <0x0d800000 0x400>, but it 222 + * requires refactoring the PIC1, GPIO and OTP nodes 223 + * before changing that. 224 + */ 225 + reg = <0x0d800100 0xa0>; 226 + }; 227 + 228 + otp@d8001ec { 229 + compatible = "nintendo,hollywood-otp"; 230 + reg = <0x0d8001ec 0x8>; 220 231 }; 221 232 222 233 disk@d806000 {
+14 -13
arch/powerpc/boot/install.sh
··· 15 15 # $2 - kernel image file 16 16 # $3 - kernel map file 17 17 # $4 - default install path (blank if root directory) 18 - # $5 and more - kernel boot files; zImage*, uImage, cuImage.*, etc. 19 18 # 20 19 21 20 # Bail with error code if anything goes wrong 22 21 set -e 22 + 23 + verify () { 24 + if [ ! -f "$1" ]; then 25 + echo "" 1>&2 26 + echo " *** Missing file: $1" 1>&2 27 + echo ' *** You need to run "make" before "make install".' 1>&2 28 + echo "" 1>&2 29 + exit 1 30 + fi 31 + } 32 + 33 + # Make sure the files actually exist 34 + verify "$2" 35 + verify "$3" 23 36 24 37 # User may have a custom install script 25 38 ··· 54 41 55 42 cat $2 > $4/$image_name 56 43 cp $3 $4/System.map 57 - 58 - # Copy all the bootable image files 59 - path=$4 60 - shift 4 61 - while [ $# -ne 0 ]; do 62 - image_name=`basename $1` 63 - if [ -f $path/$image_name ]; then 64 - mv $path/$image_name $path/$image_name.old 65 - fi 66 - cat $1 > $path/$image_name 67 - shift 68 - done;
+1 -1
arch/powerpc/boot/wrapper
··· 298 298 *-tqm8541|*-mpc8560*|*-tqm8560|*-tqm8555|*-ksi8560*) 299 299 platformo=$object/cuboot-85xx-cpm2.o 300 300 ;; 301 - *-mpc85*|*-tqm85*|*-sbc85*) 301 + *-mpc85*|*-tqm85*) 302 302 platformo=$object/cuboot-85xx.o 303 303 ;; 304 304 *-amigaone)
-50
arch/powerpc/configs/85xx/sbc8548_defconfig
··· 1 - CONFIG_PPC_85xx=y 2 - CONFIG_SYSVIPC=y 3 - CONFIG_LOG_BUF_SHIFT=14 4 - CONFIG_BLK_DEV_INITRD=y 5 - CONFIG_EXPERT=y 6 - CONFIG_SLAB=y 7 - # CONFIG_BLK_DEV_BSG is not set 8 - CONFIG_SBC8548=y 9 - CONFIG_GEN_RTC=y 10 - CONFIG_BINFMT_MISC=y 11 - CONFIG_MATH_EMULATION=y 12 - # CONFIG_SECCOMP is not set 13 - CONFIG_PCI=y 14 - CONFIG_NET=y 15 - CONFIG_PACKET=y 16 - CONFIG_UNIX=y 17 - CONFIG_XFRM_USER=y 18 - CONFIG_INET=y 19 - CONFIG_IP_MULTICAST=y 20 - CONFIG_IP_PNP=y 21 - CONFIG_IP_PNP_DHCP=y 22 - CONFIG_IP_PNP_BOOTP=y 23 - CONFIG_SYN_COOKIES=y 24 - # CONFIG_IPV6 is not set 25 - # CONFIG_FW_LOADER is not set 26 - CONFIG_MTD=y 27 - CONFIG_MTD_BLOCK=y 28 - CONFIG_MTD_CFI=y 29 - CONFIG_MTD_CFI_ADV_OPTIONS=y 30 - CONFIG_MTD_CFI_GEOMETRY=y 31 - CONFIG_MTD_CFI_I4=y 32 - CONFIG_MTD_CFI_INTELEXT=y 33 - CONFIG_MTD_PHYSMAP_OF=y 34 - CONFIG_BLK_DEV_LOOP=y 35 - CONFIG_BLK_DEV_RAM=y 36 - CONFIG_NETDEVICES=y 37 - CONFIG_GIANFAR=y 38 - CONFIG_BROADCOM_PHY=y 39 - # CONFIG_INPUT_KEYBOARD is not set 40 - # CONFIG_INPUT_MOUSE is not set 41 - # CONFIG_SERIO is not set 42 - # CONFIG_VT is not set 43 - CONFIG_SERIAL_8250=y 44 - CONFIG_SERIAL_8250_CONSOLE=y 45 - # CONFIG_HW_RANDOM is not set 46 - # CONFIG_USB_SUPPORT is not set 47 - CONFIG_PROC_KCORE=y 48 - CONFIG_TMPFS=y 49 - CONFIG_NFS_FS=y 50 - CONFIG_ROOT_NFS=y
+6 -1
arch/powerpc/configs/microwatt_defconfig
··· 5 5 CONFIG_TICK_CPU_ACCOUNTING=y 6 6 CONFIG_LOG_BUF_SHIFT=16 7 7 CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=12 8 + CONFIG_CGROUPS=y 8 9 CONFIG_BLK_DEV_INITRD=y 9 10 CONFIG_CC_OPTIMIZE_FOR_SIZE=y 10 11 CONFIG_KALLSYMS_ALL=y ··· 54 53 CONFIG_BLK_DEV_LOOP=y 55 54 CONFIG_BLK_DEV_RAM=y 56 55 CONFIG_NETDEVICES=y 56 + CONFIG_LITEX_LITEETH=y 57 57 # CONFIG_WLAN is not set 58 58 # CONFIG_INPUT is not set 59 59 # CONFIG_SERIO is not set 60 60 # CONFIG_VT is not set 61 + # CONFIG_LEGACY_PTYS is not set 61 62 CONFIG_SERIAL_8250=y 62 63 # CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set 63 64 CONFIG_SERIAL_8250_CONSOLE=y ··· 79 76 CONFIG_EXT4_FS=y 80 77 # CONFIG_FILE_LOCKING is not set 81 78 # CONFIG_DNOTIFY is not set 82 - # CONFIG_INOTIFY_USER is not set 79 + CONFIG_AUTOFS_FS=y 80 + CONFIG_TMPFS=y 83 81 # CONFIG_MISC_FILESYSTEMS is not set 82 + CONFIG_CRYPTO_SHA256=y 84 83 # CONFIG_CRYPTO_HW is not set 85 84 # CONFIG_XZ_DEC_X86 is not set 86 85 # CONFIG_XZ_DEC_IA64 is not set
-1
arch/powerpc/configs/mpc85xx_base.config
··· 13 13 CONFIG_P1022_RDK=y 14 14 CONFIG_P1023_RDB=y 15 15 CONFIG_TWR_P102x=y 16 - CONFIG_SBC8548=y 17 16 CONFIG_SOCRATES=y 18 17 CONFIG_STX_GP3=y 19 18 CONFIG_TQM8540=y
-1
arch/powerpc/configs/mpc86xx_base.config
··· 1 1 CONFIG_PPC_86xx=y 2 2 CONFIG_MPC8641_HPCN=y 3 - CONFIG_SBC8641D=y 4 3 CONFIG_MPC8610_HPCD=y 5 4 CONFIG_GEF_PPC9A=y 6 5 CONFIG_GEF_SBC310=y
+23 -26
arch/powerpc/configs/mpc885_ads_defconfig
··· 1 - CONFIG_PPC_8xx=y 2 1 # CONFIG_SWAP is not set 3 2 CONFIG_SYSVIPC=y 4 3 CONFIG_NO_HZ=y 5 4 CONFIG_HIGH_RES_TIMERS=y 5 + CONFIG_BPF_JIT=y 6 + CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y 6 7 CONFIG_LOG_BUF_SHIFT=14 7 8 CONFIG_EXPERT=y 8 9 # CONFIG_ELF_CORE is not set 9 10 # CONFIG_BASE_FULL is not set 10 11 # CONFIG_FUTEX is not set 12 + CONFIG_PERF_EVENTS=y 11 13 # CONFIG_VM_EVENT_COUNTERS is not set 12 - # CONFIG_BLK_DEV_BSG is not set 13 - CONFIG_PARTITION_ADVANCED=y 14 + CONFIG_PPC_8xx=y 15 + CONFIG_8xx_GPIO=y 16 + CONFIG_SMC_UCODE_PATCH=y 17 + CONFIG_PIN_TLB=y 14 18 CONFIG_GEN_RTC=y 15 19 CONFIG_HZ_100=y 20 + CONFIG_MATH_EMULATION=y 21 + CONFIG_PPC_16K_PAGES=y 22 + CONFIG_ADVANCED_OPTIONS=y 16 23 # CONFIG_SECCOMP is not set 24 + CONFIG_STRICT_KERNEL_RWX=y 25 + CONFIG_MODULES=y 26 + # CONFIG_BLK_DEV_BSG is not set 27 + CONFIG_PARTITION_ADVANCED=y 17 28 CONFIG_NET=y 18 29 CONFIG_PACKET=y 19 30 CONFIG_UNIX=y ··· 32 21 CONFIG_IP_MULTICAST=y 33 22 CONFIG_IP_PNP=y 34 23 CONFIG_SYN_COOKIES=y 35 - # CONFIG_IPV6 is not set 36 24 # CONFIG_FW_LOADER is not set 37 25 CONFIG_MTD=y 38 26 CONFIG_MTD_BLOCK=y ··· 44 34 # CONFIG_MTD_CFI_I2 is not set 45 35 CONFIG_MTD_CFI_I4=y 46 36 CONFIG_MTD_CFI_AMDSTD=y 37 + CONFIG_MTD_PHYSMAP=y 47 38 CONFIG_MTD_PHYSMAP_OF=y 48 39 # CONFIG_BLK_DEV is not set 49 40 CONFIG_NETDEVICES=y ··· 57 46 # CONFIG_LEGACY_PTYS is not set 58 47 CONFIG_SERIAL_CPM=y 59 48 CONFIG_SERIAL_CPM_CONSOLE=y 49 + CONFIG_SPI=y 50 + CONFIG_SPI_FSL_SPI=y 60 51 # CONFIG_HWMON is not set 52 + CONFIG_WATCHDOG=y 53 + CONFIG_8xxx_WDT=y 61 54 # CONFIG_USB_SUPPORT is not set 62 55 # CONFIG_DNOTIFY is not set 63 56 CONFIG_TMPFS=y 64 57 CONFIG_CRAMFS=y 65 58 CONFIG_NFS_FS=y 66 59 CONFIG_ROOT_NFS=y 60 + CONFIG_CRYPTO=y 61 + CONFIG_CRYPTO_DEV_TALITOS=y 67 62 CONFIG_CRC32_SLICEBY4=y 68 63 CONFIG_DEBUG_INFO=y 69 64 CONFIG_MAGIC_SYSRQ=y 70 - CONFIG_DETECT_HUNG_TASK=y 71 - CONFIG_PPC_16K_PAGES=y 72 - CONFIG_DEBUG_KERNEL=y 73 65 CONFIG_DEBUG_FS=y 74 - CONFIG_PPC_PTDUMP=y 75 - CONFIG_MODULES=y 76 - CONFIG_SPI=y 77 - CONFIG_SPI_FSL_SPI=y 78 - CONFIG_CRYPTO=y 79 - CONFIG_CRYPTO_DEV_TALITOS=y 80 - CONFIG_8xx_GPIO=y 81 - CONFIG_WATCHDOG=y 82 - CONFIG_8xxx_WDT=y 83 - CONFIG_SMC_UCODE_PATCH=y 84 - CONFIG_ADVANCED_OPTIONS=y 85 - CONFIG_PIN_TLB=y 86 - CONFIG_PERF_EVENTS=y 87 - CONFIG_MATH_EMULATION=y 88 - CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y 89 - CONFIG_STRICT_KERNEL_RWX=y 90 - CONFIG_IPV6=y 91 - CONFIG_BPF_JIT=y 92 66 CONFIG_DEBUG_VM_PGTABLE=y 67 + CONFIG_DETECT_HUNG_TASK=y 93 68 CONFIG_BDI_SWITCH=y 94 69 CONFIG_PPC_EARLY_DEBUG=y 95 - CONFIG_PPC_EARLY_DEBUG_CPM_ADDR=0xff002008 70 + CONFIG_PPC_PTDUMP=y
-1
arch/powerpc/configs/ppc6xx_defconfig
··· 55 55 CONFIG_ASP834x=y 56 56 CONFIG_PPC_86xx=y 57 57 CONFIG_MPC8641_HPCN=y 58 - CONFIG_SBC8641D=y 59 58 CONFIG_MPC8610_HPCD=y 60 59 CONFIG_GEF_SBC610=y 61 60 CONFIG_CPU_FREQ=y
+1
arch/powerpc/configs/wii_defconfig
··· 99 99 CONFIG_LEDS_TRIGGER_PANIC=y 100 100 CONFIG_RTC_CLASS=y 101 101 CONFIG_RTC_DRV_GENERIC=y 102 + CONFIG_NVMEM_NINTENDO_OTP=y 102 103 CONFIG_EXT2_FS=y 103 104 CONFIG_EXT4_FS=y 104 105 CONFIG_FUSE_FS=m
+2 -2
arch/powerpc/include/asm/asm-compat.h
··· 17 17 #define PPC_LONG stringify_in_c(.8byte) 18 18 #define PPC_LONG_ALIGN stringify_in_c(.balign 8) 19 19 #define PPC_TLNEI stringify_in_c(tdnei) 20 - #define PPC_LLARX(t, a, b, eh) PPC_LDARX(t, a, b, eh) 20 + #define PPC_LLARX stringify_in_c(ldarx) 21 21 #define PPC_STLCX stringify_in_c(stdcx.) 22 22 #define PPC_CNTLZL stringify_in_c(cntlzd) 23 23 #define PPC_MTOCRF(FXM, RS) MTOCRF((FXM), RS) ··· 50 50 #define PPC_LONG stringify_in_c(.long) 51 51 #define PPC_LONG_ALIGN stringify_in_c(.balign 4) 52 52 #define PPC_TLNEI stringify_in_c(twnei) 53 - #define PPC_LLARX(t, a, b, eh) PPC_LWARX(t, a, b, eh) 53 + #define PPC_LLARX stringify_in_c(lwarx) 54 54 #define PPC_STLCX stringify_in_c(stwcx.) 55 55 #define PPC_CNTLZL stringify_in_c(cntlzw) 56 56 #define PPC_MTOCRF stringify_in_c(mtcrf)
+2 -2
arch/powerpc/include/asm/atomic.h
··· 207 207 int r, o = *old; 208 208 209 209 __asm__ __volatile__ ( 210 - "1:\t" PPC_LWARX(%0,0,%2,1) " # atomic_try_cmpxchg_acquire \n" 210 + "1: lwarx %0,0,%2,%5 # atomic_try_cmpxchg_acquire \n" 211 211 " cmpw 0,%0,%3 \n" 212 212 " bne- 2f \n" 213 213 " stwcx. %4,0,%2 \n" ··· 215 215 "\t" PPC_ACQUIRE_BARRIER " \n" 216 216 "2: \n" 217 217 : "=&r" (r), "+m" (v->counter) 218 - : "r" (&v->counter), "r" (o), "r" (new) 218 + : "r" (&v->counter), "r" (o), "r" (new), "i" (IS_ENABLED(CONFIG_PPC64) ? 1 : 0) 219 219 : "cr0", "memory"); 220 220 221 221 if (unlikely(r != o))
+4 -4
arch/powerpc/include/asm/bitops.h
··· 70 70 unsigned long *p = (unsigned long *)_p; \ 71 71 __asm__ __volatile__ ( \ 72 72 prefix \ 73 - "1:" PPC_LLARX(%0,0,%3,0) "\n" \ 73 + "1:" PPC_LLARX "%0,0,%3,0\n" \ 74 74 stringify_in_c(op) "%0,%0,%2\n" \ 75 75 PPC_STLCX "%0,0,%3\n" \ 76 76 "bne- 1b\n" \ ··· 115 115 unsigned long *p = (unsigned long *)_p; \ 116 116 __asm__ __volatile__ ( \ 117 117 prefix \ 118 - "1:" PPC_LLARX(%0,0,%3,eh) "\n" \ 118 + "1:" PPC_LLARX "%0,0,%3,%4\n" \ 119 119 stringify_in_c(op) "%1,%0,%2\n" \ 120 120 PPC_STLCX "%1,0,%3\n" \ 121 121 "bne- 1b\n" \ 122 122 postfix \ 123 123 : "=&r" (old), "=&r" (t) \ 124 - : "r" (mask), "r" (p) \ 124 + : "r" (mask), "r" (p), "i" (IS_ENABLED(CONFIG_PPC64) ? eh : 0) \ 125 125 : "cc", "memory"); \ 126 126 return (old & mask); \ 127 127 } ··· 170 170 171 171 __asm__ __volatile__ ( 172 172 PPC_RELEASE_BARRIER 173 - "1:" PPC_LLARX(%0,0,%3,0) "\n" 173 + "1:" PPC_LLARX "%0,0,%3,0\n" 174 174 "andc %1,%0,%2\n" 175 175 PPC_STLCX "%1,0,%3\n" 176 176 "bne- 1b\n"
+1 -1
arch/powerpc/include/asm/book3s/64/kup.h
··· 90 90 /* Prevent access to userspace using any key values */ 91 91 LOAD_REG_IMMEDIATE(\gpr2, AMR_KUAP_BLOCKED) 92 92 999: tdne \gpr1, \gpr2 93 - EMIT_BUG_ENTRY 999b, __FILE__, __LINE__, (BUGFLAG_WARNING | BUGFLAG_ONCE) 93 + EMIT_WARN_ENTRY 999b, __FILE__, __LINE__, (BUGFLAG_WARNING | BUGFLAG_ONCE) 94 94 END_MMU_FTR_SECTION_NESTED_IFSET(MMU_FTR_BOOK3S_KUAP, 67) 95 95 #endif 96 96 .endm
+51 -11
arch/powerpc/include/asm/bug.h
··· 4 4 #ifdef __KERNEL__ 5 5 6 6 #include <asm/asm-compat.h> 7 + #include <asm/extable.h> 7 8 8 9 #ifdef CONFIG_BUG 9 10 ··· 30 29 .previous 31 30 .endm 32 31 #endif /* verbose */ 32 + 33 + .macro EMIT_WARN_ENTRY addr,file,line,flags 34 + EX_TABLE(\addr,\addr+4) 35 + EMIT_BUG_ENTRY \addr,\file,\line,\flags 36 + .endm 33 37 34 38 #else /* !__ASSEMBLY__ */ 35 39 /* _EMIT_BUG_ENTRY expects args %0,%1,%2,%3 to be FILE, LINE, flags and ··· 64 58 "i" (sizeof(struct bug_entry)), \ 65 59 ##__VA_ARGS__) 66 60 61 + #define WARN_ENTRY(insn, flags, label, ...) \ 62 + asm_volatile_goto( \ 63 + "1: " insn "\n" \ 64 + EX_TABLE(1b, %l[label]) \ 65 + _EMIT_BUG_ENTRY \ 66 + : : "i" (__FILE__), "i" (__LINE__), \ 67 + "i" (flags), \ 68 + "i" (sizeof(struct bug_entry)), \ 69 + ##__VA_ARGS__ : : label) 70 + 67 71 /* 68 72 * BUG_ON() and WARN_ON() do their best to cooperate with compile-time 69 73 * optimisations. However depending on the complexity of the condition ··· 84 68 BUG_ENTRY("twi 31, 0, 0", 0); \ 85 69 unreachable(); \ 86 70 } while (0) 71 + #define HAVE_ARCH_BUG 87 72 73 + #define __WARN_FLAGS(flags) do { \ 74 + __label__ __label_warn_on; \ 75 + \ 76 + WARN_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags), __label_warn_on); \ 77 + unreachable(); \ 78 + \ 79 + __label_warn_on: \ 80 + break; \ 81 + } while (0) 82 + 83 + #ifdef CONFIG_PPC64 88 84 #define BUG_ON(x) do { \ 89 85 if (__builtin_constant_p(x)) { \ 90 86 if (x) \ ··· 106 78 } \ 107 79 } while (0) 108 80 109 - #define __WARN_FLAGS(flags) BUG_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags)) 110 - 111 81 #define WARN_ON(x) ({ \ 112 - int __ret_warn_on = !!(x); \ 113 - if (__builtin_constant_p(__ret_warn_on)) { \ 114 - if (__ret_warn_on) \ 82 + bool __ret_warn_on = false; \ 83 + do { \ 84 + if (__builtin_constant_p((x))) { \ 85 + if (!(x)) \ 86 + break; \ 115 87 __WARN(); \ 116 - } else { \ 117 - BUG_ENTRY(PPC_TLNEI " %4, 0", \ 118 - BUGFLAG_WARNING | BUGFLAG_TAINT(TAINT_WARN), \ 119 - "r" (__ret_warn_on)); \ 120 - } \ 88 + __ret_warn_on = true; \ 89 + } else { \ 90 + __label__ __label_warn_on; \ 91 + \ 92 + WARN_ENTRY(PPC_TLNEI " %4, 0", \ 93 + BUGFLAG_WARNING | BUGFLAG_TAINT(TAINT_WARN), \ 94 + __label_warn_on, \ 95 + "r" ((__force long)(x))); \ 96 + break; \ 97 + __label_warn_on: \ 98 + __ret_warn_on = true; \ 99 + } \ 100 + } while (0); \ 121 101 unlikely(__ret_warn_on); \ 122 102 }) 123 103 124 - #define HAVE_ARCH_BUG 125 104 #define HAVE_ARCH_BUG_ON 126 105 #define HAVE_ARCH_WARN_ON 106 + #endif 107 + 127 108 #endif /* __ASSEMBLY __ */ 128 109 #else 129 110 #ifdef __ASSEMBLY__ 130 111 .macro EMIT_BUG_ENTRY addr,file,line,flags 131 112 .endm 113 + .macro EMIT_WARN_ENTRY addr,file,line,flags 114 + .endm 132 115 #else /* !__ASSEMBLY__ */ 133 116 #define _EMIT_BUG_ENTRY 117 + #define _EMIT_WARN_ENTRY 134 118 #endif 135 119 #endif /* CONFIG_BUG */ 136 120
-13
arch/powerpc/include/asm/debugfs.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0-or-later */ 2 - #ifndef _ASM_POWERPC_DEBUGFS_H 3 - #define _ASM_POWERPC_DEBUGFS_H 4 - 5 - /* 6 - * Copyright 2017, Michael Ellerman, IBM Corporation. 7 - */ 8 - 9 - #include <linux/debugfs.h> 10 - 11 - extern struct dentry *powerpc_debugfs_root; 12 - 13 - #endif /* _ASM_POWERPC_DEBUGFS_H */
+1
arch/powerpc/include/asm/drmem.h
··· 111 111 int __init 112 112 walk_drmem_lmbs_early(unsigned long node, void *data, 113 113 int (*func)(struct drmem_lmb *, const __be32 **, void *)); 114 + void drmem_update_lmbs(struct property *prop); 114 115 #endif 115 116 116 117 static inline void invalidate_lmb_associativity_index(struct drmem_lmb *lmb)
+14
arch/powerpc/include/asm/extable.h
··· 17 17 18 18 #define ARCH_HAS_RELATIVE_EXTABLE 19 19 20 + #ifndef __ASSEMBLY__ 21 + 20 22 struct exception_table_entry { 21 23 int insn; 22 24 int fixup; ··· 28 26 { 29 27 return (unsigned long)&x->fixup + x->fixup; 30 28 } 29 + 30 + #endif 31 + 32 + /* 33 + * Helper macro for exception table entries 34 + */ 35 + #define EX_TABLE(_fault, _target) \ 36 + stringify_in_c(.section __ex_table,"a";)\ 37 + stringify_in_c(.balign 4;) \ 38 + stringify_in_c(.long (_fault) - . ;) \ 39 + stringify_in_c(.long (_target) - . ;) \ 40 + stringify_in_c(.previous) 31 41 32 42 #endif
+4 -3
arch/powerpc/include/asm/firmware.h
··· 44 44 #define FW_FEATURE_OPAL ASM_CONST(0x0000000010000000) 45 45 #define FW_FEATURE_SET_MODE ASM_CONST(0x0000000040000000) 46 46 #define FW_FEATURE_BEST_ENERGY ASM_CONST(0x0000000080000000) 47 - #define FW_FEATURE_TYPE1_AFFINITY ASM_CONST(0x0000000100000000) 47 + #define FW_FEATURE_FORM1_AFFINITY ASM_CONST(0x0000000100000000) 48 48 #define FW_FEATURE_PRRN ASM_CONST(0x0000000200000000) 49 49 #define FW_FEATURE_DRMEM_V2 ASM_CONST(0x0000000400000000) 50 50 #define FW_FEATURE_DRC_INFO ASM_CONST(0x0000000800000000) ··· 53 53 #define FW_FEATURE_ULTRAVISOR ASM_CONST(0x0000004000000000) 54 54 #define FW_FEATURE_STUFF_TCE ASM_CONST(0x0000008000000000) 55 55 #define FW_FEATURE_RPT_INVALIDATE ASM_CONST(0x0000010000000000) 56 + #define FW_FEATURE_FORM2_AFFINITY ASM_CONST(0x0000020000000000) 56 57 57 58 #ifndef __ASSEMBLY__ 58 59 ··· 70 69 FW_FEATURE_SPLPAR | FW_FEATURE_LPAR | 71 70 FW_FEATURE_CMO | FW_FEATURE_VPHN | FW_FEATURE_XCMO | 72 71 FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY | 73 - FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN | 72 + FW_FEATURE_FORM1_AFFINITY | FW_FEATURE_PRRN | 74 73 FW_FEATURE_HPT_RESIZE | FW_FEATURE_DRMEM_V2 | 75 74 FW_FEATURE_DRC_INFO | FW_FEATURE_BLOCK_REMOVE | 76 75 FW_FEATURE_PAPR_SCM | FW_FEATURE_ULTRAVISOR | 77 - FW_FEATURE_RPT_INVALIDATE, 76 + FW_FEATURE_RPT_INVALIDATE | FW_FEATURE_FORM2_AFFINITY, 78 77 FW_FEATURE_PSERIES_ALWAYS = 0, 79 78 FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL | FW_FEATURE_ULTRAVISOR, 80 79 FW_FEATURE_POWERNV_ALWAYS = 0,
+1
arch/powerpc/include/asm/iommu.h
··· 154 154 */ 155 155 extern struct iommu_table *iommu_init_table(struct iommu_table *tbl, 156 156 int nid, unsigned long res_start, unsigned long res_end); 157 + bool iommu_table_in_use(struct iommu_table *tbl); 157 158 158 159 #define IOMMU_TABLE_GROUP_MAX_TABLES 2 159 160
+1
arch/powerpc/include/asm/kvm_book3s_64.h
··· 39 39 pgd_t *shadow_pgtable; /* our page table for this guest */ 40 40 u64 l1_gr_to_hr; /* L1's addr of part'n-scoped table */ 41 41 u64 process_table; /* process table entry for this guest */ 42 + u64 hfscr; /* HFSCR that the L1 requested for this nested guest */ 42 43 long refcnt; /* number of pointers to this struct */ 43 44 struct mutex tlb_lock; /* serialize page faults and tlbies */ 44 45 struct kvm_nested_guest *next;
+2
arch/powerpc/include/asm/kvm_host.h
··· 811 811 812 812 u32 online; 813 813 814 + u64 hfscr_permitted; /* A mask of permitted HFSCR facilities */ 815 + 814 816 /* For support of nested guests */ 815 817 struct kvm_nested_guest *nested; 816 818 u32 nested_vcpu_id;
+2 -2
arch/powerpc/include/asm/kvm_ppc.h
··· 664 664 struct kvm_vcpu *vcpu, u32 cpu); 665 665 extern void kvmppc_xive_cleanup_vcpu(struct kvm_vcpu *vcpu); 666 666 extern int kvmppc_xive_set_mapped(struct kvm *kvm, unsigned long guest_irq, 667 - struct irq_desc *host_desc); 667 + unsigned long host_irq); 668 668 extern int kvmppc_xive_clr_mapped(struct kvm *kvm, unsigned long guest_irq, 669 - struct irq_desc *host_desc); 669 + unsigned long host_irq); 670 670 extern u64 kvmppc_xive_get_icp(struct kvm_vcpu *vcpu); 671 671 extern int kvmppc_xive_set_icp(struct kvm_vcpu *vcpu, u64 icpval); 672 672
+2 -1
arch/powerpc/include/asm/membarrier.h
··· 12 12 * when switching from userspace to kernel is not needed after 13 13 * store to rq->curr. 14 14 */ 15 - if (likely(!(atomic_read(&next->membarrier_state) & 15 + if (IS_ENABLED(CONFIG_SMP) && 16 + likely(!(atomic_read(&next->membarrier_state) & 16 17 (MEMBARRIER_STATE_PRIVATE_EXPEDITED | 17 18 MEMBARRIER_STATE_GLOBAL_EXPEDITED)) || !prev)) 18 19 return;
+1 -1
arch/powerpc/include/asm/mmu.h
··· 324 324 } 325 325 #endif /* !CONFIG_DEBUG_VM */ 326 326 327 - static inline bool radix_enabled(void) 327 + static __always_inline bool radix_enabled(void) 328 328 { 329 329 return mmu_has_feature(MMU_FTR_TYPE_RADIX); 330 330 }
+5
arch/powerpc/include/asm/pci-bridge.h
··· 126 126 #endif /* CONFIG_PPC64 */ 127 127 128 128 void *private_data; 129 + 130 + /* IRQ domain hierarchy */ 131 + struct irq_domain *dev_domain; 132 + struct irq_domain *msi_domain; 133 + struct fwnode_handle *fwnode; 129 134 }; 130 135 131 136 /* These are used for config access before all the PCI probing
+7
arch/powerpc/include/asm/pmc.h
··· 34 34 #endif 35 35 } 36 36 37 + #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE 38 + static inline int ppc_get_pmu_inuse(void) 39 + { 40 + return get_paca()->pmcregs_in_use; 41 + } 42 + #endif 43 + 37 44 extern void power4_enable_pmcs(void); 38 45 39 46 #else /* CONFIG_PPC64 */
+1 -1
arch/powerpc/include/asm/pnv-pci.h
··· 33 33 void pnv_cxl_release_hwirqs(struct pci_dev *dev, int hwirq, int num); 34 34 int pnv_cxl_get_irq_count(struct pci_dev *dev); 35 35 struct device_node *pnv_pci_get_phb_node(struct pci_dev *dev); 36 - int64_t pnv_opal_pci_msi_eoi(struct irq_chip *chip, unsigned int hw_irq); 36 + int64_t pnv_opal_pci_msi_eoi(struct irq_data *d); 37 37 bool is_pnv_opal_msi(struct irq_chip *chip); 38 38 39 39 #ifdef CONFIG_CXL_BASE
-2
arch/powerpc/include/asm/ppc-opcode.h
··· 576 576 #define PPC_DIVDE(t, a, b) stringify_in_c(.long PPC_RAW_DIVDE(t, a, b)) 577 577 #define PPC_DIVDEU(t, a, b) stringify_in_c(.long PPC_RAW_DIVDEU(t, a, b)) 578 578 #define PPC_LQARX(t, a, b, eh) stringify_in_c(.long PPC_RAW_LQARX(t, a, b, eh)) 579 - #define PPC_LDARX(t, a, b, eh) stringify_in_c(.long PPC_RAW_LDARX(t, a, b, eh)) 580 - #define PPC_LWARX(t, a, b, eh) stringify_in_c(.long PPC_RAW_LWARX(t, a, b, eh)) 581 579 #define PPC_STQCX(t, a, b) stringify_in_c(.long PPC_RAW_STQCX(t, a, b)) 582 580 #define PPC_MADDHD(t, a, b, c) stringify_in_c(.long PPC_RAW_MADDHD(t, a, b, c)) 583 581 #define PPC_MADDHDU(t, a, b, c) stringify_in_c(.long PPC_RAW_MADDHDU(t, a, b, c))
+2 -11
arch/powerpc/include/asm/ppc_asm.h
··· 10 10 #include <asm/ppc-opcode.h> 11 11 #include <asm/firmware.h> 12 12 #include <asm/feature-fixups.h> 13 + #include <asm/extable.h> 13 14 14 15 #ifdef __ASSEMBLY__ 15 16 ··· 260 259 261 260 /* Be careful, this will clobber the lr register. */ 262 261 #define LOAD_REG_ADDR_PIC(reg, name) \ 263 - bl 0f; \ 262 + bcl 20,31,$+4; \ 264 263 0: mflr reg; \ 265 264 addis reg,reg,(name - 0b)@ha; \ 266 265 addi reg,reg,(name - 0b)@l; ··· 752 751 #endif /* !CONFIG_PPC_BOOK3E */ 753 752 754 753 #endif /* __ASSEMBLY__ */ 755 - 756 - /* 757 - * Helper macro for exception table entries 758 - */ 759 - #define EX_TABLE(_fault, _target) \ 760 - stringify_in_c(.section __ex_table,"a";)\ 761 - stringify_in_c(.balign 4;) \ 762 - stringify_in_c(.long (_fault) - . ;) \ 763 - stringify_in_c(.long (_target) - . ;) \ 764 - stringify_in_c(.previous) 765 754 766 755 #define SOFT_MASK_TABLE(_start, _end) \ 767 756 stringify_in_c(.section __soft_mask_table,"a";)\
+2 -1
arch/powerpc/include/asm/prom.h
··· 147 147 #define OV5_MSI 0x0201 /* PCIe/MSI support */ 148 148 #define OV5_CMO 0x0480 /* Cooperative Memory Overcommitment */ 149 149 #define OV5_XCMO 0x0440 /* Page Coalescing */ 150 - #define OV5_TYPE1_AFFINITY 0x0580 /* Type 1 NUMA affinity */ 150 + #define OV5_FORM1_AFFINITY 0x0580 /* FORM1 NUMA affinity */ 151 151 #define OV5_PRRN 0x0540 /* Platform Resource Reassignment */ 152 + #define OV5_FORM2_AFFINITY 0x0520 /* Form2 NUMA affinity */ 152 153 #define OV5_HP_EVT 0x0604 /* Hot Plug Event support */ 153 154 #define OV5_RESIZE_HPT 0x0601 /* Hash Page Table resizing */ 154 155 #define OV5_PFO_HW_RNG 0x1180 /* PFO Random Number Generator */
+31 -6
arch/powerpc/include/asm/ptrace.h
··· 22 22 #include <linux/err.h> 23 23 #include <uapi/asm/ptrace.h> 24 24 #include <asm/asm-const.h> 25 + #include <asm/reg.h> 25 26 26 27 #ifndef __ASSEMBLY__ 27 28 struct pt_regs ··· 44 43 unsigned long mq; 45 44 #endif 46 45 unsigned long trap; 47 - unsigned long dar; 48 - unsigned long dsisr; 46 + union { 47 + unsigned long dar; 48 + unsigned long dear; 49 + }; 50 + union { 51 + unsigned long dsisr; 52 + unsigned long esr; 53 + }; 49 54 unsigned long result; 50 55 }; 51 56 }; ··· 204 197 return 0; 205 198 } 206 199 207 - #ifdef __powerpc64__ 208 - #define user_mode(regs) ((((regs)->msr) >> MSR_PR_LG) & 0x1) 209 - #else 210 200 #define user_mode(regs) (((regs)->msr & MSR_PR) != 0) 211 - #endif 212 201 213 202 #define force_successful_syscall_return() \ 214 203 do { \ ··· 287 284 static inline void regs_set_return_value(struct pt_regs *regs, unsigned long rc) 288 285 { 289 286 regs->gpr[3] = rc; 287 + } 288 + 289 + static inline bool cpu_has_msr_ri(void) 290 + { 291 + return !IS_ENABLED(CONFIG_BOOKE) && !IS_ENABLED(CONFIG_40x); 292 + } 293 + 294 + static inline bool regs_is_unrecoverable(struct pt_regs *regs) 295 + { 296 + return unlikely(cpu_has_msr_ri() && !(regs->msr & MSR_RI)); 297 + } 298 + 299 + static inline void regs_set_recoverable(struct pt_regs *regs) 300 + { 301 + if (cpu_has_msr_ri()) 302 + regs_set_return_msr(regs, regs->msr | MSR_RI); 303 + } 304 + 305 + static inline void regs_set_unrecoverable(struct pt_regs *regs) 306 + { 307 + if (cpu_has_msr_ri()) 308 + regs_set_return_msr(regs, regs->msr & ~MSR_RI); 290 309 } 291 310 292 311 #define arch_has_single_step() (1)
+2 -1
arch/powerpc/include/asm/reg.h
··· 415 415 #define FSCR_TAR __MASK(FSCR_TAR_LG) 416 416 #define FSCR_EBB __MASK(FSCR_EBB_LG) 417 417 #define FSCR_DSCR __MASK(FSCR_DSCR_LG) 418 + #define FSCR_INTR_CAUSE (ASM_CONST(0xFF) << 56) /* interrupt cause */ 418 419 #define SPRN_HFSCR 0xbe /* HV=1 Facility Status & Control Register */ 419 420 #define HFSCR_PREFIX __MASK(FSCR_PREFIX_LG) 420 421 #define HFSCR_MSGP __MASK(FSCR_MSGP_LG) ··· 427 426 #define HFSCR_DSCR __MASK(FSCR_DSCR_LG) 428 427 #define HFSCR_VECVSX __MASK(FSCR_VECVSX_LG) 429 428 #define HFSCR_FP __MASK(FSCR_FP_LG) 430 - #define HFSCR_INTR_CAUSE (ASM_CONST(0xFF) << 56) /* interrupt cause */ 429 + #define HFSCR_INTR_CAUSE FSCR_INTR_CAUSE 431 430 #define SPRN_TAR 0x32f /* Target Address Register */ 432 431 #define SPRN_LPCR 0x13E /* LPAR Control Register */ 433 432 #define LPCR_VPM0 ASM_CONST(0x8000000000000000)
-8
arch/powerpc/include/asm/sections.h
··· 38 38 extern char end_virt_trampolines[]; 39 39 #endif 40 40 41 - static inline int in_kernel_text(unsigned long addr) 42 - { 43 - if (addr >= (unsigned long)_stext && addr < (unsigned long)__init_end) 44 - return 1; 45 - 46 - return 0; 47 - } 48 - 49 41 static inline unsigned long kernel_toc_addr(void) 50 42 { 51 43 /* Defined by the linker, see vmlinux.lds.S */
+3 -3
arch/powerpc/include/asm/simple_spinlock.h
··· 51 51 52 52 token = LOCK_TOKEN; 53 53 __asm__ __volatile__( 54 - "1: " PPC_LWARX(%0,0,%2,1) "\n\ 54 + "1: lwarx %0,0,%2,1\n\ 55 55 cmpwi 0,%0,0\n\ 56 56 bne- 2f\n\ 57 57 stwcx. %1,0,%2\n\ ··· 179 179 long tmp; 180 180 181 181 __asm__ __volatile__( 182 - "1: " PPC_LWARX(%0,0,%1,1) "\n" 182 + "1: lwarx %0,0,%1,1\n" 183 183 __DO_SIGN_EXTEND 184 184 " addic. %0,%0,1\n\ 185 185 ble- 2f\n" ··· 203 203 204 204 token = WRLOCK_TOKEN; 205 205 __asm__ __volatile__( 206 - "1: " PPC_LWARX(%0,0,%2,1) "\n\ 206 + "1: lwarx %0,0,%2,1\n\ 207 207 cmpwi 0,%0,0\n\ 208 208 bne- 2f\n" 209 209 " stwcx. %1,0,%2\n\
+6
arch/powerpc/include/asm/smp.h
··· 33 33 extern int cpu_to_chip_id(int cpu); 34 34 extern int *chip_id_lookup_table; 35 35 36 + DECLARE_PER_CPU(cpumask_var_t, thread_group_l1_cache_map); 37 + DECLARE_PER_CPU(cpumask_var_t, thread_group_l2_cache_map); 38 + DECLARE_PER_CPU(cpumask_var_t, thread_group_l3_cache_map); 39 + 36 40 #ifdef CONFIG_SMP 37 41 38 42 struct smp_ops_t { ··· 145 141 146 142 extern bool has_big_cores; 147 143 extern bool thread_group_shares_l2; 144 + extern bool thread_group_shares_l3; 148 145 149 146 #define cpu_smt_mask cpu_smt_mask 150 147 #ifdef CONFIG_SCHED_SMT ··· 200 195 #define hard_smp_processor_id() get_hard_smp_processor_id(0) 201 196 #define smp_setup_cpu_maps() 202 197 #define thread_group_shares_l2 0 198 + #define thread_group_shares_l3 0 203 199 static inline void inhibit_secondary_onlining(void) {} 204 200 static inline void uninhibit_secondary_onlining(void) {} 205 201 static inline const struct cpumask *cpu_sibling_mask(int cpu)
+7 -13
arch/powerpc/include/asm/syscall.h
··· 90 90 unsigned long val, mask = -1UL; 91 91 unsigned int n = 6; 92 92 93 - #ifdef CONFIG_COMPAT 94 - if (test_tsk_thread_flag(task, TIF_32BIT)) 93 + if (is_32bit_task()) 95 94 mask = 0xffffffff; 96 - #endif 95 + 97 96 while (n--) { 98 97 if (n == 0) 99 98 val = regs->orig_gpr3; ··· 115 116 116 117 static inline int syscall_get_arch(struct task_struct *task) 117 118 { 118 - int arch; 119 - 120 - if (IS_ENABLED(CONFIG_PPC64) && !test_tsk_thread_flag(task, TIF_32BIT)) 121 - arch = AUDIT_ARCH_PPC64; 119 + if (is_32bit_task()) 120 + return AUDIT_ARCH_PPC; 121 + else if (IS_ENABLED(CONFIG_CPU_LITTLE_ENDIAN)) 122 + return AUDIT_ARCH_PPC64LE; 122 123 else 123 - arch = AUDIT_ARCH_PPC; 124 - 125 - #ifdef __LITTLE_ENDIAN__ 126 - arch |= __AUDIT_ARCH_LE; 127 - #endif 128 - return arch; 124 + return AUDIT_ARCH_PPC64; 129 125 } 130 126 #endif /* _ASM_SYSCALL_H */
+30
arch/powerpc/include/asm/syscalls.h
··· 6 6 #include <linux/compiler.h> 7 7 #include <linux/linkage.h> 8 8 #include <linux/types.h> 9 + #include <linux/compat.h> 9 10 10 11 struct rtas_args; 11 12 ··· 18 17 unsigned long fd, unsigned long pgoff); 19 18 asmlinkage long ppc64_personality(unsigned long personality); 20 19 asmlinkage long sys_rtas(struct rtas_args __user *uargs); 20 + 21 + #ifdef CONFIG_COMPAT 22 + unsigned long compat_sys_mmap2(unsigned long addr, size_t len, 23 + unsigned long prot, unsigned long flags, 24 + unsigned long fd, unsigned long pgoff); 25 + 26 + compat_ssize_t compat_sys_pread64(unsigned int fd, char __user *ubuf, compat_size_t count, 27 + u32 reg6, u32 pos1, u32 pos2); 28 + 29 + compat_ssize_t compat_sys_pwrite64(unsigned int fd, const char __user *ubuf, compat_size_t count, 30 + u32 reg6, u32 pos1, u32 pos2); 31 + 32 + compat_ssize_t compat_sys_readahead(int fd, u32 r4, u32 offset1, u32 offset2, u32 count); 33 + 34 + int compat_sys_truncate64(const char __user *path, u32 reg4, 35 + unsigned long len1, unsigned long len2); 36 + 37 + long compat_sys_fallocate(int fd, int mode, u32 offset1, u32 offset2, u32 len1, u32 len2); 38 + 39 + int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, 40 + unsigned long len2); 41 + 42 + long ppc32_fadvise64(int fd, u32 unused, u32 offset1, u32 offset2, 43 + size_t len, int advice); 44 + 45 + long compat_sys_sync_file_range2(int fd, unsigned int flags, 46 + unsigned int offset1, unsigned int offset2, 47 + unsigned int nbytes1, unsigned int nbytes2); 48 + #endif 21 49 22 50 #endif /* __KERNEL__ */ 23 51 #endif /* __ASM_POWERPC_SYSCALLS_H */
-8
arch/powerpc/include/asm/tce.h
··· 19 19 #define TCE_VB 0 20 20 #define TCE_PCI 1 21 21 22 - /* TCE page size is 4096 bytes (1 << 12) */ 23 - 24 - #define TCE_SHIFT 12 25 - #define TCE_PAGE_SIZE (1 << TCE_SHIFT) 26 - 27 22 #define TCE_ENTRY_SIZE 8 /* each TCE is 64 bits */ 28 - 29 - #define TCE_RPN_MASK 0xfffffffffful /* 40-bit RPN (4K pages) */ 30 - #define TCE_RPN_SHIFT 12 31 23 #define TCE_VALID 0x800 /* TCE valid */ 32 24 #define TCE_ALLIO 0x400 /* TCE valid for all lpars */ 33 25 #define TCE_PCI_WRITE 0x2 /* write from PCI allowed */
+17 -2
arch/powerpc/include/asm/topology.h
··· 36 36 cpu_all_mask : \ 37 37 cpumask_of_node(pcibus_to_node(bus))) 38 38 39 - extern int cpu_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc); 39 + int cpu_relative_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc); 40 40 extern int __node_distance(int, int); 41 41 #define node_distance(a, b) __node_distance(a, b) 42 42 ··· 64 64 } 65 65 66 66 int of_drconf_to_nid_single(struct drmem_lmb *lmb); 67 + void update_numa_distance(struct device_node *node); 68 + 69 + extern void map_cpu_to_node(int cpu, int node); 70 + #ifdef CONFIG_HOTPLUG_CPU 71 + extern void unmap_cpu_from_node(unsigned long cpu); 72 + #endif /* CONFIG_HOTPLUG_CPU */ 67 73 68 74 #else 69 75 ··· 89 83 90 84 static inline void update_numa_cpu_lookup_table(unsigned int cpu, int node) {} 91 85 92 - static inline int cpu_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc) 86 + static inline int cpu_relative_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc) 93 87 { 94 88 return 0; 95 89 } ··· 98 92 { 99 93 return first_online_node; 100 94 } 95 + 96 + static inline void update_numa_distance(struct device_node *node) {} 97 + 98 + #ifdef CONFIG_SMP 99 + static inline void map_cpu_to_node(int cpu, int node) {} 100 + #ifdef CONFIG_HOTPLUG_CPU 101 + static inline void unmap_cpu_from_node(unsigned long cpu) {} 102 + #endif /* CONFIG_HOTPLUG_CPU */ 103 + #endif /* CONFIG_SMP */ 101 104 102 105 #endif /* CONFIG_NUMA */ 103 106
-2
arch/powerpc/include/asm/unistd.h
··· 9 9 10 10 #define NR_syscalls __NR_syscalls 11 11 12 - #define __NR__exit __NR_exit 13 - 14 12 #ifndef __ASSEMBLY__ 15 13 16 14 #include <linux/types.h>
+9
arch/powerpc/include/asm/vdso/processor.h
··· 5 5 #ifndef __ASSEMBLY__ 6 6 7 7 /* Macros for adjusting thread priority (hardware multi-threading) */ 8 + #ifdef CONFIG_PPC64 8 9 #define HMT_very_low() asm volatile("or 31, 31, 31 # very low priority") 9 10 #define HMT_low() asm volatile("or 1, 1, 1 # low priority") 10 11 #define HMT_medium_low() asm volatile("or 6, 6, 6 # medium low priority") 11 12 #define HMT_medium() asm volatile("or 2, 2, 2 # medium priority") 12 13 #define HMT_medium_high() asm volatile("or 5, 5, 5 # medium high priority") 13 14 #define HMT_high() asm volatile("or 3, 3, 3 # high priority") 15 + #else 16 + #define HMT_very_low() 17 + #define HMT_low() 18 + #define HMT_medium_low() 19 + #define HMT_medium() 20 + #define HMT_medium_high() 21 + #define HMT_high() 22 + #endif 14 23 15 24 #ifdef CONFIG_PPC64 16 25 #define cpu_relax() do { HMT_low(); HMT_medium(); barrier(); } while (0)
+2 -1
arch/powerpc/include/asm/xics.h
··· 89 89 /* ICS instance, hooked up to chip_data of an irq */ 90 90 struct ics { 91 91 struct list_head link; 92 - int (*map)(struct ics *ics, unsigned int virq); 92 + int (*check)(struct ics *ics, unsigned int hwirq); 93 93 void (*mask_unknown)(struct ics *ics, unsigned long vec); 94 94 long (*get_server)(struct ics *ics, unsigned long vec); 95 95 int (*host_match)(struct ics *ics, struct device_node *node); 96 + struct irq_chip *chip; 96 97 char data[]; 97 98 }; 98 99
+3
arch/powerpc/include/asm/xive-regs.h
··· 80 80 #define TM_QW0W2_VU PPC_BIT32(0) 81 81 #define TM_QW0W2_LOGIC_SERV PPC_BITMASK32(1,31) // XX 2,31 ? 82 82 #define TM_QW1W2_VO PPC_BIT32(0) 83 + #define TM_QW1W2_HO PPC_BIT32(1) /* P10 XIVE2 */ 83 84 #define TM_QW1W2_OS_CAM PPC_BITMASK32(8,31) 84 85 #define TM_QW2W2_VP PPC_BIT32(0) 86 + #define TM_QW2W2_HP PPC_BIT32(1) /* P10 XIVE2 */ 85 87 #define TM_QW2W2_POOL_CAM PPC_BITMASK32(8,31) 86 88 #define TM_QW3W2_VT PPC_BIT32(0) 89 + #define TM_QW3W2_HT PPC_BIT32(1) /* P10 XIVE2 */ 87 90 #define TM_QW3W2_LP PPC_BIT32(6) 88 91 #define TM_QW3W2_LE PPC_BIT32(7) 89 92 #define TM_QW3W2_T PPC_BIT32(31)
+2
arch/powerpc/include/asm/xive.h
··· 111 111 int xive_native_populate_irq_data(u32 hw_irq, 112 112 struct xive_irq_data *data); 113 113 void xive_cleanup_irq_data(struct xive_irq_data *xd); 114 + void xive_irq_free_data(unsigned int virq); 114 115 void xive_native_free_irq(u32 irq); 115 116 int xive_native_configure_irq(u32 hw_irq, u32 target, u8 prio, u32 sw_irq); 116 117 ··· 126 125 int xive_native_disable_vp(u32 vp_id); 127 126 int xive_native_get_vp_info(u32 vp_id, u32 *out_cam_id, u32 *out_chip_id); 128 127 bool xive_native_has_single_escalation(void); 128 + bool xive_native_has_save_restore(void); 129 129 130 130 int xive_native_get_queue_info(u32 vp_id, uint32_t prio, 131 131 u64 *out_qpage,
+2 -1
arch/powerpc/kernel/Makefile
··· 46 46 prom.o traps.o setup-common.o \ 47 47 udbg.o misc.o io.o misc_$(BITS).o \ 48 48 of_platform.o prom_parse.o firmware.o \ 49 - hw_breakpoint_constraints.o interrupt.o 49 + hw_breakpoint_constraints.o interrupt.o \ 50 + kdebugfs.o 50 51 obj-y += ptrace/ 51 52 obj-$(CONFIG_PPC64) += setup_64.o \ 52 53 paca.o nvram_64.o note.o
+4 -11
arch/powerpc/kernel/asm-offsets.c
··· 286 286 STACK_PT_REGS_OFFSET(_CCR, ccr); 287 287 STACK_PT_REGS_OFFSET(_XER, xer); 288 288 STACK_PT_REGS_OFFSET(_DAR, dar); 289 + STACK_PT_REGS_OFFSET(_DEAR, dear); 289 290 STACK_PT_REGS_OFFSET(_DSISR, dsisr); 291 + STACK_PT_REGS_OFFSET(_ESR, esr); 290 292 STACK_PT_REGS_OFFSET(ORIG_GPR3, orig_gpr3); 291 293 STACK_PT_REGS_OFFSET(RESULT, result); 292 294 STACK_PT_REGS_OFFSET(_TRAP, trap); 293 - #ifndef CONFIG_PPC64 294 - /* 295 - * The PowerPC 400-class & Book-E processors have neither the DAR 296 - * nor the DSISR SPRs. Hence, we overload them to hold the similar 297 - * DEAR and ESR SPRs for such processors. For critical interrupts 298 - * we use them to hold SRR0 and SRR1. 299 - */ 300 - STACK_PT_REGS_OFFSET(_DEAR, dar); 301 - STACK_PT_REGS_OFFSET(_ESR, dsisr); 302 - #else /* CONFIG_PPC64 */ 295 + #ifdef CONFIG_PPC64 303 296 STACK_PT_REGS_OFFSET(SOFTE, softe); 304 297 STACK_PT_REGS_OFFSET(_PPR, ppr); 305 - #endif /* CONFIG_PPC64 */ 298 + #endif 306 299 307 300 #ifdef CONFIG_PPC_PKEY 308 301 STACK_PT_REGS_OFFSET(STACK_REGS_AMR, amr);
+62 -62
arch/powerpc/kernel/cacheinfo.c
··· 120 120 struct cpumask shared_cpu_map; /* online CPUs using this cache */ 121 121 int type; /* split cache disambiguation */ 122 122 int level; /* level not explicit in device tree */ 123 + int group_id; /* id of the group of threads that share this cache */ 123 124 struct list_head list; /* global list of cache objects */ 124 125 struct cache *next_local; /* next cache of >= level */ 125 126 }; ··· 143 142 } 144 143 145 144 static void cache_init(struct cache *cache, int type, int level, 146 - struct device_node *ofnode) 145 + struct device_node *ofnode, int group_id) 147 146 { 148 147 cache->type = type; 149 148 cache->level = level; 150 149 cache->ofnode = of_node_get(ofnode); 150 + cache->group_id = group_id; 151 151 INIT_LIST_HEAD(&cache->list); 152 152 list_add(&cache->list, &cache_list); 153 153 } 154 154 155 - static struct cache *new_cache(int type, int level, struct device_node *ofnode) 155 + static struct cache *new_cache(int type, int level, 156 + struct device_node *ofnode, int group_id) 156 157 { 157 158 struct cache *cache; 158 159 159 160 cache = kzalloc(sizeof(*cache), GFP_KERNEL); 160 161 if (cache) 161 - cache_init(cache, type, level, ofnode); 162 + cache_init(cache, type, level, ofnode, group_id); 162 163 163 164 return cache; 164 165 } ··· 312 309 return cache; 313 310 314 311 list_for_each_entry(iter, &cache_list, list) 315 - if (iter->ofnode == cache->ofnode && iter->next_local == cache) 312 + if (iter->ofnode == cache->ofnode && 313 + iter->group_id == cache->group_id && 314 + iter->next_local == cache) 316 315 return iter; 317 316 318 317 return cache; 319 318 } 320 319 321 - /* return the first cache on a local list matching node */ 322 - static struct cache *cache_lookup_by_node(const struct device_node *node) 320 + /* return the first cache on a local list matching node and thread-group id */ 321 + static struct cache *cache_lookup_by_node_group(const struct device_node *node, 322 + int group_id) 323 323 { 324 324 struct cache *cache = NULL; 325 325 struct cache *iter; 326 326 327 327 list_for_each_entry(iter, &cache_list, list) { 328 - if (iter->ofnode != node) 328 + if (iter->ofnode != node || 329 + iter->group_id != group_id) 329 330 continue; 330 331 cache = cache_find_first_sibling(iter); 331 332 break; ··· 359 352 CACHE_TYPE_UNIFIED_D : CACHE_TYPE_UNIFIED; 360 353 } 361 354 362 - static struct cache *cache_do_one_devnode_unified(struct device_node *node, int level) 355 + static struct cache *cache_do_one_devnode_unified(struct device_node *node, int group_id, 356 + int level) 363 357 { 364 358 pr_debug("creating L%d ucache for %pOFP\n", level, node); 365 359 366 - return new_cache(cache_is_unified_d(node), level, node); 360 + return new_cache(cache_is_unified_d(node), level, node, group_id); 367 361 } 368 362 369 - static struct cache *cache_do_one_devnode_split(struct device_node *node, 363 + static struct cache *cache_do_one_devnode_split(struct device_node *node, int group_id, 370 364 int level) 371 365 { 372 366 struct cache *dcache, *icache; ··· 375 367 pr_debug("creating L%d dcache and icache for %pOFP\n", level, 376 368 node); 377 369 378 - dcache = new_cache(CACHE_TYPE_DATA, level, node); 379 - icache = new_cache(CACHE_TYPE_INSTRUCTION, level, node); 370 + dcache = new_cache(CACHE_TYPE_DATA, level, node, group_id); 371 + icache = new_cache(CACHE_TYPE_INSTRUCTION, level, node, group_id); 380 372 381 373 if (!dcache || !icache) 382 374 goto err; ··· 390 382 return NULL; 391 383 } 392 384 393 - static struct cache *cache_do_one_devnode(struct device_node *node, int level) 385 + static struct cache *cache_do_one_devnode(struct device_node *node, int group_id, int level) 394 386 { 395 387 struct cache *cache; 396 388 397 389 if (cache_node_is_unified(node)) 398 - cache = cache_do_one_devnode_unified(node, level); 390 + cache = cache_do_one_devnode_unified(node, group_id, level); 399 391 else 400 - cache = cache_do_one_devnode_split(node, level); 392 + cache = cache_do_one_devnode_split(node, group_id, level); 401 393 402 394 return cache; 403 395 } 404 396 405 397 static struct cache *cache_lookup_or_instantiate(struct device_node *node, 398 + int group_id, 406 399 int level) 407 400 { 408 401 struct cache *cache; 409 402 410 - cache = cache_lookup_by_node(node); 403 + cache = cache_lookup_by_node_group(node, group_id); 411 404 412 405 WARN_ONCE(cache && cache->level != level, 413 406 "cache level mismatch on lookup (got %d, expected %d)\n", 414 407 cache->level, level); 415 408 416 409 if (!cache) 417 - cache = cache_do_one_devnode(node, level); 410 + cache = cache_do_one_devnode(node, group_id, level); 418 411 419 412 return cache; 420 413 } ··· 452 443 of_node_get_device_type(cache->ofnode)); 453 444 } 454 445 455 - static void do_subsidiary_caches(struct cache *cache) 446 + /* 447 + * If sub-groups of threads in a core containing @cpu_id share the 448 + * L@level-cache (information obtained via "ibm,thread-groups" 449 + * device-tree property), then we identify the group by the first 450 + * thread-sibling in the group. We define this to be the group-id. 451 + * 452 + * In the absence of any thread-group information for L@level-cache, 453 + * this function returns -1. 454 + */ 455 + static int get_group_id(unsigned int cpu_id, int level) 456 + { 457 + if (has_big_cores && level == 1) 458 + return cpumask_first(per_cpu(thread_group_l1_cache_map, 459 + cpu_id)); 460 + else if (thread_group_shares_l2 && level == 2) 461 + return cpumask_first(per_cpu(thread_group_l2_cache_map, 462 + cpu_id)); 463 + else if (thread_group_shares_l3 && level == 3) 464 + return cpumask_first(per_cpu(thread_group_l3_cache_map, 465 + cpu_id)); 466 + return -1; 467 + } 468 + 469 + static void do_subsidiary_caches(struct cache *cache, unsigned int cpu_id) 456 470 { 457 471 struct device_node *subcache_node; 458 472 int level = cache->level; ··· 484 452 485 453 while ((subcache_node = of_find_next_cache_node(cache->ofnode))) { 486 454 struct cache *subcache; 455 + int group_id; 487 456 488 457 level++; 489 - subcache = cache_lookup_or_instantiate(subcache_node, level); 458 + group_id = get_group_id(cpu_id, level); 459 + subcache = cache_lookup_or_instantiate(subcache_node, group_id, level); 490 460 of_node_put(subcache_node); 491 461 if (!subcache) 492 462 break; ··· 502 468 { 503 469 struct device_node *cpu_node; 504 470 struct cache *cpu_cache = NULL; 471 + int group_id; 505 472 506 473 pr_debug("creating cache object(s) for CPU %i\n", cpu_id); 507 474 ··· 511 476 if (!cpu_node) 512 477 goto out; 513 478 514 - cpu_cache = cache_lookup_or_instantiate(cpu_node, 1); 479 + group_id = get_group_id(cpu_id, 1); 480 + 481 + cpu_cache = cache_lookup_or_instantiate(cpu_node, group_id, 1); 515 482 if (!cpu_cache) 516 483 goto out; 517 484 518 - do_subsidiary_caches(cpu_cache); 485 + do_subsidiary_caches(cpu_cache, cpu_id); 519 486 520 487 cache_cpu_set(cpu_cache, cpu_id); 521 488 out: ··· 678 641 static struct kobj_attribute cache_level_attr = 679 642 __ATTR(level, 0444, level_show, NULL); 680 643 681 - static unsigned int index_dir_to_cpu(struct cache_index_dir *index) 682 - { 683 - struct kobject *index_dir_kobj = &index->kobj; 684 - struct kobject *cache_dir_kobj = index_dir_kobj->parent; 685 - struct kobject *cpu_dev_kobj = cache_dir_kobj->parent; 686 - struct device *dev = kobj_to_dev(cpu_dev_kobj); 687 - 688 - return dev->id; 689 - } 690 - 691 - /* 692 - * On big-core systems, each core has two groups of CPUs each of which 693 - * has its own L1-cache. The thread-siblings which share l1-cache with 694 - * @cpu can be obtained via cpu_smallcore_mask(). 695 - * 696 - * On some big-core systems, the L2 cache is shared only between some 697 - * groups of siblings. This is already parsed and encoded in 698 - * cpu_l2_cache_mask(). 699 - * 700 - * TODO: cache_lookup_or_instantiate() needs to be made aware of the 701 - * "ibm,thread-groups" property so that cache->shared_cpu_map 702 - * reflects the correct siblings on platforms that have this 703 - * device-tree property. This helper function is only a stop-gap 704 - * solution so that we report the correct siblings to the 705 - * userspace via sysfs. 706 - */ 707 - static const struct cpumask *get_shared_cpu_map(struct cache_index_dir *index, struct cache *cache) 708 - { 709 - if (has_big_cores) { 710 - int cpu = index_dir_to_cpu(index); 711 - if (cache->level == 1) 712 - return cpu_smallcore_mask(cpu); 713 - if (cache->level == 2 && thread_group_shares_l2) 714 - return cpu_l2_cache_mask(cpu); 715 - } 716 - 717 - return &cache->shared_cpu_map; 718 - } 719 - 720 644 static ssize_t 721 645 show_shared_cpumap(struct kobject *k, struct kobj_attribute *attr, char *buf, bool list) 722 646 { ··· 688 690 index = kobj_to_cache_index_dir(k); 689 691 cache = index->cache; 690 692 691 - mask = get_shared_cpu_map(index, cache); 693 + mask = &cache->shared_cpu_map; 692 694 693 695 return cpumap_print_to_pagebuf(list, buf, mask); 694 696 } ··· 846 848 { 847 849 struct device_node *cpu_node; 848 850 struct cache *cache; 851 + int group_id; 849 852 850 853 cpu_node = of_get_cpu_node(cpu_id, NULL); 851 854 WARN_ONCE(!cpu_node, "no OF node found for CPU %i\n", cpu_id); 852 855 if (!cpu_node) 853 856 return NULL; 854 857 855 - cache = cache_lookup_by_node(cpu_node); 858 + group_id = get_group_id(cpu_id, 1); 859 + cache = cache_lookup_by_node_group(cpu_node, group_id); 856 860 of_node_put(cpu_node); 857 861 858 862 return cache;
+1 -2
arch/powerpc/kernel/dawr.c
··· 9 9 #include <linux/export.h> 10 10 #include <linux/fs.h> 11 11 #include <linux/debugfs.h> 12 - #include <asm/debugfs.h> 13 12 #include <asm/machdep.h> 14 13 #include <asm/hvcall.h> 15 14 ··· 100 101 if (PVR_VER(mfspr(SPRN_PVR)) == PVR_POWER9) { 101 102 /* Turn DAWR off by default, but allow admin to turn it on */ 102 103 debugfs_create_file_unsafe("dawr_enable_dangerous", 0600, 103 - powerpc_debugfs_root, 104 + arch_debugfs_dir, 104 105 &dawr_force_enable, 105 106 &dawr_enable_fops); 106 107 }
+8 -8
arch/powerpc/kernel/eeh.c
··· 21 21 #include <linux/spinlock.h> 22 22 #include <linux/export.h> 23 23 #include <linux/of.h> 24 + #include <linux/debugfs.h> 24 25 25 26 #include <linux/atomic.h> 26 - #include <asm/debugfs.h> 27 27 #include <asm/eeh.h> 28 28 #include <asm/eeh_event.h> 29 29 #include <asm/io.h> ··· 1901 1901 proc_create_single("powerpc/eeh", 0, NULL, proc_eeh_show); 1902 1902 #ifdef CONFIG_DEBUG_FS 1903 1903 debugfs_create_file_unsafe("eeh_enable", 0600, 1904 - powerpc_debugfs_root, NULL, 1904 + arch_debugfs_dir, NULL, 1905 1905 &eeh_enable_dbgfs_ops); 1906 1906 debugfs_create_u32("eeh_max_freezes", 0600, 1907 - powerpc_debugfs_root, &eeh_max_freezes); 1907 + arch_debugfs_dir, &eeh_max_freezes); 1908 1908 debugfs_create_bool("eeh_disable_recovery", 0600, 1909 - powerpc_debugfs_root, 1909 + arch_debugfs_dir, 1910 1910 &eeh_debugfs_no_recover); 1911 1911 debugfs_create_file_unsafe("eeh_dev_check", 0600, 1912 - powerpc_debugfs_root, NULL, 1912 + arch_debugfs_dir, NULL, 1913 1913 &eeh_dev_check_fops); 1914 1914 debugfs_create_file_unsafe("eeh_dev_break", 0600, 1915 - powerpc_debugfs_root, NULL, 1915 + arch_debugfs_dir, NULL, 1916 1916 &eeh_dev_break_fops); 1917 1917 debugfs_create_file_unsafe("eeh_force_recover", 0600, 1918 - powerpc_debugfs_root, NULL, 1918 + arch_debugfs_dir, NULL, 1919 1919 &eeh_force_recover_fops); 1920 1920 debugfs_create_file_unsafe("eeh_dev_can_recover", 0600, 1921 - powerpc_debugfs_root, NULL, 1921 + arch_debugfs_dir, NULL, 1922 1922 &eeh_dev_can_recover_fops); 1923 1923 eeh_cache_debugfs_init(); 1924 1924 #endif
+2 -2
arch/powerpc/kernel/eeh_cache.c
··· 12 12 #include <linux/slab.h> 13 13 #include <linux/spinlock.h> 14 14 #include <linux/atomic.h> 15 + #include <linux/debugfs.h> 15 16 #include <asm/pci-bridge.h> 16 - #include <asm/debugfs.h> 17 17 #include <asm/ppc-pci.h> 18 18 19 19 ··· 283 283 void eeh_cache_debugfs_init(void) 284 284 { 285 285 debugfs_create_file_unsafe("eeh_address_cache", 0400, 286 - powerpc_debugfs_root, NULL, 286 + arch_debugfs_dir, NULL, 287 287 &eeh_addr_cache_fops); 288 288 }
+2 -2
arch/powerpc/kernel/entry_32.S
··· 161 161 ret_from_kernel_thread: 162 162 REST_NVGPRS(r1) 163 163 bl schedule_tail 164 - mtlr r14 164 + mtctr r14 165 165 mr r3,r15 166 166 PPC440EP_ERR42 167 - blrl 167 + bctrl 168 168 li r3,0 169 169 b ret_from_syscall 170 170
+1 -1
arch/powerpc/kernel/entry_64.S
··· 309 309 */ 310 310 lbz r0,PACAIRQSOFTMASK(r13) 311 311 1: tdeqi r0,IRQS_ENABLED 312 - EMIT_BUG_ENTRY 1b,__FILE__,__LINE__,BUGFLAG_WARNING 312 + EMIT_WARN_ENTRY 1b,__FILE__,__LINE__,BUGFLAG_WARNING 313 313 #endif 314 314 315 315 /* Hard-disable interrupts */
+12 -12
arch/powerpc/kernel/exceptions-64e.S
··· 545 545 PROLOG_ADDITION_2REGS) 546 546 mfspr r14,SPRN_DEAR 547 547 mfspr r15,SPRN_ESR 548 - std r14,_DAR(r1) 549 - std r15,_DSISR(r1) 548 + std r14,_DEAR(r1) 549 + std r15,_ESR(r1) 550 550 ld r14,PACA_EXGEN+EX_R14(r13) 551 551 ld r15,PACA_EXGEN+EX_R15(r13) 552 552 EXCEPTION_COMMON(0x300) ··· 558 558 PROLOG_ADDITION_2REGS) 559 559 li r15,0 560 560 mr r14,r10 561 - std r14,_DAR(r1) 562 - std r15,_DSISR(r1) 561 + std r14,_DEAR(r1) 562 + std r15,_ESR(r1) 563 563 ld r14,PACA_EXGEN+EX_R14(r13) 564 564 ld r15,PACA_EXGEN+EX_R15(r13) 565 565 EXCEPTION_COMMON(0x400) ··· 575 575 PROLOG_ADDITION_2REGS) 576 576 mfspr r14,SPRN_DEAR 577 577 mfspr r15,SPRN_ESR 578 - std r14,_DAR(r1) 579 - std r15,_DSISR(r1) 578 + std r14,_DEAR(r1) 579 + std r15,_ESR(r1) 580 580 ld r14,PACA_EXGEN+EX_R14(r13) 581 581 ld r15,PACA_EXGEN+EX_R15(r13) 582 582 EXCEPTION_COMMON(0x600) ··· 587 587 NORMAL_EXCEPTION_PROLOG(0x700, BOOKE_INTERRUPT_PROGRAM, 588 588 PROLOG_ADDITION_1REG) 589 589 mfspr r14,SPRN_ESR 590 - std r14,_DSISR(r1) 590 + std r14,_ESR(r1) 591 591 ld r14,PACA_EXGEN+EX_R14(r13) 592 592 EXCEPTION_COMMON(0x700) 593 593 addi r3,r1,STACK_FRAME_OVERHEAD ··· 1057 1057 std r11,_CCR(r1) 1058 1058 mfspr r10,SPRN_DEAR 1059 1059 mfspr r11,SPRN_ESR 1060 - std r10,_DAR(r1) 1061 - std r11,_DSISR(r1) 1060 + std r10,_DEAR(r1) 1061 + std r11,_ESR(r1) 1062 1062 std r0,GPR0(r1); /* save r0 in stackframe */ \ 1063 1063 std r2,GPR2(r1); /* save r2 in stackframe */ \ 1064 1064 SAVE_4GPRS(3, r1); /* save r3 - r6 in stackframe */ \ ··· 1127 1127 * r3 = MAS0_TLBSEL (for the iprot array) 1128 1128 * r4 = SPRN_TLBnCFG 1129 1129 */ 1130 - bl invstr /* Find our address */ 1130 + bcl 20,31,$+4 /* Find our address */ 1131 1131 invstr: mflr r6 /* Make it accessible */ 1132 1132 mfmsr r7 1133 1133 rlwinm r5,r7,27,31,31 /* extract MSR[IS] */ ··· 1196 1196 mfmsr r6 1197 1197 xori r6,r6,MSR_IS 1198 1198 mtspr SPRN_SRR1,r6 1199 - bl 1f /* Find our address */ 1199 + bcl 20,31,$+4 /* Find our address */ 1200 1200 1: mflr r6 1201 1201 addi r6,r6,(2f - 1b) 1202 1202 mtspr SPRN_SRR0,r6 ··· 1256 1256 * r4 = MAS0 w/TLBSEL & ESEL for the temp mapping 1257 1257 */ 1258 1258 /* Now we branch the new virtual address mapped by this entry */ 1259 - bl 1f /* Find our address */ 1259 + bcl 20,31,$+4 /* Find our address */ 1260 1260 1: mflr r6 1261 1261 addi r6,r6,(2f - 1b) 1262 1262 tovirt(r6,r6)
+2 -2
arch/powerpc/kernel/fadump.c
··· 24 24 #include <linux/slab.h> 25 25 #include <linux/cma.h> 26 26 #include <linux/hugetlb.h> 27 + #include <linux/debugfs.h> 27 28 28 - #include <asm/debugfs.h> 29 29 #include <asm/page.h> 30 30 #include <asm/prom.h> 31 31 #include <asm/fadump.h> ··· 1557 1557 return; 1558 1558 } 1559 1559 1560 - debugfs_create_file("fadump_region", 0444, powerpc_debugfs_root, NULL, 1560 + debugfs_create_file("fadump_region", 0444, arch_debugfs_dir, NULL, 1561 1561 &fadump_region_fops); 1562 1562 1563 1563 if (fw_dump.dump_active) {
+1 -2
arch/powerpc/kernel/fpu.S
··· 91 91 isync 92 92 /* enable use of FP after return */ 93 93 #ifdef CONFIG_PPC32 94 - mfspr r5,SPRN_SPRG_THREAD /* current task's THREAD (phys) */ 95 - tovirt(r5, r5) 94 + addi r5,r2,THREAD 96 95 lwz r4,THREAD_FPEXC_MODE(r5) 97 96 ori r9,r9,MSR_FP /* enable FP for current */ 98 97 or r9,r9,r4
+4 -4
arch/powerpc/kernel/fsl_booke_entry_mapping.S
··· 1 1 /* SPDX-License-Identifier: GPL-2.0 */ 2 2 3 3 /* 1. Find the index of the entry we're executing in */ 4 - bl invstr /* Find our address */ 4 + bcl 20,31,$+4 /* Find our address */ 5 5 invstr: mflr r6 /* Make it accessible */ 6 6 mfmsr r7 7 7 rlwinm r4,r7,27,31,31 /* extract MSR[IS] */ ··· 85 85 addi r6,r6,10 86 86 slw r6,r8,r6 /* convert to mask */ 87 87 88 - bl 1f /* Find our address */ 88 + bcl 20,31,$+4 /* Find our address */ 89 89 1: mflr r7 90 90 91 91 mfspr r8,SPRN_MAS3 ··· 117 117 118 118 xori r6,r4,1 119 119 slwi r6,r6,5 /* setup new context with other address space */ 120 - bl 1f /* Find our address */ 120 + bcl 20,31,$+4 /* Find our address */ 121 121 1: mflr r9 122 122 rlwimi r7,r9,0,20,31 123 123 addi r7,r7,(2f - 1b) ··· 207 207 208 208 lis r7,MSR_KERNEL@h 209 209 ori r7,r7,MSR_KERNEL@l 210 - bl 1f /* Find our address */ 210 + bcl 20,31,$+4 /* Find our address */ 211 211 1: mflr r9 212 212 rlwimi r6,r9,0,20,31 213 213 addi r6,r6,(2f - 1b)
+3 -3
arch/powerpc/kernel/head_44x.S
··· 70 70 * address. 71 71 * r21 will be loaded with the physical runtime address of _stext 72 72 */ 73 - bl 0f /* Get our runtime address */ 73 + bcl 20,31,$+4 /* Get our runtime address */ 74 74 0: mflr r21 /* Make it accessible */ 75 75 addis r21,r21,(_stext - 0b)@ha 76 76 addi r21,r21,(_stext - 0b)@l /* Get our current runtime base */ ··· 853 853 wmmucr: mtspr SPRN_MMUCR,r3 /* Put MMUCR */ 854 854 sync 855 855 856 - bl invstr /* Find our address */ 856 + bcl 20,31,$+4 /* Find our address */ 857 857 invstr: mflr r5 /* Make it accessible */ 858 858 tlbsx r23,0,r5 /* Find entry we are in */ 859 859 li r4,0 /* Start at TLB entry 0 */ ··· 1045 1045 sync 1046 1046 1047 1047 /* Find the entry we are running from */ 1048 - bl 1f 1048 + bcl 20,31,$+4 1049 1049 1: mflr r23 1050 1050 tlbsx r23,0,r23 1051 1051 tlbre r24,r23,0
+2
arch/powerpc/kernel/head_64.S
··· 712 712 isync 713 713 blr 714 714 715 + _ASM_NOKPROBE_SYMBOL(copy_and_flush); /* Called in real mode */ 716 + 715 717 .align 8 716 718 copy_to_here: 717 719
+3 -3
arch/powerpc/kernel/head_fsl_booke.S
··· 79 79 mr r23,r3 80 80 mr r25,r4 81 81 82 - bl 0f 82 + bcl 20,31,$+4 83 83 0: mflr r8 84 84 addis r3,r8,(is_second_reloc - 0b)@ha 85 85 lwz r19,(is_second_reloc - 0b)@l(r3) ··· 1132 1132 bne 1b 1133 1133 1134 1134 /* Get the tlb entry used by the current running code */ 1135 - bl 0f 1135 + bcl 20,31,$+4 1136 1136 0: mflr r4 1137 1137 tlbsx 0,r4 1138 1138 ··· 1166 1166 _GLOBAL(restore_to_as0) 1167 1167 mflr r0 1168 1168 1169 - bl 0f 1169 + bcl 20,31,$+4 1170 1170 0: mflr r9 1171 1171 addi r9,r9,1f - 0b 1172 1172
-1
arch/powerpc/kernel/hw_breakpoint.c
··· 22 22 #include <asm/processor.h> 23 23 #include <asm/sstep.h> 24 24 #include <asm/debug.h> 25 - #include <asm/debugfs.h> 26 25 #include <asm/hvcall.h> 27 26 #include <asm/inst.h> 28 27 #include <linux/uaccess.h>
+3 -9
arch/powerpc/kernel/interrupt.c
··· 8 8 #include <asm/asm-prototypes.h> 9 9 #include <asm/kup.h> 10 10 #include <asm/cputime.h> 11 - #include <asm/interrupt.h> 12 11 #include <asm/hw_irq.h> 13 12 #include <asm/interrupt.h> 14 13 #include <asm/kprobes.h> ··· 92 93 CT_WARN_ON(ct_state() == CONTEXT_KERNEL); 93 94 user_exit_irqoff(); 94 95 95 - if (!IS_ENABLED(CONFIG_BOOKE) && !IS_ENABLED(CONFIG_40x)) 96 - BUG_ON(!(regs->msr & MSR_RI)); 96 + BUG_ON(regs_is_unrecoverable(regs)); 97 97 BUG_ON(!(regs->msr & MSR_PR)); 98 98 BUG_ON(arch_irq_disabled_regs(regs)); 99 99 ··· 461 463 { 462 464 unsigned long ret; 463 465 464 - if (!IS_ENABLED(CONFIG_BOOKE) && !IS_ENABLED(CONFIG_40x)) 465 - BUG_ON(!(regs->msr & MSR_RI)); 466 - BUG_ON(!(regs->msr & MSR_PR)); 466 + BUG_ON(regs_is_unrecoverable(regs)); 467 467 BUG_ON(arch_irq_disabled_regs(regs)); 468 468 CT_WARN_ON(ct_state() == CONTEXT_USER); 469 469 ··· 492 496 bool stack_store = current_thread_info()->flags & 493 497 _TIF_EMULATE_STACK_STORE; 494 498 495 - if (!IS_ENABLED(CONFIG_BOOKE) && !IS_ENABLED(CONFIG_40x) && 496 - unlikely(!(regs->msr & MSR_RI))) 499 + if (regs_is_unrecoverable(regs)) 497 500 unrecoverable_exception(regs); 498 - BUG_ON(regs->msr & MSR_PR); 499 501 /* 500 502 * CT_WARN_ON comes here via program_check_exception, 501 503 * so avoid recursion.
+31 -30
arch/powerpc/kernel/iommu.c
··· 688 688 if (tbl->it_offset == 0) 689 689 set_bit(0, tbl->it_map); 690 690 691 + if (res_start < tbl->it_offset) 692 + res_start = tbl->it_offset; 693 + 694 + if (res_end > (tbl->it_offset + tbl->it_size)) 695 + res_end = tbl->it_offset + tbl->it_size; 696 + 697 + /* Check if res_start..res_end is a valid range in the table */ 698 + if (res_start >= res_end) { 699 + tbl->it_reserved_start = tbl->it_offset; 700 + tbl->it_reserved_end = tbl->it_offset; 701 + return; 702 + } 703 + 691 704 tbl->it_reserved_start = res_start; 692 705 tbl->it_reserved_end = res_end; 693 706 694 - /* Check if res_start..res_end isn't empty and overlaps the table */ 695 - if (res_start && res_end && 696 - (tbl->it_offset + tbl->it_size < res_start || 697 - res_end < tbl->it_offset)) 698 - return; 699 - 700 707 for (i = tbl->it_reserved_start; i < tbl->it_reserved_end; ++i) 701 708 set_bit(i - tbl->it_offset, tbl->it_map); 702 - } 703 - 704 - static void iommu_table_release_pages(struct iommu_table *tbl) 705 - { 706 - int i; 707 - 708 - /* 709 - * In case we have reserved the first bit, we should not emit 710 - * the warning below. 711 - */ 712 - if (tbl->it_offset == 0) 713 - clear_bit(0, tbl->it_map); 714 - 715 - for (i = tbl->it_reserved_start; i < tbl->it_reserved_end; ++i) 716 - clear_bit(i - tbl->it_offset, tbl->it_map); 717 709 } 718 710 719 711 /* ··· 769 777 return tbl; 770 778 } 771 779 780 + bool iommu_table_in_use(struct iommu_table *tbl) 781 + { 782 + unsigned long start = 0, end; 783 + 784 + /* ignore reserved bit0 */ 785 + if (tbl->it_offset == 0) 786 + start = 1; 787 + end = tbl->it_reserved_start - tbl->it_offset; 788 + if (find_next_bit(tbl->it_map, end, start) != end) 789 + return true; 790 + 791 + start = tbl->it_reserved_end - tbl->it_offset; 792 + end = tbl->it_size; 793 + return find_next_bit(tbl->it_map, end, start) != end; 794 + } 795 + 772 796 static void iommu_table_free(struct kref *kref) 773 797 { 774 798 struct iommu_table *tbl; ··· 801 793 802 794 iommu_debugfs_del(tbl); 803 795 804 - iommu_table_release_pages(tbl); 805 - 806 796 /* verify that table contains no entries */ 807 - if (!bitmap_empty(tbl->it_map, tbl->it_size)) 797 + if (iommu_table_in_use(tbl)) 808 798 pr_warn("%s: Unexpected TCEs\n", __func__); 809 799 810 800 /* free bitmap */ ··· 1103 1097 for (i = 0; i < tbl->nr_pools; i++) 1104 1098 spin_lock_nest_lock(&tbl->pools[i].lock, &tbl->large_pool.lock); 1105 1099 1106 - iommu_table_release_pages(tbl); 1107 - 1108 - if (!bitmap_empty(tbl->it_map, tbl->it_size)) { 1100 + if (iommu_table_in_use(tbl)) { 1109 1101 pr_err("iommu_tce: it_map is not empty"); 1110 1102 ret = -EBUSY; 1111 - /* Undo iommu_table_release_pages, i.e. restore bit#0, etc */ 1112 - iommu_table_reserve_pages(tbl, tbl->it_reserved_start, 1113 - tbl->it_reserved_end); 1114 1103 } else { 1115 1104 memset(tbl->it_map, 0xff, sz); 1116 1105 }
+14
arch/powerpc/kernel/kdebugfs.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <linux/debugfs.h> 3 + #include <linux/export.h> 4 + #include <linux/init.h> 5 + 6 + struct dentry *arch_debugfs_dir; 7 + EXPORT_SYMBOL(arch_debugfs_dir); 8 + 9 + static int __init arch_kdebugfs_init(void) 10 + { 11 + arch_debugfs_dir = debugfs_create_dir("powerpc", NULL); 12 + return 0; 13 + } 14 + arch_initcall(arch_kdebugfs_init);
+1 -1
arch/powerpc/kernel/misc.S
··· 29 29 li r3, 0 30 30 _GLOBAL(add_reloc_offset) 31 31 mflr r0 32 - bl 1f 32 + bcl 20,31,$+4 33 33 1: mflr r5 34 34 PPC_LL r4,(2f-1b)(r5) 35 35 subf r5,r4,r5
+2 -2
arch/powerpc/kernel/misc_32.S
··· 67 67 srwi. r8,r8,2 68 68 beqlr 69 69 mtctr r8 70 - bl 1f 70 + bcl 20,31,$+4 71 71 1: mflr r0 72 72 lis r4,1b@ha 73 73 addi r4,r4,1b@l ··· 237 237 addi r3,r3,-4 238 238 239 239 0: twnei r5, 0 /* WARN if r3 is not cache aligned */ 240 - EMIT_BUG_ENTRY 0b,__FILE__,__LINE__, BUGFLAG_WARNING 240 + EMIT_WARN_ENTRY 0b,__FILE__,__LINE__, BUGFLAG_WARNING 241 241 242 242 addi r4,r4,-4 243 243
+1 -1
arch/powerpc/kernel/misc_64.S
··· 255 255 * Physical (hardware) cpu id should be in r3. 256 256 */ 257 257 _GLOBAL(kexec_wait) 258 - bl 1f 258 + bcl 20,31,$+4 259 259 1: mflr r5 260 260 addi r5,r5,kexec_flag-1b 261 261
+6
arch/powerpc/kernel/pci-common.c
··· 29 29 #include <linux/slab.h> 30 30 #include <linux/vgaarb.h> 31 31 #include <linux/numa.h> 32 + #include <linux/msi.h> 32 33 33 34 #include <asm/processor.h> 34 35 #include <asm/io.h> ··· 1061 1060 1062 1061 int pcibios_add_device(struct pci_dev *dev) 1063 1062 { 1063 + struct irq_domain *d; 1064 + 1064 1065 #ifdef CONFIG_PCI_IOV 1065 1066 if (ppc_md.pcibios_fixup_sriov) 1066 1067 ppc_md.pcibios_fixup_sriov(dev); 1067 1068 #endif /* CONFIG_PCI_IOV */ 1068 1069 1070 + d = dev_get_msi_domain(&dev->bus->dev); 1071 + if (d) 1072 + dev_set_msi_domain(&dev->dev, d); 1069 1073 return 0; 1070 1074 } 1071 1075
+1 -1
arch/powerpc/kernel/process.c
··· 1499 1499 trap == INTERRUPT_DATA_STORAGE || 1500 1500 trap == INTERRUPT_ALIGNMENT) { 1501 1501 if (IS_ENABLED(CONFIG_4xx) || IS_ENABLED(CONFIG_BOOKE)) 1502 - pr_cont("DEAR: "REG" ESR: "REG" ", regs->dar, regs->dsisr); 1502 + pr_cont("DEAR: "REG" ESR: "REG" ", regs->dear, regs->esr); 1503 1503 else 1504 1504 pr_cont("DAR: "REG" DSISR: %08lx ", regs->dar, regs->dsisr); 1505 1505 }
+3 -2
arch/powerpc/kernel/prom.c
··· 640 640 } 641 641 #endif /* CONFIG_BLK_DEV_INITRD */ 642 642 643 - #ifdef CONFIG_PPC32 643 + if (!IS_ENABLED(CONFIG_PPC32)) 644 + return; 645 + 644 646 /* 645 647 * Handle the case where we might be booting from an old kexec 646 648 * image that setup the mem_rsvmap as pairs of 32-bit values ··· 663 661 } 664 662 return; 665 663 } 666 - #endif 667 664 } 668 665 669 666 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+2 -1
arch/powerpc/kernel/prom_init.c
··· 1096 1096 #else 1097 1097 0, 1098 1098 #endif 1099 - .associativity = OV5_FEAT(OV5_TYPE1_AFFINITY) | OV5_FEAT(OV5_PRRN), 1099 + .associativity = OV5_FEAT(OV5_FORM1_AFFINITY) | OV5_FEAT(OV5_PRRN) | 1100 + OV5_FEAT(OV5_FORM2_AFFINITY), 1100 1101 .bin_opts = OV5_FEAT(OV5_RESIZE_HPT) | OV5_FEAT(OV5_HP_EVT), 1101 1102 .micro_checkpoint = 0, 1102 1103 .reserved0 = 0,
+4
arch/powerpc/kernel/ptrace/ptrace.c
··· 373 373 offsetof(struct user_pt_regs, trap)); 374 374 BUILD_BUG_ON(offsetof(struct pt_regs, dar) != 375 375 offsetof(struct user_pt_regs, dar)); 376 + BUILD_BUG_ON(offsetof(struct pt_regs, dear) != 377 + offsetof(struct user_pt_regs, dar)); 376 378 BUILD_BUG_ON(offsetof(struct pt_regs, dsisr) != 379 + offsetof(struct user_pt_regs, dsisr)); 380 + BUILD_BUG_ON(offsetof(struct pt_regs, esr) != 377 381 offsetof(struct user_pt_regs, dsisr)); 378 382 BUILD_BUG_ON(offsetof(struct pt_regs, result) != 379 383 offsetof(struct user_pt_regs, result));
+1 -1
arch/powerpc/kernel/reloc_32.S
··· 30 30 _GLOBAL(relocate) 31 31 32 32 mflr r0 /* Save our LR */ 33 - bl 0f /* Find our current runtime address */ 33 + bcl 20,31,$+4 /* Find our current runtime address */ 34 34 0: mflr r12 /* Make it accessible */ 35 35 mtlr r0 36 36
+2 -2
arch/powerpc/kernel/rtasd.c
··· 429 429 430 430 do_event_scan(); 431 431 432 - get_online_cpus(); 432 + cpus_read_lock(); 433 433 434 434 /* raw_ OK because just using CPU as starting point. */ 435 435 cpu = cpumask_next(raw_smp_processor_id(), cpu_online_mask); ··· 451 451 schedule_delayed_work_on(cpu, &event_scan_work, 452 452 __round_jiffies_relative(event_scan_delay, cpu)); 453 453 454 - put_online_cpus(); 454 + cpus_read_unlock(); 455 455 } 456 456 457 457 #ifdef CONFIG_PPC64
+8 -8
arch/powerpc/kernel/security.c
··· 11 11 #include <linux/nospec.h> 12 12 #include <linux/prctl.h> 13 13 #include <linux/seq_buf.h> 14 + #include <linux/debugfs.h> 14 15 15 16 #include <asm/asm-prototypes.h> 16 17 #include <asm/code-patching.h> 17 - #include <asm/debugfs.h> 18 18 #include <asm/security_features.h> 19 19 #include <asm/setup.h> 20 20 #include <asm/inst.h> ··· 106 106 static __init int barrier_nospec_debugfs_init(void) 107 107 { 108 108 debugfs_create_file_unsafe("barrier_nospec", 0600, 109 - powerpc_debugfs_root, NULL, 109 + arch_debugfs_dir, NULL, 110 110 &fops_barrier_nospec); 111 111 return 0; 112 112 } ··· 114 114 115 115 static __init int security_feature_debugfs_init(void) 116 116 { 117 - debugfs_create_x64("security_features", 0400, powerpc_debugfs_root, 117 + debugfs_create_x64("security_features", 0400, arch_debugfs_dir, 118 118 &powerpc_security_features); 119 119 return 0; 120 120 } ··· 420 420 421 421 static __init int stf_barrier_debugfs_init(void) 422 422 { 423 - debugfs_create_file_unsafe("stf_barrier", 0600, powerpc_debugfs_root, 423 + debugfs_create_file_unsafe("stf_barrier", 0600, arch_debugfs_dir, 424 424 NULL, &fops_stf_barrier); 425 425 return 0; 426 426 } ··· 748 748 static __init int count_cache_flush_debugfs_init(void) 749 749 { 750 750 debugfs_create_file_unsafe("count_cache_flush", 0600, 751 - powerpc_debugfs_root, NULL, 751 + arch_debugfs_dir, NULL, 752 752 &fops_count_cache_flush); 753 753 return 0; 754 754 } ··· 834 834 835 835 static __init int rfi_flush_debugfs_init(void) 836 836 { 837 - debugfs_create_file("rfi_flush", 0600, powerpc_debugfs_root, NULL, &fops_rfi_flush); 838 - debugfs_create_file("entry_flush", 0600, powerpc_debugfs_root, NULL, &fops_entry_flush); 839 - debugfs_create_file("uaccess_flush", 0600, powerpc_debugfs_root, NULL, &fops_uaccess_flush); 837 + debugfs_create_file("rfi_flush", 0600, arch_debugfs_dir, NULL, &fops_rfi_flush); 838 + debugfs_create_file("entry_flush", 0600, arch_debugfs_dir, NULL, &fops_entry_flush); 839 + debugfs_create_file("uaccess_flush", 0600, arch_debugfs_dir, NULL, &fops_uaccess_flush); 840 840 return 0; 841 841 } 842 842 device_initcall(rfi_flush_debugfs_init);
-13
arch/powerpc/kernel/setup-common.c
··· 33 33 #include <linux/of_platform.h> 34 34 #include <linux/hugetlb.h> 35 35 #include <linux/pgtable.h> 36 - #include <asm/debugfs.h> 37 36 #include <asm/io.h> 38 37 #include <asm/paca.h> 39 38 #include <asm/prom.h> ··· 771 772 772 773 late_initcall(check_cache_coherency); 773 774 #endif /* CONFIG_CHECK_CACHE_COHERENCY */ 774 - 775 - #ifdef CONFIG_DEBUG_FS 776 - struct dentry *powerpc_debugfs_root; 777 - EXPORT_SYMBOL(powerpc_debugfs_root); 778 - 779 - static int powerpc_debugfs_init(void) 780 - { 781 - powerpc_debugfs_root = debugfs_create_dir("powerpc", NULL); 782 - return 0; 783 - } 784 - arch_initcall(powerpc_debugfs_init); 785 - #endif 786 775 787 776 void ppc_printk_progress(char *s, unsigned short hex) 788 777 {
-1
arch/powerpc/kernel/setup_64.c
··· 32 32 #include <linux/nmi.h> 33 33 #include <linux/pgtable.h> 34 34 35 - #include <asm/debugfs.h> 36 35 #include <asm/kvm_guest.h> 37 36 #include <asm/io.h> 38 37 #include <asm/kdump.h>
+64 -38
arch/powerpc/kernel/smp.c
··· 78 78 bool has_big_cores; 79 79 bool coregroup_enabled; 80 80 bool thread_group_shares_l2; 81 + bool thread_group_shares_l3; 81 82 82 83 DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map); 83 84 DEFINE_PER_CPU(cpumask_var_t, cpu_smallcore_map); ··· 102 101 103 102 #define MAX_THREAD_LIST_SIZE 8 104 103 #define THREAD_GROUP_SHARE_L1 1 105 - #define THREAD_GROUP_SHARE_L2 2 104 + #define THREAD_GROUP_SHARE_L2_L3 2 106 105 struct thread_groups { 107 106 unsigned int property; 108 107 unsigned int nr_groups; ··· 123 122 * On big-cores system, thread_group_l1_cache_map for each CPU corresponds to 124 123 * the set its siblings that share the L1-cache. 125 124 */ 126 - static DEFINE_PER_CPU(cpumask_var_t, thread_group_l1_cache_map); 125 + DEFINE_PER_CPU(cpumask_var_t, thread_group_l1_cache_map); 127 126 128 127 /* 129 128 * On some big-cores system, thread_group_l2_cache_map for each CPU 130 129 * corresponds to the set its siblings within the core that share the 131 130 * L2-cache. 132 131 */ 133 - static DEFINE_PER_CPU(cpumask_var_t, thread_group_l2_cache_map); 132 + DEFINE_PER_CPU(cpumask_var_t, thread_group_l2_cache_map); 133 + 134 + /* 135 + * On P10, thread_group_l3_cache_map for each CPU is equal to the 136 + * thread_group_l2_cache_map 137 + */ 138 + DEFINE_PER_CPU(cpumask_var_t, thread_group_l3_cache_map); 134 139 135 140 /* SMP operations for this machine */ 136 141 struct smp_ops_t *smp_ops; ··· 896 889 return tg; 897 890 } 898 891 899 - static int __init init_thread_group_cache_map(int cpu, int cache_property) 900 - 892 + static int update_mask_from_threadgroup(cpumask_var_t *mask, struct thread_groups *tg, int cpu, int cpu_group_start) 901 893 { 902 894 int first_thread = cpu_first_thread_sibling(cpu); 903 - int i, cpu_group_start = -1, err = 0; 904 - struct thread_groups *tg = NULL; 905 - cpumask_var_t *mask = NULL; 906 - 907 - if (cache_property != THREAD_GROUP_SHARE_L1 && 908 - cache_property != THREAD_GROUP_SHARE_L2) 909 - return -EINVAL; 910 - 911 - tg = get_thread_groups(cpu, cache_property, &err); 912 - if (!tg) 913 - return err; 914 - 915 - cpu_group_start = get_cpu_thread_group_start(cpu, tg); 916 - 917 - if (unlikely(cpu_group_start == -1)) { 918 - WARN_ON_ONCE(1); 919 - return -ENODATA; 920 - } 921 - 922 - if (cache_property == THREAD_GROUP_SHARE_L1) 923 - mask = &per_cpu(thread_group_l1_cache_map, cpu); 924 - else if (cache_property == THREAD_GROUP_SHARE_L2) 925 - mask = &per_cpu(thread_group_l2_cache_map, cpu); 895 + int i; 926 896 927 897 zalloc_cpumask_var_node(mask, GFP_KERNEL, cpu_to_node(cpu)); 928 898 ··· 914 930 if (i_group_start == cpu_group_start) 915 931 cpumask_set_cpu(i, *mask); 916 932 } 933 + 934 + return 0; 935 + } 936 + 937 + static int __init init_thread_group_cache_map(int cpu, int cache_property) 938 + 939 + { 940 + int cpu_group_start = -1, err = 0; 941 + struct thread_groups *tg = NULL; 942 + cpumask_var_t *mask = NULL; 943 + 944 + if (cache_property != THREAD_GROUP_SHARE_L1 && 945 + cache_property != THREAD_GROUP_SHARE_L2_L3) 946 + return -EINVAL; 947 + 948 + tg = get_thread_groups(cpu, cache_property, &err); 949 + 950 + if (!tg) 951 + return err; 952 + 953 + cpu_group_start = get_cpu_thread_group_start(cpu, tg); 954 + 955 + if (unlikely(cpu_group_start == -1)) { 956 + WARN_ON_ONCE(1); 957 + return -ENODATA; 958 + } 959 + 960 + if (cache_property == THREAD_GROUP_SHARE_L1) { 961 + mask = &per_cpu(thread_group_l1_cache_map, cpu); 962 + update_mask_from_threadgroup(mask, tg, cpu, cpu_group_start); 963 + } 964 + else if (cache_property == THREAD_GROUP_SHARE_L2_L3) { 965 + mask = &per_cpu(thread_group_l2_cache_map, cpu); 966 + update_mask_from_threadgroup(mask, tg, cpu, cpu_group_start); 967 + mask = &per_cpu(thread_group_l3_cache_map, cpu); 968 + update_mask_from_threadgroup(mask, tg, cpu, cpu_group_start); 969 + } 970 + 917 971 918 972 return 0; 919 973 } ··· 1042 1020 has_big_cores = true; 1043 1021 1044 1022 for_each_possible_cpu(cpu) { 1045 - int err = init_thread_group_cache_map(cpu, THREAD_GROUP_SHARE_L2); 1023 + int err = init_thread_group_cache_map(cpu, THREAD_GROUP_SHARE_L2_L3); 1046 1024 1047 1025 if (err) 1048 1026 return err; 1049 1027 } 1050 1028 1051 1029 thread_group_shares_l2 = true; 1052 - pr_debug("L2 cache only shared by the threads in the small core\n"); 1030 + thread_group_shares_l3 = true; 1031 + pr_debug("L2/L3 cache only shared by the threads in the small core\n"); 1032 + 1053 1033 return 0; 1054 1034 } 1055 1035 ··· 1109 1085 } 1110 1086 1111 1087 if (cpu_to_chip_id(boot_cpuid) != -1) { 1112 - int idx = num_possible_cpus() / threads_per_core; 1088 + int idx = DIV_ROUND_UP(num_possible_cpus(), threads_per_core); 1113 1089 1114 1090 /* 1115 1091 * All threads of a core will all belong to the same core, ··· 1400 1376 l2_cache = cpu_to_l2cache(cpu); 1401 1377 if (!l2_cache || !*mask) { 1402 1378 /* Assume only core siblings share cache with this CPU */ 1403 - for_each_cpu(i, submask_fn(cpu)) 1379 + for_each_cpu(i, cpu_sibling_mask(cpu)) 1404 1380 set_cpus_related(cpu, i, cpu_l2_cache_mask); 1405 1381 1406 1382 return false; ··· 1441 1417 { 1442 1418 struct cpumask *(*mask_fn)(int) = cpu_sibling_mask; 1443 1419 int i; 1420 + 1421 + unmap_cpu_from_node(cpu); 1444 1422 1445 1423 if (shared_caches) 1446 1424 mask_fn = cpu_l2_cache_mask; ··· 1528 1502 * This CPU will not be in the online mask yet so we need to manually 1529 1503 * add it to it's own thread sibling mask. 1530 1504 */ 1505 + map_cpu_to_node(cpu, cpu_to_node(cpu)); 1531 1506 cpumask_set_cpu(cpu, cpu_sibling_mask(cpu)); 1507 + cpumask_set_cpu(cpu, cpu_core_mask(cpu)); 1532 1508 1533 1509 for (i = first_thread; i < first_thread + threads_per_core; i++) 1534 1510 if (cpu_online(i)) ··· 1548 1520 if (chip_id_lookup_table && ret) 1549 1521 chip_id = cpu_to_chip_id(cpu); 1550 1522 1551 - if (chip_id == -1) { 1552 - cpumask_copy(per_cpu(cpu_core_map, cpu), cpu_cpu_mask(cpu)); 1553 - goto out; 1554 - } 1555 - 1556 1523 if (shared_caches) 1557 1524 submask_fn = cpu_l2_cache_mask; 1558 1525 ··· 1556 1533 1557 1534 /* Skip all CPUs already part of current CPU core mask */ 1558 1535 cpumask_andnot(mask, cpu_online_mask, cpu_core_mask(cpu)); 1536 + 1537 + /* If chip_id is -1; limit the cpu_core_mask to within DIE*/ 1538 + if (chip_id == -1) 1539 + cpumask_and(mask, mask, cpu_cpu_mask(cpu)); 1559 1540 1560 1541 for_each_cpu(i, mask) { 1561 1542 if (chip_id == cpu_to_chip_id(i)) { ··· 1570 1543 } 1571 1544 } 1572 1545 1573 - out: 1574 1546 free_cpumask_var(mask); 1575 1547 } 1576 1548
+1
arch/powerpc/kernel/stacktrace.c
··· 8 8 * Copyright 2018 Nick Piggin, Michael Ellerman, IBM Corp. 9 9 */ 10 10 11 + #include <linux/delay.h> 11 12 #include <linux/export.h> 12 13 #include <linux/kallsyms.h> 13 14 #include <linux/module.h>
+4 -11
arch/powerpc/kernel/syscalls.c
··· 41 41 unsigned long prot, unsigned long flags, 42 42 unsigned long fd, unsigned long off, int shift) 43 43 { 44 - long ret = -EINVAL; 45 - 46 44 if (!arch_validate_prot(prot, addr)) 47 - goto out; 45 + return -EINVAL; 48 46 49 - if (shift) { 50 - if (off & ((1 << shift) - 1)) 51 - goto out; 52 - off >>= shift; 53 - } 47 + if (!IS_ALIGNED(off, 1 << shift)) 48 + return -EINVAL; 54 49 55 - ret = ksys_mmap_pgoff(addr, len, prot, flags, fd, off); 56 - out: 57 - return ret; 50 + return ksys_mmap_pgoff(addr, len, prot, flags, fd, off >> shift); 58 51 } 59 52 60 53 SYSCALL_DEFINE6(mmap2, unsigned long, addr, size_t, len,
+1 -1
arch/powerpc/kernel/tau_6xx.c
··· 164 164 queue_work(tau_workq, work); 165 165 } 166 166 167 - DECLARE_WORK(tau_work, tau_work_func); 167 + static DECLARE_WORK(tau_work, tau_work_func); 168 168 169 169 /* 170 170 * setup the TAU
+1 -2
arch/powerpc/kernel/time.c
··· 31 31 #include <linux/export.h> 32 32 #include <linux/sched.h> 33 33 #include <linux/sched/clock.h> 34 + #include <linux/sched/cputime.h> 34 35 #include <linux/kernel.h> 35 36 #include <linux/param.h> 36 37 #include <linux/string.h> ··· 53 52 #include <linux/irq_work.h> 54 53 #include <linux/of_clk.h> 55 54 #include <linux/suspend.h> 56 - #include <linux/sched/cputime.h> 57 - #include <linux/sched/clock.h> 58 55 #include <linux/processor.h> 59 56 #include <asm/trace.h> 60 57
+14 -9
arch/powerpc/kernel/traps.c
··· 37 37 #include <linux/smp.h> 38 38 #include <linux/console.h> 39 39 #include <linux/kmsg_dump.h> 40 + #include <linux/debugfs.h> 40 41 41 42 #include <asm/emulated_ops.h> 42 43 #include <linux/uaccess.h> 43 - #include <asm/debugfs.h> 44 44 #include <asm/interrupt.h> 45 45 #include <asm/io.h> 46 46 #include <asm/machdep.h> ··· 427 427 return; 428 428 429 429 nonrecoverable: 430 - regs_set_return_msr(regs, regs->msr & ~MSR_RI); 430 + regs_set_unrecoverable(regs); 431 431 #endif 432 432 } 433 433 DEFINE_INTERRUPT_HANDLER_NMI(system_reset_exception) ··· 497 497 die("Unrecoverable nested System Reset", regs, SIGABRT); 498 498 #endif 499 499 /* Must die if the interrupt is not recoverable */ 500 - if (!(regs->msr & MSR_RI)) { 500 + if (regs_is_unrecoverable(regs)) { 501 501 /* For the reason explained in die_mce, nmi_exit before die */ 502 502 nmi_exit(); 503 503 die("Unrecoverable System Reset", regs, SIGABRT); ··· 549 549 printk(KERN_DEBUG "%s bad port %lx at %p\n", 550 550 (*nip & 0x100)? "OUT to": "IN from", 551 551 regs->gpr[rb] - _IO_BASE, nip); 552 - regs_set_return_msr(regs, regs->msr | MSR_RI); 552 + regs_set_recoverable(regs); 553 553 regs_set_return_ip(regs, extable_fixup(entry)); 554 554 return 1; 555 555 } ··· 561 561 #ifdef CONFIG_PPC_ADV_DEBUG_REGS 562 562 /* On 4xx, the reason for the machine check or program exception 563 563 is in the ESR. */ 564 - #define get_reason(regs) ((regs)->dsisr) 564 + #define get_reason(regs) ((regs)->esr) 565 565 #define REASON_FP ESR_FP 566 566 #define REASON_ILLEGAL (ESR_PIL | ESR_PUO) 567 567 #define REASON_PRIVILEGED ESR_PPR ··· 839 839 840 840 bail: 841 841 /* Must die if the interrupt is not recoverable */ 842 - if (!(regs->msr & MSR_RI)) 842 + if (regs_is_unrecoverable(regs)) 843 843 die_mce("Unrecoverable Machine check", regs, SIGBUS); 844 844 845 845 #ifdef CONFIG_PPC_BOOK3S_64 ··· 1481 1481 1482 1482 if (!(regs->msr & MSR_PR) && /* not user-mode */ 1483 1483 report_bug(bugaddr, regs) == BUG_TRAP_TYPE_WARN) { 1484 - regs_add_return_ip(regs, 4); 1485 - return; 1484 + const struct exception_table_entry *entry; 1485 + 1486 + entry = search_exception_tables(bugaddr); 1487 + if (entry) { 1488 + regs_set_return_ip(regs, extable_fixup(entry) + regs->nip - bugaddr); 1489 + return; 1490 + } 1486 1491 } 1487 1492 _exception(SIGTRAP, regs, TRAP_BRKPT, regs->nip); 1488 1493 return; ··· 2276 2271 struct ppc_emulated_entry *entries = (void *)&ppc_emulated; 2277 2272 2278 2273 dir = debugfs_create_dir("emulated_instructions", 2279 - powerpc_debugfs_root); 2274 + arch_debugfs_dir); 2280 2275 2281 2276 debugfs_create_u32("do_warn", 0644, dir, &ppc_warn_emulated); 2282 2277
+1 -3
arch/powerpc/kernel/vector.S
··· 65 65 1: 66 66 /* enable use of VMX after return */ 67 67 #ifdef CONFIG_PPC32 68 - mfspr r5,SPRN_SPRG_THREAD /* current task's THREAD (phys) */ 68 + addi r5,r2,THREAD 69 69 oris r9,r9,MSR_VEC@h 70 - tovirt(r5, r5) 71 70 #else 72 71 ld r4,PACACURRENT(r13) 73 72 addi r5,r4,THREAD /* Get THREAD */ ··· 80 81 li r4,1 81 82 stb r4,THREAD_LOAD_VEC(r5) 82 83 addi r6,r5,THREAD_VRSTATE 83 - li r4,1 84 84 li r10,VRSTATE_VSCR 85 85 stw r4,THREAD_USED_VR(r5) 86 86 lvx v0,r10,r6
+7 -3
arch/powerpc/kexec/core_64.c
··· 64 64 begin = image->segment[i].mem; 65 65 end = begin + image->segment[i].memsz; 66 66 67 - if ((begin < high) && (end > low)) 67 + if ((begin < high) && (end > low)) { 68 + of_node_put(node); 68 69 return -ETXTBSY; 70 + } 69 71 } 70 72 } 71 73 72 74 return 0; 73 75 } 74 76 75 - static void copy_segments(unsigned long ind) 77 + /* Called during kexec sequence with MMU off */ 78 + static notrace void copy_segments(unsigned long ind) 76 79 { 77 80 unsigned long entry; 78 81 unsigned long *ptr; ··· 108 105 } 109 106 } 110 107 111 - void kexec_copy_flush(struct kimage *image) 108 + /* Called during kexec sequence with MMU off */ 109 + notrace void kexec_copy_flush(struct kimage *image) 112 110 { 113 111 long i, nr_segments = image->nr_segments; 114 112 struct kexec_segment ranges[KEXEC_SEGMENT_MAX];
+6 -6
arch/powerpc/kexec/relocate_32.S
··· 93 93 * Invalidate all the TLB entries except the current entry 94 94 * where we are running from 95 95 */ 96 - bl 0f /* Find our address */ 96 + bcl 20,31,$+4 /* Find our address */ 97 97 0: mflr r5 /* Make it accessible */ 98 98 tlbsx r23,0,r5 /* Find entry we are in */ 99 99 li r4,0 /* Start at TLB entry 0 */ ··· 158 158 /* Switch to other address space in MSR */ 159 159 insrwi r9, r7, 1, 26 /* Set MSR[IS] = r7 */ 160 160 161 - bl 1f 161 + bcl 20,31,$+4 162 162 1: mflr r8 163 163 addi r8, r8, (2f-1b) /* Find the target offset */ 164 164 ··· 202 202 li r9,0 203 203 insrwi r9, r7, 1, 26 /* Set MSR[IS] = r7 */ 204 204 205 - bl 1f 205 + bcl 20,31,$+4 206 206 1: mflr r8 207 207 and r8, r8, r11 /* Get our offset within page */ 208 208 addi r8, r8, (2f-1b) ··· 240 240 sync 241 241 242 242 /* Find the entry we are running from */ 243 - bl 2f 243 + bcl 20,31,$+4 244 244 2: mflr r23 245 245 tlbsx r23, 0, r23 246 246 tlbre r24, r23, 0 /* TLB Word 0 */ ··· 296 296 /* Update the msr to the new TS */ 297 297 insrwi r5, r7, 1, 26 298 298 299 - bl 1f 299 + bcl 20,31,$+4 300 300 1: mflr r6 301 301 addi r6, r6, (2f-1b) 302 302 ··· 355 355 /* Defaults to 256M */ 356 356 lis r10, 0x1000 357 357 358 - bl 1f 358 + bcl 20,31,$+4 359 359 1: mflr r4 360 360 addi r4, r4, (2f-1b) /* virtual address of 2f */ 361 361
-1
arch/powerpc/kvm/Kconfig
··· 38 38 config KVM_BOOK3S_64_HANDLER 39 39 bool 40 40 select KVM_BOOK3S_HANDLER 41 - select PPC_DAWR_FORCE_ENABLE 42 41 43 42 config KVM_BOOK3S_PR_POSSIBLE 44 43 bool
+2 -1
arch/powerpc/kvm/book3s.h
··· 23 23 extern int kvmppc_core_emulate_mfspr_pr(struct kvm_vcpu *vcpu, 24 24 int sprn, ulong *spr_val); 25 25 extern int kvmppc_book3s_init_pr(void); 26 - extern void kvmppc_book3s_exit_pr(void); 26 + void kvmppc_book3s_exit_pr(void); 27 + extern int kvmppc_handle_exit_pr(struct kvm_vcpu *vcpu, unsigned int exit_nr); 27 28 28 29 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM 29 30 extern void kvmppc_emulate_tabort(struct kvm_vcpu *vcpu, int ra_val);
+1 -2
arch/powerpc/kvm/book3s_64_mmu.c
··· 196 196 hva_t ptegp; 197 197 u64 pteg[16]; 198 198 u64 avpn = 0; 199 - u64 v, r; 199 + u64 r; 200 200 u64 v_val, v_mask; 201 201 u64 eaddr_mask; 202 202 int i; ··· 285 285 goto do_second; 286 286 } 287 287 288 - v = be64_to_cpu(pteg[i]); 289 288 r = be64_to_cpu(pteg[i+1]); 290 289 pp = (r & HPTE_R_PP) | key; 291 290 if (r & HPTE_R_PP0)
+7 -5
arch/powerpc/kvm/book3s_64_mmu_radix.c
··· 44 44 (to != NULL) ? __pa(to): 0, 45 45 (from != NULL) ? __pa(from): 0, n); 46 46 47 + if (eaddr & (0xFFFUL << 52)) 48 + return ret; 49 + 47 50 quadrant = 1; 48 51 if (!pid) 49 52 quadrant = 2; ··· 68 65 } 69 66 isync(); 70 67 68 + pagefault_disable(); 71 69 if (is_load) 72 - ret = copy_from_user_nofault(to, (const void __user *)from, n); 70 + ret = __copy_from_user_inatomic(to, (const void __user *)from, n); 73 71 else 74 - ret = copy_to_user_nofault((void __user *)to, from, n); 72 + ret = __copy_to_user_inatomic((void __user *)to, from, n); 73 + pagefault_enable(); 75 74 76 75 /* switch the pid first to avoid running host with unallocated pid */ 77 76 if (quadrant == 1 && pid != old_pid) ··· 86 81 87 82 return ret; 88 83 } 89 - EXPORT_SYMBOL_GPL(__kvmhv_copy_tofrom_guest_radix); 90 84 91 85 static long kvmhv_copy_tofrom_guest_radix(struct kvm_vcpu *vcpu, gva_t eaddr, 92 86 void *to, void *from, unsigned long n) ··· 121 117 122 118 return ret; 123 119 } 124 - EXPORT_SYMBOL_GPL(kvmhv_copy_from_guest_radix); 125 120 126 121 long kvmhv_copy_to_guest_radix(struct kvm_vcpu *vcpu, gva_t eaddr, void *from, 127 122 unsigned long n) 128 123 { 129 124 return kvmhv_copy_tofrom_guest_radix(vcpu, eaddr, NULL, from, n); 130 125 } 131 - EXPORT_SYMBOL_GPL(kvmhv_copy_to_guest_radix); 132 126 133 127 int kvmppc_mmu_walk_radix_tree(struct kvm_vcpu *vcpu, gva_t eaddr, 134 128 struct kvmppc_pte *gpte, u64 root,
+6 -3
arch/powerpc/kvm/book3s_64_vio_hv.c
··· 173 173 idx -= stt->offset; 174 174 page = stt->pages[idx / TCES_PER_PAGE]; 175 175 /* 176 - * page must not be NULL in real mode, 177 - * kvmppc_rm_ioba_validate() must have taken care of this. 176 + * kvmppc_rm_ioba_validate() allows pages not be allocated if TCE is 177 + * being cleared, otherwise it returns H_TOO_HARD and we skip this. 178 178 */ 179 - WARN_ON_ONCE_RM(!page); 179 + if (!page) { 180 + WARN_ON_ONCE_RM(tce != 0); 181 + return; 182 + } 180 183 tbl = kvmppc_page_address(page); 181 184 182 185 tbl[idx % TCES_PER_PAGE] = tce;
+86 -22
arch/powerpc/kvm/book3s_hv.c
··· 59 59 #include <asm/kvm_book3s.h> 60 60 #include <asm/mmu_context.h> 61 61 #include <asm/lppaca.h> 62 + #include <asm/pmc.h> 62 63 #include <asm/processor.h> 63 64 #include <asm/cputhreads.h> 64 65 #include <asm/page.h> ··· 1166 1165 break; 1167 1166 #endif 1168 1167 case H_RANDOM: 1169 - if (!powernv_get_random_long(&vcpu->arch.regs.gpr[4])) 1168 + if (!arch_get_random_seed_long(&vcpu->arch.regs.gpr[4])) 1170 1169 ret = H_HARDWARE; 1171 1170 break; 1172 1171 case H_RPT_INVALIDATE: ··· 1680 1679 r = RESUME_GUEST; 1681 1680 } 1682 1681 break; 1682 + 1683 + #ifdef CONFIG_PPC_TRANSACTIONAL_MEM 1684 + case BOOK3S_INTERRUPT_HV_SOFTPATCH: 1685 + /* 1686 + * This occurs for various TM-related instructions that 1687 + * we need to emulate on POWER9 DD2.2. We have already 1688 + * handled the cases where the guest was in real-suspend 1689 + * mode and was transitioning to transactional state. 1690 + */ 1691 + r = kvmhv_p9_tm_emulation(vcpu); 1692 + if (r != -1) 1693 + break; 1694 + fallthrough; /* go to facility unavailable handler */ 1695 + #endif 1696 + 1683 1697 /* 1684 1698 * This occurs if the guest (kernel or userspace), does something that 1685 1699 * is prohibited by HFSCR. ··· 1712 1696 r = RESUME_GUEST; 1713 1697 } 1714 1698 break; 1715 - 1716 - #ifdef CONFIG_PPC_TRANSACTIONAL_MEM 1717 - case BOOK3S_INTERRUPT_HV_SOFTPATCH: 1718 - /* 1719 - * This occurs for various TM-related instructions that 1720 - * we need to emulate on POWER9 DD2.2. We have already 1721 - * handled the cases where the guest was in real-suspend 1722 - * mode and was transitioning to transactional state. 1723 - */ 1724 - r = kvmhv_p9_tm_emulation(vcpu); 1725 - break; 1726 - #endif 1727 1699 1728 1700 case BOOK3S_INTERRUPT_HV_RM_HARD: 1729 1701 r = RESUME_PASSTHROUGH; ··· 1731 1727 1732 1728 static int kvmppc_handle_nested_exit(struct kvm_vcpu *vcpu) 1733 1729 { 1730 + struct kvm_nested_guest *nested = vcpu->arch.nested; 1734 1731 int r; 1735 1732 int srcu_idx; 1736 1733 ··· 1816 1811 * mode and was transitioning to transactional state. 1817 1812 */ 1818 1813 r = kvmhv_p9_tm_emulation(vcpu); 1819 - break; 1814 + if (r != -1) 1815 + break; 1816 + fallthrough; /* go to facility unavailable handler */ 1820 1817 #endif 1818 + 1819 + case BOOK3S_INTERRUPT_H_FAC_UNAVAIL: { 1820 + u64 cause = vcpu->arch.hfscr >> 56; 1821 + 1822 + /* 1823 + * Only pass HFU interrupts to the L1 if the facility is 1824 + * permitted but disabled by the L1's HFSCR, otherwise 1825 + * the interrupt does not make sense to the L1 so turn 1826 + * it into a HEAI. 1827 + */ 1828 + if (!(vcpu->arch.hfscr_permitted & (1UL << cause)) || 1829 + (nested->hfscr & (1UL << cause))) { 1830 + vcpu->arch.trap = BOOK3S_INTERRUPT_H_EMUL_ASSIST; 1831 + 1832 + /* 1833 + * If the fetch failed, return to guest and 1834 + * try executing it again. 1835 + */ 1836 + r = kvmppc_get_last_inst(vcpu, INST_GENERIC, 1837 + &vcpu->arch.emul_inst); 1838 + if (r != EMULATE_DONE) 1839 + r = RESUME_GUEST; 1840 + else 1841 + r = RESUME_HOST; 1842 + } else { 1843 + r = RESUME_HOST; 1844 + } 1845 + 1846 + break; 1847 + } 1821 1848 1822 1849 case BOOK3S_INTERRUPT_HV_RM_HARD: 1823 1850 vcpu->arch.trap = 0; ··· 2721 2684 spin_lock_init(&vcpu->arch.vpa_update_lock); 2722 2685 spin_lock_init(&vcpu->arch.tbacct_lock); 2723 2686 vcpu->arch.busy_preempt = TB_NIL; 2687 + vcpu->arch.shregs.msr = MSR_ME; 2724 2688 vcpu->arch.intr_msr = MSR_SF | MSR_ME; 2725 2689 2726 2690 /* ··· 2742 2704 } 2743 2705 if (cpu_has_feature(CPU_FTR_TM_COMP)) 2744 2706 vcpu->arch.hfscr |= HFSCR_TM; 2707 + 2708 + vcpu->arch.hfscr_permitted = vcpu->arch.hfscr; 2745 2709 2746 2710 kvmppc_mmu_book3s_hv_init(vcpu); 2747 2711 ··· 3767 3727 mtspr(SPRN_EBBHR, vcpu->arch.ebbhr); 3768 3728 mtspr(SPRN_EBBRR, vcpu->arch.ebbrr); 3769 3729 mtspr(SPRN_BESCR, vcpu->arch.bescr); 3770 - mtspr(SPRN_WORT, vcpu->arch.wort); 3771 3730 mtspr(SPRN_TIDR, vcpu->arch.tid); 3772 3731 mtspr(SPRN_AMR, vcpu->arch.amr); 3773 3732 mtspr(SPRN_UAMOR, vcpu->arch.uamor); ··· 3793 3754 vcpu->arch.ebbhr = mfspr(SPRN_EBBHR); 3794 3755 vcpu->arch.ebbrr = mfspr(SPRN_EBBRR); 3795 3756 vcpu->arch.bescr = mfspr(SPRN_BESCR); 3796 - vcpu->arch.wort = mfspr(SPRN_WORT); 3797 3757 vcpu->arch.tid = mfspr(SPRN_TIDR); 3798 3758 vcpu->arch.amr = mfspr(SPRN_AMR); 3799 3759 vcpu->arch.uamor = mfspr(SPRN_UAMOR); ··· 3824 3786 struct p9_host_os_sprs *host_os_sprs) 3825 3787 { 3826 3788 mtspr(SPRN_PSPB, 0); 3827 - mtspr(SPRN_WORT, 0); 3828 3789 mtspr(SPRN_UAMOR, 0); 3829 3790 3830 3791 mtspr(SPRN_DSCR, host_os_sprs->dscr); ··· 3889 3852 cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) 3890 3853 kvmppc_restore_tm_hv(vcpu, vcpu->arch.shregs.msr, true); 3891 3854 3855 + #ifdef CONFIG_PPC_PSERIES 3856 + if (kvmhv_on_pseries()) { 3857 + barrier(); 3858 + if (vcpu->arch.vpa.pinned_addr) { 3859 + struct lppaca *lp = vcpu->arch.vpa.pinned_addr; 3860 + get_lppaca()->pmcregs_in_use = lp->pmcregs_in_use; 3861 + } else { 3862 + get_lppaca()->pmcregs_in_use = 1; 3863 + } 3864 + barrier(); 3865 + } 3866 + #endif 3892 3867 kvmhv_load_guest_pmu(vcpu); 3893 3868 3894 3869 msr_check_and_set(MSR_FP | MSR_VEC | MSR_VSX); ··· 4035 3986 save_pmu |= nesting_enabled(vcpu->kvm); 4036 3987 4037 3988 kvmhv_save_guest_pmu(vcpu, save_pmu); 3989 + #ifdef CONFIG_PPC_PSERIES 3990 + if (kvmhv_on_pseries()) { 3991 + barrier(); 3992 + get_lppaca()->pmcregs_in_use = ppc_get_pmu_inuse(); 3993 + barrier(); 3994 + } 3995 + #endif 4038 3996 4039 3997 vc->entry_exit_map = 0x101; 4040 3998 vc->in_guest = 0; ··· 5384 5328 struct kvmppc_passthru_irqmap *pimap; 5385 5329 struct irq_chip *chip; 5386 5330 int i, rc = 0; 5331 + struct irq_data *host_data; 5387 5332 5388 5333 if (!kvm_irq_bypass) 5389 5334 return 1; ··· 5412 5355 * what our real-mode EOI code does, or a XIVE interrupt 5413 5356 */ 5414 5357 chip = irq_data_get_irq_chip(&desc->irq_data); 5415 - if (!chip || !(is_pnv_opal_msi(chip) || is_xive_irq(chip))) { 5358 + if (!chip || !is_pnv_opal_msi(chip)) { 5416 5359 pr_warn("kvmppc_set_passthru_irq_hv: Could not assign IRQ map for (%d,%d)\n", 5417 5360 host_irq, guest_gsi); 5418 5361 mutex_unlock(&kvm->lock); ··· 5449 5392 * the KVM real mode handler. 5450 5393 */ 5451 5394 smp_wmb(); 5452 - irq_map->r_hwirq = desc->irq_data.hwirq; 5395 + 5396 + /* 5397 + * The 'host_irq' number is mapped in the PCI-MSI domain but 5398 + * the underlying calls, which will EOI the interrupt in real 5399 + * mode, need an HW IRQ number mapped in the XICS IRQ domain. 5400 + */ 5401 + host_data = irq_domain_get_irq_data(irq_get_default_host(), host_irq); 5402 + irq_map->r_hwirq = (unsigned int)irqd_to_hwirq(host_data); 5453 5403 5454 5404 if (i == pimap->n_mapped) 5455 5405 pimap->n_mapped++; 5456 5406 5457 5407 if (xics_on_xive()) 5458 - rc = kvmppc_xive_set_mapped(kvm, guest_gsi, desc); 5408 + rc = kvmppc_xive_set_mapped(kvm, guest_gsi, host_irq); 5459 5409 else 5460 - kvmppc_xics_set_mapped(kvm, guest_gsi, desc->irq_data.hwirq); 5410 + kvmppc_xics_set_mapped(kvm, guest_gsi, irq_map->r_hwirq); 5461 5411 if (rc) 5462 5412 irq_map->r_hwirq = 0; 5463 5413 ··· 5503 5439 } 5504 5440 5505 5441 if (xics_on_xive()) 5506 - rc = kvmppc_xive_clr_mapped(kvm, guest_gsi, pimap->mapped[i].desc); 5442 + rc = kvmppc_xive_clr_mapped(kvm, guest_gsi, host_irq); 5507 5443 else 5508 5444 kvmppc_xics_clr_mapped(kvm, guest_gsi, pimap->mapped[i].r_hwirq); 5509 5445
+5 -5
arch/powerpc/kvm/book3s_hv_builtin.c
··· 137 137 * exist in the system. We use a counter of VMs to track this. 138 138 * 139 139 * One of the operations we need to block is onlining of secondaries, so we 140 - * protect hv_vm_count with get/put_online_cpus(). 140 + * protect hv_vm_count with cpus_read_lock/unlock(). 141 141 */ 142 142 static atomic_t hv_vm_count; 143 143 144 144 void kvm_hv_vm_activated(void) 145 145 { 146 - get_online_cpus(); 146 + cpus_read_lock(); 147 147 atomic_inc(&hv_vm_count); 148 - put_online_cpus(); 148 + cpus_read_unlock(); 149 149 } 150 150 EXPORT_SYMBOL_GPL(kvm_hv_vm_activated); 151 151 152 152 void kvm_hv_vm_deactivated(void) 153 153 { 154 - get_online_cpus(); 154 + cpus_read_lock(); 155 155 atomic_dec(&hv_vm_count); 156 - put_online_cpus(); 156 + cpus_read_unlock(); 157 157 } 158 158 EXPORT_SYMBOL_GPL(kvm_hv_vm_deactivated); 159 159
+50 -51
arch/powerpc/kvm/book3s_hv_nested.c
··· 99 99 hr->dawrx1 = swab64(hr->dawrx1); 100 100 } 101 101 102 - static void save_hv_return_state(struct kvm_vcpu *vcpu, int trap, 102 + static void save_hv_return_state(struct kvm_vcpu *vcpu, 103 103 struct hv_guest_state *hr) 104 104 { 105 105 struct kvmppc_vcore *vc = vcpu->arch.vcore; 106 106 107 107 hr->dpdes = vc->dpdes; 108 - hr->hfscr = vcpu->arch.hfscr; 109 108 hr->purr = vcpu->arch.purr; 110 109 hr->spurr = vcpu->arch.spurr; 111 110 hr->ic = vcpu->arch.ic; ··· 118 119 hr->pidr = vcpu->arch.pid; 119 120 hr->cfar = vcpu->arch.cfar; 120 121 hr->ppr = vcpu->arch.ppr; 121 - switch (trap) { 122 + switch (vcpu->arch.trap) { 122 123 case BOOK3S_INTERRUPT_H_DATA_STORAGE: 123 124 hr->hdar = vcpu->arch.fault_dar; 124 125 hr->hdsisr = vcpu->arch.fault_dsisr; ··· 127 128 case BOOK3S_INTERRUPT_H_INST_STORAGE: 128 129 hr->asdr = vcpu->arch.fault_gpa; 129 130 break; 131 + case BOOK3S_INTERRUPT_H_FAC_UNAVAIL: 132 + hr->hfscr = ((~HFSCR_INTR_CAUSE & hr->hfscr) | 133 + (HFSCR_INTR_CAUSE & vcpu->arch.hfscr)); 134 + break; 130 135 case BOOK3S_INTERRUPT_H_EMUL_ASSIST: 131 136 hr->heir = vcpu->arch.emul_inst; 132 137 break; 133 138 } 134 139 } 135 140 136 - /* 137 - * This can result in some L0 HV register state being leaked to an L1 138 - * hypervisor when the hv_guest_state is copied back to the guest after 139 - * being modified here. 140 - * 141 - * There is no known problem with such a leak, and in many cases these 142 - * register settings could be derived by the guest by observing behaviour 143 - * and timing, interrupts, etc., but it is an issue to consider. 144 - */ 145 - static void sanitise_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr) 146 - { 147 - struct kvmppc_vcore *vc = vcpu->arch.vcore; 148 - u64 mask; 149 - 150 - /* 151 - * Don't let L1 change LPCR bits for the L2 except these: 152 - */ 153 - mask = LPCR_DPFD | LPCR_ILE | LPCR_TC | LPCR_AIL | LPCR_LD | 154 - LPCR_LPES | LPCR_MER; 155 - 156 - /* 157 - * Additional filtering is required depending on hardware 158 - * and configuration. 159 - */ 160 - hr->lpcr = kvmppc_filter_lpcr_hv(vcpu->kvm, 161 - (vc->lpcr & ~mask) | (hr->lpcr & mask)); 162 - 163 - /* 164 - * Don't let L1 enable features for L2 which we've disabled for L1, 165 - * but preserve the interrupt cause field. 166 - */ 167 - hr->hfscr &= (HFSCR_INTR_CAUSE | vcpu->arch.hfscr); 168 - 169 - /* Don't let data address watchpoint match in hypervisor state */ 170 - hr->dawrx0 &= ~DAWRX_HYP; 171 - hr->dawrx1 &= ~DAWRX_HYP; 172 - 173 - /* Don't let completed instruction address breakpt match in HV state */ 174 - if ((hr->ciabr & CIABR_PRIV) == CIABR_PRIV_HYPER) 175 - hr->ciabr &= ~CIABR_PRIV; 176 - } 177 - 178 - static void restore_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr) 141 + static void restore_hv_regs(struct kvm_vcpu *vcpu, const struct hv_guest_state *hr) 179 142 { 180 143 struct kvmppc_vcore *vc = vcpu->arch.vcore; 181 144 ··· 249 288 sizeof(struct pt_regs)); 250 289 } 251 290 291 + static void load_l2_hv_regs(struct kvm_vcpu *vcpu, 292 + const struct hv_guest_state *l2_hv, 293 + const struct hv_guest_state *l1_hv, u64 *lpcr) 294 + { 295 + struct kvmppc_vcore *vc = vcpu->arch.vcore; 296 + u64 mask; 297 + 298 + restore_hv_regs(vcpu, l2_hv); 299 + 300 + /* 301 + * Don't let L1 change LPCR bits for the L2 except these: 302 + */ 303 + mask = LPCR_DPFD | LPCR_ILE | LPCR_TC | LPCR_AIL | LPCR_LD | 304 + LPCR_LPES | LPCR_MER; 305 + 306 + /* 307 + * Additional filtering is required depending on hardware 308 + * and configuration. 309 + */ 310 + *lpcr = kvmppc_filter_lpcr_hv(vcpu->kvm, 311 + (vc->lpcr & ~mask) | (*lpcr & mask)); 312 + 313 + /* 314 + * Don't let L1 enable features for L2 which we don't allow for L1, 315 + * but preserve the interrupt cause field. 316 + */ 317 + vcpu->arch.hfscr = l2_hv->hfscr & (HFSCR_INTR_CAUSE | vcpu->arch.hfscr_permitted); 318 + 319 + /* Don't let data address watchpoint match in hypervisor state */ 320 + vcpu->arch.dawrx0 = l2_hv->dawrx0 & ~DAWRX_HYP; 321 + vcpu->arch.dawrx1 = l2_hv->dawrx1 & ~DAWRX_HYP; 322 + 323 + /* Don't let completed instruction address breakpt match in HV state */ 324 + if ((l2_hv->ciabr & CIABR_PRIV) == CIABR_PRIV_HYPER) 325 + vcpu->arch.ciabr = l2_hv->ciabr & ~CIABR_PRIV; 326 + } 327 + 252 328 long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu) 253 329 { 254 330 long int err, r; ··· 294 296 struct hv_guest_state l2_hv = {0}, saved_l1_hv; 295 297 struct kvmppc_vcore *vc = vcpu->arch.vcore; 296 298 u64 hv_ptr, regs_ptr; 297 - u64 hdec_exp; 299 + u64 hdec_exp, lpcr; 298 300 s64 delta_purr, delta_spurr, delta_ic, delta_vtb; 299 301 300 302 if (vcpu->kvm->arch.l1_ptcr == 0) ··· 362 364 /* set L1 state to L2 state */ 363 365 vcpu->arch.nested = l2; 364 366 vcpu->arch.nested_vcpu_id = l2_hv.vcpu_token; 367 + l2->hfscr = l2_hv.hfscr; 365 368 vcpu->arch.regs = l2_regs; 366 369 367 370 /* Guest must always run with ME enabled, HV disabled. */ 368 371 vcpu->arch.shregs.msr = (vcpu->arch.regs.msr | MSR_ME) & ~MSR_HV; 369 372 370 - sanitise_hv_regs(vcpu, &l2_hv); 371 - restore_hv_regs(vcpu, &l2_hv); 373 + lpcr = l2_hv.lpcr; 374 + load_l2_hv_regs(vcpu, &l2_hv, &saved_l1_hv, &lpcr); 372 375 373 376 vcpu->arch.ret = RESUME_GUEST; 374 377 vcpu->arch.trap = 0; ··· 379 380 r = RESUME_HOST; 380 381 break; 381 382 } 382 - r = kvmhv_run_single_vcpu(vcpu, hdec_exp, l2_hv.lpcr); 383 + r = kvmhv_run_single_vcpu(vcpu, hdec_exp, lpcr); 383 384 } while (is_kvmppc_resume_guest(r)); 384 385 385 386 /* save L2 state for return */ ··· 389 390 delta_spurr = vcpu->arch.spurr - l2_hv.spurr; 390 391 delta_ic = vcpu->arch.ic - l2_hv.ic; 391 392 delta_vtb = vc->vtb - l2_hv.vtb; 392 - save_hv_return_state(vcpu, vcpu->arch.trap, &l2_hv); 393 + save_hv_return_state(vcpu, &l2_hv); 393 394 394 395 /* restore L1 state */ 395 396 vcpu->arch.nested = NULL;
+4 -4
arch/powerpc/kvm/book3s_hv_rm_xics.c
··· 706 706 icp->rm_eoied_irq = irq; 707 707 } 708 708 709 + /* Handle passthrough interrupts */ 709 710 if (state->host_irq) { 710 711 ++vcpu->stat.pthru_all; 711 712 if (state->intr_cpu != -1) { ··· 760 759 761 760 static unsigned long eoi_rc; 762 761 763 - static void icp_eoi(struct irq_chip *c, u32 hwirq, __be32 xirr, bool *again) 762 + static void icp_eoi(struct irq_data *d, u32 hwirq, __be32 xirr, bool *again) 764 763 { 765 764 void __iomem *xics_phys; 766 765 int64_t rc; 767 766 768 - rc = pnv_opal_pci_msi_eoi(c, hwirq); 767 + rc = pnv_opal_pci_msi_eoi(d); 769 768 770 769 if (rc) 771 770 eoi_rc = rc; ··· 873 872 icp_rm_deliver_irq(xics, icp, irq, false); 874 873 875 874 /* EOI the interrupt */ 876 - icp_eoi(irq_desc_get_chip(irq_map->desc), irq_map->r_hwirq, xirr, 877 - again); 875 + icp_eoi(irq_desc_get_irq_data(irq_map->desc), irq_map->r_hwirq, xirr, again); 878 876 879 877 if (check_too_hard(xics, icp) == H_TOO_HARD) 880 878 return 2;
-42
arch/powerpc/kvm/book3s_hv_rmhandlers.S
··· 1088 1088 cmpwi r12, BOOK3S_INTERRUPT_H_INST_STORAGE 1089 1089 beq kvmppc_hisi 1090 1090 1091 - #ifdef CONFIG_PPC_TRANSACTIONAL_MEM 1092 - /* For softpatch interrupt, go off and do TM instruction emulation */ 1093 - cmpwi r12, BOOK3S_INTERRUPT_HV_SOFTPATCH 1094 - beq kvmppc_tm_emul 1095 - #endif 1096 - 1097 1091 /* See if this is a leftover HDEC interrupt */ 1098 1092 cmpwi r12,BOOK3S_INTERRUPT_HV_DECREMENTER 1099 1093 bne 2f ··· 1592 1598 mr r4, r9 1593 1599 blt deliver_guest_interrupt 1594 1600 b guest_exit_cont 1595 - 1596 - #ifdef CONFIG_PPC_TRANSACTIONAL_MEM 1597 - /* 1598 - * Softpatch interrupt for transactional memory emulation cases 1599 - * on POWER9 DD2.2. This is early in the guest exit path - we 1600 - * haven't saved registers or done a treclaim yet. 1601 - */ 1602 - kvmppc_tm_emul: 1603 - /* Save instruction image in HEIR */ 1604 - mfspr r3, SPRN_HEIR 1605 - stw r3, VCPU_HEIR(r9) 1606 - 1607 - /* 1608 - * The cases we want to handle here are those where the guest 1609 - * is in real suspend mode and is trying to transition to 1610 - * transactional mode. 1611 - */ 1612 - lbz r0, HSTATE_FAKE_SUSPEND(r13) 1613 - cmpwi r0, 0 /* keep exiting guest if in fake suspend */ 1614 - bne guest_exit_cont 1615 - rldicl r3, r11, 64 - MSR_TS_S_LG, 62 1616 - cmpwi r3, 1 /* or if not in suspend state */ 1617 - bne guest_exit_cont 1618 - 1619 - /* Call C code to do the emulation */ 1620 - mr r3, r9 1621 - bl kvmhv_p9_tm_emulation_early 1622 - nop 1623 - ld r9, HSTATE_KVM_VCPU(r13) 1624 - li r12, BOOK3S_INTERRUPT_HV_SOFTPATCH 1625 - cmpwi r3, 0 1626 - beq guest_exit_cont /* continue exiting if not handled */ 1627 - ld r10, VCPU_PC(r9) 1628 - ld r11, VCPU_MSR(r9) 1629 - b fast_interrupt_c_return /* go back to guest if handled */ 1630 - #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */ 1631 1601 1632 1602 /* 1633 1603 * Check whether an HDSI is an HPTE not found fault or something else.
+39 -22
arch/powerpc/kvm/book3s_hv_tm.c
··· 47 47 int ra, rs; 48 48 49 49 /* 50 + * The TM softpatch interrupt sets NIP to the instruction following 51 + * the faulting instruction, which is not executed. Rewind nip to the 52 + * faulting instruction so it looks like a normal synchronous 53 + * interrupt, then update nip in the places where the instruction is 54 + * emulated. 55 + */ 56 + vcpu->arch.regs.nip -= 4; 57 + 58 + /* 50 59 * rfid, rfebb, and mtmsrd encode bit 31 = 0 since it's a reserved bit 51 60 * in these instructions, so masking bit 31 out doesn't change these 52 61 * instructions. For treclaim., tsr., and trechkpt. instructions if bit ··· 76 67 (newmsr & MSR_TM))); 77 68 newmsr = sanitize_msr(newmsr); 78 69 vcpu->arch.shregs.msr = newmsr; 79 - vcpu->arch.cfar = vcpu->arch.regs.nip - 4; 70 + vcpu->arch.cfar = vcpu->arch.regs.nip; 80 71 vcpu->arch.regs.nip = vcpu->arch.shregs.srr0; 81 72 return RESUME_GUEST; 82 73 ··· 88 79 } 89 80 /* check EBB facility is available */ 90 81 if (!(vcpu->arch.hfscr & HFSCR_EBB)) { 91 - /* generate an illegal instruction interrupt */ 92 - kvmppc_core_queue_program(vcpu, SRR1_PROGILL); 93 - return RESUME_GUEST; 82 + vcpu->arch.hfscr &= ~HFSCR_INTR_CAUSE; 83 + vcpu->arch.hfscr |= (u64)FSCR_EBB_LG << 56; 84 + vcpu->arch.trap = BOOK3S_INTERRUPT_H_FAC_UNAVAIL; 85 + return -1; /* rerun host interrupt handler */ 94 86 } 95 87 if ((msr & MSR_PR) && !(vcpu->arch.fscr & FSCR_EBB)) { 96 88 /* generate a facility unavailable interrupt */ 97 - vcpu->arch.fscr = (vcpu->arch.fscr & ~(0xffull << 56)) | 98 - ((u64)FSCR_EBB_LG << 56); 89 + vcpu->arch.fscr &= ~FSCR_INTR_CAUSE; 90 + vcpu->arch.fscr |= (u64)FSCR_EBB_LG << 56; 99 91 kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_FAC_UNAVAIL); 100 92 return RESUME_GUEST; 101 93 } ··· 110 100 vcpu->arch.bescr = bescr; 111 101 msr = (msr & ~MSR_TS_MASK) | MSR_TS_T; 112 102 vcpu->arch.shregs.msr = msr; 113 - vcpu->arch.cfar = vcpu->arch.regs.nip - 4; 103 + vcpu->arch.cfar = vcpu->arch.regs.nip; 114 104 vcpu->arch.regs.nip = vcpu->arch.ebbrr; 115 105 return RESUME_GUEST; 116 106 ··· 126 116 newmsr = (newmsr & ~MSR_LE) | (msr & MSR_LE); 127 117 newmsr = sanitize_msr(newmsr); 128 118 vcpu->arch.shregs.msr = newmsr; 119 + vcpu->arch.regs.nip += 4; 129 120 return RESUME_GUEST; 130 121 131 122 /* ignore bit 31, see comment above */ ··· 139 128 } 140 129 /* check for TM disabled in the HFSCR or MSR */ 141 130 if (!(vcpu->arch.hfscr & HFSCR_TM)) { 142 - /* generate an illegal instruction interrupt */ 143 - kvmppc_core_queue_program(vcpu, SRR1_PROGILL); 144 - return RESUME_GUEST; 131 + vcpu->arch.hfscr &= ~HFSCR_INTR_CAUSE; 132 + vcpu->arch.hfscr |= (u64)FSCR_TM_LG << 56; 133 + vcpu->arch.trap = BOOK3S_INTERRUPT_H_FAC_UNAVAIL; 134 + return -1; /* rerun host interrupt handler */ 145 135 } 146 136 if (!(msr & MSR_TM)) { 147 137 /* generate a facility unavailable interrupt */ 148 - vcpu->arch.fscr = (vcpu->arch.fscr & ~(0xffull << 56)) | 149 - ((u64)FSCR_TM_LG << 56); 138 + vcpu->arch.fscr &= ~FSCR_INTR_CAUSE; 139 + vcpu->arch.fscr |= (u64)FSCR_TM_LG << 56; 150 140 kvmppc_book3s_queue_irqprio(vcpu, 151 141 BOOK3S_INTERRUPT_FAC_UNAVAIL); 152 142 return RESUME_GUEST; ··· 164 152 msr = (msr & ~MSR_TS_MASK) | MSR_TS_S; 165 153 } 166 154 vcpu->arch.shregs.msr = msr; 155 + vcpu->arch.regs.nip += 4; 167 156 return RESUME_GUEST; 168 157 169 158 /* ignore bit 31, see comment above */ 170 159 case (PPC_INST_TRECLAIM & PO_XOP_OPCODE_MASK): 171 160 /* check for TM disabled in the HFSCR or MSR */ 172 161 if (!(vcpu->arch.hfscr & HFSCR_TM)) { 173 - /* generate an illegal instruction interrupt */ 174 - kvmppc_core_queue_program(vcpu, SRR1_PROGILL); 175 - return RESUME_GUEST; 162 + vcpu->arch.hfscr &= ~HFSCR_INTR_CAUSE; 163 + vcpu->arch.hfscr |= (u64)FSCR_TM_LG << 56; 164 + vcpu->arch.trap = BOOK3S_INTERRUPT_H_FAC_UNAVAIL; 165 + return -1; /* rerun host interrupt handler */ 176 166 } 177 167 if (!(msr & MSR_TM)) { 178 168 /* generate a facility unavailable interrupt */ 179 - vcpu->arch.fscr = (vcpu->arch.fscr & ~(0xffull << 56)) | 180 - ((u64)FSCR_TM_LG << 56); 169 + vcpu->arch.fscr &= ~FSCR_INTR_CAUSE; 170 + vcpu->arch.fscr |= (u64)FSCR_TM_LG << 56; 181 171 kvmppc_book3s_queue_irqprio(vcpu, 182 172 BOOK3S_INTERRUPT_FAC_UNAVAIL); 183 173 return RESUME_GUEST; ··· 203 189 vcpu->arch.regs.ccr = (vcpu->arch.regs.ccr & 0x0fffffff) | 204 190 (((msr & MSR_TS_MASK) >> MSR_TS_S_LG) << 29); 205 191 vcpu->arch.shregs.msr &= ~MSR_TS_MASK; 192 + vcpu->arch.regs.nip += 4; 206 193 return RESUME_GUEST; 207 194 208 195 /* ignore bit 31, see comment above */ ··· 211 196 /* XXX do we need to check for PR=0 here? */ 212 197 /* check for TM disabled in the HFSCR or MSR */ 213 198 if (!(vcpu->arch.hfscr & HFSCR_TM)) { 214 - /* generate an illegal instruction interrupt */ 215 - kvmppc_core_queue_program(vcpu, SRR1_PROGILL); 216 - return RESUME_GUEST; 199 + vcpu->arch.hfscr &= ~HFSCR_INTR_CAUSE; 200 + vcpu->arch.hfscr |= (u64)FSCR_TM_LG << 56; 201 + vcpu->arch.trap = BOOK3S_INTERRUPT_H_FAC_UNAVAIL; 202 + return -1; /* rerun host interrupt handler */ 217 203 } 218 204 if (!(msr & MSR_TM)) { 219 205 /* generate a facility unavailable interrupt */ 220 - vcpu->arch.fscr = (vcpu->arch.fscr & ~(0xffull << 56)) | 221 - ((u64)FSCR_TM_LG << 56); 206 + vcpu->arch.fscr &= ~FSCR_INTR_CAUSE; 207 + vcpu->arch.fscr |= (u64)FSCR_TM_LG << 56; 222 208 kvmppc_book3s_queue_irqprio(vcpu, 223 209 BOOK3S_INTERRUPT_FAC_UNAVAIL); 224 210 return RESUME_GUEST; ··· 236 220 vcpu->arch.regs.ccr = (vcpu->arch.regs.ccr & 0x0fffffff) | 237 221 (((msr & MSR_TS_MASK) >> MSR_TS_S_LG) << 29); 238 222 vcpu->arch.shregs.msr = msr | MSR_TS_S; 223 + vcpu->arch.regs.nip += 4; 239 224 return RESUME_GUEST; 240 225 } 241 226
+3 -3
arch/powerpc/kvm/book3s_xics.c
··· 10 10 #include <linux/gfp.h> 11 11 #include <linux/anon_inodes.h> 12 12 #include <linux/spinlock.h> 13 - 13 + #include <linux/debugfs.h> 14 14 #include <linux/uaccess.h> 15 + 15 16 #include <asm/kvm_book3s.h> 16 17 #include <asm/kvm_ppc.h> 17 18 #include <asm/hvcall.h> 18 19 #include <asm/xics.h> 19 - #include <asm/debugfs.h> 20 20 #include <asm/time.h> 21 21 22 22 #include <linux/seq_file.h> ··· 1024 1024 return; 1025 1025 } 1026 1026 1027 - xics->dentry = debugfs_create_file(name, 0444, powerpc_debugfs_root, 1027 + xics->dentry = debugfs_create_file(name, 0444, arch_debugfs_dir, 1028 1028 xics, &xics_debug_fops); 1029 1029 1030 1030 pr_debug("%s: created %s\n", __func__, name);
+52 -22
arch/powerpc/kvm/book3s_xive.c
··· 22 22 #include <asm/xive.h> 23 23 #include <asm/xive-regs.h> 24 24 #include <asm/debug.h> 25 - #include <asm/debugfs.h> 26 25 #include <asm/time.h> 27 26 #include <asm/opal.h> 28 27 ··· 58 59 */ 59 60 #define XIVE_Q_GAP 2 60 61 62 + static bool kvmppc_xive_vcpu_has_save_restore(struct kvm_vcpu *vcpu) 63 + { 64 + struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu; 65 + 66 + /* Check enablement at VP level */ 67 + return xc->vp_cam & TM_QW1W2_HO; 68 + } 69 + 70 + bool kvmppc_xive_check_save_restore(struct kvm_vcpu *vcpu) 71 + { 72 + struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu; 73 + struct kvmppc_xive *xive = xc->xive; 74 + 75 + if (xive->flags & KVMPPC_XIVE_FLAG_SAVE_RESTORE) 76 + return kvmppc_xive_vcpu_has_save_restore(vcpu); 77 + 78 + return true; 79 + } 80 + 61 81 /* 62 82 * Push a vcpu's context to the XIVE on guest entry. 63 83 * This assumes we are in virtual mode (MMU on) ··· 95 77 return; 96 78 97 79 eieio(); 98 - __raw_writeq(vcpu->arch.xive_saved_state.w01, tima + TM_QW1_OS); 80 + if (!kvmppc_xive_vcpu_has_save_restore(vcpu)) 81 + __raw_writeq(vcpu->arch.xive_saved_state.w01, tima + TM_QW1_OS); 99 82 __raw_writel(vcpu->arch.xive_cam_word, tima + TM_QW1_OS + TM_WORD2); 100 83 vcpu->arch.xive_pushed = 1; 101 84 eieio(); ··· 168 149 /* First load to pull the context, we ignore the value */ 169 150 __raw_readl(tima + TM_SPC_PULL_OS_CTX); 170 151 /* Second load to recover the context state (Words 0 and 1) */ 171 - vcpu->arch.xive_saved_state.w01 = __raw_readq(tima + TM_QW1_OS); 152 + if (!kvmppc_xive_vcpu_has_save_restore(vcpu)) 153 + vcpu->arch.xive_saved_state.w01 = __raw_readq(tima + TM_QW1_OS); 172 154 173 155 /* Fixup some of the state for the next load */ 174 156 vcpu->arch.xive_saved_state.lsmfb = 0; ··· 383 363 if (!vcpu->arch.xive_vcpu) 384 364 continue; 385 365 rc = xive_provision_queue(vcpu, prio); 386 - if (rc == 0 && !xive->single_escalation) 366 + if (rc == 0 && !kvmppc_xive_has_single_escalation(xive)) 387 367 kvmppc_xive_attach_escalation(vcpu, prio, 388 - xive->single_escalation); 368 + kvmppc_xive_has_single_escalation(xive)); 389 369 if (rc) 390 370 return rc; 391 371 } ··· 942 922 } 943 923 944 924 int kvmppc_xive_set_mapped(struct kvm *kvm, unsigned long guest_irq, 945 - struct irq_desc *host_desc) 925 + unsigned long host_irq) 946 926 { 947 927 struct kvmppc_xive *xive = kvm->arch.xive; 948 928 struct kvmppc_xive_src_block *sb; 949 929 struct kvmppc_xive_irq_state *state; 950 - struct irq_data *host_data = irq_desc_get_irq_data(host_desc); 951 - unsigned int host_irq = irq_desc_get_irq(host_desc); 930 + struct irq_data *host_data = 931 + irq_domain_get_irq_data(irq_get_default_host(), host_irq); 952 932 unsigned int hw_irq = (unsigned int)irqd_to_hwirq(host_data); 953 933 u16 idx; 954 934 u8 prio; ··· 957 937 if (!xive) 958 938 return -ENODEV; 959 939 960 - pr_devel("set_mapped girq 0x%lx host HW irq 0x%x...\n",guest_irq, hw_irq); 940 + pr_debug("%s: GIRQ 0x%lx host IRQ %ld XIVE HW IRQ 0x%x\n", 941 + __func__, guest_irq, host_irq, hw_irq); 961 942 962 943 sb = kvmppc_xive_find_source(xive, guest_irq, &idx); 963 944 if (!sb) ··· 980 959 */ 981 960 rc = irq_set_vcpu_affinity(host_irq, state); 982 961 if (rc) { 983 - pr_err("Failed to set VCPU affinity for irq %d\n", host_irq); 962 + pr_err("Failed to set VCPU affinity for host IRQ %ld\n", host_irq); 984 963 return rc; 985 964 } 986 965 ··· 1040 1019 EXPORT_SYMBOL_GPL(kvmppc_xive_set_mapped); 1041 1020 1042 1021 int kvmppc_xive_clr_mapped(struct kvm *kvm, unsigned long guest_irq, 1043 - struct irq_desc *host_desc) 1022 + unsigned long host_irq) 1044 1023 { 1045 1024 struct kvmppc_xive *xive = kvm->arch.xive; 1046 1025 struct kvmppc_xive_src_block *sb; 1047 1026 struct kvmppc_xive_irq_state *state; 1048 - unsigned int host_irq = irq_desc_get_irq(host_desc); 1049 1027 u16 idx; 1050 1028 u8 prio; 1051 1029 int rc; ··· 1052 1032 if (!xive) 1053 1033 return -ENODEV; 1054 1034 1055 - pr_devel("clr_mapped girq 0x%lx...\n", guest_irq); 1035 + pr_debug("%s: GIRQ 0x%lx host IRQ %ld\n", __func__, guest_irq, host_irq); 1056 1036 1057 1037 sb = kvmppc_xive_find_source(xive, guest_irq, &idx); 1058 1038 if (!sb) ··· 1079 1059 /* Release the passed-through interrupt to the host */ 1080 1060 rc = irq_set_vcpu_affinity(host_irq, NULL); 1081 1061 if (rc) { 1082 - pr_err("Failed to clr VCPU affinity for irq %d\n", host_irq); 1062 + pr_err("Failed to clr VCPU affinity for host IRQ %ld\n", host_irq); 1083 1063 return rc; 1084 1064 } 1085 1065 ··· 1219 1199 /* Free escalations */ 1220 1200 for (i = 0; i < KVMPPC_XIVE_Q_COUNT; i++) { 1221 1201 if (xc->esc_virq[i]) { 1222 - if (xc->xive->single_escalation) 1202 + if (kvmppc_xive_has_single_escalation(xc->xive)) 1223 1203 xive_cleanup_single_escalation(vcpu, xc, 1224 1204 xc->esc_virq[i]); 1225 1205 free_irq(xc->esc_virq[i], vcpu); ··· 1339 1319 if (r) 1340 1320 goto bail; 1341 1321 1322 + if (!kvmppc_xive_check_save_restore(vcpu)) { 1323 + pr_err("inconsistent save-restore setup for VCPU %d\n", cpu); 1324 + r = -EIO; 1325 + goto bail; 1326 + } 1327 + 1342 1328 /* Configure VCPU fields for use by assembly push/pull */ 1343 1329 vcpu->arch.xive_saved_state.w01 = cpu_to_be64(0xff000000); 1344 1330 vcpu->arch.xive_cam_word = cpu_to_be32(xc->vp_cam | TM_QW1W2_VO); ··· 1366 1340 * Enable the VP first as the single escalation mode will 1367 1341 * affect escalation interrupts numbering 1368 1342 */ 1369 - r = xive_native_enable_vp(xc->vp_id, xive->single_escalation); 1343 + r = xive_native_enable_vp(xc->vp_id, kvmppc_xive_has_single_escalation(xive)); 1370 1344 if (r) { 1371 1345 pr_err("Failed to enable VP in OPAL, err %d\n", r); 1372 1346 goto bail; ··· 1383 1357 struct xive_q *q = &xc->queues[i]; 1384 1358 1385 1359 /* Single escalation, no queue 7 */ 1386 - if (i == 7 && xive->single_escalation) 1360 + if (i == 7 && kvmppc_xive_has_single_escalation(xive)) 1387 1361 break; 1388 1362 1389 1363 /* Is queue already enabled ? Provision it */ 1390 1364 if (xive->qmap & (1 << i)) { 1391 1365 r = xive_provision_queue(vcpu, i); 1392 - if (r == 0 && !xive->single_escalation) 1366 + if (r == 0 && !kvmppc_xive_has_single_escalation(xive)) 1393 1367 kvmppc_xive_attach_escalation( 1394 - vcpu, i, xive->single_escalation); 1368 + vcpu, i, kvmppc_xive_has_single_escalation(xive)); 1395 1369 if (r) 1396 1370 goto bail; 1397 1371 } else { ··· 1406 1380 } 1407 1381 1408 1382 /* If not done above, attach priority 0 escalation */ 1409 - r = kvmppc_xive_attach_escalation(vcpu, 0, xive->single_escalation); 1383 + r = kvmppc_xive_attach_escalation(vcpu, 0, kvmppc_xive_has_single_escalation(xive)); 1410 1384 if (r) 1411 1385 goto bail; 1412 1386 ··· 2161 2135 */ 2162 2136 xive->nr_servers = KVM_MAX_VCPUS; 2163 2137 2164 - xive->single_escalation = xive_native_has_single_escalation(); 2138 + if (xive_native_has_single_escalation()) 2139 + xive->flags |= KVMPPC_XIVE_FLAG_SINGLE_ESCALATION; 2140 + 2141 + if (xive_native_has_save_restore()) 2142 + xive->flags |= KVMPPC_XIVE_FLAG_SAVE_RESTORE; 2165 2143 2166 2144 kvm->arch.xive = xive; 2167 2145 return 0; ··· 2359 2329 return; 2360 2330 } 2361 2331 2362 - xive->dentry = debugfs_create_file(name, S_IRUGO, powerpc_debugfs_root, 2332 + xive->dentry = debugfs_create_file(name, S_IRUGO, arch_debugfs_dir, 2363 2333 xive, &xive_debug_fops); 2364 2334 2365 2335 pr_debug("%s: created %s\n", __func__, name);
+10 -1
arch/powerpc/kvm/book3s_xive.h
··· 97 97 int (*reset_mapped)(struct kvm *kvm, unsigned long guest_irq); 98 98 }; 99 99 100 + #define KVMPPC_XIVE_FLAG_SINGLE_ESCALATION 0x1 101 + #define KVMPPC_XIVE_FLAG_SAVE_RESTORE 0x2 102 + 100 103 struct kvmppc_xive { 101 104 struct kvm *kvm; 102 105 struct kvm_device *dev; ··· 136 133 u32 q_page_order; 137 134 138 135 /* Flags */ 139 - u8 single_escalation; 136 + u8 flags; 140 137 141 138 /* Number of entries in the VP block */ 142 139 u32 nr_servers; ··· 310 307 struct kvmppc_xive_vcpu *xc, int irq); 311 308 int kvmppc_xive_compute_vp_id(struct kvmppc_xive *xive, u32 cpu, u32 *vp); 312 309 int kvmppc_xive_set_nr_servers(struct kvmppc_xive *xive, u64 addr); 310 + bool kvmppc_xive_check_save_restore(struct kvm_vcpu *vcpu); 311 + 312 + static inline bool kvmppc_xive_has_single_escalation(struct kvmppc_xive *xive) 313 + { 314 + return xive->flags & KVMPPC_XIVE_FLAG_SINGLE_ESCALATION; 315 + } 313 316 314 317 #endif /* CONFIG_KVM_XICS */ 315 318 #endif /* _KVM_PPC_BOOK3S_XICS_H */
+17 -7
arch/powerpc/kvm/book3s_xive_native.c
··· 20 20 #include <asm/xive.h> 21 21 #include <asm/xive-regs.h> 22 22 #include <asm/debug.h> 23 - #include <asm/debugfs.h> 24 23 #include <asm/opal.h> 25 24 26 25 #include <linux/debugfs.h> ··· 92 93 for (i = 0; i < KVMPPC_XIVE_Q_COUNT; i++) { 93 94 /* Free the escalation irq */ 94 95 if (xc->esc_virq[i]) { 95 - if (xc->xive->single_escalation) 96 + if (kvmppc_xive_has_single_escalation(xc->xive)) 96 97 xive_cleanup_single_escalation(vcpu, xc, 97 98 xc->esc_virq[i]); 98 99 free_irq(xc->esc_virq[i], vcpu); ··· 167 168 goto bail; 168 169 } 169 170 171 + if (!kvmppc_xive_check_save_restore(vcpu)) { 172 + pr_err("inconsistent save-restore setup for VCPU %d\n", server_num); 173 + rc = -EIO; 174 + goto bail; 175 + } 176 + 170 177 /* 171 178 * Enable the VP first as the single escalation mode will 172 179 * affect escalation interrupts numbering 173 180 */ 174 - rc = xive_native_enable_vp(xc->vp_id, xive->single_escalation); 181 + rc = xive_native_enable_vp(xc->vp_id, kvmppc_xive_has_single_escalation(xive)); 175 182 if (rc) { 176 183 pr_err("Failed to enable VP in OPAL: %d\n", rc); 177 184 goto bail; ··· 698 693 } 699 694 700 695 rc = kvmppc_xive_attach_escalation(vcpu, priority, 701 - xive->single_escalation); 696 + kvmppc_xive_has_single_escalation(xive)); 702 697 error: 703 698 if (rc) 704 699 kvmppc_xive_native_cleanup_queue(vcpu, priority); ··· 825 820 for (prio = 0; prio < KVMPPC_XIVE_Q_COUNT; prio++) { 826 821 827 822 /* Single escalation, no queue 7 */ 828 - if (prio == 7 && xive->single_escalation) 823 + if (prio == 7 && kvmppc_xive_has_single_escalation(xive)) 829 824 break; 830 825 831 826 if (xc->esc_virq[prio]) { ··· 1116 1111 */ 1117 1112 xive->nr_servers = KVM_MAX_VCPUS; 1118 1113 1119 - xive->single_escalation = xive_native_has_single_escalation(); 1114 + if (xive_native_has_single_escalation()) 1115 + xive->flags |= KVMPPC_XIVE_FLAG_SINGLE_ESCALATION; 1116 + 1117 + if (xive_native_has_save_restore()) 1118 + xive->flags |= KVMPPC_XIVE_FLAG_SAVE_RESTORE; 1119 + 1120 1120 xive->ops = &kvmppc_xive_native_ops; 1121 1121 1122 1122 kvm->arch.xive = xive; ··· 1267 1257 return; 1268 1258 } 1269 1259 1270 - xive->dentry = debugfs_create_file(name, 0444, powerpc_debugfs_root, 1260 + xive->dentry = debugfs_create_file(name, 0444, arch_debugfs_dir, 1271 1261 xive, &xive_native_debug_fops); 1272 1262 1273 1263 pr_debug("%s: created %s\n", __func__, name);
+1 -1
arch/powerpc/mm/Makefile
··· 18 18 obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o 19 19 obj-$(CONFIG_NOT_COHERENT_CACHE) += dma-noncoherent.o 20 20 obj-$(CONFIG_PPC_COPRO_BASE) += copro_fault.o 21 - obj-$(CONFIG_PPC_PTDUMP) += ptdump/ 21 + obj-$(CONFIG_PTDUMP_CORE) += ptdump/ 22 22 obj-$(CONFIG_KASAN) += kasan/
+1 -1
arch/powerpc/mm/book3s64/hash_native.c
··· 787 787 * TODO: add batching support when enabled. remember, no dynamic memory here, 788 788 * although there is the control page available... 789 789 */ 790 - static void native_hpte_clear(void) 790 + static notrace void native_hpte_clear(void) 791 791 { 792 792 unsigned long vpn = 0; 793 793 unsigned long slot, slots;
+2 -2
arch/powerpc/mm/book3s64/hash_utils.c
··· 36 36 #include <linux/hugetlb.h> 37 37 #include <linux/cpu.h> 38 38 #include <linux/pgtable.h> 39 + #include <linux/debugfs.h> 39 40 40 - #include <asm/debugfs.h> 41 41 #include <asm/interrupt.h> 42 42 #include <asm/processor.h> 43 43 #include <asm/mmu.h> ··· 2072 2072 2073 2073 static int __init hash64_debugfs(void) 2074 2074 { 2075 - debugfs_create_file("hpt_order", 0600, powerpc_debugfs_root, NULL, 2075 + debugfs_create_file("hpt_order", 0600, arch_debugfs_dir, NULL, 2076 2076 &fops_hpt_order); 2077 2077 return 0; 2078 2078 }
+4 -4
arch/powerpc/mm/book3s64/pgtable.c
··· 6 6 #include <linux/sched.h> 7 7 #include <linux/mm_types.h> 8 8 #include <linux/memblock.h> 9 + #include <linux/debugfs.h> 9 10 #include <misc/cxl-base.h> 10 11 11 - #include <asm/debugfs.h> 12 12 #include <asm/pgalloc.h> 13 13 #include <asm/tlb.h> 14 14 #include <asm/trace.h> ··· 172 172 } 173 173 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ 174 174 175 - /* For use by kexec */ 176 - void mmu_cleanup_all(void) 175 + /* For use by kexec, called with MMU off */ 176 + notrace void mmu_cleanup_all(void) 177 177 { 178 178 if (radix_enabled()) 179 179 radix__mmu_cleanup_all(); ··· 520 520 * invalidated as expected. 521 521 */ 522 522 debugfs_create_bool("tlbie_enabled", 0600, 523 - powerpc_debugfs_root, 523 + arch_debugfs_dir, 524 524 &tlbie_enabled); 525 525 526 526 return 0;
+2 -1
arch/powerpc/mm/book3s64/radix_pgtable.c
··· 679 679 mtspr(SPRN_UAMOR, 0); 680 680 } 681 681 682 - void radix__mmu_cleanup_all(void) 682 + /* Called during kexec sequence with MMU off */ 683 + notrace void radix__mmu_cleanup_all(void) 683 684 { 684 685 unsigned long lpcr; 685 686
+14 -2
arch/powerpc/mm/book3s64/radix_tlb.c
··· 10 10 #include <linux/memblock.h> 11 11 #include <linux/mmu_context.h> 12 12 #include <linux/sched/mm.h> 13 + #include <linux/debugfs.h> 13 14 14 15 #include <asm/ppc-opcode.h> 15 16 #include <asm/tlb.h> ··· 1107 1106 * invalidating a full PID, so it has a far lower threshold to change from 1108 1107 * individual page flushes to full-pid flushes. 1109 1108 */ 1110 - static unsigned long tlb_single_page_flush_ceiling __read_mostly = 33; 1111 - static unsigned long tlb_local_single_page_flush_ceiling __read_mostly = POWER9_TLB_SETS_RADIX * 2; 1109 + static u32 tlb_single_page_flush_ceiling __read_mostly = 33; 1110 + static u32 tlb_local_single_page_flush_ceiling __read_mostly = POWER9_TLB_SETS_RADIX * 2; 1112 1111 1113 1112 static inline void __radix__flush_tlb_range(struct mm_struct *mm, 1114 1113 unsigned long start, unsigned long end) ··· 1525 1524 EXPORT_SYMBOL_GPL(do_h_rpt_invalidate_prt); 1526 1525 1527 1526 #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */ 1527 + 1528 + static int __init create_tlb_single_page_flush_ceiling(void) 1529 + { 1530 + debugfs_create_u32("tlb_single_page_flush_ceiling", 0600, 1531 + arch_debugfs_dir, &tlb_single_page_flush_ceiling); 1532 + debugfs_create_u32("tlb_local_single_page_flush_ceiling", 0600, 1533 + arch_debugfs_dir, &tlb_local_single_page_flush_ceiling); 1534 + return 0; 1535 + } 1536 + late_initcall(create_tlb_single_page_flush_ceiling); 1537 +
+1 -1
arch/powerpc/mm/book3s64/slb.c
··· 822 822 /* IRQs are not reconciled here, so can't check irqs_disabled */ 823 823 VM_WARN_ON(mfmsr() & MSR_EE); 824 824 825 - if (unlikely(!(regs->msr & MSR_RI))) 825 + if (regs_is_unrecoverable(regs)) 826 826 return -EINVAL; 827 827 828 828 /*
+46
arch/powerpc/mm/drmem.c
··· 18 18 19 19 static struct drmem_lmb_info __drmem_info; 20 20 struct drmem_lmb_info *drmem_info = &__drmem_info; 21 + static bool in_drmem_update; 21 22 22 23 u64 drmem_lmb_memory_max(void) 23 24 { ··· 179 178 if (!memory) 180 179 return -1; 181 180 181 + /* 182 + * Set in_drmem_update to prevent the notifier callback to process the 183 + * DT property back since the change is coming from the LMB tree. 184 + */ 185 + in_drmem_update = true; 182 186 prop = of_find_property(memory, "ibm,dynamic-memory", NULL); 183 187 if (prop) { 184 188 rc = drmem_update_dt_v1(memory, prop); ··· 192 186 if (prop) 193 187 rc = drmem_update_dt_v2(memory, prop); 194 188 } 189 + in_drmem_update = false; 195 190 196 191 of_node_put(memory); 197 192 return rc; ··· 314 307 return ret; 315 308 } 316 309 310 + /* 311 + * Update the LMB associativity index. 312 + */ 313 + static int update_lmb(struct drmem_lmb *updated_lmb, 314 + __maybe_unused const __be32 **usm, 315 + __maybe_unused void *data) 316 + { 317 + struct drmem_lmb *lmb; 318 + 319 + for_each_drmem_lmb(lmb) { 320 + if (lmb->drc_index != updated_lmb->drc_index) 321 + continue; 322 + 323 + lmb->aa_index = updated_lmb->aa_index; 324 + break; 325 + } 326 + return 0; 327 + } 328 + 329 + /* 330 + * Update the LMB associativity index. 331 + * 332 + * This needs to be called when the hypervisor is updating the 333 + * dynamic-reconfiguration-memory node property. 334 + */ 335 + void drmem_update_lmbs(struct property *prop) 336 + { 337 + /* 338 + * Don't update the LMBs if triggered by the update done in 339 + * drmem_update_dt(), the LMB values have been used to the update the DT 340 + * property in that case. 341 + */ 342 + if (in_drmem_update) 343 + return; 344 + if (!strcmp(prop->name, "ibm,dynamic-memory")) 345 + __walk_drmem_v1_lmbs(prop->value, NULL, NULL, update_lmb); 346 + else if (!strcmp(prop->name, "ibm,dynamic-memory-v2")) 347 + __walk_drmem_v2_lmbs(prop->value, NULL, NULL, update_lmb); 348 + } 317 349 #endif 318 350 319 351 static int init_drmem_lmb_size(struct device_node *dn)
+1 -1
arch/powerpc/mm/mmu_decl.h
··· 180 180 void __init mmu_mapin_immr(void); 181 181 #endif 182 182 183 - #ifdef CONFIG_PPC_DEBUG_WX 183 + #ifdef CONFIG_DEBUG_WX 184 184 void ptdump_check_wx(void); 185 185 #else 186 186 static inline void ptdump_check_wx(void) { }
+2 -2
arch/powerpc/mm/nohash/tlb_low.S
··· 199 199 * Touch enough instruction cache lines to ensure cache hits 200 200 */ 201 201 1: mflr r9 202 - bl 2f 202 + bcl 20,31,$+4 203 203 2: mflr r6 204 204 li r7,32 205 205 PPC_ICBT(0,R6,R7) /* touch next cache line */ ··· 414 414 * Set up temporary TLB entry that is the same as what we're 415 415 * running from, but in AS=1. 416 416 */ 417 - bl 1f 417 + bcl 20,31,$+4 418 418 1: mflr r6 419 419 tlbsx 0,r8 420 420 mfspr r6,SPRN_MAS1
+357 -134
arch/powerpc/mm/numa.c
··· 40 40 41 41 static char *cmdline __initdata; 42 42 43 - static int numa_debug; 44 - #define dbg(args...) if (numa_debug) { printk(KERN_INFO args); } 45 - 46 43 int numa_cpu_lookup_table[NR_CPUS]; 47 44 cpumask_var_t node_to_cpumask_map[MAX_NUMNODES]; 48 45 struct pglist_data *node_data[MAX_NUMNODES]; ··· 48 51 EXPORT_SYMBOL(node_to_cpumask_map); 49 52 EXPORT_SYMBOL(node_data); 50 53 51 - static int min_common_depth; 54 + static int primary_domain_index; 52 55 static int n_mem_addr_cells, n_mem_size_cells; 53 - static int form1_affinity; 56 + 57 + #define FORM0_AFFINITY 0 58 + #define FORM1_AFFINITY 1 59 + #define FORM2_AFFINITY 2 60 + static int affinity_form; 54 61 55 62 #define MAX_DISTANCE_REF_POINTS 4 56 63 static int distance_ref_points_depth; 57 64 static const __be32 *distance_ref_points; 58 65 static int distance_lookup_table[MAX_NUMNODES][MAX_DISTANCE_REF_POINTS]; 66 + static int numa_distance_table[MAX_NUMNODES][MAX_NUMNODES] = { 67 + [0 ... MAX_NUMNODES - 1] = { [0 ... MAX_NUMNODES - 1] = -1 } 68 + }; 69 + static int numa_id_index_table[MAX_NUMNODES] = { [0 ... MAX_NUMNODES - 1] = NUMA_NO_NODE }; 59 70 60 71 /* 61 72 * Allocate node_to_cpumask_map based on number of available nodes ··· 84 79 alloc_bootmem_cpumask_var(&node_to_cpumask_map[node]); 85 80 86 81 /* cpumask_of_node() will now work */ 87 - dbg("Node to cpumask map for %u nodes\n", nr_node_ids); 82 + pr_debug("Node to cpumask map for %u nodes\n", nr_node_ids); 88 83 } 89 84 90 85 static int __init fake_numa_create_new_node(unsigned long end_pfn, ··· 128 123 cmdline = p; 129 124 fake_nid++; 130 125 *nid = fake_nid; 131 - dbg("created new fake_node with id %d\n", fake_nid); 126 + pr_debug("created new fake_node with id %d\n", fake_nid); 132 127 return 1; 133 128 } 134 129 return 0; ··· 142 137 numa_cpu_lookup_table[cpu] = -1; 143 138 } 144 139 145 - static void map_cpu_to_node(int cpu, int node) 140 + void map_cpu_to_node(int cpu, int node) 146 141 { 147 142 update_numa_cpu_lookup_table(cpu, node); 148 143 149 - dbg("adding cpu %d to node %d\n", cpu, node); 150 - 151 - if (!(cpumask_test_cpu(cpu, node_to_cpumask_map[node]))) 144 + if (!(cpumask_test_cpu(cpu, node_to_cpumask_map[node]))) { 145 + pr_debug("adding cpu %d to node %d\n", cpu, node); 152 146 cpumask_set_cpu(cpu, node_to_cpumask_map[node]); 147 + } 153 148 } 154 149 155 150 #if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_PPC_SPLPAR) 156 - static void unmap_cpu_from_node(unsigned long cpu) 151 + void unmap_cpu_from_node(unsigned long cpu) 157 152 { 158 153 int node = numa_cpu_lookup_table[cpu]; 159 154 160 - dbg("removing cpu %lu from node %d\n", cpu, node); 161 - 162 155 if (cpumask_test_cpu(cpu, node_to_cpumask_map[node])) { 163 156 cpumask_clear_cpu(cpu, node_to_cpumask_map[node]); 157 + pr_debug("removing cpu %lu from node %d\n", cpu, node); 164 158 } else { 165 - printk(KERN_ERR "WARNING: cpu %lu not found in node %d\n", 166 - cpu, node); 159 + pr_warn("Warning: cpu %lu not found in node %d\n", cpu, node); 167 160 } 168 161 } 169 162 #endif /* CONFIG_HOTPLUG_CPU || CONFIG_PPC_SPLPAR */ 170 163 171 - int cpu_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc) 164 + static int __associativity_to_nid(const __be32 *associativity, 165 + int max_array_sz) 166 + { 167 + int nid; 168 + /* 169 + * primary_domain_index is 1 based array index. 170 + */ 171 + int index = primary_domain_index - 1; 172 + 173 + if (!numa_enabled || index >= max_array_sz) 174 + return NUMA_NO_NODE; 175 + 176 + nid = of_read_number(&associativity[index], 1); 177 + 178 + /* POWER4 LPAR uses 0xffff as invalid node */ 179 + if (nid == 0xffff || nid >= nr_node_ids) 180 + nid = NUMA_NO_NODE; 181 + return nid; 182 + } 183 + /* 184 + * Returns nid in the range [0..nr_node_ids], or -1 if no useful NUMA 185 + * info is found. 186 + */ 187 + static int associativity_to_nid(const __be32 *associativity) 188 + { 189 + int array_sz = of_read_number(associativity, 1); 190 + 191 + /* Skip the first element in the associativity array */ 192 + return __associativity_to_nid((associativity + 1), array_sz); 193 + } 194 + 195 + static int __cpu_form2_relative_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc) 196 + { 197 + int dist; 198 + int node1, node2; 199 + 200 + node1 = associativity_to_nid(cpu1_assoc); 201 + node2 = associativity_to_nid(cpu2_assoc); 202 + 203 + dist = numa_distance_table[node1][node2]; 204 + if (dist <= LOCAL_DISTANCE) 205 + return 0; 206 + else if (dist <= REMOTE_DISTANCE) 207 + return 1; 208 + else 209 + return 2; 210 + } 211 + 212 + static int __cpu_form1_relative_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc) 172 213 { 173 214 int dist = 0; 174 215 ··· 230 179 return dist; 231 180 } 232 181 182 + int cpu_relative_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc) 183 + { 184 + /* We should not get called with FORM0 */ 185 + VM_WARN_ON(affinity_form == FORM0_AFFINITY); 186 + if (affinity_form == FORM1_AFFINITY) 187 + return __cpu_form1_relative_distance(cpu1_assoc, cpu2_assoc); 188 + return __cpu_form2_relative_distance(cpu1_assoc, cpu2_assoc); 189 + } 190 + 233 191 /* must hold reference to node during call */ 234 192 static const __be32 *of_get_associativity(struct device_node *dev) 235 193 { ··· 250 190 int i; 251 191 int distance = LOCAL_DISTANCE; 252 192 253 - if (!form1_affinity) 193 + if (affinity_form == FORM2_AFFINITY) 194 + return numa_distance_table[a][b]; 195 + else if (affinity_form == FORM0_AFFINITY) 254 196 return ((a == b) ? LOCAL_DISTANCE : REMOTE_DISTANCE); 255 197 256 198 for (i = 0; i < distance_ref_points_depth; i++) { ··· 266 204 return distance; 267 205 } 268 206 EXPORT_SYMBOL(__node_distance); 269 - 270 - static void initialize_distance_lookup_table(int nid, 271 - const __be32 *associativity) 272 - { 273 - int i; 274 - 275 - if (!form1_affinity) 276 - return; 277 - 278 - for (i = 0; i < distance_ref_points_depth; i++) { 279 - const __be32 *entry; 280 - 281 - entry = &associativity[be32_to_cpu(distance_ref_points[i]) - 1]; 282 - distance_lookup_table[nid][i] = of_read_number(entry, 1); 283 - } 284 - } 285 - 286 - /* 287 - * Returns nid in the range [0..nr_node_ids], or -1 if no useful NUMA 288 - * info is found. 289 - */ 290 - static int associativity_to_nid(const __be32 *associativity) 291 - { 292 - int nid = NUMA_NO_NODE; 293 - 294 - if (!numa_enabled) 295 - goto out; 296 - 297 - if (of_read_number(associativity, 1) >= min_common_depth) 298 - nid = of_read_number(&associativity[min_common_depth], 1); 299 - 300 - /* POWER4 LPAR uses 0xffff as invalid node */ 301 - if (nid == 0xffff || nid >= nr_node_ids) 302 - nid = NUMA_NO_NODE; 303 - 304 - if (nid > 0 && 305 - of_read_number(associativity, 1) >= distance_ref_points_depth) { 306 - /* 307 - * Skip the length field and send start of associativity array 308 - */ 309 - initialize_distance_lookup_table(nid, associativity + 1); 310 - } 311 - 312 - out: 313 - return nid; 314 - } 315 207 316 208 /* Returns the nid associated with the given device tree node, 317 209 * or -1 if not found. ··· 300 284 } 301 285 EXPORT_SYMBOL(of_node_to_nid); 302 286 303 - static int __init find_min_common_depth(void) 287 + static void __initialize_form1_numa_distance(const __be32 *associativity, 288 + int max_array_sz) 304 289 { 305 - int depth; 290 + int i, nid; 291 + 292 + if (affinity_form != FORM1_AFFINITY) 293 + return; 294 + 295 + nid = __associativity_to_nid(associativity, max_array_sz); 296 + if (nid != NUMA_NO_NODE) { 297 + for (i = 0; i < distance_ref_points_depth; i++) { 298 + const __be32 *entry; 299 + int index = be32_to_cpu(distance_ref_points[i]) - 1; 300 + 301 + /* 302 + * broken hierarchy, return with broken distance table 303 + */ 304 + if (WARN(index >= max_array_sz, "Broken ibm,associativity property")) 305 + return; 306 + 307 + entry = &associativity[index]; 308 + distance_lookup_table[nid][i] = of_read_number(entry, 1); 309 + } 310 + } 311 + } 312 + 313 + static void initialize_form1_numa_distance(const __be32 *associativity) 314 + { 315 + int array_sz; 316 + 317 + array_sz = of_read_number(associativity, 1); 318 + /* Skip the first element in the associativity array */ 319 + __initialize_form1_numa_distance(associativity + 1, array_sz); 320 + } 321 + 322 + /* 323 + * Used to update distance information w.r.t newly added node. 324 + */ 325 + void update_numa_distance(struct device_node *node) 326 + { 327 + int nid; 328 + 329 + if (affinity_form == FORM0_AFFINITY) 330 + return; 331 + else if (affinity_form == FORM1_AFFINITY) { 332 + const __be32 *associativity; 333 + 334 + associativity = of_get_associativity(node); 335 + if (!associativity) 336 + return; 337 + 338 + initialize_form1_numa_distance(associativity); 339 + return; 340 + } 341 + 342 + /* FORM2 affinity */ 343 + nid = of_node_to_nid_single(node); 344 + if (nid == NUMA_NO_NODE) 345 + return; 346 + 347 + /* 348 + * With FORM2 we expect NUMA distance of all possible NUMA 349 + * nodes to be provided during boot. 350 + */ 351 + WARN(numa_distance_table[nid][nid] == -1, 352 + "NUMA distance details for node %d not provided\n", nid); 353 + } 354 + 355 + /* 356 + * ibm,numa-lookup-index-table= {N, domainid1, domainid2, ..... domainidN} 357 + * ibm,numa-distance-table = { N, 1, 2, 4, 5, 1, 6, .... N elements} 358 + */ 359 + static void initialize_form2_numa_distance_lookup_table(void) 360 + { 361 + int i, j; 306 362 struct device_node *root; 363 + const __u8 *numa_dist_table; 364 + const __be32 *numa_lookup_index; 365 + int numa_dist_table_length; 366 + int max_numa_index, distance_index; 367 + 368 + if (firmware_has_feature(FW_FEATURE_OPAL)) 369 + root = of_find_node_by_path("/ibm,opal"); 370 + else 371 + root = of_find_node_by_path("/rtas"); 372 + if (!root) 373 + root = of_find_node_by_path("/"); 374 + 375 + numa_lookup_index = of_get_property(root, "ibm,numa-lookup-index-table", NULL); 376 + max_numa_index = of_read_number(&numa_lookup_index[0], 1); 377 + 378 + /* first element of the array is the size and is encode-int */ 379 + numa_dist_table = of_get_property(root, "ibm,numa-distance-table", NULL); 380 + numa_dist_table_length = of_read_number((const __be32 *)&numa_dist_table[0], 1); 381 + /* Skip the size which is encoded int */ 382 + numa_dist_table += sizeof(__be32); 383 + 384 + pr_debug("numa_dist_table_len = %d, numa_dist_indexes_len = %d\n", 385 + numa_dist_table_length, max_numa_index); 386 + 387 + for (i = 0; i < max_numa_index; i++) 388 + /* +1 skip the max_numa_index in the property */ 389 + numa_id_index_table[i] = of_read_number(&numa_lookup_index[i + 1], 1); 390 + 391 + 392 + if (numa_dist_table_length != max_numa_index * max_numa_index) { 393 + WARN(1, "Wrong NUMA distance information\n"); 394 + /* consider everybody else just remote. */ 395 + for (i = 0; i < max_numa_index; i++) { 396 + for (j = 0; j < max_numa_index; j++) { 397 + int nodeA = numa_id_index_table[i]; 398 + int nodeB = numa_id_index_table[j]; 399 + 400 + if (nodeA == nodeB) 401 + numa_distance_table[nodeA][nodeB] = LOCAL_DISTANCE; 402 + else 403 + numa_distance_table[nodeA][nodeB] = REMOTE_DISTANCE; 404 + } 405 + } 406 + } 407 + 408 + distance_index = 0; 409 + for (i = 0; i < max_numa_index; i++) { 410 + for (j = 0; j < max_numa_index; j++) { 411 + int nodeA = numa_id_index_table[i]; 412 + int nodeB = numa_id_index_table[j]; 413 + 414 + numa_distance_table[nodeA][nodeB] = numa_dist_table[distance_index++]; 415 + pr_debug("dist[%d][%d]=%d ", nodeA, nodeB, numa_distance_table[nodeA][nodeB]); 416 + } 417 + } 418 + of_node_put(root); 419 + } 420 + 421 + static int __init find_primary_domain_index(void) 422 + { 423 + int index; 424 + struct device_node *root; 425 + 426 + /* 427 + * Check for which form of affinity. 428 + */ 429 + if (firmware_has_feature(FW_FEATURE_OPAL)) { 430 + affinity_form = FORM1_AFFINITY; 431 + } else if (firmware_has_feature(FW_FEATURE_FORM2_AFFINITY)) { 432 + pr_debug("Using form 2 affinity\n"); 433 + affinity_form = FORM2_AFFINITY; 434 + } else if (firmware_has_feature(FW_FEATURE_FORM1_AFFINITY)) { 435 + pr_debug("Using form 1 affinity\n"); 436 + affinity_form = FORM1_AFFINITY; 437 + } else 438 + affinity_form = FORM0_AFFINITY; 307 439 308 440 if (firmware_has_feature(FW_FEATURE_OPAL)) 309 441 root = of_find_node_by_path("/ibm,opal"); ··· 477 313 &distance_ref_points_depth); 478 314 479 315 if (!distance_ref_points) { 480 - dbg("NUMA: ibm,associativity-reference-points not found.\n"); 316 + pr_debug("ibm,associativity-reference-points not found.\n"); 481 317 goto err; 482 318 } 483 319 484 320 distance_ref_points_depth /= sizeof(int); 485 - 486 - if (firmware_has_feature(FW_FEATURE_OPAL) || 487 - firmware_has_feature(FW_FEATURE_TYPE1_AFFINITY)) { 488 - dbg("Using form 1 affinity\n"); 489 - form1_affinity = 1; 490 - } 491 - 492 - if (form1_affinity) { 493 - depth = of_read_number(distance_ref_points, 1); 494 - } else { 321 + if (affinity_form == FORM0_AFFINITY) { 495 322 if (distance_ref_points_depth < 2) { 496 - printk(KERN_WARNING "NUMA: " 497 - "short ibm,associativity-reference-points\n"); 323 + pr_warn("short ibm,associativity-reference-points\n"); 498 324 goto err; 499 325 } 500 326 501 - depth = of_read_number(&distance_ref_points[1], 1); 327 + index = of_read_number(&distance_ref_points[1], 1); 328 + } else { 329 + /* 330 + * Both FORM1 and FORM2 affinity find the primary domain details 331 + * at the same offset. 332 + */ 333 + index = of_read_number(distance_ref_points, 1); 502 334 } 503 - 504 335 /* 505 336 * Warn and cap if the hardware supports more than 506 337 * MAX_DISTANCE_REF_POINTS domains. 507 338 */ 508 339 if (distance_ref_points_depth > MAX_DISTANCE_REF_POINTS) { 509 - printk(KERN_WARNING "NUMA: distance array capped at " 510 - "%d entries\n", MAX_DISTANCE_REF_POINTS); 340 + pr_warn("distance array capped at %d entries\n", 341 + MAX_DISTANCE_REF_POINTS); 511 342 distance_ref_points_depth = MAX_DISTANCE_REF_POINTS; 512 343 } 513 344 514 345 of_node_put(root); 515 - return depth; 346 + return index; 516 347 517 348 err: 518 349 of_node_put(root); ··· 585 426 return 0; 586 427 } 587 428 429 + static int get_nid_and_numa_distance(struct drmem_lmb *lmb) 430 + { 431 + struct assoc_arrays aa = { .arrays = NULL }; 432 + int default_nid = NUMA_NO_NODE; 433 + int nid = default_nid; 434 + int rc, index; 435 + 436 + if ((primary_domain_index < 0) || !numa_enabled) 437 + return default_nid; 438 + 439 + rc = of_get_assoc_arrays(&aa); 440 + if (rc) 441 + return default_nid; 442 + 443 + if (primary_domain_index <= aa.array_sz && 444 + !(lmb->flags & DRCONF_MEM_AI_INVALID) && lmb->aa_index < aa.n_arrays) { 445 + const __be32 *associativity; 446 + 447 + index = lmb->aa_index * aa.array_sz; 448 + associativity = &aa.arrays[index]; 449 + nid = __associativity_to_nid(associativity, aa.array_sz); 450 + if (nid > 0 && affinity_form == FORM1_AFFINITY) { 451 + /* 452 + * lookup array associativity entries have 453 + * no length of the array as the first element. 454 + */ 455 + __initialize_form1_numa_distance(associativity, aa.array_sz); 456 + } 457 + } 458 + return nid; 459 + } 460 + 588 461 /* 589 462 * This is like of_node_to_nid_single() for memory represented in the 590 463 * ibm,dynamic-reconfiguration-memory node. ··· 628 437 int nid = default_nid; 629 438 int rc, index; 630 439 631 - if ((min_common_depth < 0) || !numa_enabled) 440 + if ((primary_domain_index < 0) || !numa_enabled) 632 441 return default_nid; 633 442 634 443 rc = of_get_assoc_arrays(&aa); 635 444 if (rc) 636 445 return default_nid; 637 446 638 - if (min_common_depth <= aa.array_sz && 447 + if (primary_domain_index <= aa.array_sz && 639 448 !(lmb->flags & DRCONF_MEM_AI_INVALID) && lmb->aa_index < aa.n_arrays) { 640 - index = lmb->aa_index * aa.array_sz + min_common_depth - 1; 641 - nid = of_read_number(&aa.arrays[index], 1); 449 + const __be32 *associativity; 642 450 643 - if (nid == 0xffff || nid >= nr_node_ids) 644 - nid = default_nid; 645 - 646 - if (nid > 0) { 647 - index = lmb->aa_index * aa.array_sz; 648 - initialize_distance_lookup_table(nid, 649 - &aa.arrays[index]); 650 - } 451 + index = lmb->aa_index * aa.array_sz; 452 + associativity = &aa.arrays[index]; 453 + nid = __associativity_to_nid(associativity, aa.array_sz); 651 454 } 652 - 653 455 return nid; 654 456 } 655 457 656 458 #ifdef CONFIG_PPC_SPLPAR 657 - static int vphn_get_nid(long lcpu) 459 + 460 + static int __vphn_get_associativity(long lcpu, __be32 *associativity) 658 461 { 659 - __be32 associativity[VPHN_ASSOC_BUFSIZE] = {0}; 660 462 long rc, hwid; 661 463 662 464 /* ··· 669 485 670 486 rc = hcall_vphn(hwid, VPHN_FLAG_VCPU, associativity); 671 487 if (rc == H_SUCCESS) 672 - return associativity_to_nid(associativity); 488 + return 0; 673 489 } 674 490 491 + return -1; 492 + } 493 + 494 + static int vphn_get_nid(long lcpu) 495 + { 496 + __be32 associativity[VPHN_ASSOC_BUFSIZE] = {0}; 497 + 498 + 499 + if (!__vphn_get_associativity(lcpu, associativity)) 500 + return associativity_to_nid(associativity); 501 + 675 502 return NUMA_NO_NODE; 503 + 676 504 } 677 505 #else 506 + 507 + static int __vphn_get_associativity(long lcpu, __be32 *associativity) 508 + { 509 + return -1; 510 + } 511 + 678 512 static int vphn_get_nid(long unused) 679 513 { 680 514 return NUMA_NO_NODE; ··· 800 598 801 599 static int ppc_numa_cpu_dead(unsigned int cpu) 802 600 { 803 - #ifdef CONFIG_HOTPLUG_CPU 804 - unmap_cpu_from_node(cpu); 805 - #endif 806 601 return 0; 807 602 } 808 603 ··· 884 685 size = read_n_cells(n_mem_size_cells, usm); 885 686 } 886 687 887 - nid = of_drconf_to_nid_single(lmb); 688 + nid = get_nid_and_numa_distance(lmb); 888 689 fake_numa_create_new_node(((base + size) >> PAGE_SHIFT), 889 690 &nid); 890 691 node_set_online(nid); ··· 901 702 struct device_node *memory; 902 703 int default_nid = 0; 903 704 unsigned long i; 705 + const __be32 *associativity; 904 706 905 707 if (numa_enabled == 0) { 906 - printk(KERN_WARNING "NUMA disabled by user\n"); 708 + pr_warn("disabled by user\n"); 907 709 return -1; 908 710 } 909 711 910 - min_common_depth = find_min_common_depth(); 712 + primary_domain_index = find_primary_domain_index(); 911 713 912 - if (min_common_depth < 0) { 714 + if (primary_domain_index < 0) { 913 715 /* 914 - * if we fail to parse min_common_depth from device tree 716 + * if we fail to parse primary_domain_index from device tree 915 717 * mark the numa disabled, boot with numa disabled. 916 718 */ 917 719 numa_enabled = false; 918 - return min_common_depth; 720 + return primary_domain_index; 919 721 } 920 722 921 - dbg("NUMA associativity depth for CPU/Memory: %d\n", min_common_depth); 723 + pr_debug("associativity depth for CPU/Memory: %d\n", primary_domain_index); 724 + 725 + /* 726 + * If it is FORM2 initialize the distance table here. 727 + */ 728 + if (affinity_form == FORM2_AFFINITY) 729 + initialize_form2_numa_distance_lookup_table(); 922 730 923 731 /* 924 732 * Even though we connect cpus to numa domains later in SMP ··· 933 727 * each node to be onlined must have NODE_DATA etc backing it. 934 728 */ 935 729 for_each_present_cpu(i) { 730 + __be32 vphn_assoc[VPHN_ASSOC_BUFSIZE]; 936 731 struct device_node *cpu; 937 - int nid = vphn_get_nid(i); 732 + int nid = NUMA_NO_NODE; 938 733 939 - /* 940 - * Don't fall back to default_nid yet -- we will plug 941 - * cpus into nodes once the memory scan has discovered 942 - * the topology. 943 - */ 944 - if (nid == NUMA_NO_NODE) { 734 + memset(vphn_assoc, 0, VPHN_ASSOC_BUFSIZE * sizeof(__be32)); 735 + 736 + if (__vphn_get_associativity(i, vphn_assoc) == 0) { 737 + nid = associativity_to_nid(vphn_assoc); 738 + initialize_form1_numa_distance(vphn_assoc); 739 + } else { 740 + 741 + /* 742 + * Don't fall back to default_nid yet -- we will plug 743 + * cpus into nodes once the memory scan has discovered 744 + * the topology. 745 + */ 945 746 cpu = of_get_cpu_node(i, NULL); 946 747 BUG_ON(!cpu); 947 - nid = of_node_to_nid_single(cpu); 748 + 749 + associativity = of_get_associativity(cpu); 750 + if (associativity) { 751 + nid = associativity_to_nid(associativity); 752 + initialize_form1_numa_distance(associativity); 753 + } 948 754 of_node_put(cpu); 949 755 } 950 756 ··· 992 774 * have associativity properties. If none, then 993 775 * everything goes to default_nid. 994 776 */ 995 - nid = of_node_to_nid_single(memory); 996 - if (nid < 0) 777 + associativity = of_get_associativity(memory); 778 + if (associativity) { 779 + nid = associativity_to_nid(associativity); 780 + initialize_form1_numa_distance(associativity); 781 + } else 997 782 nid = default_nid; 998 783 999 784 fake_numa_create_new_node(((start + size) >> PAGE_SHIFT), &nid); ··· 1032 811 unsigned int nid = 0; 1033 812 int i; 1034 813 1035 - printk(KERN_DEBUG "Top of RAM: 0x%lx, Total RAM: 0x%lx\n", 1036 - top_of_ram, total_ram); 1037 - printk(KERN_DEBUG "Memory hole size: %ldMB\n", 1038 - (top_of_ram - total_ram) >> 20); 814 + pr_debug("Top of RAM: 0x%lx, Total RAM: 0x%lx\n", top_of_ram, total_ram); 815 + pr_debug("Memory hole size: %ldMB\n", (top_of_ram - total_ram) >> 20); 1039 816 1040 817 for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, NULL) { 1041 818 fake_numa_create_new_node(end_pfn, &nid); ··· 1112 893 static void __init find_possible_nodes(void) 1113 894 { 1114 895 struct device_node *rtas; 1115 - const __be32 *domains; 896 + const __be32 *domains = NULL; 1116 897 int prop_length, max_nodes; 1117 898 u32 i; 1118 899 ··· 1128 909 * it doesn't exist, then fallback on ibm,max-associativity-domains. 1129 910 * Current denotes what the platform can support compared to max 1130 911 * which denotes what the Hypervisor can support. 912 + * 913 + * If the LPAR is migratable, new nodes might be activated after a LPM, 914 + * so we should consider the max number in that case. 1131 915 */ 1132 - domains = of_get_property(rtas, "ibm,current-associativity-domains", 1133 - &prop_length); 916 + if (!of_get_property(of_root, "ibm,migratable-partition", NULL)) 917 + domains = of_get_property(rtas, 918 + "ibm,current-associativity-domains", 919 + &prop_length); 1134 920 if (!domains) { 1135 921 domains = of_get_property(rtas, "ibm,max-associativity-domains", 1136 922 &prop_length); ··· 1143 919 goto out; 1144 920 } 1145 921 1146 - max_nodes = of_read_number(&domains[min_common_depth], 1); 922 + max_nodes = of_read_number(&domains[primary_domain_index], 1); 923 + pr_info("Partition configured for %d NUMA nodes.\n", max_nodes); 924 + 1147 925 for (i = 0; i < max_nodes; i++) { 1148 926 if (!node_possible(i)) 1149 927 node_set(i, node_possible_map); 1150 928 } 1151 929 1152 930 prop_length /= sizeof(int); 1153 - if (prop_length > min_common_depth + 2) 931 + if (prop_length > primary_domain_index + 2) 1154 932 coregroup_enabled = 1; 1155 933 1156 934 out: ··· 1239 1013 1240 1014 if (strstr(p, "off")) 1241 1015 numa_enabled = 0; 1242 - 1243 - if (strstr(p, "debug")) 1244 - numa_debug = 1; 1245 1016 1246 1017 p = strstr(p, "fake="); 1247 1018 if (p) ··· 1402 1179 1403 1180 switch (rc) { 1404 1181 case H_SUCCESS: 1405 - dbg("VPHN hcall succeeded. Reset polling...\n"); 1182 + pr_debug("VPHN hcall succeeded. Reset polling...\n"); 1406 1183 goto out; 1407 1184 1408 1185 case H_FUNCTION: ··· 1482 1259 goto out; 1483 1260 1484 1261 index = of_read_number(associativity, 1); 1485 - if (index > min_common_depth + 1) 1262 + if (index > primary_domain_index + 1) 1486 1263 return of_read_number(&associativity[index - 1], 1); 1487 1264 1488 1265 out:
+4 -2
arch/powerpc/mm/ptdump/8xx.c
··· 75 75 }; 76 76 77 77 struct pgtable_level pg_level[5] = { 78 - { 79 - }, { /* pgd */ 78 + { /* pgd */ 79 + .flag = flag_array, 80 + .num = ARRAY_SIZE(flag_array), 81 + }, { /* p4d */ 80 82 .flag = flag_array, 81 83 .num = ARRAY_SIZE(flag_array), 82 84 }, { /* pud */
+7 -2
arch/powerpc/mm/ptdump/Makefile
··· 5 5 obj-$(CONFIG_4xx) += shared.o 6 6 obj-$(CONFIG_PPC_8xx) += 8xx.o 7 7 obj-$(CONFIG_PPC_BOOK3E_MMU) += shared.o 8 - obj-$(CONFIG_PPC_BOOK3S_32) += shared.o bats.o segment_regs.o 9 - obj-$(CONFIG_PPC_BOOK3S_64) += book3s64.o hashpagetable.o 8 + obj-$(CONFIG_PPC_BOOK3S_32) += shared.o 9 + obj-$(CONFIG_PPC_BOOK3S_64) += book3s64.o 10 + 11 + ifdef CONFIG_PTDUMP_DEBUGFS 12 + obj-$(CONFIG_PPC_BOOK3S_32) += bats.o segment_regs.o 13 + obj-$(CONFIG_PPC_BOOK3S_64) += hashpagetable.o 14 + endif
+4 -14
arch/powerpc/mm/ptdump/bats.c
··· 7 7 */ 8 8 9 9 #include <linux/pgtable.h> 10 - #include <asm/debugfs.h> 10 + #include <linux/debugfs.h> 11 11 #include <asm/cpu_has_feature.h> 12 12 13 13 #include "ptdump.h" ··· 57 57 58 58 #define BAT_SHOW_603(_m, _n, _l, _u, _d) bat_show_603(_m, _n, mfspr(_l), mfspr(_u), _d) 59 59 60 - static int bats_show_603(struct seq_file *m, void *v) 60 + static int bats_show(struct seq_file *m, void *v) 61 61 { 62 62 seq_puts(m, "---[ Instruction Block Address Translation ]---\n"); 63 63 ··· 88 88 return 0; 89 89 } 90 90 91 - static int bats_open(struct inode *inode, struct file *file) 92 - { 93 - return single_open(file, bats_show_603, NULL); 94 - } 95 - 96 - static const struct file_operations bats_fops = { 97 - .open = bats_open, 98 - .read = seq_read, 99 - .llseek = seq_lseek, 100 - .release = single_release, 101 - }; 91 + DEFINE_SHOW_ATTRIBUTE(bats); 102 92 103 93 static int __init bats_init(void) 104 94 { 105 95 debugfs_create_file("block_address_translation", 0400, 106 - powerpc_debugfs_root, NULL, &bats_fops); 96 + arch_debugfs_dir, NULL, &bats_fops); 107 97 return 0; 108 98 } 109 99 device_initcall(bats_init);
+4 -2
arch/powerpc/mm/ptdump/book3s64.c
··· 103 103 }; 104 104 105 105 struct pgtable_level pg_level[5] = { 106 - { 107 - }, { /* pgd */ 106 + { /* pgd */ 107 + .flag = flag_array, 108 + .num = ARRAY_SIZE(flag_array), 109 + }, { /* p4d */ 108 110 .flag = flag_array, 109 111 .num = ARRAY_SIZE(flag_array), 110 112 }, { /* pud */
+1 -11
arch/powerpc/mm/ptdump/hashpagetable.c
··· 526 526 return 0; 527 527 } 528 528 529 - static int ptdump_open(struct inode *inode, struct file *file) 530 - { 531 - return single_open(file, ptdump_show, NULL); 532 - } 533 - 534 - static const struct file_operations ptdump_fops = { 535 - .open = ptdump_open, 536 - .read = seq_read, 537 - .llseek = seq_lseek, 538 - .release = single_release, 539 - }; 529 + DEFINE_SHOW_ATTRIBUTE(ptdump); 540 530 541 531 static int ptdump_init(void) 542 532 {
+48 -130
arch/powerpc/mm/ptdump/ptdump.c
··· 16 16 #include <linux/io.h> 17 17 #include <linux/mm.h> 18 18 #include <linux/highmem.h> 19 + #include <linux/ptdump.h> 19 20 #include <linux/sched.h> 20 21 #include <linux/seq_file.h> 21 22 #include <asm/fixmap.h> ··· 55 54 * 56 55 */ 57 56 struct pg_state { 57 + struct ptdump_state ptdump; 58 58 struct seq_file *seq; 59 59 const struct addr_marker *marker; 60 60 unsigned long start_address; 61 61 unsigned long start_pa; 62 - unsigned int level; 62 + int level; 63 63 u64 current_flags; 64 64 bool check_wx; 65 65 unsigned long wx_pages; ··· 102 100 { 0, "kasan shadow mem end" }, 103 101 #endif 104 102 { -1, NULL }, 103 + }; 104 + 105 + static struct ptdump_range ptdump_range[] __ro_after_init = { 106 + {TASK_SIZE_MAX, ~0UL}, 107 + {0, 0} 105 108 }; 106 109 107 110 #define pt_dump_seq_printf(m, fmt, args...) \ ··· 195 188 st->wx_pages += (addr - st->start_address) / PAGE_SIZE; 196 189 } 197 190 198 - static void note_page_update_state(struct pg_state *st, unsigned long addr, 199 - unsigned int level, u64 val, unsigned long page_size) 191 + static void note_page_update_state(struct pg_state *st, unsigned long addr, int level, u64 val) 200 192 { 201 - u64 flag = val & pg_level[level].mask; 193 + u64 flag = level >= 0 ? val & pg_level[level].mask : 0; 202 194 u64 pa = val & PTE_RPN_MASK; 203 195 204 196 st->level = level; ··· 211 205 } 212 206 } 213 207 214 - static void note_page(struct pg_state *st, unsigned long addr, 215 - unsigned int level, u64 val, unsigned long page_size) 208 + static void note_page(struct ptdump_state *pt_st, unsigned long addr, int level, u64 val) 216 209 { 217 - u64 flag = val & pg_level[level].mask; 210 + u64 flag = level >= 0 ? val & pg_level[level].mask : 0; 211 + struct pg_state *st = container_of(pt_st, struct pg_state, ptdump); 218 212 219 213 /* At first no level is set */ 220 - if (!st->level) { 214 + if (st->level == -1) { 221 215 pt_dump_seq_printf(st->seq, "---[ %s ]---\n", st->marker->name); 222 - note_page_update_state(st, addr, level, val, page_size); 216 + note_page_update_state(st, addr, level, val); 223 217 /* 224 218 * Dump the section of virtual memory when: 225 219 * - the PTE flags from one entry to the next differs. ··· 248 242 * Address indicates we have passed the end of the 249 243 * current section of virtual memory 250 244 */ 251 - note_page_update_state(st, addr, level, val, page_size); 252 - } 253 - } 254 - 255 - static void walk_pte(struct pg_state *st, pmd_t *pmd, unsigned long start) 256 - { 257 - pte_t *pte = pte_offset_kernel(pmd, 0); 258 - unsigned long addr; 259 - unsigned int i; 260 - 261 - for (i = 0; i < PTRS_PER_PTE; i++, pte++) { 262 - addr = start + i * PAGE_SIZE; 263 - note_page(st, addr, 4, pte_val(*pte), PAGE_SIZE); 264 - 265 - } 266 - } 267 - 268 - static void walk_hugepd(struct pg_state *st, hugepd_t *phpd, unsigned long start, 269 - int pdshift, int level) 270 - { 271 - #ifdef CONFIG_ARCH_HAS_HUGEPD 272 - unsigned int i; 273 - int shift = hugepd_shift(*phpd); 274 - int ptrs_per_hpd = pdshift - shift > 0 ? 1 << (pdshift - shift) : 1; 275 - 276 - if (start & ((1 << shift) - 1)) 277 - return; 278 - 279 - for (i = 0; i < ptrs_per_hpd; i++) { 280 - unsigned long addr = start + (i << shift); 281 - pte_t *pte = hugepte_offset(*phpd, addr, pdshift); 282 - 283 - note_page(st, addr, level + 1, pte_val(*pte), 1 << shift); 284 - } 285 - #endif 286 - } 287 - 288 - static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start) 289 - { 290 - pmd_t *pmd = pmd_offset(pud, 0); 291 - unsigned long addr; 292 - unsigned int i; 293 - 294 - for (i = 0; i < PTRS_PER_PMD; i++, pmd++) { 295 - addr = start + i * PMD_SIZE; 296 - if (!pmd_none(*pmd) && !pmd_is_leaf(*pmd)) 297 - /* pmd exists */ 298 - walk_pte(st, pmd, addr); 299 - else 300 - note_page(st, addr, 3, pmd_val(*pmd), PMD_SIZE); 301 - } 302 - } 303 - 304 - static void walk_pud(struct pg_state *st, p4d_t *p4d, unsigned long start) 305 - { 306 - pud_t *pud = pud_offset(p4d, 0); 307 - unsigned long addr; 308 - unsigned int i; 309 - 310 - for (i = 0; i < PTRS_PER_PUD; i++, pud++) { 311 - addr = start + i * PUD_SIZE; 312 - if (!pud_none(*pud) && !pud_is_leaf(*pud)) 313 - /* pud exists */ 314 - walk_pmd(st, pud, addr); 315 - else 316 - note_page(st, addr, 2, pud_val(*pud), PUD_SIZE); 317 - } 318 - } 319 - 320 - static void walk_pagetables(struct pg_state *st) 321 - { 322 - unsigned int i; 323 - unsigned long addr = st->start_address & PGDIR_MASK; 324 - pgd_t *pgd = pgd_offset_k(addr); 325 - 326 - /* 327 - * Traverse the linux pagetable structure and dump pages that are in 328 - * the hash pagetable. 329 - */ 330 - for (i = pgd_index(addr); i < PTRS_PER_PGD; i++, pgd++, addr += PGDIR_SIZE) { 331 - p4d_t *p4d = p4d_offset(pgd, 0); 332 - 333 - if (p4d_none(*p4d) || p4d_is_leaf(*p4d)) 334 - note_page(st, addr, 1, p4d_val(*p4d), PGDIR_SIZE); 335 - else if (is_hugepd(__hugepd(p4d_val(*p4d)))) 336 - walk_hugepd(st, (hugepd_t *)p4d, addr, PGDIR_SHIFT, 1); 337 - else 338 - /* p4d exists */ 339 - walk_pud(st, p4d, addr); 245 + note_page_update_state(st, addr, level, val); 340 246 } 341 247 } 342 248 ··· 301 383 struct pg_state st = { 302 384 .seq = m, 303 385 .marker = address_markers, 304 - .start_address = IS_ENABLED(CONFIG_PPC64) ? PAGE_OFFSET : TASK_SIZE, 386 + .level = -1, 387 + .ptdump = { 388 + .note_page = note_page, 389 + .range = ptdump_range, 390 + } 305 391 }; 306 392 307 - #ifdef CONFIG_PPC64 308 - if (!radix_enabled()) 309 - st.start_address = KERN_VIRT_START; 310 - #endif 311 - 312 393 /* Traverse kernel page tables */ 313 - walk_pagetables(&st); 314 - note_page(&st, 0, 0, 0, 0); 394 + ptdump_walk_pgd(&st.ptdump, &init_mm, NULL); 315 395 return 0; 316 396 } 317 397 318 - 319 - static int ptdump_open(struct inode *inode, struct file *file) 320 - { 321 - return single_open(file, ptdump_show, NULL); 322 - } 323 - 324 - static const struct file_operations ptdump_fops = { 325 - .open = ptdump_open, 326 - .read = seq_read, 327 - .llseek = seq_lseek, 328 - .release = single_release, 329 - }; 398 + DEFINE_SHOW_ATTRIBUTE(ptdump); 330 399 331 400 static void build_pgtable_complete_mask(void) 332 401 { ··· 325 420 pg_level[i].mask |= pg_level[i].flag[j].mask; 326 421 } 327 422 328 - #ifdef CONFIG_PPC_DEBUG_WX 423 + #ifdef CONFIG_DEBUG_WX 329 424 void ptdump_check_wx(void) 330 425 { 331 426 struct pg_state st = { 332 427 .seq = NULL, 333 - .marker = address_markers, 428 + .marker = (struct addr_marker[]) { 429 + { 0, NULL}, 430 + { -1, NULL}, 431 + }, 432 + .level = -1, 334 433 .check_wx = true, 335 - .start_address = IS_ENABLED(CONFIG_PPC64) ? PAGE_OFFSET : TASK_SIZE, 434 + .ptdump = { 435 + .note_page = note_page, 436 + .range = ptdump_range, 437 + } 336 438 }; 337 439 338 - #ifdef CONFIG_PPC64 339 - if (!radix_enabled()) 340 - st.start_address = KERN_VIRT_START; 341 - #endif 342 - 343 - walk_pagetables(&st); 440 + ptdump_walk_pgd(&st.ptdump, &init_mm, NULL); 344 441 345 442 if (st.wx_pages) 346 443 pr_warn("Checked W+X mappings: FAILED, %lu W+X pages found\n", ··· 352 445 } 353 446 #endif 354 447 355 - static int ptdump_init(void) 448 + static int __init ptdump_init(void) 356 449 { 450 + #ifdef CONFIG_PPC64 451 + if (!radix_enabled()) 452 + ptdump_range[0].start = KERN_VIRT_START; 453 + else 454 + ptdump_range[0].start = PAGE_OFFSET; 455 + 456 + ptdump_range[0].end = PAGE_OFFSET + (PGDIR_SIZE * PTRS_PER_PGD); 457 + #endif 458 + 357 459 populate_markers(); 358 460 build_pgtable_complete_mask(); 359 - debugfs_create_file("kernel_page_tables", 0400, NULL, NULL, 360 - &ptdump_fops); 461 + 462 + if (IS_ENABLED(CONFIG_PTDUMP_DEBUGFS)) 463 + debugfs_create_file("kernel_page_tables", 0400, NULL, NULL, &ptdump_fops); 464 + 361 465 return 0; 362 466 } 363 467 device_initcall(ptdump_init);
+3 -13
arch/powerpc/mm/ptdump/segment_regs.c
··· 6 6 * This dumps the content of Segment Registers 7 7 */ 8 8 9 - #include <asm/debugfs.h> 9 + #include <linux/debugfs.h> 10 10 11 11 static void seg_show(struct seq_file *m, int i) 12 12 { ··· 41 41 return 0; 42 42 } 43 43 44 - static int sr_open(struct inode *inode, struct file *file) 45 - { 46 - return single_open(file, sr_show, NULL); 47 - } 48 - 49 - static const struct file_operations sr_fops = { 50 - .open = sr_open, 51 - .read = seq_read, 52 - .llseek = seq_lseek, 53 - .release = single_release, 54 - }; 44 + DEFINE_SHOW_ATTRIBUTE(sr); 55 45 56 46 static int __init sr_init(void) 57 47 { 58 - debugfs_create_file("segment_registers", 0400, powerpc_debugfs_root, 48 + debugfs_create_file("segment_registers", 0400, arch_debugfs_dir, 59 49 NULL, &sr_fops); 60 50 return 0; 61 51 }
+4 -2
arch/powerpc/mm/ptdump/shared.c
··· 68 68 }; 69 69 70 70 struct pgtable_level pg_level[5] = { 71 - { 72 - }, { /* pgd */ 71 + { /* pgd */ 72 + .flag = flag_array, 73 + .num = ARRAY_SIZE(flag_array), 74 + }, { /* p4d */ 73 75 .flag = flag_array, 74 76 .num = ARRAY_SIZE(flag_array), 75 77 }, { /* pud */
+11 -10
arch/powerpc/perf/core-book3s.c
··· 340 340 * If the PMU doesn't update the SIAR for non marked events use 341 341 * pt_regs. 342 342 * 343 + * If regs is a kernel interrupt, always use SIAR. Some PMUs have an 344 + * issue with regs_sipr not being in synch with SIAR in interrupt entry 345 + * and return sequences, which can result in regs_sipr being true for 346 + * kernel interrupts and SIAR, which has the effect of causing samples 347 + * to pile up at mtmsrd MSR[EE] 0->1 or pending irq replay around 348 + * interrupt entry/exit. 349 + * 343 350 * If the PMU has HV/PR flags then check to see if they 344 351 * place the exception in userspace. If so, use pt_regs. In 345 352 * continuous sampling mode the SIAR and the PMU exception are ··· 363 356 use_siar = 1; 364 357 else if ((ppmu->flags & PPMU_NO_CONT_SAMPLING)) 365 358 use_siar = 0; 359 + else if (!user_mode(regs)) 360 + use_siar = 1; 366 361 else if (!(ppmu->flags & PPMU_NO_SIPR) && regs_sipr(regs)) 367 362 use_siar = 0; 368 363 else ··· 2260 2251 */ 2261 2252 unsigned long perf_instruction_pointer(struct pt_regs *regs) 2262 2253 { 2263 - bool use_siar = regs_use_siar(regs); 2264 2254 unsigned long siar = mfspr(SPRN_SIAR); 2265 2255 2266 - if (ppmu && (ppmu->flags & PPMU_P10_DD1)) { 2267 - if (siar) 2268 - return siar; 2269 - else 2270 - return regs->nip; 2271 - } else if (use_siar && siar_valid(regs)) 2272 - return mfspr(SPRN_SIAR) + perf_ip_adjust(regs); 2273 - else if (use_siar) 2274 - return 0; // no valid instruction pointer 2256 + if (regs_use_siar(regs) && siar_valid(regs) && siar) 2257 + return siar + perf_ip_adjust(regs); 2275 2258 else 2276 2259 return regs->nip; 2277 2260 }
+1 -1
arch/powerpc/perf/hv-gpci.c
··· 175 175 */ 176 176 count = 0; 177 177 for (i = offset; i < offset + length; i++) 178 - count |= arg->bytes[i] << (i - offset); 178 + count |= (u64)(arg->bytes[i]) << ((length - 1 - (i - offset)) * 8); 179 179 180 180 *value = count; 181 181 out:
+2 -2
arch/powerpc/platforms/44x/machine_check.c
··· 11 11 12 12 int machine_check_440A(struct pt_regs *regs) 13 13 { 14 - unsigned long reason = regs->dsisr; 14 + unsigned long reason = regs->esr; 15 15 16 16 printk("Machine check in kernel mode.\n"); 17 17 if (reason & ESR_IMCP){ ··· 48 48 #ifdef CONFIG_PPC_47x 49 49 int machine_check_47x(struct pt_regs *regs) 50 50 { 51 - unsigned long reason = regs->dsisr; 51 + unsigned long reason = regs->esr; 52 52 u32 mcsr; 53 53 54 54 printk(KERN_ERR "Machine check in kernel mode.\n");
+1 -1
arch/powerpc/platforms/4xx/machine_check.c
··· 10 10 11 11 int machine_check_4xx(struct pt_regs *regs) 12 12 { 13 - unsigned long reason = regs->dsisr; 13 + unsigned long reason = regs->esr; 14 14 15 15 if (reason & ESR_IMCP) { 16 16 printk("Instruction");
-6
arch/powerpc/platforms/85xx/Kconfig
··· 208 208 select TQM85xx 209 209 select CPM2 210 210 211 - config SBC8548 212 - bool "Wind River SBC8548" 213 - select DEFAULT_UIMAGE 214 - help 215 - This option enables support for the Wind River SBC8548 board 216 - 217 211 config PPA8548 218 212 bool "Prodrive PPA8548" 219 213 help
-1
arch/powerpc/platforms/85xx/Makefile
··· 26 26 obj-$(CONFIG_FB_FSL_DIU) += t1042rdb_diu.o 27 27 obj-$(CONFIG_STX_GP3) += stx_gp3.o 28 28 obj-$(CONFIG_TQM85xx) += tqm85xx.o 29 - obj-$(CONFIG_SBC8548) += sbc8548.o 30 29 obj-$(CONFIG_PPA8548) += ppa8548.o 31 30 obj-$(CONFIG_SOCRATES) += socrates.o socrates_fpga_pic.o 32 31 obj-$(CONFIG_KSI8560) += ksi8560.o
-134
arch/powerpc/platforms/85xx/sbc8548.c
··· 1 - // SPDX-License-Identifier: GPL-2.0-or-later 2 - /* 3 - * Wind River SBC8548 setup and early boot code. 4 - * 5 - * Copyright 2007 Wind River Systems Inc. 6 - * 7 - * By Paul Gortmaker (see MAINTAINERS for contact information) 8 - * 9 - * Based largely on the MPC8548CDS support - Copyright 2005 Freescale Inc. 10 - */ 11 - 12 - #include <linux/stddef.h> 13 - #include <linux/kernel.h> 14 - #include <linux/init.h> 15 - #include <linux/errno.h> 16 - #include <linux/reboot.h> 17 - #include <linux/pci.h> 18 - #include <linux/kdev_t.h> 19 - #include <linux/major.h> 20 - #include <linux/console.h> 21 - #include <linux/delay.h> 22 - #include <linux/seq_file.h> 23 - #include <linux/initrd.h> 24 - #include <linux/interrupt.h> 25 - #include <linux/fsl_devices.h> 26 - #include <linux/of_platform.h> 27 - #include <linux/pgtable.h> 28 - 29 - #include <asm/page.h> 30 - #include <linux/atomic.h> 31 - #include <asm/time.h> 32 - #include <asm/io.h> 33 - #include <asm/machdep.h> 34 - #include <asm/ipic.h> 35 - #include <asm/pci-bridge.h> 36 - #include <asm/irq.h> 37 - #include <mm/mmu_decl.h> 38 - #include <asm/prom.h> 39 - #include <asm/udbg.h> 40 - #include <asm/mpic.h> 41 - 42 - #include <sysdev/fsl_soc.h> 43 - #include <sysdev/fsl_pci.h> 44 - 45 - #include "mpc85xx.h" 46 - 47 - static int sbc_rev; 48 - 49 - static void __init sbc8548_pic_init(void) 50 - { 51 - struct mpic *mpic = mpic_alloc(NULL, 0, MPIC_BIG_ENDIAN, 52 - 0, 256, " OpenPIC "); 53 - BUG_ON(mpic == NULL); 54 - mpic_init(mpic); 55 - } 56 - 57 - /* Extract the HW Rev from the EPLD on the board */ 58 - static int __init sbc8548_hw_rev(void) 59 - { 60 - struct device_node *np; 61 - struct resource res; 62 - unsigned int *rev; 63 - int board_rev = 0; 64 - 65 - np = of_find_compatible_node(NULL, NULL, "hw-rev"); 66 - if (np == NULL) { 67 - printk("No HW-REV found in DTB.\n"); 68 - return -ENODEV; 69 - } 70 - 71 - of_address_to_resource(np, 0, &res); 72 - of_node_put(np); 73 - 74 - rev = ioremap(res.start,sizeof(unsigned int)); 75 - board_rev = (*rev) >> 28; 76 - iounmap(rev); 77 - 78 - return board_rev; 79 - } 80 - 81 - /* 82 - * Setup the architecture 83 - */ 84 - static void __init sbc8548_setup_arch(void) 85 - { 86 - if (ppc_md.progress) 87 - ppc_md.progress("sbc8548_setup_arch()", 0); 88 - 89 - fsl_pci_assign_primary(); 90 - 91 - sbc_rev = sbc8548_hw_rev(); 92 - } 93 - 94 - static void sbc8548_show_cpuinfo(struct seq_file *m) 95 - { 96 - uint pvid, svid, phid1; 97 - 98 - pvid = mfspr(SPRN_PVR); 99 - svid = mfspr(SPRN_SVR); 100 - 101 - seq_printf(m, "Vendor\t\t: Wind River\n"); 102 - seq_printf(m, "Machine\t\t: SBC8548 v%d\n", sbc_rev); 103 - seq_printf(m, "PVR\t\t: 0x%x\n", pvid); 104 - seq_printf(m, "SVR\t\t: 0x%x\n", svid); 105 - 106 - /* Display cpu Pll setting */ 107 - phid1 = mfspr(SPRN_HID1); 108 - seq_printf(m, "PLL setting\t: 0x%x\n", ((phid1 >> 24) & 0x3f)); 109 - } 110 - 111 - machine_arch_initcall(sbc8548, mpc85xx_common_publish_devices); 112 - 113 - /* 114 - * Called very early, device-tree isn't unflattened 115 - */ 116 - static int __init sbc8548_probe(void) 117 - { 118 - return of_machine_is_compatible("SBC8548"); 119 - } 120 - 121 - define_machine(sbc8548) { 122 - .name = "SBC8548", 123 - .probe = sbc8548_probe, 124 - .setup_arch = sbc8548_setup_arch, 125 - .init_IRQ = sbc8548_pic_init, 126 - .show_cpuinfo = sbc8548_show_cpuinfo, 127 - .get_irq = mpic_get_irq, 128 - #ifdef CONFIG_PCI 129 - .pcibios_fixup_bus = fsl_pcibios_fixup_bus, 130 - .pcibios_fixup_phb = fsl_pcibios_fixup_phb, 131 - #endif 132 - .calibrate_decr = generic_calibrate_decr, 133 - .progress = udbg_progress, 134 - };
+1 -7
arch/powerpc/platforms/86xx/Kconfig
··· 20 20 help 21 21 This option enables support for the MPC8641 HPCN board. 22 22 23 - config SBC8641D 24 - bool "Wind River SBC8641D" 25 - select DEFAULT_UIMAGE 26 - help 27 - This option enables support for the WRS SBC8641D board. 28 - 29 23 config MPC8610_HPCD 30 24 bool "Freescale MPC8610 HPCD" 31 25 select DEFAULT_UIMAGE ··· 68 74 select FSL_PCI if PCI 69 75 select PPC_UDBG_16550 70 76 select MPIC 71 - default y if MPC8641_HPCN || SBC8641D || GEF_SBC610 || GEF_SBC310 || GEF_PPC9A \ 77 + default y if MPC8641_HPCN || GEF_SBC610 || GEF_SBC310 || GEF_PPC9A \ 72 78 || MVME7100 73 79 74 80 config MPC8610
-1
arch/powerpc/platforms/86xx/Makefile
··· 6 6 obj-y := pic.o common.o 7 7 obj-$(CONFIG_SMP) += mpc86xx_smp.o 8 8 obj-$(CONFIG_MPC8641_HPCN) += mpc86xx_hpcn.o 9 - obj-$(CONFIG_SBC8641D) += sbc8641d.o 10 9 obj-$(CONFIG_MPC8610_HPCD) += mpc8610_hpcd.o 11 10 obj-$(CONFIG_GEF_SBC610) += gef_sbc610.o 12 11 obj-$(CONFIG_GEF_SBC310) += gef_sbc310.o
-87
arch/powerpc/platforms/86xx/sbc8641d.c
··· 1 - // SPDX-License-Identifier: GPL-2.0-or-later 2 - /* 3 - * SBC8641D board specific routines 4 - * 5 - * Copyright 2008 Wind River Systems Inc. 6 - * 7 - * By Paul Gortmaker (see MAINTAINERS for contact information) 8 - * 9 - * Based largely on the 8641 HPCN support by Freescale Semiconductor Inc. 10 - */ 11 - 12 - #include <linux/stddef.h> 13 - #include <linux/kernel.h> 14 - #include <linux/pci.h> 15 - #include <linux/kdev_t.h> 16 - #include <linux/delay.h> 17 - #include <linux/seq_file.h> 18 - #include <linux/of_platform.h> 19 - 20 - #include <asm/time.h> 21 - #include <asm/machdep.h> 22 - #include <asm/pci-bridge.h> 23 - #include <asm/prom.h> 24 - #include <mm/mmu_decl.h> 25 - #include <asm/udbg.h> 26 - 27 - #include <asm/mpic.h> 28 - 29 - #include <sysdev/fsl_pci.h> 30 - #include <sysdev/fsl_soc.h> 31 - 32 - #include "mpc86xx.h" 33 - 34 - static void __init 35 - sbc8641_setup_arch(void) 36 - { 37 - if (ppc_md.progress) 38 - ppc_md.progress("sbc8641_setup_arch()", 0); 39 - 40 - printk("SBC8641 board from Wind River\n"); 41 - 42 - #ifdef CONFIG_SMP 43 - mpc86xx_smp_init(); 44 - #endif 45 - 46 - fsl_pci_assign_primary(); 47 - } 48 - 49 - 50 - static void 51 - sbc8641_show_cpuinfo(struct seq_file *m) 52 - { 53 - uint svid = mfspr(SPRN_SVR); 54 - 55 - seq_printf(m, "Vendor\t\t: Wind River Systems\n"); 56 - 57 - seq_printf(m, "SVR\t\t: 0x%x\n", svid); 58 - } 59 - 60 - 61 - /* 62 - * Called very early, device-tree isn't unflattened 63 - */ 64 - static int __init sbc8641_probe(void) 65 - { 66 - if (of_machine_is_compatible("wind,sbc8641")) 67 - return 1; /* Looks good */ 68 - 69 - return 0; 70 - } 71 - 72 - machine_arch_initcall(sbc8641, mpc86xx_common_publish_devices); 73 - 74 - define_machine(sbc8641) { 75 - .name = "SBC8641D", 76 - .probe = sbc8641_probe, 77 - .setup_arch = sbc8641_setup_arch, 78 - .init_IRQ = mpc86xx_init_irq, 79 - .show_cpuinfo = sbc8641_show_cpuinfo, 80 - .get_irq = mpic_get_irq, 81 - .time_init = mpc86xx_time_init, 82 - .calibrate_decr = generic_calibrate_decr, 83 - .progress = udbg_progress, 84 - #ifdef CONFIG_PCI 85 - .pcibios_fixup_bus = fsl_pcibios_fixup_bus, 86 - #endif 87 - };
+2 -2
arch/powerpc/platforms/cell/axon_msi.c
··· 12 12 #include <linux/export.h> 13 13 #include <linux/of_platform.h> 14 14 #include <linux/slab.h> 15 + #include <linux/debugfs.h> 15 16 16 - #include <asm/debugfs.h> 17 17 #include <asm/dcr.h> 18 18 #include <asm/machdep.h> 19 19 #include <asm/prom.h> ··· 480 480 481 481 snprintf(name, sizeof(name), "msic_%d", of_node_to_nid(dn)); 482 482 483 - debugfs_create_file(name, 0600, powerpc_debugfs_root, msic, &fops_msic); 483 + debugfs_create_file(name, 0600, arch_debugfs_dir, msic, &fops_msic); 484 484 } 485 485 #endif /* DEBUG */
+1 -1
arch/powerpc/platforms/embedded6xx/holly.c
··· 251 251 /* Are we prepared to handle this fault */ 252 252 if ((entry = search_exception_tables(regs->nip)) != NULL) { 253 253 tsi108_clear_pci_cfg_error(); 254 - regs_set_return_msr(regs, regs->msr | MSR_RI); 254 + regs_set_recoverable(regs); 255 255 regs_set_return_ip(regs, extable_fixup(entry)); 256 256 return 1; 257 257 }
+1 -1
arch/powerpc/platforms/embedded6xx/mpc7448_hpc2.c
··· 173 173 /* Are we prepared to handle this fault */ 174 174 if ((entry = search_exception_tables(regs->nip)) != NULL) { 175 175 tsi108_clear_pci_cfg_error(); 176 - regs_set_return_msr(regs, regs->msr | MSR_RI); 176 + regs_set_recoverable(regs); 177 177 regs_set_return_ip(regs, extable_fixup(entry)); 178 178 return 1; 179 179 }
+1 -1
arch/powerpc/platforms/pasemi/idle.c
··· 59 59 restore_astate(hard_smp_processor_id()); 60 60 61 61 /* everything handled */ 62 - regs_set_return_msr(regs, regs->msr | MSR_RI); 62 + regs_set_recoverable(regs); 63 63 return 1; 64 64 } 65 65
+2 -4
arch/powerpc/platforms/powernv/idle.c
··· 199 199 */ 200 200 power7_fastsleep_workaround_exit = false; 201 201 202 - get_online_cpus(); 202 + cpus_read_lock(); 203 203 primary_thread_mask = cpu_online_cores_map(); 204 204 on_each_cpu_mask(&primary_thread_mask, 205 205 pnv_fastsleep_workaround_apply, 206 206 &err, 1); 207 - put_online_cpus(); 207 + cpus_read_unlock(); 208 208 if (err) { 209 209 pr_err("fastsleep_workaround_applyonce change failed while running pnv_fastsleep_workaround_apply"); 210 210 goto fail; ··· 667 667 sprs.purr = mfspr(SPRN_PURR); 668 668 sprs.spurr = mfspr(SPRN_SPURR); 669 669 sprs.dscr = mfspr(SPRN_DSCR); 670 - sprs.wort = mfspr(SPRN_WORT); 671 670 sprs.ciabr = mfspr(SPRN_CIABR); 672 671 673 672 sprs.mmcra = mfspr(SPRN_MMCRA); ··· 784 785 mtspr(SPRN_PURR, sprs.purr); 785 786 mtspr(SPRN_SPURR, sprs.spurr); 786 787 mtspr(SPRN_DSCR, sprs.dscr); 787 - mtspr(SPRN_WORT, sprs.wort); 788 788 mtspr(SPRN_CIABR, sprs.ciabr); 789 789 790 790 mtspr(SPRN_MMCRA, sprs.mmcra);
+1 -2
arch/powerpc/platforms/powernv/memtrace.c
··· 18 18 #include <linux/memory_hotplug.h> 19 19 #include <linux/numa.h> 20 20 #include <asm/machdep.h> 21 - #include <asm/debugfs.h> 22 21 #include <asm/cacheflush.h> 23 22 24 23 /* This enables us to keep track of the memory removed from each node. */ ··· 329 330 static int memtrace_init(void) 330 331 { 331 332 memtrace_debugfs_dir = debugfs_create_dir("memtrace", 332 - powerpc_debugfs_root); 333 + arch_debugfs_dir); 333 334 334 335 debugfs_create_file("enable", 0600, memtrace_debugfs_dir, 335 336 NULL, &memtrace_init_fops);
+6 -6
arch/powerpc/platforms/powernv/opal-imc.c
··· 13 13 #include <linux/of_address.h> 14 14 #include <linux/of_platform.h> 15 15 #include <linux/crash_dump.h> 16 + #include <linux/debugfs.h> 16 17 #include <asm/opal.h> 17 18 #include <asm/io.h> 18 19 #include <asm/imc-pmu.h> 19 20 #include <asm/cputhreads.h> 20 - #include <asm/debugfs.h> 21 21 22 22 static struct dentry *imc_debugfs_parent; 23 23 ··· 56 56 u32 cb_offset; 57 57 struct imc_mem_info *ptr = pmu_ptr->mem_info; 58 58 59 - imc_debugfs_parent = debugfs_create_dir("imc", powerpc_debugfs_root); 59 + imc_debugfs_parent = debugfs_create_dir("imc", arch_debugfs_dir); 60 60 61 61 if (of_property_read_u32(node, "cb_offset", &cb_offset)) 62 62 cb_offset = IMC_CNTL_BLK_OFFSET; ··· 186 186 int nid, cpu; 187 187 const struct cpumask *l_cpumask; 188 188 189 - get_online_cpus(); 189 + cpus_read_lock(); 190 190 for_each_node_with_cpus(nid) { 191 191 l_cpumask = cpumask_of_node(nid); 192 192 cpu = cpumask_first_and(l_cpumask, cpu_online_mask); ··· 195 195 opal_imc_counters_stop(OPAL_IMC_COUNTERS_NEST, 196 196 get_hard_smp_processor_id(cpu)); 197 197 } 198 - put_online_cpus(); 198 + cpus_read_unlock(); 199 199 } 200 200 201 201 static void disable_core_pmu_counters(void) ··· 203 203 cpumask_t cores_map; 204 204 int cpu, rc; 205 205 206 - get_online_cpus(); 206 + cpus_read_lock(); 207 207 /* Disable the IMC Core functions */ 208 208 cores_map = cpu_online_cores_map(); 209 209 for_each_cpu(cpu, &cores_map) { ··· 213 213 pr_err("%s: Failed to stop Core (cpu = %d)\n", 214 214 __FUNCTION__, cpu); 215 215 } 216 - put_online_cpus(); 216 + cpus_read_unlock(); 217 217 } 218 218 219 219 int get_max_nest_dev(void)
+2 -2
arch/powerpc/platforms/powernv/opal-lpc.c
··· 10 10 #include <linux/bug.h> 11 11 #include <linux/io.h> 12 12 #include <linux/slab.h> 13 + #include <linux/debugfs.h> 13 14 14 15 #include <asm/machdep.h> 15 16 #include <asm/firmware.h> 16 17 #include <asm/opal.h> 17 18 #include <asm/prom.h> 18 19 #include <linux/uaccess.h> 19 - #include <asm/debugfs.h> 20 20 #include <asm/isa-bridge.h> 21 21 22 22 static int opal_lpc_chip_id = -1; ··· 371 371 if (opal_lpc_chip_id < 0) 372 372 return -ENODEV; 373 373 374 - root = debugfs_create_dir("lpc", powerpc_debugfs_root); 374 + root = debugfs_create_dir("lpc", arch_debugfs_dir); 375 375 376 376 rc |= opal_lpc_debugfs_create_type(root, "io", OPAL_LPC_IO); 377 377 rc |= opal_lpc_debugfs_create_type(root, "mem", OPAL_LPC_MEM);
+2 -2
arch/powerpc/platforms/powernv/opal-xscom.c
··· 14 14 #include <linux/gfp.h> 15 15 #include <linux/slab.h> 16 16 #include <linux/uaccess.h> 17 + #include <linux/debugfs.h> 17 18 18 19 #include <asm/machdep.h> 19 20 #include <asm/firmware.h> 20 21 #include <asm/opal.h> 21 - #include <asm/debugfs.h> 22 22 #include <asm/prom.h> 23 23 24 24 static u64 opal_scom_unmangle(u64 addr) ··· 189 189 if (!firmware_has_feature(FW_FEATURE_OPAL)) 190 190 return 0; 191 191 192 - root = debugfs_create_dir("scom", powerpc_debugfs_root); 192 + root = debugfs_create_dir("scom", arch_debugfs_dir); 193 193 if (!root) 194 194 return -1; 195 195
+1 -1
arch/powerpc/platforms/powernv/opal.c
··· 588 588 { 589 589 int recovered = 0; 590 590 591 - if (!(regs->msr & MSR_RI)) { 591 + if (regs_is_unrecoverable(regs)) { 592 592 /* If MSR_RI isn't set, we cannot recover */ 593 593 pr_err("Machine check interrupt unrecoverable: MSR(RI=0)\n"); 594 594 recovered = 0;
+237 -23
arch/powerpc/platforms/powernv/pci-ioda.c
··· 20 20 #include <linux/iommu.h> 21 21 #include <linux/rculist.h> 22 22 #include <linux/sizes.h> 23 + #include <linux/debugfs.h> 23 24 24 25 #include <asm/sections.h> 25 26 #include <asm/io.h> ··· 33 32 #include <asm/iommu.h> 34 33 #include <asm/tce.h> 35 34 #include <asm/xics.h> 36 - #include <asm/debugfs.h> 37 35 #include <asm/firmware.h> 38 36 #include <asm/pnv-pci.h> 39 37 #include <asm/mmzone.h> 38 + #include <asm/xive.h> 40 39 41 40 #include <misc/cxl-base.h> 42 41 ··· 1963 1962 pe->dma_setup_done = true; 1964 1963 } 1965 1964 1966 - int64_t pnv_opal_pci_msi_eoi(struct irq_chip *chip, unsigned int hw_irq) 1965 + /* 1966 + * Called from KVM in real mode to EOI passthru interrupts. The ICP 1967 + * EOI is handled directly in KVM in kvmppc_deliver_irq_passthru(). 1968 + * 1969 + * The IRQ data is mapped in the PCI-MSI domain and the EOI OPAL call 1970 + * needs an HW IRQ number mapped in the XICS IRQ domain. The HW IRQ 1971 + * numbers of the in-the-middle MSI domain are vector numbers and it's 1972 + * good enough for OPAL. Use that. 1973 + */ 1974 + int64_t pnv_opal_pci_msi_eoi(struct irq_data *d) 1967 1975 { 1968 - struct pnv_phb *phb = container_of(chip, struct pnv_phb, 1969 - ioda.irq_chip); 1976 + struct pci_controller *hose = irq_data_get_irq_chip_data(d->parent_data); 1977 + struct pnv_phb *phb = hose->private_data; 1970 1978 1971 - return opal_pci_msi_eoi(phb->opal_id, hw_irq); 1979 + return opal_pci_msi_eoi(phb->opal_id, d->parent_data->hwirq); 1972 1980 } 1973 1981 1982 + /* 1983 + * The IRQ data is mapped in the XICS domain, with OPAL HW IRQ numbers 1984 + */ 1974 1985 static void pnv_ioda2_msi_eoi(struct irq_data *d) 1975 1986 { 1976 1987 int64_t rc; 1977 1988 unsigned int hw_irq = (unsigned int)irqd_to_hwirq(d); 1978 - struct irq_chip *chip = irq_data_get_irq_chip(d); 1989 + struct pci_controller *hose = irq_data_get_irq_chip_data(d); 1990 + struct pnv_phb *phb = hose->private_data; 1979 1991 1980 - rc = pnv_opal_pci_msi_eoi(chip, hw_irq); 1992 + rc = opal_pci_msi_eoi(phb->opal_id, hw_irq); 1981 1993 WARN_ON_ONCE(rc); 1982 1994 1983 1995 icp_native_eoi(d); 1984 1996 } 1985 1997 1986 - 1998 + /* P8/CXL only */ 1987 1999 void pnv_set_msi_irq_chip(struct pnv_phb *phb, unsigned int virq) 1988 2000 { 1989 2001 struct irq_data *idata; ··· 2018 2004 phb->ioda.irq_chip.irq_eoi = pnv_ioda2_msi_eoi; 2019 2005 } 2020 2006 irq_set_chip(virq, &phb->ioda.irq_chip); 2007 + irq_set_chip_data(virq, phb->hose); 2021 2008 } 2009 + 2010 + static struct irq_chip pnv_pci_msi_irq_chip; 2022 2011 2023 2012 /* 2024 2013 * Returns true iff chip is something that we could call ··· 2029 2012 */ 2030 2013 bool is_pnv_opal_msi(struct irq_chip *chip) 2031 2014 { 2032 - return chip->irq_eoi == pnv_ioda2_msi_eoi; 2015 + return chip == &pnv_pci_msi_irq_chip; 2033 2016 } 2034 2017 EXPORT_SYMBOL_GPL(is_pnv_opal_msi); 2035 2018 2036 - static int pnv_pci_ioda_msi_setup(struct pnv_phb *phb, struct pci_dev *dev, 2037 - unsigned int hwirq, unsigned int virq, 2038 - unsigned int is_64, struct msi_msg *msg) 2019 + static int __pnv_pci_ioda_msi_setup(struct pnv_phb *phb, struct pci_dev *dev, 2020 + unsigned int xive_num, 2021 + unsigned int is_64, struct msi_msg *msg) 2039 2022 { 2040 2023 struct pnv_ioda_pe *pe = pnv_ioda_get_pe(dev); 2041 - unsigned int xive_num = hwirq - phb->msi_base; 2042 2024 __be32 data; 2043 2025 int rc; 2026 + 2027 + dev_dbg(&dev->dev, "%s: setup %s-bit MSI for vector #%d\n", __func__, 2028 + is_64 ? "64" : "32", xive_num); 2044 2029 2045 2030 /* No PE assigned ? bail out ... no MSI for you ! */ 2046 2031 if (pe == NULL) ··· 2091 2072 } 2092 2073 msg->data = be32_to_cpu(data); 2093 2074 2094 - pnv_set_msi_irq_chip(phb, virq); 2075 + return 0; 2076 + } 2095 2077 2096 - pr_devel("%s: %s-bit MSI on hwirq %x (xive #%d)," 2097 - " address=%x_%08x data=%x PE# %x\n", 2098 - pci_name(dev), is_64 ? "64" : "32", hwirq, xive_num, 2099 - msg->address_hi, msg->address_lo, data, pe->pe_number); 2078 + /* 2079 + * The msi_free() op is called before irq_domain_free_irqs_top() when 2080 + * the handler data is still available. Use that to clear the XIVE 2081 + * controller. 2082 + */ 2083 + static void pnv_msi_ops_msi_free(struct irq_domain *domain, 2084 + struct msi_domain_info *info, 2085 + unsigned int irq) 2086 + { 2087 + if (xive_enabled()) 2088 + xive_irq_free_data(irq); 2089 + } 2090 + 2091 + static struct msi_domain_ops pnv_pci_msi_domain_ops = { 2092 + .msi_free = pnv_msi_ops_msi_free, 2093 + }; 2094 + 2095 + static void pnv_msi_shutdown(struct irq_data *d) 2096 + { 2097 + d = d->parent_data; 2098 + if (d->chip->irq_shutdown) 2099 + d->chip->irq_shutdown(d); 2100 + } 2101 + 2102 + static void pnv_msi_mask(struct irq_data *d) 2103 + { 2104 + pci_msi_mask_irq(d); 2105 + irq_chip_mask_parent(d); 2106 + } 2107 + 2108 + static void pnv_msi_unmask(struct irq_data *d) 2109 + { 2110 + pci_msi_unmask_irq(d); 2111 + irq_chip_unmask_parent(d); 2112 + } 2113 + 2114 + static struct irq_chip pnv_pci_msi_irq_chip = { 2115 + .name = "PNV-PCI-MSI", 2116 + .irq_shutdown = pnv_msi_shutdown, 2117 + .irq_mask = pnv_msi_mask, 2118 + .irq_unmask = pnv_msi_unmask, 2119 + .irq_eoi = irq_chip_eoi_parent, 2120 + }; 2121 + 2122 + static struct msi_domain_info pnv_msi_domain_info = { 2123 + .flags = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS | 2124 + MSI_FLAG_MULTI_PCI_MSI | MSI_FLAG_PCI_MSIX), 2125 + .ops = &pnv_pci_msi_domain_ops, 2126 + .chip = &pnv_pci_msi_irq_chip, 2127 + }; 2128 + 2129 + static void pnv_msi_compose_msg(struct irq_data *d, struct msi_msg *msg) 2130 + { 2131 + struct msi_desc *entry = irq_data_get_msi_desc(d); 2132 + struct pci_dev *pdev = msi_desc_to_pci_dev(entry); 2133 + struct pci_controller *hose = irq_data_get_irq_chip_data(d); 2134 + struct pnv_phb *phb = hose->private_data; 2135 + int rc; 2136 + 2137 + rc = __pnv_pci_ioda_msi_setup(phb, pdev, d->hwirq, 2138 + entry->msi_attrib.is_64, msg); 2139 + if (rc) 2140 + dev_err(&pdev->dev, "Failed to setup %s-bit MSI #%ld : %d\n", 2141 + entry->msi_attrib.is_64 ? "64" : "32", d->hwirq, rc); 2142 + } 2143 + 2144 + /* 2145 + * The IRQ data is mapped in the MSI domain in which HW IRQ numbers 2146 + * correspond to vector numbers. 2147 + */ 2148 + static void pnv_msi_eoi(struct irq_data *d) 2149 + { 2150 + struct pci_controller *hose = irq_data_get_irq_chip_data(d); 2151 + struct pnv_phb *phb = hose->private_data; 2152 + 2153 + if (phb->model == PNV_PHB_MODEL_PHB3) { 2154 + /* 2155 + * The EOI OPAL call takes an OPAL HW IRQ number but 2156 + * since it is translated into a vector number in 2157 + * OPAL, use that directly. 2158 + */ 2159 + WARN_ON_ONCE(opal_pci_msi_eoi(phb->opal_id, d->hwirq)); 2160 + } 2161 + 2162 + irq_chip_eoi_parent(d); 2163 + } 2164 + 2165 + static struct irq_chip pnv_msi_irq_chip = { 2166 + .name = "PNV-MSI", 2167 + .irq_shutdown = pnv_msi_shutdown, 2168 + .irq_mask = irq_chip_mask_parent, 2169 + .irq_unmask = irq_chip_unmask_parent, 2170 + .irq_eoi = pnv_msi_eoi, 2171 + .irq_set_affinity = irq_chip_set_affinity_parent, 2172 + .irq_compose_msi_msg = pnv_msi_compose_msg, 2173 + }; 2174 + 2175 + static int pnv_irq_parent_domain_alloc(struct irq_domain *domain, 2176 + unsigned int virq, int hwirq) 2177 + { 2178 + struct irq_fwspec parent_fwspec; 2179 + int ret; 2180 + 2181 + parent_fwspec.fwnode = domain->parent->fwnode; 2182 + parent_fwspec.param_count = 2; 2183 + parent_fwspec.param[0] = hwirq; 2184 + parent_fwspec.param[1] = IRQ_TYPE_EDGE_RISING; 2185 + 2186 + ret = irq_domain_alloc_irqs_parent(domain, virq, 1, &parent_fwspec); 2187 + if (ret) 2188 + return ret; 2189 + 2190 + return 0; 2191 + } 2192 + 2193 + static int pnv_irq_domain_alloc(struct irq_domain *domain, unsigned int virq, 2194 + unsigned int nr_irqs, void *arg) 2195 + { 2196 + struct pci_controller *hose = domain->host_data; 2197 + struct pnv_phb *phb = hose->private_data; 2198 + msi_alloc_info_t *info = arg; 2199 + struct pci_dev *pdev = msi_desc_to_pci_dev(info->desc); 2200 + int hwirq; 2201 + int i, ret; 2202 + 2203 + hwirq = msi_bitmap_alloc_hwirqs(&phb->msi_bmp, nr_irqs); 2204 + if (hwirq < 0) { 2205 + dev_warn(&pdev->dev, "failed to find a free MSI\n"); 2206 + return -ENOSPC; 2207 + } 2208 + 2209 + dev_dbg(&pdev->dev, "%s bridge %pOF %d/%x #%d\n", __func__, 2210 + hose->dn, virq, hwirq, nr_irqs); 2211 + 2212 + for (i = 0; i < nr_irqs; i++) { 2213 + ret = pnv_irq_parent_domain_alloc(domain, virq + i, 2214 + phb->msi_base + hwirq + i); 2215 + if (ret) 2216 + goto out; 2217 + 2218 + irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i, 2219 + &pnv_msi_irq_chip, hose); 2220 + } 2221 + 2222 + return 0; 2223 + 2224 + out: 2225 + irq_domain_free_irqs_parent(domain, virq, i - 1); 2226 + msi_bitmap_free_hwirqs(&phb->msi_bmp, hwirq, nr_irqs); 2227 + return ret; 2228 + } 2229 + 2230 + static void pnv_irq_domain_free(struct irq_domain *domain, unsigned int virq, 2231 + unsigned int nr_irqs) 2232 + { 2233 + struct irq_data *d = irq_domain_get_irq_data(domain, virq); 2234 + struct pci_controller *hose = irq_data_get_irq_chip_data(d); 2235 + struct pnv_phb *phb = hose->private_data; 2236 + 2237 + pr_debug("%s bridge %pOF %d/%lx #%d\n", __func__, hose->dn, 2238 + virq, d->hwirq, nr_irqs); 2239 + 2240 + msi_bitmap_free_hwirqs(&phb->msi_bmp, d->hwirq, nr_irqs); 2241 + /* XIVE domain is cleared through ->msi_free() */ 2242 + } 2243 + 2244 + static const struct irq_domain_ops pnv_irq_domain_ops = { 2245 + .alloc = pnv_irq_domain_alloc, 2246 + .free = pnv_irq_domain_free, 2247 + }; 2248 + 2249 + static int pnv_msi_allocate_domains(struct pci_controller *hose, unsigned int count) 2250 + { 2251 + struct pnv_phb *phb = hose->private_data; 2252 + struct irq_domain *parent = irq_get_default_host(); 2253 + 2254 + hose->fwnode = irq_domain_alloc_named_id_fwnode("PNV-MSI", phb->opal_id); 2255 + if (!hose->fwnode) 2256 + return -ENOMEM; 2257 + 2258 + hose->dev_domain = irq_domain_create_hierarchy(parent, 0, count, 2259 + hose->fwnode, 2260 + &pnv_irq_domain_ops, hose); 2261 + if (!hose->dev_domain) { 2262 + pr_err("PCI: failed to create IRQ domain bridge %pOF (domain %d)\n", 2263 + hose->dn, hose->global_number); 2264 + irq_domain_free_fwnode(hose->fwnode); 2265 + return -ENOMEM; 2266 + } 2267 + 2268 + hose->msi_domain = pci_msi_create_irq_domain(of_node_to_fwnode(hose->dn), 2269 + &pnv_msi_domain_info, 2270 + hose->dev_domain); 2271 + if (!hose->msi_domain) { 2272 + pr_err("PCI: failed to create MSI IRQ domain bridge %pOF (domain %d)\n", 2273 + hose->dn, hose->global_number); 2274 + irq_domain_free_fwnode(hose->fwnode); 2275 + irq_domain_remove(hose->dev_domain); 2276 + return -ENOMEM; 2277 + } 2100 2278 2101 2279 return 0; 2102 2280 } ··· 2318 2102 return; 2319 2103 } 2320 2104 2321 - phb->msi_setup = pnv_pci_ioda_msi_setup; 2322 - phb->msi32_support = 1; 2323 2105 pr_info(" Allocated bitmap for %d MSIs (base IRQ 0x%x)\n", 2324 2106 count, phb->msi_base); 2107 + 2108 + pnv_msi_allocate_domains(phb->hose, count); 2325 2109 } 2326 2110 2327 2111 static void pnv_ioda_setup_pe_res(struct pnv_ioda_pe *pe, ··· 2475 2259 phb = hose->private_data; 2476 2260 2477 2261 sprintf(name, "PCI%04x", hose->global_number); 2478 - phb->dbgfs = debugfs_create_dir(name, powerpc_debugfs_root); 2262 + phb->dbgfs = debugfs_create_dir(name, arch_debugfs_dir); 2479 2263 2480 2264 debugfs_create_file_unsafe("dump_diag_regs", 0200, phb->dbgfs, 2481 2265 phb, &pnv_pci_diag_data_fops); ··· 2925 2709 .dma_dev_setup = pnv_pci_ioda_dma_dev_setup, 2926 2710 .dma_bus_setup = pnv_pci_ioda_dma_bus_setup, 2927 2711 .iommu_bypass_supported = pnv_pci_ioda_iommu_bypass_supported, 2928 - .setup_msi_irqs = pnv_setup_msi_irqs, 2929 - .teardown_msi_irqs = pnv_teardown_msi_irqs, 2930 2712 .enable_device_hook = pnv_pci_enable_device_hook, 2931 2713 .release_device = pnv_pci_release_device, 2932 2714 .window_alignment = pnv_pci_window_alignment,
-67
arch/powerpc/platforms/powernv/pci.c
··· 160 160 } 161 161 EXPORT_SYMBOL_GPL(pnv_pci_set_power_state); 162 162 163 - int pnv_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type) 164 - { 165 - struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus); 166 - struct msi_desc *entry; 167 - struct msi_msg msg; 168 - int hwirq; 169 - unsigned int virq; 170 - int rc; 171 - 172 - if (WARN_ON(!phb) || !phb->msi_bmp.bitmap) 173 - return -ENODEV; 174 - 175 - if (pdev->no_64bit_msi && !phb->msi32_support) 176 - return -ENODEV; 177 - 178 - for_each_pci_msi_entry(entry, pdev) { 179 - if (!entry->msi_attrib.is_64 && !phb->msi32_support) { 180 - pr_warn("%s: Supports only 64-bit MSIs\n", 181 - pci_name(pdev)); 182 - return -ENXIO; 183 - } 184 - hwirq = msi_bitmap_alloc_hwirqs(&phb->msi_bmp, 1); 185 - if (hwirq < 0) { 186 - pr_warn("%s: Failed to find a free MSI\n", 187 - pci_name(pdev)); 188 - return -ENOSPC; 189 - } 190 - virq = irq_create_mapping(NULL, phb->msi_base + hwirq); 191 - if (!virq) { 192 - pr_warn("%s: Failed to map MSI to linux irq\n", 193 - pci_name(pdev)); 194 - msi_bitmap_free_hwirqs(&phb->msi_bmp, hwirq, 1); 195 - return -ENOMEM; 196 - } 197 - rc = phb->msi_setup(phb, pdev, phb->msi_base + hwirq, 198 - virq, entry->msi_attrib.is_64, &msg); 199 - if (rc) { 200 - pr_warn("%s: Failed to setup MSI\n", pci_name(pdev)); 201 - irq_dispose_mapping(virq); 202 - msi_bitmap_free_hwirqs(&phb->msi_bmp, hwirq, 1); 203 - return rc; 204 - } 205 - irq_set_msi_desc(virq, entry); 206 - pci_write_msi_msg(virq, &msg); 207 - } 208 - return 0; 209 - } 210 - 211 - void pnv_teardown_msi_irqs(struct pci_dev *pdev) 212 - { 213 - struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus); 214 - struct msi_desc *entry; 215 - irq_hw_number_t hwirq; 216 - 217 - if (WARN_ON(!phb)) 218 - return; 219 - 220 - for_each_pci_msi_entry(entry, pdev) { 221 - if (!entry->irq) 222 - continue; 223 - hwirq = virq_to_hw(entry->irq); 224 - irq_set_msi_desc(entry->irq, NULL); 225 - irq_dispose_mapping(entry->irq); 226 - msi_bitmap_free_hwirqs(&phb->msi_bmp, hwirq - phb->msi_base, 1); 227 - } 228 - } 229 - 230 163 /* Nicely print the contents of the PE State Tables (PEST). */ 231 164 static void pnv_pci_dump_pest(__be64 pestA[], __be64 pestB[], int pest_size) 232 165 {
-6
arch/powerpc/platforms/powernv/pci.h
··· 123 123 #endif 124 124 125 125 unsigned int msi_base; 126 - unsigned int msi32_support; 127 126 struct msi_bitmap msi_bmp; 128 - int (*msi_setup)(struct pnv_phb *phb, struct pci_dev *dev, 129 - unsigned int hwirq, unsigned int virq, 130 - unsigned int is_64, struct msi_msg *msg); 131 127 int (*init_m64)(struct pnv_phb *phb); 132 128 int (*get_pe_state)(struct pnv_phb *phb, int pe_no); 133 129 void (*freeze_pe)(struct pnv_phb *phb, int pe_no); ··· 285 289 extern void pnv_pci_reset_secondary_bus(struct pci_dev *dev); 286 290 extern int pnv_eeh_phb_reset(struct pci_controller *hose, int option); 287 291 288 - extern int pnv_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type); 289 - extern void pnv_teardown_msi_irqs(struct pci_dev *pdev); 290 292 extern struct pnv_ioda_pe *pnv_pci_bdfn_to_pe(struct pnv_phb *phb, u16 bdfn); 291 293 extern struct pnv_ioda_pe *pnv_ioda_get_pe(struct pci_dev *dev); 292 294 extern void pnv_set_msi_irq_chip(struct pnv_phb *phb, unsigned int virq);
+2 -1
arch/powerpc/platforms/ps3/htab.c
··· 169 169 spin_unlock_irqrestore(&ps3_htab_lock, flags); 170 170 } 171 171 172 - static void ps3_hpte_clear(void) 172 + /* Called during kexec sequence with MMU off */ 173 + static notrace void ps3_hpte_clear(void) 173 174 { 174 175 unsigned long hpte_count = (1UL << ppc64_pft_size) >> 4; 175 176 u64 i;
+6 -2
arch/powerpc/platforms/ps3/mm.c
··· 195 195 196 196 /** 197 197 * ps3_mm_vas_destroy - 198 + * 199 + * called during kexec sequence with MMU off. 198 200 */ 199 201 200 - void ps3_mm_vas_destroy(void) 202 + notrace void ps3_mm_vas_destroy(void) 201 203 { 202 204 int result; 203 205 ··· 1245 1243 1246 1244 /** 1247 1245 * ps3_mm_shutdown - final cleanup of address space 1246 + * 1247 + * called during kexec sequence with MMU off. 1248 1248 */ 1249 1249 1250 - void ps3_mm_shutdown(void) 1250 + notrace void ps3_mm_shutdown(void) 1251 1251 { 1252 1252 ps3_mm_region_destroy(&map.r1); 1253 1253 }
+2 -2
arch/powerpc/platforms/pseries/dtl.c
··· 11 11 #include <linux/spinlock.h> 12 12 #include <asm/smp.h> 13 13 #include <linux/uaccess.h> 14 + #include <linux/debugfs.h> 14 15 #include <asm/firmware.h> 15 16 #include <asm/dtl.h> 16 17 #include <asm/lppaca.h> 17 - #include <asm/debugfs.h> 18 18 #include <asm/plpar_wrappers.h> 19 19 #include <asm/machdep.h> 20 20 ··· 338 338 339 339 /* set up common debugfs structure */ 340 340 341 - dtl_dir = debugfs_create_dir("dtl", powerpc_debugfs_root); 341 + dtl_dir = debugfs_create_dir("dtl", arch_debugfs_dir); 342 342 343 343 debugfs_create_x8("dtl_event_mask", 0600, dtl_dir, &dtl_event_mask); 344 344 debugfs_create_u32("dtl_buf_entries", 0400, dtl_dir, &dtl_buf_entries);
+2 -1
arch/powerpc/platforms/pseries/firmware.c
··· 119 119 120 120 static __initdata struct vec5_fw_feature 121 121 vec5_fw_features_table[] = { 122 - {FW_FEATURE_TYPE1_AFFINITY, OV5_TYPE1_AFFINITY}, 122 + {FW_FEATURE_FORM1_AFFINITY, OV5_FORM1_AFFINITY}, 123 123 {FW_FEATURE_PRRN, OV5_PRRN}, 124 124 {FW_FEATURE_DRMEM_V2, OV5_DRMEM_V2}, 125 125 {FW_FEATURE_DRC_INFO, OV5_DRC_INFO}, 126 + {FW_FEATURE_FORM2_AFFINITY, OV5_FORM2_AFFINITY}, 126 127 }; 127 128 128 129 static void __init fw_vec5_feature_init(const char *vec5, unsigned long len)
+134 -39
arch/powerpc/platforms/pseries/hotplug-cpu.c
··· 39 39 /* This version can't take the spinlock, because it never returns */ 40 40 static int rtas_stop_self_token = RTAS_UNKNOWN_SERVICE; 41 41 42 + /* 43 + * Record the CPU ids used on each nodes. 44 + * Protected by cpu_add_remove_lock. 45 + */ 46 + static cpumask_var_t node_recorded_ids_map[MAX_NUMNODES]; 47 + 42 48 static void rtas_stop_self(void) 43 49 { 44 50 static struct rtas_args args; ··· 145 139 paca_ptrs[cpu]->cpu_start = 0; 146 140 } 147 141 142 + /** 143 + * find_cpu_id_range - found a linear ranger of @nthreads free CPU ids. 144 + * @nthreads : the number of threads (cpu ids) 145 + * @assigned_node : the node it belongs to or NUMA_NO_NODE if free ids from any 146 + * node can be peek. 147 + * @cpu_mask: the returned CPU mask. 148 + * 149 + * Returns 0 on success. 150 + */ 151 + static int find_cpu_id_range(unsigned int nthreads, int assigned_node, 152 + cpumask_var_t *cpu_mask) 153 + { 154 + cpumask_var_t candidate_mask; 155 + unsigned int cpu, node; 156 + int rc = -ENOSPC; 157 + 158 + if (!zalloc_cpumask_var(&candidate_mask, GFP_KERNEL)) 159 + return -ENOMEM; 160 + 161 + cpumask_clear(*cpu_mask); 162 + for (cpu = 0; cpu < nthreads; cpu++) 163 + cpumask_set_cpu(cpu, *cpu_mask); 164 + 165 + BUG_ON(!cpumask_subset(cpu_present_mask, cpu_possible_mask)); 166 + 167 + /* Get a bitmap of unoccupied slots. */ 168 + cpumask_xor(candidate_mask, cpu_possible_mask, cpu_present_mask); 169 + 170 + if (assigned_node != NUMA_NO_NODE) { 171 + /* 172 + * Remove free ids previously assigned on the other nodes. We 173 + * can walk only online nodes because once a node became online 174 + * it is not turned offlined back. 175 + */ 176 + for_each_online_node(node) { 177 + if (node == assigned_node) 178 + continue; 179 + cpumask_andnot(candidate_mask, candidate_mask, 180 + node_recorded_ids_map[node]); 181 + } 182 + } 183 + 184 + if (cpumask_empty(candidate_mask)) 185 + goto out; 186 + 187 + while (!cpumask_empty(*cpu_mask)) { 188 + if (cpumask_subset(*cpu_mask, candidate_mask)) 189 + /* Found a range where we can insert the new cpu(s) */ 190 + break; 191 + cpumask_shift_left(*cpu_mask, *cpu_mask, nthreads); 192 + } 193 + 194 + if (!cpumask_empty(*cpu_mask)) 195 + rc = 0; 196 + 197 + out: 198 + free_cpumask_var(candidate_mask); 199 + return rc; 200 + } 201 + 148 202 /* 149 203 * Update cpu_present_mask and paca(s) for a new cpu node. The wrinkle 150 - * here is that a cpu device node may represent up to two logical cpus 204 + * here is that a cpu device node may represent multiple logical cpus 151 205 * in the SMT case. We must honor the assumption in other code that 152 206 * the logical ids for sibling SMT threads x and y are adjacent, such 153 207 * that x^1 == y and y^1 == x. 154 208 */ 155 209 static int pseries_add_processor(struct device_node *np) 156 210 { 157 - unsigned int cpu; 158 - cpumask_var_t candidate_mask, tmp; 159 - int err = -ENOSPC, len, nthreads, i; 211 + int len, nthreads, node, cpu, assigned_node; 212 + int rc = 0; 213 + cpumask_var_t cpu_mask; 160 214 const __be32 *intserv; 161 215 162 216 intserv = of_get_property(np, "ibm,ppc-interrupt-server#s", &len); 163 217 if (!intserv) 164 218 return 0; 165 219 166 - zalloc_cpumask_var(&candidate_mask, GFP_KERNEL); 167 - zalloc_cpumask_var(&tmp, GFP_KERNEL); 168 - 169 220 nthreads = len / sizeof(u32); 170 - for (i = 0; i < nthreads; i++) 171 - cpumask_set_cpu(i, tmp); 221 + 222 + if (!alloc_cpumask_var(&cpu_mask, GFP_KERNEL)) 223 + return -ENOMEM; 224 + 225 + /* 226 + * Fetch from the DT nodes read by dlpar_configure_connector() the NUMA 227 + * node id the added CPU belongs to. 228 + */ 229 + node = of_node_to_nid(np); 230 + if (node < 0 || !node_possible(node)) 231 + node = first_online_node; 232 + 233 + BUG_ON(node == NUMA_NO_NODE); 234 + assigned_node = node; 172 235 173 236 cpu_maps_update_begin(); 174 237 175 - BUG_ON(!cpumask_subset(cpu_present_mask, cpu_possible_mask)); 176 - 177 - /* Get a bitmap of unoccupied slots. */ 178 - cpumask_xor(candidate_mask, cpu_possible_mask, cpu_present_mask); 179 - if (cpumask_empty(candidate_mask)) { 180 - /* If we get here, it most likely means that NR_CPUS is 181 - * less than the partition's max processors setting. 238 + rc = find_cpu_id_range(nthreads, node, &cpu_mask); 239 + if (rc && nr_node_ids > 1) { 240 + /* 241 + * Try again, considering the free CPU ids from the other node. 182 242 */ 183 - printk(KERN_ERR "Cannot add cpu %pOF; this system configuration" 184 - " supports %d logical cpus.\n", np, 185 - num_possible_cpus()); 186 - goto out_unlock; 243 + node = NUMA_NO_NODE; 244 + rc = find_cpu_id_range(nthreads, NUMA_NO_NODE, &cpu_mask); 187 245 } 188 246 189 - while (!cpumask_empty(tmp)) 190 - if (cpumask_subset(tmp, candidate_mask)) 191 - /* Found a range where we can insert the new cpu(s) */ 192 - break; 193 - else 194 - cpumask_shift_left(tmp, tmp, nthreads); 195 - 196 - if (cpumask_empty(tmp)) { 197 - printk(KERN_ERR "Unable to find space in cpu_present_mask for" 198 - " processor %pOFn with %d thread(s)\n", np, 199 - nthreads); 200 - goto out_unlock; 247 + if (rc) { 248 + pr_err("Cannot add cpu %pOF; this system configuration" 249 + " supports %d logical cpus.\n", np, num_possible_cpus()); 250 + goto out; 201 251 } 202 252 203 - for_each_cpu(cpu, tmp) { 253 + for_each_cpu(cpu, cpu_mask) { 204 254 BUG_ON(cpu_present(cpu)); 205 255 set_cpu_present(cpu, true); 206 256 set_hard_smp_processor_id(cpu, be32_to_cpu(*intserv++)); 207 257 } 208 - err = 0; 209 - out_unlock: 258 + 259 + /* Record the newly used CPU ids for the associate node. */ 260 + cpumask_or(node_recorded_ids_map[assigned_node], 261 + node_recorded_ids_map[assigned_node], cpu_mask); 262 + 263 + /* 264 + * If node is set to NUMA_NO_NODE, CPU ids have be reused from 265 + * another node, remove them from its mask. 266 + */ 267 + if (node == NUMA_NO_NODE) { 268 + cpu = cpumask_first(cpu_mask); 269 + pr_warn("Reusing free CPU ids %d-%d from another node\n", 270 + cpu, cpu + nthreads - 1); 271 + for_each_online_node(node) { 272 + if (node == assigned_node) 273 + continue; 274 + cpumask_andnot(node_recorded_ids_map[node], 275 + node_recorded_ids_map[node], 276 + cpu_mask); 277 + } 278 + } 279 + 280 + out: 210 281 cpu_maps_update_done(); 211 - free_cpumask_var(candidate_mask); 212 - free_cpumask_var(tmp); 213 - return err; 282 + free_cpumask_var(cpu_mask); 283 + return rc; 214 284 } 215 285 216 286 /* ··· 579 497 580 498 return saved_rc; 581 499 } 500 + 501 + update_numa_distance(dn); 582 502 583 503 rc = dlpar_online_cpu(dn); 584 504 if (rc) { ··· 992 908 static int __init pseries_cpu_hotplug_init(void) 993 909 { 994 910 int qcss_tok; 911 + unsigned int node; 995 912 996 913 #ifdef CONFIG_ARCH_CPU_PROBE_RELEASE 997 914 ppc_md.cpu_probe = dlpar_cpu_probe; ··· 1014 929 smp_ops->cpu_die = pseries_cpu_die; 1015 930 1016 931 /* Processors can be added/removed only on LPAR */ 1017 - if (firmware_has_feature(FW_FEATURE_LPAR)) 932 + if (firmware_has_feature(FW_FEATURE_LPAR)) { 933 + for_each_node(node) { 934 + alloc_bootmem_cpumask_var(&node_recorded_ids_map[node]); 935 + 936 + /* Record ids of CPU added at boot time */ 937 + cpumask_or(node_recorded_ids_map[node], 938 + node_recorded_ids_map[node], 939 + cpumask_of_node(node)); 940 + } 941 + 1018 942 of_reconfig_notifier_register(&pseries_smp_nb); 943 + } 1019 944 1020 945 return 0; 1021 946 }
+6
arch/powerpc/platforms/pseries/hotplug-memory.c
··· 180 180 return -ENODEV; 181 181 } 182 182 183 + update_numa_distance(lmb_node); 184 + 183 185 dr_node = of_find_node_by_path("/ibm,dynamic-reconfiguration-memory"); 184 186 if (!dr_node) { 185 187 dlpar_free_cc_nodes(lmb_node); ··· 979 977 case OF_RECONFIG_DETACH_NODE: 980 978 err = pseries_remove_mem_node(rd->dn); 981 979 break; 980 + case OF_RECONFIG_UPDATE_PROPERTY: 981 + if (!strcmp(rd->dn->name, 982 + "ibm,dynamic-reconfiguration-memory")) 983 + drmem_update_lmbs(rd->prop); 982 984 } 983 985 return notifier_from_errno(err); 984 986 }
+326 -194
arch/powerpc/platforms/pseries/iommu.c
··· 53 53 DDW_EXT_QUERY_OUT_SIZE = 2 54 54 }; 55 55 56 - static struct iommu_table_group *iommu_pseries_alloc_group(int node) 56 + static struct iommu_table *iommu_pseries_alloc_table(int node) 57 57 { 58 - struct iommu_table_group *table_group; 59 58 struct iommu_table *tbl; 60 - 61 - table_group = kzalloc_node(sizeof(struct iommu_table_group), GFP_KERNEL, 62 - node); 63 - if (!table_group) 64 - return NULL; 65 59 66 60 tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL, node); 67 61 if (!tbl) 68 - goto free_group; 62 + return NULL; 69 63 70 64 INIT_LIST_HEAD_RCU(&tbl->it_group_list); 71 65 kref_init(&tbl->it_kref); 66 + return tbl; 67 + } 72 68 73 - table_group->tables[0] = tbl; 69 + static struct iommu_table_group *iommu_pseries_alloc_group(int node) 70 + { 71 + struct iommu_table_group *table_group; 74 72 75 - return table_group; 73 + table_group = kzalloc_node(sizeof(*table_group), GFP_KERNEL, node); 74 + if (!table_group) 75 + return NULL; 76 76 77 - free_group: 77 + table_group->tables[0] = iommu_pseries_alloc_table(node); 78 + if (table_group->tables[0]) 79 + return table_group; 80 + 78 81 kfree(table_group); 79 82 return NULL; 80 83 } ··· 110 107 u64 proto_tce; 111 108 __be64 *tcep; 112 109 u64 rpn; 110 + const unsigned long tceshift = tbl->it_page_shift; 111 + const unsigned long pagesize = IOMMU_PAGE_SIZE(tbl); 113 112 114 113 proto_tce = TCE_PCI_READ; // Read allowed 115 114 ··· 122 117 123 118 while (npages--) { 124 119 /* can't move this out since we might cross MEMBLOCK boundary */ 125 - rpn = __pa(uaddr) >> TCE_SHIFT; 126 - *tcep = cpu_to_be64(proto_tce | (rpn & TCE_RPN_MASK) << TCE_RPN_SHIFT); 120 + rpn = __pa(uaddr) >> tceshift; 121 + *tcep = cpu_to_be64(proto_tce | rpn << tceshift); 127 122 128 - uaddr += TCE_PAGE_SIZE; 123 + uaddr += pagesize; 129 124 tcep++; 130 125 } 131 126 return 0; ··· 151 146 return be64_to_cpu(*tcep); 152 147 } 153 148 154 - static void tce_free_pSeriesLP(unsigned long liobn, long, long); 149 + static void tce_free_pSeriesLP(unsigned long liobn, long, long, long); 155 150 static void tce_freemulti_pSeriesLP(struct iommu_table*, long, long); 156 151 157 152 static int tce_build_pSeriesLP(unsigned long liobn, long tcenum, long tceshift, ··· 171 166 proto_tce |= TCE_PCI_WRITE; 172 167 173 168 while (npages--) { 174 - tce = proto_tce | (rpn & TCE_RPN_MASK) << tceshift; 169 + tce = proto_tce | rpn << tceshift; 175 170 rc = plpar_tce_put((u64)liobn, (u64)tcenum << tceshift, tce); 176 171 177 172 if (unlikely(rc == H_NOT_ENOUGH_RESOURCES)) { 178 173 ret = (int)rc; 179 - tce_free_pSeriesLP(liobn, tcenum_start, 174 + tce_free_pSeriesLP(liobn, tcenum_start, tceshift, 180 175 (npages_start - (npages + 1))); 181 176 break; 182 177 } ··· 210 205 long tcenum_start = tcenum, npages_start = npages; 211 206 int ret = 0; 212 207 unsigned long flags; 208 + const unsigned long tceshift = tbl->it_page_shift; 213 209 214 210 if ((npages == 1) || !firmware_has_feature(FW_FEATURE_PUT_TCE_IND)) { 215 211 return tce_build_pSeriesLP(tbl->it_index, tcenum, 216 - tbl->it_page_shift, npages, uaddr, 212 + tceshift, npages, uaddr, 217 213 direction, attrs); 218 214 } 219 215 ··· 231 225 if (!tcep) { 232 226 local_irq_restore(flags); 233 227 return tce_build_pSeriesLP(tbl->it_index, tcenum, 234 - tbl->it_page_shift, 228 + tceshift, 235 229 npages, uaddr, direction, attrs); 236 230 } 237 231 __this_cpu_write(tce_page, tcep); 238 232 } 239 233 240 - rpn = __pa(uaddr) >> TCE_SHIFT; 234 + rpn = __pa(uaddr) >> tceshift; 241 235 proto_tce = TCE_PCI_READ; 242 236 if (direction != DMA_TO_DEVICE) 243 237 proto_tce |= TCE_PCI_WRITE; ··· 251 245 limit = min_t(long, npages, 4096/TCE_ENTRY_SIZE); 252 246 253 247 for (l = 0; l < limit; l++) { 254 - tcep[l] = cpu_to_be64(proto_tce | (rpn & TCE_RPN_MASK) << TCE_RPN_SHIFT); 248 + tcep[l] = cpu_to_be64(proto_tce | rpn << tceshift); 255 249 rpn++; 256 250 } 257 251 258 252 rc = plpar_tce_put_indirect((u64)tbl->it_index, 259 - (u64)tcenum << 12, 253 + (u64)tcenum << tceshift, 260 254 (u64)__pa(tcep), 261 255 limit); 262 256 ··· 283 277 return ret; 284 278 } 285 279 286 - static void tce_free_pSeriesLP(unsigned long liobn, long tcenum, long npages) 280 + static void tce_free_pSeriesLP(unsigned long liobn, long tcenum, long tceshift, 281 + long npages) 287 282 { 288 283 u64 rc; 289 284 290 285 while (npages--) { 291 - rc = plpar_tce_put((u64)liobn, (u64)tcenum << 12, 0); 286 + rc = plpar_tce_put((u64)liobn, (u64)tcenum << tceshift, 0); 292 287 293 288 if (rc && printk_ratelimit()) { 294 289 printk("tce_free_pSeriesLP: plpar_tce_put failed. rc=%lld\n", rc); ··· 308 301 u64 rc; 309 302 310 303 if (!firmware_has_feature(FW_FEATURE_STUFF_TCE)) 311 - return tce_free_pSeriesLP(tbl->it_index, tcenum, npages); 304 + return tce_free_pSeriesLP(tbl->it_index, tcenum, 305 + tbl->it_page_shift, npages); 312 306 313 - rc = plpar_tce_stuff((u64)tbl->it_index, (u64)tcenum << 12, 0, npages); 307 + rc = plpar_tce_stuff((u64)tbl->it_index, 308 + (u64)tcenum << tbl->it_page_shift, 0, npages); 314 309 315 310 if (rc && printk_ratelimit()) { 316 311 printk("tce_freemulti_pSeriesLP: plpar_tce_stuff failed\n"); ··· 328 319 u64 rc; 329 320 unsigned long tce_ret; 330 321 331 - rc = plpar_tce_get((u64)tbl->it_index, (u64)tcenum << 12, &tce_ret); 322 + rc = plpar_tce_get((u64)tbl->it_index, 323 + (u64)tcenum << tbl->it_page_shift, &tce_ret); 332 324 333 325 if (rc && printk_ratelimit()) { 334 326 printk("tce_get_pSeriesLP: plpar_tce_get failed. rc=%lld\n", rc); ··· 349 339 __be32 window_shift; /* ilog2(tce_window_size) */ 350 340 }; 351 341 352 - struct direct_window { 342 + struct dma_win { 353 343 struct device_node *device; 354 344 const struct dynamic_dma_window_prop *prop; 355 345 struct list_head list; ··· 369 359 u32 addr_lo; 370 360 }; 371 361 372 - static LIST_HEAD(direct_window_list); 362 + static LIST_HEAD(dma_win_list); 373 363 /* prevents races between memory on/offline and window creation */ 374 - static DEFINE_SPINLOCK(direct_window_list_lock); 364 + static DEFINE_SPINLOCK(dma_win_list_lock); 375 365 /* protects initializing window twice for same device */ 376 - static DEFINE_MUTEX(direct_window_init_mutex); 366 + static DEFINE_MUTEX(dma_win_init_mutex); 377 367 #define DIRECT64_PROPNAME "linux,direct64-ddr-window-info" 368 + #define DMA64_PROPNAME "linux,dma64-ddr-window-info" 378 369 379 370 static int tce_clearrange_multi_pSeriesLP(unsigned long start_pfn, 380 371 unsigned long num_pfn, const void *arg) ··· 502 491 return tce_setrange_multi_pSeriesLP(start_pfn, num_pfn, arg); 503 492 } 504 493 494 + static void iommu_table_setparms_common(struct iommu_table *tbl, unsigned long busno, 495 + unsigned long liobn, unsigned long win_addr, 496 + unsigned long window_size, unsigned long page_shift, 497 + void *base, struct iommu_table_ops *table_ops) 498 + { 499 + tbl->it_busno = busno; 500 + tbl->it_index = liobn; 501 + tbl->it_offset = win_addr >> page_shift; 502 + tbl->it_size = window_size >> page_shift; 503 + tbl->it_page_shift = page_shift; 504 + tbl->it_base = (unsigned long)base; 505 + tbl->it_blocksize = 16; 506 + tbl->it_type = TCE_PCI; 507 + tbl->it_ops = table_ops; 508 + } 509 + 510 + struct iommu_table_ops iommu_table_pseries_ops; 511 + 505 512 static void iommu_table_setparms(struct pci_controller *phb, 506 513 struct device_node *dn, 507 514 struct iommu_table *tbl) ··· 528 499 const unsigned long *basep; 529 500 const u32 *sizep; 530 501 531 - node = phb->dn; 502 + /* Test if we are going over 2GB of DMA space */ 503 + if (phb->dma_window_base_cur + phb->dma_window_size > SZ_2G) { 504 + udbg_printf("PCI_DMA: Unexpected number of IOAs under this PHB.\n"); 505 + panic("PCI_DMA: Unexpected number of IOAs under this PHB.\n"); 506 + } 532 507 508 + node = phb->dn; 533 509 basep = of_get_property(node, "linux,tce-base", NULL); 534 510 sizep = of_get_property(node, "linux,tce-size", NULL); 535 511 if (basep == NULL || sizep == NULL) { ··· 543 509 return; 544 510 } 545 511 546 - tbl->it_base = (unsigned long)__va(*basep); 512 + iommu_table_setparms_common(tbl, phb->bus->number, 0, phb->dma_window_base_cur, 513 + phb->dma_window_size, IOMMU_PAGE_SHIFT_4K, 514 + __va(*basep), &iommu_table_pseries_ops); 547 515 548 516 if (!is_kdump_kernel()) 549 517 memset((void *)tbl->it_base, 0, *sizep); 550 518 551 - tbl->it_busno = phb->bus->number; 552 - tbl->it_page_shift = IOMMU_PAGE_SHIFT_4K; 553 - 554 - /* Units of tce entries */ 555 - tbl->it_offset = phb->dma_window_base_cur >> tbl->it_page_shift; 556 - 557 - /* Test if we are going over 2GB of DMA space */ 558 - if (phb->dma_window_base_cur + phb->dma_window_size > 0x80000000ul) { 559 - udbg_printf("PCI_DMA: Unexpected number of IOAs under this PHB.\n"); 560 - panic("PCI_DMA: Unexpected number of IOAs under this PHB.\n"); 561 - } 562 - 563 519 phb->dma_window_base_cur += phb->dma_window_size; 564 - 565 - /* Set the tce table size - measured in entries */ 566 - tbl->it_size = phb->dma_window_size >> tbl->it_page_shift; 567 - 568 - tbl->it_index = 0; 569 - tbl->it_blocksize = 16; 570 - tbl->it_type = TCE_PCI; 571 520 } 521 + 522 + struct iommu_table_ops iommu_table_lpar_multi_ops; 572 523 573 524 /* 574 525 * iommu_table_setparms_lpar ··· 566 547 struct iommu_table_group *table_group, 567 548 const __be32 *dma_window) 568 549 { 569 - unsigned long offset, size; 550 + unsigned long offset, size, liobn; 570 551 571 - of_parse_dma_window(dn, dma_window, &tbl->it_index, &offset, &size); 552 + of_parse_dma_window(dn, dma_window, &liobn, &offset, &size); 572 553 573 - tbl->it_busno = phb->bus->number; 574 - tbl->it_page_shift = IOMMU_PAGE_SHIFT_4K; 575 - tbl->it_base = 0; 576 - tbl->it_blocksize = 16; 577 - tbl->it_type = TCE_PCI; 578 - tbl->it_offset = offset >> tbl->it_page_shift; 579 - tbl->it_size = size >> tbl->it_page_shift; 554 + iommu_table_setparms_common(tbl, phb->bus->number, liobn, offset, size, IOMMU_PAGE_SHIFT_4K, NULL, 555 + &iommu_table_lpar_multi_ops); 556 + 580 557 581 558 table_group->tce32_start = offset; 582 559 table_group->tce32_size = size; ··· 652 637 tbl = pci->table_group->tables[0]; 653 638 654 639 iommu_table_setparms(pci->phb, dn, tbl); 655 - tbl->it_ops = &iommu_table_pseries_ops; 640 + 656 641 if (!iommu_init_table(tbl, pci->phb->node, 0, 0)) 657 642 panic("Failed to initialize iommu table"); 658 643 ··· 713 698 pr_debug("pci_dma_bus_setup_pSeriesLP: setting up bus %pOF\n", 714 699 dn); 715 700 716 - /* Find nearest ibm,dma-window, walking up the device tree */ 701 + /* 702 + * Find nearest ibm,dma-window (default DMA window), walking up the 703 + * device tree 704 + */ 717 705 for (pdn = dn; pdn != NULL; pdn = pdn->parent) { 718 706 dma_window = of_get_property(pdn, "ibm,dma-window", NULL); 719 707 if (dma_window != NULL) ··· 738 720 tbl = ppci->table_group->tables[0]; 739 721 iommu_table_setparms_lpar(ppci->phb, pdn, tbl, 740 722 ppci->table_group, dma_window); 741 - tbl->it_ops = &iommu_table_lpar_multi_ops; 723 + 742 724 if (!iommu_init_table(tbl, ppci->phb->node, 0, 0)) 743 725 panic("Failed to initialize iommu table"); 744 726 iommu_register_group(ppci->table_group, ··· 768 750 PCI_DN(dn)->table_group = iommu_pseries_alloc_group(phb->node); 769 751 tbl = PCI_DN(dn)->table_group->tables[0]; 770 752 iommu_table_setparms(phb, dn, tbl); 771 - tbl->it_ops = &iommu_table_pseries_ops; 753 + 772 754 if (!iommu_init_table(tbl, phb->node, 0, 0)) 773 755 panic("Failed to initialize iommu table"); 774 756 ··· 803 785 804 786 early_param("disable_ddw", disable_ddw_setup); 805 787 806 - static void remove_dma_window(struct device_node *np, u32 *ddw_avail, 807 - struct property *win) 788 + static void clean_dma_window(struct device_node *np, struct dynamic_dma_window_prop *dwp) 808 789 { 809 - struct dynamic_dma_window_prop *dwp; 810 - u64 liobn; 811 790 int ret; 812 791 813 - dwp = win->value; 814 - liobn = (u64)be32_to_cpu(dwp->liobn); 815 - 816 - /* clear the whole window, note the arg is in kernel pages */ 817 792 ret = tce_clearrange_multi_pSeriesLP(0, 818 793 1ULL << (be32_to_cpu(dwp->window_shift) - PAGE_SHIFT), dwp); 819 794 if (ret) ··· 815 804 else 816 805 pr_debug("%pOF successfully cleared tces in window.\n", 817 806 np); 807 + } 808 + 809 + /* 810 + * Call only if DMA window is clean. 811 + */ 812 + static void __remove_dma_window(struct device_node *np, u32 *ddw_avail, u64 liobn) 813 + { 814 + int ret; 818 815 819 816 ret = rtas_call(ddw_avail[DDW_REMOVE_PE_DMA_WIN], 1, 1, NULL, liobn); 820 817 if (ret) 821 - pr_warn("%pOF: failed to remove direct window: rtas returned " 818 + pr_warn("%pOF: failed to remove DMA window: rtas returned " 822 819 "%d to ibm,remove-pe-dma-window(%x) %llx\n", 823 820 np, ret, ddw_avail[DDW_REMOVE_PE_DMA_WIN], liobn); 824 821 else 825 - pr_debug("%pOF: successfully removed direct window: rtas returned " 822 + pr_debug("%pOF: successfully removed DMA window: rtas returned " 826 823 "%d to ibm,remove-pe-dma-window(%x) %llx\n", 827 824 np, ret, ddw_avail[DDW_REMOVE_PE_DMA_WIN], liobn); 828 825 } 829 826 830 - static void remove_ddw(struct device_node *np, bool remove_prop) 827 + static void remove_dma_window(struct device_node *np, u32 *ddw_avail, 828 + struct property *win) 829 + { 830 + struct dynamic_dma_window_prop *dwp; 831 + u64 liobn; 832 + 833 + dwp = win->value; 834 + liobn = (u64)be32_to_cpu(dwp->liobn); 835 + 836 + clean_dma_window(np, dwp); 837 + __remove_dma_window(np, ddw_avail, liobn); 838 + } 839 + 840 + static int remove_ddw(struct device_node *np, bool remove_prop, const char *win_name) 831 841 { 832 842 struct property *win; 833 843 u32 ddw_avail[DDW_APPLICABLE_SIZE]; 834 844 int ret = 0; 835 845 846 + win = of_find_property(np, win_name, NULL); 847 + if (!win) 848 + return -EINVAL; 849 + 836 850 ret = of_property_read_u32_array(np, "ibm,ddw-applicable", 837 851 &ddw_avail[0], DDW_APPLICABLE_SIZE); 838 852 if (ret) 839 - return; 853 + return 0; 840 854 841 - win = of_find_property(np, DIRECT64_PROPNAME, NULL); 842 - if (!win) 843 - return; 844 855 845 856 if (win->length >= sizeof(struct dynamic_dma_window_prop)) 846 857 remove_dma_window(np, ddw_avail, win); 847 858 848 859 if (!remove_prop) 849 - return; 860 + return 0; 850 861 851 862 ret = of_remove_property(np, win); 852 863 if (ret) 853 - pr_warn("%pOF: failed to remove direct window property: %d\n", 864 + pr_warn("%pOF: failed to remove DMA window property: %d\n", 854 865 np, ret); 866 + return 0; 855 867 } 856 868 857 - static u64 find_existing_ddw(struct device_node *pdn, int *window_shift) 869 + static bool find_existing_ddw(struct device_node *pdn, u64 *dma_addr, int *window_shift) 858 870 { 859 - struct direct_window *window; 860 - const struct dynamic_dma_window_prop *direct64; 861 - u64 dma_addr = 0; 871 + struct dma_win *window; 872 + const struct dynamic_dma_window_prop *dma64; 873 + bool found = false; 862 874 863 - spin_lock(&direct_window_list_lock); 875 + spin_lock(&dma_win_list_lock); 864 876 /* check if we already created a window and dupe that config if so */ 865 - list_for_each_entry(window, &direct_window_list, list) { 877 + list_for_each_entry(window, &dma_win_list, list) { 866 878 if (window->device == pdn) { 867 - direct64 = window->prop; 868 - dma_addr = be64_to_cpu(direct64->dma_base); 869 - *window_shift = be32_to_cpu(direct64->window_shift); 879 + dma64 = window->prop; 880 + *dma_addr = be64_to_cpu(dma64->dma_base); 881 + *window_shift = be32_to_cpu(dma64->window_shift); 882 + found = true; 870 883 break; 871 884 } 872 885 } 873 - spin_unlock(&direct_window_list_lock); 886 + spin_unlock(&dma_win_list_lock); 874 887 875 - return dma_addr; 888 + return found; 889 + } 890 + 891 + static struct dma_win *ddw_list_new_entry(struct device_node *pdn, 892 + const struct dynamic_dma_window_prop *dma64) 893 + { 894 + struct dma_win *window; 895 + 896 + window = kzalloc(sizeof(*window), GFP_KERNEL); 897 + if (!window) 898 + return NULL; 899 + 900 + window->device = pdn; 901 + window->prop = dma64; 902 + 903 + return window; 904 + } 905 + 906 + static void find_existing_ddw_windows_named(const char *name) 907 + { 908 + int len; 909 + struct device_node *pdn; 910 + struct dma_win *window; 911 + const struct dynamic_dma_window_prop *dma64; 912 + 913 + for_each_node_with_property(pdn, name) { 914 + dma64 = of_get_property(pdn, name, &len); 915 + if (!dma64 || len < sizeof(*dma64)) { 916 + remove_ddw(pdn, true, name); 917 + continue; 918 + } 919 + 920 + window = ddw_list_new_entry(pdn, dma64); 921 + if (!window) 922 + break; 923 + 924 + spin_lock(&dma_win_list_lock); 925 + list_add(&window->list, &dma_win_list); 926 + spin_unlock(&dma_win_list_lock); 927 + } 876 928 } 877 929 878 930 static int find_existing_ddw_windows(void) 879 931 { 880 - int len; 881 - struct device_node *pdn; 882 - struct direct_window *window; 883 - const struct dynamic_dma_window_prop *direct64; 884 - 885 932 if (!firmware_has_feature(FW_FEATURE_LPAR)) 886 933 return 0; 887 934 888 - for_each_node_with_property(pdn, DIRECT64_PROPNAME) { 889 - direct64 = of_get_property(pdn, DIRECT64_PROPNAME, &len); 890 - if (!direct64) 891 - continue; 892 - 893 - window = kzalloc(sizeof(*window), GFP_KERNEL); 894 - if (!window || len < sizeof(struct dynamic_dma_window_prop)) { 895 - kfree(window); 896 - remove_ddw(pdn, true); 897 - continue; 898 - } 899 - 900 - window->device = pdn; 901 - window->prop = direct64; 902 - spin_lock(&direct_window_list_lock); 903 - list_add(&window->list, &direct_window_list); 904 - spin_unlock(&direct_window_list_lock); 905 - } 935 + find_existing_ddw_windows_named(DIRECT64_PROPNAME); 936 + find_existing_ddw_windows_named(DMA64_PROPNAME); 906 937 907 938 return 0; 908 939 } ··· 1183 1130 return 0; 1184 1131 } 1185 1132 1133 + static struct property *ddw_property_create(const char *propname, u32 liobn, u64 dma_addr, 1134 + u32 page_shift, u32 window_shift) 1135 + { 1136 + struct dynamic_dma_window_prop *ddwprop; 1137 + struct property *win64; 1138 + 1139 + win64 = kzalloc(sizeof(*win64), GFP_KERNEL); 1140 + if (!win64) 1141 + return NULL; 1142 + 1143 + win64->name = kstrdup(propname, GFP_KERNEL); 1144 + ddwprop = kzalloc(sizeof(*ddwprop), GFP_KERNEL); 1145 + win64->value = ddwprop; 1146 + win64->length = sizeof(*ddwprop); 1147 + if (!win64->name || !win64->value) { 1148 + kfree(win64->name); 1149 + kfree(win64->value); 1150 + kfree(win64); 1151 + return NULL; 1152 + } 1153 + 1154 + ddwprop->liobn = cpu_to_be32(liobn); 1155 + ddwprop->dma_base = cpu_to_be64(dma_addr); 1156 + ddwprop->tce_shift = cpu_to_be32(page_shift); 1157 + ddwprop->window_shift = cpu_to_be32(window_shift); 1158 + 1159 + return win64; 1160 + } 1161 + 1186 1162 /* 1187 1163 * If the PE supports dynamic dma windows, and there is space for a table 1188 1164 * that can map all pages in a linear offset, then setup such a table, ··· 1221 1139 * pdn: the parent pe node with the ibm,dma_window property 1222 1140 * Future: also check if we can remap the base window for our base page size 1223 1141 * 1224 - * returns the dma offset for use by the direct mapped DMA code. 1142 + * returns true if can map all pages (direct mapping), false otherwise.. 1225 1143 */ 1226 - static u64 enable_ddw(struct pci_dev *dev, struct device_node *pdn) 1144 + static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn) 1227 1145 { 1228 1146 int len = 0, ret; 1229 1147 int max_ram_len = order_base_2(ddw_memory_hotplug_max()); 1230 1148 struct ddw_query_response query; 1231 1149 struct ddw_create_response create; 1232 1150 int page_shift; 1233 - u64 dma_addr; 1151 + u64 win_addr; 1152 + const char *win_name; 1234 1153 struct device_node *dn; 1235 1154 u32 ddw_avail[DDW_APPLICABLE_SIZE]; 1236 - struct direct_window *window; 1155 + struct dma_win *window; 1237 1156 struct property *win64; 1238 - struct dynamic_dma_window_prop *ddwprop; 1157 + bool ddw_enabled = false; 1239 1158 struct failed_ddw_pdn *fpdn; 1240 - bool default_win_removed = false; 1159 + bool default_win_removed = false, direct_mapping = false; 1241 1160 bool pmem_present; 1161 + struct pci_dn *pci = PCI_DN(pdn); 1162 + struct iommu_table *tbl = pci->table_group->tables[0]; 1242 1163 1243 1164 dn = of_find_node_by_type(NULL, "ibm,pmemory"); 1244 1165 pmem_present = dn != NULL; 1245 1166 of_node_put(dn); 1246 1167 1247 - mutex_lock(&direct_window_init_mutex); 1168 + mutex_lock(&dma_win_init_mutex); 1248 1169 1249 - dma_addr = find_existing_ddw(pdn, &len); 1250 - if (dma_addr != 0) 1170 + if (find_existing_ddw(pdn, &dev->dev.archdata.dma_offset, &len)) { 1171 + direct_mapping = (len >= max_ram_len); 1172 + ddw_enabled = true; 1251 1173 goto out_unlock; 1174 + } 1252 1175 1253 1176 /* 1254 1177 * If we already went through this for a previous function of ··· 1327 1240 1328 1241 page_shift = iommu_get_page_shift(query.page_size); 1329 1242 if (!page_shift) { 1330 - dev_dbg(&dev->dev, "no supported direct page size in mask %x", 1331 - query.page_size); 1243 + dev_dbg(&dev->dev, "no supported page size in mask %x", 1244 + query.page_size); 1332 1245 goto out_failed; 1333 1246 } 1334 - /* verify the window * number of ptes will map the partition */ 1335 - /* check largest block * page size > max memory hotplug addr */ 1247 + 1248 + 1336 1249 /* 1337 1250 * The "ibm,pmemory" can appear anywhere in the address space. 1338 1251 * Assuming it is still backed by page structs, try MAX_PHYSMEM_BITS ··· 1348 1261 dev_info(&dev->dev, "Skipping ibm,pmemory"); 1349 1262 } 1350 1263 1264 + /* check if the available block * number of ptes will map everything */ 1351 1265 if (query.largest_available_block < (1ULL << (len - page_shift))) { 1352 1266 dev_dbg(&dev->dev, 1353 1267 "can't map partition max 0x%llx with %llu %llu-sized pages\n", 1354 1268 1ULL << len, 1355 1269 query.largest_available_block, 1356 1270 1ULL << page_shift); 1357 - goto out_failed; 1358 - } 1359 - win64 = kzalloc(sizeof(struct property), GFP_KERNEL); 1360 - if (!win64) { 1361 - dev_info(&dev->dev, 1362 - "couldn't allocate property for 64bit dma window\n"); 1363 - goto out_failed; 1364 - } 1365 - win64->name = kstrdup(DIRECT64_PROPNAME, GFP_KERNEL); 1366 - win64->value = ddwprop = kmalloc(sizeof(*ddwprop), GFP_KERNEL); 1367 - win64->length = sizeof(*ddwprop); 1368 - if (!win64->name || !win64->value) { 1369 - dev_info(&dev->dev, 1370 - "couldn't allocate property name and value\n"); 1371 - goto out_free_prop; 1271 + 1272 + /* DDW + IOMMU on single window may fail if there is any allocation */ 1273 + if (default_win_removed && iommu_table_in_use(tbl)) { 1274 + dev_dbg(&dev->dev, "current IOMMU table in use, can't be replaced.\n"); 1275 + goto out_failed; 1276 + } 1277 + 1278 + len = order_base_2(query.largest_available_block << page_shift); 1279 + win_name = DMA64_PROPNAME; 1280 + } else { 1281 + direct_mapping = true; 1282 + win_name = DIRECT64_PROPNAME; 1372 1283 } 1373 1284 1374 1285 ret = create_ddw(dev, ddw_avail, &create, page_shift, len); 1375 1286 if (ret != 0) 1376 - goto out_free_prop; 1377 - 1378 - ddwprop->liobn = cpu_to_be32(create.liobn); 1379 - ddwprop->dma_base = cpu_to_be64(((u64)create.addr_hi << 32) | 1380 - create.addr_lo); 1381 - ddwprop->tce_shift = cpu_to_be32(page_shift); 1382 - ddwprop->window_shift = cpu_to_be32(len); 1287 + goto out_failed; 1383 1288 1384 1289 dev_dbg(&dev->dev, "created tce table LIOBN 0x%x for %pOF\n", 1385 1290 create.liobn, dn); 1386 1291 1387 - window = kzalloc(sizeof(*window), GFP_KERNEL); 1388 - if (!window) 1389 - goto out_clear_window; 1292 + win_addr = ((u64)create.addr_hi << 32) | create.addr_lo; 1293 + win64 = ddw_property_create(win_name, create.liobn, win_addr, page_shift, len); 1390 1294 1391 - ret = walk_system_ram_range(0, memblock_end_of_DRAM() >> PAGE_SHIFT, 1392 - win64->value, tce_setrange_multi_pSeriesLP_walk); 1393 - if (ret) { 1394 - dev_info(&dev->dev, "failed to map direct window for %pOF: %d\n", 1395 - dn, ret); 1396 - goto out_free_window; 1295 + if (!win64) { 1296 + dev_info(&dev->dev, 1297 + "couldn't allocate property, property name, or value\n"); 1298 + goto out_remove_win; 1397 1299 } 1398 1300 1399 1301 ret = of_add_property(pdn, win64); 1400 1302 if (ret) { 1401 - dev_err(&dev->dev, "unable to add dma window property for %pOF: %d", 1402 - pdn, ret); 1403 - goto out_free_window; 1303 + dev_err(&dev->dev, "unable to add DMA window property for %pOF: %d", 1304 + pdn, ret); 1305 + goto out_free_prop; 1404 1306 } 1405 1307 1406 - window->device = pdn; 1407 - window->prop = ddwprop; 1408 - spin_lock(&direct_window_list_lock); 1409 - list_add(&window->list, &direct_window_list); 1410 - spin_unlock(&direct_window_list_lock); 1308 + window = ddw_list_new_entry(pdn, win64->value); 1309 + if (!window) 1310 + goto out_del_prop; 1411 1311 1412 - dma_addr = be64_to_cpu(ddwprop->dma_base); 1312 + if (direct_mapping) { 1313 + /* DDW maps the whole partition, so enable direct DMA mapping */ 1314 + ret = walk_system_ram_range(0, memblock_end_of_DRAM() >> PAGE_SHIFT, 1315 + win64->value, tce_setrange_multi_pSeriesLP_walk); 1316 + if (ret) { 1317 + dev_info(&dev->dev, "failed to map DMA window for %pOF: %d\n", 1318 + dn, ret); 1319 + 1320 + /* Make sure to clean DDW if any TCE was set*/ 1321 + clean_dma_window(pdn, win64->value); 1322 + goto out_del_list; 1323 + } 1324 + } else { 1325 + struct iommu_table *newtbl; 1326 + int i; 1327 + 1328 + for (i = 0; i < ARRAY_SIZE(pci->phb->mem_resources); i++) { 1329 + const unsigned long mask = IORESOURCE_MEM_64 | IORESOURCE_MEM; 1330 + 1331 + /* Look for MMIO32 */ 1332 + if ((pci->phb->mem_resources[i].flags & mask) == IORESOURCE_MEM) 1333 + break; 1334 + } 1335 + 1336 + if (i == ARRAY_SIZE(pci->phb->mem_resources)) 1337 + goto out_del_list; 1338 + 1339 + /* New table for using DDW instead of the default DMA window */ 1340 + newtbl = iommu_pseries_alloc_table(pci->phb->node); 1341 + if (!newtbl) { 1342 + dev_dbg(&dev->dev, "couldn't create new IOMMU table\n"); 1343 + goto out_del_list; 1344 + } 1345 + 1346 + iommu_table_setparms_common(newtbl, pci->phb->bus->number, create.liobn, win_addr, 1347 + 1UL << len, page_shift, NULL, &iommu_table_lpar_multi_ops); 1348 + iommu_init_table(newtbl, pci->phb->node, pci->phb->mem_resources[i].start, 1349 + pci->phb->mem_resources[i].end); 1350 + 1351 + pci->table_group->tables[1] = newtbl; 1352 + 1353 + /* Keep default DMA window stuct if removed */ 1354 + if (default_win_removed) { 1355 + tbl->it_size = 0; 1356 + kfree(tbl->it_map); 1357 + } 1358 + 1359 + set_iommu_table_base(&dev->dev, newtbl); 1360 + } 1361 + 1362 + spin_lock(&dma_win_list_lock); 1363 + list_add(&window->list, &dma_win_list); 1364 + spin_unlock(&dma_win_list_lock); 1365 + 1366 + dev->dev.archdata.dma_offset = win_addr; 1367 + ddw_enabled = true; 1413 1368 goto out_unlock; 1414 1369 1415 - out_free_window: 1370 + out_del_list: 1416 1371 kfree(window); 1417 1372 1418 - out_clear_window: 1419 - remove_ddw(pdn, true); 1373 + out_del_prop: 1374 + of_remove_property(pdn, win64); 1420 1375 1421 1376 out_free_prop: 1422 1377 kfree(win64->name); 1423 1378 kfree(win64->value); 1424 1379 kfree(win64); 1380 + 1381 + out_remove_win: 1382 + /* DDW is clean, so it's ok to call this directly. */ 1383 + __remove_dma_window(pdn, ddw_avail, create.liobn); 1425 1384 1426 1385 out_failed: 1427 1386 if (default_win_removed) ··· 1480 1347 list_add(&fpdn->list, &failed_ddw_pdn_list); 1481 1348 1482 1349 out_unlock: 1483 - mutex_unlock(&direct_window_init_mutex); 1350 + mutex_unlock(&dma_win_init_mutex); 1484 1351 1485 1352 /* 1486 1353 * If we have persistent memory and the window size is only as big 1487 1354 * as RAM, then we failed to create a window to cover persistent 1488 1355 * memory and need to set the DMA limit. 1489 1356 */ 1490 - if (pmem_present && dma_addr && (len == max_ram_len)) 1491 - dev->dev.bus_dma_limit = dma_addr + (1ULL << len); 1357 + if (pmem_present && ddw_enabled && direct_mapping && len == max_ram_len) 1358 + dev->dev.bus_dma_limit = dev->dev.archdata.dma_offset + (1ULL << len); 1492 1359 1493 - return dma_addr; 1360 + return ddw_enabled && direct_mapping; 1494 1361 } 1495 1362 1496 1363 static void pci_dma_dev_setup_pSeriesLP(struct pci_dev *dev) ··· 1532 1399 tbl = pci->table_group->tables[0]; 1533 1400 iommu_table_setparms_lpar(pci->phb, pdn, tbl, 1534 1401 pci->table_group, dma_window); 1535 - tbl->it_ops = &iommu_table_lpar_multi_ops; 1402 + 1536 1403 iommu_init_table(tbl, pci->phb->node, 0, 0); 1537 1404 iommu_register_group(pci->table_group, 1538 1405 pci_domain_nr(pci->phb->bus), 0); ··· 1569 1436 break; 1570 1437 } 1571 1438 1572 - if (pdn && PCI_DN(pdn)) { 1573 - pdev->dev.archdata.dma_offset = enable_ddw(pdev, pdn); 1574 - if (pdev->dev.archdata.dma_offset) 1575 - return true; 1576 - } 1439 + if (pdn && PCI_DN(pdn)) 1440 + return enable_ddw(pdev, pdn); 1577 1441 1578 1442 return false; 1579 1443 } ··· 1578 1448 static int iommu_mem_notifier(struct notifier_block *nb, unsigned long action, 1579 1449 void *data) 1580 1450 { 1581 - struct direct_window *window; 1451 + struct dma_win *window; 1582 1452 struct memory_notify *arg = data; 1583 1453 int ret = 0; 1584 1454 1585 1455 switch (action) { 1586 1456 case MEM_GOING_ONLINE: 1587 - spin_lock(&direct_window_list_lock); 1588 - list_for_each_entry(window, &direct_window_list, list) { 1457 + spin_lock(&dma_win_list_lock); 1458 + list_for_each_entry(window, &dma_win_list, list) { 1589 1459 ret |= tce_setrange_multi_pSeriesLP(arg->start_pfn, 1590 1460 arg->nr_pages, window->prop); 1591 1461 /* XXX log error */ 1592 1462 } 1593 - spin_unlock(&direct_window_list_lock); 1463 + spin_unlock(&dma_win_list_lock); 1594 1464 break; 1595 1465 case MEM_CANCEL_ONLINE: 1596 1466 case MEM_OFFLINE: 1597 - spin_lock(&direct_window_list_lock); 1598 - list_for_each_entry(window, &direct_window_list, list) { 1467 + spin_lock(&dma_win_list_lock); 1468 + list_for_each_entry(window, &dma_win_list, list) { 1599 1469 ret |= tce_clearrange_multi_pSeriesLP(arg->start_pfn, 1600 1470 arg->nr_pages, window->prop); 1601 1471 /* XXX log error */ 1602 1472 } 1603 - spin_unlock(&direct_window_list_lock); 1473 + spin_unlock(&dma_win_list_lock); 1604 1474 break; 1605 1475 default: 1606 1476 break; ··· 1621 1491 struct of_reconfig_data *rd = data; 1622 1492 struct device_node *np = rd->dn; 1623 1493 struct pci_dn *pci = PCI_DN(np); 1624 - struct direct_window *window; 1494 + struct dma_win *window; 1625 1495 1626 1496 switch (action) { 1627 1497 case OF_RECONFIG_DETACH_NODE: ··· 1632 1502 * we have to remove the property when releasing 1633 1503 * the device node. 1634 1504 */ 1635 - remove_ddw(np, false); 1505 + if (remove_ddw(np, false, DIRECT64_PROPNAME)) 1506 + remove_ddw(np, false, DMA64_PROPNAME); 1507 + 1636 1508 if (pci && pci->table_group) 1637 1509 iommu_pseries_free_group(pci->table_group, 1638 1510 np->full_name); 1639 1511 1640 - spin_lock(&direct_window_list_lock); 1641 - list_for_each_entry(window, &direct_window_list, list) { 1512 + spin_lock(&dma_win_list_lock); 1513 + list_for_each_entry(window, &dma_win_list, list) { 1642 1514 if (window->device == np) { 1643 1515 list_del(&window->list); 1644 1516 kfree(window); 1645 1517 break; 1646 1518 } 1647 1519 } 1648 - spin_unlock(&direct_window_list_lock); 1520 + spin_unlock(&dma_win_list_lock); 1649 1521 break; 1650 1522 default: 1651 1523 err = NOTIFY_DONE;
+11 -7
arch/powerpc/platforms/pseries/lpar.c
··· 22 22 #include <linux/workqueue.h> 23 23 #include <linux/proc_fs.h> 24 24 #include <linux/pgtable.h> 25 + #include <linux/debugfs.h> 26 + 25 27 #include <asm/processor.h> 26 28 #include <asm/mmu.h> 27 29 #include <asm/page.h> ··· 41 39 #include <asm/kexec.h> 42 40 #include <asm/fadump.h> 43 41 #include <asm/asm-prototypes.h> 44 - #include <asm/debugfs.h> 45 42 #include <asm/dtl.h> 46 43 47 44 #include "pseries.h" ··· 262 261 if (!last_disp_cpu_assoc || !cur_disp_cpu_assoc) 263 262 return -EIO; 264 263 265 - return cpu_distance(last_disp_cpu_assoc, cur_disp_cpu_assoc); 264 + return cpu_relative_distance(last_disp_cpu_assoc, cur_disp_cpu_assoc); 266 265 } 267 266 268 267 static int cpu_home_node_dispatch_distance(int disp_cpu) ··· 282 281 if (!disp_cpu_assoc || !vcpu_assoc) 283 282 return -EIO; 284 283 285 - return cpu_distance(disp_cpu_assoc, vcpu_assoc); 284 + return cpu_relative_distance(disp_cpu_assoc, vcpu_assoc); 286 285 } 287 286 288 287 static void update_vcpu_disp_stat(int disp_cpu) ··· 802 801 return -1; 803 802 } 804 803 805 - static void manual_hpte_clear_all(void) 804 + /* Called during kexec sequence with MMU off */ 805 + static notrace void manual_hpte_clear_all(void) 806 806 { 807 807 unsigned long size_bytes = 1UL << ppc64_pft_size; 808 808 unsigned long hpte_count = size_bytes >> 4; ··· 836 834 } 837 835 } 838 836 839 - static int hcall_hpte_clear_all(void) 837 + /* Called during kexec sequence with MMU off */ 838 + static notrace int hcall_hpte_clear_all(void) 840 839 { 841 840 int rc; 842 841 ··· 848 845 return rc; 849 846 } 850 847 851 - static void pseries_hpte_clear_all(void) 848 + /* Called during kexec sequence with MMU off */ 849 + static notrace void pseries_hpte_clear_all(void) 852 850 { 853 851 int rc; 854 852 ··· 2020 2016 if (!firmware_has_feature(FW_FEATURE_SPLPAR)) 2021 2017 return 0; 2022 2018 2023 - vpa_dir = debugfs_create_dir("vpa", powerpc_debugfs_root); 2019 + vpa_dir = debugfs_create_dir("vpa", arch_debugfs_dir); 2024 2020 2025 2021 /* set up the per-cpu vpa file*/ 2026 2022 for_each_possible_cpu(i) {
+225 -71
arch/powerpc/platforms/pseries/msi.c
··· 13 13 #include <asm/hw_irq.h> 14 14 #include <asm/ppc-pci.h> 15 15 #include <asm/machdep.h> 16 + #include <asm/xive.h> 16 17 17 18 #include "pseries.h" 18 19 ··· 111 110 return rtas_ret[0]; 112 111 } 113 112 114 - static void rtas_teardown_msi_irqs(struct pci_dev *pdev) 115 - { 116 - struct msi_desc *entry; 117 - 118 - for_each_pci_msi_entry(entry, pdev) { 119 - if (!entry->irq) 120 - continue; 121 - 122 - irq_set_msi_desc(entry->irq, NULL); 123 - irq_dispose_mapping(entry->irq); 124 - } 125 - 126 - rtas_disable_msi(pdev); 127 - } 128 - 129 113 static int check_req(struct pci_dev *pdev, int nvec, char *prop_name) 130 114 { 131 115 struct device_node *dn; ··· 150 164 151 165 /* Quota calculation */ 152 166 153 - static struct device_node *find_pe_total_msi(struct pci_dev *dev, int *total) 167 + static struct device_node *__find_pe_total_msi(struct device_node *node, int *total) 154 168 { 155 169 struct device_node *dn; 156 170 const __be32 *p; 157 171 158 - dn = of_node_get(pci_device_to_OF_node(dev)); 172 + dn = of_node_get(node); 159 173 while (dn) { 160 174 p = of_get_property(dn, "ibm,pe-total-#msi", NULL); 161 175 if (p) { ··· 169 183 } 170 184 171 185 return NULL; 186 + } 187 + 188 + static struct device_node *find_pe_total_msi(struct pci_dev *dev, int *total) 189 + { 190 + return __find_pe_total_msi(pci_device_to_OF_node(dev), total); 172 191 } 173 192 174 193 static struct device_node *find_pe_dn(struct pci_dev *dev, int *total) ··· 359 368 pci_write_config_dword(pdev, pdev->msi_cap + PCI_MSI_ADDRESS_HI, 0); 360 369 } 361 370 362 - static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type) 371 + static int rtas_prepare_msi_irqs(struct pci_dev *pdev, int nvec_in, int type, 372 + msi_alloc_info_t *arg) 363 373 { 364 374 struct pci_dn *pdn; 365 - int hwirq, virq, i, quota, rc; 366 - struct msi_desc *entry; 367 - struct msi_msg msg; 375 + int quota, rc; 368 376 int nvec = nvec_in; 369 377 int use_32bit_msi_hack = 0; 370 378 ··· 441 451 return rc; 442 452 } 443 453 444 - i = 0; 445 - for_each_pci_msi_entry(entry, pdev) { 446 - hwirq = rtas_query_irq_number(pdn, i++); 447 - if (hwirq < 0) { 448 - pr_debug("rtas_msi: error (%d) getting hwirq\n", rc); 449 - return hwirq; 450 - } 454 + return 0; 455 + } 451 456 452 - /* 453 - * Depending on the number of online CPUs in the original 454 - * kernel, it is likely for CPU #0 to be offline in a kdump 455 - * kernel. The associated IRQs in the affinity mappings 456 - * provided by irq_create_affinity_masks() are thus not 457 - * started by irq_startup(), as per-design for managed IRQs. 458 - * This can be a problem with multi-queue block devices driven 459 - * by blk-mq : such a non-started IRQ is very likely paired 460 - * with the single queue enforced by blk-mq during kdump (see 461 - * blk_mq_alloc_tag_set()). This causes the device to remain 462 - * silent and likely hangs the guest at some point. 463 - * 464 - * We don't really care for fine-grained affinity when doing 465 - * kdump actually : simply ignore the pre-computed affinity 466 - * masks in this case and let the default mask with all CPUs 467 - * be used when creating the IRQ mappings. 468 - */ 469 - if (is_kdump_kernel()) 470 - virq = irq_create_mapping(NULL, hwirq); 471 - else 472 - virq = irq_create_mapping_affinity(NULL, hwirq, 473 - entry->affinity); 457 + static int pseries_msi_ops_prepare(struct irq_domain *domain, struct device *dev, 458 + int nvec, msi_alloc_info_t *arg) 459 + { 460 + struct pci_dev *pdev = to_pci_dev(dev); 461 + struct msi_desc *desc = first_pci_msi_entry(pdev); 462 + int type = desc->msi_attrib.is_msix ? PCI_CAP_ID_MSIX : PCI_CAP_ID_MSI; 474 463 475 - if (!virq) { 476 - pr_debug("rtas_msi: Failed mapping hwirq %d\n", hwirq); 477 - return -ENOSPC; 478 - } 464 + return rtas_prepare_msi_irqs(pdev, nvec, type, arg); 465 + } 479 466 480 - dev_dbg(&pdev->dev, "rtas_msi: allocated virq %d\n", virq); 481 - irq_set_msi_desc(virq, entry); 467 + /* 468 + * ->msi_free() is called before irq_domain_free_irqs_top() when the 469 + * handler data is still available. Use that to clear the XIVE 470 + * controller data. 471 + */ 472 + static void pseries_msi_ops_msi_free(struct irq_domain *domain, 473 + struct msi_domain_info *info, 474 + unsigned int irq) 475 + { 476 + if (xive_enabled()) 477 + xive_irq_free_data(irq); 478 + } 482 479 483 - /* Read config space back so we can restore after reset */ 484 - __pci_read_msi_msg(entry, &msg); 485 - entry->msg = msg; 480 + /* 481 + * RTAS can not disable one MSI at a time. It's all or nothing. Do it 482 + * at the end after all IRQs have been freed. 483 + */ 484 + static void pseries_msi_domain_free_irqs(struct irq_domain *domain, 485 + struct device *dev) 486 + { 487 + if (WARN_ON_ONCE(!dev_is_pci(dev))) 488 + return; 489 + 490 + __msi_domain_free_irqs(domain, dev); 491 + 492 + rtas_disable_msi(to_pci_dev(dev)); 493 + } 494 + 495 + static struct msi_domain_ops pseries_pci_msi_domain_ops = { 496 + .msi_prepare = pseries_msi_ops_prepare, 497 + .msi_free = pseries_msi_ops_msi_free, 498 + .domain_free_irqs = pseries_msi_domain_free_irqs, 499 + }; 500 + 501 + static void pseries_msi_shutdown(struct irq_data *d) 502 + { 503 + d = d->parent_data; 504 + if (d->chip->irq_shutdown) 505 + d->chip->irq_shutdown(d); 506 + } 507 + 508 + static void pseries_msi_mask(struct irq_data *d) 509 + { 510 + pci_msi_mask_irq(d); 511 + irq_chip_mask_parent(d); 512 + } 513 + 514 + static void pseries_msi_unmask(struct irq_data *d) 515 + { 516 + pci_msi_unmask_irq(d); 517 + irq_chip_unmask_parent(d); 518 + } 519 + 520 + static struct irq_chip pseries_pci_msi_irq_chip = { 521 + .name = "pSeries-PCI-MSI", 522 + .irq_shutdown = pseries_msi_shutdown, 523 + .irq_mask = pseries_msi_mask, 524 + .irq_unmask = pseries_msi_unmask, 525 + .irq_eoi = irq_chip_eoi_parent, 526 + }; 527 + 528 + static struct msi_domain_info pseries_msi_domain_info = { 529 + .flags = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS | 530 + MSI_FLAG_MULTI_PCI_MSI | MSI_FLAG_PCI_MSIX), 531 + .ops = &pseries_pci_msi_domain_ops, 532 + .chip = &pseries_pci_msi_irq_chip, 533 + }; 534 + 535 + static void pseries_msi_compose_msg(struct irq_data *data, struct msi_msg *msg) 536 + { 537 + __pci_read_msi_msg(irq_data_get_msi_desc(data), msg); 538 + } 539 + 540 + static struct irq_chip pseries_msi_irq_chip = { 541 + .name = "pSeries-MSI", 542 + .irq_shutdown = pseries_msi_shutdown, 543 + .irq_mask = irq_chip_mask_parent, 544 + .irq_unmask = irq_chip_unmask_parent, 545 + .irq_eoi = irq_chip_eoi_parent, 546 + .irq_set_affinity = irq_chip_set_affinity_parent, 547 + .irq_compose_msi_msg = pseries_msi_compose_msg, 548 + }; 549 + 550 + static int pseries_irq_parent_domain_alloc(struct irq_domain *domain, unsigned int virq, 551 + irq_hw_number_t hwirq) 552 + { 553 + struct irq_fwspec parent_fwspec; 554 + int ret; 555 + 556 + parent_fwspec.fwnode = domain->parent->fwnode; 557 + parent_fwspec.param_count = 2; 558 + parent_fwspec.param[0] = hwirq; 559 + parent_fwspec.param[1] = IRQ_TYPE_EDGE_RISING; 560 + 561 + ret = irq_domain_alloc_irqs_parent(domain, virq, 1, &parent_fwspec); 562 + if (ret) 563 + return ret; 564 + 565 + return 0; 566 + } 567 + 568 + static int pseries_irq_domain_alloc(struct irq_domain *domain, unsigned int virq, 569 + unsigned int nr_irqs, void *arg) 570 + { 571 + struct pci_controller *phb = domain->host_data; 572 + msi_alloc_info_t *info = arg; 573 + struct msi_desc *desc = info->desc; 574 + struct pci_dev *pdev = msi_desc_to_pci_dev(desc); 575 + int hwirq; 576 + int i, ret; 577 + 578 + hwirq = rtas_query_irq_number(pci_get_pdn(pdev), desc->msi_attrib.entry_nr); 579 + if (hwirq < 0) { 580 + dev_err(&pdev->dev, "Failed to query HW IRQ: %d\n", hwirq); 581 + return hwirq; 582 + } 583 + 584 + dev_dbg(&pdev->dev, "%s bridge %pOF %d/%x #%d\n", __func__, 585 + phb->dn, virq, hwirq, nr_irqs); 586 + 587 + for (i = 0; i < nr_irqs; i++) { 588 + ret = pseries_irq_parent_domain_alloc(domain, virq + i, hwirq + i); 589 + if (ret) 590 + goto out; 591 + 592 + irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i, 593 + &pseries_msi_irq_chip, domain->host_data); 486 594 } 487 595 488 596 return 0; 597 + 598 + out: 599 + /* TODO: handle RTAS cleanup in ->msi_finish() ? */ 600 + irq_domain_free_irqs_parent(domain, virq, i - 1); 601 + return ret; 602 + } 603 + 604 + static void pseries_irq_domain_free(struct irq_domain *domain, unsigned int virq, 605 + unsigned int nr_irqs) 606 + { 607 + struct irq_data *d = irq_domain_get_irq_data(domain, virq); 608 + struct pci_controller *phb = irq_data_get_irq_chip_data(d); 609 + 610 + pr_debug("%s bridge %pOF %d #%d\n", __func__, phb->dn, virq, nr_irqs); 611 + 612 + /* XIVE domain data is cleared through ->msi_free() */ 613 + } 614 + 615 + static const struct irq_domain_ops pseries_irq_domain_ops = { 616 + .alloc = pseries_irq_domain_alloc, 617 + .free = pseries_irq_domain_free, 618 + }; 619 + 620 + static int __pseries_msi_allocate_domains(struct pci_controller *phb, 621 + unsigned int count) 622 + { 623 + struct irq_domain *parent = irq_get_default_host(); 624 + 625 + phb->fwnode = irq_domain_alloc_named_id_fwnode("pSeries-MSI", 626 + phb->global_number); 627 + if (!phb->fwnode) 628 + return -ENOMEM; 629 + 630 + phb->dev_domain = irq_domain_create_hierarchy(parent, 0, count, 631 + phb->fwnode, 632 + &pseries_irq_domain_ops, phb); 633 + if (!phb->dev_domain) { 634 + pr_err("PCI: failed to create IRQ domain bridge %pOF (domain %d)\n", 635 + phb->dn, phb->global_number); 636 + irq_domain_free_fwnode(phb->fwnode); 637 + return -ENOMEM; 638 + } 639 + 640 + phb->msi_domain = pci_msi_create_irq_domain(of_node_to_fwnode(phb->dn), 641 + &pseries_msi_domain_info, 642 + phb->dev_domain); 643 + if (!phb->msi_domain) { 644 + pr_err("PCI: failed to create MSI IRQ domain bridge %pOF (domain %d)\n", 645 + phb->dn, phb->global_number); 646 + irq_domain_free_fwnode(phb->fwnode); 647 + irq_domain_remove(phb->dev_domain); 648 + return -ENOMEM; 649 + } 650 + 651 + return 0; 652 + } 653 + 654 + int pseries_msi_allocate_domains(struct pci_controller *phb) 655 + { 656 + int count; 657 + 658 + if (!__find_pe_total_msi(phb->dn, &count)) { 659 + pr_err("PCI: failed to find MSIs for bridge %pOF (domain %d)\n", 660 + phb->dn, phb->global_number); 661 + return -ENOSPC; 662 + } 663 + 664 + return __pseries_msi_allocate_domains(phb, count); 665 + } 666 + 667 + void pseries_msi_free_domains(struct pci_controller *phb) 668 + { 669 + if (phb->msi_domain) 670 + irq_domain_remove(phb->msi_domain); 671 + if (phb->dev_domain) 672 + irq_domain_remove(phb->dev_domain); 673 + if (phb->fwnode) 674 + irq_domain_free_fwnode(phb->fwnode); 489 675 } 490 676 491 677 static void rtas_msi_pci_irq_fixup(struct pci_dev *pdev) ··· 684 518 685 519 static int rtas_msi_init(void) 686 520 { 687 - struct pci_controller *phb; 688 - 689 521 query_token = rtas_token("ibm,query-interrupt-source-number"); 690 522 change_token = rtas_token("ibm,change-msi"); 691 523 ··· 694 530 } 695 531 696 532 pr_debug("rtas_msi: Registering RTAS MSI callbacks.\n"); 697 - 698 - WARN_ON(pseries_pci_controller_ops.setup_msi_irqs); 699 - pseries_pci_controller_ops.setup_msi_irqs = rtas_setup_msi_irqs; 700 - pseries_pci_controller_ops.teardown_msi_irqs = rtas_teardown_msi_irqs; 701 - 702 - list_for_each_entry(phb, &hose_list, list_node) { 703 - WARN_ON(phb->controller_ops.setup_msi_irqs); 704 - phb->controller_ops.setup_msi_irqs = rtas_setup_msi_irqs; 705 - phb->controller_ops.teardown_msi_irqs = rtas_teardown_msi_irqs; 706 - } 707 533 708 534 WARN_ON(ppc_md.pci_irq_fixup); 709 535 ppc_md.pci_irq_fixup = rtas_msi_pci_irq_fixup;
+4
arch/powerpc/platforms/pseries/pci_dlpar.c
··· 33 33 34 34 pci_devs_phb_init_dynamic(phb); 35 35 36 + pseries_msi_allocate_domains(phb); 37 + 36 38 /* Create EEH devices for the PHB */ 37 39 eeh_phb_pe_create(phb); 38 40 ··· 75 73 return 1; 76 74 } 77 75 } 76 + 77 + pseries_msi_free_domains(phb); 78 78 79 79 /* Remove the PCI bus and unregister the bridge device from sysfs */ 80 80 phb->bus = NULL;
+2
arch/powerpc/platforms/pseries/pseries.h
··· 85 85 int pseries_root_bridge_prepare(struct pci_host_bridge *bridge); 86 86 87 87 extern struct pci_controller_ops pseries_pci_controller_ops; 88 + int pseries_msi_allocate_domains(struct pci_controller *phb); 89 + void pseries_msi_free_domains(struct pci_controller *phb); 88 90 89 91 unsigned long pseries_memory_block_size(void); 90 92
+1 -1
arch/powerpc/platforms/pseries/ras.c
··· 783 783 { 784 784 int recovered = 0; 785 785 786 - if (!(regs->msr & MSR_RI)) { 786 + if (regs_is_unrecoverable(regs)) { 787 787 /* If MSR_RI isn't set, we cannot recover */ 788 788 pr_err("Machine check interrupt unrecoverable: MSR(RI=0)\n"); 789 789 recovered = 0;
+2
arch/powerpc/platforms/pseries/setup.c
··· 486 486 487 487 /* create pci_dn's for DT nodes under this PHB */ 488 488 pci_devs_phb_init_dynamic(phb); 489 + 490 + pseries_msi_allocate_domains(phb); 489 491 } 490 492 491 493 of_node_put(root);
+1 -1
arch/powerpc/platforms/pseries/vas.c
··· 184 184 * Note: The hypervisor forwards an interrupt for each fault request. 185 185 * So one fault CRB to process for each H_GET_NX_FAULT hcall. 186 186 */ 187 - irqreturn_t pseries_vas_fault_thread_fn(int irq, void *data) 187 + static irqreturn_t pseries_vas_fault_thread_fn(int irq, void *data) 188 188 { 189 189 struct pseries_vas_window *txwin = data; 190 190 struct coprocessor_request_block crb;
+1 -1
arch/powerpc/sysdev/fsl_rio.c
··· 108 108 __func__); 109 109 out_be32((u32 *)(rio_regs_win + RIO_LTLEDCSR), 110 110 0); 111 - regs_set_return_msr(regs, regs->msr | MSR_RI); 111 + regs_set_recoverable(regs); 112 112 regs_set_return_ip(regs, extable_fixup(entry)); 113 113 return 1; 114 114 }
+5 -8
arch/powerpc/sysdev/xics/ics-native.c
··· 131 131 .irq_retrigger = xics_retrigger, 132 132 }; 133 133 134 - static int ics_native_map(struct ics *ics, unsigned int virq) 134 + static int ics_native_check(struct ics *ics, unsigned int hw_irq) 135 135 { 136 - unsigned int vec = (unsigned int)virq_to_hw(virq); 137 136 struct ics_native *in = to_ics_native(ics); 138 137 139 - pr_devel("%s: vec=0x%x\n", __func__, vec); 138 + pr_devel("%s: hw_irq=0x%x\n", __func__, hw_irq); 140 139 141 - if (vec < in->ibase || vec >= (in->ibase + in->icount)) 140 + if (hw_irq < in->ibase || hw_irq >= (in->ibase + in->icount)) 142 141 return -EINVAL; 143 - 144 - irq_set_chip_and_handler(virq, &ics_native_irq_chip, handle_fasteoi_irq); 145 - irq_set_chip_data(virq, ics); 146 142 147 143 return 0; 148 144 } ··· 173 177 } 174 178 175 179 static struct ics ics_native_template = { 176 - .map = ics_native_map, 180 + .check = ics_native_check, 177 181 .mask_unknown = ics_native_mask_unknown, 178 182 .get_server = ics_native_get_server, 179 183 .host_match = ics_native_host_match, 184 + .chip = &ics_native_irq_chip, 180 185 }; 181 186 182 187 static int __init ics_native_add_one(struct device_node *np)
+11 -29
arch/powerpc/sysdev/xics/ics-opal.c
··· 62 62 63 63 static unsigned int ics_opal_startup(struct irq_data *d) 64 64 { 65 - #ifdef CONFIG_PCI_MSI 66 - /* 67 - * The generic MSI code returns with the interrupt disabled on the 68 - * card, using the MSI mask bits. Firmware doesn't appear to unmask 69 - * at that level, so we do it here by hand. 70 - */ 71 - if (irq_data_get_msi_desc(d)) 72 - pci_msi_unmask_irq(d); 73 - #endif 74 - 75 - /* unmask it */ 76 65 ics_opal_unmask_irq(d); 77 66 return 0; 78 67 } ··· 122 133 } 123 134 server = ics_opal_mangle_server(wanted_server); 124 135 125 - pr_devel("ics-hal: set-affinity irq %d [hw 0x%x] server: 0x%x/0x%x\n", 136 + pr_debug("ics-hal: set-affinity irq %d [hw 0x%x] server: 0x%x/0x%x\n", 126 137 d->irq, hw_irq, wanted_server, server); 127 138 128 139 rc = opal_set_xive(hw_irq, server, priority); ··· 146 157 .irq_retrigger = xics_retrigger, 147 158 }; 148 159 149 - static int ics_opal_map(struct ics *ics, unsigned int virq); 150 - static void ics_opal_mask_unknown(struct ics *ics, unsigned long vec); 151 - static long ics_opal_get_server(struct ics *ics, unsigned long vec); 152 - 153 160 static int ics_opal_host_match(struct ics *ics, struct device_node *node) 154 161 { 155 162 return 1; 156 163 } 157 164 158 - /* Only one global & state struct ics */ 159 - static struct ics ics_hal = { 160 - .map = ics_opal_map, 161 - .mask_unknown = ics_opal_mask_unknown, 162 - .get_server = ics_opal_get_server, 163 - .host_match = ics_opal_host_match, 164 - }; 165 - 166 - static int ics_opal_map(struct ics *ics, unsigned int virq) 165 + static int ics_opal_check(struct ics *ics, unsigned int hw_irq) 167 166 { 168 - unsigned int hw_irq = (unsigned int)virq_to_hw(virq); 169 167 int64_t rc; 170 168 __be16 server; 171 169 int8_t priority; ··· 164 188 rc = opal_get_xive(hw_irq, &server, &priority); 165 189 if (rc != OPAL_SUCCESS) 166 190 return -ENXIO; 167 - 168 - irq_set_chip_and_handler(virq, &ics_opal_irq_chip, handle_fasteoi_irq); 169 - irq_set_chip_data(virq, &ics_hal); 170 191 171 192 return 0; 172 193 } ··· 194 221 return -1; 195 222 return ics_opal_unmangle_server(be16_to_cpu(server)); 196 223 } 224 + 225 + /* Only one global & state struct ics */ 226 + static struct ics ics_hal = { 227 + .check = ics_opal_check, 228 + .mask_unknown = ics_opal_mask_unknown, 229 + .get_server = ics_opal_get_server, 230 + .host_match = ics_opal_host_match, 231 + .chip = &ics_opal_irq_chip, 232 + }; 197 233 198 234 int __init ics_opal_init(void) 199 235 {
+13 -27
arch/powerpc/sysdev/xics/ics-rtas.c
··· 24 24 static int ibm_int_on; 25 25 static int ibm_int_off; 26 26 27 - static int ics_rtas_map(struct ics *ics, unsigned int virq); 28 - static void ics_rtas_mask_unknown(struct ics *ics, unsigned long vec); 29 - static long ics_rtas_get_server(struct ics *ics, unsigned long vec); 30 - static int ics_rtas_host_match(struct ics *ics, struct device_node *node); 31 - 32 - /* Only one global & state struct ics */ 33 - static struct ics ics_rtas = { 34 - .map = ics_rtas_map, 35 - .mask_unknown = ics_rtas_mask_unknown, 36 - .get_server = ics_rtas_get_server, 37 - .host_match = ics_rtas_host_match, 38 - }; 39 - 40 27 static void ics_rtas_unmask_irq(struct irq_data *d) 41 28 { 42 29 unsigned int hw_irq = (unsigned int)irqd_to_hwirq(d); ··· 57 70 58 71 static unsigned int ics_rtas_startup(struct irq_data *d) 59 72 { 60 - #ifdef CONFIG_PCI_MSI 61 - /* 62 - * The generic MSI code returns with the interrupt disabled on the 63 - * card, using the MSI mask bits. Firmware doesn't appear to unmask 64 - * at that level, so we do it here by hand. 65 - */ 66 - if (irq_data_get_msi_desc(d)) 67 - pci_msi_unmask_irq(d); 68 - #endif 69 73 /* unmask it */ 70 74 ics_rtas_unmask_irq(d); 71 75 return 0; ··· 124 146 return -1; 125 147 } 126 148 149 + pr_debug("%s: irq %d [hw 0x%x] server: 0x%x\n", __func__, d->irq, 150 + hw_irq, irq_server); 151 + 127 152 status = rtas_call_reentrant(ibm_set_xive, 3, 1, NULL, 128 153 hw_irq, irq_server, xics_status[1]); 129 154 ··· 150 169 .irq_retrigger = xics_retrigger, 151 170 }; 152 171 153 - static int ics_rtas_map(struct ics *ics, unsigned int virq) 172 + static int ics_rtas_check(struct ics *ics, unsigned int hw_irq) 154 173 { 155 - unsigned int hw_irq = (unsigned int)virq_to_hw(virq); 156 174 int status[2]; 157 175 int rc; 158 176 ··· 162 182 rc = rtas_call_reentrant(ibm_get_xive, 1, 3, status, hw_irq); 163 183 if (rc) 164 184 return -ENXIO; 165 - 166 - irq_set_chip_and_handler(virq, &ics_rtas_irq_chip, handle_fasteoi_irq); 167 - irq_set_chip_data(virq, &ics_rtas); 168 185 169 186 return 0; 170 187 } ··· 189 212 */ 190 213 return !of_device_is_compatible(node, "chrp,iic"); 191 214 } 215 + 216 + /* Only one global & state struct ics */ 217 + static struct ics ics_rtas = { 218 + .check = ics_rtas_check, 219 + .mask_unknown = ics_rtas_mask_unknown, 220 + .get_server = ics_rtas_get_server, 221 + .host_match = ics_rtas_host_match, 222 + .chip = &ics_rtas_irq_chip, 223 + }; 192 224 193 225 __init int ics_rtas_init(void) 194 226 {
+93 -38
arch/powerpc/sysdev/xics/xics-common.c
··· 38 38 39 39 struct irq_domain *xics_host; 40 40 41 - static LIST_HEAD(ics_list); 41 + static struct ics *xics_ics; 42 42 43 43 void xics_update_irq_servers(void) 44 44 { ··· 111 111 112 112 void xics_mask_unknown_vec(unsigned int vec) 113 113 { 114 - struct ics *ics; 115 - 116 114 pr_err("Interrupt 0x%x (real) is invalid, disabling it.\n", vec); 117 115 118 - list_for_each_entry(ics, &ics_list, link) 119 - ics->mask_unknown(ics, vec); 116 + if (WARN_ON(!xics_ics)) 117 + return; 118 + xics_ics->mask_unknown(xics_ics, vec); 120 119 } 121 120 122 121 ··· 132 133 * IPIs are marked IRQF_PERCPU. The handler was set in map. 133 134 */ 134 135 BUG_ON(request_irq(ipi, icp_ops->ipi_action, 135 - IRQF_PERCPU | IRQF_NO_THREAD, "IPI", NULL)); 136 + IRQF_NO_DEBUG | IRQF_PERCPU | IRQF_NO_THREAD, "IPI", NULL)); 136 137 } 137 138 138 139 void __init xics_smp_probe(void) ··· 183 184 unsigned int irq, virq; 184 185 struct irq_desc *desc; 185 186 187 + pr_debug("%s: CPU %u\n", __func__, cpu); 188 + 186 189 /* If we used to be the default server, move to the new "boot_cpuid" */ 187 190 if (hw_cpu == xics_default_server) 188 191 xics_update_irq_servers(); ··· 199 198 struct irq_chip *chip; 200 199 long server; 201 200 unsigned long flags; 202 - struct ics *ics; 201 + struct irq_data *irqd; 203 202 204 203 /* We can't set affinity on ISA interrupts */ 205 204 if (virq < NR_IRQS_LEGACY) ··· 207 206 /* We only need to migrate enabled IRQS */ 208 207 if (!desc->action) 209 208 continue; 210 - if (desc->irq_data.domain != xics_host) 209 + /* We need a mapping in the XICS IRQ domain */ 210 + irqd = irq_domain_get_irq_data(xics_host, virq); 211 + if (!irqd) 211 212 continue; 212 - irq = desc->irq_data.hwirq; 213 + irq = irqd_to_hwirq(irqd); 213 214 /* We need to get IPIs still. */ 214 215 if (irq == XICS_IPI || irq == XICS_IRQ_SPURIOUS) 215 216 continue; ··· 222 219 raw_spin_lock_irqsave(&desc->lock, flags); 223 220 224 221 /* Locate interrupt server */ 225 - server = -1; 226 - ics = irq_desc_get_chip_data(desc); 227 - if (ics) 228 - server = ics->get_server(ics, irq); 222 + server = xics_ics->get_server(xics_ics, irq); 229 223 if (server < 0) { 230 - printk(KERN_ERR "%s: Can't find server for irq %d\n", 231 - __func__, irq); 224 + pr_err("%s: Can't find server for irq %d/%x\n", 225 + __func__, virq, irq); 232 226 goto unlock; 233 227 } 234 228 ··· 307 307 static int xics_host_match(struct irq_domain *h, struct device_node *node, 308 308 enum irq_domain_bus_token bus_token) 309 309 { 310 - struct ics *ics; 311 - 312 - list_for_each_entry(ics, &ics_list, link) 313 - if (ics->host_match(ics, node)) 314 - return 1; 315 - 316 - return 0; 310 + if (WARN_ON(!xics_ics)) 311 + return 0; 312 + return xics_ics->host_match(xics_ics, node) ? 1 : 0; 317 313 } 318 314 319 315 /* Dummies */ ··· 323 327 .irq_unmask = xics_ipi_unmask, 324 328 }; 325 329 326 - static int xics_host_map(struct irq_domain *h, unsigned int virq, 327 - irq_hw_number_t hw) 330 + static int xics_host_map(struct irq_domain *domain, unsigned int virq, 331 + irq_hw_number_t hwirq) 328 332 { 329 - struct ics *ics; 330 - 331 - pr_devel("xics: map virq %d, hwirq 0x%lx\n", virq, hw); 333 + pr_devel("xics: map virq %d, hwirq 0x%lx\n", virq, hwirq); 332 334 333 335 /* 334 336 * Mark interrupts as edge sensitive by default so that resend ··· 336 342 irq_clear_status_flags(virq, IRQ_LEVEL); 337 343 338 344 /* Don't call into ICS for IPIs */ 339 - if (hw == XICS_IPI) { 345 + if (hwirq == XICS_IPI) { 340 346 irq_set_chip_and_handler(virq, &xics_ipi_chip, 341 347 handle_percpu_irq); 342 348 return 0; 343 349 } 344 350 345 - /* Let the ICS setup the chip data */ 346 - list_for_each_entry(ics, &ics_list, link) 347 - if (ics->map(ics, virq) == 0) 348 - return 0; 351 + if (WARN_ON(!xics_ics)) 352 + return -EINVAL; 349 353 350 - return -EINVAL; 354 + if (xics_ics->check(xics_ics, hwirq)) 355 + return -EINVAL; 356 + 357 + /* No chip data for the XICS domain */ 358 + irq_domain_set_info(domain, virq, hwirq, xics_ics->chip, 359 + NULL, handle_fasteoi_irq, NULL, NULL); 360 + 361 + return 0; 351 362 } 352 363 353 364 static int xics_host_xlate(struct irq_domain *h, struct device_node *ct, ··· 411 412 return 0; 412 413 } 413 414 415 + #ifdef CONFIG_IRQ_DOMAIN_HIERARCHY 416 + static int xics_host_domain_translate(struct irq_domain *d, struct irq_fwspec *fwspec, 417 + unsigned long *hwirq, unsigned int *type) 418 + { 419 + return xics_host_xlate(d, to_of_node(fwspec->fwnode), fwspec->param, 420 + fwspec->param_count, hwirq, type); 421 + } 422 + 423 + static int xics_host_domain_alloc(struct irq_domain *domain, unsigned int virq, 424 + unsigned int nr_irqs, void *arg) 425 + { 426 + struct irq_fwspec *fwspec = arg; 427 + irq_hw_number_t hwirq; 428 + unsigned int type = IRQ_TYPE_NONE; 429 + int i, rc; 430 + 431 + rc = xics_host_domain_translate(domain, fwspec, &hwirq, &type); 432 + if (rc) 433 + return rc; 434 + 435 + pr_debug("%s %d/%lx #%d\n", __func__, virq, hwirq, nr_irqs); 436 + 437 + for (i = 0; i < nr_irqs; i++) 438 + irq_domain_set_info(domain, virq + i, hwirq + i, xics_ics->chip, 439 + xics_ics, handle_fasteoi_irq, NULL, NULL); 440 + 441 + return 0; 442 + } 443 + 444 + static void xics_host_domain_free(struct irq_domain *domain, 445 + unsigned int virq, unsigned int nr_irqs) 446 + { 447 + pr_debug("%s %d #%d\n", __func__, virq, nr_irqs); 448 + } 449 + #endif 450 + 414 451 static const struct irq_domain_ops xics_host_ops = { 452 + #ifdef CONFIG_IRQ_DOMAIN_HIERARCHY 453 + .alloc = xics_host_domain_alloc, 454 + .free = xics_host_domain_free, 455 + .translate = xics_host_domain_translate, 456 + #endif 415 457 .match = xics_host_match, 416 458 .map = xics_host_map, 417 459 .xlate = xics_host_xlate, 418 460 }; 419 461 420 - static void __init xics_init_host(void) 462 + static int __init xics_allocate_domain(void) 421 463 { 422 - xics_host = irq_domain_add_tree(NULL, &xics_host_ops, NULL); 423 - BUG_ON(xics_host == NULL); 464 + struct fwnode_handle *fn; 465 + 466 + fn = irq_domain_alloc_named_fwnode("XICS"); 467 + if (!fn) 468 + return -ENOMEM; 469 + 470 + xics_host = irq_domain_create_tree(fn, &xics_host_ops, NULL); 471 + if (!xics_host) { 472 + irq_domain_free_fwnode(fn); 473 + return -ENOMEM; 474 + } 475 + 424 476 irq_set_default_host(xics_host); 477 + return 0; 425 478 } 426 479 427 480 void __init xics_register_ics(struct ics *ics) 428 481 { 429 - list_add(&ics->link, &ics_list); 482 + if (WARN_ONCE(xics_ics, "XICS: Source Controller is already defined !")) 483 + return; 484 + xics_ics = ics; 430 485 } 431 486 432 487 static void __init xics_get_server_size(void) ··· 537 484 /* Initialize common bits */ 538 485 xics_get_server_size(); 539 486 xics_update_irq_servers(); 540 - xics_init_host(); 487 + rc = xics_allocate_domain(); 488 + if (rc < 0) 489 + pr_err("XICS: Failed to create IRQ domain"); 541 490 xics_setup_cpu(); 542 491 }
+77 -26
arch/powerpc/sysdev/xive/common.c
··· 21 21 #include <linux/msi.h> 22 22 #include <linux/vmalloc.h> 23 23 24 - #include <asm/debugfs.h> 25 24 #include <asm/prom.h> 26 25 #include <asm/io.h> 27 26 #include <asm/smp.h> ··· 312 313 struct irq_desc *desc; 313 314 314 315 for_each_irq_desc(i, desc) { 315 - struct irq_data *d = irq_desc_get_irq_data(desc); 316 - unsigned int hwirq = (unsigned int)irqd_to_hwirq(d); 316 + struct irq_data *d = irq_domain_get_irq_data(xive_irq_domain, i); 317 317 318 - if (d->domain == xive_irq_domain) 319 - xmon_xive_get_irq_config(hwirq, d); 318 + if (d) 319 + xmon_xive_get_irq_config(irqd_to_hwirq(d), d); 320 320 } 321 321 } 322 322 ··· 615 617 pr_devel("xive_irq_startup: irq %d [0x%x] data @%p\n", 616 618 d->irq, hw_irq, d); 617 619 618 - #ifdef CONFIG_PCI_MSI 619 - /* 620 - * The generic MSI code returns with the interrupt disabled on the 621 - * card, using the MSI mask bits. Firmware doesn't appear to unmask 622 - * at that level, so we do it here by hand. 623 - */ 624 - if (irq_data_get_msi_desc(d)) 625 - pci_msi_unmask_irq(d); 626 - #endif 627 - 628 620 /* Pick a target */ 629 621 target = xive_pick_irq_target(d, irq_data_get_affinity_mask(d)); 630 622 if (target == XIVE_INVALID_TARGET) { ··· 702 714 u32 target, old_target; 703 715 int rc = 0; 704 716 705 - pr_devel("xive_irq_set_affinity: irq %d\n", d->irq); 717 + pr_debug("%s: irq %d/%x\n", __func__, d->irq, hw_irq); 706 718 707 719 /* Is this valid ? */ 708 720 if (cpumask_any_and(cpumask, cpu_online_mask) >= nr_cpu_ids) 709 721 return -EINVAL; 710 - 711 - /* Don't do anything if the interrupt isn't started */ 712 - if (!irqd_is_started(d)) 713 - return IRQ_SET_MASK_OK; 714 722 715 723 /* 716 724 * If existing target is already in the new mask, and is ··· 743 759 return rc; 744 760 } 745 761 746 - pr_devel(" target: 0x%x\n", target); 762 + pr_debug(" target: 0x%x\n", target); 747 763 xd->target = target; 748 764 749 765 /* Give up previous target */ ··· 974 990 975 991 void xive_cleanup_irq_data(struct xive_irq_data *xd) 976 992 { 993 + pr_debug("%s for HW %x\n", __func__, xd->hw_irq); 994 + 977 995 if (xd->eoi_mmio) { 978 996 iounmap(xd->eoi_mmio); 979 997 if (xd->eoi_mmio == xd->trig_mmio) ··· 1017 1031 return 0; 1018 1032 } 1019 1033 1020 - static void xive_irq_free_data(unsigned int virq) 1034 + void xive_irq_free_data(unsigned int virq) 1021 1035 { 1022 1036 struct xive_irq_data *xd = irq_get_handler_data(virq); 1023 1037 ··· 1027 1041 xive_cleanup_irq_data(xd); 1028 1042 kfree(xd); 1029 1043 } 1044 + EXPORT_SYMBOL_GPL(xive_irq_free_data); 1030 1045 1031 1046 #ifdef CONFIG_SMP 1032 1047 ··· 1166 1179 return 0; 1167 1180 1168 1181 ret = request_irq(xid->irq, xive_muxed_ipi_action, 1169 - IRQF_PERCPU | IRQF_NO_THREAD, 1182 + IRQF_NO_DEBUG | IRQF_PERCPU | IRQF_NO_THREAD, 1170 1183 xid->name, NULL); 1171 1184 1172 1185 WARN(ret < 0, "Failed to request IPI %d: %d\n", xid->irq, ret); ··· 1366 1379 } 1367 1380 #endif 1368 1381 1382 + #ifdef CONFIG_IRQ_DOMAIN_HIERARCHY 1383 + static int xive_irq_domain_translate(struct irq_domain *d, 1384 + struct irq_fwspec *fwspec, 1385 + unsigned long *hwirq, 1386 + unsigned int *type) 1387 + { 1388 + return xive_irq_domain_xlate(d, to_of_node(fwspec->fwnode), 1389 + fwspec->param, fwspec->param_count, 1390 + hwirq, type); 1391 + } 1392 + 1393 + static int xive_irq_domain_alloc(struct irq_domain *domain, unsigned int virq, 1394 + unsigned int nr_irqs, void *arg) 1395 + { 1396 + struct irq_fwspec *fwspec = arg; 1397 + irq_hw_number_t hwirq; 1398 + unsigned int type = IRQ_TYPE_NONE; 1399 + int i, rc; 1400 + 1401 + rc = xive_irq_domain_translate(domain, fwspec, &hwirq, &type); 1402 + if (rc) 1403 + return rc; 1404 + 1405 + pr_debug("%s %d/%lx #%d\n", __func__, virq, hwirq, nr_irqs); 1406 + 1407 + for (i = 0; i < nr_irqs; i++) { 1408 + /* TODO: call xive_irq_domain_map() */ 1409 + 1410 + /* 1411 + * Mark interrupts as edge sensitive by default so that resend 1412 + * actually works. Will fix that up below if needed. 1413 + */ 1414 + irq_clear_status_flags(virq, IRQ_LEVEL); 1415 + 1416 + /* allocates and sets handler data */ 1417 + rc = xive_irq_alloc_data(virq + i, hwirq + i); 1418 + if (rc) 1419 + return rc; 1420 + 1421 + irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i, 1422 + &xive_irq_chip, domain->host_data); 1423 + irq_set_handler(virq + i, handle_fasteoi_irq); 1424 + } 1425 + 1426 + return 0; 1427 + } 1428 + 1429 + static void xive_irq_domain_free(struct irq_domain *domain, 1430 + unsigned int virq, unsigned int nr_irqs) 1431 + { 1432 + int i; 1433 + 1434 + pr_debug("%s %d #%d\n", __func__, virq, nr_irqs); 1435 + 1436 + for (i = 0; i < nr_irqs; i++) 1437 + xive_irq_free_data(virq + i); 1438 + } 1439 + #endif 1440 + 1369 1441 static const struct irq_domain_ops xive_irq_domain_ops = { 1442 + #ifdef CONFIG_IRQ_DOMAIN_HIERARCHY 1443 + .alloc = xive_irq_domain_alloc, 1444 + .free = xive_irq_domain_free, 1445 + .translate = xive_irq_domain_translate, 1446 + #endif 1370 1447 .match = xive_irq_domain_match, 1371 1448 .map = xive_irq_domain_map, 1372 1449 .unmap = xive_irq_domain_unmap, ··· 1768 1717 xive_debug_show_cpu(m, cpu); 1769 1718 1770 1719 for_each_irq_desc(i, desc) { 1771 - struct irq_data *d = irq_desc_get_irq_data(desc); 1720 + struct irq_data *d = irq_domain_get_irq_data(xive_irq_domain, i); 1772 1721 1773 - if (d->domain == xive_irq_domain) 1722 + if (d) 1774 1723 xive_debug_show_irq(m, d); 1775 1724 } 1776 1725 return 0; ··· 1780 1729 int xive_core_debug_init(void) 1781 1730 { 1782 1731 if (xive_enabled()) 1783 - debugfs_create_file("xive", 0400, powerpc_debugfs_root, 1732 + debugfs_create_file("xive", 0400, arch_debugfs_dir, 1784 1733 NULL, &xive_core_debug_fops); 1785 1734 return 0; 1786 1735 }
+10
arch/powerpc/sysdev/xive/native.c
··· 41 41 static u32 xive_pool_vps = XIVE_INVALID_VP; 42 42 static struct kmem_cache *xive_provision_cache; 43 43 static bool xive_has_single_esc; 44 + static bool xive_has_save_restore; 44 45 45 46 int xive_native_populate_irq_data(u32 hw_irq, struct xive_irq_data *data) 46 47 { ··· 589 588 if (of_get_property(np, "single-escalation-support", NULL) != NULL) 590 589 xive_has_single_esc = true; 591 590 591 + if (of_get_property(np, "vp-save-restore", NULL)) 592 + xive_has_save_restore = true; 593 + 592 594 /* Configure Thread Management areas for KVM */ 593 595 for_each_possible_cpu(cpu) 594 596 kvmppc_set_xive_tima(cpu, r.start, tima); ··· 755 751 return xive_has_single_esc; 756 752 } 757 753 EXPORT_SYMBOL_GPL(xive_native_has_single_escalation); 754 + 755 + bool xive_native_has_save_restore(void) 756 + { 757 + return xive_has_save_restore; 758 + } 759 + EXPORT_SYMBOL_GPL(xive_native_has_save_restore); 758 760 759 761 int xive_native_get_queue_info(u32 vp_id, u32 prio, 760 762 u64 *out_qpage,
+12 -12
arch/powerpc/tools/head_check.sh
··· 49 49 $nm "$vmlinux" | grep -e " [TA] _stext$" -e " t start_first_256B$" -e " a text_start$" -e " t start_text$" > .tmp_symbols.txt 50 50 51 51 52 - vma=$(cat .tmp_symbols.txt | grep -e " [TA] _stext$" | cut -d' ' -f1) 52 + vma=$(grep -e " [TA] _stext$" .tmp_symbols.txt | cut -d' ' -f1) 53 53 54 - expected_start_head_addr=$vma 54 + expected_start_head_addr="$vma" 55 55 56 - start_head_addr=$(cat .tmp_symbols.txt | grep " t start_first_256B$" | cut -d' ' -f1) 56 + start_head_addr=$(grep " t start_first_256B$" .tmp_symbols.txt | cut -d' ' -f1) 57 57 58 58 if [ "$start_head_addr" != "$expected_start_head_addr" ]; then 59 - echo "ERROR: head code starts at $start_head_addr, should be $expected_start_head_addr" 60 - echo "ERROR: try to enable LD_HEAD_STUB_CATCH config option" 61 - echo "ERROR: see comments in arch/powerpc/tools/head_check.sh" 59 + echo "ERROR: head code starts at $start_head_addr, should be $expected_start_head_addr" 1>&2 60 + echo "ERROR: try to enable LD_HEAD_STUB_CATCH config option" 1>&2 61 + echo "ERROR: see comments in arch/powerpc/tools/head_check.sh" 1>&2 62 62 63 63 exit 1 64 64 fi 65 65 66 - top_vma=$(echo $vma | cut -d'0' -f1) 66 + top_vma=$(echo "$vma" | cut -d'0' -f1) 67 67 68 - expected_start_text_addr=$(cat .tmp_symbols.txt | grep " a text_start$" | cut -d' ' -f1 | sed "s/^0/$top_vma/") 68 + expected_start_text_addr=$(grep " a text_start$" .tmp_symbols.txt | cut -d' ' -f1 | sed "s/^0/$top_vma/") 69 69 70 - start_text_addr=$(cat .tmp_symbols.txt | grep " t start_text$" | cut -d' ' -f1) 70 + start_text_addr=$(grep " t start_text$" .tmp_symbols.txt | cut -d' ' -f1) 71 71 72 72 if [ "$start_text_addr" != "$expected_start_text_addr" ]; then 73 - echo "ERROR: start_text address is $start_text_addr, should be $expected_start_text_addr" 74 - echo "ERROR: try to enable LD_HEAD_STUB_CATCH config option" 75 - echo "ERROR: see comments in arch/powerpc/tools/head_check.sh" 73 + echo "ERROR: start_text address is $start_text_addr, should be $expected_start_text_addr" 1>&2 74 + echo "ERROR: try to enable LD_HEAD_STUB_CATCH config option" 1>&2 75 + echo "ERROR: see comments in arch/powerpc/tools/head_check.sh" 1>&2 76 76 77 77 exit 1 78 78 fi
+6 -16
arch/powerpc/xmon/xmon.c
··· 26 26 #include <linux/ctype.h> 27 27 #include <linux/highmem.h> 28 28 #include <linux/security.h> 29 + #include <linux/debugfs.h> 29 30 30 - #include <asm/debugfs.h> 31 31 #include <asm/ptrace.h> 32 32 #include <asm/smp.h> 33 33 #include <asm/string.h> ··· 482 482 static inline void release_output_lock(void) {} 483 483 #endif 484 484 485 - static inline int unrecoverable_excp(struct pt_regs *regs) 486 - { 487 - #if defined(CONFIG_4xx) || defined(CONFIG_PPC_BOOK3E) 488 - /* We have no MSR_RI bit on 4xx or Book3e, so we simply return false */ 489 - return 0; 490 - #else 491 - return ((regs->msr & MSR_RI) == 0); 492 - #endif 493 - } 494 - 495 485 static void xmon_touch_watchdogs(void) 496 486 { 497 487 touch_softlockup_watchdog_sync(); ··· 555 565 bp = NULL; 556 566 if ((regs->msr & (MSR_IR|MSR_PR|MSR_64BIT)) == (MSR_IR|MSR_64BIT)) 557 567 bp = at_breakpoint(regs->nip); 558 - if (bp || unrecoverable_excp(regs)) 568 + if (bp || regs_is_unrecoverable(regs)) 559 569 fromipi = 0; 560 570 561 571 if (!fromipi) { ··· 567 577 cpu, BP_NUM(bp)); 568 578 xmon_print_symbol(regs->nip, " ", ")\n"); 569 579 } 570 - if (unrecoverable_excp(regs)) 580 + if (regs_is_unrecoverable(regs)) 571 581 printf("WARNING: exception is not recoverable, " 572 582 "can't continue\n"); 573 583 release_output_lock(); ··· 683 693 printf("Stopped at breakpoint %tx (", BP_NUM(bp)); 684 694 xmon_print_symbol(regs->nip, " ", ")\n"); 685 695 } 686 - if (unrecoverable_excp(regs)) 696 + if (regs_is_unrecoverable(regs)) 687 697 printf("WARNING: exception is not recoverable, " 688 698 "can't continue\n"); 689 699 remove_bpts(); ··· 4067 4077 4068 4078 static int __init setup_xmon_dbgfs(void) 4069 4079 { 4070 - debugfs_create_file("xmon", 0600, powerpc_debugfs_root, NULL, 4071 - &xmon_dbgfs_ops); 4080 + debugfs_create_file("xmon", 0600, arch_debugfs_dir, NULL, 4081 + &xmon_dbgfs_ops); 4072 4082 return 0; 4073 4083 } 4074 4084 device_initcall(setup_xmon_dbgfs);
+14 -2
drivers/cpufreq/powernv-cpufreq.c
··· 36 36 #define MAX_PSTATE_SHIFT 32 37 37 #define LPSTATE_SHIFT 48 38 38 #define GPSTATE_SHIFT 56 39 + #define MAX_NR_CHIPS 32 39 40 40 41 #define MAX_RAMP_DOWN_TIME 5120 41 42 /* ··· 1047 1046 unsigned int *chip; 1048 1047 unsigned int cpu, i; 1049 1048 unsigned int prev_chip_id = UINT_MAX; 1049 + cpumask_t *chip_cpu_mask; 1050 1050 int ret = 0; 1051 1051 1052 1052 chip = kcalloc(num_possible_cpus(), sizeof(*chip), GFP_KERNEL); 1053 1053 if (!chip) 1054 1054 return -ENOMEM; 1055 + 1056 + /* Allocate a chip cpu mask large enough to fit mask for all chips */ 1057 + chip_cpu_mask = kcalloc(MAX_NR_CHIPS, sizeof(cpumask_t), GFP_KERNEL); 1058 + if (!chip_cpu_mask) { 1059 + ret = -ENOMEM; 1060 + goto free_and_return; 1061 + } 1055 1062 1056 1063 for_each_possible_cpu(cpu) { 1057 1064 unsigned int id = cpu_to_chip_id(cpu); ··· 1068 1059 prev_chip_id = id; 1069 1060 chip[nr_chips++] = id; 1070 1061 } 1062 + cpumask_set_cpu(cpu, &chip_cpu_mask[nr_chips-1]); 1071 1063 } 1072 1064 1073 1065 chips = kcalloc(nr_chips, sizeof(struct chip), GFP_KERNEL); 1074 1066 if (!chips) { 1075 1067 ret = -ENOMEM; 1076 - goto free_and_return; 1068 + goto out_free_chip_cpu_mask; 1077 1069 } 1078 1070 1079 1071 for (i = 0; i < nr_chips; i++) { 1080 1072 chips[i].id = chip[i]; 1081 - cpumask_copy(&chips[i].mask, cpumask_of_node(chip[i])); 1073 + cpumask_copy(&chips[i].mask, &chip_cpu_mask[i]); 1082 1074 INIT_WORK(&chips[i].throttle, powernv_cpufreq_work_fn); 1083 1075 for_each_cpu(cpu, &chips[i].mask) 1084 1076 per_cpu(chip_info, cpu) = &chips[i]; 1085 1077 } 1086 1078 1079 + out_free_chip_cpu_mask: 1080 + kfree(chip_cpu_mask); 1087 1081 free_and_return: 1088 1082 kfree(chip); 1089 1083 return ret;
+47 -32
drivers/cpuidle/cpuidle-pseries.c
··· 346 346 static void __init fixup_cede0_latency(void) 347 347 { 348 348 struct xcede_latency_payload *payload; 349 - u64 min_latency_us; 349 + u64 min_xcede_latency_us = UINT_MAX; 350 350 int i; 351 - 352 - min_latency_us = dedicated_states[1].exit_latency; // CEDE latency 353 351 354 352 if (parse_cede_parameters()) 355 353 return; ··· 356 358 nr_xcede_records); 357 359 358 360 payload = &xcede_latency_parameter.payload; 361 + 362 + /* 363 + * The CEDE idle state maps to CEDE(0). While the hypervisor 364 + * does not advertise CEDE(0) exit latency values, it does 365 + * advertise the latency values of the extended CEDE states. 366 + * We use the lowest advertised exit latency value as a proxy 367 + * for the exit latency of CEDE(0). 368 + */ 359 369 for (i = 0; i < nr_xcede_records; i++) { 360 370 struct xcede_latency_record *record = &payload->records[i]; 371 + u8 hint = record->hint; 361 372 u64 latency_tb = be64_to_cpu(record->latency_ticks); 362 373 u64 latency_us = DIV_ROUND_UP_ULL(tb_to_ns(latency_tb), NSEC_PER_USEC); 363 374 364 - if (latency_us == 0) 365 - pr_warn("cpuidle: xcede record %d has an unrealistic latency of 0us.\n", i); 375 + /* 376 + * We expect the exit latency of an extended CEDE 377 + * state to be non-zero, it to since it takes at least 378 + * a few nanoseconds to wakeup the idle CPU and 379 + * dispatch the virtual processor into the Linux 380 + * Guest. 381 + * 382 + * So we consider only non-zero value for performing 383 + * the fixup of CEDE(0) latency. 384 + */ 385 + if (latency_us == 0) { 386 + pr_warn("cpuidle: Skipping xcede record %d [hint=%d]. Exit latency = 0us\n", 387 + i, hint); 388 + continue; 389 + } 366 390 367 - if (latency_us < min_latency_us) 368 - min_latency_us = latency_us; 391 + if (latency_us < min_xcede_latency_us) 392 + min_xcede_latency_us = latency_us; 369 393 } 370 394 371 - /* 372 - * By default, we assume that CEDE(0) has exit latency 10us, 373 - * since there is no way for us to query from the platform. 374 - * 375 - * However, if the wakeup latency of an Extended CEDE state is 376 - * smaller than 10us, then we can be sure that CEDE(0) 377 - * requires no more than that. 378 - * 379 - * Perform the fix-up. 380 - */ 381 - if (min_latency_us < dedicated_states[1].exit_latency) { 382 - /* 383 - * We set a minimum of 1us wakeup latency for cede0 to 384 - * distinguish it from snooze 385 - */ 386 - u64 cede0_latency = 1; 387 - 388 - if (min_latency_us > cede0_latency) 389 - cede0_latency = min_latency_us - 1; 390 - 391 - dedicated_states[1].exit_latency = cede0_latency; 392 - dedicated_states[1].target_residency = 10 * (cede0_latency); 395 + if (min_xcede_latency_us != UINT_MAX) { 396 + dedicated_states[1].exit_latency = min_xcede_latency_us; 397 + dedicated_states[1].target_residency = 10 * (min_xcede_latency_us); 393 398 pr_info("cpuidle: Fixed up CEDE exit latency to %llu us\n", 394 - cede0_latency); 399 + min_xcede_latency_us); 395 400 } 396 401 397 402 } ··· 403 402 * pseries_idle_probe() 404 403 * Choose state table for shared versus dedicated partition 405 404 */ 406 - static int pseries_idle_probe(void) 405 + static int __init pseries_idle_probe(void) 407 406 { 408 407 409 408 if (cpuidle_disable != IDLE_NO_OVERRIDE) ··· 420 419 cpuidle_state_table = shared_states; 421 420 max_idle_state = ARRAY_SIZE(shared_states); 422 421 } else { 423 - fixup_cede0_latency(); 422 + /* 423 + * Use firmware provided latency values 424 + * starting with POWER10 platforms. In the 425 + * case that we are running on a POWER10 426 + * platform but in an earlier compat mode, we 427 + * can still use the firmware provided values. 428 + * 429 + * However, on platforms prior to POWER10, we 430 + * cannot rely on the accuracy of the firmware 431 + * provided latency values. On such platforms, 432 + * go with the conservative default estimate 433 + * of 10us. 434 + */ 435 + if (cpu_has_feature(CPU_FTR_ARCH_31) || pvr_version_is(PVR_POWER10)) 436 + fixup_cede0_latency(); 424 437 cpuidle_state_table = dedicated_states; 425 438 max_idle_state = NR_DEDICATED_STATES; 426 439 }
+1
kernel/irq/irqdomain.c
··· 491 491 { 492 492 return irq_default_domain; 493 493 } 494 + EXPORT_SYMBOL_GPL(irq_get_default_host); 494 495 495 496 static bool irq_domain_is_nomap(struct irq_domain *domain) 496 497 {
+1 -1
scripts/mod/modpost.c
··· 931 931 ".kprobes.text", ".cpuidle.text", ".noinstr.text" 932 932 #define OTHER_TEXT_SECTIONS ".ref.text", ".head.text", ".spinlock.text", \ 933 933 ".fixup", ".entry.text", ".exception.text", ".text.*", \ 934 - ".coldtext" 934 + ".coldtext", ".softirqentry.text" 935 935 936 936 #define INIT_SECTIONS ".init.*" 937 937 #define MEM_INIT_SECTIONS ".meminit.*"
+2 -1
tools/testing/selftests/powerpc/ptrace/ptrace-tm-gpr.c
··· 57 57 : [gpr_1]"i"(GPR_1), [gpr_2]"i"(GPR_2), 58 58 [sprn_texasr] "i" (SPRN_TEXASR), [flt_1] "b" (&a), 59 59 [flt_2] "b" (&b), [cptr1] "b" (&cptr[1]) 60 - : "memory", "r7", "r8", "r9", "r10", 60 + : "memory", "r0", "r7", "r8", "r9", "r10", 61 61 "r11", "r12", "r13", "r14", "r15", "r16", 62 62 "r17", "r18", "r19", "r20", "r21", "r22", 63 63 "r23", "r24", "r25", "r26", "r27", "r28", ··· 113 113 int ret, status; 114 114 115 115 SKIP_IF(!have_htm()); 116 + SKIP_IF(htm_is_synthetic()); 116 117 shm_id = shmget(IPC_PRIVATE, sizeof(int) * 2, 0777|IPC_CREAT); 117 118 pid = fork(); 118 119 if (pid < 0) {
+2 -1
tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-gpr.c
··· 65 65 : [gpr_1]"i"(GPR_1), [gpr_2]"i"(GPR_2), [gpr_4]"i"(GPR_4), 66 66 [sprn_texasr] "i" (SPRN_TEXASR), [flt_1] "b" (&a), 67 67 [flt_4] "b" (&d) 68 - : "memory", "r5", "r6", "r7", 68 + : "memory", "r0", "r5", "r6", "r7", 69 69 "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15", 70 70 "r16", "r17", "r18", "r19", "r20", "r21", "r22", "r23", 71 71 "r24", "r25", "r26", "r27", "r28", "r29", "r30", "r31" ··· 119 119 int ret, status; 120 120 121 121 SKIP_IF(!have_htm()); 122 + SKIP_IF(htm_is_synthetic()); 122 123 shm_id = shmget(IPC_PRIVATE, sizeof(int) * 3, 0777|IPC_CREAT); 123 124 pid = fork(); 124 125 if (pid < 0) {
+1
tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-tar.c
··· 129 129 int ret, status; 130 130 131 131 SKIP_IF(!have_htm()); 132 + SKIP_IF(htm_is_synthetic()); 132 133 shm_id = shmget(IPC_PRIVATE, sizeof(int) * 3, 0777|IPC_CREAT); 133 134 pid = fork(); 134 135 if (pid == 0)
+1
tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-vsx.c
··· 129 129 int ret, status, i; 130 130 131 131 SKIP_IF(!have_htm()); 132 + SKIP_IF(htm_is_synthetic()); 132 133 shm_id = shmget(IPC_PRIVATE, sizeof(int) * 3, 0777|IPC_CREAT); 133 134 134 135 for (i = 0; i < 128; i++) {
+1
tools/testing/selftests/powerpc/ptrace/ptrace-tm-spr.c
··· 114 114 int ret, status; 115 115 116 116 SKIP_IF(!have_htm()); 117 + SKIP_IF(htm_is_synthetic()); 117 118 shm_id = shmget(IPC_PRIVATE, sizeof(struct shared), 0777|IPC_CREAT); 118 119 shm_id1 = shmget(IPC_PRIVATE, sizeof(int), 0777|IPC_CREAT); 119 120 pid = fork();
+1
tools/testing/selftests/powerpc/ptrace/ptrace-tm-tar.c
··· 117 117 int ret, status; 118 118 119 119 SKIP_IF(!have_htm()); 120 + SKIP_IF(htm_is_synthetic()); 120 121 shm_id = shmget(IPC_PRIVATE, sizeof(int) * 2, 0777|IPC_CREAT); 121 122 pid = fork(); 122 123 if (pid == 0)
+1
tools/testing/selftests/powerpc/ptrace/ptrace-tm-vsx.c
··· 113 113 int ret, status, i; 114 114 115 115 SKIP_IF(!have_htm()); 116 + SKIP_IF(htm_is_synthetic()); 116 117 shm_id = shmget(IPC_PRIVATE, sizeof(int) * 2, 0777|IPC_CREAT); 117 118 118 119 for (i = 0; i < 128; i++) {
+1
tools/testing/selftests/powerpc/signal/signal_tm.c
··· 56 56 } 57 57 58 58 SKIP_IF(!have_htm()); 59 + SKIP_IF(htm_is_synthetic()); 59 60 60 61 for (i = 0; i < MAX_ATTEMPT; i++) { 61 62 /*
+1
tools/testing/selftests/powerpc/tm/tm-exec.c
··· 27 27 static int test_exec(void) 28 28 { 29 29 SKIP_IF(!have_htm()); 30 + SKIP_IF(htm_is_synthetic()); 30 31 31 32 asm __volatile__( 32 33 "tbegin.;"
+1
tools/testing/selftests/powerpc/tm/tm-fork.c
··· 21 21 int test_fork(void) 22 22 { 23 23 SKIP_IF(!have_htm()); 24 + SKIP_IF(htm_is_synthetic()); 24 25 25 26 asm __volatile__( 26 27 "tbegin.;"
+1 -1
tools/testing/selftests/powerpc/tm/tm-poison.c
··· 20 20 #include <sched.h> 21 21 #include <sys/types.h> 22 22 #include <signal.h> 23 - #include <inttypes.h> 24 23 25 24 #include "tm.h" 26 25 ··· 33 34 bool fail_vr = false; 34 35 35 36 SKIP_IF(!have_htm()); 37 + SKIP_IF(htm_is_synthetic()); 36 38 37 39 cpu = pick_online_cpu(); 38 40 FAIL_IF(cpu < 0);
+1
tools/testing/selftests/powerpc/tm/tm-resched-dscr.c
··· 40 40 uint64_t rv, dscr1 = 1, dscr2, texasr; 41 41 42 42 SKIP_IF(!have_htm()); 43 + SKIP_IF(htm_is_synthetic()); 43 44 44 45 printf("Check DSCR TM context switch: "); 45 46 fflush(stdout);
+1
tools/testing/selftests/powerpc/tm/tm-signal-context-chk-fpu.c
··· 79 79 pid_t pid = getpid(); 80 80 81 81 SKIP_IF(!have_htm()); 82 + SKIP_IF(htm_is_synthetic()); 82 83 83 84 act.sa_sigaction = signal_usr1; 84 85 sigemptyset(&act.sa_mask);
+1
tools/testing/selftests/powerpc/tm/tm-signal-context-chk-gpr.c
··· 81 81 pid_t pid = getpid(); 82 82 83 83 SKIP_IF(!have_htm()); 84 + SKIP_IF(htm_is_synthetic()); 84 85 85 86 act.sa_sigaction = signal_usr1; 86 87 sigemptyset(&act.sa_mask);
+1
tools/testing/selftests/powerpc/tm/tm-signal-context-chk-vmx.c
··· 104 104 pid_t pid = getpid(); 105 105 106 106 SKIP_IF(!have_htm()); 107 + SKIP_IF(htm_is_synthetic()); 107 108 108 109 act.sa_sigaction = signal_usr1; 109 110 sigemptyset(&act.sa_mask);
+1
tools/testing/selftests/powerpc/tm/tm-signal-context-chk-vsx.c
··· 153 153 pid_t pid = getpid(); 154 154 155 155 SKIP_IF(!have_htm()); 156 + SKIP_IF(htm_is_synthetic()); 156 157 157 158 act.sa_sigaction = signal_usr1; 158 159 sigemptyset(&act.sa_mask);
+1
tools/testing/selftests/powerpc/tm/tm-signal-pagefault.c
··· 226 226 stack_t ss; 227 227 228 228 SKIP_IF(!have_htm()); 229 + SKIP_IF(htm_is_synthetic()); 229 230 SKIP_IF(!have_userfaultfd()); 230 231 231 232 setup_uf_mem();
+1
tools/testing/selftests/powerpc/tm/tm-signal-sigreturn-nt.c
··· 32 32 struct sigaction trap_sa; 33 33 34 34 SKIP_IF(!have_htm()); 35 + SKIP_IF(htm_is_synthetic()); 35 36 36 37 trap_sa.sa_flags = SA_SIGINFO; 37 38 trap_sa.sa_sigaction = trap_signal_handler;
+1
tools/testing/selftests/powerpc/tm/tm-signal-stack.c
··· 35 35 int pid; 36 36 37 37 SKIP_IF(!have_htm()); 38 + SKIP_IF(htm_is_synthetic()); 38 39 39 40 pid = fork(); 40 41 if (pid < 0)
+1
tools/testing/selftests/powerpc/tm/tm-sigreturn.c
··· 55 55 uint64_t ret = 0; 56 56 57 57 SKIP_IF(!have_htm()); 58 + SKIP_IF(htm_is_synthetic()); 58 59 SKIP_IF(!is_ppc64le()); 59 60 60 61 memset(&sa, 0, sizeof(sa));
+1 -1
tools/testing/selftests/powerpc/tm/tm-syscall.c
··· 25 25 unsigned retries = 0; 26 26 27 27 #define TEST_DURATION 10 /* seconds */ 28 - #define TM_RETRIES 100 29 28 30 29 pid_t getppid_tm(bool suspend) 31 30 { ··· 66 67 struct timeval end, now; 67 68 68 69 SKIP_IF(!have_htm_nosc()); 70 + SKIP_IF(htm_is_synthetic()); 69 71 70 72 setbuf(stdout, NULL); 71 73
+1
tools/testing/selftests/powerpc/tm/tm-tar.c
··· 26 26 int i; 27 27 28 28 SKIP_IF(!have_htm()); 29 + SKIP_IF(htm_is_synthetic()); 29 30 SKIP_IF(!is_ppc64le()); 30 31 31 32 for (i = 0; i < num_loops; i++)
+1
tools/testing/selftests/powerpc/tm/tm-tmspr.c
··· 96 96 unsigned long i; 97 97 98 98 SKIP_IF(!have_htm()); 99 + SKIP_IF(htm_is_synthetic()); 99 100 100 101 /* To cause some context switching */ 101 102 thread_num = 10 * sysconf(_SC_NPROCESSORS_ONLN);
+1
tools/testing/selftests/powerpc/tm/tm-trap.c
··· 255 255 struct sigaction trap_sa; 256 256 257 257 SKIP_IF(!have_htm()); 258 + SKIP_IF(htm_is_synthetic()); 258 259 259 260 trap_sa.sa_flags = SA_SIGINFO; 260 261 trap_sa.sa_sigaction = trap_signal_handler;
+1
tools/testing/selftests/powerpc/tm/tm-unavailable.c
··· 344 344 cpu_set_t cpuset; 345 345 346 346 SKIP_IF(!have_htm()); 347 + SKIP_IF(htm_is_synthetic()); 347 348 348 349 cpu = pick_online_cpu(); 349 350 FAIL_IF(cpu < 0);
+1
tools/testing/selftests/powerpc/tm/tm-vmx-unavail.c
··· 91 91 pthread_t *thread; 92 92 93 93 SKIP_IF(!have_htm()); 94 + SKIP_IF(htm_is_synthetic()); 94 95 95 96 passed = 1; 96 97
+1
tools/testing/selftests/powerpc/tm/tm-vmxcopy.c
··· 46 46 uint64_t aborted = 0; 47 47 48 48 SKIP_IF(!have_htm()); 49 + SKIP_IF(htm_is_synthetic()); 49 50 SKIP_IF(!is_ppc64le()); 50 51 51 52 fd = mkstemp(tmpfile);
+36
tools/testing/selftests/powerpc/tm/tm.h
··· 10 10 #include <asm/tm.h> 11 11 12 12 #include "utils.h" 13 + #include "reg.h" 14 + 15 + #define TM_RETRIES 100 13 16 14 17 static inline bool have_htm(void) 15 18 { ··· 32 29 printf("PPC_FEATURE2_HTM_NOSC not defined, can't check AT_HWCAP2\n"); 33 30 return false; 34 31 #endif 32 + } 33 + 34 + /* 35 + * Transactional Memory was removed in ISA 3.1. A synthetic TM implementation 36 + * is provided on P10 for threads running in P8/P9 compatibility mode. The 37 + * synthetic implementation immediately fails after tbegin. This failure sets 38 + * Bit 7 (Failure Persistent) and Bit 15 (Implementation-specific). 39 + */ 40 + static inline bool htm_is_synthetic(void) 41 + { 42 + int i; 43 + 44 + /* 45 + * Per the ISA, the Failure Persistent bit may be incorrect. Try a few 46 + * times in case we got an Implementation-specific failure on a non ISA 47 + * v3.1 system. On these systems the Implementation-specific failure 48 + * should not be persistent. 49 + */ 50 + for (i = 0; i < TM_RETRIES; i++) { 51 + asm volatile( 52 + "tbegin.;" 53 + "beq 1f;" 54 + "tend.;" 55 + "1:" 56 + : 57 + : 58 + : "memory"); 59 + 60 + if ((__builtin_get_texasr() & (TEXASR_FP | TEXASR_IC)) != 61 + (TEXASR_FP | TEXASR_IC)) 62 + break; 63 + } 64 + return i == TM_RETRIES; 35 65 } 36 66 37 67 static inline long failure_code(void)