Merge tag 'powerpc-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

+105

Documentation/powerpc/associativity.rst

··· 1 + ============================ 2 + NUMA resource associativity 3 + ============================ 4 + 5 + Associativity represents the groupings of the various platform resources into 6 + domains of substantially similar mean performance relative to resources outside 7 + of that domain. Resources subsets of a given domain that exhibit better 8 + performance relative to each other than relative to other resources subsets 9 + are represented as being members of a sub-grouping domain. This performance 10 + characteristic is presented in terms of NUMA node distance within the Linux kernel. 11 + From the platform view, these groups are also referred to as domains. 12 + 13 + PAPR interface currently supports different ways of communicating these resource 14 + grouping details to the OS. These are referred to as Form 0, Form 1 and Form2 15 + associativity grouping. Form 0 is the oldest format and is now considered deprecated. 16 + 17 + Hypervisor indicates the type/form of associativity used via "ibm,architecture-vec-5 property". 18 + Bit 0 of byte 5 in the "ibm,architecture-vec-5" property indicates usage of Form 0 or Form 1. 19 + A value of 1 indicates the usage of Form 1 associativity. For Form 2 associativity 20 + bit 2 of byte 5 in the "ibm,architecture-vec-5" property is used. 21 + 22 + Form 0 23 + ------ 24 + Form 0 associativity supports only two NUMA distances (LOCAL and REMOTE). 25 + 26 + Form 1 27 + ------ 28 + With Form 1 a combination of ibm,associativity-reference-points, and ibm,associativity 29 + device tree properties are used to determine the NUMA distance between resource groups/domains. 30 + 31 + The “ibm,associativity” property contains a list of one or more numbers (domainID) 32 + representing the resource’s platform grouping domains. 33 + 34 + The “ibm,associativity-reference-points” property contains a list of one or more numbers 35 + (domainID index) that represents the 1 based ordinal in the associativity lists. 36 + The list of domainID indexes represents an increasing hierarchy of resource grouping. 37 + 38 + ex: 39 + { primary domainID index, secondary domainID index, tertiary domainID index.. } 40 + 41 + Linux kernel uses the domainID at the primary domainID index as the NUMA node id. 42 + Linux kernel computes NUMA distance between two domains by recursively comparing 43 + if they belong to the same higher-level domains. For mismatch at every higher 44 + level of the resource group, the kernel doubles the NUMA distance between the 45 + comparing domains. 46 + 47 + Form 2 48 + ------- 49 + Form 2 associativity format adds separate device tree properties representing NUMA node distance 50 + thereby making the node distance computation flexible. Form 2 also allows flexible primary 51 + domain numbering. With numa distance computation now detached from the index value in 52 + "ibm,associativity-reference-points" property, Form 2 allows a large number of primary domain 53 + ids at the same domainID index representing resource groups of different performance/latency 54 + characteristics. 55 + 56 + Hypervisor indicates the usage of FORM2 associativity using bit 2 of byte 5 in the 57 + "ibm,architecture-vec-5" property. 58 + 59 + "ibm,numa-lookup-index-table" property contains a list of one or more numbers representing 60 + the domainIDs present in the system. The offset of the domainID in this property is 61 + used as an index while computing numa distance information via "ibm,numa-distance-table". 62 + 63 + prop-encoded-array: The number N of the domainIDs encoded as with encode-int, followed by 64 + N domainID encoded as with encode-int 65 + 66 + For ex: 67 + "ibm,numa-lookup-index-table" = {4, 0, 8, 250, 252}. The offset of domainID 8 (2) is used when 68 + computing the distance of domain 8 from other domains present in the system. For the rest of 69 + this document, this offset will be referred to as domain distance offset. 70 + 71 + "ibm,numa-distance-table" property contains a list of one or more numbers representing the NUMA 72 + distance between resource groups/domains present in the system. 73 + 74 + prop-encoded-array: The number N of the distance values encoded as with encode-int, followed by 75 + N distance values encoded as with encode-bytes. The max distance value we could encode is 255. 76 + The number N must be equal to the square of m where m is the number of domainIDs in the 77 + numa-lookup-index-table. 78 + 79 + For ex: 80 + ibm,numa-lookup-index-table = <3 0 8 40>; 81 + ibm,numa-distace-table = <9>, /bits/ 8 < 10 20 80 20 10 160 80 160 10>; 82 + 83 + :: 84 + 85 + | 0 8 40 86 + --|------------ 87 + | 88 + 0 | 10 20 80 89 + | 90 + 8 | 20 10 160 91 + | 92 + 40| 80 160 10 93 + 94 + A possible "ibm,associativity" property for resources in node 0, 8 and 40 95 + 96 + { 3, 6, 7, 0 } 97 + { 3, 6, 9, 8 } 98 + { 3, 6, 7, 40} 99 + 100 + With "ibm,associativity-reference-points" { 0x3 } 101 + 102 + "ibm,lookup-index-table" helps in having a compact representation of distance matrix. 103 + Since domainID can be sparse, the matrix of distances can also be effectively sparse. 104 + With "ibm,lookup-index-table" we can achieve a compact representation of 105 + distance information.

+1

Documentation/powerpc/index.rst

··· 7 7 .. toctree:: 8 8 :maxdepth: 1 9 9 10 + associativity 10 11 booting 11 12 bootwrapper 12 13 cpu_families

-1

MAINTAINERS

··· 6853 6853 F: drivers/media/usb/em28xx/ 6854 6854 6855 6855 EMBEDDED LINUX 6856 - M: Paul Gortmaker <paul.gortmaker@windriver.com> 6857 6856 M: Matt Mackall <mpm@selenic.com> 6858 6857 M: David Woodhouse <dwmw2@infradead.org> 6859 6858 L: linux-embedded@vger.kernel.org

+2

arch/powerpc/Kconfig

··· 123 123 select ARCH_HAS_COPY_MC if PPC64 124 124 select ARCH_HAS_DEBUG_VIRTUAL 125 125 select ARCH_HAS_DEBUG_VM_PGTABLE 126 + select ARCH_HAS_DEBUG_WX if STRICT_KERNEL_RWX 126 127 select ARCH_HAS_DEVMEM_IS_ALLOWED 127 128 select ARCH_HAS_DMA_MAP_DIRECT if PPC_PSERIES 128 129 select ARCH_HAS_ELF_RANDOMIZE ··· 183 182 select GENERIC_IRQ_SHOW 184 183 select GENERIC_IRQ_SHOW_LEVEL 185 184 select GENERIC_PCI_IOMAP if PCI 185 + select GENERIC_PTDUMP 186 186 select GENERIC_SMP_IDLE_THREAD 187 187 select GENERIC_TIME_VSYSCALL 188 188 select GENERIC_VDSO_TIME_NS

-30

arch/powerpc/Kconfig.debug

··· 365 365 366 366 If you are unsure, say N. 367 367 368 - config PPC_PTDUMP 369 - bool "Export kernel pagetable layout to userspace via debugfs" 370 - depends on DEBUG_KERNEL && DEBUG_FS 371 - help 372 - This option exports the state of the kernel pagetables to a 373 - debugfs file. This is only useful for kernel developers who are 374 - working in architecture specific areas of the kernel - probably 375 - not a good idea to enable this feature in a production kernel. 376 - 377 - If you are unsure, say N. 378 - 379 - config PPC_DEBUG_WX 380 - bool "Warn on W+X mappings at boot" 381 - depends on PPC_PTDUMP && STRICT_KERNEL_RWX 382 - help 383 - Generate a warning if any W+X mappings are found at boot. 384 - 385 - This is useful for discovering cases where the kernel is leaving 386 - W+X mappings after applying NX, as such mappings are a security risk. 387 - 388 - Note that even if the check fails, your kernel is possibly 389 - still fine, as W+X mappings are not a security hole in 390 - themselves, what they do is that they make the exploitation 391 - of other unfixed kernel bugs easier. 392 - 393 - There is no runtime or memory usage effect of this option 394 - once the kernel has booted up - it's a one time check. 395 - 396 - If in doubt, say "Y". 397 - 398 368 config PPC_FAST_ENDIAN_SWITCH 399 369 bool "Deprecated fast endian-switch syscall" 400 370 depends on DEBUG_KERNEL && PPC_BOOK3S_64

+3 -1

arch/powerpc/Makefile

··· 122 122 123 123 LDFLAGS_vmlinux-y := -Bstatic 124 124 LDFLAGS_vmlinux-$(CONFIG_RELOCATABLE) := -pie 125 + LDFLAGS_vmlinux-$(CONFIG_RELOCATABLE) += -z notext 125 126 LDFLAGS_vmlinux := $(LDFLAGS_vmlinux-y) 126 127 127 128 ifdef CONFIG_PPC64 ··· 408 407 409 408 PHONY += install 410 409 install: 411 - $(Q)$(MAKE) $(build)=$(boot) install 410 + sh -x $(srctree)/$(boot)/install.sh "$(KERNELRELEASE)" vmlinux \ 411 + System.map "$(INSTALL_PATH)" 412 412 413 413 archclean: 414 414 $(Q)$(MAKE) $(clean)=$(boot)

-11

arch/powerpc/boot/Makefile

··· 341 341 image-$(CONFIG_TQM8548) += cuImage.tqm8548 342 342 image-$(CONFIG_TQM8555) += cuImage.tqm8555 343 343 image-$(CONFIG_TQM8560) += cuImage.tqm8560 344 - image-$(CONFIG_SBC8548) += cuImage.sbc8548 345 344 image-$(CONFIG_KSI8560) += cuImage.ksi8560 346 345 347 346 # Board ports in arch/powerpc/platform/86xx/Kconfig ··· 442 443 $(Q)rm -f $@; ln $< $@ 443 444 $(obj)/zImage.initrd: $(addprefix $(obj)/, $(initrd-y)) 444 445 $(Q)rm -f $@; ln $< $@ 445 - 446 - # Only install the vmlinux 447 - install: $(CONFIGURE) $(addprefix $(obj)/, $(image-y)) 448 - sh -x $(srctree)/$(src)/install.sh "$(KERNELRELEASE)" vmlinux System.map "$(INSTALL_PATH)" 449 - 450 - # Install the vmlinux and other built boot targets. 451 - zInstall: $(CONFIGURE) $(addprefix $(obj)/, $(image-y)) 452 - sh -x $(srctree)/$(src)/install.sh "$(KERNELRELEASE)" vmlinux System.map "$(INSTALL_PATH)" $^ 453 - 454 - PHONY += install zInstall 455 446 456 447 # anything not in $(targets) 457 448 clean-files += $(image-) $(initrd-) cuImage.* dtbImage.* treeImage.* \

-176

arch/powerpc/boot/dts/fsl/sbc8641d.dts

··· 1 - // SPDX-License-Identifier: GPL-2.0-or-later 2 - /* 3 - * SBC8641D Device Tree Source 4 - * 5 - * Copyright 2008 Wind River Systems Inc. 6 - * 7 - * Paul Gortmaker (see MAINTAINERS for contact information) 8 - * 9 - * Based largely on the mpc8641_hpcn.dts by Freescale Semiconductor Inc. 10 - */ 11 - 12 - /include/ "mpc8641si-pre.dtsi" 13 - 14 - / { 15 - model = "SBC8641D"; 16 - compatible = "wind,sbc8641"; 17 - 18 - memory { 19 - device_type = "memory"; 20 - reg = <0x00000000 0x20000000>; // 512M at 0x0 21 - }; 22 - 23 - lbc: localbus@f8005000 { 24 - reg = <0xf8005000 0x1000>; 25 - 26 - ranges = <0 0 0xff000000 0x01000000 // 16MB Boot flash 27 - 1 0 0xf0000000 0x00010000 // 64KB EEPROM 28 - 2 0 0xf1000000 0x00100000 // EPLD (1MB) 29 - 3 0 0xe0000000 0x04000000 // 64MB LB SDRAM (CS3) 30 - 4 0 0xe4000000 0x04000000 // 64MB LB SDRAM (CS4) 31 - 6 0 0xf4000000 0x00100000 // LCD display (1MB) 32 - 7 0 0xe8000000 0x04000000>; // 64MB OneNAND 33 - 34 - flash@0,0 { 35 - compatible = "cfi-flash"; 36 - reg = <0 0 0x01000000>; 37 - bank-width = <2>; 38 - device-width = <2>; 39 - #address-cells = <1>; 40 - #size-cells = <1>; 41 - partition@0 { 42 - label = "dtb"; 43 - reg = <0x00000000 0x00100000>; 44 - read-only; 45 - }; 46 - partition@300000 { 47 - label = "kernel"; 48 - reg = <0x00100000 0x00400000>; 49 - read-only; 50 - }; 51 - partition@400000 { 52 - label = "fs"; 53 - reg = <0x00500000 0x00a00000>; 54 - }; 55 - partition@700000 { 56 - label = "firmware"; 57 - reg = <0x00f00000 0x00100000>; 58 - read-only; 59 - }; 60 - }; 61 - 62 - epld@2,0 { 63 - compatible = "wrs,epld-localbus"; 64 - #address-cells = <2>; 65 - #size-cells = <1>; 66 - reg = <2 0 0x100000>; 67 - ranges = <0 0 5 0 1 // User switches 68 - 1 0 5 1 1 // Board ID/Rev 69 - 3 0 5 3 1>; // LEDs 70 - }; 71 - }; 72 - 73 - soc: soc@f8000000 { 74 - ranges = <0x00000000 0xf8000000 0x00100000>; 75 - 76 - enet0: ethernet@24000 { 77 - tbi-handle = <&tbi0>; 78 - phy-handle = <&phy0>; 79 - phy-connection-type = "rgmii-id"; 80 - }; 81 - 82 - mdio@24520 { 83 - phy0: ethernet-phy@1f { 84 - reg = <0x1f>; 85 - }; 86 - phy1: ethernet-phy@0 { 87 - reg = <0>; 88 - }; 89 - phy2: ethernet-phy@1 { 90 - reg = <1>; 91 - }; 92 - phy3: ethernet-phy@2 { 93 - reg = <2>; 94 - }; 95 - tbi0: tbi-phy@11 { 96 - reg = <0x11>; 97 - device_type = "tbi-phy"; 98 - }; 99 - }; 100 - 101 - enet1: ethernet@25000 { 102 - tbi-handle = <&tbi1>; 103 - phy-handle = <&phy1>; 104 - phy-connection-type = "rgmii-id"; 105 - }; 106 - 107 - mdio@25520 { 108 - tbi1: tbi-phy@11 { 109 - reg = <0x11>; 110 - device_type = "tbi-phy"; 111 - }; 112 - }; 113 - 114 - enet2: ethernet@26000 { 115 - tbi-handle = <&tbi2>; 116 - phy-handle = <&phy2>; 117 - phy-connection-type = "rgmii-id"; 118 - }; 119 - 120 - mdio@26520 { 121 - tbi2: tbi-phy@11 { 122 - reg = <0x11>; 123 - device_type = "tbi-phy"; 124 - }; 125 - }; 126 - 127 - enet3: ethernet@27000 { 128 - tbi-handle = <&tbi3>; 129 - phy-handle = <&phy3>; 130 - phy-connection-type = "rgmii-id"; 131 - }; 132 - 133 - mdio@27520 { 134 - tbi3: tbi-phy@11 { 135 - reg = <0x11>; 136 - device_type = "tbi-phy"; 137 - }; 138 - }; 139 - }; 140 - 141 - pci0: pcie@f8008000 { 142 - reg = <0xf8008000 0x1000>; 143 - ranges = <0x02000000 0x0 0x80000000 0x80000000 0x0 0x20000000 144 - 0x01000000 0x0 0x00000000 0xe2000000 0x0 0x00100000>; 145 - interrupt-map-mask = <0xff00 0 0 7>; 146 - 147 - pcie@0 { 148 - ranges = <0x02000000 0x0 0x80000000 149 - 0x02000000 0x0 0x80000000 150 - 0x0 0x20000000 151 - 152 - 0x01000000 0x0 0x00000000 153 - 0x01000000 0x0 0x00000000 154 - 0x0 0x00100000>; 155 - }; 156 - 157 - }; 158 - 159 - pci1: pcie@f8009000 { 160 - reg = <0xf8009000 0x1000>; 161 - ranges = <0x02000000 0x0 0xa0000000 0xa0000000 0x0 0x20000000 162 - 0x01000000 0x0 0x00000000 0xe3000000 0x0 0x00100000>; 163 - 164 - pcie@0 { 165 - ranges = <0x02000000 0x0 0xa0000000 166 - 0x02000000 0x0 0xa0000000 167 - 0x0 0x20000000 168 - 169 - 0x01000000 0x0 0x00000000 170 - 0x01000000 0x0 0x00000000 171 - 0x0 0x00100000>; 172 - }; 173 - }; 174 - }; 175 - 176 - /include/ "mpc8641si-post.dtsi"

+12

arch/powerpc/boot/dts/microwatt.dts

··· 127 127 fifo-size = <16>; 128 128 interrupts = <0x10 0x1>; 129 129 }; 130 + 131 + ethernet@8020000 { 132 + compatible = "litex,liteeth"; 133 + reg = <0x8021000 0x100 134 + 0x8020800 0x100 135 + 0x8030000 0x2000>; 136 + reg-names = "mac", "mido", "buffer"; 137 + litex,rx-slots = <2>; 138 + litex,tx-slots = <2>; 139 + litex,slot-size = <0x800>; 140 + interrupts = <0x11 0x1>; 141 + }; 130 142 }; 131 143 132 144 chosen {

-111

arch/powerpc/boot/dts/sbc8548-altflash.dts

··· 1 - // SPDX-License-Identifier: GPL-2.0-or-later 2 - /* 3 - * SBC8548 Device Tree Source 4 - * 5 - * Configured for booting off the alternate (64MB SODIMM) flash. 6 - * Requires switching JP12 jumpers and changing SW2.8 setting. 7 - * 8 - * Copyright 2013 Wind River Systems Inc. 9 - * 10 - * Paul Gortmaker (see MAINTAINERS for contact information) 11 - */ 12 - 13 - 14 - /dts-v1/; 15 - 16 - /include/ "sbc8548-pre.dtsi" 17 - 18 - /{ 19 - localbus@e0000000 { 20 - #address-cells = <2>; 21 - #size-cells = <1>; 22 - compatible = "simple-bus"; 23 - reg = <0xe0000000 0x5000>; 24 - interrupt-parent = <&mpic>; 25 - 26 - ranges = <0x0 0x0 0xfc000000 0x04000000 /*64MB Flash*/ 27 - 0x3 0x0 0xf0000000 0x04000000 /*64MB SDRAM*/ 28 - 0x4 0x0 0xf4000000 0x04000000 /*64MB SDRAM*/ 29 - 0x5 0x0 0xf8000000 0x00b10000 /* EPLD */ 30 - 0x6 0x0 0xef800000 0x00800000>; /*8MB Flash*/ 31 - 32 - flash@0,0 { 33 - #address-cells = <1>; 34 - #size-cells = <1>; 35 - reg = <0x0 0x0 0x04000000>; 36 - compatible = "intel,JS28F128", "cfi-flash"; 37 - bank-width = <4>; 38 - device-width = <1>; 39 - partition@0 { 40 - label = "space"; 41 - /* FC000000 -> FFEFFFFF */ 42 - reg = <0x00000000 0x03f00000>; 43 - }; 44 - partition@3f00000 { 45 - label = "bootloader"; 46 - /* FFF00000 -> FFFFFFFF */ 47 - reg = <0x03f00000 0x00100000>; 48 - read-only; 49 - }; 50 - }; 51 - 52 - 53 - epld@5,0 { 54 - compatible = "wrs,epld-localbus"; 55 - #address-cells = <2>; 56 - #size-cells = <1>; 57 - reg = <0x5 0x0 0x00b10000>; 58 - ranges = < 59 - 0x0 0x0 0x5 0x000000 0x1fff /* LED */ 60 - 0x1 0x0 0x5 0x100000 0x1fff /* Switches */ 61 - 0x3 0x0 0x5 0x300000 0x1fff /* HW Rev. */ 62 - 0xb 0x0 0x5 0xb00000 0x1fff /* EEPROM */ 63 - >; 64 - 65 - led@0,0 { 66 - compatible = "led"; 67 - reg = <0x0 0x0 0x1fff>; 68 - }; 69 - 70 - switches@1,0 { 71 - compatible = "switches"; 72 - reg = <0x1 0x0 0x1fff>; 73 - }; 74 - 75 - hw-rev@3,0 { 76 - compatible = "hw-rev"; 77 - reg = <0x3 0x0 0x1fff>; 78 - }; 79 - 80 - eeprom@b,0 { 81 - compatible = "eeprom"; 82 - reg = <0xb 0 0x1fff>; 83 - }; 84 - 85 - }; 86 - 87 - alt-flash@6,0 { 88 - #address-cells = <1>; 89 - #size-cells = <1>; 90 - compatible = "intel,JS28F640", "cfi-flash"; 91 - reg = <0x6 0x0 0x800000>; 92 - bank-width = <1>; 93 - device-width = <1>; 94 - partition@0 { 95 - label = "space"; 96 - /* EF800000 -> EFF9FFFF */ 97 - reg = <0x00000000 0x007a0000>; 98 - }; 99 - partition@7a0000 { 100 - label = "bootloader"; 101 - /* EFFA0000 -> EFFFFFFF */ 102 - reg = <0x007a0000 0x00060000>; 103 - read-only; 104 - }; 105 - }; 106 - 107 - 108 - }; 109 - }; 110 - 111 - /include/ "sbc8548-post.dtsi"

-289

arch/powerpc/boot/dts/sbc8548-post.dtsi

··· 1 - // SPDX-License-Identifier: GPL-2.0-or-later 2 - /* 3 - * SBC8548 Device Tree Source 4 - * 5 - * Copyright 2007 Wind River Systems Inc. 6 - * 7 - * Paul Gortmaker (see MAINTAINERS for contact information) 8 - */ 9 - 10 - /{ 11 - soc8548@e0000000 { 12 - #address-cells = <1>; 13 - #size-cells = <1>; 14 - device_type = "soc"; 15 - ranges = <0x00000000 0xe0000000 0x00100000>; 16 - bus-frequency = <0>; 17 - compatible = "simple-bus"; 18 - 19 - ecm-law@0 { 20 - compatible = "fsl,ecm-law"; 21 - reg = <0x0 0x1000>; 22 - fsl,num-laws = <10>; 23 - }; 24 - 25 - ecm@1000 { 26 - compatible = "fsl,mpc8548-ecm", "fsl,ecm"; 27 - reg = <0x1000 0x1000>; 28 - interrupts = <17 2>; 29 - interrupt-parent = <&mpic>; 30 - }; 31 - 32 - memory-controller@2000 { 33 - compatible = "fsl,mpc8548-memory-controller"; 34 - reg = <0x2000 0x1000>; 35 - interrupt-parent = <&mpic>; 36 - interrupts = <0x12 0x2>; 37 - }; 38 - 39 - L2: l2-cache-controller@20000 { 40 - compatible = "fsl,mpc8548-l2-cache-controller"; 41 - reg = <0x20000 0x1000>; 42 - cache-line-size = <0x20>; // 32 bytes 43 - cache-size = <0x80000>; // L2, 512K 44 - interrupt-parent = <&mpic>; 45 - interrupts = <0x10 0x2>; 46 - }; 47 - 48 - i2c@3000 { 49 - #address-cells = <1>; 50 - #size-cells = <0>; 51 - cell-index = <0>; 52 - compatible = "fsl-i2c"; 53 - reg = <0x3000 0x100>; 54 - interrupts = <0x2b 0x2>; 55 - interrupt-parent = <&mpic>; 56 - dfsrr; 57 - }; 58 - 59 - i2c@3100 { 60 - #address-cells = <1>; 61 - #size-cells = <0>; 62 - cell-index = <1>; 63 - compatible = "fsl-i2c"; 64 - reg = <0x3100 0x100>; 65 - interrupts = <0x2b 0x2>; 66 - interrupt-parent = <&mpic>; 67 - dfsrr; 68 - }; 69 - 70 - dma@21300 { 71 - #address-cells = <1>; 72 - #size-cells = <1>; 73 - compatible = "fsl,mpc8548-dma", "fsl,eloplus-dma"; 74 - reg = <0x21300 0x4>; 75 - ranges = <0x0 0x21100 0x200>; 76 - cell-index = <0>; 77 - dma-channel@0 { 78 - compatible = "fsl,mpc8548-dma-channel", 79 - "fsl,eloplus-dma-channel"; 80 - reg = <0x0 0x80>; 81 - cell-index = <0>; 82 - interrupt-parent = <&mpic>; 83 - interrupts = <20 2>; 84 - }; 85 - dma-channel@80 { 86 - compatible = "fsl,mpc8548-dma-channel", 87 - "fsl,eloplus-dma-channel"; 88 - reg = <0x80 0x80>; 89 - cell-index = <1>; 90 - interrupt-parent = <&mpic>; 91 - interrupts = <21 2>; 92 - }; 93 - dma-channel@100 { 94 - compatible = "fsl,mpc8548-dma-channel", 95 - "fsl,eloplus-dma-channel"; 96 - reg = <0x100 0x80>; 97 - cell-index = <2>; 98 - interrupt-parent = <&mpic>; 99 - interrupts = <22 2>; 100 - }; 101 - dma-channel@180 { 102 - compatible = "fsl,mpc8548-dma-channel", 103 - "fsl,eloplus-dma-channel"; 104 - reg = <0x180 0x80>; 105 - cell-index = <3>; 106 - interrupt-parent = <&mpic>; 107 - interrupts = <23 2>; 108 - }; 109 - }; 110 - 111 - enet0: ethernet@24000 { 112 - #address-cells = <1>; 113 - #size-cells = <1>; 114 - cell-index = <0>; 115 - device_type = "network"; 116 - model = "eTSEC"; 117 - compatible = "gianfar"; 118 - reg = <0x24000 0x1000>; 119 - ranges = <0x0 0x24000 0x1000>; 120 - local-mac-address = [ 00 00 00 00 00 00 ]; 121 - interrupts = <0x1d 0x2 0x1e 0x2 0x22 0x2>; 122 - interrupt-parent = <&mpic>; 123 - tbi-handle = <&tbi0>; 124 - phy-handle = <&phy0>; 125 - 126 - mdio@520 { 127 - #address-cells = <1>; 128 - #size-cells = <0>; 129 - compatible = "fsl,gianfar-mdio"; 130 - reg = <0x520 0x20>; 131 - 132 - phy0: ethernet-phy@19 { 133 - interrupt-parent = <&mpic>; 134 - interrupts = <0x6 0x1>; 135 - reg = <0x19>; 136 - }; 137 - phy1: ethernet-phy@1a { 138 - interrupt-parent = <&mpic>; 139 - interrupts = <0x7 0x1>; 140 - reg = <0x1a>; 141 - }; 142 - tbi0: tbi-phy@11 { 143 - reg = <0x11>; 144 - device_type = "tbi-phy"; 145 - }; 146 - }; 147 - }; 148 - 149 - enet1: ethernet@25000 { 150 - #address-cells = <1>; 151 - #size-cells = <1>; 152 - cell-index = <1>; 153 - device_type = "network"; 154 - model = "eTSEC"; 155 - compatible = "gianfar"; 156 - reg = <0x25000 0x1000>; 157 - ranges = <0x0 0x25000 0x1000>; 158 - local-mac-address = [ 00 00 00 00 00 00 ]; 159 - interrupts = <0x23 0x2 0x24 0x2 0x28 0x2>; 160 - interrupt-parent = <&mpic>; 161 - tbi-handle = <&tbi1>; 162 - phy-handle = <&phy1>; 163 - 164 - mdio@520 { 165 - #address-cells = <1>; 166 - #size-cells = <0>; 167 - compatible = "fsl,gianfar-tbi"; 168 - reg = <0x520 0x20>; 169 - 170 - tbi1: tbi-phy@11 { 171 - reg = <0x11>; 172 - device_type = "tbi-phy"; 173 - }; 174 - }; 175 - }; 176 - 177 - serial0: serial@4500 { 178 - cell-index = <0>; 179 - device_type = "serial"; 180 - compatible = "fsl,ns16550", "ns16550"; 181 - reg = <0x4500 0x100>; // reg base, size 182 - clock-frequency = <0>; // should we fill in in uboot? 183 - interrupts = <0x2a 0x2>; 184 - interrupt-parent = <&mpic>; 185 - }; 186 - 187 - serial1: serial@4600 { 188 - cell-index = <1>; 189 - device_type = "serial"; 190 - compatible = "fsl,ns16550", "ns16550"; 191 - reg = <0x4600 0x100>; // reg base, size 192 - clock-frequency = <0>; // should we fill in in uboot? 193 - interrupts = <0x2a 0x2>; 194 - interrupt-parent = <&mpic>; 195 - }; 196 - 197 - global-utilities@e0000 { //global utilities reg 198 - compatible = "fsl,mpc8548-guts"; 199 - reg = <0xe0000 0x1000>; 200 - fsl,has-rstcr; 201 - }; 202 - 203 - crypto@30000 { 204 - compatible = "fsl,sec2.1", "fsl,sec2.0"; 205 - reg = <0x30000 0x10000>; 206 - interrupts = <45 2>; 207 - interrupt-parent = <&mpic>; 208 - fsl,num-channels = <4>; 209 - fsl,channel-fifo-len = <24>; 210 - fsl,exec-units-mask = <0xfe>; 211 - fsl,descriptor-types-mask = <0x12b0ebf>; 212 - }; 213 - 214 - mpic: pic@40000 { 215 - interrupt-controller; 216 - #address-cells = <0>; 217 - #interrupt-cells = <2>; 218 - reg = <0x40000 0x40000>; 219 - compatible = "chrp,open-pic"; 220 - device_type = "open-pic"; 221 - }; 222 - }; 223 - 224 - pci0: pci@e0008000 { 225 - interrupt-map-mask = <0xf800 0x0 0x0 0x7>; 226 - interrupt-map = < 227 - /* IDSEL 0x01 (PCI-X slot) @66MHz */ 228 - 0x0800 0x0 0x0 0x1 &mpic 0x2 0x1 229 - 0x0800 0x0 0x0 0x2 &mpic 0x3 0x1 230 - 0x0800 0x0 0x0 0x3 &mpic 0x4 0x1 231 - 0x0800 0x0 0x0 0x4 &mpic 0x1 0x1 232 - 233 - /* IDSEL 0x11 (PCI, 3.3V 32bit) @33MHz */ 234 - 0x8800 0x0 0x0 0x1 &mpic 0x2 0x1 235 - 0x8800 0x0 0x0 0x2 &mpic 0x3 0x1 236 - 0x8800 0x0 0x0 0x3 &mpic 0x4 0x1 237 - 0x8800 0x0 0x0 0x4 &mpic 0x1 0x1>; 238 - 239 - interrupt-parent = <&mpic>; 240 - interrupts = <0x18 0x2>; 241 - bus-range = <0 0>; 242 - ranges = <0x02000000 0x0 0x80000000 0x80000000 0x0 0x10000000 243 - 0x01000000 0x0 0x00000000 0xe2000000 0x0 0x00800000>; 244 - clock-frequency = <66000000>; 245 - #interrupt-cells = <1>; 246 - #size-cells = <2>; 247 - #address-cells = <3>; 248 - reg = <0xe0008000 0x1000>; 249 - compatible = "fsl,mpc8540-pcix", "fsl,mpc8540-pci"; 250 - device_type = "pci"; 251 - }; 252 - 253 - pci1: pcie@e000a000 { 254 - interrupt-map-mask = <0xf800 0x0 0x0 0x7>; 255 - interrupt-map = < 256 - 257 - /* IDSEL 0x0 (PEX) */ 258 - 0x0000 0x0 0x0 0x1 &mpic 0x0 0x1 259 - 0x0000 0x0 0x0 0x2 &mpic 0x1 0x1 260 - 0x0000 0x0 0x0 0x3 &mpic 0x2 0x1 261 - 0x0000 0x0 0x0 0x4 &mpic 0x3 0x1>; 262 - 263 - interrupt-parent = <&mpic>; 264 - interrupts = <0x1a 0x2>; 265 - bus-range = <0x0 0xff>; 266 - ranges = <0x02000000 0x0 0xa0000000 0xa0000000 0x0 0x10000000 267 - 0x01000000 0x0 0x00000000 0xe2800000 0x0 0x08000000>; 268 - clock-frequency = <33000000>; 269 - #interrupt-cells = <1>; 270 - #size-cells = <2>; 271 - #address-cells = <3>; 272 - reg = <0xe000a000 0x1000>; 273 - compatible = "fsl,mpc8548-pcie"; 274 - device_type = "pci"; 275 - pcie@0 { 276 - reg = <0x0 0x0 0x0 0x0 0x0>; 277 - #size-cells = <2>; 278 - #address-cells = <3>; 279 - device_type = "pci"; 280 - ranges = <0x02000000 0x0 0xa0000000 281 - 0x02000000 0x0 0xa0000000 282 - 0x0 0x10000000 283 - 284 - 0x01000000 0x0 0x00000000 285 - 0x01000000 0x0 0x00000000 286 - 0x0 0x00800000>; 287 - }; 288 - }; 289 - };

-48

arch/powerpc/boot/dts/sbc8548-pre.dtsi

··· 1 - // SPDX-License-Identifier: GPL-2.0-or-later 2 - /* 3 - * SBC8548 Device Tree Source 4 - * 5 - * Copyright 2007 Wind River Systems Inc. 6 - * 7 - * Paul Gortmaker (see MAINTAINERS for contact information) 8 - */ 9 - 10 - /{ 11 - model = "SBC8548"; 12 - compatible = "SBC8548"; 13 - #address-cells = <1>; 14 - #size-cells = <1>; 15 - 16 - aliases { 17 - ethernet0 = &enet0; 18 - ethernet1 = &enet1; 19 - serial0 = &serial0; 20 - serial1 = &serial1; 21 - pci0 = &pci0; 22 - pci1 = &pci1; 23 - }; 24 - 25 - cpus { 26 - #address-cells = <1>; 27 - #size-cells = <0>; 28 - 29 - PowerPC,8548@0 { 30 - device_type = "cpu"; 31 - reg = <0>; 32 - d-cache-line-size = <0x20>; // 32 bytes 33 - i-cache-line-size = <0x20>; // 32 bytes 34 - d-cache-size = <0x8000>; // L1, 32K 35 - i-cache-size = <0x8000>; // L1, 32K 36 - timebase-frequency = <0>; // From uboot 37 - bus-frequency = <0>; 38 - clock-frequency = <0>; 39 - next-level-cache = <&L2>; 40 - }; 41 - }; 42 - 43 - memory { 44 - device_type = "memory"; 45 - reg = <0x00000000 0x10000000>; 46 - }; 47 - 48 - };

-106

arch/powerpc/boot/dts/sbc8548.dts

··· 1 - // SPDX-License-Identifier: GPL-2.0-or-later 2 - /* 3 - * SBC8548 Device Tree Source 4 - * 5 - * Copyright 2007 Wind River Systems Inc. 6 - * 7 - * Paul Gortmaker (see MAINTAINERS for contact information) 8 - */ 9 - 10 - 11 - /dts-v1/; 12 - 13 - /include/ "sbc8548-pre.dtsi" 14 - 15 - /{ 16 - localbus@e0000000 { 17 - #address-cells = <2>; 18 - #size-cells = <1>; 19 - compatible = "simple-bus"; 20 - reg = <0xe0000000 0x5000>; 21 - interrupt-parent = <&mpic>; 22 - 23 - ranges = <0x0 0x0 0xff800000 0x00800000 /*8MB Flash*/ 24 - 0x3 0x0 0xf0000000 0x04000000 /*64MB SDRAM*/ 25 - 0x4 0x0 0xf4000000 0x04000000 /*64MB SDRAM*/ 26 - 0x5 0x0 0xf8000000 0x00b10000 /* EPLD */ 27 - 0x6 0x0 0xec000000 0x04000000>; /*64MB Flash*/ 28 - 29 - 30 - flash@0,0 { 31 - #address-cells = <1>; 32 - #size-cells = <1>; 33 - compatible = "intel,JS28F640", "cfi-flash"; 34 - reg = <0x0 0x0 0x800000>; 35 - bank-width = <1>; 36 - device-width = <1>; 37 - partition@0 { 38 - label = "space"; 39 - /* FF800000 -> FFF9FFFF */ 40 - reg = <0x00000000 0x007a0000>; 41 - }; 42 - partition@7a0000 { 43 - label = "bootloader"; 44 - /* FFFA0000 -> FFFFFFFF */ 45 - reg = <0x007a0000 0x00060000>; 46 - read-only; 47 - }; 48 - }; 49 - 50 - epld@5,0 { 51 - compatible = "wrs,epld-localbus"; 52 - #address-cells = <2>; 53 - #size-cells = <1>; 54 - reg = <0x5 0x0 0x00b10000>; 55 - ranges = < 56 - 0x0 0x0 0x5 0x000000 0x1fff /* LED */ 57 - 0x1 0x0 0x5 0x100000 0x1fff /* Switches */ 58 - 0x3 0x0 0x5 0x300000 0x1fff /* HW Rev. */ 59 - 0xb 0x0 0x5 0xb00000 0x1fff /* EEPROM */ 60 - >; 61 - 62 - led@0,0 { 63 - compatible = "led"; 64 - reg = <0x0 0x0 0x1fff>; 65 - }; 66 - 67 - switches@1,0 { 68 - compatible = "switches"; 69 - reg = <0x1 0x0 0x1fff>; 70 - }; 71 - 72 - hw-rev@3,0 { 73 - compatible = "hw-rev"; 74 - reg = <0x3 0x0 0x1fff>; 75 - }; 76 - 77 - eeprom@b,0 { 78 - compatible = "eeprom"; 79 - reg = <0xb 0 0x1fff>; 80 - }; 81 - 82 - }; 83 - 84 - alt-flash@6,0 { 85 - #address-cells = <1>; 86 - #size-cells = <1>; 87 - reg = <0x6 0x0 0x04000000>; 88 - compatible = "intel,JS28F128", "cfi-flash"; 89 - bank-width = <4>; 90 - device-width = <1>; 91 - partition@0 { 92 - label = "space"; 93 - /* EC000000 -> EFEFFFFF */ 94 - reg = <0x00000000 0x03f00000>; 95 - }; 96 - partition@3f00000 { 97 - label = "bootloader"; 98 - /* EFF00000 -> EFFFFFFF */ 99 - reg = <0x03f00000 0x00100000>; 100 - read-only; 101 - }; 102 - }; 103 - }; 104 - }; 105 - 106 - /include/ "sbc8548-post.dtsi"

+12 -1

arch/powerpc/boot/dts/wii.dts

··· 216 216 217 217 control@d800100 { 218 218 compatible = "nintendo,hollywood-control"; 219 - reg = <0x0d800100 0x300>; 219 + /* 220 + * Both the address and length are wrong, according to 221 + * Wiibrew this should be <0x0d800000 0x400>, but it 222 + * requires refactoring the PIC1, GPIO and OTP nodes 223 + * before changing that. 224 + */ 225 + reg = <0x0d800100 0xa0>; 226 + }; 227 + 228 + otp@d8001ec { 229 + compatible = "nintendo,hollywood-otp"; 230 + reg = <0x0d8001ec 0x8>; 220 231 }; 221 232 222 233 disk@d806000 {

+14 -13

arch/powerpc/boot/install.sh

··· 15 15 # $2 - kernel image file 16 16 # $3 - kernel map file 17 17 # $4 - default install path (blank if root directory) 18 - # $5 and more - kernel boot files; zImage*, uImage, cuImage.*, etc. 19 18 # 20 19 21 20 # Bail with error code if anything goes wrong 22 21 set -e 22 + 23 + verify () { 24 + if [ ! -f "$1" ]; then 25 + echo "" 1>&2 26 + echo " *** Missing file: $1" 1>&2 27 + echo ' *** You need to run "make" before "make install".' 1>&2 28 + echo "" 1>&2 29 + exit 1 30 + fi 31 + } 32 + 33 + # Make sure the files actually exist 34 + verify "$2" 35 + verify "$3" 23 36 24 37 # User may have a custom install script 25 38 ··· 54 41 55 42 cat $2 > $4/$image_name 56 43 cp $3 $4/System.map 57 - 58 - # Copy all the bootable image files 59 - path=$4 60 - shift 4 61 - while [ $# -ne 0 ]; do 62 - image_name=`basename $1` 63 - if [ -f $path/$image_name ]; then 64 - mv $path/$image_name $path/$image_name.old 65 - fi 66 - cat $1 > $path/$image_name 67 - shift 68 - done;

+1 -1

arch/powerpc/boot/wrapper

··· 298 298 *-tqm8541|*-mpc8560*|*-tqm8560|*-tqm8555|*-ksi8560*) 299 299 platformo=$object/cuboot-85xx-cpm2.o 300 300 ;; 301 - *-mpc85*|*-tqm85*|*-sbc85*) 301 + *-mpc85*|*-tqm85*) 302 302 platformo=$object/cuboot-85xx.o 303 303 ;; 304 304 *-amigaone)

-50

arch/powerpc/configs/85xx/sbc8548_defconfig

··· 1 - CONFIG_PPC_85xx=y 2 - CONFIG_SYSVIPC=y 3 - CONFIG_LOG_BUF_SHIFT=14 4 - CONFIG_BLK_DEV_INITRD=y 5 - CONFIG_EXPERT=y 6 - CONFIG_SLAB=y 7 - # CONFIG_BLK_DEV_BSG is not set 8 - CONFIG_SBC8548=y 9 - CONFIG_GEN_RTC=y 10 - CONFIG_BINFMT_MISC=y 11 - CONFIG_MATH_EMULATION=y 12 - # CONFIG_SECCOMP is not set 13 - CONFIG_PCI=y 14 - CONFIG_NET=y 15 - CONFIG_PACKET=y 16 - CONFIG_UNIX=y 17 - CONFIG_XFRM_USER=y 18 - CONFIG_INET=y 19 - CONFIG_IP_MULTICAST=y 20 - CONFIG_IP_PNP=y 21 - CONFIG_IP_PNP_DHCP=y 22 - CONFIG_IP_PNP_BOOTP=y 23 - CONFIG_SYN_COOKIES=y 24 - # CONFIG_IPV6 is not set 25 - # CONFIG_FW_LOADER is not set 26 - CONFIG_MTD=y 27 - CONFIG_MTD_BLOCK=y 28 - CONFIG_MTD_CFI=y 29 - CONFIG_MTD_CFI_ADV_OPTIONS=y 30 - CONFIG_MTD_CFI_GEOMETRY=y 31 - CONFIG_MTD_CFI_I4=y 32 - CONFIG_MTD_CFI_INTELEXT=y 33 - CONFIG_MTD_PHYSMAP_OF=y 34 - CONFIG_BLK_DEV_LOOP=y 35 - CONFIG_BLK_DEV_RAM=y 36 - CONFIG_NETDEVICES=y 37 - CONFIG_GIANFAR=y 38 - CONFIG_BROADCOM_PHY=y 39 - # CONFIG_INPUT_KEYBOARD is not set 40 - # CONFIG_INPUT_MOUSE is not set 41 - # CONFIG_SERIO is not set 42 - # CONFIG_VT is not set 43 - CONFIG_SERIAL_8250=y 44 - CONFIG_SERIAL_8250_CONSOLE=y 45 - # CONFIG_HW_RANDOM is not set 46 - # CONFIG_USB_SUPPORT is not set 47 - CONFIG_PROC_KCORE=y 48 - CONFIG_TMPFS=y 49 - CONFIG_NFS_FS=y 50 - CONFIG_ROOT_NFS=y

+6 -1

arch/powerpc/configs/microwatt_defconfig

··· 5 5 CONFIG_TICK_CPU_ACCOUNTING=y 6 6 CONFIG_LOG_BUF_SHIFT=16 7 7 CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=12 8 + CONFIG_CGROUPS=y 8 9 CONFIG_BLK_DEV_INITRD=y 9 10 CONFIG_CC_OPTIMIZE_FOR_SIZE=y 10 11 CONFIG_KALLSYMS_ALL=y ··· 54 53 CONFIG_BLK_DEV_LOOP=y 55 54 CONFIG_BLK_DEV_RAM=y 56 55 CONFIG_NETDEVICES=y 56 + CONFIG_LITEX_LITEETH=y 57 57 # CONFIG_WLAN is not set 58 58 # CONFIG_INPUT is not set 59 59 # CONFIG_SERIO is not set 60 60 # CONFIG_VT is not set 61 + # CONFIG_LEGACY_PTYS is not set 61 62 CONFIG_SERIAL_8250=y 62 63 # CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set 63 64 CONFIG_SERIAL_8250_CONSOLE=y ··· 79 76 CONFIG_EXT4_FS=y 80 77 # CONFIG_FILE_LOCKING is not set 81 78 # CONFIG_DNOTIFY is not set 82 - # CONFIG_INOTIFY_USER is not set 79 + CONFIG_AUTOFS_FS=y 80 + CONFIG_TMPFS=y 83 81 # CONFIG_MISC_FILESYSTEMS is not set 82 + CONFIG_CRYPTO_SHA256=y 84 83 # CONFIG_CRYPTO_HW is not set 85 84 # CONFIG_XZ_DEC_X86 is not set 86 85 # CONFIG_XZ_DEC_IA64 is not set

-1

arch/powerpc/configs/mpc85xx_base.config

··· 13 13 CONFIG_P1022_RDK=y 14 14 CONFIG_P1023_RDB=y 15 15 CONFIG_TWR_P102x=y 16 - CONFIG_SBC8548=y 17 16 CONFIG_SOCRATES=y 18 17 CONFIG_STX_GP3=y 19 18 CONFIG_TQM8540=y

-1

arch/powerpc/configs/mpc86xx_base.config

··· 1 1 CONFIG_PPC_86xx=y 2 2 CONFIG_MPC8641_HPCN=y 3 - CONFIG_SBC8641D=y 4 3 CONFIG_MPC8610_HPCD=y 5 4 CONFIG_GEF_PPC9A=y 6 5 CONFIG_GEF_SBC310=y

+23 -26

arch/powerpc/configs/mpc885_ads_defconfig

··· 1 - CONFIG_PPC_8xx=y 2 1 # CONFIG_SWAP is not set 3 2 CONFIG_SYSVIPC=y 4 3 CONFIG_NO_HZ=y 5 4 CONFIG_HIGH_RES_TIMERS=y 5 + CONFIG_BPF_JIT=y 6 + CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y 6 7 CONFIG_LOG_BUF_SHIFT=14 7 8 CONFIG_EXPERT=y 8 9 # CONFIG_ELF_CORE is not set 9 10 # CONFIG_BASE_FULL is not set 10 11 # CONFIG_FUTEX is not set 12 + CONFIG_PERF_EVENTS=y 11 13 # CONFIG_VM_EVENT_COUNTERS is not set 12 - # CONFIG_BLK_DEV_BSG is not set 13 - CONFIG_PARTITION_ADVANCED=y 14 + CONFIG_PPC_8xx=y 15 + CONFIG_8xx_GPIO=y 16 + CONFIG_SMC_UCODE_PATCH=y 17 + CONFIG_PIN_TLB=y 14 18 CONFIG_GEN_RTC=y 15 19 CONFIG_HZ_100=y 20 + CONFIG_MATH_EMULATION=y 21 + CONFIG_PPC_16K_PAGES=y 22 + CONFIG_ADVANCED_OPTIONS=y 16 23 # CONFIG_SECCOMP is not set 24 + CONFIG_STRICT_KERNEL_RWX=y 25 + CONFIG_MODULES=y 26 + # CONFIG_BLK_DEV_BSG is not set 27 + CONFIG_PARTITION_ADVANCED=y 17 28 CONFIG_NET=y 18 29 CONFIG_PACKET=y 19 30 CONFIG_UNIX=y ··· 32 21 CONFIG_IP_MULTICAST=y 33 22 CONFIG_IP_PNP=y 34 23 CONFIG_SYN_COOKIES=y 35 - # CONFIG_IPV6 is not set 36 24 # CONFIG_FW_LOADER is not set 37 25 CONFIG_MTD=y 38 26 CONFIG_MTD_BLOCK=y ··· 44 34 # CONFIG_MTD_CFI_I2 is not set 45 35 CONFIG_MTD_CFI_I4=y 46 36 CONFIG_MTD_CFI_AMDSTD=y 37 + CONFIG_MTD_PHYSMAP=y 47 38 CONFIG_MTD_PHYSMAP_OF=y 48 39 # CONFIG_BLK_DEV is not set 49 40 CONFIG_NETDEVICES=y ··· 57 46 # CONFIG_LEGACY_PTYS is not set 58 47 CONFIG_SERIAL_CPM=y 59 48 CONFIG_SERIAL_CPM_CONSOLE=y 49 + CONFIG_SPI=y 50 + CONFIG_SPI_FSL_SPI=y 60 51 # CONFIG_HWMON is not set 52 + CONFIG_WATCHDOG=y 53 + CONFIG_8xxx_WDT=y 61 54 # CONFIG_USB_SUPPORT is not set 62 55 # CONFIG_DNOTIFY is not set 63 56 CONFIG_TMPFS=y 64 57 CONFIG_CRAMFS=y 65 58 CONFIG_NFS_FS=y 66 59 CONFIG_ROOT_NFS=y 60 + CONFIG_CRYPTO=y 61 + CONFIG_CRYPTO_DEV_TALITOS=y 67 62 CONFIG_CRC32_SLICEBY4=y 68 63 CONFIG_DEBUG_INFO=y 69 64 CONFIG_MAGIC_SYSRQ=y 70 - CONFIG_DETECT_HUNG_TASK=y 71 - CONFIG_PPC_16K_PAGES=y 72 - CONFIG_DEBUG_KERNEL=y 73 65 CONFIG_DEBUG_FS=y 74 - CONFIG_PPC_PTDUMP=y 75 - CONFIG_MODULES=y 76 - CONFIG_SPI=y 77 - CONFIG_SPI_FSL_SPI=y 78 - CONFIG_CRYPTO=y 79 - CONFIG_CRYPTO_DEV_TALITOS=y 80 - CONFIG_8xx_GPIO=y 81 - CONFIG_WATCHDOG=y 82 - CONFIG_8xxx_WDT=y 83 - CONFIG_SMC_UCODE_PATCH=y 84 - CONFIG_ADVANCED_OPTIONS=y 85 - CONFIG_PIN_TLB=y 86 - CONFIG_PERF_EVENTS=y 87 - CONFIG_MATH_EMULATION=y 88 - CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y 89 - CONFIG_STRICT_KERNEL_RWX=y 90 - CONFIG_IPV6=y 91 - CONFIG_BPF_JIT=y 92 66 CONFIG_DEBUG_VM_PGTABLE=y 67 + CONFIG_DETECT_HUNG_TASK=y 93 68 CONFIG_BDI_SWITCH=y 94 69 CONFIG_PPC_EARLY_DEBUG=y 95 - CONFIG_PPC_EARLY_DEBUG_CPM_ADDR=0xff002008 70 + CONFIG_PPC_PTDUMP=y

-1

arch/powerpc/configs/ppc6xx_defconfig

··· 55 55 CONFIG_ASP834x=y 56 56 CONFIG_PPC_86xx=y 57 57 CONFIG_MPC8641_HPCN=y 58 - CONFIG_SBC8641D=y 59 58 CONFIG_MPC8610_HPCD=y 60 59 CONFIG_GEF_SBC610=y 61 60 CONFIG_CPU_FREQ=y

+1

arch/powerpc/configs/wii_defconfig

··· 99 99 CONFIG_LEDS_TRIGGER_PANIC=y 100 100 CONFIG_RTC_CLASS=y 101 101 CONFIG_RTC_DRV_GENERIC=y 102 + CONFIG_NVMEM_NINTENDO_OTP=y 102 103 CONFIG_EXT2_FS=y 103 104 CONFIG_EXT4_FS=y 104 105 CONFIG_FUSE_FS=m

+2 -2

arch/powerpc/include/asm/asm-compat.h

··· 17 17 #define PPC_LONG stringify_in_c(.8byte) 18 18 #define PPC_LONG_ALIGN stringify_in_c(.balign 8) 19 19 #define PPC_TLNEI stringify_in_c(tdnei) 20 - #define PPC_LLARX(t, a, b, eh) PPC_LDARX(t, a, b, eh) 20 + #define PPC_LLARX stringify_in_c(ldarx) 21 21 #define PPC_STLCX stringify_in_c(stdcx.) 22 22 #define PPC_CNTLZL stringify_in_c(cntlzd) 23 23 #define PPC_MTOCRF(FXM, RS) MTOCRF((FXM), RS) ··· 50 50 #define PPC_LONG stringify_in_c(.long) 51 51 #define PPC_LONG_ALIGN stringify_in_c(.balign 4) 52 52 #define PPC_TLNEI stringify_in_c(twnei) 53 - #define PPC_LLARX(t, a, b, eh) PPC_LWARX(t, a, b, eh) 53 + #define PPC_LLARX stringify_in_c(lwarx) 54 54 #define PPC_STLCX stringify_in_c(stwcx.) 55 55 #define PPC_CNTLZL stringify_in_c(cntlzw) 56 56 #define PPC_MTOCRF stringify_in_c(mtcrf)

+2 -2

arch/powerpc/include/asm/atomic.h

··· 207 207 int r, o = *old; 208 208 209 209 __asm__ __volatile__ ( 210 - "1:\t" PPC_LWARX(%0,0,%2,1) " # atomic_try_cmpxchg_acquire \n" 210 + "1: lwarx %0,0,%2,%5 # atomic_try_cmpxchg_acquire \n" 211 211 " cmpw 0,%0,%3 \n" 212 212 " bne- 2f \n" 213 213 " stwcx. %4,0,%2 \n" ··· 215 215 "\t" PPC_ACQUIRE_BARRIER " \n" 216 216 "2: \n" 217 217 : "=&r" (r), "+m" (v->counter) 218 - : "r" (&v->counter), "r" (o), "r" (new) 218 + : "r" (&v->counter), "r" (o), "r" (new), "i" (IS_ENABLED(CONFIG_PPC64) ? 1 : 0) 219 219 : "cr0", "memory"); 220 220 221 221 if (unlikely(r != o))

+4 -4

arch/powerpc/include/asm/bitops.h

··· 70 70 unsigned long *p = (unsigned long *)_p; \ 71 71 __asm__ __volatile__ ( \ 72 72 prefix \ 73 - "1:" PPC_LLARX(%0,0,%3,0) "\n" \ 73 + "1:" PPC_LLARX "%0,0,%3,0\n" \ 74 74 stringify_in_c(op) "%0,%0,%2\n" \ 75 75 PPC_STLCX "%0,0,%3\n" \ 76 76 "bne- 1b\n" \ ··· 115 115 unsigned long *p = (unsigned long *)_p; \ 116 116 __asm__ __volatile__ ( \ 117 117 prefix \ 118 - "1:" PPC_LLARX(%0,0,%3,eh) "\n" \ 118 + "1:" PPC_LLARX "%0,0,%3,%4\n" \ 119 119 stringify_in_c(op) "%1,%0,%2\n" \ 120 120 PPC_STLCX "%1,0,%3\n" \ 121 121 "bne- 1b\n" \ 122 122 postfix \ 123 123 : "=&r" (old), "=&r" (t) \ 124 - : "r" (mask), "r" (p) \ 124 + : "r" (mask), "r" (p), "i" (IS_ENABLED(CONFIG_PPC64) ? eh : 0) \ 125 125 : "cc", "memory"); \ 126 126 return (old & mask); \ 127 127 } ··· 170 170 171 171 __asm__ __volatile__ ( 172 172 PPC_RELEASE_BARRIER 173 - "1:" PPC_LLARX(%0,0,%3,0) "\n" 173 + "1:" PPC_LLARX "%0,0,%3,0\n" 174 174 "andc %1,%0,%2\n" 175 175 PPC_STLCX "%1,0,%3\n" 176 176 "bne- 1b\n"

+1 -1

arch/powerpc/include/asm/book3s/64/kup.h

··· 90 90 /* Prevent access to userspace using any key values */ 91 91 LOAD_REG_IMMEDIATE(\gpr2, AMR_KUAP_BLOCKED) 92 92 999: tdne \gpr1, \gpr2 93 - EMIT_BUG_ENTRY 999b, __FILE__, __LINE__, (BUGFLAG_WARNING | BUGFLAG_ONCE) 93 + EMIT_WARN_ENTRY 999b, __FILE__, __LINE__, (BUGFLAG_WARNING | BUGFLAG_ONCE) 94 94 END_MMU_FTR_SECTION_NESTED_IFSET(MMU_FTR_BOOK3S_KUAP, 67) 95 95 #endif 96 96 .endm

+51 -11

arch/powerpc/include/asm/bug.h

··· 4 4 #ifdef __KERNEL__ 5 5 6 6 #include <asm/asm-compat.h> 7 + #include <asm/extable.h> 7 8 8 9 #ifdef CONFIG_BUG 9 10 ··· 30 29 .previous 31 30 .endm 32 31 #endif /* verbose */ 32 + 33 + .macro EMIT_WARN_ENTRY addr,file,line,flags 34 + EX_TABLE(\addr,\addr+4) 35 + EMIT_BUG_ENTRY \addr,\file,\line,\flags 36 + .endm 33 37 34 38 #else /* !__ASSEMBLY__ */ 35 39 /* _EMIT_BUG_ENTRY expects args %0,%1,%2,%3 to be FILE, LINE, flags and ··· 64 58 "i" (sizeof(struct bug_entry)), \ 65 59 ##__VA_ARGS__) 66 60 61 + #define WARN_ENTRY(insn, flags, label, ...) \ 62 + asm_volatile_goto( \ 63 + "1: " insn "\n" \ 64 + EX_TABLE(1b, %l[label]) \ 65 + _EMIT_BUG_ENTRY \ 66 + : : "i" (__FILE__), "i" (__LINE__), \ 67 + "i" (flags), \ 68 + "i" (sizeof(struct bug_entry)), \ 69 + ##__VA_ARGS__ : : label) 70 + 67 71 /* 68 72 * BUG_ON() and WARN_ON() do their best to cooperate with compile-time 69 73 * optimisations. However depending on the complexity of the condition ··· 84 68 BUG_ENTRY("twi 31, 0, 0", 0); \ 85 69 unreachable(); \ 86 70 } while (0) 71 + #define HAVE_ARCH_BUG 87 72 73 + #define __WARN_FLAGS(flags) do { \ 74 + __label__ __label_warn_on; \ 75 + \ 76 + WARN_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags), __label_warn_on); \ 77 + unreachable(); \ 78 + \ 79 + __label_warn_on: \ 80 + break; \ 81 + } while (0) 82 + 83 + #ifdef CONFIG_PPC64 88 84 #define BUG_ON(x) do { \ 89 85 if (__builtin_constant_p(x)) { \ 90 86 if (x) \ ··· 106 78 } \ 107 79 } while (0) 108 80 109 - #define __WARN_FLAGS(flags) BUG_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags)) 110 - 111 81 #define WARN_ON(x) ({ \ 112 - int __ret_warn_on = !!(x); \ 113 - if (__builtin_constant_p(__ret_warn_on)) { \ 114 - if (__ret_warn_on) \ 82 + bool __ret_warn_on = false; \ 83 + do { \ 84 + if (__builtin_constant_p((x))) { \ 85 + if (!(x)) \ 86 + break; \ 115 87 __WARN(); \ 116 - } else { \ 117 - BUG_ENTRY(PPC_TLNEI " %4, 0", \ 118 - BUGFLAG_WARNING | BUGFLAG_TAINT(TAINT_WARN), \ 119 - "r" (__ret_warn_on)); \ 120 - } \ 88 + __ret_warn_on = true; \ 89 + } else { \ 90 + __label__ __label_warn_on; \ 91 + \ 92 + WARN_ENTRY(PPC_TLNEI " %4, 0", \ 93 + BUGFLAG_WARNING | BUGFLAG_TAINT(TAINT_WARN), \ 94 + __label_warn_on, \ 95 + "r" ((__force long)(x))); \ 96 + break; \ 97 + __label_warn_on: \ 98 + __ret_warn_on = true; \ 99 + } \ 100 + } while (0); \ 121 101 unlikely(__ret_warn_on); \ 122 102 }) 123 103 124 - #define HAVE_ARCH_BUG 125 104 #define HAVE_ARCH_BUG_ON 126 105 #define HAVE_ARCH_WARN_ON 106 + #endif 107 + 127 108 #endif /* __ASSEMBLY __ */ 128 109 #else 129 110 #ifdef __ASSEMBLY__ 130 111 .macro EMIT_BUG_ENTRY addr,file,line,flags 131 112 .endm 113 + .macro EMIT_WARN_ENTRY addr,file,line,flags 114 + .endm 132 115 #else /* !__ASSEMBLY__ */ 133 116 #define _EMIT_BUG_ENTRY 117 + #define _EMIT_WARN_ENTRY 134 118 #endif 135 119 #endif /* CONFIG_BUG */ 136 120

-13

arch/powerpc/include/asm/debugfs.h

··· 1 - /* SPDX-License-Identifier: GPL-2.0-or-later */ 2 - #ifndef _ASM_POWERPC_DEBUGFS_H 3 - #define _ASM_POWERPC_DEBUGFS_H 4 - 5 - /* 6 - * Copyright 2017, Michael Ellerman, IBM Corporation. 7 - */ 8 - 9 - #include <linux/debugfs.h> 10 - 11 - extern struct dentry *powerpc_debugfs_root; 12 - 13 - #endif /* _ASM_POWERPC_DEBUGFS_H */

+1

arch/powerpc/include/asm/drmem.h

··· 111 111 int __init 112 112 walk_drmem_lmbs_early(unsigned long node, void *data, 113 113 int (*func)(struct drmem_lmb *, const __be32 **, void *)); 114 + void drmem_update_lmbs(struct property *prop); 114 115 #endif 115 116 116 117 static inline void invalidate_lmb_associativity_index(struct drmem_lmb *lmb)

+14

arch/powerpc/include/asm/extable.h

··· 17 17 18 18 #define ARCH_HAS_RELATIVE_EXTABLE 19 19 20 + #ifndef __ASSEMBLY__ 21 + 20 22 struct exception_table_entry { 21 23 int insn; 22 24 int fixup; ··· 28 26 { 29 27 return (unsigned long)&x->fixup + x->fixup; 30 28 } 29 + 30 + #endif 31 + 32 + /* 33 + * Helper macro for exception table entries 34 + */ 35 + #define EX_TABLE(_fault, _target) \ 36 + stringify_in_c(.section __ex_table,"a";)\ 37 + stringify_in_c(.balign 4;) \ 38 + stringify_in_c(.long (_fault) - . ;) \ 39 + stringify_in_c(.long (_target) - . ;) \ 40 + stringify_in_c(.previous) 31 41 32 42 #endif

+4 -3

arch/powerpc/include/asm/firmware.h

··· 44 44 #define FW_FEATURE_OPAL ASM_CONST(0x0000000010000000) 45 45 #define FW_FEATURE_SET_MODE ASM_CONST(0x0000000040000000) 46 46 #define FW_FEATURE_BEST_ENERGY ASM_CONST(0x0000000080000000) 47 - #define FW_FEATURE_TYPE1_AFFINITY ASM_CONST(0x0000000100000000) 47 + #define FW_FEATURE_FORM1_AFFINITY ASM_CONST(0x0000000100000000) 48 48 #define FW_FEATURE_PRRN ASM_CONST(0x0000000200000000) 49 49 #define FW_FEATURE_DRMEM_V2 ASM_CONST(0x0000000400000000) 50 50 #define FW_FEATURE_DRC_INFO ASM_CONST(0x0000000800000000) ··· 53 53 #define FW_FEATURE_ULTRAVISOR ASM_CONST(0x0000004000000000) 54 54 #define FW_FEATURE_STUFF_TCE ASM_CONST(0x0000008000000000) 55 55 #define FW_FEATURE_RPT_INVALIDATE ASM_CONST(0x0000010000000000) 56 + #define FW_FEATURE_FORM2_AFFINITY ASM_CONST(0x0000020000000000) 56 57 57 58 #ifndef __ASSEMBLY__ 58 59 ··· 70 69 FW_FEATURE_SPLPAR | FW_FEATURE_LPAR | 71 70 FW_FEATURE_CMO | FW_FEATURE_VPHN | FW_FEATURE_XCMO | 72 71 FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY | 73 - FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN | 72 + FW_FEATURE_FORM1_AFFINITY | FW_FEATURE_PRRN | 74 73 FW_FEATURE_HPT_RESIZE | FW_FEATURE_DRMEM_V2 | 75 74 FW_FEATURE_DRC_INFO | FW_FEATURE_BLOCK_REMOVE | 76 75 FW_FEATURE_PAPR_SCM | FW_FEATURE_ULTRAVISOR | 77 - FW_FEATURE_RPT_INVALIDATE, 76 + FW_FEATURE_RPT_INVALIDATE | FW_FEATURE_FORM2_AFFINITY, 78 77 FW_FEATURE_PSERIES_ALWAYS = 0, 79 78 FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL | FW_FEATURE_ULTRAVISOR, 80 79 FW_FEATURE_POWERNV_ALWAYS = 0,

+1

arch/powerpc/include/asm/iommu.h

··· 154 154 */ 155 155 extern struct iommu_table *iommu_init_table(struct iommu_table *tbl, 156 156 int nid, unsigned long res_start, unsigned long res_end); 157 + bool iommu_table_in_use(struct iommu_table *tbl); 157 158 158 159 #define IOMMU_TABLE_GROUP_MAX_TABLES 2 159 160

+1

arch/powerpc/include/asm/kvm_book3s_64.h

··· 39 39 pgd_t *shadow_pgtable; /* our page table for this guest */ 40 40 u64 l1_gr_to_hr; /* L1's addr of part'n-scoped table */ 41 41 u64 process_table; /* process table entry for this guest */ 42 + u64 hfscr; /* HFSCR that the L1 requested for this nested guest */ 42 43 long refcnt; /* number of pointers to this struct */ 43 44 struct mutex tlb_lock; /* serialize page faults and tlbies */ 44 45 struct kvm_nested_guest *next;

+2

arch/powerpc/include/asm/kvm_host.h

··· 811 811 812 812 u32 online; 813 813 814 + u64 hfscr_permitted; /* A mask of permitted HFSCR facilities */ 815 + 814 816 /* For support of nested guests */ 815 817 struct kvm_nested_guest *nested; 816 818 u32 nested_vcpu_id;

+2 -2

arch/powerpc/include/asm/kvm_ppc.h

··· 664 664 struct kvm_vcpu *vcpu, u32 cpu); 665 665 extern void kvmppc_xive_cleanup_vcpu(struct kvm_vcpu *vcpu); 666 666 extern int kvmppc_xive_set_mapped(struct kvm *kvm, unsigned long guest_irq, 667 - struct irq_desc *host_desc); 667 + unsigned long host_irq); 668 668 extern int kvmppc_xive_clr_mapped(struct kvm *kvm, unsigned long guest_irq, 669 - struct irq_desc *host_desc); 669 + unsigned long host_irq); 670 670 extern u64 kvmppc_xive_get_icp(struct kvm_vcpu *vcpu); 671 671 extern int kvmppc_xive_set_icp(struct kvm_vcpu *vcpu, u64 icpval); 672 672

+2 -1

arch/powerpc/include/asm/membarrier.h

··· 12 12 * when switching from userspace to kernel is not needed after 13 13 * store to rq->curr. 14 14 */ 15 - if (likely(!(atomic_read(&next->membarrier_state) & 15 + if (IS_ENABLED(CONFIG_SMP) && 16 + likely(!(atomic_read(&next->membarrier_state) & 16 17 (MEMBARRIER_STATE_PRIVATE_EXPEDITED | 17 18 MEMBARRIER_STATE_GLOBAL_EXPEDITED)) || !prev)) 18 19 return;

+1 -1

arch/powerpc/include/asm/mmu.h

··· 324 324 } 325 325 #endif /* !CONFIG_DEBUG_VM */ 326 326 327 - static inline bool radix_enabled(void) 327 + static __always_inline bool radix_enabled(void) 328 328 { 329 329 return mmu_has_feature(MMU_FTR_TYPE_RADIX); 330 330 }

+5

arch/powerpc/include/asm/pci-bridge.h

··· 126 126 #endif /* CONFIG_PPC64 */ 127 127 128 128 void *private_data; 129 + 130 + /* IRQ domain hierarchy */ 131 + struct irq_domain *dev_domain; 132 + struct irq_domain *msi_domain; 133 + struct fwnode_handle *fwnode; 129 134 }; 130 135 131 136 /* These are used for config access before all the PCI probing

+7

arch/powerpc/include/asm/pmc.h

··· 34 34 #endif 35 35 } 36 36 37 + #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE 38 + static inline int ppc_get_pmu_inuse(void) 39 + { 40 + return get_paca()->pmcregs_in_use; 41 + } 42 + #endif 43 + 37 44 extern void power4_enable_pmcs(void); 38 45 39 46 #else /* CONFIG_PPC64 */

+1 -1

arch/powerpc/include/asm/pnv-pci.h

··· 33 33 void pnv_cxl_release_hwirqs(struct pci_dev *dev, int hwirq, int num); 34 34 int pnv_cxl_get_irq_count(struct pci_dev *dev); 35 35 struct device_node *pnv_pci_get_phb_node(struct pci_dev *dev); 36 - int64_t pnv_opal_pci_msi_eoi(struct irq_chip *chip, unsigned int hw_irq); 36 + int64_t pnv_opal_pci_msi_eoi(struct irq_data *d); 37 37 bool is_pnv_opal_msi(struct irq_chip *chip); 38 38 39 39 #ifdef CONFIG_CXL_BASE

-2

arch/powerpc/include/asm/ppc-opcode.h

··· 576 576 #define PPC_DIVDE(t, a, b) stringify_in_c(.long PPC_RAW_DIVDE(t, a, b)) 577 577 #define PPC_DIVDEU(t, a, b) stringify_in_c(.long PPC_RAW_DIVDEU(t, a, b)) 578 578 #define PPC_LQARX(t, a, b, eh) stringify_in_c(.long PPC_RAW_LQARX(t, a, b, eh)) 579 - #define PPC_LDARX(t, a, b, eh) stringify_in_c(.long PPC_RAW_LDARX(t, a, b, eh)) 580 - #define PPC_LWARX(t, a, b, eh) stringify_in_c(.long PPC_RAW_LWARX(t, a, b, eh)) 581 579 #define PPC_STQCX(t, a, b) stringify_in_c(.long PPC_RAW_STQCX(t, a, b)) 582 580 #define PPC_MADDHD(t, a, b, c) stringify_in_c(.long PPC_RAW_MADDHD(t, a, b, c)) 583 581 #define PPC_MADDHDU(t, a, b, c) stringify_in_c(.long PPC_RAW_MADDHDU(t, a, b, c))

+2 -11

arch/powerpc/include/asm/ppc_asm.h

··· 10 10 #include <asm/ppc-opcode.h> 11 11 #include <asm/firmware.h> 12 12 #include <asm/feature-fixups.h> 13 + #include <asm/extable.h> 13 14 14 15 #ifdef __ASSEMBLY__ 15 16 ··· 260 259 261 260 /* Be careful, this will clobber the lr register. */ 262 261 #define LOAD_REG_ADDR_PIC(reg, name) \ 263 - bl 0f; \ 262 + bcl 20,31,$+4; \ 264 263 0: mflr reg; \ 265 264 addis reg,reg,(name - 0b)@ha; \ 266 265 addi reg,reg,(name - 0b)@l; ··· 752 751 #endif /* !CONFIG_PPC_BOOK3E */ 753 752 754 753 #endif /* __ASSEMBLY__ */ 755 - 756 - /* 757 - * Helper macro for exception table entries 758 - */ 759 - #define EX_TABLE(_fault, _target) \ 760 - stringify_in_c(.section __ex_table,"a";)\ 761 - stringify_in_c(.balign 4;) \ 762 - stringify_in_c(.long (_fault) - . ;) \ 763 - stringify_in_c(.long (_target) - . ;) \ 764 - stringify_in_c(.previous) 765 754 766 755 #define SOFT_MASK_TABLE(_start, _end) \ 767 756 stringify_in_c(.section __soft_mask_table,"a";)\

+2 -1

arch/powerpc/include/asm/prom.h

··· 147 147 #define OV5_MSI 0x0201 /* PCIe/MSI support */ 148 148 #define OV5_CMO 0x0480 /* Cooperative Memory Overcommitment */ 149 149 #define OV5_XCMO 0x0440 /* Page Coalescing */ 150 - #define OV5_TYPE1_AFFINITY 0x0580 /* Type 1 NUMA affinity */ 150 + #define OV5_FORM1_AFFINITY 0x0580 /* FORM1 NUMA affinity */ 151 151 #define OV5_PRRN 0x0540 /* Platform Resource Reassignment */ 152 + #define OV5_FORM2_AFFINITY 0x0520 /* Form2 NUMA affinity */ 152 153 #define OV5_HP_EVT 0x0604 /* Hot Plug Event support */ 153 154 #define OV5_RESIZE_HPT 0x0601 /* Hash Page Table resizing */ 154 155 #define OV5_PFO_HW_RNG 0x1180 /* PFO Random Number Generator */

+31 -6

arch/powerpc/include/asm/ptrace.h

··· 22 22 #include <linux/err.h> 23 23 #include <uapi/asm/ptrace.h> 24 24 #include <asm/asm-const.h> 25 + #include <asm/reg.h> 25 26 26 27 #ifndef __ASSEMBLY__ 27 28 struct pt_regs ··· 44 43 unsigned long mq; 45 44 #endif 46 45 unsigned long trap; 47 - unsigned long dar; 48 - unsigned long dsisr; 46 + union { 47 + unsigned long dar; 48 + unsigned long dear; 49 + }; 50 + union { 51 + unsigned long dsisr; 52 + unsigned long esr; 53 + }; 49 54 unsigned long result; 50 55 }; 51 56 }; ··· 204 197 return 0; 205 198 } 206 199 207 - #ifdef __powerpc64__ 208 - #define user_mode(regs) ((((regs)->msr) >> MSR_PR_LG) & 0x1) 209 - #else 210 200 #define user_mode(regs) (((regs)->msr & MSR_PR) != 0) 211 - #endif 212 201 213 202 #define force_successful_syscall_return() \ 214 203 do { \ ··· 287 284 static inline void regs_set_return_value(struct pt_regs *regs, unsigned long rc) 288 285 { 289 286 regs->gpr[3] = rc; 287 + } 288 + 289 + static inline bool cpu_has_msr_ri(void) 290 + { 291 + return !IS_ENABLED(CONFIG_BOOKE) && !IS_ENABLED(CONFIG_40x); 292 + } 293 + 294 + static inline bool regs_is_unrecoverable(struct pt_regs *regs) 295 + { 296 + return unlikely(cpu_has_msr_ri() && !(regs->msr & MSR_RI)); 297 + } 298 + 299 + static inline void regs_set_recoverable(struct pt_regs *regs) 300 + { 301 + if (cpu_has_msr_ri()) 302 + regs_set_return_msr(regs, regs->msr | MSR_RI); 303 + } 304 + 305 + static inline void regs_set_unrecoverable(struct pt_regs *regs) 306 + { 307 + if (cpu_has_msr_ri()) 308 + regs_set_return_msr(regs, regs->msr & ~MSR_RI); 290 309 } 291 310 292 311 #define arch_has_single_step() (1)

+2 -1

arch/powerpc/include/asm/reg.h

··· 415 415 #define FSCR_TAR __MASK(FSCR_TAR_LG) 416 416 #define FSCR_EBB __MASK(FSCR_EBB_LG) 417 417 #define FSCR_DSCR __MASK(FSCR_DSCR_LG) 418 + #define FSCR_INTR_CAUSE (ASM_CONST(0xFF) << 56) /* interrupt cause */ 418 419 #define SPRN_HFSCR 0xbe /* HV=1 Facility Status & Control Register */ 419 420 #define HFSCR_PREFIX __MASK(FSCR_PREFIX_LG) 420 421 #define HFSCR_MSGP __MASK(FSCR_MSGP_LG) ··· 427 426 #define HFSCR_DSCR __MASK(FSCR_DSCR_LG) 428 427 #define HFSCR_VECVSX __MASK(FSCR_VECVSX_LG) 429 428 #define HFSCR_FP __MASK(FSCR_FP_LG) 430 - #define HFSCR_INTR_CAUSE (ASM_CONST(0xFF) << 56) /* interrupt cause */ 429 + #define HFSCR_INTR_CAUSE FSCR_INTR_CAUSE 431 430 #define SPRN_TAR 0x32f /* Target Address Register */ 432 431 #define SPRN_LPCR 0x13E /* LPAR Control Register */ 433 432 #define LPCR_VPM0 ASM_CONST(0x8000000000000000)

-8

arch/powerpc/include/asm/sections.h

··· 38 38 extern char end_virt_trampolines[]; 39 39 #endif 40 40 41 - static inline int in_kernel_text(unsigned long addr) 42 - { 43 - if (addr >= (unsigned long)_stext && addr < (unsigned long)__init_end) 44 - return 1; 45 - 46 - return 0; 47 - } 48 - 49 41 static inline unsigned long kernel_toc_addr(void) 50 42 { 51 43 /* Defined by the linker, see vmlinux.lds.S */

+3 -3

arch/powerpc/include/asm/simple_spinlock.h

··· 51 51 52 52 token = LOCK_TOKEN; 53 53 __asm__ __volatile__( 54 - "1: " PPC_LWARX(%0,0,%2,1) "\n\ 54 + "1: lwarx %0,0,%2,1\n\ 55 55 cmpwi 0,%0,0\n\ 56 56 bne- 2f\n\ 57 57 stwcx. %1,0,%2\n\ ··· 179 179 long tmp; 180 180 181 181 __asm__ __volatile__( 182 - "1: " PPC_LWARX(%0,0,%1,1) "\n" 182 + "1: lwarx %0,0,%1,1\n" 183 183 __DO_SIGN_EXTEND 184 184 " addic. %0,%0,1\n\ 185 185 ble- 2f\n" ··· 203 203 204 204 token = WRLOCK_TOKEN; 205 205 __asm__ __volatile__( 206 - "1: " PPC_LWARX(%0,0,%2,1) "\n\ 206 + "1: lwarx %0,0,%2,1\n\ 207 207 cmpwi 0,%0,0\n\ 208 208 bne- 2f\n" 209 209 " stwcx. %1,0,%2\n\

+6

arch/powerpc/include/asm/smp.h

··· 33 33 extern int cpu_to_chip_id(int cpu); 34 34 extern int *chip_id_lookup_table; 35 35 36 + DECLARE_PER_CPU(cpumask_var_t, thread_group_l1_cache_map); 37 + DECLARE_PER_CPU(cpumask_var_t, thread_group_l2_cache_map); 38 + DECLARE_PER_CPU(cpumask_var_t, thread_group_l3_cache_map); 39 + 36 40 #ifdef CONFIG_SMP 37 41 38 42 struct smp_ops_t { ··· 145 141 146 142 extern bool has_big_cores; 147 143 extern bool thread_group_shares_l2; 144 + extern bool thread_group_shares_l3; 148 145 149 146 #define cpu_smt_mask cpu_smt_mask 150 147 #ifdef CONFIG_SCHED_SMT ··· 200 195 #define hard_smp_processor_id() get_hard_smp_processor_id(0) 201 196 #define smp_setup_cpu_maps() 202 197 #define thread_group_shares_l2 0 198 + #define thread_group_shares_l3 0 203 199 static inline void inhibit_secondary_onlining(void) {} 204 200 static inline void uninhibit_secondary_onlining(void) {} 205 201 static inline const struct cpumask *cpu_sibling_mask(int cpu)

+7 -13

arch/powerpc/include/asm/syscall.h

··· 90 90 unsigned long val, mask = -1UL; 91 91 unsigned int n = 6; 92 92 93 - #ifdef CONFIG_COMPAT 94 - if (test_tsk_thread_flag(task, TIF_32BIT)) 93 + if (is_32bit_task()) 95 94 mask = 0xffffffff; 96 - #endif 95 + 97 96 while (n--) { 98 97 if (n == 0) 99 98 val = regs->orig_gpr3; ··· 115 116 116 117 static inline int syscall_get_arch(struct task_struct *task) 117 118 { 118 - int arch; 119 - 120 - if (IS_ENABLED(CONFIG_PPC64) && !test_tsk_thread_flag(task, TIF_32BIT)) 121 - arch = AUDIT_ARCH_PPC64; 119 + if (is_32bit_task()) 120 + return AUDIT_ARCH_PPC; 121 + else if (IS_ENABLED(CONFIG_CPU_LITTLE_ENDIAN)) 122 + return AUDIT_ARCH_PPC64LE; 122 123 else 123 - arch = AUDIT_ARCH_PPC; 124 - 125 - #ifdef __LITTLE_ENDIAN__ 126 - arch |= __AUDIT_ARCH_LE; 127 - #endif 128 - return arch; 124 + return AUDIT_ARCH_PPC64; 129 125 } 130 126 #endif /* _ASM_SYSCALL_H */

+30

arch/powerpc/include/asm/syscalls.h

··· 6 6 #include <linux/compiler.h> 7 7 #include <linux/linkage.h> 8 8 #include <linux/types.h> 9 + #include <linux/compat.h> 9 10 10 11 struct rtas_args; 11 12 ··· 18 17 unsigned long fd, unsigned long pgoff); 19 18 asmlinkage long ppc64_personality(unsigned long personality); 20 19 asmlinkage long sys_rtas(struct rtas_args __user *uargs); 20 + 21 + #ifdef CONFIG_COMPAT 22 + unsigned long compat_sys_mmap2(unsigned long addr, size_t len, 23 + unsigned long prot, unsigned long flags, 24 + unsigned long fd, unsigned long pgoff); 25 + 26 + compat_ssize_t compat_sys_pread64(unsigned int fd, char __user *ubuf, compat_size_t count, 27 + u32 reg6, u32 pos1, u32 pos2); 28 + 29 + compat_ssize_t compat_sys_pwrite64(unsigned int fd, const char __user *ubuf, compat_size_t count, 30 + u32 reg6, u32 pos1, u32 pos2); 31 + 32 + compat_ssize_t compat_sys_readahead(int fd, u32 r4, u32 offset1, u32 offset2, u32 count); 33 + 34 + int compat_sys_truncate64(const char __user *path, u32 reg4, 35 + unsigned long len1, unsigned long len2); 36 + 37 + long compat_sys_fallocate(int fd, int mode, u32 offset1, u32 offset2, u32 len1, u32 len2); 38 + 39 + int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, 40 + unsigned long len2); 41 + 42 + long ppc32_fadvise64(int fd, u32 unused, u32 offset1, u32 offset2, 43 + size_t len, int advice); 44 + 45 + long compat_sys_sync_file_range2(int fd, unsigned int flags, 46 + unsigned int offset1, unsigned int offset2, 47 + unsigned int nbytes1, unsigned int nbytes2); 48 + #endif 21 49 22 50 #endif /* __KERNEL__ */ 23 51 #endif /* __ASM_POWERPC_SYSCALLS_H */

-8

arch/powerpc/include/asm/tce.h

··· 19 19 #define TCE_VB 0 20 20 #define TCE_PCI 1 21 21 22 - /* TCE page size is 4096 bytes (1 << 12) */ 23 - 24 - #define TCE_SHIFT 12 25 - #define TCE_PAGE_SIZE (1 << TCE_SHIFT) 26 - 27 22 #define TCE_ENTRY_SIZE 8 /* each TCE is 64 bits */ 28 - 29 - #define TCE_RPN_MASK 0xfffffffffful /* 40-bit RPN (4K pages) */ 30 - #define TCE_RPN_SHIFT 12 31 23 #define TCE_VALID 0x800 /* TCE valid */ 32 24 #define TCE_ALLIO 0x400 /* TCE valid for all lpars */ 33 25 #define TCE_PCI_WRITE 0x2 /* write from PCI allowed */

+17 -2

arch/powerpc/include/asm/topology.h

··· 36 36 cpu_all_mask : \ 37 37 cpumask_of_node(pcibus_to_node(bus))) 38 38 39 - extern int cpu_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc); 39 + int cpu_relative_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc); 40 40 extern int __node_distance(int, int); 41 41 #define node_distance(a, b) __node_distance(a, b) 42 42 ··· 64 64 } 65 65 66 66 int of_drconf_to_nid_single(struct drmem_lmb *lmb); 67 + void update_numa_distance(struct device_node *node); 68 + 69 + extern void map_cpu_to_node(int cpu, int node); 70 + #ifdef CONFIG_HOTPLUG_CPU 71 + extern void unmap_cpu_from_node(unsigned long cpu); 72 + #endif /* CONFIG_HOTPLUG_CPU */ 67 73 68 74 #else 69 75 ··· 89 83 90 84 static inline void update_numa_cpu_lookup_table(unsigned int cpu, int node) {} 91 85 92 - static inline int cpu_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc) 86 + static inline int cpu_relative_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc) 93 87 { 94 88 return 0; 95 89 } ··· 98 92 { 99 93 return first_online_node; 100 94 } 95 + 96 + static inline void update_numa_distance(struct device_node *node) {} 97 + 98 + #ifdef CONFIG_SMP 99 + static inline void map_cpu_to_node(int cpu, int node) {} 100 + #ifdef CONFIG_HOTPLUG_CPU 101 + static inline void unmap_cpu_from_node(unsigned long cpu) {} 102 + #endif /* CONFIG_HOTPLUG_CPU */ 103 + #endif /* CONFIG_SMP */ 101 104 102 105 #endif /* CONFIG_NUMA */ 103 106

-2

arch/powerpc/include/asm/unistd.h

··· 9 9 10 10 #define NR_syscalls __NR_syscalls 11 11 12 - #define __NR__exit __NR_exit 13 - 14 12 #ifndef __ASSEMBLY__ 15 13 16 14 #include <linux/types.h>

+9

arch/powerpc/include/asm/vdso/processor.h

··· 5 5 #ifndef __ASSEMBLY__ 6 6 7 7 /* Macros for adjusting thread priority (hardware multi-threading) */ 8 + #ifdef CONFIG_PPC64 8 9 #define HMT_very_low() asm volatile("or 31, 31, 31 # very low priority") 9 10 #define HMT_low() asm volatile("or 1, 1, 1 # low priority") 10 11 #define HMT_medium_low() asm volatile("or 6, 6, 6 # medium low priority") 11 12 #define HMT_medium() asm volatile("or 2, 2, 2 # medium priority") 12 13 #define HMT_medium_high() asm volatile("or 5, 5, 5 # medium high priority") 13 14 #define HMT_high() asm volatile("or 3, 3, 3 # high priority") 15 + #else 16 + #define HMT_very_low() 17 + #define HMT_low() 18 + #define HMT_medium_low() 19 + #define HMT_medium() 20 + #define HMT_medium_high() 21 + #define HMT_high() 22 + #endif 14 23 15 24 #ifdef CONFIG_PPC64 16 25 #define cpu_relax() do { HMT_low(); HMT_medium(); barrier(); } while (0)

+2 -1

arch/powerpc/include/asm/xics.h

··· 89 89 /* ICS instance, hooked up to chip_data of an irq */ 90 90 struct ics { 91 91 struct list_head link; 92 - int (*map)(struct ics *ics, unsigned int virq); 92 + int (*check)(struct ics *ics, unsigned int hwirq); 93 93 void (*mask_unknown)(struct ics *ics, unsigned long vec); 94 94 long (*get_server)(struct ics *ics, unsigned long vec); 95 95 int (*host_match)(struct ics *ics, struct device_node *node); 96 + struct irq_chip *chip; 96 97 char data[]; 97 98 }; 98 99

+3

arch/powerpc/include/asm/xive-regs.h

··· 80 80 #define TM_QW0W2_VU PPC_BIT32(0) 81 81 #define TM_QW0W2_LOGIC_SERV PPC_BITMASK32(1,31) // XX 2,31 ? 82 82 #define TM_QW1W2_VO PPC_BIT32(0) 83 + #define TM_QW1W2_HO PPC_BIT32(1) /* P10 XIVE2 */ 83 84 #define TM_QW1W2_OS_CAM PPC_BITMASK32(8,31) 84 85 #define TM_QW2W2_VP PPC_BIT32(0) 86 + #define TM_QW2W2_HP PPC_BIT32(1) /* P10 XIVE2 */ 85 87 #define TM_QW2W2_POOL_CAM PPC_BITMASK32(8,31) 86 88 #define TM_QW3W2_VT PPC_BIT32(0) 89 + #define TM_QW3W2_HT PPC_BIT32(1) /* P10 XIVE2 */ 87 90 #define TM_QW3W2_LP PPC_BIT32(6) 88 91 #define TM_QW3W2_LE PPC_BIT32(7) 89 92 #define TM_QW3W2_T PPC_BIT32(31)

+2

arch/powerpc/include/asm/xive.h

··· 111 111 int xive_native_populate_irq_data(u32 hw_irq, 112 112 struct xive_irq_data *data); 113 113 void xive_cleanup_irq_data(struct xive_irq_data *xd); 114 + void xive_irq_free_data(unsigned int virq); 114 115 void xive_native_free_irq(u32 irq); 115 116 int xive_native_configure_irq(u32 hw_irq, u32 target, u8 prio, u32 sw_irq); 116 117 ··· 126 125 int xive_native_disable_vp(u32 vp_id); 127 126 int xive_native_get_vp_info(u32 vp_id, u32 *out_cam_id, u32 *out_chip_id); 128 127 bool xive_native_has_single_escalation(void); 128 + bool xive_native_has_save_restore(void); 129 129 130 130 int xive_native_get_queue_info(u32 vp_id, uint32_t prio, 131 131 u64 *out_qpage,

+2 -1

arch/powerpc/kernel/Makefile

··· 46 46 prom.o traps.o setup-common.o \ 47 47 udbg.o misc.o io.o misc_$(BITS).o \ 48 48 of_platform.o prom_parse.o firmware.o \ 49 - hw_breakpoint_constraints.o interrupt.o 49 + hw_breakpoint_constraints.o interrupt.o \ 50 + kdebugfs.o 50 51 obj-y += ptrace/ 51 52 obj-$(CONFIG_PPC64) += setup_64.o \ 52 53 paca.o nvram_64.o note.o

+4 -11

arch/powerpc/kernel/asm-offsets.c

··· 286 286 STACK_PT_REGS_OFFSET(_CCR, ccr); 287 287 STACK_PT_REGS_OFFSET(_XER, xer); 288 288 STACK_PT_REGS_OFFSET(_DAR, dar); 289 + STACK_PT_REGS_OFFSET(_DEAR, dear); 289 290 STACK_PT_REGS_OFFSET(_DSISR, dsisr); 291 + STACK_PT_REGS_OFFSET(_ESR, esr); 290 292 STACK_PT_REGS_OFFSET(ORIG_GPR3, orig_gpr3); 291 293 STACK_PT_REGS_OFFSET(RESULT, result); 292 294 STACK_PT_REGS_OFFSET(_TRAP, trap); 293 - #ifndef CONFIG_PPC64 294 - /* 295 - * The PowerPC 400-class & Book-E processors have neither the DAR 296 - * nor the DSISR SPRs. Hence, we overload them to hold the similar 297 - * DEAR and ESR SPRs for such processors. For critical interrupts 298 - * we use them to hold SRR0 and SRR1. 299 - */ 300 - STACK_PT_REGS_OFFSET(_DEAR, dar); 301 - STACK_PT_REGS_OFFSET(_ESR, dsisr); 302 - #else /* CONFIG_PPC64 */ 295 + #ifdef CONFIG_PPC64 303 296 STACK_PT_REGS_OFFSET(SOFTE, softe); 304 297 STACK_PT_REGS_OFFSET(_PPR, ppr); 305 - #endif /* CONFIG_PPC64 */ 298 + #endif 306 299 307 300 #ifdef CONFIG_PPC_PKEY 308 301 STACK_PT_REGS_OFFSET(STACK_REGS_AMR, amr);

+62 -62

arch/powerpc/kernel/cacheinfo.c

··· 120 120 struct cpumask shared_cpu_map; /* online CPUs using this cache */ 121 121 int type; /* split cache disambiguation */ 122 122 int level; /* level not explicit in device tree */ 123 + int group_id; /* id of the group of threads that share this cache */ 123 124 struct list_head list; /* global list of cache objects */ 124 125 struct cache *next_local; /* next cache of >= level */ 125 126 }; ··· 143 142 } 144 143 145 144 static void cache_init(struct cache *cache, int type, int level, 146 - struct device_node *ofnode) 145 + struct device_node *ofnode, int group_id) 147 146 { 148 147 cache->type = type; 149 148 cache->level = level; 150 149 cache->ofnode = of_node_get(ofnode); 150 + cache->group_id = group_id; 151 151 INIT_LIST_HEAD(&cache->list); 152 152 list_add(&cache->list, &cache_list); 153 153 } 154 154 155 - static struct cache *new_cache(int type, int level, struct device_node *ofnode) 155 + static struct cache *new_cache(int type, int level, 156 + struct device_node *ofnode, int group_id) 156 157 { 157 158 struct cache *cache; 158 159 159 160 cache = kzalloc(sizeof(*cache), GFP_KERNEL); 160 161 if (cache) 161 - cache_init(cache, type, level, ofnode); 162 + cache_init(cache, type, level, ofnode, group_id); 162 163 163 164 return cache; 164 165 } ··· 312 309 return cache; 313 310 314 311 list_for_each_entry(iter, &cache_list, list) 315 - if (iter->ofnode == cache->ofnode && iter->next_local == cache) 312 + if (iter->ofnode == cache->ofnode && 313 + iter->group_id == cache->group_id && 314 + iter->next_local == cache) 316 315 return iter; 317 316 318 317 return cache; 319 318 } 320 319 321 - /* return the first cache on a local list matching node */ 322 - static struct cache *cache_lookup_by_node(const struct device_node *node) 320 + /* return the first cache on a local list matching node and thread-group id */ 321 + static struct cache *cache_lookup_by_node_group(const struct device_node *node, 322 + int group_id) 323 323 { 324 324 struct cache *cache = NULL; 325 325 struct cache *iter; 326 326 327 327 list_for_each_entry(iter, &cache_list, list) { 328 - if (iter->ofnode != node) 328 + if (iter->ofnode != node || 329 + iter->group_id != group_id) 329 330 continue; 330 331 cache = cache_find_first_sibling(iter); 331 332 break; ··· 359 352 CACHE_TYPE_UNIFIED_D : CACHE_TYPE_UNIFIED; 360 353 } 361 354 362 - static struct cache *cache_do_one_devnode_unified(struct device_node *node, int level) 355 + static struct cache *cache_do_one_devnode_unified(struct device_node *node, int group_id, 356 + int level) 363 357 { 364 358 pr_debug("creating L%d ucache for %pOFP\n", level, node); 365 359 366 - return new_cache(cache_is_unified_d(node), level, node); 360 + return new_cache(cache_is_unified_d(node), level, node, group_id); 367 361 } 368 362 369 - static struct cache *cache_do_one_devnode_split(struct device_node *node, 363 + static struct cache *cache_do_one_devnode_split(struct device_node *node, int group_id, 370 364 int level) 371 365 { 372 366 struct cache *dcache, *icache; ··· 375 367 pr_debug("creating L%d dcache and icache for %pOFP\n", level, 376 368 node); 377 369 378 - dcache = new_cache(CACHE_TYPE_DATA, level, node); 379 - icache = new_cache(CACHE_TYPE_INSTRUCTION, level, node); 370 + dcache = new_cache(CACHE_TYPE_DATA, level, node, group_id); 371 + icache = new_cache(CACHE_TYPE_INSTRUCTION, level, node, group_id); 380 372 381 373 if (!dcache || !icache) 382 374 goto err; ··· 390 382 return NULL; 391 383 } 392 384 393 - static struct cache *cache_do_one_devnode(struct device_node *node, int level) 385 + static struct cache *cache_do_one_devnode(struct device_node *node, int group_id, int level) 394 386 { 395 387 struct cache *cache; 396 388 397 389 if (cache_node_is_unified(node)) 398 - cache = cache_do_one_devnode_unified(node, level); 390 + cache = cache_do_one_devnode_unified(node, group_id, level); 399 391 else 400 - cache = cache_do_one_devnode_split(node, level); 392 + cache = cache_do_one_devnode_split(node, group_id, level); 401 393 402 394 return cache; 403 395 } 404 396 405 397 static struct cache *cache_lookup_or_instantiate(struct device_node *node, 398 + int group_id, 406 399 int level) 407 400 { 408 401 struct cache *cache; 409 402 410 - cache = cache_lookup_by_node(node); 403 + cache = cache_lookup_by_node_group(node, group_id); 411 404 412 405 WARN_ONCE(cache && cache->level != level, 413 406 "cache level mismatch on lookup (got %d, expected %d)\n", 414 407 cache->level, level); 415 408 416 409 if (!cache) 417 - cache = cache_do_one_devnode(node, level); 410 + cache = cache_do_one_devnode(node, group_id, level); 418 411 419 412 return cache; 420 413 } ··· 452 443 of_node_get_device_type(cache->ofnode)); 453 444 } 454 445 455 - static void do_subsidiary_caches(struct cache *cache) 446 + /* 447 + * If sub-groups of threads in a core containing @cpu_id share the 448 + * L@level-cache (information obtained via "ibm,thread-groups" 449 + * device-tree property), then we identify the group by the first 450 + * thread-sibling in the group. We define this to be the group-id. 451 + * 452 + * In the absence of any thread-group information for L@level-cache, 453 + * this function returns -1. 454 + */ 455 + static int get_group_id(unsigned int cpu_id, int level) 456 + { 457 + if (has_big_cores && level == 1) 458 + return cpumask_first(per_cpu(thread_group_l1_cache_map, 459 + cpu_id)); 460 + else if (thread_group_shares_l2 && level == 2) 461 + return cpumask_first(per_cpu(thread_group_l2_cache_map, 462 + cpu_id)); 463 + else if (thread_group_shares_l3 && level == 3) 464 + return cpumask_first(per_cpu(thread_group_l3_cache_map, 465 + cpu_id)); 466 + return -1; 467 + } 468 + 469 + static void do_subsidiary_caches(struct cache *cache, unsigned int cpu_id) 456 470 { 457 471 struct device_node *subcache_node; 458 472 int level = cache->level; ··· 484 452 485 453 while ((subcache_node = of_find_next_cache_node(cache->ofnode))) { 486 454 struct cache *subcache; 455 + int group_id; 487 456 488 457 level++; 489 - subcache = cache_lookup_or_instantiate(subcache_node, level); 458 + group_id = get_group_id(cpu_id, level); 459 + subcache = cache_lookup_or_instantiate(subcache_node, group_id, level); 490 460 of_node_put(subcache_node); 491 461 if (!subcache) 492 462 break; ··· 502 468 { 503 469 struct device_node *cpu_node; 504 470 struct cache *cpu_cache = NULL; 471 + int group_id; 505 472 506 473 pr_debug("creating cache object(s) for CPU %i\n", cpu_id); 507 474 ··· 511 476 if (!cpu_node) 512 477 goto out; 513 478 514 - cpu_cache = cache_lookup_or_instantiate(cpu_node, 1); 479 + group_id = get_group_id(cpu_id, 1); 480 + 481 + cpu_cache = cache_lookup_or_instantiate(cpu_node, group_id, 1); 515 482 if (!cpu_cache) 516 483 goto out; 517 484 518 - do_subsidiary_caches(cpu_cache); 485 + do_subsidiary_caches(cpu_cache, cpu_id); 519 486 520 487 cache_cpu_set(cpu_cache, cpu_id); 521 488 out: ··· 678 641 static struct kobj_attribute cache_level_attr = 679 642 __ATTR(level, 0444, level_show, NULL); 680 643 681 - static unsigned int index_dir_to_cpu(struct cache_index_dir *index) 682 - { 683 - struct kobject *index_dir_kobj = &index->kobj; 684 - struct kobject *cache_dir_kobj = index_dir_kobj->parent; 685 - struct kobject *cpu_dev_kobj = cache_dir_kobj->parent; 686 - struct device *dev = kobj_to_dev(cpu_dev_kobj); 687 - 688 - return dev->id; 689 - } 690 - 691 - /* 692 - * On big-core systems, each core has two groups of CPUs each of which 693 - * has its own L1-cache. The thread-siblings which share l1-cache with 694 - * @cpu can be obtained via cpu_smallcore_mask(). 695 - * 696 - * On some big-core systems, the L2 cache is shared only between some 697 - * groups of siblings. This is already parsed and encoded in 698 - * cpu_l2_cache_mask(). 699 - * 700 - * TODO: cache_lookup_or_instantiate() needs to be made aware of the 701 - * "ibm,thread-groups" property so that cache->shared_cpu_map 702 - * reflects the correct siblings on platforms that have this 703 - * device-tree property. This helper function is only a stop-gap 704 - * solution so that we report the correct siblings to the 705 - * userspace via sysfs. 706 - */ 707 - static const struct cpumask *get_shared_cpu_map(struct cache_index_dir *index, struct cache *cache) 708 - { 709 - if (has_big_cores) { 710 - int cpu = index_dir_to_cpu(index); 711 - if (cache->level == 1) 712 - return cpu_smallcore_mask(cpu); 713 - if (cache->level == 2 && thread_group_shares_l2) 714 - return cpu_l2_cache_mask(cpu); 715 - } 716 - 717 - return &cache->shared_cpu_map; 718 - } 719 - 720 644 static ssize_t 721 645 show_shared_cpumap(struct kobject *k, struct kobj_attribute *attr, char *buf, bool list) 722 646 { ··· 688 690 index = kobj_to_cache_index_dir(k); 689 691 cache = index->cache; 690 692 691 - mask = get_shared_cpu_map(index, cache); 693 + mask = &cache->shared_cpu_map; 692 694 693 695 return cpumap_print_to_pagebuf(list, buf, mask); 694 696 } ··· 846 848 { 847 849 struct device_node *cpu_node; 848 850 struct cache *cache; 851 + int group_id; 849 852 850 853 cpu_node = of_get_cpu_node(cpu_id, NULL); 851 854 WARN_ONCE(!cpu_node, "no OF node found for CPU %i\n", cpu_id); 852 855 if (!cpu_node) 853 856 return NULL; 854 857 855 - cache = cache_lookup_by_node(cpu_node); 858 + group_id = get_group_id(cpu_id, 1); 859 + cache = cache_lookup_by_node_group(cpu_node, group_id); 856 860 of_node_put(cpu_node); 857 861 858 862 return cache;

+1 -2

arch/powerpc/kernel/dawr.c

··· 9 9 #include <linux/export.h> 10 10 #include <linux/fs.h> 11 11 #include <linux/debugfs.h> 12 - #include <asm/debugfs.h> 13 12 #include <asm/machdep.h> 14 13 #include <asm/hvcall.h> 15 14 ··· 100 101 if (PVR_VER(mfspr(SPRN_PVR)) == PVR_POWER9) { 101 102 /* Turn DAWR off by default, but allow admin to turn it on */ 102 103 debugfs_create_file_unsafe("dawr_enable_dangerous", 0600, 103 - powerpc_debugfs_root, 104 + arch_debugfs_dir, 104 105 &dawr_force_enable, 105 106 &dawr_enable_fops); 106 107 }

+8 -8

arch/powerpc/kernel/eeh.c

··· 21 21 #include <linux/spinlock.h> 22 22 #include <linux/export.h> 23 23 #include <linux/of.h> 24 + #include <linux/debugfs.h> 24 25 25 26 #include <linux/atomic.h> 26 - #include <asm/debugfs.h> 27 27 #include <asm/eeh.h> 28 28 #include <asm/eeh_event.h> 29 29 #include <asm/io.h> ··· 1901 1901 proc_create_single("powerpc/eeh", 0, NULL, proc_eeh_show); 1902 1902 #ifdef CONFIG_DEBUG_FS 1903 1903 debugfs_create_file_unsafe("eeh_enable", 0600, 1904 - powerpc_debugfs_root, NULL, 1904 + arch_debugfs_dir, NULL, 1905 1905 &eeh_enable_dbgfs_ops); 1906 1906 debugfs_create_u32("eeh_max_freezes", 0600, 1907 - powerpc_debugfs_root, &eeh_max_freezes); 1907 + arch_debugfs_dir, &eeh_max_freezes); 1908 1908 debugfs_create_bool("eeh_disable_recovery", 0600, 1909 - powerpc_debugfs_root, 1909 + arch_debugfs_dir, 1910 1910 &eeh_debugfs_no_recover); 1911 1911 debugfs_create_file_unsafe("eeh_dev_check", 0600, 1912 - powerpc_debugfs_root, NULL, 1912 + arch_debugfs_dir, NULL, 1913 1913 &eeh_dev_check_fops); 1914 1914 debugfs_create_file_unsafe("eeh_dev_break", 0600, 1915 - powerpc_debugfs_root, NULL, 1915 + arch_debugfs_dir, NULL, 1916 1916 &eeh_dev_break_fops); 1917 1917 debugfs_create_file_unsafe("eeh_force_recover", 0600, 1918 - powerpc_debugfs_root, NULL, 1918 + arch_debugfs_dir, NULL, 1919 1919 &eeh_force_recover_fops); 1920 1920 debugfs_create_file_unsafe("eeh_dev_can_recover", 0600, 1921 - powerpc_debugfs_root, NULL, 1921 + arch_debugfs_dir, NULL, 1922 1922 &eeh_dev_can_recover_fops); 1923 1923 eeh_cache_debugfs_init(); 1924 1924 #endif

+2 -2

arch/powerpc/kernel/eeh_cache.c

··· 12 12 #include <linux/slab.h> 13 13 #include <linux/spinlock.h> 14 14 #include <linux/atomic.h> 15 + #include <linux/debugfs.h> 15 16 #include <asm/pci-bridge.h> 16 - #include <asm/debugfs.h> 17 17 #include <asm/ppc-pci.h> 18 18 19 19 ··· 283 283 void eeh_cache_debugfs_init(void) 284 284 { 285 285 debugfs_create_file_unsafe("eeh_address_cache", 0400, 286 - powerpc_debugfs_root, NULL, 286 + arch_debugfs_dir, NULL, 287 287 &eeh_addr_cache_fops); 288 288 }

+2 -2

arch/powerpc/kernel/entry_32.S

··· 161 161 ret_from_kernel_thread: 162 162 REST_NVGPRS(r1) 163 163 bl schedule_tail 164 - mtlr r14 164 + mtctr r14 165 165 mr r3,r15 166 166 PPC440EP_ERR42 167 - blrl 167 + bctrl 168 168 li r3,0 169 169 b ret_from_syscall 170 170

+1 -1

arch/powerpc/kernel/entry_64.S

··· 309 309 */ 310 310 lbz r0,PACAIRQSOFTMASK(r13) 311 311 1: tdeqi r0,IRQS_ENABLED 312 - EMIT_BUG_ENTRY 1b,__FILE__,__LINE__,BUGFLAG_WARNING 312 + EMIT_WARN_ENTRY 1b,__FILE__,__LINE__,BUGFLAG_WARNING 313 313 #endif 314 314 315 315 /* Hard-disable interrupts */

+12 -12

arch/powerpc/kernel/exceptions-64e.S

··· 545 545 PROLOG_ADDITION_2REGS) 546 546 mfspr r14,SPRN_DEAR 547 547 mfspr r15,SPRN_ESR 548 - std r14,_DAR(r1) 549 - std r15,_DSISR(r1) 548 + std r14,_DEAR(r1) 549 + std r15,_ESR(r1) 550 550 ld r14,PACA_EXGEN+EX_R14(r13) 551 551 ld r15,PACA_EXGEN+EX_R15(r13) 552 552 EXCEPTION_COMMON(0x300) ··· 558 558 PROLOG_ADDITION_2REGS) 559 559 li r15,0 560 560 mr r14,r10 561 - std r14,_DAR(r1) 562 - std r15,_DSISR(r1) 561 + std r14,_DEAR(r1) 562 + std r15,_ESR(r1) 563 563 ld r14,PACA_EXGEN+EX_R14(r13) 564 564 ld r15,PACA_EXGEN+EX_R15(r13) 565 565 EXCEPTION_COMMON(0x400) ··· 575 575 PROLOG_ADDITION_2REGS) 576 576 mfspr r14,SPRN_DEAR 577 577 mfspr r15,SPRN_ESR 578 - std r14,_DAR(r1) 579 - std r15,_DSISR(r1) 578 + std r14,_DEAR(r1) 579 + std r15,_ESR(r1) 580 580 ld r14,PACA_EXGEN+EX_R14(r13) 581 581 ld r15,PACA_EXGEN+EX_R15(r13) 582 582 EXCEPTION_COMMON(0x600) ··· 587 587 NORMAL_EXCEPTION_PROLOG(0x700, BOOKE_INTERRUPT_PROGRAM, 588 588 PROLOG_ADDITION_1REG) 589 589 mfspr r14,SPRN_ESR 590 - std r14,_DSISR(r1) 590 + std r14,_ESR(r1) 591 591 ld r14,PACA_EXGEN+EX_R14(r13) 592 592 EXCEPTION_COMMON(0x700) 593 593 addi r3,r1,STACK_FRAME_OVERHEAD ··· 1057 1057 std r11,_CCR(r1) 1058 1058 mfspr r10,SPRN_DEAR 1059 1059 mfspr r11,SPRN_ESR 1060 - std r10,_DAR(r1) 1061 - std r11,_DSISR(r1) 1060 + std r10,_DEAR(r1) 1061 + std r11,_ESR(r1) 1062 1062 std r0,GPR0(r1); /* save r0 in stackframe */ \ 1063 1063 std r2,GPR2(r1); /* save r2 in stackframe */ \ 1064 1064 SAVE_4GPRS(3, r1); /* save r3 - r6 in stackframe */ \ ··· 1127 1127 * r3 = MAS0_TLBSEL (for the iprot array) 1128 1128 * r4 = SPRN_TLBnCFG 1129 1129 */ 1130 - bl invstr /* Find our address */ 1130 + bcl 20,31,$+4 /* Find our address */ 1131 1131 invstr: mflr r6 /* Make it accessible */ 1132 1132 mfmsr r7 1133 1133 rlwinm r5,r7,27,31,31 /* extract MSR[IS] */ ··· 1196 1196 mfmsr r6 1197 1197 xori r6,r6,MSR_IS 1198 1198 mtspr SPRN_SRR1,r6 1199 - bl 1f /* Find our address */ 1199 + bcl 20,31,$+4 /* Find our address */ 1200 1200 1: mflr r6 1201 1201 addi r6,r6,(2f - 1b) 1202 1202 mtspr SPRN_SRR0,r6 ··· 1256 1256 * r4 = MAS0 w/TLBSEL & ESEL for the temp mapping 1257 1257 */ 1258 1258 /* Now we branch the new virtual address mapped by this entry */ 1259 - bl 1f /* Find our address */ 1259 + bcl 20,31,$+4 /* Find our address */ 1260 1260 1: mflr r6 1261 1261 addi r6,r6,(2f - 1b) 1262 1262 tovirt(r6,r6)

+2 -2

arch/powerpc/kernel/fadump.c

··· 24 24 #include <linux/slab.h> 25 25 #include <linux/cma.h> 26 26 #include <linux/hugetlb.h> 27 + #include <linux/debugfs.h> 27 28 28 - #include <asm/debugfs.h> 29 29 #include <asm/page.h> 30 30 #include <asm/prom.h> 31 31 #include <asm/fadump.h> ··· 1557 1557 return; 1558 1558 } 1559 1559 1560 - debugfs_create_file("fadump_region", 0444, powerpc_debugfs_root, NULL, 1560 + debugfs_create_file("fadump_region", 0444, arch_debugfs_dir, NULL, 1561 1561 &fadump_region_fops); 1562 1562 1563 1563 if (fw_dump.dump_active) {

+1 -2

arch/powerpc/kernel/fpu.S

··· 91 91 isync 92 92 /* enable use of FP after return */ 93 93 #ifdef CONFIG_PPC32 94 - mfspr r5,SPRN_SPRG_THREAD /* current task's THREAD (phys) */ 95 - tovirt(r5, r5) 94 + addi r5,r2,THREAD 96 95 lwz r4,THREAD_FPEXC_MODE(r5) 97 96 ori r9,r9,MSR_FP /* enable FP for current */ 98 97 or r9,r9,r4

+4 -4

arch/powerpc/kernel/fsl_booke_entry_mapping.S

··· 1 1 /* SPDX-License-Identifier: GPL-2.0 */ 2 2 3 3 /* 1. Find the index of the entry we're executing in */ 4 - bl invstr /* Find our address */ 4 + bcl 20,31,$+4 /* Find our address */ 5 5 invstr: mflr r6 /* Make it accessible */ 6 6 mfmsr r7 7 7 rlwinm r4,r7,27,31,31 /* extract MSR[IS] */ ··· 85 85 addi r6,r6,10 86 86 slw r6,r8,r6 /* convert to mask */ 87 87 88 - bl 1f /* Find our address */ 88 + bcl 20,31,$+4 /* Find our address */ 89 89 1: mflr r7 90 90 91 91 mfspr r8,SPRN_MAS3 ··· 117 117 118 118 xori r6,r4,1 119 119 slwi r6,r6,5 /* setup new context with other address space */ 120 - bl 1f /* Find our address */ 120 + bcl 20,31,$+4 /* Find our address */ 121 121 1: mflr r9 122 122 rlwimi r7,r9,0,20,31 123 123 addi r7,r7,(2f - 1b) ··· 207 207 208 208 lis r7,MSR_KERNEL@h 209 209 ori r7,r7,MSR_KERNEL@l 210 - bl 1f /* Find our address */ 210 + bcl 20,31,$+4 /* Find our address */ 211 211 1: mflr r9 212 212 rlwimi r6,r9,0,20,31 213 213 addi r6,r6,(2f - 1b)

+3 -3

arch/powerpc/kernel/head_44x.S

··· 70 70 * address. 71 71 * r21 will be loaded with the physical runtime address of _stext 72 72 */ 73 - bl 0f /* Get our runtime address */ 73 + bcl 20,31,$+4 /* Get our runtime address */ 74 74 0: mflr r21 /* Make it accessible */ 75 75 addis r21,r21,(_stext - 0b)@ha 76 76 addi r21,r21,(_stext - 0b)@l /* Get our current runtime base */ ··· 853 853 wmmucr: mtspr SPRN_MMUCR,r3 /* Put MMUCR */ 854 854 sync 855 855 856 - bl invstr /* Find our address */ 856 + bcl 20,31,$+4 /* Find our address */ 857 857 invstr: mflr r5 /* Make it accessible */ 858 858 tlbsx r23,0,r5 /* Find entry we are in */ 859 859 li r4,0 /* Start at TLB entry 0 */ ··· 1045 1045 sync 1046 1046 1047 1047 /* Find the entry we are running from */ 1048 - bl 1f 1048 + bcl 20,31,$+4 1049 1049 1: mflr r23 1050 1050 tlbsx r23,0,r23 1051 1051 tlbre r24,r23,0

+2

arch/powerpc/kernel/head_64.S

··· 712 712 isync 713 713 blr 714 714 715 + _ASM_NOKPROBE_SYMBOL(copy_and_flush); /* Called in real mode */ 716 + 715 717 .align 8 716 718 copy_to_here: 717 719

+3 -3

arch/powerpc/kernel/head_fsl_booke.S

··· 79 79 mr r23,r3 80 80 mr r25,r4 81 81 82 - bl 0f 82 + bcl 20,31,$+4 83 83 0: mflr r8 84 84 addis r3,r8,(is_second_reloc - 0b)@ha 85 85 lwz r19,(is_second_reloc - 0b)@l(r3) ··· 1132 1132 bne 1b 1133 1133 1134 1134 /* Get the tlb entry used by the current running code */ 1135 - bl 0f 1135 + bcl 20,31,$+4 1136 1136 0: mflr r4 1137 1137 tlbsx 0,r4 1138 1138 ··· 1166 1166 _GLOBAL(restore_to_as0) 1167 1167 mflr r0 1168 1168 1169 - bl 0f 1169 + bcl 20,31,$+4 1170 1170 0: mflr r9 1171 1171 addi r9,r9,1f - 0b 1172 1172

-1

arch/powerpc/kernel/hw_breakpoint.c

··· 22 22 #include <asm/processor.h> 23 23 #include <asm/sstep.h> 24 24 #include <asm/debug.h> 25 - #include <asm/debugfs.h> 26 25 #include <asm/hvcall.h> 27 26 #include <asm/inst.h> 28 27 #include <linux/uaccess.h>

+3 -9

arch/powerpc/kernel/interrupt.c

··· 8 8 #include <asm/asm-prototypes.h> 9 9 #include <asm/kup.h> 10 10 #include <asm/cputime.h> 11 - #include <asm/interrupt.h> 12 11 #include <asm/hw_irq.h> 13 12 #include <asm/interrupt.h> 14 13 #include <asm/kprobes.h> ··· 92 93 CT_WARN_ON(ct_state() == CONTEXT_KERNEL); 93 94 user_exit_irqoff(); 94 95 95 - if (!IS_ENABLED(CONFIG_BOOKE) && !IS_ENABLED(CONFIG_40x)) 96 - BUG_ON(!(regs->msr & MSR_RI)); 96 + BUG_ON(regs_is_unrecoverable(regs)); 97 97 BUG_ON(!(regs->msr & MSR_PR)); 98 98 BUG_ON(arch_irq_disabled_regs(regs)); 99 99 ··· 461 463 { 462 464 unsigned long ret; 463 465 464 - if (!IS_ENABLED(CONFIG_BOOKE) && !IS_ENABLED(CONFIG_40x)) 465 - BUG_ON(!(regs->msr & MSR_RI)); 466 - BUG_ON(!(regs->msr & MSR_PR)); 466 + BUG_ON(regs_is_unrecoverable(regs)); 467 467 BUG_ON(arch_irq_disabled_regs(regs)); 468 468 CT_WARN_ON(ct_state() == CONTEXT_USER); 469 469 ··· 492 496 bool stack_store = current_thread_info()->flags & 493 497 _TIF_EMULATE_STACK_STORE; 494 498 495 - if (!IS_ENABLED(CONFIG_BOOKE) && !IS_ENABLED(CONFIG_40x) && 496 - unlikely(!(regs->msr & MSR_RI))) 499 + if (regs_is_unrecoverable(regs)) 497 500 unrecoverable_exception(regs); 498 - BUG_ON(regs->msr & MSR_PR); 499 501 /* 500 502 * CT_WARN_ON comes here via program_check_exception, 501 503 * so avoid recursion.

+31 -30

arch/powerpc/kernel/iommu.c

··· 688 688 if (tbl->it_offset == 0) 689 689 set_bit(0, tbl->it_map); 690 690 691 + if (res_start < tbl->it_offset) 692 + res_start = tbl->it_offset; 693 + 694 + if (res_end > (tbl->it_offset + tbl->it_size)) 695 + res_end = tbl->it_offset + tbl->it_size; 696 + 697 + /* Check if res_start..res_end is a valid range in the table */ 698 + if (res_start >= res_end) { 699 + tbl->it_reserved_start = tbl->it_offset; 700 + tbl->it_reserved_end = tbl->it_offset; 701 + return; 702 + } 703 + 691 704 tbl->it_reserved_start = res_start; 692 705 tbl->it_reserved_end = res_end; 693 706 694 - /* Check if res_start..res_end isn't empty and overlaps the table */ 695 - if (res_start && res_end && 696 - (tbl->it_offset + tbl->it_size < res_start || 697 - res_end < tbl->it_offset)) 698 - return; 699 - 700 707 for (i = tbl->it_reserved_start; i < tbl->it_reserved_end; ++i) 701 708 set_bit(i - tbl->it_offset, tbl->it_map); 702 - } 703 - 704 - static void iommu_table_release_pages(struct iommu_table *tbl) 705 - { 706 - int i; 707 - 708 - /* 709 - * In case we have reserved the first bit, we should not emit 710 - * the warning below. 711 - */ 712 - if (tbl->it_offset == 0) 713 - clear_bit(0, tbl->it_map); 714 - 715 - for (i = tbl->it_reserved_start; i < tbl->it_reserved_end; ++i) 716 - clear_bit(i - tbl->it_offset, tbl->it_map); 717 709 } 718 710 719 711 /* ··· 769 777 return tbl; 770 778 } 771 779 780 + bool iommu_table_in_use(struct iommu_table *tbl) 781 + { 782 + unsigned long start = 0, end; 783 + 784 + /* ignore reserved bit0 */ 785 + if (tbl->it_offset == 0) 786 + start = 1; 787 + end = tbl->it_reserved_start - tbl->it_offset; 788 + if (find_next_bit(tbl->it_map, end, start) != end) 789 + return true; 790 + 791 + start = tbl->it_reserved_end - tbl->it_offset; 792 + end = tbl->it_size; 793 + return find_next_bit(tbl->it_map, end, start) != end; 794 + } 795 + 772 796 static void iommu_table_free(struct kref *kref) 773 797 { 774 798 struct iommu_table *tbl; ··· 801 793 802 794 iommu_debugfs_del(tbl); 803 795 804 - iommu_table_release_pages(tbl); 805 - 806 796 /* verify that table contains no entries */ 807 - if (!bitmap_empty(tbl->it_map, tbl->it_size)) 797 + if (iommu_table_in_use(tbl)) 808 798 pr_warn("%s: Unexpected TCEs\n", __func__); 809 799 810 800 /* free bitmap */ ··· 1103 1097 for (i = 0; i < tbl->nr_pools; i++) 1104 1098 spin_lock_nest_lock(&tbl->pools[i].lock, &tbl->large_pool.lock); 1105 1099 1106 - iommu_table_release_pages(tbl); 1107 - 1108 - if (!bitmap_empty(tbl->it_map, tbl->it_size)) { 1100 + if (iommu_table_in_use(tbl)) { 1109 1101 pr_err("iommu_tce: it_map is not empty"); 1110 1102 ret = -EBUSY; 1111 - /* Undo iommu_table_release_pages, i.e. restore bit#0, etc */ 1112 - iommu_table_reserve_pages(tbl, tbl->it_reserved_start, 1113 - tbl->it_reserved_end); 1114 1103 } else { 1115 1104 memset(tbl->it_map, 0xff, sz); 1116 1105 }

+14

arch/powerpc/kernel/kdebugfs.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <linux/debugfs.h> 3 + #include <linux/export.h> 4 + #include <linux/init.h> 5 + 6 + struct dentry *arch_debugfs_dir; 7 + EXPORT_SYMBOL(arch_debugfs_dir); 8 + 9 + static int __init arch_kdebugfs_init(void) 10 + { 11 + arch_debugfs_dir = debugfs_create_dir("powerpc", NULL); 12 + return 0; 13 + } 14 + arch_initcall(arch_kdebugfs_init);

+1 -1

arch/powerpc/kernel/misc.S

··· 29 29 li r3, 0 30 30 _GLOBAL(add_reloc_offset) 31 31 mflr r0 32 - bl 1f 32 + bcl 20,31,$+4 33 33 1: mflr r5 34 34 PPC_LL r4,(2f-1b)(r5) 35 35 subf r5,r4,r5

+2 -2

arch/powerpc/kernel/misc_32.S

··· 67 67 srwi. r8,r8,2 68 68 beqlr 69 69 mtctr r8 70 - bl 1f 70 + bcl 20,31,$+4 71 71 1: mflr r0 72 72 lis r4,1b@ha 73 73 addi r4,r4,1b@l ··· 237 237 addi r3,r3,-4 238 238 239 239 0: twnei r5, 0 /* WARN if r3 is not cache aligned */ 240 - EMIT_BUG_ENTRY 0b,__FILE__,__LINE__, BUGFLAG_WARNING 240 + EMIT_WARN_ENTRY 0b,__FILE__,__LINE__, BUGFLAG_WARNING 241 241 242 242 addi r4,r4,-4 243 243

+1 -1

arch/powerpc/kernel/misc_64.S

··· 255 255 * Physical (hardware) cpu id should be in r3. 256 256 */ 257 257 _GLOBAL(kexec_wait) 258 - bl 1f 258 + bcl 20,31,$+4 259 259 1: mflr r5 260 260 addi r5,r5,kexec_flag-1b 261 261

+6

arch/powerpc/kernel/pci-common.c

··· 29 29 #include <linux/slab.h> 30 30 #include <linux/vgaarb.h> 31 31 #include <linux/numa.h> 32 + #include <linux/msi.h> 32 33 33 34 #include <asm/processor.h> 34 35 #include <asm/io.h> ··· 1061 1060 1062 1061 int pcibios_add_device(struct pci_dev *dev) 1063 1062 { 1063 + struct irq_domain *d; 1064 + 1064 1065 #ifdef CONFIG_PCI_IOV 1065 1066 if (ppc_md.pcibios_fixup_sriov) 1066 1067 ppc_md.pcibios_fixup_sriov(dev); 1067 1068 #endif /* CONFIG_PCI_IOV */ 1068 1069 1070 + d = dev_get_msi_domain(&dev->bus->dev); 1071 + if (d) 1072 + dev_set_msi_domain(&dev->dev, d); 1069 1073 return 0; 1070 1074 } 1071 1075

+1 -1

arch/powerpc/kernel/process.c

··· 1499 1499 trap == INTERRUPT_DATA_STORAGE || 1500 1500 trap == INTERRUPT_ALIGNMENT) { 1501 1501 if (IS_ENABLED(CONFIG_4xx) || IS_ENABLED(CONFIG_BOOKE)) 1502 - pr_cont("DEAR: "REG" ESR: "REG" ", regs->dar, regs->dsisr); 1502 + pr_cont("DEAR: "REG" ESR: "REG" ", regs->dear, regs->esr); 1503 1503 else 1504 1504 pr_cont("DAR: "REG" DSISR: %08lx ", regs->dar, regs->dsisr); 1505 1505 }

+3 -2

arch/powerpc/kernel/prom.c

··· 640 640 } 641 641 #endif /* CONFIG_BLK_DEV_INITRD */ 642 642 643 - #ifdef CONFIG_PPC32 643 + if (!IS_ENABLED(CONFIG_PPC32)) 644 + return; 645 + 644 646 /* 645 647 * Handle the case where we might be booting from an old kexec 646 648 * image that setup the mem_rsvmap as pairs of 32-bit values ··· 663 661 } 664 662 return; 665 663 } 666 - #endif 667 664 } 668 665 669 666 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM

+2 -1

arch/powerpc/kernel/prom_init.c

··· 1096 1096 #else 1097 1097 0, 1098 1098 #endif 1099 - .associativity = OV5_FEAT(OV5_TYPE1_AFFINITY) | OV5_FEAT(OV5_PRRN), 1099 + .associativity = OV5_FEAT(OV5_FORM1_AFFINITY) | OV5_FEAT(OV5_PRRN) | 1100 + OV5_FEAT(OV5_FORM2_AFFINITY), 1100 1101 .bin_opts = OV5_FEAT(OV5_RESIZE_HPT) | OV5_FEAT(OV5_HP_EVT), 1101 1102 .micro_checkpoint = 0, 1102 1103 .reserved0 = 0,

+4

arch/powerpc/kernel/ptrace/ptrace.c

··· 373 373 offsetof(struct user_pt_regs, trap)); 374 374 BUILD_BUG_ON(offsetof(struct pt_regs, dar) != 375 375 offsetof(struct user_pt_regs, dar)); 376 + BUILD_BUG_ON(offsetof(struct pt_regs, dear) != 377 + offsetof(struct user_pt_regs, dar)); 376 378 BUILD_BUG_ON(offsetof(struct pt_regs, dsisr) != 379 + offsetof(struct user_pt_regs, dsisr)); 380 + BUILD_BUG_ON(offsetof(struct pt_regs, esr) != 377 381 offsetof(struct user_pt_regs, dsisr)); 378 382 BUILD_BUG_ON(offsetof(struct pt_regs, result) != 379 383 offsetof(struct user_pt_regs, result));

+1 -1

arch/powerpc/kernel/reloc_32.S

··· 30 30 _GLOBAL(relocate) 31 31 32 32 mflr r0 /* Save our LR */ 33 - bl 0f /* Find our current runtime address */ 33 + bcl 20,31,$+4 /* Find our current runtime address */ 34 34 0: mflr r12 /* Make it accessible */ 35 35 mtlr r0 36 36

+2 -2

arch/powerpc/kernel/rtasd.c

··· 429 429 430 430 do_event_scan(); 431 431 432 - get_online_cpus(); 432 + cpus_read_lock(); 433 433 434 434 /* raw_ OK because just using CPU as starting point. */ 435 435 cpu = cpumask_next(raw_smp_processor_id(), cpu_online_mask); ··· 451 451 schedule_delayed_work_on(cpu, &event_scan_work, 452 452 __round_jiffies_relative(event_scan_delay, cpu)); 453 453 454 - put_online_cpus(); 454 + cpus_read_unlock(); 455 455 } 456 456 457 457 #ifdef CONFIG_PPC64

+8 -8

arch/powerpc/kernel/security.c

··· 11 11 #include <linux/nospec.h> 12 12 #include <linux/prctl.h> 13 13 #include <linux/seq_buf.h> 14 + #include <linux/debugfs.h> 14 15 15 16 #include <asm/asm-prototypes.h> 16 17 #include <asm/code-patching.h> 17 - #include <asm/debugfs.h> 18 18 #include <asm/security_features.h> 19 19 #include <asm/setup.h> 20 20 #include <asm/inst.h> ··· 106 106 static __init int barrier_nospec_debugfs_init(void) 107 107 { 108 108 debugfs_create_file_unsafe("barrier_nospec", 0600, 109 - powerpc_debugfs_root, NULL, 109 + arch_debugfs_dir, NULL, 110 110 &fops_barrier_nospec); 111 111 return 0; 112 112 } ··· 114 114 115 115 static __init int security_feature_debugfs_init(void) 116 116 { 117 - debugfs_create_x64("security_features", 0400, powerpc_debugfs_root, 117 + debugfs_create_x64("security_features", 0400, arch_debugfs_dir, 118 118 &powerpc_security_features); 119 119 return 0; 120 120 } ··· 420 420 421 421 static __init int stf_barrier_debugfs_init(void) 422 422 { 423 - debugfs_create_file_unsafe("stf_barrier", 0600, powerpc_debugfs_root, 423 + debugfs_create_file_unsafe("stf_barrier", 0600, arch_debugfs_dir, 424 424 NULL, &fops_stf_barrier); 425 425 return 0; 426 426 } ··· 748 748 static __init int count_cache_flush_debugfs_init(void) 749 749 { 750 750 debugfs_create_file_unsafe("count_cache_flush", 0600, 751 - powerpc_debugfs_root, NULL, 751 + arch_debugfs_dir, NULL, 752 752 &fops_count_cache_flush); 753 753 return 0; 754 754 } ··· 834 834 835 835 static __init int rfi_flush_debugfs_init(void) 836 836 { 837 - debugfs_create_file("rfi_flush", 0600, powerpc_debugfs_root, NULL, &fops_rfi_flush); 838 - debugfs_create_file("entry_flush", 0600, powerpc_debugfs_root, NULL, &fops_entry_flush); 839 - debugfs_create_file("uaccess_flush", 0600, powerpc_debugfs_root, NULL, &fops_uaccess_flush); 837 + debugfs_create_file("rfi_flush", 0600, arch_debugfs_dir, NULL, &fops_rfi_flush); 838 + debugfs_create_file("entry_flush", 0600, arch_debugfs_dir, NULL, &fops_entry_flush); 839 + debugfs_create_file("uaccess_flush", 0600, arch_debugfs_dir, NULL, &fops_uaccess_flush); 840 840 return 0; 841 841 } 842 842 device_initcall(rfi_flush_debugfs_init);

-13

arch/powerpc/kernel/setup-common.c

··· 33 33 #include <linux/of_platform.h> 34 34 #include <linux/hugetlb.h> 35 35 #include <linux/pgtable.h> 36 - #include <asm/debugfs.h> 37 36 #include <asm/io.h> 38 37 #include <asm/paca.h> 39 38 #include <asm/prom.h> ··· 771 772 772 773 late_initcall(check_cache_coherency); 773 774 #endif /* CONFIG_CHECK_CACHE_COHERENCY */ 774 - 775 - #ifdef CONFIG_DEBUG_FS 776 - struct dentry *powerpc_debugfs_root; 777 - EXPORT_SYMBOL(powerpc_debugfs_root); 778 - 779 - static int powerpc_debugfs_init(void) 780 - { 781 - powerpc_debugfs_root = debugfs_create_dir("powerpc", NULL); 782 - return 0; 783 - } 784 - arch_initcall(powerpc_debugfs_init); 785 - #endif 786 775 787 776 void ppc_printk_progress(char *s, unsigned short hex) 788 777 {

-1

arch/powerpc/kernel/setup_64.c

··· 32 32 #include <linux/nmi.h> 33 33 #include <linux/pgtable.h> 34 34 35 - #include <asm/debugfs.h> 36 35 #include <asm/kvm_guest.h> 37 36 #include <asm/io.h> 38 37 #include <asm/kdump.h>

+64 -38

arch/powerpc/kernel/smp.c

··· 78 78 bool has_big_cores; 79 79 bool coregroup_enabled; 80 80 bool thread_group_shares_l2; 81 + bool thread_group_shares_l3; 81 82 82 83 DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map); 83 84 DEFINE_PER_CPU(cpumask_var_t, cpu_smallcore_map); ··· 102 101 103 102 #define MAX_THREAD_LIST_SIZE 8 104 103 #define THREAD_GROUP_SHARE_L1 1 105 - #define THREAD_GROUP_SHARE_L2 2 104 + #define THREAD_GROUP_SHARE_L2_L3 2 106 105 struct thread_groups { 107 106 unsigned int property; 108 107 unsigned int nr_groups; ··· 123 122 * On big-cores system, thread_group_l1_cache_map for each CPU corresponds to 124 123 * the set its siblings that share the L1-cache. 125 124 */ 126 - static DEFINE_PER_CPU(cpumask_var_t, thread_group_l1_cache_map); 125 + DEFINE_PER_CPU(cpumask_var_t, thread_group_l1_cache_map); 127 126 128 127 /* 129 128 * On some big-cores system, thread_group_l2_cache_map for each CPU 130 129 * corresponds to the set its siblings within the core that share the 131 130 * L2-cache. 132 131 */ 133 - static DEFINE_PER_CPU(cpumask_var_t, thread_group_l2_cache_map); 132 + DEFINE_PER_CPU(cpumask_var_t, thread_group_l2_cache_map); 133 + 134 + /* 135 + * On P10, thread_group_l3_cache_map for each CPU is equal to the 136 + * thread_group_l2_cache_map 137 + */ 138 + DEFINE_PER_CPU(cpumask_var_t, thread_group_l3_cache_map); 134 139 135 140 /* SMP operations for this machine */ 136 141 struct smp_ops_t *smp_ops; ··· 896 889 return tg; 897 890 } 898 891 899 - static int __init init_thread_group_cache_map(int cpu, int cache_property) 900 - 892 + static int update_mask_from_threadgroup(cpumask_var_t *mask, struct thread_groups *tg, int cpu, int cpu_group_start) 901 893 { 902 894 int first_thread = cpu_first_thread_sibling(cpu); 903 - int i, cpu_group_start = -1, err = 0; 904 - struct thread_groups *tg = NULL; 905 - cpumask_var_t *mask = NULL; 906 - 907 - if (cache_property != THREAD_GROUP_SHARE_L1 && 908 - cache_property != THREAD_GROUP_SHARE_L2) 909 - return -EINVAL; 910 - 911 - tg = get_thread_groups(cpu, cache_property, &err); 912 - if (!tg) 913 - return err; 914 - 915 - cpu_group_start = get_cpu_thread_group_start(cpu, tg); 916 - 917 - if (unlikely(cpu_group_start == -1)) { 918 - WARN_ON_ONCE(1); 919 - return -ENODATA; 920 - } 921 - 922 - if (cache_property == THREAD_GROUP_SHARE_L1) 923 - mask = &per_cpu(thread_group_l1_cache_map, cpu); 924 - else if (cache_property == THREAD_GROUP_SHARE_L2) 925 - mask = &per_cpu(thread_group_l2_cache_map, cpu); 895 + int i; 926 896 927 897 zalloc_cpumask_var_node(mask, GFP_KERNEL, cpu_to_node(cpu)); 928 898 ··· 914 930 if (i_group_start == cpu_group_start) 915 931 cpumask_set_cpu(i, *mask); 916 932 } 933 + 934 + return 0; 935 + } 936 + 937 + static int __init init_thread_group_cache_map(int cpu, int cache_property) 938 + 939 + { 940 + int cpu_group_start = -1, err = 0; 941 + struct thread_groups *tg = NULL; 942 + cpumask_var_t *mask = NULL; 943 + 944 + if (cache_property != THREAD_GROUP_SHARE_L1 && 945 + cache_property != THREAD_GROUP_SHARE_L2_L3) 946 + return -EINVAL; 947 + 948 + tg = get_thread_groups(cpu, cache_property, &err); 949 + 950 + if (!tg) 951 + return err; 952 + 953 + cpu_group_start = get_cpu_thread_group_start(cpu, tg); 954 + 955 + if (unlikely(cpu_group_start == -1)) { 956 + WARN_ON_ONCE(1); 957 + return -ENODATA; 958 + } 959 + 960 + if (cache_property == THREAD_GROUP_SHARE_L1) { 961 + mask = &per_cpu(thread_group_l1_cache_map, cpu); 962 + update_mask_from_threadgroup(mask, tg, cpu, cpu_group_start); 963 + } 964 + else if (cache_property == THREAD_GROUP_SHARE_L2_L3) { 965 + mask = &per_cpu(thread_group_l2_cache_map, cpu); 966 + update_mask_from_threadgroup(mask, tg, cpu, cpu_group_start); 967 + mask = &per_cpu(thread_group_l3_cache_map, cpu); 968 + update_mask_from_threadgroup(mask, tg, cpu, cpu_group_start); 969 + } 970 + 917 971 918 972 return 0; 919 973 } ··· 1042 1020 has_big_cores = true; 1043 1021 1044 1022 for_each_possible_cpu(cpu) { 1045 - int err = init_thread_group_cache_map(cpu, THREAD_GROUP_SHARE_L2); 1023 + int err = init_thread_group_cache_map(cpu, THREAD_GROUP_SHARE_L2_L3); 1046 1024 1047 1025 if (err) 1048 1026 return err; 1049 1027 } 1050 1028 1051 1029 thread_group_shares_l2 = true; 1052 - pr_debug("L2 cache only shared by the threads in the small core\n"); 1030 + thread_group_shares_l3 = true; 1031 + pr_debug("L2/L3 cache only shared by the threads in the small core\n"); 1032 + 1053 1033 return 0; 1054 1034 } 1055 1035 ··· 1109 1085 } 1110 1086 1111 1087 if (cpu_to_chip_id(boot_cpuid) != -1) { 1112 - int idx = num_possible_cpus() / threads_per_core; 1088 + int idx = DIV_ROUND_UP(num_possible_cpus(), threads_per_core); 1113 1089 1114 1090 /* 1115 1091 * All threads of a core will all belong to the same core, ··· 1400 1376 l2_cache = cpu_to_l2cache(cpu); 1401 1377 if (!l2_cache || !*mask) { 1402 1378 /* Assume only core siblings share cache with this CPU */ 1403 - for_each_cpu(i, submask_fn(cpu)) 1379 + for_each_cpu(i, cpu_sibling_mask(cpu)) 1404 1380 set_cpus_related(cpu, i, cpu_l2_cache_mask); 1405 1381 1406 1382 return false; ··· 1441 1417 { 1442 1418 struct cpumask *(*mask_fn)(int) = cpu_sibling_mask; 1443 1419 int i; 1420 + 1421 + unmap_cpu_from_node(cpu); 1444 1422 1445 1423 if (shared_caches) 1446 1424 mask_fn = cpu_l2_cache_mask; ··· 1528 1502 * This CPU will not be in the online mask yet so we need to manually 1529 1503 * add it to it's own thread sibling mask. 1530 1504 */ 1505 + map_cpu_to_node(cpu, cpu_to_node(cpu)); 1531 1506 cpumask_set_cpu(cpu, cpu_sibling_mask(cpu)); 1507 + cpumask_set_cpu(cpu, cpu_core_mask(cpu)); 1532 1508 1533 1509 for (i = first_thread; i < first_thread + threads_per_core; i++) 1534 1510 if (cpu_online(i)) ··· 1548 1520 if (chip_id_lookup_table && ret) 1549 1521 chip_id = cpu_to_chip_id(cpu); 1550 1522 1551 - if (chip_id == -1) { 1552 - cpumask_copy(per_cpu(cpu_core_map, cpu), cpu_cpu_mask(cpu)); 1553 - goto out; 1554 - } 1555 - 1556 1523 if (shared_caches) 1557 1524 submask_fn = cpu_l2_cache_mask; 1558 1525 ··· 1556 1533 1557 1534 /* Skip all CPUs already part of current CPU core mask */ 1558 1535 cpumask_andnot(mask, cpu_online_mask, cpu_core_mask(cpu)); 1536 + 1537 + /* If chip_id is -1; limit the cpu_core_mask to within DIE*/ 1538 + if (chip_id == -1) 1539 + cpumask_and(mask, mask, cpu_cpu_mask(cpu)); 1559 1540 1560 1541 for_each_cpu(i, mask) { 1561 1542 if (chip_id == cpu_to_chip_id(i)) { ··· 1570 1543 } 1571 1544 } 1572 1545 1573 - out: 1574 1546 free_cpumask_var(mask); 1575 1547 } 1576 1548

+1

arch/powerpc/kernel/stacktrace.c

··· 8 8 * Copyright 2018 Nick Piggin, Michael Ellerman, IBM Corp. 9 9 */ 10 10 11 + #include <linux/delay.h> 11 12 #include <linux/export.h> 12 13 #include <linux/kallsyms.h> 13 14 #include <linux/module.h>

+4 -11

arch/powerpc/kernel/syscalls.c

··· 41 41 unsigned long prot, unsigned long flags, 42 42 unsigned long fd, unsigned long off, int shift) 43 43 { 44 - long ret = -EINVAL; 45 - 46 44 if (!arch_validate_prot(prot, addr)) 47 - goto out; 45 + return -EINVAL; 48 46 49 - if (shift) { 50 - if (off & ((1 << shift) - 1)) 51 - goto out; 52 - off >>= shift; 53 - } 47 + if (!IS_ALIGNED(off, 1 << shift)) 48 + return -EINVAL; 54 49 55 - ret = ksys_mmap_pgoff(addr, len, prot, flags, fd, off); 56 - out: 57 - return ret; 50 + return ksys_mmap_pgoff(addr, len, prot, flags, fd, off >> shift); 58 51 } 59 52 60 53 SYSCALL_DEFINE6(mmap2, unsigned long, addr, size_t, len,

+1 -1

arch/powerpc/kernel/tau_6xx.c

··· 164 164 queue_work(tau_workq, work); 165 165 } 166 166 167 - DECLARE_WORK(tau_work, tau_work_func); 167 + static DECLARE_WORK(tau_work, tau_work_func); 168 168 169 169 /* 170 170 * setup the TAU

+1 -2

arch/powerpc/kernel/time.c

··· 31 31 #include <linux/export.h> 32 32 #include <linux/sched.h> 33 33 #include <linux/sched/clock.h> 34 + #include <linux/sched/cputime.h> 34 35 #include <linux/kernel.h> 35 36 #include <linux/param.h> 36 37 #include <linux/string.h> ··· 53 52 #include <linux/irq_work.h> 54 53 #include <linux/of_clk.h> 55 54 #include <linux/suspend.h> 56 - #include <linux/sched/cputime.h> 57 - #include <linux/sched/clock.h> 58 55 #include <linux/processor.h> 59 56 #include <asm/trace.h> 60 57

+14 -9

arch/powerpc/kernel/traps.c

··· 37 37 #include <linux/smp.h> 38 38 #include <linux/console.h> 39 39 #include <linux/kmsg_dump.h> 40 + #include <linux/debugfs.h> 40 41 41 42 #include <asm/emulated_ops.h> 42 43 #include <linux/uaccess.h> 43 - #include <asm/debugfs.h> 44 44 #include <asm/interrupt.h> 45 45 #include <asm/io.h> 46 46 #include <asm/machdep.h> ··· 427 427 return; 428 428 429 429 nonrecoverable: 430 - regs_set_return_msr(regs, regs->msr & ~MSR_RI); 430 + regs_set_unrecoverable(regs); 431 431 #endif 432 432 } 433 433 DEFINE_INTERRUPT_HANDLER_NMI(system_reset_exception) ··· 497 497 die("Unrecoverable nested System Reset", regs, SIGABRT); 498 498 #endif 499 499 /* Must die if the interrupt is not recoverable */ 500 - if (!(regs->msr & MSR_RI)) { 500 + if (regs_is_unrecoverable(regs)) { 501 501 /* For the reason explained in die_mce, nmi_exit before die */ 502 502 nmi_exit(); 503 503 die("Unrecoverable System Reset", regs, SIGABRT); ··· 549 549 printk(KERN_DEBUG "%s bad port %lx at %p\n", 550 550 (*nip & 0x100)? "OUT to": "IN from", 551 551 regs->gpr[rb] - _IO_BASE, nip); 552 - regs_set_return_msr(regs, regs->msr | MSR_RI); 552 + regs_set_recoverable(regs); 553 553 regs_set_return_ip(regs, extable_fixup(entry)); 554 554 return 1; 555 555 } ··· 561 561 #ifdef CONFIG_PPC_ADV_DEBUG_REGS 562 562 /* On 4xx, the reason for the machine check or program exception 563 563 is in the ESR. */ 564 - #define get_reason(regs) ((regs)->dsisr) 564 + #define get_reason(regs) ((regs)->esr) 565 565 #define REASON_FP ESR_FP 566 566 #define REASON_ILLEGAL (ESR_PIL | ESR_PUO) 567 567 #define REASON_PRIVILEGED ESR_PPR ··· 839 839 840 840 bail: 841 841 /* Must die if the interrupt is not recoverable */ 842 - if (!(regs->msr & MSR_RI)) 842 + if (regs_is_unrecoverable(regs)) 843 843 die_mce("Unrecoverable Machine check", regs, SIGBUS); 844 844 845 845 #ifdef CONFIG_PPC_BOOK3S_64 ··· 1481 1481 1482 1482 if (!(regs->msr & MSR_PR) && /* not user-mode */ 1483 1483 report_bug(bugaddr, regs) == BUG_TRAP_TYPE_WARN) { 1484 - regs_add_return_ip(regs, 4); 1485 - return; 1484 + const struct exception_table_entry *entry; 1485 + 1486 + entry = search_exception_tables(bugaddr); 1487 + if (entry) { 1488 + regs_set_return_ip(regs, extable_fixup(entry) + regs->nip - bugaddr); 1489 + return; 1490 + } 1486 1491 } 1487 1492 _exception(SIGTRAP, regs, TRAP_BRKPT, regs->nip); 1488 1493 return; ··· 2276 2271 struct ppc_emulated_entry *entries = (void *)&ppc_emulated; 2277 2272 2278 2273 dir = debugfs_create_dir("emulated_instructions", 2279 - powerpc_debugfs_root); 2274 + arch_debugfs_dir); 2280 2275 2281 2276 debugfs_create_u32("do_warn", 0644, dir, &ppc_warn_emulated); 2282 2277

+1 -3

arch/powerpc/kernel/vector.S

··· 65 65 1: 66 66 /* enable use of VMX after return */ 67 67 #ifdef CONFIG_PPC32 68 - mfspr r5,SPRN_SPRG_THREAD /* current task's THREAD (phys) */ 68 + addi r5,r2,THREAD 69 69 oris r9,r9,MSR_VEC@h 70 - tovirt(r5, r5) 71 70 #else 72 71 ld r4,PACACURRENT(r13) 73 72 addi r5,r4,THREAD /* Get THREAD */ ··· 80 81 li r4,1 81 82 stb r4,THREAD_LOAD_VEC(r5) 82 83 addi r6,r5,THREAD_VRSTATE 83 - li r4,1 84 84 li r10,VRSTATE_VSCR 85 85 stw r4,THREAD_USED_VR(r5) 86 86 lvx v0,r10,r6

+7 -3

arch/powerpc/kexec/core_64.c

··· 64 64 begin = image->segment[i].mem; 65 65 end = begin + image->segment[i].memsz; 66 66 67 - if ((begin < high) && (end > low)) 67 + if ((begin < high) && (end > low)) { 68 + of_node_put(node); 68 69 return -ETXTBSY; 70 + } 69 71 } 70 72 } 71 73 72 74 return 0; 73 75 } 74 76 75 - static void copy_segments(unsigned long ind) 77 + /* Called during kexec sequence with MMU off */ 78 + static notrace void copy_segments(unsigned long ind) 76 79 { 77 80 unsigned long entry; 78 81 unsigned long *ptr; ··· 108 105 } 109 106 } 110 107 111 - void kexec_copy_flush(struct kimage *image) 108 + /* Called during kexec sequence with MMU off */ 109 + notrace void kexec_copy_flush(struct kimage *image) 112 110 { 113 111 long i, nr_segments = image->nr_segments; 114 112 struct kexec_segment ranges[KEXEC_SEGMENT_MAX];

+6 -6

arch/powerpc/kexec/relocate_32.S

··· 93 93 * Invalidate all the TLB entries except the current entry 94 94 * where we are running from 95 95 */ 96 - bl 0f /* Find our address */ 96 + bcl 20,31,$+4 /* Find our address */ 97 97 0: mflr r5 /* Make it accessible */ 98 98 tlbsx r23,0,r5 /* Find entry we are in */ 99 99 li r4,0 /* Start at TLB entry 0 */ ··· 158 158 /* Switch to other address space in MSR */ 159 159 insrwi r9, r7, 1, 26 /* Set MSR[IS] = r7 */ 160 160 161 - bl 1f 161 + bcl 20,31,$+4 162 162 1: mflr r8 163 163 addi r8, r8, (2f-1b) /* Find the target offset */ 164 164 ··· 202 202 li r9,0 203 203 insrwi r9, r7, 1, 26 /* Set MSR[IS] = r7 */ 204 204 205 - bl 1f 205 + bcl 20,31,$+4 206 206 1: mflr r8 207 207 and r8, r8, r11 /* Get our offset within page */ 208 208 addi r8, r8, (2f-1b) ··· 240 240 sync 241 241 242 242 /* Find the entry we are running from */ 243 - bl 2f 243 + bcl 20,31,$+4 244 244 2: mflr r23 245 245 tlbsx r23, 0, r23 246 246 tlbre r24, r23, 0 /* TLB Word 0 */ ··· 296 296 /* Update the msr to the new TS */ 297 297 insrwi r5, r7, 1, 26 298 298 299 - bl 1f 299 + bcl 20,31,$+4 300 300 1: mflr r6 301 301 addi r6, r6, (2f-1b) 302 302 ··· 355 355 /* Defaults to 256M */ 356 356 lis r10, 0x1000 357 357 358 - bl 1f 358 + bcl 20,31,$+4 359 359 1: mflr r4 360 360 addi r4, r4, (2f-1b) /* virtual address of 2f */ 361 361

-1

arch/powerpc/kvm/Kconfig

··· 38 38 config KVM_BOOK3S_64_HANDLER 39 39 bool 40 40 select KVM_BOOK3S_HANDLER 41 - select PPC_DAWR_FORCE_ENABLE 42 41 43 42 config KVM_BOOK3S_PR_POSSIBLE 44 43 bool

+2 -1

arch/powerpc/kvm/book3s.h

··· 23 23 extern int kvmppc_core_emulate_mfspr_pr(struct kvm_vcpu *vcpu, 24 24 int sprn, ulong *spr_val); 25 25 extern int kvmppc_book3s_init_pr(void); 26 - extern void kvmppc_book3s_exit_pr(void); 26 + void kvmppc_book3s_exit_pr(void); 27 + extern int kvmppc_handle_exit_pr(struct kvm_vcpu *vcpu, unsigned int exit_nr); 27 28 28 29 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM 29 30 extern void kvmppc_emulate_tabort(struct kvm_vcpu *vcpu, int ra_val);

+1 -2

arch/powerpc/kvm/book3s_64_mmu.c

··· 196 196 hva_t ptegp; 197 197 u64 pteg[16]; 198 198 u64 avpn = 0; 199 - u64 v, r; 199 + u64 r; 200 200 u64 v_val, v_mask; 201 201 u64 eaddr_mask; 202 202 int i; ··· 285 285 goto do_second; 286 286 } 287 287 288 - v = be64_to_cpu(pteg[i]); 289 288 r = be64_to_cpu(pteg[i+1]); 290 289 pp = (r & HPTE_R_PP) | key; 291 290 if (r & HPTE_R_PP0)

+7 -5

arch/powerpc/kvm/book3s_64_mmu_radix.c

··· 44 44 (to != NULL) ? __pa(to): 0, 45 45 (from != NULL) ? __pa(from): 0, n); 46 46 47 + if (eaddr & (0xFFFUL << 52)) 48 + return ret; 49 + 47 50 quadrant = 1; 48 51 if (!pid) 49 52 quadrant = 2; ··· 68 65 } 69 66 isync(); 70 67 68 + pagefault_disable(); 71 69 if (is_load) 72 - ret = copy_from_user_nofault(to, (const void __user *)from, n); 70 + ret = __copy_from_user_inatomic(to, (const void __user *)from, n); 73 71 else 74 - ret = copy_to_user_nofault((void __user *)to, from, n); 72 + ret = __copy_to_user_inatomic((void __user *)to, from, n); 73 + pagefault_enable(); 75 74 76 75 /* switch the pid first to avoid running host with unallocated pid */ 77 76 if (quadrant == 1 && pid != old_pid) ··· 86 81 87 82 return ret; 88 83 } 89 - EXPORT_SYMBOL_GPL(__kvmhv_copy_tofrom_guest_radix); 90 84 91 85 static long kvmhv_copy_tofrom_guest_radix(struct kvm_vcpu *vcpu, gva_t eaddr, 92 86 void *to, void *from, unsigned long n) ··· 121 117 122 118 return ret; 123 119 } 124 - EXPORT_SYMBOL_GPL(kvmhv_copy_from_guest_radix); 125 120 126 121 long kvmhv_copy_to_guest_radix(struct kvm_vcpu *vcpu, gva_t eaddr, void *from, 127 122 unsigned long n) 128 123 { 129 124 return kvmhv_copy_tofrom_guest_radix(vcpu, eaddr, NULL, from, n); 130 125 } 131 - EXPORT_SYMBOL_GPL(kvmhv_copy_to_guest_radix); 132 126 133 127 int kvmppc_mmu_walk_radix_tree(struct kvm_vcpu *vcpu, gva_t eaddr, 134 128 struct kvmppc_pte *gpte, u64 root,

+6 -3

arch/powerpc/kvm/book3s_64_vio_hv.c

··· 173 173 idx -= stt->offset; 174 174 page = stt->pages[idx / TCES_PER_PAGE]; 175 175 /* 176 - * page must not be NULL in real mode, 177 - * kvmppc_rm_ioba_validate() must have taken care of this. 176 + * kvmppc_rm_ioba_validate() allows pages not be allocated if TCE is 177 + * being cleared, otherwise it returns H_TOO_HARD and we skip this. 178 178 */ 179 - WARN_ON_ONCE_RM(!page); 179 + if (!page) { 180 + WARN_ON_ONCE_RM(tce != 0); 181 + return; 182 + } 180 183 tbl = kvmppc_page_address(page); 181 184 182 185 tbl[idx % TCES_PER_PAGE] = tce;

+86 -22

arch/powerpc/kvm/book3s_hv.c

··· 59 59 #include <asm/kvm_book3s.h> 60 60 #include <asm/mmu_context.h> 61 61 #include <asm/lppaca.h> 62 + #include <asm/pmc.h> 62 63 #include <asm/processor.h> 63 64 #include <asm/cputhreads.h> 64 65 #include <asm/page.h> ··· 1166 1165 break; 1167 1166 #endif 1168 1167 case H_RANDOM: 1169 - if (!powernv_get_random_long(&vcpu->arch.regs.gpr[4])) 1168 + if (!arch_get_random_seed_long(&vcpu->arch.regs.gpr[4])) 1170 1169 ret = H_HARDWARE; 1171 1170 break; 1172 1171 case H_RPT_INVALIDATE: ··· 1680 1679 r = RESUME_GUEST; 1681 1680 } 1682 1681 break; 1682 + 1683 + #ifdef CONFIG_PPC_TRANSACTIONAL_MEM 1684 + case BOOK3S_INTERRUPT_HV_SOFTPATCH: 1685 + /* 1686 + * This occurs for various TM-related instructions that 1687 + * we need to emulate on POWER9 DD2.2. We have already 1688 + * handled the cases where the guest was in real-suspend 1689 + * mode and was transitioning to transactional state. 1690 + */ 1691 + r = kvmhv_p9_tm_emulation(vcpu); 1692 + if (r != -1) 1693 + break; 1694 + fallthrough; /* go to facility unavailable handler */ 1695 + #endif 1696 + 1683 1697 /* 1684 1698 * This occurs if the guest (kernel or userspace), does something that 1685 1699 * is prohibited by HFSCR. ··· 1712 1696 r = RESUME_GUEST; 1713 1697 } 1714 1698 break; 1715 - 1716 - #ifdef CONFIG_PPC_TRANSACTIONAL_MEM 1717 - case BOOK3S_INTERRUPT_HV_SOFTPATCH: 1718 - /* 1719 - * This occurs for various TM-related instructions that 1720 - * we need to emulate on POWER9 DD2.2. We have already 1721 - * handled the cases where the guest was in real-suspend 1722 - * mode and was transitioning to transactional state. 1723 - */ 1724 - r = kvmhv_p9_tm_emulation(vcpu); 1725 - break; 1726 - #endif 1727 1699 1728 1700 case BOOK3S_INTERRUPT_HV_RM_HARD: 1729 1701 r = RESUME_PASSTHROUGH; ··· 1731 1727 1732 1728 static int kvmppc_handle_nested_exit(struct kvm_vcpu *vcpu) 1733 1729 { 1730 + struct kvm_nested_guest *nested = vcpu->arch.nested; 1734 1731 int r; 1735 1732 int srcu_idx; 1736 1733 ··· 1816 1811 * mode and was transitioning to transactional state. 1817 1812 */ 1818 1813 r = kvmhv_p9_tm_emulation(vcpu); 1819 - break; 1814 + if (r != -1) 1815 + break; 1816 + fallthrough; /* go to facility unavailable handler */ 1820 1817 #endif 1818 + 1819 + case BOOK3S_INTERRUPT_H_FAC_UNAVAIL: { 1820 + u64 cause = vcpu->arch.hfscr >> 56; 1821 + 1822 + /* 1823 + * Only pass HFU interrupts to the L1 if the facility is 1824 + * permitted but disabled by the L1's HFSCR, otherwise 1825 + * the interrupt does not make sense to the L1 so turn 1826 + * it into a HEAI. 1827 + */ 1828 + if (!(vcpu->arch.hfscr_permitted & (1UL << cause)) || 1829 + (nested->hfscr & (1UL << cause))) { 1830 + vcpu->arch.trap = BOOK3S_INTERRUPT_H_EMUL_ASSIST; 1831 + 1832 + /* 1833 + * If the fetch failed, return to guest and 1834 + * try executing it again. 1835 + */ 1836 + r = kvmppc_get_last_inst(vcpu, INST_GENERIC, 1837 + &vcpu->arch.emul_inst); 1838 + if (r != EMULATE_DONE) 1839 + r = RESUME_GUEST; 1840 + else 1841 + r = RESUME_HOST; 1842 + } else { 1843 + r = RESUME_HOST; 1844 + } 1845 + 1846 + break; 1847 + } 1821 1848 1822 1849 case BOOK3S_INTERRUPT_HV_RM_HARD: 1823 1850 vcpu->arch.trap = 0; ··· 2721 2684 spin_lock_init(&vcpu->arch.vpa_update_lock); 2722 2685 spin_lock_init(&vcpu->arch.tbacct_lock); 2723 2686 vcpu->arch.busy_preempt = TB_NIL; 2687 + vcpu->arch.shregs.msr = MSR_ME; 2724 2688 vcpu->arch.intr_msr = MSR_SF | MSR_ME; 2725 2689 2726 2690 /* ··· 2742 2704 } 2743 2705 if (cpu_has_feature(CPU_FTR_TM_COMP)) 2744 2706 vcpu->arch.hfscr |= HFSCR_TM; 2707 + 2708 + vcpu->arch.hfscr_permitted = vcpu->arch.hfscr; 2745 2709 2746 2710 kvmppc_mmu_book3s_hv_init(vcpu); 2747 2711 ··· 3767 3727 mtspr(SPRN_EBBHR, vcpu->arch.ebbhr); 3768 3728 mtspr(SPRN_EBBRR, vcpu->arch.ebbrr); 3769 3729 mtspr(SPRN_BESCR, vcpu->arch.bescr); 3770 - mtspr(SPRN_WORT, vcpu->arch.wort); 3771 3730 mtspr(SPRN_TIDR, vcpu->arch.tid); 3772 3731 mtspr(SPRN_AMR, vcpu->arch.amr); 3773 3732 mtspr(SPRN_UAMOR, vcpu->arch.uamor); ··· 3793 3754 vcpu->arch.ebbhr = mfspr(SPRN_EBBHR); 3794 3755 vcpu->arch.ebbrr = mfspr(SPRN_EBBRR); 3795 3756 vcpu->arch.bescr = mfspr(SPRN_BESCR); 3796 - vcpu->arch.wort = mfspr(SPRN_WORT); 3797 3757 vcpu->arch.tid = mfspr(SPRN_TIDR); 3798 3758 vcpu->arch.amr = mfspr(SPRN_AMR); 3799 3759 vcpu->arch.uamor = mfspr(SPRN_UAMOR); ··· 3824 3786 struct p9_host_os_sprs *host_os_sprs) 3825 3787 { 3826 3788 mtspr(SPRN_PSPB, 0); 3827 - mtspr(SPRN_WORT, 0); 3828 3789 mtspr(SPRN_UAMOR, 0); 3829 3790 3830 3791 mtspr(SPRN_DSCR, host_os_sprs->dscr); ··· 3889 3852 cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) 3890 3853 kvmppc_restore_tm_hv(vcpu, vcpu->arch.shregs.msr, true); 3891 3854 3855 + #ifdef CONFIG_PPC_PSERIES 3856 + if (kvmhv_on_pseries()) { 3857 + barrier(); 3858 + if (vcpu->arch.vpa.pinned_addr) { 3859 + struct lppaca *lp = vcpu->arch.vpa.pinned_addr; 3860 + get_lppaca()->pmcregs_in_use = lp->pmcregs_in_use; 3861 + } else { 3862 + get_lppaca()->pmcregs_in_use = 1; 3863 + } 3864 + barrier(); 3865 + } 3866 + #endif 3892 3867 kvmhv_load_guest_pmu(vcpu); 3893 3868 3894 3869 msr_check_and_set(MSR_FP | MSR_VEC | MSR_VSX); ··· 4035 3986 save_pmu |= nesting_enabled(vcpu->kvm); 4036 3987 4037 3988 kvmhv_save_guest_pmu(vcpu, save_pmu); 3989 + #ifdef CONFIG_PPC_PSERIES 3990 + if (kvmhv_on_pseries()) { 3991 + barrier(); 3992 + get_lppaca()->pmcregs_in_use = ppc_get_pmu_inuse(); 3993 + barrier(); 3994 + } 3995 + #endif 4038 3996 4039 3997 vc->entry_exit_map = 0x101; 4040 3998 vc->in_guest = 0; ··· 5384 5328 struct kvmppc_passthru_irqmap *pimap; 5385 5329 struct irq_chip *chip; 5386 5330 int i, rc = 0; 5331 + struct irq_data *host_data; 5387 5332 5388 5333 if (!kvm_irq_bypass) 5389 5334 return 1; ··· 5412 5355 * what our real-mode EOI code does, or a XIVE interrupt 5413 5356 */ 5414 5357 chip = irq_data_get_irq_chip(&desc->irq_data); 5415 - if (!chip || !(is_pnv_opal_msi(chip) || is_xive_irq(chip))) { 5358 + if (!chip || !is_pnv_opal_msi(chip)) { 5416 5359 pr_warn("kvmppc_set_passthru_irq_hv: Could not assign IRQ map for (%d,%d)\n", 5417 5360 host_irq, guest_gsi); 5418 5361 mutex_unlock(&kvm->lock); ··· 5449 5392 * the KVM real mode handler. 5450 5393 */ 5451 5394 smp_wmb(); 5452 - irq_map->r_hwirq = desc->irq_data.hwirq; 5395 + 5396 + /* 5397 + * The 'host_irq' number is mapped in the PCI-MSI domain but 5398 + * the underlying calls, which will EOI the interrupt in real 5399 + * mode, need an HW IRQ number mapped in the XICS IRQ domain. 5400 + */ 5401 + host_data = irq_domain_get_irq_data(irq_get_default_host(), host_irq); 5402 + irq_map->r_hwirq = (unsigned int)irqd_to_hwirq(host_data); 5453 5403 5454 5404 if (i == pimap->n_mapped) 5455 5405 pimap->n_mapped++; 5456 5406 5457 5407 if (xics_on_xive()) 5458 - rc = kvmppc_xive_set_mapped(kvm, guest_gsi, desc); 5408 + rc = kvmppc_xive_set_mapped(kvm, guest_gsi, host_irq); 5459 5409 else 5460 - kvmppc_xics_set_mapped(kvm, guest_gsi, desc->irq_data.hwirq); 5410 + kvmppc_xics_set_mapped(kvm, guest_gsi, irq_map->r_hwirq); 5461 5411 if (rc) 5462 5412 irq_map->r_hwirq = 0; 5463 5413 ··· 5503 5439 } 5504 5440 5505 5441 if (xics_on_xive()) 5506 - rc = kvmppc_xive_clr_mapped(kvm, guest_gsi, pimap->mapped[i].desc); 5442 + rc = kvmppc_xive_clr_mapped(kvm, guest_gsi, host_irq); 5507 5443 else 5508 5444 kvmppc_xics_clr_mapped(kvm, guest_gsi, pimap->mapped[i].r_hwirq); 5509 5445

+5 -5

arch/powerpc/kvm/book3s_hv_builtin.c

··· 137 137 * exist in the system. We use a counter of VMs to track this. 138 138 * 139 139 * One of the operations we need to block is onlining of secondaries, so we 140 - * protect hv_vm_count with get/put_online_cpus(). 140 + * protect hv_vm_count with cpus_read_lock/unlock(). 141 141 */ 142 142 static atomic_t hv_vm_count; 143 143 144 144 void kvm_hv_vm_activated(void) 145 145 { 146 - get_online_cpus(); 146 + cpus_read_lock(); 147 147 atomic_inc(&hv_vm_count); 148 - put_online_cpus(); 148 + cpus_read_unlock(); 149 149 } 150 150 EXPORT_SYMBOL_GPL(kvm_hv_vm_activated); 151 151 152 152 void kvm_hv_vm_deactivated(void) 153 153 { 154 - get_online_cpus(); 154 + cpus_read_lock(); 155 155 atomic_dec(&hv_vm_count); 156 - put_online_cpus(); 156 + cpus_read_unlock(); 157 157 } 158 158 EXPORT_SYMBOL_GPL(kvm_hv_vm_deactivated); 159 159

+50 -51

arch/powerpc/kvm/book3s_hv_nested.c

··· 99 99 hr->dawrx1 = swab64(hr->dawrx1); 100 100 } 101 101 102 - static void save_hv_return_state(struct kvm_vcpu *vcpu, int trap, 102 + static void save_hv_return_state(struct kvm_vcpu *vcpu, 103 103 struct hv_guest_state *hr) 104 104 { 105 105 struct kvmppc_vcore *vc = vcpu->arch.vcore; 106 106 107 107 hr->dpdes = vc->dpdes; 108 - hr->hfscr = vcpu->arch.hfscr; 109 108 hr->purr = vcpu->arch.purr; 110 109 hr->spurr = vcpu->arch.spurr; 111 110 hr->ic = vcpu->arch.ic; ··· 118 119 hr->pidr = vcpu->arch.pid; 119 120 hr->cfar = vcpu->arch.cfar; 120 121 hr->ppr = vcpu->arch.ppr; 121 - switch (trap) { 122 + switch (vcpu->arch.trap) { 122 123 case BOOK3S_INTERRUPT_H_DATA_STORAGE: 123 124 hr->hdar = vcpu->arch.fault_dar; 124 125 hr->hdsisr = vcpu->arch.fault_dsisr; ··· 127 128 case BOOK3S_INTERRUPT_H_INST_STORAGE: 128 129 hr->asdr = vcpu->arch.fault_gpa; 129 130 break; 131 + case BOOK3S_INTERRUPT_H_FAC_UNAVAIL: 132 + hr->hfscr = ((~HFSCR_INTR_CAUSE & hr->hfscr) | 133 + (HFSCR_INTR_CAUSE & vcpu->arch.hfscr)); 134 + break; 130 135 case BOOK3S_INTERRUPT_H_EMUL_ASSIST: 131 136 hr->heir = vcpu->arch.emul_inst; 132 137 break; 133 138 } 134 139 } 135 140 136 - /* 137 - * This can result in some L0 HV register state being leaked to an L1 138 - * hypervisor when the hv_guest_state is copied back to the guest after 139 - * being modified here. 140 - * 141 - * There is no known problem with such a leak, and in many cases these 142 - * register settings could be derived by the guest by observing behaviour 143 - * and timing, interrupts, etc., but it is an issue to consider. 144 - */ 145 - static void sanitise_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr) 146 - { 147 - struct kvmppc_vcore *vc = vcpu->arch.vcore; 148 - u64 mask; 149 - 150 - /* 151 - * Don't let L1 change LPCR bits for the L2 except these: 152 - */ 153 - mask = LPCR_DPFD | LPCR_ILE | LPCR_TC | LPCR_AIL | LPCR_LD | 154 - LPCR_LPES | LPCR_MER; 155 - 156 - /* 157 - * Additional filtering is required depending on hardware 158 - * and configuration. 159 - */ 160 - hr->lpcr = kvmppc_filter_lpcr_hv(vcpu->kvm, 161 - (vc->lpcr & ~mask) | (hr->lpcr & mask)); 162 - 163 - /* 164 - * Don't let L1 enable features for L2 which we've disabled for L1, 165 - * but preserve the interrupt cause field. 166 - */ 167 - hr->hfscr &= (HFSCR_INTR_CAUSE | vcpu->arch.hfscr); 168 - 169 - /* Don't let data address watchpoint match in hypervisor state */ 170 - hr->dawrx0 &= ~DAWRX_HYP; 171 - hr->dawrx1 &= ~DAWRX_HYP; 172 - 173 - /* Don't let completed instruction address breakpt match in HV state */ 174 - if ((hr->ciabr & CIABR_PRIV) == CIABR_PRIV_HYPER) 175 - hr->ciabr &= ~CIABR_PRIV; 176 - } 177 - 178 - static void restore_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr) 141 + static void restore_hv_regs(struct kvm_vcpu *vcpu, const struct hv_guest_state *hr) 179 142 { 180 143 struct kvmppc_vcore *vc = vcpu->arch.vcore; 181 144 ··· 249 288 sizeof(struct pt_regs)); 250 289 } 251 290 291 + static void load_l2_hv_regs(struct kvm_vcpu *vcpu, 292 + const struct hv_guest_state *l2_hv, 293 + const struct hv_guest_state *l1_hv, u64 *lpcr) 294 + { 295 + struct kvmppc_vcore *vc = vcpu->arch.vcore; 296 + u64 mask; 297 + 298 + restore_hv_regs(vcpu, l2_hv); 299 + 300 + /* 301 + * Don't let L1 change LPCR bits for the L2 except these: 302 + */ 303 + mask = LPCR_DPFD | LPCR_ILE | LPCR_TC | LPCR_AIL | LPCR_LD | 304 + LPCR_LPES | LPCR_MER; 305 + 306 + /* 307 + * Additional filtering is required depending on hardware 308 + * and configuration. 309 + */ 310 + *lpcr = kvmppc_filter_lpcr_hv(vcpu->kvm, 311 + (vc->lpcr & ~mask) | (*lpcr & mask)); 312 + 313 + /* 314 + * Don't let L1 enable features for L2 which we don't allow for L1, 315 + * but preserve the interrupt cause field. 316 + */ 317 + vcpu->arch.hfscr = l2_hv->hfscr & (HFSCR_INTR_CAUSE | vcpu->arch.hfscr_permitted); 318 + 319 + /* Don't let data address watchpoint match in hypervisor state */ 320 + vcpu->arch.dawrx0 = l2_hv->dawrx0 & ~DAWRX_HYP; 321 + vcpu->arch.dawrx1 = l2_hv->dawrx1 & ~DAWRX_HYP; 322 + 323 + /* Don't let completed instruction address breakpt match in HV state */ 324 + if ((l2_hv->ciabr & CIABR_PRIV) == CIABR_PRIV_HYPER) 325 + vcpu->arch.ciabr = l2_hv->ciabr & ~CIABR_PRIV; 326 + } 327 + 252 328 long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu) 253 329 { 254 330 long int err, r; ··· 294 296 struct hv_guest_state l2_hv = {0}, saved_l1_hv; 295 297 struct kvmppc_vcore *vc = vcpu->arch.vcore; 296 298 u64 hv_ptr, regs_ptr; 297 - u64 hdec_exp; 299 + u64 hdec_exp, lpcr; 298 300 s64 delta_purr, delta_spurr, delta_ic, delta_vtb; 299 301 300 302 if (vcpu->kvm->arch.l1_ptcr == 0) ··· 362 364 /* set L1 state to L2 state */ 363 365 vcpu->arch.nested = l2; 364 366 vcpu->arch.nested_vcpu_id = l2_hv.vcpu_token; 367 + l2->hfscr = l2_hv.hfscr; 365 368 vcpu->arch.regs = l2_regs; 366 369 367 370 /* Guest must always run with ME enabled, HV disabled. */ 368 371 vcpu->arch.shregs.msr = (vcpu->arch.regs.msr | MSR_ME) & ~MSR_HV; 369 372 370 - sanitise_hv_regs(vcpu, &l2_hv); 371 - restore_hv_regs(vcpu, &l2_hv); 373 + lpcr = l2_hv.lpcr; 374 + load_l2_hv_regs(vcpu, &l2_hv, &saved_l1_hv, &lpcr); 372 375 373 376 vcpu->arch.ret = RESUME_GUEST; 374 377 vcpu->arch.trap = 0; ··· 379 380 r = RESUME_HOST; 380 381 break; 381 382 } 382 - r = kvmhv_run_single_vcpu(vcpu, hdec_exp, l2_hv.lpcr); 383 + r = kvmhv_run_single_vcpu(vcpu, hdec_exp, lpcr); 383 384 } while (is_kvmppc_resume_guest(r)); 384 385 385 386 /* save L2 state for return */ ··· 389 390 delta_spurr = vcpu->arch.spurr - l2_hv.spurr; 390 391 delta_ic = vcpu->arch.ic - l2_hv.ic; 391 392 delta_vtb = vc->vtb - l2_hv.vtb; 392 - save_hv_return_state(vcpu, vcpu->arch.trap, &l2_hv); 393 + save_hv_return_state(vcpu, &l2_hv); 393 394 394 395 /* restore L1 state */ 395 396 vcpu->arch.nested = NULL;

+4 -4

arch/powerpc/kvm/book3s_hv_rm_xics.c

··· 706 706 icp->rm_eoied_irq = irq; 707 707 } 708 708 709 + /* Handle passthrough interrupts */ 709 710 if (state->host_irq) { 710 711 ++vcpu->stat.pthru_all; 711 712 if (state->intr_cpu != -1) { ··· 760 759 761 760 static unsigned long eoi_rc; 762 761 763 - static void icp_eoi(struct irq_chip *c, u32 hwirq, __be32 xirr, bool *again) 762 + static void icp_eoi(struct irq_data *d, u32 hwirq, __be32 xirr, bool *again) 764 763 { 765 764 void __iomem *xics_phys; 766 765 int64_t rc; 767 766 768 - rc = pnv_opal_pci_msi_eoi(c, hwirq); 767 + rc = pnv_opal_pci_msi_eoi(d); 769 768 770 769 if (rc) 771 770 eoi_rc = rc; ··· 873 872 icp_rm_deliver_irq(xics, icp, irq, false); 874 873 875 874 /* EOI the interrupt */ 876 - icp_eoi(irq_desc_get_chip(irq_map->desc), irq_map->r_hwirq, xirr, 877 - again); 875 + icp_eoi(irq_desc_get_irq_data(irq_map->desc), irq_map->r_hwirq, xirr, again); 878 876 879 877 if (check_too_hard(xics, icp) == H_TOO_HARD) 880 878 return 2;

-42

arch/powerpc/kvm/book3s_hv_rmhandlers.S

··· 1088 1088 cmpwi r12, BOOK3S_INTERRUPT_H_INST_STORAGE 1089 1089 beq kvmppc_hisi 1090 1090 1091 - #ifdef CONFIG_PPC_TRANSACTIONAL_MEM 1092 - /* For softpatch interrupt, go off and do TM instruction emulation */ 1093 - cmpwi r12, BOOK3S_INTERRUPT_HV_SOFTPATCH 1094 - beq kvmppc_tm_emul 1095 - #endif 1096 - 1097 1091 /* See if this is a leftover HDEC interrupt */ 1098 1092 cmpwi r12,BOOK3S_INTERRUPT_HV_DECREMENTER 1099 1093 bne 2f ··· 1592 1598 mr r4, r9 1593 1599 blt deliver_guest_interrupt 1594 1600 b guest_exit_cont 1595 - 1596 - #ifdef CONFIG_PPC_TRANSACTIONAL_MEM 1597 - /* 1598 - * Softpatch interrupt for transactional memory emulation cases 1599 - * on POWER9 DD2.2. This is early in the guest exit path - we 1600 - * haven't saved registers or done a treclaim yet. 1601 - */ 1602 - kvmppc_tm_emul: 1603 - /* Save instruction image in HEIR */ 1604 - mfspr r3, SPRN_HEIR 1605 - stw r3, VCPU_HEIR(r9) 1606 - 1607 - /* 1608 - * The cases we want to handle here are those where the guest 1609 - * is in real suspend mode and is trying to transition to 1610 - * transactional mode. 1611 - */ 1612 - lbz r0, HSTATE_FAKE_SUSPEND(r13) 1613 - cmpwi r0, 0 /* keep exiting guest if in fake suspend */ 1614 - bne guest_exit_cont 1615 - rldicl r3, r11, 64 - MSR_TS_S_LG, 62 1616 - cmpwi r3, 1 /* or if not in suspend state */ 1617 - bne guest_exit_cont 1618 - 1619 - /* Call C code to do the emulation */ 1620 - mr r3, r9 1621 - bl kvmhv_p9_tm_emulation_early 1622 - nop 1623 - ld r9, HSTATE_KVM_VCPU(r13) 1624 - li r12, BOOK3S_INTERRUPT_HV_SOFTPATCH 1625 - cmpwi r3, 0 1626 - beq guest_exit_cont /* continue exiting if not handled */ 1627 - ld r10, VCPU_PC(r9) 1628 - ld r11, VCPU_MSR(r9) 1629 - b fast_interrupt_c_return /* go back to guest if handled */ 1630 - #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */ 1631 1601 1632 1602 /* 1633 1603 * Check whether an HDSI is an HPTE not found fault or something else.

+39 -22

arch/powerpc/kvm/book3s_hv_tm.c

··· 47 47 int ra, rs; 48 48 49 49 /* 50 + * The TM softpatch interrupt sets NIP to the instruction following 51 + * the faulting instruction, which is not executed. Rewind nip to the 52 + * faulting instruction so it looks like a normal synchronous 53 + * interrupt, then update nip in the places where the instruction is 54 + * emulated. 55 + */ 56 + vcpu->arch.regs.nip -= 4; 57 + 58 + /* 50 59 * rfid, rfebb, and mtmsrd encode bit 31 = 0 since it's a reserved bit 51 60 * in these instructions, so masking bit 31 out doesn't change these 52 61 * instructions. For treclaim., tsr., and trechkpt. instructions if bit ··· 76 67 (newmsr & MSR_TM))); 77 68 newmsr = sanitize_msr(newmsr); 78 69 vcpu->arch.shregs.msr = newmsr; 79 - vcpu->arch.cfar = vcpu->arch.regs.nip - 4; 70 + vcpu->arch.cfar = vcpu->arch.regs.nip; 80 71 vcpu->arch.regs.nip = vcpu->arch.shregs.srr0; 81 72 return RESUME_GUEST; 82 73 ··· 88 79 } 89 80 /* check EBB facility is available */ 90 81 if (!(vcpu->arch.hfscr & HFSCR_EBB)) { 91 - /* generate an illegal instruction interrupt */ 92 - kvmppc_core_queue_program(vcpu, SRR1_PROGILL); 93 - return RESUME_GUEST; 82 + vcpu->arch.hfscr &= ~HFSCR_INTR_CAUSE; 83 + vcpu->arch.hfscr |= (u64)FSCR_EBB_LG << 56; 84 + vcpu->arch.trap = BOOK3S_INTERRUPT_H_FAC_UNAVAIL; 85 + return -1; /* rerun host interrupt handler */ 94 86 } 95 87 if ((msr & MSR_PR) && !(vcpu->arch.fscr & FSCR_EBB)) { 96 88 /* generate a facility unavailable interrupt */ 97 - vcpu->arch.fscr = (vcpu->arch.fscr & ~(0xffull << 56)) | 98 - ((u64)FSCR_EBB_LG << 56); 89 + vcpu->arch.fscr &= ~FSCR_INTR_CAUSE; 90 + vcpu->arch.fscr |= (u64)FSCR_EBB_LG << 56; 99 91 kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_FAC_UNAVAIL); 100 92 return RESUME_GUEST; 101 93 } ··· 110 100 vcpu->arch.bescr = bescr; 111 101 msr = (msr & ~MSR_TS_MASK) | MSR_TS_T; 112 102 vcpu->arch.shregs.msr = msr; 113 - vcpu->arch.cfar = vcpu->arch.regs.nip - 4; 103 + vcpu->arch.cfar = vcpu->arch.regs.nip; 114 104 vcpu->arch.regs.nip = vcpu->arch.ebbrr; 115 105 return RESUME_GUEST; 116 106 ··· 126 116 newmsr = (newmsr & ~MSR_LE) | (msr & MSR_LE); 127 117 newmsr = sanitize_msr(newmsr); 128 118 vcpu->arch.shregs.msr = newmsr; 119 + vcpu->arch.regs.nip += 4; 129 120 return RESUME_GUEST; 130 121 131 122 /* ignore bit 31, see comment above */ ··· 139 128 } 140 129 /* check for TM disabled in the HFSCR or MSR */ 141 130 if (!(vcpu->arch.hfscr & HFSCR_TM)) { 142 - /* generate an illegal instruction interrupt */ 143 - kvmppc_core_queue_program(vcpu, SRR1_PROGILL); 144 - return RESUME_GUEST; 131 + vcpu->arch.hfscr &= ~HFSCR_INTR_CAUSE; 132 + vcpu->arch.hfscr |= (u64)FSCR_TM_LG << 56; 133 + vcpu->arch.trap = BOOK3S_INTERRUPT_H_FAC_UNAVAIL; 134 + return -1; /* rerun host interrupt handler */ 145 135 } 146 136 if (!(msr & MSR_TM)) { 147 137 /* generate a facility unavailable interrupt */ 148 - vcpu->arch.fscr = (vcpu->arch.fscr & ~(0xffull << 56)) | 149 - ((u64)FSCR_TM_LG << 56); 138 + vcpu->arch.fscr &= ~FSCR_INTR_CAUSE; 139 + vcpu->arch.fscr |= (u64)FSCR_TM_LG << 56; 150 140 kvmppc_book3s_queue_irqprio(vcpu, 151 141 BOOK3S_INTERRUPT_FAC_UNAVAIL); 152 142 return RESUME_GUEST; ··· 164 152 msr = (msr & ~MSR_TS_MASK) | MSR_TS_S; 165 153 } 166 154 vcpu->arch.shregs.msr = msr; 155 + vcpu->arch.regs.nip += 4; 167 156 return RESUME_GUEST; 168 157 169 158 /* ignore bit 31, see comment above */ 170 159 case (PPC_INST_TRECLAIM & PO_XOP_OPCODE_MASK): 171 160 /* check for TM disabled in the HFSCR or MSR */ 172 161 if (!(vcpu->arch.hfscr & HFSCR_TM)) { 173 - /* generate an illegal instruction interrupt */ 174 - kvmppc_core_queue_program(vcpu, SRR1_PROGILL); 175 - return RESUME_GUEST; 162 + vcpu->arch.hfscr &= ~HFSCR_INTR_CAUSE; 163 + vcpu->arch.hfscr |= (u64)FSCR_TM_LG << 56; 164 + vcpu->arch.trap = BOOK3S_INTERRUPT_H_FAC_UNAVAIL; 165 + return -1; /* rerun host interrupt handler */ 176 166 } 177 167 if (!(msr & MSR_TM)) { 178 168 /* generate a facility unavailable interrupt */ 179 - vcpu->arch.fscr = (vcpu->arch.fscr & ~(0xffull << 56)) | 180 - ((u64)FSCR_TM_LG << 56); 169 + vcpu->arch.fscr &= ~FSCR_INTR_CAUSE; 170 + vcpu->arch.fscr |= (u64)FSCR_TM_LG << 56; 181 171 kvmppc_book3s_queue_irqprio(vcpu, 182 172 BOOK3S_INTERRUPT_FAC_UNAVAIL); 183 173 return RESUME_GUEST; ··· 203 189 vcpu->arch.regs.ccr = (vcpu->arch.regs.ccr & 0x0fffffff) | 204 190 (((msr & MSR_TS_MASK) >> MSR_TS_S_LG) << 29); 205 191 vcpu->arch.shregs.msr &= ~MSR_TS_MASK; 192 + vcpu->arch.regs.nip += 4; 206 193 return RESUME_GUEST; 207 194 208 195 /* ignore bit 31, see comment above */ ··· 211 196 /* XXX do we need to check for PR=0 here? */ 212 197 /* check for TM disabled in the HFSCR or MSR */ 213 198 if (!(vcpu->arch.hfscr & HFSCR_TM)) { 214 - /* generate an illegal instruction interrupt */ 215 - kvmppc_core_queue_program(vcpu, SRR1_PROGILL); 216 - return RESUME_GUEST; 199 + vcpu->arch.hfscr &= ~HFSCR_INTR_CAUSE; 200 + vcpu->arch.hfscr |= (u64)FSCR_TM_LG << 56; 201 + vcpu->arch.trap = BOOK3S_INTERRUPT_H_FAC_UNAVAIL; 202 + return -1; /* rerun host interrupt handler */ 217 203 } 218 204 if (!(msr & MSR_TM)) { 219 205 /* generate a facility unavailable interrupt */ 220 - vcpu->arch.fscr = (vcpu->arch.fscr & ~(0xffull << 56)) | 221 - ((u64)FSCR_TM_LG << 56); 206 + vcpu->arch.fscr &= ~FSCR_INTR_CAUSE; 207 + vcpu->arch.fscr |= (u64)FSCR_TM_LG << 56; 222 208 kvmppc_book3s_queue_irqprio(vcpu, 223 209 BOOK3S_INTERRUPT_FAC_UNAVAIL); 224 210 return RESUME_GUEST; ··· 236 220 vcpu->arch.regs.ccr = (vcpu->arch.regs.ccr & 0x0fffffff) | 237 221 (((msr & MSR_TS_MASK) >> MSR_TS_S_LG) << 29); 238 222 vcpu->arch.shregs.msr = msr | MSR_TS_S; 223 + vcpu->arch.regs.nip += 4; 239 224 return RESUME_GUEST; 240 225 } 241 226

+3 -3

arch/powerpc/kvm/book3s_xics.c

··· 10 10 #include <linux/gfp.h> 11 11 #include <linux/anon_inodes.h> 12 12 #include <linux/spinlock.h> 13 - 13 + #include <linux/debugfs.h> 14 14 #include <linux/uaccess.h> 15 + 15 16 #include <asm/kvm_book3s.h> 16 17 #include <asm/kvm_ppc.h> 17 18 #include <asm/hvcall.h> 18 19 #include <asm/xics.h> 19 - #include <asm/debugfs.h> 20 20 #include <asm/time.h> 21 21 22 22 #include <linux/seq_file.h> ··· 1024 1024 return; 1025 1025 } 1026 1026 1027 - xics->dentry = debugfs_create_file(name, 0444, powerpc_debugfs_root, 1027 + xics->dentry = debugfs_create_file(name, 0444, arch_debugfs_dir, 1028 1028 xics, &xics_debug_fops); 1029 1029 1030 1030 pr_debug("%s: created %s\n", __func__, name);

+52 -22

arch/powerpc/kvm/book3s_xive.c

··· 22 22 #include <asm/xive.h> 23 23 #include <asm/xive-regs.h> 24 24 #include <asm/debug.h> 25 - #include <asm/debugfs.h> 26 25 #include <asm/time.h> 27 26 #include <asm/opal.h> 28 27 ··· 58 59 */ 59 60 #define XIVE_Q_GAP 2 60 61 62 + static bool kvmppc_xive_vcpu_has_save_restore(struct kvm_vcpu *vcpu) 63 + { 64 + struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu; 65 + 66 + /* Check enablement at VP level */ 67 + return xc->vp_cam & TM_QW1W2_HO; 68 + } 69 + 70 + bool kvmppc_xive_check_save_restore(struct kvm_vcpu *vcpu) 71 + { 72 + struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu; 73 + struct kvmppc_xive *xive = xc->xive; 74 + 75 + if (xive->flags & KVMPPC_XIVE_FLAG_SAVE_RESTORE) 76 + return kvmppc_xive_vcpu_has_save_restore(vcpu); 77 + 78 + return true; 79 + } 80 + 61 81 /* 62 82 * Push a vcpu's context to the XIVE on guest entry. 63 83 * This assumes we are in virtual mode (MMU on) ··· 95 77 return; 96 78 97 79 eieio(); 98 - __raw_writeq(vcpu->arch.xive_saved_state.w01, tima + TM_QW1_OS); 80 + if (!kvmppc_xive_vcpu_has_save_restore(vcpu)) 81 + __raw_writeq(vcpu->arch.xive_saved_state.w01, tima + TM_QW1_OS); 99 82 __raw_writel(vcpu->arch.xive_cam_word, tima + TM_QW1_OS + TM_WORD2); 100 83 vcpu->arch.xive_pushed = 1; 101 84 eieio(); ··· 168 149 /* First load to pull the context, we ignore the value */ 169 150 __raw_readl(tima + TM_SPC_PULL_OS_CTX); 170 151 /* Second load to recover the context state (Words 0 and 1) */ 171 - vcpu->arch.xive_saved_state.w01 = __raw_readq(tima + TM_QW1_OS); 152 + if (!kvmppc_xive_vcpu_has_save_restore(vcpu)) 153 + vcpu->arch.xive_saved_state.w01 = __raw_readq(tima + TM_QW1_OS); 172 154 173 155 /* Fixup some of the state for the next load */ 174 156 vcpu->arch.xive_saved_state.lsmfb = 0; ··· 383 363 if (!vcpu->arch.xive_vcpu) 384 364 continue; 385 365 rc = xive_provision_queue(vcpu, prio); 386 - if (rc == 0 && !xive->single_escalation) 366 + if (rc == 0 && !kvmppc_xive_has_single_escalation(xive)) 387 367 kvmppc_xive_attach_escalation(vcpu, prio, 388 - xive->single_escalation); 368 + kvmppc_xive_has_single_escalation(xive)); 389 369 if (rc) 390 370 return rc; 391 371 } ··· 942 922 } 943 923 944 924 int kvmppc_xive_set_mapped(struct kvm *kvm, unsigned long guest_irq, 945 - struct irq_desc *host_desc) 925 + unsigned long host_irq) 946 926 { 947 927 struct kvmppc_xive *xive = kvm->arch.xive; 948 928 struct kvmppc_xive_src_block *sb; 949 929 struct kvmppc_xive_irq_state *state; 950 - struct irq_data *host_data = irq_desc_get_irq_data(host_desc); 951 - unsigned int host_irq = irq_desc_get_irq(host_desc); 930 + struct irq_data *host_data = 931 + irq_domain_get_irq_data(irq_get_default_host(), host_irq); 952 932 unsigned int hw_irq = (unsigned int)irqd_to_hwirq(host_data); 953 933 u16 idx; 954 934 u8 prio; ··· 957 937 if (!xive) 958 938 return -ENODEV; 959 939 960 - pr_devel("set_mapped girq 0x%lx host HW irq 0x%x...\n",guest_irq, hw_irq); 940 + pr_debug("%s: GIRQ 0x%lx host IRQ %ld XIVE HW IRQ 0x%x\n", 941 + __func__, guest_irq, host_irq, hw_irq); 961 942 962 943 sb = kvmppc_xive_find_source(xive, guest_irq, &idx); 963 944 if (!sb) ··· 980 959 */ 981 960 rc = irq_set_vcpu_affinity(host_irq, state); 982 961 if (rc) { 983 - pr_err("Failed to set VCPU affinity for irq %d\n", host_irq); 962 + pr_err("Failed to set VCPU affinity for host IRQ %ld\n", host_irq); 984 963 return rc; 985 964 } 986 965 ··· 1040 1019 EXPORT_SYMBOL_GPL(kvmppc_xive_set_mapped); 1041 1020 1042 1021 int kvmppc_xive_clr_mapped(struct kvm *kvm, unsigned long guest_irq, 1043 - struct irq_desc *host_desc) 1022 + unsigned long host_irq) 1044 1023 { 1045 1024 struct kvmppc_xive *xive = kvm->arch.xive; 1046 1025 struct kvmppc_xive_src_block *sb; 1047 1026 struct kvmppc_xive_irq_state *state; 1048 - unsigned int host_irq = irq_desc_get_irq(host_desc); 1049 1027 u16 idx; 1050 1028 u8 prio; 1051 1029 int rc; ··· 1052 1032 if (!xive) 1053 1033 return -ENODEV; 1054 1034 1055 - pr_devel("clr_mapped girq 0x%lx...\n", guest_irq); 1035 + pr_debug("%s: GIRQ 0x%lx host IRQ %ld\n", __func__, guest_irq, host_irq); 1056 1036 1057 1037 sb = kvmppc_xive_find_source(xive, guest_irq, &idx); 1058 1038 if (!sb) ··· 1079 1059 /* Release the passed-through interrupt to the host */ 1080 1060 rc = irq_set_vcpu_affinity(host_irq, NULL); 1081 1061 if (rc) { 1082 - pr_err("Failed to clr VCPU affinity for irq %d\n", host_irq); 1062 + pr_err("Failed to clr VCPU affinity for host IRQ %ld\n", host_irq); 1083 1063 return rc; 1084 1064 } 1085 1065 ··· 1219 1199 /* Free escalations */ 1220 1200 for (i = 0; i < KVMPPC_XIVE_Q_COUNT; i++) { 1221 1201 if (xc->esc_virq[i]) { 1222 - if (xc->xive->single_escalation) 1202 + if (kvmppc_xive_has_single_escalation(xc->xive)) 1223 1203 xive_cleanup_single_escalation(vcpu, xc, 1224 1204 xc->esc_virq[i]); 1225 1205 free_irq(xc->esc_virq[i], vcpu); ··· 1339 1319 if (r) 1340 1320 goto bail; 1341 1321 1322 + if (!kvmppc_xive_check_save_restore(vcpu)) { 1323 + pr_err("inconsistent save-restore setup for VCPU %d\n", cpu); 1324 + r = -EIO; 1325 + goto bail; 1326 + } 1327 + 1342 1328 /* Configure VCPU fields for use by assembly push/pull */ 1343 1329 vcpu->arch.xive_saved_state.w01 = cpu_to_be64(0xff000000); 1344 1330 vcpu->arch.xive_cam_word = cpu_to_be32(xc->vp_cam | TM_QW1W2_VO); ··· 1366 1340 * Enable the VP first as the single escalation mode will 1367 1341 * affect escalation interrupts numbering 1368 1342 */ 1369 - r = xive_native_enable_vp(xc->vp_id, xive->single_escalation); 1343 + r = xive_native_enable_vp(xc->vp_id, kvmppc_xive_has_single_escalation(xive)); 1370 1344 if (r) { 1371 1345 pr_err("Failed to enable VP in OPAL, err %d\n", r); 1372 1346 goto bail; ··· 1383 1357 struct xive_q *q = &xc->queues[i]; 1384 1358 1385 1359 /* Single escalation, no queue 7 */ 1386 - if (i == 7 && xive->single_escalation) 1360 + if (i == 7 && kvmppc_xive_has_single_escalation(xive)) 1387 1361 break; 1388 1362 1389 1363 /* Is queue already enabled ? Provision it */ 1390 1364 if (xive->qmap & (1 << i)) { 1391 1365 r = xive_provision_queue(vcpu, i); 1392 - if (r == 0 && !xive->single_escalation) 1366 + if (r == 0 && !kvmppc_xive_has_single_escalation(xive)) 1393 1367 kvmppc_xive_attach_escalation( 1394 - vcpu, i, xive->single_escalation); 1368 + vcpu, i, kvmppc_xive_has_single_escalation(xive)); 1395 1369 if (r) 1396 1370 goto bail; 1397 1371 } else { ··· 1406 1380 } 1407 1381 1408 1382 /* If not done above, attach priority 0 escalation */ 1409 - r = kvmppc_xive_attach_escalation(vcpu, 0, xive->single_escalation); 1383 + r = kvmppc_xive_attach_escalation(vcpu, 0, kvmppc_xive_has_single_escalation(xive)); 1410 1384 if (r) 1411 1385 goto bail; 1412 1386 ··· 2161 2135 */ 2162 2136 xive->nr_servers = KVM_MAX_VCPUS; 2163 2137 2164 - xive->single_escalation = xive_native_has_single_escalation(); 2138 + if (xive_native_has_single_escalation()) 2139 + xive->flags |= KVMPPC_XIVE_FLAG_SINGLE_ESCALATION; 2140 + 2141 + if (xive_native_has_save_restore()) 2142 + xive->flags |= KVMPPC_XIVE_FLAG_SAVE_RESTORE; 2165 2143 2166 2144 kvm->arch.xive = xive; 2167 2145 return 0; ··· 2359 2329 return; 2360 2330 } 2361 2331 2362 - xive->dentry = debugfs_create_file(name, S_IRUGO, powerpc_debugfs_root, 2332 + xive->dentry = debugfs_create_file(name, S_IRUGO, arch_debugfs_dir, 2363 2333 xive, &xive_debug_fops); 2364 2334 2365 2335 pr_debug("%s: created %s\n", __func__, name);

+10 -1

arch/powerpc/kvm/book3s_xive.h

··· 97 97 int (*reset_mapped)(struct kvm *kvm, unsigned long guest_irq); 98 98 }; 99 99 100 + #define KVMPPC_XIVE_FLAG_SINGLE_ESCALATION 0x1 101 + #define KVMPPC_XIVE_FLAG_SAVE_RESTORE 0x2 102 + 100 103 struct kvmppc_xive { 101 104 struct kvm *kvm; 102 105 struct kvm_device *dev; ··· 136 133 u32 q_page_order; 137 134 138 135 /* Flags */ 139 - u8 single_escalation; 136 + u8 flags; 140 137 141 138 /* Number of entries in the VP block */ 142 139 u32 nr_servers; ··· 310 307 struct kvmppc_xive_vcpu *xc, int irq); 311 308 int kvmppc_xive_compute_vp_id(struct kvmppc_xive *xive, u32 cpu, u32 *vp); 312 309 int kvmppc_xive_set_nr_servers(struct kvmppc_xive *xive, u64 addr); 310 + bool kvmppc_xive_check_save_restore(struct kvm_vcpu *vcpu); 311 + 312 + static inline bool kvmppc_xive_has_single_escalation(struct kvmppc_xive *xive) 313 + { 314 + return xive->flags & KVMPPC_XIVE_FLAG_SINGLE_ESCALATION; 315 + } 313 316 314 317 #endif /* CONFIG_KVM_XICS */ 315 318 #endif /* _KVM_PPC_BOOK3S_XICS_H */

+17 -7

arch/powerpc/kvm/book3s_xive_native.c

··· 20 20 #include <asm/xive.h> 21 21 #include <asm/xive-regs.h> 22 22 #include <asm/debug.h> 23 - #include <asm/debugfs.h> 24 23 #include <asm/opal.h> 25 24 26 25 #include <linux/debugfs.h> ··· 92 93 for (i = 0; i < KVMPPC_XIVE_Q_COUNT; i++) { 93 94 /* Free the escalation irq */ 94 95 if (xc->esc_virq[i]) { 95 - if (xc->xive->single_escalation) 96 + if (kvmppc_xive_has_single_escalation(xc->xive)) 96 97 xive_cleanup_single_escalation(vcpu, xc, 97 98 xc->esc_virq[i]); 98 99 free_irq(xc->esc_virq[i], vcpu); ··· 167 168 goto bail; 168 169 } 169 170 171 + if (!kvmppc_xive_check_save_restore(vcpu)) { 172 + pr_err("inconsistent save-restore setup for VCPU %d\n", server_num); 173 + rc = -EIO; 174 + goto bail; 175 + } 176 + 170 177 /* 171 178 * Enable the VP first as the single escalation mode will 172 179 * affect escalation interrupts numbering 173 180 */ 174 - rc = xive_native_enable_vp(xc->vp_id, xive->single_escalation); 181 + rc = xive_native_enable_vp(xc->vp_id, kvmppc_xive_has_single_escalation(xive)); 175 182 if (rc) { 176 183 pr_err("Failed to enable VP in OPAL: %d\n", rc); 177 184 goto bail; ··· 698 693 } 699 694 700 695 rc = kvmppc_xive_attach_escalation(vcpu, priority, 701 - xive->single_escalation); 696 + kvmppc_xive_has_single_escalation(xive)); 702 697 error: 703 698 if (rc) 704 699 kvmppc_xive_native_cleanup_queue(vcpu, priority); ··· 825 820 for (prio = 0; prio < KVMPPC_XIVE_Q_COUNT; prio++) { 826 821 827 822 /* Single escalation, no queue 7 */ 828 - if (prio == 7 && xive->single_escalation) 823 + if (prio == 7 && kvmppc_xive_has_single_escalation(xive)) 829 824 break; 830 825 831 826 if (xc->esc_virq[prio]) { ··· 1116 1111 */ 1117 1112 xive->nr_servers = KVM_MAX_VCPUS; 1118 1113 1119 - xive->single_escalation = xive_native_has_single_escalation(); 1114 + if (xive_native_has_single_escalation()) 1115 + xive->flags |= KVMPPC_XIVE_FLAG_SINGLE_ESCALATION; 1116 + 1117 + if (xive_native_has_save_restore()) 1118 + xive->flags |= KVMPPC_XIVE_FLAG_SAVE_RESTORE; 1119 + 1120 1120 xive->ops = &kvmppc_xive_native_ops; 1121 1121 1122 1122 kvm->arch.xive = xive; ··· 1267 1257 return; 1268 1258 } 1269 1259 1270 - xive->dentry = debugfs_create_file(name, 0444, powerpc_debugfs_root, 1260 + xive->dentry = debugfs_create_file(name, 0444, arch_debugfs_dir, 1271 1261 xive, &xive_native_debug_fops); 1272 1262 1273 1263 pr_debug("%s: created %s\n", __func__, name);

+1 -1

arch/powerpc/mm/Makefile

··· 18 18 obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o 19 19 obj-$(CONFIG_NOT_COHERENT_CACHE) += dma-noncoherent.o 20 20 obj-$(CONFIG_PPC_COPRO_BASE) += copro_fault.o 21 - obj-$(CONFIG_PPC_PTDUMP) += ptdump/ 21 + obj-$(CONFIG_PTDUMP_CORE) += ptdump/ 22 22 obj-$(CONFIG_KASAN) += kasan/

+1 -1

arch/powerpc/mm/book3s64/hash_native.c

··· 787 787 * TODO: add batching support when enabled. remember, no dynamic memory here, 788 788 * although there is the control page available... 789 789 */ 790 - static void native_hpte_clear(void) 790 + static notrace void native_hpte_clear(void) 791 791 { 792 792 unsigned long vpn = 0; 793 793 unsigned long slot, slots;

+2 -2

arch/powerpc/mm/book3s64/hash_utils.c

··· 36 36 #include <linux/hugetlb.h> 37 37 #include <linux/cpu.h> 38 38 #include <linux/pgtable.h> 39 + #include <linux/debugfs.h> 39 40 40 - #include <asm/debugfs.h> 41 41 #include <asm/interrupt.h> 42 42 #include <asm/processor.h> 43 43 #include <asm/mmu.h> ··· 2072 2072 2073 2073 static int __init hash64_debugfs(void) 2074 2074 { 2075 - debugfs_create_file("hpt_order", 0600, powerpc_debugfs_root, NULL, 2075 + debugfs_create_file("hpt_order", 0600, arch_debugfs_dir, NULL, 2076 2076 &fops_hpt_order); 2077 2077 return 0; 2078 2078 }

+4 -4

arch/powerpc/mm/book3s64/pgtable.c

··· 6 6 #include <linux/sched.h> 7 7 #include <linux/mm_types.h> 8 8 #include <linux/memblock.h> 9 + #include <linux/debugfs.h> 9 10 #include <misc/cxl-base.h> 10 11 11 - #include <asm/debugfs.h> 12 12 #include <asm/pgalloc.h> 13 13 #include <asm/tlb.h> 14 14 #include <asm/trace.h> ··· 172 172 } 173 173 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ 174 174 175 - /* For use by kexec */ 176 - void mmu_cleanup_all(void) 175 + /* For use by kexec, called with MMU off */ 176 + notrace void mmu_cleanup_all(void) 177 177 { 178 178 if (radix_enabled()) 179 179 radix__mmu_cleanup_all(); ··· 520 520 * invalidated as expected. 521 521 */ 522 522 debugfs_create_bool("tlbie_enabled", 0600, 523 - powerpc_debugfs_root, 523 + arch_debugfs_dir, 524 524 &tlbie_enabled); 525 525 526 526 return 0;

+2 -1

arch/powerpc/mm/book3s64/radix_pgtable.c

··· 679 679 mtspr(SPRN_UAMOR, 0); 680 680 } 681 681 682 - void radix__mmu_cleanup_all(void) 682 + /* Called during kexec sequence with MMU off */ 683 + notrace void radix__mmu_cleanup_all(void) 683 684 { 684 685 unsigned long lpcr; 685 686

+14 -2

arch/powerpc/mm/book3s64/radix_tlb.c

··· 10 10 #include <linux/memblock.h> 11 11 #include <linux/mmu_context.h> 12 12 #include <linux/sched/mm.h> 13 + #include <linux/debugfs.h> 13 14 14 15 #include <asm/ppc-opcode.h> 15 16 #include <asm/tlb.h> ··· 1107 1106 * invalidating a full PID, so it has a far lower threshold to change from 1108 1107 * individual page flushes to full-pid flushes. 1109 1108 */ 1110 - static unsigned long tlb_single_page_flush_ceiling __read_mostly = 33; 1111 - static unsigned long tlb_local_single_page_flush_ceiling __read_mostly = POWER9_TLB_SETS_RADIX * 2; 1109 + static u32 tlb_single_page_flush_ceiling __read_mostly = 33; 1110 + static u32 tlb_local_single_page_flush_ceiling __read_mostly = POWER9_TLB_SETS_RADIX * 2; 1112 1111 1113 1112 static inline void __radix__flush_tlb_range(struct mm_struct *mm, 1114 1113 unsigned long start, unsigned long end) ··· 1525 1524 EXPORT_SYMBOL_GPL(do_h_rpt_invalidate_prt); 1526 1525 1527 1526 #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */ 1527 + 1528 + static int __init create_tlb_single_page_flush_ceiling(void) 1529 + { 1530 + debugfs_create_u32("tlb_single_page_flush_ceiling", 0600, 1531 + arch_debugfs_dir, &tlb_single_page_flush_ceiling); 1532 + debugfs_create_u32("tlb_local_single_page_flush_ceiling", 0600, 1533 + arch_debugfs_dir, &tlb_local_single_page_flush_ceiling); 1534 + return 0; 1535 + } 1536 + late_initcall(create_tlb_single_page_flush_ceiling); 1537 +

+1 -1

arch/powerpc/mm/book3s64/slb.c

··· 822 822 /* IRQs are not reconciled here, so can't check irqs_disabled */ 823 823 VM_WARN_ON(mfmsr() & MSR_EE); 824 824 825 - if (unlikely(!(regs->msr & MSR_RI))) 825 + if (regs_is_unrecoverable(regs)) 826 826 return -EINVAL; 827 827 828 828 /*

+46

arch/powerpc/mm/drmem.c

··· 18 18 19 19 static struct drmem_lmb_info __drmem_info; 20 20 struct drmem_lmb_info *drmem_info = &__drmem_info; 21 + static bool in_drmem_update; 21 22 22 23 u64 drmem_lmb_memory_max(void) 23 24 { ··· 179 178 if (!memory) 180 179 return -1; 181 180 181 + /* 182 + * Set in_drmem_update to prevent the notifier callback to process the 183 + * DT property back since the change is coming from the LMB tree. 184 + */ 185 + in_drmem_update = true; 182 186 prop = of_find_property(memory, "ibm,dynamic-memory", NULL); 183 187 if (prop) { 184 188 rc = drmem_update_dt_v1(memory, prop); ··· 192 186 if (prop) 193 187 rc = drmem_update_dt_v2(memory, prop); 194 188 } 189 + in_drmem_update = false; 195 190 196 191 of_node_put(memory); 197 192 return rc; ··· 314 307 return ret; 315 308 } 316 309 310 + /* 311 + * Update the LMB associativity index. 312 + */ 313 + static int update_lmb(struct drmem_lmb *updated_lmb, 314 + __maybe_unused const __be32 **usm, 315 + __maybe_unused void *data) 316 + { 317 + struct drmem_lmb *lmb; 318 + 319 + for_each_drmem_lmb(lmb) { 320 + if (lmb->drc_index != updated_lmb->drc_index) 321 + continue; 322 + 323 + lmb->aa_index = updated_lmb->aa_index; 324 + break; 325 + } 326 + return 0; 327 + } 328 + 329 + /* 330 + * Update the LMB associativity index. 331 + * 332 + * This needs to be called when the hypervisor is updating the 333 + * dynamic-reconfiguration-memory node property. 334 + */ 335 + void drmem_update_lmbs(struct property *prop) 336 + { 337 + /* 338 + * Don't update the LMBs if triggered by the update done in 339 + * drmem_update_dt(), the LMB values have been used to the update the DT 340 + * property in that case. 341 + */ 342 + if (in_drmem_update) 343 + return; 344 + if (!strcmp(prop->name, "ibm,dynamic-memory")) 345 + __walk_drmem_v1_lmbs(prop->value, NULL, NULL, update_lmb); 346 + else if (!strcmp(prop->name, "ibm,dynamic-memory-v2")) 347 + __walk_drmem_v2_lmbs(prop->value, NULL, NULL, update_lmb); 348 + } 317 349 #endif 318 350 319 351 static int init_drmem_lmb_size(struct device_node *dn)

+1 -1

arch/powerpc/mm/mmu_decl.h

··· 180 180 void __init mmu_mapin_immr(void); 181 181 #endif 182 182 183 - #ifdef CONFIG_PPC_DEBUG_WX 183 + #ifdef CONFIG_DEBUG_WX 184 184 void ptdump_check_wx(void); 185 185 #else 186 186 static inline void ptdump_check_wx(void) { }

+2 -2

arch/powerpc/mm/nohash/tlb_low.S

··· 199 199 * Touch enough instruction cache lines to ensure cache hits 200 200 */ 201 201 1: mflr r9 202 - bl 2f 202 + bcl 20,31,$+4 203 203 2: mflr r6 204 204 li r7,32 205 205 PPC_ICBT(0,R6,R7) /* touch next cache line */ ··· 414 414 * Set up temporary TLB entry that is the same as what we're 415 415 * running from, but in AS=1. 416 416 */ 417 - bl 1f 417 + bcl 20,31,$+4 418 418 1: mflr r6 419 419 tlbsx 0,r8 420 420 mfspr r6,SPRN_MAS1

+357 -134

arch/powerpc/mm/numa.c

··· 40 40 41 41 static char *cmdline __initdata; 42 42 43 - static int numa_debug; 44 - #define dbg(args...) if (numa_debug) { printk(KERN_INFO args); } 45 - 46 43 int numa_cpu_lookup_table[NR_CPUS]; 47 44 cpumask_var_t node_to_cpumask_map[MAX_NUMNODES]; 48 45 struct pglist_data *node_data[MAX_NUMNODES]; ··· 48 51 EXPORT_SYMBOL(node_to_cpumask_map); 49 52 EXPORT_SYMBOL(node_data); 50 53 51 - static int min_common_depth; 54 + static int primary_domain_index; 52 55 static int n_mem_addr_cells, n_mem_size_cells; 53 - static int form1_affinity; 56 + 57 + #define FORM0_AFFINITY 0 58 + #define FORM1_AFFINITY 1 59 + #define FORM2_AFFINITY 2 60 + static int affinity_form; 54 61 55 62 #define MAX_DISTANCE_REF_POINTS 4 56 63 static int distance_ref_points_depth; 57 64 static const __be32 *distance_ref_points; 58 65 static int distance_lookup_table[MAX_NUMNODES][MAX_DISTANCE_REF_POINTS]; 66 + static int numa_distance_table[MAX_NUMNODES][MAX_NUMNODES] = { 67 + [0 ... MAX_NUMNODES - 1] = { [0 ... MAX_NUMNODES - 1] = -1 } 68 + }; 69 + static int numa_id_index_table[MAX_NUMNODES] = { [0 ... MAX_NUMNODES - 1] = NUMA_NO_NODE }; 59 70 60 71 /* 61 72 * Allocate node_to_cpumask_map based on number of available nodes ··· 84 79 alloc_bootmem_cpumask_var(&node_to_cpumask_map[node]); 85 80 86 81 /* cpumask_of_node() will now work */ 87 - dbg("Node to cpumask map for %u nodes\n", nr_node_ids); 82 + pr_debug("Node to cpumask map for %u nodes\n", nr_node_ids); 88 83 } 89 84 90 85 static int __init fake_numa_create_new_node(unsigned long end_pfn, ··· 128 123 cmdline = p; 129 124 fake_nid++; 130 125 *nid = fake_nid; 131 - dbg("created new fake_node with id %d\n", fake_nid); 126 + pr_debug("created new fake_node with id %d\n", fake_nid); 132 127 return 1; 133 128 } 134 129 return 0; ··· 142 137 numa_cpu_lookup_table[cpu] = -1; 143 138 } 144 139 145 - static void map_cpu_to_node(int cpu, int node) 140 + void map_cpu_to_node(int cpu, int node) 146 141 { 147 142 update_numa_cpu_lookup_table(cpu, node); 148 143 149 - dbg("adding cpu %d to node %d\n", cpu, node); 150 - 151 - if (!(cpumask_test_cpu(cpu, node_to_cpumask_map[node]))) 144 + if (!(cpumask_test_cpu(cpu, node_to_cpumask_map[node]))) { 145 + pr_debug("adding cpu %d to node %d\n", cpu, node); 152 146 cpumask_set_cpu(cpu, node_to_cpumask_map[node]); 147 + } 153 148 } 154 149 155 150 #if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_PPC_SPLPAR) 156 - static void unmap_cpu_from_node(unsigned long cpu) 151 + void unmap_cpu_from_node(unsigned long cpu) 157 152 { 158 153 int node = numa_cpu_lookup_table[cpu]; 159 154 160 - dbg("removing cpu %lu from node %d\n", cpu, node); 161 - 162 155 if (cpumask_test_cpu(cpu, node_to_cpumask_map[node])) { 163 156 cpumask_clear_cpu(cpu, node_to_cpumask_map[node]); 157 + pr_debug("removing cpu %lu from node %d\n", cpu, node); 164 158 } else { 165 - printk(KERN_ERR "WARNING: cpu %lu not found in node %d\n", 166 - cpu, node); 159 + pr_warn("Warning: cpu %lu not found in node %d\n", cpu, node); 167 160 } 168 161 } 169 162 #endif /* CONFIG_HOTPLUG_CPU || CONFIG_PPC_SPLPAR */ 170 163 171 - int cpu_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc) 164 + static int __associativity_to_nid(const __be32 *associativity, 165 + int max_array_sz) 166 + { 167 + int nid; 168 + /* 169 + * primary_domain_index is 1 based array index. 170 + */ 171 + int index = primary_domain_index - 1; 172 + 173 + if (!numa_enabled || index >= max_array_sz) 174 + return NUMA_NO_NODE; 175 + 176 + nid = of_read_number(&associativity[index], 1); 177 + 178 + /* POWER4 LPAR uses 0xffff as invalid node */ 179 + if (nid == 0xffff || nid >= nr_node_ids) 180 + nid = NUMA_NO_NODE; 181 + return nid; 182 + } 183 + /* 184 + * Returns nid in the range [0..nr_node_ids], or -1 if no useful NUMA 185 + * info is found. 186 + */ 187 + static int associativity_to_nid(const __be32 *associativity) 188 + { 189 + int array_sz = of_read_number(associativity, 1); 190 + 191 + /* Skip the first element in the associativity array */ 192 + return __associativity_to_nid((associativity + 1), array_sz); 193 + } 194 + 195 + static int __cpu_form2_relative_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc) 196 + { 197 + int dist; 198 + int node1, node2; 199 + 200 + node1 = associativity_to_nid(cpu1_assoc); 201 + node2 = associativity_to_nid(cpu2_assoc); 202 + 203 + dist = numa_distance_table[node1][node2]; 204 + if (dist <= LOCAL_DISTANCE) 205 + return 0; 206 + else if (dist <= REMOTE_DISTANCE) 207 + return 1; 208 + else 209 + return 2; 210 + } 211 + 212 + static int __cpu_form1_relative_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc) 172 213 { 173 214 int dist = 0; 174 215 ··· 230 179 return dist; 231 180 } 232 181 182 + int cpu_relative_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc) 183 + { 184 + /* We should not get called with FORM0 */ 185 + VM_WARN_ON(affinity_form == FORM0_AFFINITY); 186 + if (affinity_form == FORM1_AFFINITY) 187 + return __cpu_form1_relative_distance(cpu1_assoc, cpu2_assoc); 188 + return __cpu_form2_relative_distance(cpu1_assoc, cpu2_assoc); 189 + } 190 + 233 191 /* must hold reference to node during call */ 234 192 static const __be32 *of_get_associativity(struct device_node *dev) 235 193 { ··· 250 190 int i; 251 191 int distance = LOCAL_DISTANCE; 252 192 253 - if (!form1_affinity) 193 + if (affinity_form == FORM2_AFFINITY) 194 + return numa_distance_table[a][b]; 195 + else if (affinity_form == FORM0_AFFINITY) 254 196 return ((a == b) ? LOCAL_DISTANCE : REMOTE_DISTANCE); 255 197 256 198 for (i = 0; i < distance_ref_points_depth; i++) { ··· 266 204 return distance; 267 205 } 268 206 EXPORT_SYMBOL(__node_distance); 269 - 270 - static void initialize_distance_lookup_table(int nid, 271 - const __be32 *associativity) 272 - { 273 - int i; 274 - 275 - if (!form1_affinity) 276 - return; 277 - 278 - for (i = 0; i < distance_ref_points_depth; i++) { 279 - const __be32 *entry; 280 - 281 - entry = &associativity[be32_to_cpu(distance_ref_points[i]) - 1]; 282 - distance_lookup_table[nid][i] = of_read_number(entry, 1); 283 - } 284 - } 285 - 286 - /* 287 - * Returns nid in the range [0..nr_node_ids], or -1 if no useful NUMA 288 - * info is found. 289 - */ 290 - static int associativity_to_nid(const __be32 *associativity) 291 - { 292 - int nid = NUMA_NO_NODE; 293 - 294 - if (!numa_enabled) 295 - goto out; 296 - 297 - if (of_read_number(associativity, 1) >= min_common_depth) 298 - nid = of_read_number(&associativity[min_common_depth], 1); 299 - 300 - /* POWER4 LPAR uses 0xffff as invalid node */ 301 - if (nid == 0xffff || nid >= nr_node_ids) 302 - nid = NUMA_NO_NODE; 303 - 304 - if (nid > 0 && 305 - of_read_number(associativity, 1) >= distance_ref_points_depth) { 306 - /* 307 - * Skip the length field and send start of associativity array 308 - */ 309 - initialize_distance_lookup_table(nid, associativity + 1); 310 - } 311 - 312 - out: 313 - return nid; 314 - } 315 207 316 208 /* Returns the nid associated with the given device tree node, 317 209 * or -1 if not found. ··· 300 284 } 301 285 EXPORT_SYMBOL(of_node_to_nid); 302 286 303 - static int __init find_min_common_depth(void) 287 + static void __initialize_form1_numa_distance(const __be32 *associativity, 288 + int max_array_sz) 304 289 { 305 - int depth; 290 + int i, nid; 291 + 292 + if (affinity_form != FORM1_AFFINITY) 293 + return; 294 + 295 + nid = __associativity_to_nid(associativity, max_array_sz); 296 + if (nid != NUMA_NO_NODE) { 297 + for (i = 0; i < distance_ref_points_depth; i++) { 298 + const __be32 *entry; 299 + int index = be32_to_cpu(distance_ref_points[i]) - 1; 300 + 301 + /* 302 + * broken hierarchy, return with broken distance table 303 + */ 304 + if (WARN(index >= max_array_sz, "Broken ibm,associativity property")) 305 + return; 306 + 307 + entry = &associativity[index]; 308 + distance_lookup_table[nid][i] = of_read_number(entry, 1); 309 + } 310 + } 311 + } 312 + 313 + static void initialize_form1_numa_distance(const __be32 *associativity) 314 + { 315 + int array_sz; 316 + 317 + array_sz = of_read_number(associativity, 1); 318 + /* Skip the first element in the associativity array */ 319 + __initialize_form1_numa_distance(associativity + 1, array_sz); 320 + } 321 + 322 + /* 323 + * Used to update distance information w.r.t newly added node. 324 + */ 325 + void update_numa_distance(struct device_node *node) 326 + { 327 + int nid; 328 + 329 + if (affinity_form == FORM0_AFFINITY) 330 + return; 331 + else if (affinity_form == FORM1_AFFINITY) { 332 + const __be32 *associativity; 333 + 334 + associativity = of_get_associativity(node); 335 + if (!associativity) 336 + return; 337 + 338 + initialize_form1_numa_distance(associativity); 339 + return; 340 + } 341 + 342 + /* FORM2 affinity */ 343 + nid = of_node_to_nid_single(node); 344 + if (nid == NUMA_NO_NODE) 345 + return; 346 + 347 + /* 348 + * With FORM2 we expect NUMA distance of all possible NUMA 349 + * nodes to be provided during boot. 350 + */ 351 + WARN(numa_distance_table[nid][nid] == -1, 352 + "NUMA distance details for node %d not provided\n", nid); 353 + } 354 + 355 + /* 356 + * ibm,numa-lookup-index-table= {N, domainid1, domainid2, ..... domainidN} 357 + * ibm,numa-distance-table = { N, 1, 2, 4, 5, 1, 6, .... N elements} 358 + */ 359 + static void initialize_form2_numa_distance_lookup_table(void) 360 + { 361 + int i, j; 306 362 struct device_node *root; 363 + const __u8 *numa_dist_table; 364 + const __be32 *numa_lookup_index; 365 + int numa_dist_table_length; 366 + int max_numa_index, distance_index; 367 + 368 + if (firmware_has_feature(FW_FEATURE_OPAL)) 369 + root = of_find_node_by_path("/ibm,opal"); 370 + else 371 + root = of_find_node_by_path("/rtas"); 372 + if (!root) 373 + root = of_find_node_by_path("/"); 374 + 375 + numa_lookup_index = of_get_property(root, "ibm,numa-lookup-index-table", NULL); 376 + max_numa_index = of_read_number(&numa_lookup_index[0], 1); 377 + 378 + /* first element of the array is the size and is encode-int */ 379 + numa_dist_table = of_get_property(root, "ibm,numa-distance-table", NULL); 380 + numa_dist_table_length = of_read_number((const __be32 *)&numa_dist_table[0], 1); 381 + /* Skip the size which is encoded int */ 382 + numa_dist_table += sizeof(__be32); 383 + 384 + pr_debug("numa_dist_table_len = %d, numa_dist_indexes_len = %d\n", 385 + numa_dist_table_length, max_numa_index); 386 + 387 + for (i = 0; i < max_numa_index; i++) 388 + /* +1 skip the max_numa_index in the property */ 389 + numa_id_index_table[i] = of_read_number(&numa_lookup_index[i + 1], 1); 390 + 391 + 392 + if (numa_dist_table_length != max_numa_index * max_numa_index) { 393 + WARN(1, "Wrong NUMA distance information\n"); 394 + /* consider everybody else just remote. */ 395 + for (i = 0; i < max_numa_index; i++) { 396 + for (j = 0; j < max_numa_index; j++) { 397 + int nodeA = numa_id_index_table[i]; 398 + int nodeB = numa_id_index_table[j]; 399 + 400 + if (nodeA == nodeB) 401 + numa_distance_table[nodeA][nodeB] = LOCAL_DISTANCE; 402 + else 403 + numa_distance_table[nodeA][nodeB] = REMOTE_DISTANCE; 404 + } 405 + } 406 + } 407 + 408 + distance_index = 0; 409 + for (i = 0; i < max_numa_index; i++) { 410 + for (j = 0; j < max_numa_index; j++) { 411 + int nodeA = numa_id_index_table[i]; 412 + int nodeB = numa_id_index_table[j]; 413 + 414 + numa_distance_table[nodeA][nodeB] = numa_dist_table[distance_index++]; 415 + pr_debug("dist[%d][%d]=%d ", nodeA, nodeB, numa_distance_table[nodeA][nodeB]); 416 + } 417 + } 418 + of_node_put(root); 419 + } 420 + 421 + static int __init find_primary_domain_index(void) 422 + { 423 + int index; 424 + struct device_node *root; 425 + 426 + /* 427 + * Check for which form of affinity. 428 + */ 429 + if (firmware_has_feature(FW_FEATURE_OPAL)) { 430 + affinity_form = FORM1_AFFINITY; 431 + } else if (firmware_has_feature(FW_FEATURE_FORM2_AFFINITY)) { 432 + pr_debug("Using form 2 affinity\n"); 433 + affinity_form = FORM2_AFFINITY; 434 + } else if (firmware_has_feature(FW_FEATURE_FORM1_AFFINITY)) { 435 + pr_debug("Using form 1 affinity\n"); 436 + affinity_form = FORM1_AFFINITY; 437 + } else 438 + affinity_form = FORM0_AFFINITY; 307 439 308 440 if (firmware_has_feature(FW_FEATURE_OPAL)) 309 441 root = of_find_node_by_path("/ibm,opal"); ··· 477 313 &distance_ref_points_depth); 478 314 479 315 if (!distance_ref_points) { 480 - dbg("NUMA: ibm,associativity-reference-points not found.\n"); 316 + pr_debug("ibm,associativity-reference-points not found.\n"); 481 317 goto err; 482 318 } 483 319 484 320 distance_ref_points_depth /= sizeof(int); 485 - 486 - if (firmware_has_feature(FW_FEATURE_OPAL) || 487 - firmware_has_feature(FW_FEATURE_TYPE1_AFFINITY)) { 488 - dbg("Using form 1 affinity\n"); 489 - form1_affinity = 1; 490 - } 491 - 492 - if (form1_affinity) { 493 - depth = of_read_number(distance_ref_points, 1); 494 - } else { 321 + if (affinity_form == FORM0_AFFINITY) { 495 322 if (distance_ref_points_depth < 2) { 496 - printk(KERN_WARNING "NUMA: " 497 - "short ibm,associativity-reference-points\n"); 323 + pr_warn("short ibm,associativity-reference-points\n"); 498 324 goto err; 499 325 } 500 326 501 - depth = of_read_number(&distance_ref_points[1], 1); 327 + index = of_read_number(&distance_ref_points[1], 1); 328 + } else { 329 + /* 330 + * Both FORM1 and FORM2 affinity find the primary domain details 331 + * at the same offset. 332 + */ 333 + index = of_read_number(distance_ref_points, 1); 502 334 } 503 - 504 335 /* 505 336 * Warn and cap if the hardware supports more than 506 337 * MAX_DISTANCE_REF_POINTS domains. 507 338 */ 508 339 if (distance_ref_points_depth > MAX_DISTANCE_REF_POINTS) { 509 - printk(KERN_WARNING "NUMA: distance array capped at " 510 - "%d entries\n", MAX_DISTANCE_REF_POINTS); 340 + pr_warn("distance array capped at %d entries\n", 341 + MAX_DISTANCE_REF_POINTS); 511 342 distance_ref_points_depth = MAX_DISTANCE_REF_POINTS; 512 343 } 513 344 514 345 of_node_put(root); 515 - return depth; 346 + return index; 516 347 517 348 err: 518 349 of_node_put(root); ··· 585 426 return 0; 586 427 } 587 428 429 + static int get_nid_and_numa_distance(struct drmem_lmb *lmb) 430 + { 431 + struct assoc_arrays aa = { .arrays = NULL }; 432 + int default_nid = NUMA_NO_NODE; 433 + int nid = default_nid; 434 + int rc, index; 435 + 436 + if ((primary_domain_index < 0) || !numa_enabled) 437 + return default_nid; 438 + 439 + rc = of_get_assoc_arrays(&aa); 440 + if (rc) 441 + return default_nid; 442 + 443 + if (primary_domain_index <= aa.array_sz && 444 + !(lmb->flags & DRCONF_MEM_AI_INVALID) && lmb->aa_index < aa.n_arrays) { 445 + const __be32 *associativity; 446 + 447 + index = lmb->aa_index * aa.array_sz; 448 + associativity = &aa.arrays[index]; 449 + nid = __associativity_to_nid(associativity, aa.array_sz); 450 + if (nid > 0 && affinity_form == FORM1_AFFINITY) { 451 + /* 452 + * lookup array associativity entries have 453 + * no length of the array as the first element. 454 + */ 455 + __initialize_form1_numa_distance(associativity, aa.array_sz); 456 + } 457 + } 458 + return nid; 459 + } 460 + 588 461 /* 589 462 * This is like of_node_to_nid_single() for memory represented in the 590 463 * ibm,dynamic-reconfiguration-memory node. ··· 628 437 int nid = default_nid; 629 438 int rc, index; 630 439 631 - if ((min_common_depth < 0) || !numa_enabled) 440 + if ((primary_domain_index < 0) || !numa_enabled) 632 441 return default_nid; 633 442 634 443 rc = of_get_assoc_arrays(&aa); 635 444 if (rc) 636 445 return default_nid; 637 446 638 - if (min_common_depth <= aa.array_sz && 447 + if (primary_domain_index <= aa.array_sz && 639 448 !(lmb->flags & DRCONF_MEM_AI_INVALID) && lmb->aa_index < aa.n_arrays) { 640 - index = lmb->aa_index * aa.array_sz + min_common_depth - 1; 641 - nid = of_read_number(&aa.arrays[index], 1); 449 + const __be32 *associativity; 642 450 643 - if (nid == 0xffff || nid >= nr_node_ids) 644 - nid = default_nid; 645 - 646 - if (nid > 0) { 647 - index = lmb->aa_index * aa.array_sz; 648 - initialize_distance_lookup_table(nid, 649 - &aa.arrays[index]); 650 - } 451 + index = lmb->aa_index * aa.array_sz; 452 + associativity = &aa.arrays[index]; 453 + nid = __associativity_to_nid(associativity, aa.array_sz); 651 454 } 652 - 653 455 return nid; 654 456 } 655 457 656 458 #ifdef CONFIG_PPC_SPLPAR 657 - static int vphn_get_nid(long lcpu) 459 + 460 + static int __vphn_get_associativity(long lcpu, __be32 *associativity) 658 461 { 659 - __be32 associativity[VPHN_ASSOC_BUFSIZE] = {0}; 660 462 long rc, hwid; 661 463 662 464 /* ··· 669 485 670 486 rc = hcall_vphn(hwid, VPHN_FLAG_VCPU, associativity); 671 487 if (rc == H_SUCCESS) 672 - return associativity_to_nid(associativity); 488 + return 0; 673 489 } 674 490 491 + return -1; 492 + } 493 + 494 + static int vphn_get_nid(long lcpu) 495 + { 496 + __be32 associativity[VPHN_ASSOC_BUFSIZE] = {0}; 497 + 498 + 499 + if (!__vphn_get_associativity(lcpu, associativity)) 500 + return associativity_to_nid(associativity); 501 + 675 502 return NUMA_NO_NODE; 503 + 676 504 } 677 505 #else 506 + 507 + static int __vphn_get_associativity(long lcpu, __be32 *associativity) 508 + { 509 + return -1; 510 + } 511 + 678 512 static int vphn_get_nid(long unused) 679 513 { 680 514 return NUMA_NO_NODE; ··· 800 598 801 599 static int ppc_numa_cpu_dead(unsigned int cpu) 802 600 { 803 - #ifdef CONFIG_HOTPLUG_CPU 804 - unmap_cpu_from_node(cpu); 805 - #endif 806 601 return 0; 807 602 } 808 603 ··· 884 685 size = read_n_cells(n_mem_size_cells, usm); 885 686 } 886 687 887 - nid = of_drconf_to_nid_single(lmb); 688 + nid = get_nid_and_numa_distance(lmb); 888 689 fake_numa_create_new_node(((base + size) >> PAGE_SHIFT), 889 690 &nid); 890 691 node_set_online(nid); ··· 901 702 struct device_node *memory; 902 703 int default_nid = 0; 903 704 unsigned long i; 705 + const __be32 *associativity; 904 706 905 707 if (numa_enabled == 0) { 906 - printk(KERN_WARNING "NUMA disabled by user\n"); 708 + pr_warn("disabled by user\n"); 907 709 return -1; 908 710 } 909 711 910 - min_common_depth = find_min_common_depth(); 712 + primary_domain_index = find_primary_domain_index(); 911 713 912 - if (min_common_depth < 0) { 714 + if (primary_domain_index < 0) { 913 715 /* 914 - * if we fail to parse min_common_depth from device tree 716 + * if we fail to parse primary_domain_index from device tree 915 717 * mark the numa disabled, boot with numa disabled. 916 718 */ 917 719 numa_enabled = false; 918 - return min_common_depth; 720 + return primary_domain_index; 919 721 } 920 722 921 - dbg("NUMA associativity depth for CPU/Memory: %d\n", min_common_depth); 723 + pr_debug("associativity depth for CPU/Memory: %d\n", primary_domain_index); 724 + 725 + /* 726 + * If it is FORM2 initialize the distance table here. 727 + */ 728 + if (affinity_form == FORM2_AFFINITY) 729 + initialize_form2_numa_distance_lookup_table(); 922 730 923 731 /* 924 732 * Even though we connect cpus to numa domains later in SMP ··· 933 727 * each node to be onlined must have NODE_DATA etc backing it. 934 728 */ 935 729 for_each_present_cpu(i) { 730 + __be32 vphn_assoc[VPHN_ASSOC_BUFSIZE]; 936 731 struct device_node *cpu; 937 - int nid = vphn_get_nid(i); 732 + int nid = NUMA_NO_NODE; 938 733 939 - /* 940 - * Don't fall back to default_nid yet -- we will plug 941 - * cpus into nodes once the memory scan has discovered 942 - * the topology. 943 - */ 944 - if (nid == NUMA_NO_NODE) { 734 + memset(vphn_assoc, 0, VPHN_ASSOC_BUFSIZE * sizeof(__be32)); 735 + 736 + if (__vphn_get_associativity(i, vphn_assoc) == 0) { 737 + nid = associativity_to_nid(vphn_assoc); 738 + initialize_form1_numa_distance(vphn_assoc); 739 + } else { 740 + 741 + /* 742 + * Don't fall back to default_nid yet -- we will plug 743 + * cpus into nodes once the memory scan has discovered 744 + * the topology. 745 + */ 945 746 cpu = of_get_cpu_node(i, NULL); 946 747 BUG_ON(!cpu); 947 - nid = of_node_to_nid_single(cpu); 748 + 749 + associativity = of_get_associativity(cpu); 750 + if (associativity) { 751 + nid = associativity_to_nid(associativity); 752 + initialize_form1_numa_distance(associativity); 753 + } 948 754 of_node_put(cpu); 949 755 } 950 756 ··· 992 774 * have associativity properties. If none, then 993 775 * everything goes to default_nid. 994 776 */ 995 - nid = of_node_to_nid_single(memory); 996 - if (nid < 0) 777 + associativity = of_get_associativity(memory); 778 + if (associativity) { 779 + nid = associativity_to_nid(associativity); 780 + initialize_form1_numa_distance(associativity); 781 + } else 997 782 nid = default_nid; 998 783 999 784 fake_numa_create_new_node(((start + size) >> PAGE_SHIFT), &nid); ··· 1032 811 unsigned int nid = 0; 1033 812 int i; 1034 813 1035 - printk(KERN_DEBUG "Top of RAM: 0x%lx, Total RAM: 0x%lx\n", 1036 - top_of_ram, total_ram); 1037 - printk(KERN_DEBUG "Memory hole size: %ldMB\n", 1038 - (top_of_ram - total_ram) >> 20); 814 + pr_debug("Top of RAM: 0x%lx, Total RAM: 0x%lx\n", top_of_ram, total_ram); 815 + pr_debug("Memory hole size: %ldMB\n", (top_of_ram - total_ram) >> 20); 1039 816 1040 817 for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, NULL) { 1041 818 fake_numa_create_new_node(end_pfn, &nid); ··· 1112 893 static void __init find_possible_nodes(void) 1113 894 { 1114 895 struct device_node *rtas; 1115 - const __be32 *domains; 896 + const __be32 *domains = NULL; 1116 897 int prop_length, max_nodes; 1117 898 u32 i; 1118 899 ··· 1128 909 * it doesn't exist, then fallback on ibm,max-associativity-domains. 1129 910 * Current denotes what the platform can support compared to max 1130 911 * which denotes what the Hypervisor can support. 912 + * 913 + * If the LPAR is migratable, new nodes might be activated after a LPM, 914 + * so we should consider the max number in that case. 1131 915 */ 1132 - domains = of_get_property(rtas, "ibm,current-associativity-domains", 1133 - &prop_length); 916 + if (!of_get_property(of_root, "ibm,migratable-partition", NULL)) 917 + domains = of_get_property(rtas, 918 + "ibm,current-associativity-domains", 919 + &prop_length); 1134 920 if (!domains) { 1135 921 domains = of_get_property(rtas, "ibm,max-associativity-domains", 1136 922 &prop_length); ··· 1143 919 goto out; 1144 920 } 1145 921 1146 - max_nodes = of_read_number(&domains[min_common_depth], 1); 922 + max_nodes = of_read_number(&domains[primary_domain_index], 1); 923 + pr_info("Partition configured for %d NUMA nodes.\n", max_nodes); 924 + 1147 925 for (i = 0; i < max_nodes; i++) { 1148 926 if (!node_possible(i)) 1149 927 node_set(i, node_possible_map); 1150 928 } 1151 929 1152 930 prop_length /= sizeof(int); 1153 - if (prop_length > min_common_depth + 2) 931 + if (prop_length > primary_domain_index + 2) 1154 932 coregroup_enabled = 1; 1155 933 1156 934 out: ··· 1239 1013 1240 1014 if (strstr(p, "off")) 1241 1015 numa_enabled = 0; 1242 - 1243 - if (strstr(p, "debug")) 1244 - numa_debug = 1; 1245 1016 1246 1017 p = strstr(p, "fake="); 1247 1018 if (p) ··· 1402 1179 1403 1180 switch (rc) { 1404 1181 case H_SUCCESS: 1405 - dbg("VPHN hcall succeeded. Reset polling...\n"); 1182 + pr_debug("VPHN hcall succeeded. Reset polling...\n"); 1406 1183 goto out; 1407 1184 1408 1185 case H_FUNCTION: ··· 1482 1259 goto out; 1483 1260 1484 1261 index = of_read_number(associativity, 1); 1485 - if (index > min_common_depth + 1) 1262 + if (index > primary_domain_index + 1) 1486 1263 return of_read_number(&associativity[index - 1], 1); 1487 1264 1488 1265 out:

+4 -2

arch/powerpc/mm/ptdump/8xx.c

··· 75 75 }; 76 76 77 77 struct pgtable_level pg_level[5] = { 78 - { 79 - }, { /* pgd */ 78 + { /* pgd */ 79 + .flag = flag_array, 80 + .num = ARRAY_SIZE(flag_array), 81 + }, { /* p4d */ 80 82 .flag = flag_array, 81 83 .num = ARRAY_SIZE(flag_array), 82 84 }, { /* pud */

+7 -2

arch/powerpc/mm/ptdump/Makefile

··· 5 5 obj-$(CONFIG_4xx) += shared.o 6 6 obj-$(CONFIG_PPC_8xx) += 8xx.o 7 7 obj-$(CONFIG_PPC_BOOK3E_MMU) += shared.o 8 - obj-$(CONFIG_PPC_BOOK3S_32) += shared.o bats.o segment_regs.o 9 - obj-$(CONFIG_PPC_BOOK3S_64) += book3s64.o hashpagetable.o 8 + obj-$(CONFIG_PPC_BOOK3S_32) += shared.o 9 + obj-$(CONFIG_PPC_BOOK3S_64) += book3s64.o 10 + 11 + ifdef CONFIG_PTDUMP_DEBUGFS 12 + obj-$(CONFIG_PPC_BOOK3S_32) += bats.o segment_regs.o 13 + obj-$(CONFIG_PPC_BOOK3S_64) += hashpagetable.o 14 + endif

+4 -14

arch/powerpc/mm/ptdump/bats.c

··· 7 7 */ 8 8 9 9 #include <linux/pgtable.h> 10 - #include <asm/debugfs.h> 10 + #include <linux/debugfs.h> 11 11 #include <asm/cpu_has_feature.h> 12 12 13 13 #include "ptdump.h" ··· 57 57 58 58 #define BAT_SHOW_603(_m, _n, _l, _u, _d) bat_show_603(_m, _n, mfspr(_l), mfspr(_u), _d) 59 59 60 - static int bats_show_603(struct seq_file *m, void *v) 60 + static int bats_show(struct seq_file *m, void *v) 61 61 { 62 62 seq_puts(m, "---[ Instruction Block Address Translation ]---\n"); 63 63 ··· 88 88 return 0; 89 89 } 90 90 91 - static int bats_open(struct inode *inode, struct file *file) 92 - { 93 - return single_open(file, bats_show_603, NULL); 94 - } 95 - 96 - static const struct file_operations bats_fops = { 97 - .open = bats_open, 98 - .read = seq_read, 99 - .llseek = seq_lseek, 100 - .release = single_release, 101 - }; 91 + DEFINE_SHOW_ATTRIBUTE(bats); 102 92 103 93 static int __init bats_init(void) 104 94 { 105 95 debugfs_create_file("block_address_translation", 0400, 106 - powerpc_debugfs_root, NULL, &bats_fops); 96 + arch_debugfs_dir, NULL, &bats_fops); 107 97 return 0; 108 98 } 109 99 device_initcall(bats_init);

+4 -2

arch/powerpc/mm/ptdump/book3s64.c

··· 103 103 }; 104 104 105 105 struct pgtable_level pg_level[5] = { 106 - { 107 - }, { /* pgd */ 106 + { /* pgd */ 107 + .flag = flag_array, 108 + .num = ARRAY_SIZE(flag_array), 109 + }, { /* p4d */ 108 110 .flag = flag_array, 109 111 .num = ARRAY_SIZE(flag_array), 110 112 }, { /* pud */

+1 -11

arch/powerpc/mm/ptdump/hashpagetable.c

··· 526 526 return 0; 527 527 } 528 528 529 - static int ptdump_open(struct inode *inode, struct file *file) 530 - { 531 - return single_open(file, ptdump_show, NULL); 532 - } 533 - 534 - static const struct file_operations ptdump_fops = { 535 - .open = ptdump_open, 536 - .read = seq_read, 537 - .llseek = seq_lseek, 538 - .release = single_release, 539 - }; 529 + DEFINE_SHOW_ATTRIBUTE(ptdump); 540 530 541 531 static int ptdump_init(void) 542 532 {

+48 -130

arch/powerpc/mm/ptdump/ptdump.c

··· 16 16 #include <linux/io.h> 17 17 #include <linux/mm.h> 18 18 #include <linux/highmem.h> 19 + #include <linux/ptdump.h> 19 20 #include <linux/sched.h> 20 21 #include <linux/seq_file.h> 21 22 #include <asm/fixmap.h> ··· 55 54 * 56 55 */ 57 56 struct pg_state { 57 + struct ptdump_state ptdump; 58 58 struct seq_file *seq; 59 59 const struct addr_marker *marker; 60 60 unsigned long start_address; 61 61 unsigned long start_pa; 62 - unsigned int level; 62 + int level; 63 63 u64 current_flags; 64 64 bool check_wx; 65 65 unsigned long wx_pages; ··· 102 100 { 0, "kasan shadow mem end" }, 103 101 #endif 104 102 { -1, NULL }, 103 + }; 104 + 105 + static struct ptdump_range ptdump_range[] __ro_after_init = { 106 + {TASK_SIZE_MAX, ~0UL}, 107 + {0, 0} 105 108 }; 106 109 107 110 #define pt_dump_seq_printf(m, fmt, args...) \ ··· 195 188 st->wx_pages += (addr - st->start_address) / PAGE_SIZE; 196 189 } 197 190 198 - static void note_page_update_state(struct pg_state *st, unsigned long addr, 199 - unsigned int level, u64 val, unsigned long page_size) 191 + static void note_page_update_state(struct pg_state *st, unsigned long addr, int level, u64 val) 200 192 { 201 - u64 flag = val & pg_level[level].mask; 193 + u64 flag = level >= 0 ? val & pg_level[level].mask : 0; 202 194 u64 pa = val & PTE_RPN_MASK; 203 195 204 196 st->level = level; ··· 211 205 } 212 206 } 213 207 214 - static void note_page(struct pg_state *st, unsigned long addr, 215 - unsigned int level, u64 val, unsigned long page_size) 208 + static void note_page(struct ptdump_state *pt_st, unsigned long addr, int level, u64 val) 216 209 { 217 - u64 flag = val & pg_level[level].mask; 210 + u64 flag = level >= 0 ? val & pg_level[level].mask : 0; 211 + struct pg_state *st = container_of(pt_st, struct pg_state, ptdump); 218 212 219 213 /* At first no level is set */ 220 - if (!st->level) { 214 + if (st->level == -1) { 221 215 pt_dump_seq_printf(st->seq, "---[ %s ]---\n", st->marker->name); 222 - note_page_update_state(st, addr, level, val, page_size); 216 + note_page_update_state(st, addr, level, val); 223 217 /* 224 218 * Dump the section of virtual memory when: 225 219 * - the PTE flags from one entry to the next differs. ··· 248 242 * Address indicates we have passed the end of the 249 243 * current section of virtual memory 250 244 */ 251 - note_page_update_state(st, addr, level, val, page_size); 252 - } 253 - } 254 - 255 - static void walk_pte(struct pg_state *st, pmd_t *pmd, unsigned long start) 256 - { 257 - pte_t *pte = pte_offset_kernel(pmd, 0); 258 - unsigned long addr; 259 - unsigned int i; 260 - 261 - for (i = 0; i < PTRS_PER_PTE; i++, pte++) { 262 - addr = start + i * PAGE_SIZE; 263 - note_page(st, addr, 4, pte_val(*pte), PAGE_SIZE); 264 - 265 - } 266 - } 267 - 268 - static void walk_hugepd(struct pg_state *st, hugepd_t *phpd, unsigned long start, 269 - int pdshift, int level) 270 - { 271 - #ifdef CONFIG_ARCH_HAS_HUGEPD 272 - unsigned int i; 273 - int shift = hugepd_shift(*phpd); 274 - int ptrs_per_hpd = pdshift - shift > 0 ? 1 << (pdshift - shift) : 1; 275 - 276 - if (start & ((1 << shift) - 1)) 277 - return; 278 - 279 - for (i = 0; i < ptrs_per_hpd; i++) { 280 - unsigned long addr = start + (i << shift); 281 - pte_t *pte = hugepte_offset(*phpd, addr, pdshift); 282 - 283 - note_page(st, addr, level + 1, pte_val(*pte), 1 << shift); 284 - } 285 - #endif 286 - } 287 - 288 - static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start) 289 - { 290 - pmd_t *pmd = pmd_offset(pud, 0); 291 - unsigned long addr; 292 - unsigned int i; 293 - 294 - for (i = 0; i < PTRS_PER_PMD; i++, pmd++) { 295 - addr = start + i * PMD_SIZE; 296 - if (!pmd_none(*pmd) && !pmd_is_leaf(*pmd)) 297 - /* pmd exists */ 298 - walk_pte(st, pmd, addr); 299 - else 300 - note_page(st, addr, 3, pmd_val(*pmd), PMD_SIZE); 301 - } 302 - } 303 - 304 - static void walk_pud(struct pg_state *st, p4d_t *p4d, unsigned long start) 305 - { 306 - pud_t *pud = pud_offset(p4d, 0); 307 - unsigned long addr; 308 - unsigned int i; 309 - 310 - for (i = 0; i < PTRS_PER_PUD; i++, pud++) { 311 - addr = start + i * PUD_SIZE; 312 - if (!pud_none(*pud) && !pud_is_leaf(*pud)) 313 - /* pud exists */ 314 - walk_pmd(st, pud, addr); 315 - else 316 - note_page(st, addr, 2, pud_val(*pud), PUD_SIZE); 317 - } 318 - } 319 - 320 - static void walk_pagetables(struct pg_state *st) 321 - { 322 - unsigned int i; 323 - unsigned long addr = st->start_address & PGDIR_MASK; 324 - pgd_t *pgd = pgd_offset_k(addr); 325 - 326 - /* 327 - * Traverse the linux pagetable structure and dump pages that are in 328 - * the hash pagetable. 329 - */ 330 - for (i = pgd_index(addr); i < PTRS_PER_PGD; i++, pgd++, addr += PGDIR_SIZE) { 331 - p4d_t *p4d = p4d_offset(pgd, 0); 332 - 333 - if (p4d_none(*p4d) || p4d_is_leaf(*p4d)) 334 - note_page(st, addr, 1, p4d_val(*p4d), PGDIR_SIZE); 335 - else if (is_hugepd(__hugepd(p4d_val(*p4d)))) 336 - walk_hugepd(st, (hugepd_t *)p4d, addr, PGDIR_SHIFT, 1); 337 - else 338 - /* p4d exists */ 339 - walk_pud(st, p4d, addr); 245 + note_page_update_state(st, addr, level, val); 340 246 } 341 247 } 342 248 ··· 301 383 struct pg_state st = { 302 384 .seq = m, 303 385 .marker = address_markers, 304 - .start_address = IS_ENABLED(CONFIG_PPC64) ? PAGE_OFFSET : TASK_SIZE, 386 + .level = -1, 387 + .ptdump = { 388 + .note_page = note_page, 389 + .range = ptdump_range, 390 + } 305 391 }; 306 392 307 - #ifdef CONFIG_PPC64 308 - if (!radix_enabled()) 309 - st.start_address = KERN_VIRT_START; 310 - #endif 311 - 312 393 /* Traverse kernel page tables */ 313 - walk_pagetables(&st); 314 - note_page(&st, 0, 0, 0, 0); 394 + ptdump_walk_pgd(&st.ptdump, &init_mm, NULL); 315 395 return 0; 316 396 } 317 397 318 - 319 - static int ptdump_open(struct inode *inode, struct file *file) 320 - { 321 - return single_open(file, ptdump_show, NULL); 322 - } 323 - 324 - static const struct file_operations ptdump_fops = { 325 - .open = ptdump_open, 326 - .read = seq_read, 327 - .llseek = seq_lseek, 328 - .release = single_release, 329 - }; 398 + DEFINE_SHOW_ATTRIBUTE(ptdump); 330 399 331 400 static void build_pgtable_complete_mask(void) 332 401 { ··· 325 420 pg_level[i].mask |= pg_level[i].flag[j].mask; 326 421 } 327 422 328 - #ifdef CONFIG_PPC_DEBUG_WX 423 + #ifdef CONFIG_DEBUG_WX 329 424 void ptdump_check_wx(void) 330 425 { 331 426 struct pg_state st = { 332 427 .seq = NULL, 333 - .marker = address_markers, 428 + .marker = (struct addr_marker[]) { 429 + { 0, NULL}, 430 + { -1, NULL}, 431 + }, 432 + .level = -1, 334 433 .check_wx = true, 335 - .start_address = IS_ENABLED(CONFIG_PPC64) ? PAGE_OFFSET : TASK_SIZE, 434 + .ptdump = { 435 + .note_page = note_page, 436 + .range = ptdump_range, 437 + } 336 438 }; 337 439 338 - #ifdef CONFIG_PPC64 339 - if (!radix_enabled()) 340 - st.start_address = KERN_VIRT_START; 341 - #endif 342 - 343 - walk_pagetables(&st); 440 + ptdump_walk_pgd(&st.ptdump, &init_mm, NULL); 344 441 345 442 if (st.wx_pages) 346 443 pr_warn("Checked W+X mappings: FAILED, %lu W+X pages found\n", ··· 352 445 } 353 446 #endif 354 447 355 - static int ptdump_init(void) 448 + static int __init ptdump_init(void) 356 449 { 450 + #ifdef CONFIG_PPC64 451 + if (!radix_enabled()) 452 + ptdump_range[0].start = KERN_VIRT_START; 453 + else 454 + ptdump_range[0].start = PAGE_OFFSET; 455 + 456 + ptdump_range[0].end = PAGE_OFFSET + (PGDIR_SIZE * PTRS_PER_PGD); 457 + #endif 458 + 357 459 populate_markers(); 358 460 build_pgtable_complete_mask(); 359 - debugfs_create_file("kernel_page_tables", 0400, NULL, NULL, 360 - &ptdump_fops); 461 + 462 + if (IS_ENABLED(CONFIG_PTDUMP_DEBUGFS)) 463 + debugfs_create_file("kernel_page_tables", 0400, NULL, NULL, &ptdump_fops); 464 + 361 465 return 0; 362 466 } 363 467 device_initcall(ptdump_init);

+3 -13

arch/powerpc/mm/ptdump/segment_regs.c

··· 6 6 * This dumps the content of Segment Registers 7 7 */ 8 8 9 - #include <asm/debugfs.h> 9 + #include <linux/debugfs.h> 10 10 11 11 static void seg_show(struct seq_file *m, int i) 12 12 { ··· 41 41 return 0; 42 42 } 43 43 44 - static int sr_open(struct inode *inode, struct file *file) 45 - { 46 - return single_open(file, sr_show, NULL); 47 - } 48 - 49 - static const struct file_operations sr_fops = { 50 - .open = sr_open, 51 - .read = seq_read, 52 - .llseek = seq_lseek, 53 - .release = single_release, 54 - }; 44 + DEFINE_SHOW_ATTRIBUTE(sr); 55 45 56 46 static int __init sr_init(void) 57 47 { 58 - debugfs_create_file("segment_registers", 0400, powerpc_debugfs_root, 48 + debugfs_create_file("segment_registers", 0400, arch_debugfs_dir, 59 49 NULL, &sr_fops); 60 50 return 0; 61 51 }

+4 -2

arch/powerpc/mm/ptdump/shared.c

··· 68 68 }; 69 69 70 70 struct pgtable_level pg_level[5] = { 71 - { 72 - }, { /* pgd */ 71 + { /* pgd */ 72 + .flag = flag_array, 73 + .num = ARRAY_SIZE(flag_array), 74 + }, { /* p4d */ 73 75 .flag = flag_array, 74 76 .num = ARRAY_SIZE(flag_array), 75 77 }, { /* pud */

+11 -10

arch/powerpc/perf/core-book3s.c

··· 340 340 * If the PMU doesn't update the SIAR for non marked events use 341 341 * pt_regs. 342 342 * 343 + * If regs is a kernel interrupt, always use SIAR. Some PMUs have an 344 + * issue with regs_sipr not being in synch with SIAR in interrupt entry 345 + * and return sequences, which can result in regs_sipr being true for 346 + * kernel interrupts and SIAR, which has the effect of causing samples 347 + * to pile up at mtmsrd MSR[EE] 0->1 or pending irq replay around 348 + * interrupt entry/exit. 349 + * 343 350 * If the PMU has HV/PR flags then check to see if they 344 351 * place the exception in userspace. If so, use pt_regs. In 345 352 * continuous sampling mode the SIAR and the PMU exception are ··· 363 356 use_siar = 1; 364 357 else if ((ppmu->flags & PPMU_NO_CONT_SAMPLING)) 365 358 use_siar = 0; 359 + else if (!user_mode(regs)) 360 + use_siar = 1; 366 361 else if (!(ppmu->flags & PPMU_NO_SIPR) && regs_sipr(regs)) 367 362 use_siar = 0; 368 363 else ··· 2260 2251 */ 2261 2252 unsigned long perf_instruction_pointer(struct pt_regs *regs) 2262 2253 { 2263 - bool use_siar = regs_use_siar(regs); 2264 2254 unsigned long siar = mfspr(SPRN_SIAR); 2265 2255 2266 - if (ppmu && (ppmu->flags & PPMU_P10_DD1)) { 2267 - if (siar) 2268 - return siar; 2269 - else 2270 - return regs->nip; 2271 - } else if (use_siar && siar_valid(regs)) 2272 - return mfspr(SPRN_SIAR) + perf_ip_adjust(regs); 2273 - else if (use_siar) 2274 - return 0; // no valid instruction pointer 2256 + if (regs_use_siar(regs) && siar_valid(regs) && siar) 2257 + return siar + perf_ip_adjust(regs); 2275 2258 else 2276 2259 return regs->nip; 2277 2260 }

+1 -1

arch/powerpc/perf/hv-gpci.c

··· 175 175 */ 176 176 count = 0; 177 177 for (i = offset; i < offset + length; i++) 178 - count |= arg->bytes[i] << (i - offset); 178 + count |= (u64)(arg->bytes[i]) << ((length - 1 - (i - offset)) * 8); 179 179 180 180 *value = count; 181 181 out:

+2 -2

arch/powerpc/platforms/44x/machine_check.c

··· 11 11 12 12 int machine_check_440A(struct pt_regs *regs) 13 13 { 14 - unsigned long reason = regs->dsisr; 14 + unsigned long reason = regs->esr; 15 15 16 16 printk("Machine check in kernel mode.\n"); 17 17 if (reason & ESR_IMCP){ ··· 48 48 #ifdef CONFIG_PPC_47x 49 49 int machine_check_47x(struct pt_regs *regs) 50 50 { 51 - unsigned long reason = regs->dsisr; 51 + unsigned long reason = regs->esr; 52 52 u32 mcsr; 53 53 54 54 printk(KERN_ERR "Machine check in kernel mode.\n");

+1 -1

arch/powerpc/platforms/4xx/machine_check.c

··· 10 10 11 11 int machine_check_4xx(struct pt_regs *regs) 12 12 { 13 - unsigned long reason = regs->dsisr; 13 + unsigned long reason = regs->esr; 14 14 15 15 if (reason & ESR_IMCP) { 16 16 printk("Instruction");

-6

arch/powerpc/platforms/85xx/Kconfig

··· 208 208 select TQM85xx 209 209 select CPM2 210 210 211 - config SBC8548 212 - bool "Wind River SBC8548" 213 - select DEFAULT_UIMAGE 214 - help 215 - This option enables support for the Wind River SBC8548 board 216 - 217 211 config PPA8548 218 212 bool "Prodrive PPA8548" 219 213 help

-1

arch/powerpc/platforms/85xx/Makefile

··· 26 26 obj-$(CONFIG_FB_FSL_DIU) += t1042rdb_diu.o 27 27 obj-$(CONFIG_STX_GP3) += stx_gp3.o 28 28 obj-$(CONFIG_TQM85xx) += tqm85xx.o 29 - obj-$(CONFIG_SBC8548) += sbc8548.o 30 29 obj-$(CONFIG_PPA8548) += ppa8548.o 31 30 obj-$(CONFIG_SOCRATES) += socrates.o socrates_fpga_pic.o 32 31 obj-$(CONFIG_KSI8560) += ksi8560.o

-134

arch/powerpc/platforms/85xx/sbc8548.c

··· 1 - // SPDX-License-Identifier: GPL-2.0-or-later 2 - /* 3 - * Wind River SBC8548 setup and early boot code. 4 - * 5 - * Copyright 2007 Wind River Systems Inc. 6 - * 7 - * By Paul Gortmaker (see MAINTAINERS for contact information) 8 - * 9 - * Based largely on the MPC8548CDS support - Copyright 2005 Freescale Inc. 10 - */ 11 - 12 - #include <linux/stddef.h> 13 - #include <linux/kernel.h> 14 - #include <linux/init.h> 15 - #include <linux/errno.h> 16 - #include <linux/reboot.h> 17 - #include <linux/pci.h> 18 - #include <linux/kdev_t.h> 19 - #include <linux/major.h> 20 - #include <linux/console.h> 21 - #include <linux/delay.h> 22 - #include <linux/seq_file.h> 23 - #include <linux/initrd.h> 24 - #include <linux/interrupt.h> 25 - #include <linux/fsl_devices.h> 26 - #include <linux/of_platform.h> 27 - #include <linux/pgtable.h> 28 - 29 - #include <asm/page.h> 30 - #include <linux/atomic.h> 31 - #include <asm/time.h> 32 - #include <asm/io.h> 33 - #include <asm/machdep.h> 34 - #include <asm/ipic.h> 35 - #include <asm/pci-bridge.h> 36 - #include <asm/irq.h> 37 - #include <mm/mmu_decl.h> 38 - #include <asm/prom.h> 39 - #include <asm/udbg.h> 40 - #include <asm/mpic.h> 41 - 42 - #include <sysdev/fsl_soc.h> 43 - #include <sysdev/fsl_pci.h> 44 - 45 - #include "mpc85xx.h" 46 - 47 - static int sbc_rev; 48 - 49 - static void __init sbc8548_pic_init(void) 50 - { 51 - struct mpic *mpic = mpic_alloc(NULL, 0, MPIC_BIG_ENDIAN, 52 - 0, 256, " OpenPIC "); 53 - BUG_ON(mpic == NULL); 54 - mpic_init(mpic); 55 - } 56 - 57 - /* Extract the HW Rev from the EPLD on the board */ 58 - static int __init sbc8548_hw_rev(void) 59 - { 60 - struct device_node *np; 61 - struct resource res; 62 - unsigned int *rev; 63 - int board_rev = 0; 64 - 65 - np = of_find_compatible_node(NULL, NULL, "hw-rev"); 66 - if (np == NULL) { 67 - printk("No HW-REV found in DTB.\n"); 68 - return -ENODEV; 69 - } 70 - 71 - of_address_to_resource(np, 0, &res); 72 - of_node_put(np); 73 - 74 - rev = ioremap(res.start,sizeof(unsigned int)); 75 - board_rev = (*rev) >> 28; 76 - iounmap(rev); 77 - 78 - return board_rev; 79 - } 80 - 81 - /* 82 - * Setup the architecture 83 - */ 84 - static void __init sbc8548_setup_arch(void) 85 - { 86 - if (ppc_md.progress) 87 - ppc_md.progress("sbc8548_setup_arch()", 0); 88 - 89 - fsl_pci_assign_primary(); 90 - 91 - sbc_rev = sbc8548_hw_rev(); 92 - } 93 - 94 - static void sbc8548_show_cpuinfo(struct seq_file *m) 95 - { 96 - uint pvid, svid, phid1; 97 - 98 - pvid = mfspr(SPRN_PVR); 99 - svid = mfspr(SPRN_SVR); 100 - 101 - seq_printf(m, "Vendor\t\t: Wind River\n"); 102 - seq_printf(m, "Machine\t\t: SBC8548 v%d\n", sbc_rev); 103 - seq_printf(m, "PVR\t\t: 0x%x\n", pvid); 104 - seq_printf(m, "SVR\t\t: 0x%x\n", svid); 105 - 106 - /* Display cpu Pll setting */ 107 - phid1 = mfspr(SPRN_HID1); 108 - seq_printf(m, "PLL setting\t: 0x%x\n", ((phid1 >> 24) & 0x3f)); 109 - } 110 - 111 - machine_arch_initcall(sbc8548, mpc85xx_common_publish_devices); 112 - 113 - /* 114 - * Called very early, device-tree isn't unflattened 115 - */ 116 - static int __init sbc8548_probe(void) 117 - { 118 - return of_machine_is_compatible("SBC8548"); 119 - } 120 - 121 - define_machine(sbc8548) { 122 - .name = "SBC8548", 123 - .probe = sbc8548_probe, 124 - .setup_arch = sbc8548_setup_arch, 125 - .init_IRQ = sbc8548_pic_init, 126 - .show_cpuinfo = sbc8548_show_cpuinfo, 127 - .get_irq = mpic_get_irq, 128 - #ifdef CONFIG_PCI 129 - .pcibios_fixup_bus = fsl_pcibios_fixup_bus, 130 - .pcibios_fixup_phb = fsl_pcibios_fixup_phb, 131 - #endif 132 - .calibrate_decr = generic_calibrate_decr, 133 - .progress = udbg_progress, 134 - };

+1 -7

arch/powerpc/platforms/86xx/Kconfig

··· 20 20 help 21 21 This option enables support for the MPC8641 HPCN board. 22 22 23 - config SBC8641D 24 - bool "Wind River SBC8641D" 25 - select DEFAULT_UIMAGE 26 - help 27 - This option enables support for the WRS SBC8641D board. 28 - 29 23 config MPC8610_HPCD 30 24 bool "Freescale MPC8610 HPCD" 31 25 select DEFAULT_UIMAGE ··· 68 74 select FSL_PCI if PCI 69 75 select PPC_UDBG_16550 70 76 select MPIC 71 - default y if MPC8641_HPCN || SBC8641D || GEF_SBC610 || GEF_SBC310 || GEF_PPC9A \ 77 + default y if MPC8641_HPCN || GEF_SBC610 || GEF_SBC310 || GEF_PPC9A \ 72 78 || MVME7100 73 79 74 80 config MPC8610

-1

arch/powerpc/platforms/86xx/Makefile

··· 6 6 obj-y := pic.o common.o 7 7 obj-$(CONFIG_SMP) += mpc86xx_smp.o 8 8 obj-$(CONFIG_MPC8641_HPCN) += mpc86xx_hpcn.o 9 - obj-$(CONFIG_SBC8641D) += sbc8641d.o 10 9 obj-$(CONFIG_MPC8610_HPCD) += mpc8610_hpcd.o 11 10 obj-$(CONFIG_GEF_SBC610) += gef_sbc610.o 12 11 obj-$(CONFIG_GEF_SBC310) += gef_sbc310.o

-87

arch/powerpc/platforms/86xx/sbc8641d.c

··· 1 - // SPDX-License-Identifier: GPL-2.0-or-later 2 - /* 3 - * SBC8641D board specific routines 4 - * 5 - * Copyright 2008 Wind River Systems Inc. 6 - * 7 - * By Paul Gortmaker (see MAINTAINERS for contact information) 8 - * 9 - * Based largely on the 8641 HPCN support by Freescale Semiconductor Inc. 10 - */ 11 - 12 - #include <linux/stddef.h> 13 - #include <linux/kernel.h> 14 - #include <linux/pci.h> 15 - #include <linux/kdev_t.h> 16 - #include <linux/delay.h> 17 - #include <linux/seq_file.h> 18 - #include <linux/of_platform.h> 19 - 20 - #include <asm/time.h> 21 - #include <asm/machdep.h> 22 - #include <asm/pci-bridge.h> 23 - #include <asm/prom.h> 24 - #include <mm/mmu_decl.h> 25 - #include <asm/udbg.h> 26 - 27 - #include <asm/mpic.h> 28 - 29 - #include <sysdev/fsl_pci.h> 30 - #include <sysdev/fsl_soc.h> 31 - 32 - #include "mpc86xx.h" 33 - 34 - static void __init 35 - sbc8641_setup_arch(void) 36 - { 37 - if (ppc_md.progress) 38 - ppc_md.progress("sbc8641_setup_arch()", 0); 39 - 40 - printk("SBC8641 board from Wind River\n"); 41 - 42 - #ifdef CONFIG_SMP 43 - mpc86xx_smp_init(); 44 - #endif 45 - 46 - fsl_pci_assign_primary(); 47 - } 48 - 49 - 50 - static void 51 - sbc8641_show_cpuinfo(struct seq_file *m) 52 - { 53 - uint svid = mfspr(SPRN_SVR); 54 - 55 - seq_printf(m, "Vendor\t\t: Wind River Systems\n"); 56 - 57 - seq_printf(m, "SVR\t\t: 0x%x\n", svid); 58 - } 59 - 60 - 61 - /* 62 - * Called very early, device-tree isn't unflattened 63 - */ 64 - static int __init sbc8641_probe(void) 65 - { 66 - if (of_machine_is_compatible("wind,sbc8641")) 67 - return 1; /* Looks good */ 68 - 69 - return 0; 70 - } 71 - 72 - machine_arch_initcall(sbc8641, mpc86xx_common_publish_devices); 73 - 74 - define_machine(sbc8641) { 75 - .name = "SBC8641D", 76 - .probe = sbc8641_probe, 77 - .setup_arch = sbc8641_setup_arch, 78 - .init_IRQ = mpc86xx_init_irq, 79 - .show_cpuinfo = sbc8641_show_cpuinfo, 80 - .get_irq = mpic_get_irq, 81 - .time_init = mpc86xx_time_init, 82 - .calibrate_decr = generic_calibrate_decr, 83 - .progress = udbg_progress, 84 - #ifdef CONFIG_PCI 85 - .pcibios_fixup_bus = fsl_pcibios_fixup_bus, 86 - #endif 87 - };

+2 -2

arch/powerpc/platforms/cell/axon_msi.c

··· 12 12 #include <linux/export.h> 13 13 #include <linux/of_platform.h> 14 14 #include <linux/slab.h> 15 + #include <linux/debugfs.h> 15 16 16 - #include <asm/debugfs.h> 17 17 #include <asm/dcr.h> 18 18 #include <asm/machdep.h> 19 19 #include <asm/prom.h> ··· 480 480 481 481 snprintf(name, sizeof(name), "msic_%d", of_node_to_nid(dn)); 482 482 483 - debugfs_create_file(name, 0600, powerpc_debugfs_root, msic, &fops_msic); 483 + debugfs_create_file(name, 0600, arch_debugfs_dir, msic, &fops_msic); 484 484 } 485 485 #endif /* DEBUG */

+1 -1

arch/powerpc/platforms/embedded6xx/holly.c

··· 251 251 /* Are we prepared to handle this fault */ 252 252 if ((entry = search_exception_tables(regs->nip)) != NULL) { 253 253 tsi108_clear_pci_cfg_error(); 254 - regs_set_return_msr(regs, regs->msr | MSR_RI); 254 + regs_set_recoverable(regs); 255 255 regs_set_return_ip(regs, extable_fixup(entry)); 256 256 return 1; 257 257 }

+1 -1

arch/powerpc/platforms/embedded6xx/mpc7448_hpc2.c

··· 173 173 /* Are we prepared to handle this fault */ 174 174 if ((entry = search_exception_tables(regs->nip)) != NULL) { 175 175 tsi108_clear_pci_cfg_error(); 176 - regs_set_return_msr(regs, regs->msr | MSR_RI); 176 + regs_set_recoverable(regs); 177 177 regs_set_return_ip(regs, extable_fixup(entry)); 178 178 return 1; 179 179 }

+1 -1

arch/powerpc/platforms/pasemi/idle.c

··· 59 59 restore_astate(hard_smp_processor_id()); 60 60 61 61 /* everything handled */ 62 - regs_set_return_msr(regs, regs->msr | MSR_RI); 62 + regs_set_recoverable(regs); 63 63 return 1; 64 64 } 65 65

+2 -4

arch/powerpc/platforms/powernv/idle.c

··· 199 199 */ 200 200 power7_fastsleep_workaround_exit = false; 201 201 202 - get_online_cpus(); 202 + cpus_read_lock(); 203 203 primary_thread_mask = cpu_online_cores_map(); 204 204 on_each_cpu_mask(&primary_thread_mask, 205 205 pnv_fastsleep_workaround_apply, 206 206 &err, 1); 207 - put_online_cpus(); 207 + cpus_read_unlock(); 208 208 if (err) { 209 209 pr_err("fastsleep_workaround_applyonce change failed while running pnv_fastsleep_workaround_apply"); 210 210 goto fail; ··· 667 667 sprs.purr = mfspr(SPRN_PURR); 668 668 sprs.spurr = mfspr(SPRN_SPURR); 669 669 sprs.dscr = mfspr(SPRN_DSCR); 670 - sprs.wort = mfspr(SPRN_WORT); 671 670 sprs.ciabr = mfspr(SPRN_CIABR); 672 671 673 672 sprs.mmcra = mfspr(SPRN_MMCRA); ··· 784 785 mtspr(SPRN_PURR, sprs.purr); 785 786 mtspr(SPRN_SPURR, sprs.spurr); 786 787 mtspr(SPRN_DSCR, sprs.dscr); 787 - mtspr(SPRN_WORT, sprs.wort); 788 788 mtspr(SPRN_CIABR, sprs.ciabr); 789 789 790 790 mtspr(SPRN_MMCRA, sprs.mmcra);

+1 -2

arch/powerpc/platforms/powernv/memtrace.c

··· 18 18 #include <linux/memory_hotplug.h> 19 19 #include <linux/numa.h> 20 20 #include <asm/machdep.h> 21 - #include <asm/debugfs.h> 22 21 #include <asm/cacheflush.h> 23 22 24 23 /* This enables us to keep track of the memory removed from each node. */ ··· 329 330 static int memtrace_init(void) 330 331 { 331 332 memtrace_debugfs_dir = debugfs_create_dir("memtrace", 332 - powerpc_debugfs_root); 333 + arch_debugfs_dir); 333 334 334 335 debugfs_create_file("enable", 0600, memtrace_debugfs_dir, 335 336 NULL, &memtrace_init_fops);

+6 -6

arch/powerpc/platforms/powernv/opal-imc.c

··· 13 13 #include <linux/of_address.h> 14 14 #include <linux/of_platform.h> 15 15 #include <linux/crash_dump.h> 16 + #include <linux/debugfs.h> 16 17 #include <asm/opal.h> 17 18 #include <asm/io.h> 18 19 #include <asm/imc-pmu.h> 19 20 #include <asm/cputhreads.h> 20 - #include <asm/debugfs.h> 21 21 22 22 static struct dentry *imc_debugfs_parent; 23 23 ··· 56 56 u32 cb_offset; 57 57 struct imc_mem_info *ptr = pmu_ptr->mem_info; 58 58 59 - imc_debugfs_parent = debugfs_create_dir("imc", powerpc_debugfs_root); 59 + imc_debugfs_parent = debugfs_create_dir("imc", arch_debugfs_dir); 60 60 61 61 if (of_property_read_u32(node, "cb_offset", &cb_offset)) 62 62 cb_offset = IMC_CNTL_BLK_OFFSET; ··· 186 186 int nid, cpu; 187 187 const struct cpumask *l_cpumask; 188 188 189 - get_online_cpus(); 189 + cpus_read_lock(); 190 190 for_each_node_with_cpus(nid) { 191 191 l_cpumask = cpumask_of_node(nid); 192 192 cpu = cpumask_first_and(l_cpumask, cpu_online_mask); ··· 195 195 opal_imc_counters_stop(OPAL_IMC_COUNTERS_NEST, 196 196 get_hard_smp_processor_id(cpu)); 197 197 } 198 - put_online_cpus(); 198 + cpus_read_unlock(); 199 199 } 200 200 201 201 static void disable_core_pmu_counters(void) ··· 203 203 cpumask_t cores_map; 204 204 int cpu, rc; 205 205 206 - get_online_cpus(); 206 + cpus_read_lock(); 207 207 /* Disable the IMC Core functions */ 208 208 cores_map = cpu_online_cores_map(); 209 209 for_each_cpu(cpu, &cores_map) { ··· 213 213 pr_err("%s: Failed to stop Core (cpu = %d)\n", 214 214 __FUNCTION__, cpu); 215 215 } 216 - put_online_cpus(); 216 + cpus_read_unlock(); 217 217 } 218 218 219 219 int get_max_nest_dev(void)

+2 -2

arch/powerpc/platforms/powernv/opal-lpc.c

··· 10 10 #include <linux/bug.h> 11 11 #include <linux/io.h> 12 12 #include <linux/slab.h> 13 + #include <linux/debugfs.h> 13 14 14 15 #include <asm/machdep.h> 15 16 #include <asm/firmware.h> 16 17 #include <asm/opal.h> 17 18 #include <asm/prom.h> 18 19 #include <linux/uaccess.h> 19 - #include <asm/debugfs.h> 20 20 #include <asm/isa-bridge.h> 21 21 22 22 static int opal_lpc_chip_id = -1; ··· 371 371 if (opal_lpc_chip_id < 0) 372 372 return -ENODEV; 373 373 374 - root = debugfs_create_dir("lpc", powerpc_debugfs_root); 374 + root = debugfs_create_dir("lpc", arch_debugfs_dir); 375 375 376 376 rc |= opal_lpc_debugfs_create_type(root, "io", OPAL_LPC_IO); 377 377 rc |= opal_lpc_debugfs_create_type(root, "mem", OPAL_LPC_MEM);

+2 -2

arch/powerpc/platforms/powernv/opal-xscom.c

··· 14 14 #include <linux/gfp.h> 15 15 #include <linux/slab.h> 16 16 #include <linux/uaccess.h> 17 + #include <linux/debugfs.h> 17 18 18 19 #include <asm/machdep.h> 19 20 #include <asm/firmware.h> 20 21 #include <asm/opal.h> 21 - #include <asm/debugfs.h> 22 22 #include <asm/prom.h> 23 23 24 24 static u64 opal_scom_unmangle(u64 addr) ··· 189 189 if (!firmware_has_feature(FW_FEATURE_OPAL)) 190 190 return 0; 191 191 192 - root = debugfs_create_dir("scom", powerpc_debugfs_root); 192 + root = debugfs_create_dir("scom", arch_debugfs_dir); 193 193 if (!root) 194 194 return -1; 195 195

+1 -1

arch/powerpc/platforms/powernv/opal.c

··· 588 588 { 589 589 int recovered = 0; 590 590 591 - if (!(regs->msr & MSR_RI)) { 591 + if (regs_is_unrecoverable(regs)) { 592 592 /* If MSR_RI isn't set, we cannot recover */ 593 593 pr_err("Machine check interrupt unrecoverable: MSR(RI=0)\n"); 594 594 recovered = 0;

+237 -23

arch/powerpc/platforms/powernv/pci-ioda.c

··· 20 20 #include <linux/iommu.h> 21 21 #include <linux/rculist.h> 22 22 #include <linux/sizes.h> 23 + #include <linux/debugfs.h> 23 24 24 25 #include <asm/sections.h> 25 26 #include <asm/io.h> ··· 33 32 #include <asm/iommu.h> 34 33 #include <asm/tce.h> 35 34 #include <asm/xics.h> 36 - #include <asm/debugfs.h> 37 35 #include <asm/firmware.h> 38 36 #include <asm/pnv-pci.h> 39 37 #include <asm/mmzone.h> 38 + #include <asm/xive.h> 40 39 41 40 #include <misc/cxl-base.h> 42 41 ··· 1963 1962 pe->dma_setup_done = true; 1964 1963 } 1965 1964 1966 - int64_t pnv_opal_pci_msi_eoi(struct irq_chip *chip, unsigned int hw_irq) 1965 + /* 1966 + * Called from KVM in real mode to EOI passthru interrupts. The ICP 1967 + * EOI is handled directly in KVM in kvmppc_deliver_irq_passthru(). 1968 + * 1969 + * The IRQ data is mapped in the PCI-MSI domain and the EOI OPAL call 1970 + * needs an HW IRQ number mapped in the XICS IRQ domain. The HW IRQ 1971 + * numbers of the in-the-middle MSI domain are vector numbers and it's 1972 + * good enough for OPAL. Use that. 1973 + */ 1974 + int64_t pnv_opal_pci_msi_eoi(struct irq_data *d) 1967 1975 { 1968 - struct pnv_phb *phb = container_of(chip, struct pnv_phb, 1969 - ioda.irq_chip); 1976 + struct pci_controller *hose = irq_data_get_irq_chip_data(d->parent_data); 1977 + struct pnv_phb *phb = hose->private_data; 1970 1978 1971 - return opal_pci_msi_eoi(phb->opal_id, hw_irq); 1979 + return opal_pci_msi_eoi(phb->opal_id, d->parent_data->hwirq); 1972 1980 } 1973 1981 1982 + /* 1983 + * The IRQ data is mapped in the XICS domain, with OPAL HW IRQ numbers 1984 + */ 1974 1985 static void pnv_ioda2_msi_eoi(struct irq_data *d) 1975 1986 { 1976 1987 int64_t rc; 1977 1988 unsigned int hw_irq = (unsigned int)irqd_to_hwirq(d); 1978 - struct irq_chip *chip = irq_data_get_irq_chip(d); 1989 + struct pci_controller *hose = irq_data_get_irq_chip_data(d); 1990 + struct pnv_phb *phb = hose->private_data; 1979 1991 1980 - rc = pnv_opal_pci_msi_eoi(chip, hw_irq); 1992 + rc = opal_pci_msi_eoi(phb->opal_id, hw_irq); 1981 1993 WARN_ON_ONCE(rc); 1982 1994 1983 1995 icp_native_eoi(d); 1984 1996 } 1985 1997 1986 - 1998 + /* P8/CXL only */ 1987 1999 void pnv_set_msi_irq_chip(struct pnv_phb *phb, unsigned int virq) 1988 2000 { 1989 2001 struct irq_data *idata; ··· 2018 2004 phb->ioda.irq_chip.irq_eoi = pnv_ioda2_msi_eoi; 2019 2005 } 2020 2006 irq_set_chip(virq, &phb->ioda.irq_chip); 2007 + irq_set_chip_data(virq, phb->hose); 2021 2008 } 2009 + 2010 + static struct irq_chip pnv_pci_msi_irq_chip; 2022 2011 2023 2012 /* 2024 2013 * Returns true iff chip is something that we could call ··· 2029 2012 */ 2030 2013 bool is_pnv_opal_msi(struct irq_chip *chip) 2031 2014 { 2032 - return chip->irq_eoi == pnv_ioda2_msi_eoi; 2015 + return chip == &pnv_pci_msi_irq_chip; 2033 2016 } 2034 2017 EXPORT_SYMBOL_GPL(is_pnv_opal_msi); 2035 2018 2036 - static int pnv_pci_ioda_msi_setup(struct pnv_phb *phb, struct pci_dev *dev, 2037 - unsigned int hwirq, unsigned int virq, 2038 - unsigned int is_64, struct msi_msg *msg) 2019 + static int __pnv_pci_ioda_msi_setup(struct pnv_phb *phb, struct pci_dev *dev, 2020 + unsigned int xive_num, 2021 + unsigned int is_64, struct msi_msg *msg) 2039 2022 { 2040 2023 struct pnv_ioda_pe *pe = pnv_ioda_get_pe(dev); 2041 - unsigned int xive_num = hwirq - phb->msi_base; 2042 2024 __be32 data; 2043 2025 int rc; 2026 + 2027 + dev_dbg(&dev->dev, "%s: setup %s-bit MSI for vector #%d\n", __func__, 2028 + is_64 ? "64" : "32", xive_num); 2044 2029 2045 2030 /* No PE assigned ? bail out ... no MSI for you ! */ 2046 2031 if (pe == NULL) ··· 2091 2072 } 2092 2073 msg->data = be32_to_cpu(data); 2093 2074 2094 - pnv_set_msi_irq_chip(phb, virq); 2075 + return 0; 2076 + } 2095 2077 2096 - pr_devel("%s: %s-bit MSI on hwirq %x (xive #%d)," 2097 - " address=%x_%08x data=%x PE# %x\n", 2098 - pci_name(dev), is_64 ? "64" : "32", hwirq, xive_num, 2099 - msg->address_hi, msg->address_lo, data, pe->pe_number); 2078 + /* 2079 + * The msi_free() op is called before irq_domain_free_irqs_top() when 2080 + * the handler data is still available. Use that to clear the XIVE 2081 + * controller. 2082 + */ 2083 + static void pnv_msi_ops_msi_free(struct irq_domain *domain, 2084 + struct msi_domain_info *info, 2085 + unsigned int irq) 2086 + { 2087 + if (xive_enabled()) 2088 + xive_irq_free_data(irq); 2089 + } 2090 + 2091 + static struct msi_domain_ops pnv_pci_msi_domain_ops = { 2092 + .msi_free = pnv_msi_ops_msi_free, 2093 + }; 2094 + 2095 + static void pnv_msi_shutdown(struct irq_data *d) 2096 + { 2097 + d = d->parent_data; 2098 + if (d->chip->irq_shutdown) 2099 + d->chip->irq_shutdown(d); 2100 + } 2101 + 2102 + static void pnv_msi_mask(struct irq_data *d) 2103 + { 2104 + pci_msi_mask_irq(d); 2105 + irq_chip_mask_parent(d); 2106 + } 2107 + 2108 + static void pnv_msi_unmask(struct irq_data *d) 2109 + { 2110 + pci_msi_unmask_irq(d); 2111 + irq_chip_unmask_parent(d); 2112 + } 2113 + 2114 + static struct irq_chip pnv_pci_msi_irq_chip = { 2115 + .name = "PNV-PCI-MSI", 2116 + .irq_shutdown = pnv_msi_shutdown, 2117 + .irq_mask = pnv_msi_mask, 2118 + .irq_unmask = pnv_msi_unmask, 2119 + .irq_eoi = irq_chip_eoi_parent, 2120 + }; 2121 + 2122 + static struct msi_domain_info pnv_msi_domain_info = { 2123 + .flags = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS | 2124 + MSI_FLAG_MULTI_PCI_MSI | MSI_FLAG_PCI_MSIX), 2125 + .ops = &pnv_pci_msi_domain_ops, 2126 + .chip = &pnv_pci_msi_irq_chip, 2127 + }; 2128 + 2129 + static void pnv_msi_compose_msg(struct irq_data *d, struct msi_msg *msg) 2130 + { 2131 + struct msi_desc *entry = irq_data_get_msi_desc(d); 2132 + struct pci_dev *pdev = msi_desc_to_pci_dev(entry); 2133 + struct pci_controller *hose = irq_data_get_irq_chip_data(d); 2134 + struct pnv_phb *phb = hose->private_data; 2135 + int rc; 2136 + 2137 + rc = __pnv_pci_ioda_msi_setup(phb, pdev, d->hwirq, 2138 + entry->msi_attrib.is_64, msg); 2139 + if (rc) 2140 + dev_err(&pdev->dev, "Failed to setup %s-bit MSI #%ld : %d\n", 2141 + entry->msi_attrib.is_64 ? "64" : "32", d->hwirq, rc); 2142 + } 2143 + 2144 + /* 2145 + * The IRQ data is mapped in the MSI domain in which HW IRQ numbers 2146 + * correspond to vector numbers. 2147 + */ 2148 + static void pnv_msi_eoi(struct irq_data *d) 2149 + { 2150 + struct pci_controller *hose = irq_data_get_irq_chip_data(d); 2151 + struct pnv_phb *phb = hose->private_data; 2152 + 2153 + if (phb->model == PNV_PHB_MODEL_PHB3) { 2154 + /* 2155 + * The EOI OPAL call takes an OPAL HW IRQ number but 2156 + * since it is translated into a vector number in 2157 + * OPAL, use that directly. 2158 + */ 2159 + WARN_ON_ONCE(opal_pci_msi_eoi(phb->opal_id, d->hwirq)); 2160 + } 2161 + 2162 + irq_chip_eoi_parent(d); 2163 + } 2164 + 2165 + static struct irq_chip pnv_msi_irq_chip = { 2166 + .name = "PNV-MSI", 2167 + .irq_shutdown = pnv_msi_shutdown, 2168 + .irq_mask = irq_chip_mask_parent, 2169 + .irq_unmask = irq_chip_unmask_parent, 2170 + .irq_eoi = pnv_msi_eoi, 2171 + .irq_set_affinity = irq_chip_set_affinity_parent, 2172 + .irq_compose_msi_msg = pnv_msi_compose_msg, 2173 + }; 2174 + 2175 + static int pnv_irq_parent_domain_alloc(struct irq_domain *domain, 2176 + unsigned int virq, int hwirq) 2177 + { 2178 + struct irq_fwspec parent_fwspec; 2179 + int ret; 2180 + 2181 + parent_fwspec.fwnode = domain->parent->fwnode; 2182 + parent_fwspec.param_count = 2; 2183 + parent_fwspec.param[0] = hwirq; 2184 + parent_fwspec.param[1] = IRQ_TYPE_EDGE_RISING; 2185 + 2186 + ret = irq_domain_alloc_irqs_parent(domain, virq, 1, &parent_fwspec); 2187 + if (ret) 2188 + return ret; 2189 + 2190 + return 0; 2191 + } 2192 + 2193 + static int pnv_irq_domain_alloc(struct irq_domain *domain, unsigned int virq, 2194 + unsigned int nr_irqs, void *arg) 2195 + { 2196 + struct pci_controller *hose = domain->host_data; 2197 + struct pnv_phb *phb = hose->private_data; 2198 + msi_alloc_info_t *info = arg; 2199 + struct pci_dev *pdev = msi_desc_to_pci_dev(info->desc); 2200 + int hwirq; 2201 + int i, ret; 2202 + 2203 + hwirq = msi_bitmap_alloc_hwirqs(&phb->msi_bmp, nr_irqs); 2204 + if (hwirq < 0) { 2205 + dev_warn(&pdev->dev, "failed to find a free MSI\n"); 2206 + return -ENOSPC; 2207 + } 2208 + 2209 + dev_dbg(&pdev->dev, "%s bridge %pOF %d/%x #%d\n", __func__, 2210 + hose->dn, virq, hwirq, nr_irqs); 2211 + 2212 + for (i = 0; i < nr_irqs; i++) { 2213 + ret = pnv_irq_parent_domain_alloc(domain, virq + i, 2214 + phb->msi_base + hwirq + i); 2215 + if (ret) 2216 + goto out; 2217 + 2218 + irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i, 2219 + &pnv_msi_irq_chip, hose); 2220 + } 2221 + 2222 + return 0; 2223 + 2224 + out: 2225 + irq_domain_free_irqs_parent(domain, virq, i - 1); 2226 + msi_bitmap_free_hwirqs(&phb->msi_bmp, hwirq, nr_irqs); 2227 + return ret; 2228 + } 2229 + 2230 + static void pnv_irq_domain_free(struct irq_domain *domain, unsigned int virq, 2231 + unsigned int nr_irqs) 2232 + { 2233 + struct irq_data *d = irq_domain_get_irq_data(domain, virq); 2234 + struct pci_controller *hose = irq_data_get_irq_chip_data(d); 2235 + struct pnv_phb *phb = hose->private_data; 2236 + 2237 + pr_debug("%s bridge %pOF %d/%lx #%d\n", __func__, hose->dn, 2238 + virq, d->hwirq, nr_irqs); 2239 + 2240 + msi_bitmap_free_hwirqs(&phb->msi_bmp, d->hwirq, nr_irqs); 2241 + /* XIVE domain is cleared through ->msi_free() */ 2242 + } 2243 + 2244 + static const struct irq_domain_ops pnv_irq_domain_ops = { 2245 + .alloc = pnv_irq_domain_alloc, 2246 + .free = pnv_irq_domain_free, 2247 + }; 2248 + 2249 + static int pnv_msi_allocate_domains(struct pci_controller *hose, unsigned int count) 2250 + { 2251 + struct pnv_phb *phb = hose->private_data; 2252 + struct irq_domain *parent = irq_get_default_host(); 2253 + 2254 + hose->fwnode = irq_domain_alloc_named_id_fwnode("PNV-MSI", phb->opal_id); 2255 + if (!hose->fwnode) 2256 + return -ENOMEM; 2257 + 2258 + hose->dev_domain = irq_domain_create_hierarchy(parent, 0, count, 2259 + hose->fwnode, 2260 + &pnv_irq_domain_ops, hose); 2261 + if (!hose->dev_domain) { 2262 + pr_err("PCI: failed to create IRQ domain bridge %pOF (domain %d)\n", 2263 + hose->dn, hose->global_number); 2264 + irq_domain_free_fwnode(hose->fwnode); 2265 + return -ENOMEM; 2266 + } 2267 + 2268 + hose->msi_domain = pci_msi_create_irq_domain(of_node_to_fwnode(hose->dn), 2269 + &pnv_msi_domain_info, 2270 + hose->dev_domain); 2271 + if (!hose->msi_domain) { 2272 + pr_err("PCI: failed to create MSI IRQ domain bridge %pOF (domain %d)\n", 2273 + hose->dn, hose->global_number); 2274 + irq_domain_free_fwnode(hose->fwnode); 2275 + irq_domain_remove(hose->dev_domain); 2276 + return -ENOMEM; 2277 + } 2100 2278 2101 2279 return 0; 2102 2280 } ··· 2318 2102 return; 2319 2103 } 2320 2104 2321 - phb->msi_setup = pnv_pci_ioda_msi_setup; 2322 - phb->msi32_support = 1; 2323 2105 pr_info(" Allocated bitmap for %d MSIs (base IRQ 0x%x)\n", 2324 2106 count, phb->msi_base); 2107 + 2108 + pnv_msi_allocate_domains(phb->hose, count); 2325 2109 } 2326 2110 2327 2111 static void pnv_ioda_setup_pe_res(struct pnv_ioda_pe *pe, ··· 2475 2259 phb = hose->private_data; 2476 2260 2477 2261 sprintf(name, "PCI%04x", hose->global_number); 2478 - phb->dbgfs = debugfs_create_dir(name, powerpc_debugfs_root); 2262 + phb->dbgfs = debugfs_create_dir(name, arch_debugfs_dir); 2479 2263 2480 2264 debugfs_create_file_unsafe("dump_diag_regs", 0200, phb->dbgfs, 2481 2265 phb, &pnv_pci_diag_data_fops); ··· 2925 2709 .dma_dev_setup = pnv_pci_ioda_dma_dev_setup, 2926 2710 .dma_bus_setup = pnv_pci_ioda_dma_bus_setup, 2927 2711 .iommu_bypass_supported = pnv_pci_ioda_iommu_bypass_supported, 2928 - .setup_msi_irqs = pnv_setup_msi_irqs, 2929 - .teardown_msi_irqs = pnv_teardown_msi_irqs, 2930 2712 .enable_device_hook = pnv_pci_enable_device_hook, 2931 2713 .release_device = pnv_pci_release_device, 2932 2714 .window_alignment = pnv_pci_window_alignment,

-67

arch/powerpc/platforms/powernv/pci.c

··· 160 160 } 161 161 EXPORT_SYMBOL_GPL(pnv_pci_set_power_state); 162 162 163 - int pnv_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type) 164 - { 165 - struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus); 166 - struct msi_desc *entry; 167 - struct msi_msg msg; 168 - int hwirq; 169 - unsigned int virq; 170 - int rc; 171 - 172 - if (WARN_ON(!phb) || !phb->msi_bmp.bitmap) 173 - return -ENODEV; 174 - 175 - if (pdev->no_64bit_msi && !phb->msi32_support) 176 - return -ENODEV; 177 - 178 - for_each_pci_msi_entry(entry, pdev) { 179 - if (!entry->msi_attrib.is_64 && !phb->msi32_support) { 180 - pr_warn("%s: Supports only 64-bit MSIs\n", 181 - pci_name(pdev)); 182 - return -ENXIO; 183 - } 184 - hwirq = msi_bitmap_alloc_hwirqs(&phb->msi_bmp, 1); 185 - if (hwirq < 0) { 186 - pr_warn("%s: Failed to find a free MSI\n", 187 - pci_name(pdev)); 188 - return -ENOSPC; 189 - } 190 - virq = irq_create_mapping(NULL, phb->msi_base + hwirq); 191 - if (!virq) { 192 - pr_warn("%s: Failed to map MSI to linux irq\n", 193 - pci_name(pdev)); 194 - msi_bitmap_free_hwirqs(&phb->msi_bmp, hwirq, 1); 195 - return -ENOMEM; 196 - } 197 - rc = phb->msi_setup(phb, pdev, phb->msi_base + hwirq, 198 - virq, entry->msi_attrib.is_64, &msg); 199 - if (rc) { 200 - pr_warn("%s: Failed to setup MSI\n", pci_name(pdev)); 201 - irq_dispose_mapping(virq); 202 - msi_bitmap_free_hwirqs(&phb->msi_bmp, hwirq, 1); 203 - return rc; 204 - } 205 - irq_set_msi_desc(virq, entry); 206 - pci_write_msi_msg(virq, &msg); 207 - } 208 - return 0; 209 - } 210 - 211 - void pnv_teardown_msi_irqs(struct pci_dev *pdev) 212 - { 213 - struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus); 214 - struct msi_desc *entry; 215 - irq_hw_number_t hwirq; 216 - 217 - if (WARN_ON(!phb)) 218 - return; 219 - 220 - for_each_pci_msi_entry(entry, pdev) { 221 - if (!entry->irq) 222 - continue; 223 - hwirq = virq_to_hw(entry->irq); 224 - irq_set_msi_desc(entry->irq, NULL); 225 - irq_dispose_mapping(entry->irq); 226 - msi_bitmap_free_hwirqs(&phb->msi_bmp, hwirq - phb->msi_base, 1); 227 - } 228 - } 229 - 230 163 /* Nicely print the contents of the PE State Tables (PEST). */ 231 164 static void pnv_pci_dump_pest(__be64 pestA[], __be64 pestB[], int pest_size) 232 165 {

-6

arch/powerpc/platforms/powernv/pci.h

··· 123 123 #endif 124 124 125 125 unsigned int msi_base; 126 - unsigned int msi32_support; 127 126 struct msi_bitmap msi_bmp; 128 - int (*msi_setup)(struct pnv_phb *phb, struct pci_dev *dev, 129 - unsigned int hwirq, unsigned int virq, 130 - unsigned int is_64, struct msi_msg *msg); 131 127 int (*init_m64)(struct pnv_phb *phb); 132 128 int (*get_pe_state)(struct pnv_phb *phb, int pe_no); 133 129 void (*freeze_pe)(struct pnv_phb *phb, int pe_no); ··· 285 289 extern void pnv_pci_reset_secondary_bus(struct pci_dev *dev); 286 290 extern int pnv_eeh_phb_reset(struct pci_controller *hose, int option); 287 291 288 - extern int pnv_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type); 289 - extern void pnv_teardown_msi_irqs(struct pci_dev *pdev); 290 292 extern struct pnv_ioda_pe *pnv_pci_bdfn_to_pe(struct pnv_phb *phb, u16 bdfn); 291 293 extern struct pnv_ioda_pe *pnv_ioda_get_pe(struct pci_dev *dev); 292 294 extern void pnv_set_msi_irq_chip(struct pnv_phb *phb, unsigned int virq);

+2 -1

arch/powerpc/platforms/ps3/htab.c

··· 169 169 spin_unlock_irqrestore(&ps3_htab_lock, flags); 170 170 } 171 171 172 - static void ps3_hpte_clear(void) 172 + /* Called during kexec sequence with MMU off */ 173 + static notrace void ps3_hpte_clear(void) 173 174 { 174 175 unsigned long hpte_count = (1UL << ppc64_pft_size) >> 4; 175 176 u64 i;

+6 -2

arch/powerpc/platforms/ps3/mm.c

··· 195 195 196 196 /** 197 197 * ps3_mm_vas_destroy - 198 + * 199 + * called during kexec sequence with MMU off. 198 200 */ 199 201 200 - void ps3_mm_vas_destroy(void) 202 + notrace void ps3_mm_vas_destroy(void) 201 203 { 202 204 int result; 203 205 ··· 1245 1243 1246 1244 /** 1247 1245 * ps3_mm_shutdown - final cleanup of address space 1246 + * 1247 + * called during kexec sequence with MMU off. 1248 1248 */ 1249 1249 1250 - void ps3_mm_shutdown(void) 1250 + notrace void ps3_mm_shutdown(void) 1251 1251 { 1252 1252 ps3_mm_region_destroy(&map.r1); 1253 1253 }

+2 -2

arch/powerpc/platforms/pseries/dtl.c

··· 11 11 #include <linux/spinlock.h> 12 12 #include <asm/smp.h> 13 13 #include <linux/uaccess.h> 14 + #include <linux/debugfs.h> 14 15 #include <asm/firmware.h> 15 16 #include <asm/dtl.h> 16 17 #include <asm/lppaca.h> 17 - #include <asm/debugfs.h> 18 18 #include <asm/plpar_wrappers.h> 19 19 #include <asm/machdep.h> 20 20 ··· 338 338 339 339 /* set up common debugfs structure */ 340 340 341 - dtl_dir = debugfs_create_dir("dtl", powerpc_debugfs_root); 341 + dtl_dir = debugfs_create_dir("dtl", arch_debugfs_dir); 342 342 343 343 debugfs_create_x8("dtl_event_mask", 0600, dtl_dir, &dtl_event_mask); 344 344 debugfs_create_u32("dtl_buf_entries", 0400, dtl_dir, &dtl_buf_entries);

+2 -1

arch/powerpc/platforms/pseries/firmware.c

··· 119 119 120 120 static __initdata struct vec5_fw_feature 121 121 vec5_fw_features_table[] = { 122 - {FW_FEATURE_TYPE1_AFFINITY, OV5_TYPE1_AFFINITY}, 122 + {FW_FEATURE_FORM1_AFFINITY, OV5_FORM1_AFFINITY}, 123 123 {FW_FEATURE_PRRN, OV5_PRRN}, 124 124 {FW_FEATURE_DRMEM_V2, OV5_DRMEM_V2}, 125 125 {FW_FEATURE_DRC_INFO, OV5_DRC_INFO}, 126 + {FW_FEATURE_FORM2_AFFINITY, OV5_FORM2_AFFINITY}, 126 127 }; 127 128 128 129 static void __init fw_vec5_feature_init(const char *vec5, unsigned long len)

+134 -39

arch/powerpc/platforms/pseries/hotplug-cpu.c

··· 39 39 /* This version can't take the spinlock, because it never returns */ 40 40 static int rtas_stop_self_token = RTAS_UNKNOWN_SERVICE; 41 41 42 + /* 43 + * Record the CPU ids used on each nodes. 44 + * Protected by cpu_add_remove_lock. 45 + */ 46 + static cpumask_var_t node_recorded_ids_map[MAX_NUMNODES]; 47 + 42 48 static void rtas_stop_self(void) 43 49 { 44 50 static struct rtas_args args; ··· 145 139 paca_ptrs[cpu]->cpu_start = 0; 146 140 } 147 141 142 + /** 143 + * find_cpu_id_range - found a linear ranger of @nthreads free CPU ids. 144 + * @nthreads : the number of threads (cpu ids) 145 + * @assigned_node : the node it belongs to or NUMA_NO_NODE if free ids from any 146 + * node can be peek. 147 + * @cpu_mask: the returned CPU mask. 148 + * 149 + * Returns 0 on success. 150 + */ 151 + static int find_cpu_id_range(unsigned int nthreads, int assigned_node, 152 + cpumask_var_t *cpu_mask) 153 + { 154 + cpumask_var_t candidate_mask; 155 + unsigned int cpu, node; 156 + int rc = -ENOSPC; 157 + 158 + if (!zalloc_cpumask_var(&candidate_mask, GFP_KERNEL)) 159 + return -ENOMEM; 160 + 161 + cpumask_clear(*cpu_mask); 162 + for (cpu = 0; cpu < nthreads; cpu++) 163 + cpumask_set_cpu(cpu, *cpu_mask); 164 + 165 + BUG_ON(!cpumask_subset(cpu_present_mask, cpu_possible_mask)); 166 + 167 + /* Get a bitmap of unoccupied slots. */ 168 + cpumask_xor(candidate_mask, cpu_possible_mask, cpu_present_mask); 169 + 170 + if (assigned_node != NUMA_NO_NODE) { 171 + /* 172 + * Remove free ids previously assigned on the other nodes. We 173 + * can walk only online nodes because once a node became online 174 + * it is not turned offlined back. 175 + */ 176 + for_each_online_node(node) { 177 + if (node == assigned_node) 178 + continue; 179 + cpumask_andnot(candidate_mask, candidate_mask, 180 + node_recorded_ids_map[node]); 181 + } 182 + } 183 + 184 + if (cpumask_empty(candidate_mask)) 185 + goto out; 186 + 187 + while (!cpumask_empty(*cpu_mask)) { 188 + if (cpumask_subset(*cpu_mask, candidate_mask)) 189 + /* Found a range where we can insert the new cpu(s) */ 190 + break; 191 + cpumask_shift_left(*cpu_mask, *cpu_mask, nthreads); 192 + } 193 + 194 + if (!cpumask_empty(*cpu_mask)) 195 + rc = 0; 196 + 197 + out: 198 + free_cpumask_var(candidate_mask); 199 + return rc; 200 + } 201 + 148 202 /* 149 203 * Update cpu_present_mask and paca(s) for a new cpu node. The wrinkle 150 - * here is that a cpu device node may represent up to two logical cpus 204 + * here is that a cpu device node may represent multiple logical cpus 151 205 * in the SMT case. We must honor the assumption in other code that 152 206 * the logical ids for sibling SMT threads x and y are adjacent, such 153 207 * that x^1 == y and y^1 == x. 154 208 */ 155 209 static int pseries_add_processor(struct device_node *np) 156 210 { 157 - unsigned int cpu; 158 - cpumask_var_t candidate_mask, tmp; 159 - int err = -ENOSPC, len, nthreads, i; 211 + int len, nthreads, node, cpu, assigned_node; 212 + int rc = 0; 213 + cpumask_var_t cpu_mask; 160 214 const __be32 *intserv; 161 215 162 216 intserv = of_get_property(np, "ibm,ppc-interrupt-server#s", &len); 163 217 if (!intserv) 164 218 return 0; 165 219 166 - zalloc_cpumask_var(&candidate_mask, GFP_KERNEL); 167 - zalloc_cpumask_var(&tmp, GFP_KERNEL); 168 - 169 220 nthreads = len / sizeof(u32); 170 - for (i = 0; i < nthreads; i++) 171 - cpumask_set_cpu(i, tmp); 221 + 222 + if (!alloc_cpumask_var(&cpu_mask, GFP_KERNEL)) 223 + return -ENOMEM; 224 + 225 + /* 226 + * Fetch from the DT nodes read by dlpar_configure_connector() the NUMA 227 + * node id the added CPU belongs to. 228 + */ 229 + node = of_node_to_nid(np); 230 + if (node < 0 || !node_possible(node)) 231 + node = first_online_node; 232 + 233 + BUG_ON(node == NUMA_NO_NODE); 234 + assigned_node = node; 172 235 173 236 cpu_maps_update_begin(); 174 237 175 - BUG_ON(!cpumask_subset(cpu_present_mask, cpu_possible_mask)); 176 - 177 - /* Get a bitmap of unoccupied slots. */ 178 - cpumask_xor(candidate_mask, cpu_possible_mask, cpu_present_mask); 179 - if (cpumask_empty(candidate_mask)) { 180 - /* If we get here, it most likely means that NR_CPUS is 181 - * less than the partition's max processors setting. 238 + rc = find_cpu_id_range(nthreads, node, &cpu_mask); 239 + if (rc && nr_node_ids > 1) { 240 + /* 241 + * Try again, considering the free CPU ids from the other node. 182 242 */ 183 - printk(KERN_ERR "Cannot add cpu %pOF; this system configuration" 184 - " supports %d logical cpus.\n", np, 185 - num_possible_cpus()); 186 - goto out_unlock; 243 + node = NUMA_NO_NODE; 244 + rc = find_cpu_id_range(nthreads, NUMA_NO_NODE, &cpu_mask); 187 245 } 188 246 189 - while (!cpumask_empty(tmp)) 190 - if (cpumask_subset(tmp, candidate_mask)) 191 - /* Found a range where we can insert the new cpu(s) */ 192 - break; 193 - else 194 - cpumask_shift_left(tmp, tmp, nthreads); 195 - 196 - if (cpumask_empty(tmp)) { 197 - printk(KERN_ERR "Unable to find space in cpu_present_mask for" 198 - " processor %pOFn with %d thread(s)\n", np, 199 - nthreads); 200 - goto out_unlock; 247 + if (rc) { 248 + pr_err("Cannot add cpu %pOF; this system configuration" 249 + " supports %d logical cpus.\n", np, num_possible_cpus()); 250 + goto out; 201 251 } 202 252 203 - for_each_cpu(cpu, tmp) { 253 + for_each_cpu(cpu, cpu_mask) { 204 254 BUG_ON(cpu_present(cpu)); 205 255 set_cpu_present(cpu, true); 206 256 set_hard_smp_processor_id(cpu, be32_to_cpu(*intserv++)); 207 257 } 208 - err = 0; 209 - out_unlock: 258 + 259 + /* Record the newly used CPU ids for the associate node. */ 260 + cpumask_or(node_recorded_ids_map[assigned_node], 261 + node_recorded_ids_map[assigned_node], cpu_mask); 262 + 263 + /* 264 + * If node is set to NUMA_NO_NODE, CPU ids have be reused from 265 + * another node, remove them from its mask. 266 + */ 267 + if (node == NUMA_NO_NODE) { 268 + cpu = cpumask_first(cpu_mask); 269 + pr_warn("Reusing free CPU ids %d-%d from another node\n", 270 + cpu, cpu + nthreads - 1); 271 + for_each_online_node(node) { 272 + if (node == assigned_node) 273 + continue; 274 + cpumask_andnot(node_recorded_ids_map[node], 275 + node_recorded_ids_map[node], 276 + cpu_mask); 277 + } 278 + } 279 + 280 + out: 210 281 cpu_maps_update_done(); 211 - free_cpumask_var(candidate_mask); 212 - free_cpumask_var(tmp); 213 - return err; 282 + free_cpumask_var(cpu_mask); 283 + return rc; 214 284 } 215 285 216 286 /* ··· 579 497 580 498 return saved_rc; 581 499 } 500 + 501 + update_numa_distance(dn); 582 502 583 503 rc = dlpar_online_cpu(dn); 584 504 if (rc) { ··· 992 908 static int __init pseries_cpu_hotplug_init(void) 993 909 { 994 910 int qcss_tok; 911 + unsigned int node; 995 912 996 913 #ifdef CONFIG_ARCH_CPU_PROBE_RELEASE 997 914 ppc_md.cpu_probe = dlpar_cpu_probe; ··· 1014 929 smp_ops->cpu_die = pseries_cpu_die; 1015 930 1016 931 /* Processors can be added/removed only on LPAR */ 1017 - if (firmware_has_feature(FW_FEATURE_LPAR)) 932 + if (firmware_has_feature(FW_FEATURE_LPAR)) { 933 + for_each_node(node) { 934 + alloc_bootmem_cpumask_var(&node_recorded_ids_map[node]); 935 + 936 + /* Record ids of CPU added at boot time */ 937 + cpumask_or(node_recorded_ids_map[node], 938 + node_recorded_ids_map[node], 939 + cpumask_of_node(node)); 940 + } 941 + 1018 942 of_reconfig_notifier_register(&pseries_smp_nb); 943 + } 1019 944 1020 945 return 0; 1021 946 }

+6

arch/powerpc/platforms/pseries/hotplug-memory.c

··· 180 180 return -ENODEV; 181 181 } 182 182 183 + update_numa_distance(lmb_node); 184 + 183 185 dr_node = of_find_node_by_path("/ibm,dynamic-reconfiguration-memory"); 184 186 if (!dr_node) { 185 187 dlpar_free_cc_nodes(lmb_node); ··· 979 977 case OF_RECONFIG_DETACH_NODE: 980 978 err = pseries_remove_mem_node(rd->dn); 981 979 break; 980 + case OF_RECONFIG_UPDATE_PROPERTY: 981 + if (!strcmp(rd->dn->name, 982 + "ibm,dynamic-reconfiguration-memory")) 983 + drmem_update_lmbs(rd->prop); 982 984 } 983 985 return notifier_from_errno(err); 984 986 }

+326 -194

arch/powerpc/platforms/pseries/iommu.c

··· 53 53 DDW_EXT_QUERY_OUT_SIZE = 2 54 54 }; 55 55 56 - static struct iommu_table_group *iommu_pseries_alloc_group(int node) 56 + static struct iommu_table *iommu_pseries_alloc_table(int node) 57 57 { 58 - struct iommu_table_group *table_group; 59 58 struct iommu_table *tbl; 60 - 61 - table_group = kzalloc_node(sizeof(struct iommu_table_group), GFP_KERNEL, 62 - node); 63 - if (!table_group) 64 - return NULL; 65 59 66 60 tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL, node); 67 61 if (!tbl) 68 - goto free_group; 62 + return NULL; 69 63 70 64 INIT_LIST_HEAD_RCU(&tbl->it_group_list); 71 65 kref_init(&tbl->it_kref); 66 + return tbl; 67 + } 72 68 73 - table_group->tables[0] = tbl; 69 + static struct iommu_table_group *iommu_pseries_alloc_group(int node) 70 + { 71 + struct iommu_table_group *table_group; 74 72 75 - return table_group; 73 + table_group = kzalloc_node(sizeof(*table_group), GFP_KERNEL, node); 74 + if (!table_group) 75 + return NULL; 76 76 77 - free_group: 77 + table_group->tables[0] = iommu_pseries_alloc_table(node); 78 + if (table_group->tables[0]) 79 + return table_group; 80 + 78 81 kfree(table_group); 79 82 return NULL; 80 83 } ··· 110 107 u64 proto_tce; 111 108 __be64 *tcep; 112 109 u64 rpn; 110 + const unsigned long tceshift = tbl->it_page_shift; 111 + const unsigned long pagesize = IOMMU_PAGE_SIZE(tbl); 113 112 114 113 proto_tce = TCE_PCI_READ; // Read allowed 115 114 ··· 122 117 123 118 while (npages--) { 124 119 /* can't move this out since we might cross MEMBLOCK boundary */ 125 - rpn = __pa(uaddr) >> TCE_SHIFT; 126 - *tcep = cpu_to_be64(proto_tce | (rpn & TCE_RPN_MASK) << TCE_RPN_SHIFT); 120 + rpn = __pa(uaddr) >> tceshift; 121 + *tcep = cpu_to_be64(proto_tce | rpn << tceshift); 127 122 128 - uaddr += TCE_PAGE_SIZE; 123 + uaddr += pagesize; 129 124 tcep++; 130 125 } 131 126 return 0; ··· 151 146 return be64_to_cpu(*tcep); 152 147 } 153 148 154 - static void tce_free_pSeriesLP(unsigned long liobn, long, long); 149 + static void tce_free_pSeriesLP(unsigned long liobn, long, long, long); 155 150 static void tce_freemulti_pSeriesLP(struct iommu_table*, long, long); 156 151 157 152 static int tce_build_pSeriesLP(unsigned long liobn, long tcenum, long tceshift, ··· 171 166 proto_tce |= TCE_PCI_WRITE; 172 167 173 168 while (npages--) { 174 - tce = proto_tce | (rpn & TCE_RPN_MASK) << tceshift; 169 + tce = proto_tce | rpn << tceshift; 175 170 rc = plpar_tce_put((u64)liobn, (u64)tcenum << tceshift, tce); 176 171 177 172 if (unlikely(rc == H_NOT_ENOUGH_RESOURCES)) { 178 173 ret = (int)rc; 179 - tce_free_pSeriesLP(liobn, tcenum_start, 174 + tce_free_pSeriesLP(liobn, tcenum_start, tceshift, 180 175 (npages_start - (npages + 1))); 181 176 break; 182 177 } ··· 210 205 long tcenum_start = tcenum, npages_start = npages; 211 206 int ret = 0; 212 207 unsigned long flags; 208 + const unsigned long tceshift = tbl->it_page_shift; 213 209 214 210 if ((npages == 1) || !firmware_has_feature(FW_FEATURE_PUT_TCE_IND)) { 215 211 return tce_build_pSeriesLP(tbl->it_index, tcenum, 216 - tbl->it_page_shift, npages, uaddr, 212 + tceshift, npages, uaddr, 217 213 direction, attrs); 218 214 } 219 215 ··· 231 225 if (!tcep) { 232 226 local_irq_restore(flags); 233 227 return tce_build_pSeriesLP(tbl->it_index, tcenum, 234 - tbl->it_page_shift, 228 + tceshift, 235 229 npages, uaddr, direction, attrs); 236 230 } 237 231 __this_cpu_write(tce_page, tcep); 238 232 } 239 233 240 - rpn = __pa(uaddr) >> TCE_SHIFT; 234 + rpn = __pa(uaddr) >> tceshift; 241 235 proto_tce = TCE_PCI_READ; 242 236 if (direction != DMA_TO_DEVICE) 243 237 proto_tce |= TCE_PCI_WRITE; ··· 251 245 limit = min_t(long, npages, 4096/TCE_ENTRY_SIZE); 252 246 253 247 for (l = 0; l < limit; l++) { 254 - tcep[l] = cpu_to_be64(proto_tce | (rpn & TCE_RPN_MASK) << TCE_RPN_SHIFT); 248 + tcep[l] = cpu_to_be64(proto_tce | rpn << tceshift); 255 249 rpn++; 256 250 } 257 251 258 252 rc = plpar_tce_put_indirect((u64)tbl->it_index, 259 - (u64)tcenum << 12, 253 + (u64)tcenum << tceshift, 260 254 (u64)__pa(tcep), 261 255 limit); 262 256 ··· 283 277 return ret; 284 278 } 285 279 286 - static void tce_free_pSeriesLP(unsigned long liobn, long tcenum, long npages) 280 + static void tce_free_pSeriesLP(unsigned long liobn, long tcenum, long tceshift, 281 + long npages) 287 282 { 288 283 u64 rc; 289 284 290 285 while (npages--) { 291 - rc = plpar_tce_put((u64)liobn, (u64)tcenum << 12, 0); 286 + rc = plpar_tce_put((u64)liobn, (u64)tcenum << tceshift, 0); 292 287 293 288 if (rc && printk_ratelimit()) { 294 289 printk("tce_free_pSeriesLP: plpar_tce_put failed. rc=%lld\n", rc); ··· 308 301 u64 rc; 309 302 310 303 if (!firmware_has_feature(FW_FEATURE_STUFF_TCE)) 311 - return tce_free_pSeriesLP(tbl->it_index, tcenum, npages); 304 + return tce_free_pSeriesLP(tbl->it_index, tcenum, 305 + tbl->it_page_shift, npages); 312 306 313 - rc = plpar_tce_stuff((u64)tbl->it_index, (u64)tcenum << 12, 0, npages); 307 + rc = plpar_tce_stuff((u64)tbl->it_index, 308 + (u64)tcenum << tbl->it_page_shift, 0, npages); 314 309 315 310 if (rc && printk_ratelimit()) { 316 311 printk("tce_freemulti_pSeriesLP: plpar_tce_stuff failed\n"); ··· 328 319 u64 rc; 329 320 unsigned long tce_ret; 330 321 331 - rc = plpar_tce_get((u64)tbl->it_index, (u64)tcenum << 12, &tce_ret); 322 + rc = plpar_tce_get((u64)tbl->it_index, 323 + (u64)tcenum << tbl->it_page_shift, &tce_ret); 332 324 333 325 if (rc && printk_ratelimit()) { 334 326 printk("tce_get_pSeriesLP: plpar_tce_get failed. rc=%lld\n", rc); ··· 349 339 __be32 window_shift; /* ilog2(tce_window_size) */ 350 340 }; 351 341 352 - struct direct_window { 342 + struct dma_win { 353 343 struct device_node *device; 354 344 const struct dynamic_dma_window_prop *prop; 355 345 struct list_head list; ··· 369 359 u32 addr_lo; 370 360 }; 371 361 372 - static LIST_HEAD(direct_window_list); 362 + static LIST_HEAD(dma_win_list); 373 363 /* prevents races between memory on/offline and window creation */ 374 - static DEFINE_SPINLOCK(direct_window_list_lock); 364 + static DEFINE_SPINLOCK(dma_win_list_lock); 375 365 /* protects initializing window twice for same device */ 376 - static DEFINE_MUTEX(direct_window_init_mutex); 366 + static DEFINE_MUTEX(dma_win_init_mutex); 377 367 #define DIRECT64_PROPNAME "linux,direct64-ddr-window-info" 368 + #define DMA64_PROPNAME "linux,dma64-ddr-window-info" 378 369 379 370 static int tce_clearrange_multi_pSeriesLP(unsigned long start_pfn, 380 371 unsigned long num_pfn, const void *arg) ··· 502 491 return tce_setrange_multi_pSeriesLP(start_pfn, num_pfn, arg); 503 492 } 504 493 494 + static void iommu_table_setparms_common(struct iommu_table *tbl, unsigned long busno, 495 + unsigned long liobn, unsigned long win_addr, 496 + unsigned long window_size, unsigned long page_shift, 497 + void *base, struct iommu_table_ops *table_ops) 498 + { 499 + tbl->it_busno = busno; 500 + tbl->it_index = liobn; 501 + tbl->it_offset = win_addr >> page_shift; 502 + tbl->it_size = window_size >> page_shift; 503 + tbl->it_page_shift = page_shift; 504 + tbl->it_base = (unsigned long)base; 505 + tbl->it_blocksize = 16; 506 + tbl->it_type = TCE_PCI; 507 + tbl->it_ops = table_ops; 508 + } 509 + 510 + struct iommu_table_ops iommu_table_pseries_ops; 511 + 505 512 static void iommu_table_setparms(struct pci_controller *phb, 506 513 struct device_node *dn, 507 514 struct iommu_table *tbl) ··· 528 499 const unsigned long *basep; 529 500 const u32 *sizep; 530 501 531 - node = phb->dn; 502 + /* Test if we are going over 2GB of DMA space */ 503 + if (phb->dma_window_base_cur + phb->dma_window_size > SZ_2G) { 504 + udbg_printf("PCI_DMA: Unexpected number of IOAs under this PHB.\n"); 505 + panic("PCI_DMA: Unexpected number of IOAs under this PHB.\n"); 506 + } 532 507 508 + node = phb->dn; 533 509 basep = of_get_property(node, "linux,tce-base", NULL); 534 510 sizep = of_get_property(node, "linux,tce-size", NULL); 535 511 if (basep == NULL || sizep == NULL) { ··· 543 509 return; 544 510 } 545 511 546 - tbl->it_base = (unsigned long)__va(*basep); 512 + iommu_table_setparms_common(tbl, phb->bus->number, 0, phb->dma_window_base_cur, 513 + phb->dma_window_size, IOMMU_PAGE_SHIFT_4K, 514 + __va(*basep), &iommu_table_pseries_ops); 547 515 548 516 if (!is_kdump_kernel()) 549 517 memset((void *)tbl->it_base, 0, *sizep); 550 518 551 - tbl->it_busno = phb->bus->number; 552 - tbl->it_page_shift = IOMMU_PAGE_SHIFT_4K; 553 - 554 - /* Units of tce entries */ 555 - tbl->it_offset = phb->dma_window_base_cur >> tbl->it_page_shift; 556 - 557 - /* Test if we are going over 2GB of DMA space */ 558 - if (phb->dma_window_base_cur + phb->dma_window_size > 0x80000000ul) { 559 - udbg_printf("PCI_DMA: Unexpected number of IOAs under this PHB.\n"); 560 - panic("PCI_DMA: Unexpected number of IOAs under this PHB.\n"); 561 - } 562 - 563 519 phb->dma_window_base_cur += phb->dma_window_size; 564 - 565 - /* Set the tce table size - measured in entries */ 566 - tbl->it_size = phb->dma_window_size >> tbl->it_page_shift; 567 - 568 - tbl->it_index = 0; 569 - tbl->it_blocksize = 16; 570 - tbl->it_type = TCE_PCI; 571 520 } 521 + 522 + struct iommu_table_ops iommu_table_lpar_multi_ops; 572 523 573 524 /* 574 525 * iommu_table_setparms_lpar ··· 566 547 struct iommu_table_group *table_group, 567 548 const __be32 *dma_window) 568 549 { 569 - unsigned long offset, size; 550 + unsigned long offset, size, liobn; 570 551 571 - of_parse_dma_window(dn, dma_window, &tbl->it_index, &offset, &size); 552 + of_parse_dma_window(dn, dma_window, &liobn, &offset, &size); 572 553 573 - tbl->it_busno = phb->bus->number; 574 - tbl->it_page_shift = IOMMU_PAGE_SHIFT_4K; 575 - tbl->it_base = 0; 576 - tbl->it_blocksize = 16; 577 - tbl->it_type = TCE_PCI; 578 - tbl->it_offset = offset >> tbl->it_page_shift; 579 - tbl->it_size = size >> tbl->it_page_shift; 554 + iommu_table_setparms_common(tbl, phb->bus->number, liobn, offset, size, IOMMU_PAGE_SHIFT_4K, NULL, 555 + &iommu_table_lpar_multi_ops); 556 + 580 557 581 558 table_group->tce32_start = offset; 582 559 table_group->tce32_size = size; ··· 652 637 tbl = pci->table_group->tables[0]; 653 638 654 639 iommu_table_setparms(pci->phb, dn, tbl); 655 - tbl->it_ops = &iommu_table_pseries_ops; 640 + 656 641 if (!iommu_init_table(tbl, pci->phb->node, 0, 0)) 657 642 panic("Failed to initialize iommu table"); 658 643 ··· 713 698 pr_debug("pci_dma_bus_setup_pSeriesLP: setting up bus %pOF\n", 714 699 dn); 715 700 716 - /* Find nearest ibm,dma-window, walking up the device tree */ 701 + /* 702 + * Find nearest ibm,dma-window (default DMA window), walking up the 703 + * device tree 704 + */ 717 705 for (pdn = dn; pdn != NULL; pdn = pdn->parent) { 718 706 dma_window = of_get_property(pdn, "ibm,dma-window", NULL); 719 707 if (dma_window != NULL) ··· 738 720 tbl = ppci->table_group->tables[0]; 739 721 iommu_table_setparms_lpar(ppci->phb, pdn, tbl, 740 722 ppci->table_group, dma_window); 741 - tbl->it_ops = &iommu_table_lpar_multi_ops; 723 + 742 724 if (!iommu_init_table(tbl, ppci->phb->node, 0, 0)) 743 725 panic("Failed to initialize iommu table"); 744 726 iommu_register_group(ppci->table_group, ··· 768 750 PCI_DN(dn)->table_group = iommu_pseries_alloc_group(phb->node); 769 751 tbl = PCI_DN(dn)->table_group->tables[0]; 770 752 iommu_table_setparms(phb, dn, tbl); 771 - tbl->it_ops = &iommu_table_pseries_ops; 753 + 772 754 if (!iommu_init_table(tbl, phb->node, 0, 0)) 773 755 panic("Failed to initialize iommu table"); 774 756 ··· 803 785 804 786 early_param("disable_ddw", disable_ddw_setup); 805 787 806 - static void remove_dma_window(struct device_node *np, u32 *ddw_avail, 807 - struct property *win) 788 + static void clean_dma_window(struct device_node *np, struct dynamic_dma_window_prop *dwp) 808 789 { 809 - struct dynamic_dma_window_prop *dwp; 810 - u64 liobn; 811 790 int ret; 812 791 813 - dwp = win->value; 814 - liobn = (u64)be32_to_cpu(dwp->liobn); 815 - 816 - /* clear the whole window, note the arg is in kernel pages */ 817 792 ret = tce_clearrange_multi_pSeriesLP(0, 818 793 1ULL << (be32_to_cpu(dwp->window_shift) - PAGE_SHIFT), dwp); 819 794 if (ret) ··· 815 804 else 816 805 pr_debug("%pOF successfully cleared tces in window.\n", 817 806 np); 807 + } 808 + 809 + /* 810 + * Call only if DMA window is clean. 811 + */ 812 + static void __remove_dma_window(struct device_node *np, u32 *ddw_avail, u64 liobn) 813 + { 814 + int ret; 818 815 819 816 ret = rtas_call(ddw_avail[DDW_REMOVE_PE_DMA_WIN], 1, 1, NULL, liobn); 820 817 if (ret) 821 - pr_warn("%pOF: failed to remove direct window: rtas returned " 818 + pr_warn("%pOF: failed to remove DMA window: rtas returned " 822 819 "%d to ibm,remove-pe-dma-window(%x) %llx\n", 823 820 np, ret, ddw_avail[DDW_REMOVE_PE_DMA_WIN], liobn); 824 821 else 825 - pr_debug("%pOF: successfully removed direct window: rtas returned " 822 + pr_debug("%pOF: successfully removed DMA window: rtas returned " 826 823 "%d to ibm,remove-pe-dma-window(%x) %llx\n", 827 824 np, ret, ddw_avail[DDW_REMOVE_PE_DMA_WIN], liobn); 828 825 } 829 826 830 - static void remove_ddw(struct device_node *np, bool remove_prop) 827 + static void remove_dma_window(struct device_node *np, u32 *ddw_avail, 828 + struct property *win) 829 + { 830 + struct dynamic_dma_window_prop *dwp; 831 + u64 liobn; 832 + 833 + dwp = win->value; 834 + liobn = (u64)be32_to_cpu(dwp->liobn); 835 + 836 + clean_dma_window(np, dwp); 837 + __remove_dma_window(np, ddw_avail, liobn); 838 + } 839 + 840 + static int remove_ddw(struct device_node *np, bool remove_prop, const char *win_name) 831 841 { 832 842 struct property *win; 833 843 u32 ddw_avail[DDW_APPLICABLE_SIZE]; 834 844 int ret = 0; 835 845 846 + win = of_find_property(np, win_name, NULL); 847 + if (!win) 848 + return -EINVAL; 849 + 836 850 ret = of_property_read_u32_array(np, "ibm,ddw-applicable", 837 851 &ddw_avail[0], DDW_APPLICABLE_SIZE); 838 852 if (ret) 839 - return; 853 + return 0; 840 854 841 - win = of_find_property(np, DIRECT64_PROPNAME, NULL); 842 - if (!win) 843 - return; 844 855 845 856 if (win->length >= sizeof(struct dynamic_dma_window_prop)) 846 857 remove_dma_window(np, ddw_avail, win); 847 858 848 859 if (!remove_prop) 849 - return; 860 + return 0; 850 861 851 862 ret = of_remove_property(np, win); 852 863 if (ret) 853 - pr_warn("%pOF: failed to remove direct window property: %d\n", 864 + pr_warn("%pOF: failed to remove DMA window property: %d\n", 854 865 np, ret); 866 + return 0; 855 867 } 856 868 857 - static u64 find_existing_ddw(struct device_node *pdn, int *window_shift) 869 + static bool find_existing_ddw(struct device_node *pdn, u64 *dma_addr, int *window_shift) 858 870 { 859 - struct direct_window *window; 860 - const struct dynamic_dma_window_prop *direct64; 861 - u64 dma_addr = 0; 871 + struct dma_win *window; 872 + const struct dynamic_dma_window_prop *dma64; 873 + bool found = false; 862 874 863 - spin_lock(&direct_window_list_lock); 875 + spin_lock(&dma_win_list_lock); 864 876 /* check if we already created a window and dupe that config if so */ 865 - list_for_each_entry(window, &direct_window_list, list) { 877 + list_for_each_entry(window, &dma_win_list, list) { 866 878 if (window->device == pdn) { 867 - direct64 = window->prop; 868 - dma_addr = be64_to_cpu(direct64->dma_base); 869 - *window_shift = be32_to_cpu(direct64->window_shift); 879 + dma64 = window->prop; 880 + *dma_addr = be64_to_cpu(dma64->dma_base); 881 + *window_shift = be32_to_cpu(dma64->window_shift); 882 + found = true; 870 883 break; 871 884 } 872 885 } 873 - spin_unlock(&direct_window_list_lock); 886 + spin_unlock(&dma_win_list_lock); 874 887 875 - return dma_addr; 888 + return found; 889 + } 890 + 891 + static struct dma_win *ddw_list_new_entry(struct device_node *pdn, 892 + const struct dynamic_dma_window_prop *dma64) 893 + { 894 + struct dma_win *window; 895 + 896 + window = kzalloc(sizeof(*window), GFP_KERNEL); 897 + if (!window) 898 + return NULL; 899 + 900 + window->device = pdn; 901 + window->prop = dma64; 902 + 903 + return window; 904 + } 905 + 906 + static void find_existing_ddw_windows_named(const char *name) 907 + { 908 + int len; 909 + struct device_node *pdn; 910 + struct dma_win *window; 911 + const struct dynamic_dma_window_prop *dma64; 912 + 913 + for_each_node_with_property(pdn, name) { 914 + dma64 = of_get_property(pdn, name, &len); 915 + if (!dma64 || len < sizeof(*dma64)) { 916 + remove_ddw(pdn, true, name); 917 + continue; 918 + } 919 + 920 + window = ddw_list_new_entry(pdn, dma64); 921 + if (!window) 922 + break; 923 + 924 + spin_lock(&dma_win_list_lock); 925 + list_add(&window->list, &dma_win_list); 926 + spin_unlock(&dma_win_list_lock); 927 + } 876 928 } 877 929 878 930 static int find_existing_ddw_windows(void) 879 931 { 880 - int len; 881 - struct device_node *pdn; 882 - struct direct_window *window; 883 - const struct dynamic_dma_window_prop *direct64; 884 - 885 932 if (!firmware_has_feature(FW_FEATURE_LPAR)) 886 933 return 0; 887 934 888 - for_each_node_with_property(pdn, DIRECT64_PROPNAME) { 889 - direct64 = of_get_property(pdn, DIRECT64_PROPNAME, &len); 890 - if (!direct64) 891 - continue; 892 - 893 - window = kzalloc(sizeof(*window), GFP_KERNEL); 894 - if (!window || len < sizeof(struct dynamic_dma_window_prop)) { 895 - kfree(window); 896 - remove_ddw(pdn, true); 897 - continue; 898 - } 899 - 900 - window->device = pdn; 901 - window->prop = direct64; 902 - spin_lock(&direct_window_list_lock); 903 - list_add(&window->list, &direct_window_list); 904 - spin_unlock(&direct_window_list_lock); 905 - } 935 + find_existing_ddw_windows_named(DIRECT64_PROPNAME); 936 + find_existing_ddw_windows_named(DMA64_PROPNAME); 906 937 907 938 return 0; 908 939 } ··· 1183 1130 return 0; 1184 1131 } 1185 1132 1133 + static struct property *ddw_property_create(const char *propname, u32 liobn, u64 dma_addr, 1134 + u32 page_shift, u32 window_shift) 1135 + { 1136 + struct dynamic_dma_window_prop *ddwprop; 1137 + struct property *win64; 1138 + 1139 + win64 = kzalloc(sizeof(*win64), GFP_KERNEL); 1140 + if (!win64) 1141 + return NULL; 1142 + 1143 + win64->name = kstrdup(propname, GFP_KERNEL); 1144 + ddwprop = kzalloc(sizeof(*ddwprop), GFP_KERNEL); 1145 + win64->value = ddwprop; 1146 + win64->length = sizeof(*ddwprop); 1147 + if (!win64->name || !win64->value) { 1148 + kfree(win64->name); 1149 + kfree(win64->value); 1150 + kfree(win64); 1151 + return NULL; 1152 + } 1153 + 1154 + ddwprop->liobn = cpu_to_be32(liobn); 1155 + ddwprop->dma_base = cpu_to_be64(dma_addr); 1156 + ddwprop->tce_shift = cpu_to_be32(page_shift); 1157 + ddwprop->window_shift = cpu_to_be32(window_shift); 1158 + 1159 + return win64; 1160 + } 1161 + 1186 1162 /* 1187 1163 * If the PE supports dynamic dma windows, and there is space for a table 1188 1164 * that can map all pages in a linear offset, then setup such a table, ··· 1221 1139 * pdn: the parent pe node with the ibm,dma_window property 1222 1140 * Future: also check if we can remap the base window for our base page size 1223 1141 * 1224 - * returns the dma offset for use by the direct mapped DMA code. 1142 + * returns true if can map all pages (direct mapping), false otherwise.. 1225 1143 */ 1226 - static u64 enable_ddw(struct pci_dev *dev, struct device_node *pdn) 1144 + static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn) 1227 1145 { 1228 1146 int len = 0, ret; 1229 1147 int max_ram_len = order_base_2(ddw_memory_hotplug_max()); 1230 1148 struct ddw_query_response query; 1231 1149 struct ddw_create_response create; 1232 1150 int page_shift; 1233 - u64 dma_addr; 1151 + u64 win_addr; 1152 + const char *win_name; 1234 1153 struct device_node *dn; 1235 1154 u32 ddw_avail[DDW_APPLICABLE_SIZE]; 1236 - struct direct_window *window; 1155 + struct dma_win *window; 1237 1156 struct property *win64; 1238 - struct dynamic_dma_window_prop *ddwprop; 1157 + bool ddw_enabled = false; 1239 1158 struct failed_ddw_pdn *fpdn; 1240 - bool default_win_removed = false; 1159 + bool default_win_removed = false, direct_mapping = false; 1241 1160 bool pmem_present; 1161 + struct pci_dn *pci = PCI_DN(pdn); 1162 + struct iommu_table *tbl = pci->table_group->tables[0]; 1242 1163 1243 1164 dn = of_find_node_by_type(NULL, "ibm,pmemory"); 1244 1165 pmem_present = dn != NULL; 1245 1166 of_node_put(dn); 1246 1167 1247 - mutex_lock(&direct_window_init_mutex); 1168 + mutex_lock(&dma_win_init_mutex); 1248 1169 1249 - dma_addr = find_existing_ddw(pdn, &len); 1250 - if (dma_addr != 0) 1170 + if (find_existing_ddw(pdn, &dev->dev.archdata.dma_offset, &len)) { 1171 + direct_mapping = (len >= max_ram_len); 1172 + ddw_enabled = true; 1251 1173 goto out_unlock; 1174 + } 1252 1175 1253 1176 /* 1254 1177 * If we already went through this for a previous function of ··· 1327 1240 1328 1241 page_shift = iommu_get_page_shift(query.page_size); 1329 1242 if (!page_shift) { 1330 - dev_dbg(&dev->dev, "no supported direct page size in mask %x", 1331 - query.page_size); 1243 + dev_dbg(&dev->dev, "no supported page size in mask %x", 1244 + query.page_size); 1332 1245 goto out_failed; 1333 1246 } 1334 - /* verify the window * number of ptes will map the partition */ 1335 - /* check largest block * page size > max memory hotplug addr */ 1247 + 1248 + 1336 1249 /* 1337 1250 * The "ibm,pmemory" can appear anywhere in the address space. 1338 1251 * Assuming it is still backed by page structs, try MAX_PHYSMEM_BITS ··· 1348 1261 dev_info(&dev->dev, "Skipping ibm,pmemory"); 1349 1262 } 1350 1263 1264 + /* check if the available block * number of ptes will map everything */ 1351 1265 if (query.largest_available_block < (1ULL << (len - page_shift))) { 1352 1266 dev_dbg(&dev->dev, 1353 1267 "can't map partition max 0x%llx with %llu %llu-sized pages\n", 1354 1268 1ULL << len, 1355 1269 query.largest_available_block, 1356 1270 1ULL << page_shift); 1357 - goto out_failed; 1358 - } 1359 - win64 = kzalloc(sizeof(struct property), GFP_KERNEL); 1360 - if (!win64) { 1361 - dev_info(&dev->dev, 1362 - "couldn't allocate property for 64bit dma window\n"); 1363 - goto out_failed; 1364 - } 1365 - win64->name = kstrdup(DIRECT64_PROPNAME, GFP_KERNEL); 1366 - win64->value = ddwprop = kmalloc(sizeof(*ddwprop), GFP_KERNEL); 1367 - win64->length = sizeof(*ddwprop); 1368 - if (!win64->name || !win64->value) { 1369 - dev_info(&dev->dev, 1370 - "couldn't allocate property name and value\n"); 1371 - goto out_free_prop; 1271 + 1272 + /* DDW + IOMMU on single window may fail if there is any allocation */ 1273 + if (default_win_removed && iommu_table_in_use(tbl)) { 1274 + dev_dbg(&dev->dev, "current IOMMU table in use, can't be replaced.\n"); 1275 + goto out_failed; 1276 + } 1277 + 1278 + len = order_base_2(query.largest_available_block << page_shift); 1279 + win_name = DMA64_PROPNAME; 1280 + } else { 1281 + direct_mapping = true; 1282 + win_name = DIRECT64_PROPNAME; 1372 1283 } 1373 1284 1374 1285 ret = create_ddw(dev, ddw_avail, &create, page_shift, len); 1375 1286 if (ret != 0) 1376 - goto out_free_prop; 1377 - 1378 - ddwprop->liobn = cpu_to_be32(create.liobn); 1379 - ddwprop->dma_base = cpu_to_be64(((u64)create.addr_hi << 32) | 1380 - create.addr_lo); 1381 - ddwprop->tce_shift = cpu_to_be32(page_shift); 1382 - ddwprop->window_shift = cpu_to_be32(len); 1287 + goto out_failed; 1383 1288 1384 1289 dev_dbg(&dev->dev, "created tce table LIOBN 0x%x for %pOF\n", 1385 1290 create.liobn, dn); 1386 1291 1387 - window = kzalloc(sizeof(*window), GFP_KERNEL); 1388 - if (!window) 1389 - goto out_clear_window; 1292 + win_addr = ((u64)create.addr_hi << 32) | create.addr_lo; 1293 + win64 = ddw_property_create(win_name, create.liobn, win_addr, page_shift, len); 1390 1294 1391 - ret = walk_system_ram_range(0, memblock_end_of_DRAM() >> PAGE_SHIFT, 1392 - win64->value, tce_setrange_multi_pSeriesLP_walk); 1393 - if (ret) { 1394 - dev_info(&dev->dev, "failed to map direct window for %pOF: %d\n", 1395 - dn, ret); 1396 - goto out_free_window; 1295 + if (!win64) { 1296 + dev_info(&dev->dev, 1297 + "couldn't allocate property, property name, or value\n"); 1298 + goto out_remove_win; 1397 1299 } 1398 1300 1399 1301 ret = of_add_property(pdn, win64); 1400 1302 if (ret) { 1401 - dev_err(&dev->dev, "unable to add dma window property for %pOF: %d", 1402 - pdn, ret); 1403 - goto out_free_window; 1303 + dev_err(&dev->dev, "unable to add DMA window property for %pOF: %d", 1304 + pdn, ret); 1305 + goto out_free_prop; 1404 1306 } 1405 1307 1406 - window->device = pdn; 1407 - window->prop = ddwprop; 1408 - spin_lock(&direct_window_list_lock); 1409 - list_add(&window->list, &direct_window_list); 1410 - spin_unlock(&direct_window_list_lock); 1308 + window = ddw_list_new_entry(pdn, win64->value); 1309 + if (!window) 1310 + goto out_del_prop; 1411 1311 1412 - dma_addr = be64_to_cpu(ddwprop->dma_base); 1312 + if (direct_mapping) { 1313 + /* DDW maps the whole partition, so enable direct DMA mapping */ 1314 + ret = walk_system_ram_range(0, memblock_end_of_DRAM() >> PAGE_SHIFT, 1315 + win64->value, tce_setrange_multi_pSeriesLP_walk); 1316 + if (ret) { 1317 + dev_info(&dev->dev, "failed to map DMA window for %pOF: %d\n", 1318 + dn, ret); 1319 + 1320 + /* Make sure to clean DDW if any TCE was set*/ 1321 + clean_dma_window(pdn, win64->value); 1322 + goto out_del_list; 1323 + } 1324 + } else { 1325 + struct iommu_table *newtbl; 1326 + int i; 1327 + 1328 + for (i = 0; i < ARRAY_SIZE(pci->phb->mem_resources); i++) { 1329 + const unsigned long mask = IORESOURCE_MEM_64 | IORESOURCE_MEM; 1330 + 1331 + /* Look for MMIO32 */ 1332 + if ((pci->phb->mem_resources[i].flags & mask) == IORESOURCE_MEM) 1333 + break; 1334 + } 1335 + 1336 + if (i == ARRAY_SIZE(pci->phb->mem_resources)) 1337 + goto out_del_list; 1338 + 1339 + /* New table for using DDW instead of the default DMA window */ 1340 + newtbl = iommu_pseries_alloc_table(pci->phb->node); 1341 + if (!newtbl) { 1342 + dev_dbg(&dev->dev, "couldn't create new IOMMU table\n"); 1343 + goto out_del_list; 1344 + } 1345 + 1346 + iommu_table_setparms_common(newtbl, pci->phb->bus->number, create.liobn, win_addr, 1347 + 1UL << len, page_shift, NULL, &iommu_table_lpar_multi_ops); 1348 + iommu_init_table(newtbl, pci->phb->node, pci->phb->mem_resources[i].start, 1349 + pci->phb->mem_resources[i].end); 1350 + 1351 + pci->table_group->tables[1] = newtbl; 1352 + 1353 + /* Keep default DMA window stuct if removed */ 1354 + if (default_win_removed) { 1355 + tbl->it_size = 0; 1356 + kfree(tbl->it_map); 1357 + } 1358 + 1359 + set_iommu_table_base(&dev->dev, newtbl); 1360 + } 1361 + 1362 + spin_lock(&dma_win_list_lock); 1363 + list_add(&window->list, &dma_win_list); 1364 + spin_unlock(&dma_win_list_lock); 1365 + 1366 + dev->dev.archdata.dma_offset = win_addr; 1367 + ddw_enabled = true; 1413 1368 goto out_unlock; 1414 1369 1415 - out_free_window: 1370 + out_del_list: 1416 1371 kfree(window); 1417 1372 1418 - out_clear_window: 1419 - remove_ddw(pdn, true); 1373 + out_del_prop: 1374 + of_remove_property(pdn, win64); 1420 1375 1421 1376 out_free_prop: 1422 1377 kfree(win64->name); 1423 1378 kfree(win64->value); 1424 1379 kfree(win64); 1380 + 1381 + out_remove_win: 1382 + /* DDW is clean, so it's ok to call this directly. */ 1383 + __remove_dma_window(pdn, ddw_avail, create.liobn); 1425 1384 1426 1385 out_failed: 1427 1386 if (default_win_removed) ··· 1480 1347 list_add(&fpdn->list, &failed_ddw_pdn_list); 1481 1348 1482 1349 out_unlock: 1483 - mutex_unlock(&direct_window_init_mutex); 1350 + mutex_unlock(&dma_win_init_mutex); 1484 1351 1485 1352 /* 1486 1353 * If we have persistent memory and the window size is only as big 1487 1354 * as RAM, then we failed to create a window to cover persistent 1488 1355 * memory and need to set the DMA limit. 1489 1356 */ 1490 - if (pmem_present && dma_addr && (len == max_ram_len)) 1491 - dev->dev.bus_dma_limit = dma_addr + (1ULL << len); 1357 + if (pmem_present && ddw_enabled && direct_mapping && len == max_ram_len) 1358 + dev->dev.bus_dma_limit = dev->dev.archdata.dma_offset + (1ULL << len); 1492 1359 1493 - return dma_addr; 1360 + return ddw_enabled && direct_mapping; 1494 1361 } 1495 1362 1496 1363 static void pci_dma_dev_setup_pSeriesLP(struct pci_dev *dev) ··· 1532 1399 tbl = pci->table_group->tables[0]; 1533 1400 iommu_table_setparms_lpar(pci->phb, pdn, tbl, 1534 1401 pci->table_group, dma_window); 1535 - tbl->it_ops = &iommu_table_lpar_multi_ops; 1402 + 1536 1403 iommu_init_table(tbl, pci->phb->node, 0, 0); 1537 1404 iommu_register_group(pci->table_group, 1538 1405 pci_domain_nr(pci->phb->bus), 0); ··· 1569 1436 break; 1570 1437 } 1571 1438 1572 - if (pdn && PCI_DN(pdn)) { 1573 - pdev->dev.archdata.dma_offset = enable_ddw(pdev, pdn); 1574 - if (pdev->dev.archdata.dma_offset) 1575 - return true; 1576 - } 1439 + if (pdn && PCI_DN(pdn)) 1440 + return enable_ddw(pdev, pdn); 1577 1441 1578 1442 return false; 1579 1443 } ··· 1578 1448 static int iommu_mem_notifier(struct notifier_block *nb, unsigned long action, 1579 1449 void *data) 1580 1450 { 1581 - struct direct_window *window; 1451 + struct dma_win *window; 1582 1452 struct memory_notify *arg = data; 1583 1453 int ret = 0; 1584 1454 1585 1455 switch (action) { 1586 1456 case MEM_GOING_ONLINE: 1587 - spin_lock(&direct_window_list_lock); 1588 - list_for_each_entry(window, &direct_window_list, list) { 1457 + spin_lock(&dma_win_list_lock); 1458 + list_for_each_entry(window, &dma_win_list, list) { 1589 1459 ret |= tce_setrange_multi_pSeriesLP(arg->start_pfn, 1590 1460 arg->nr_pages, window->prop); 1591 1461 /* XXX log error */ 1592 1462 } 1593 - spin_unlock(&direct_window_list_lock); 1463 + spin_unlock(&dma_win_list_lock); 1594 1464 break; 1595 1465 case MEM_CANCEL_ONLINE: 1596 1466 case MEM_OFFLINE: 1597 - spin_lock(&direct_window_list_lock); 1598 - list_for_each_entry(window, &direct_window_list, list) { 1467 + spin_lock(&dma_win_list_lock); 1468 + list_for_each_entry(window, &dma_win_list, list) { 1599 1469 ret |= tce_clearrange_multi_pSeriesLP(arg->start_pfn, 1600 1470 arg->nr_pages, window->prop); 1601 1471 /* XXX log error */ 1602 1472 } 1603 - spin_unlock(&direct_window_list_lock); 1473 + spin_unlock(&dma_win_list_lock); 1604 1474 break; 1605 1475 default: 1606 1476 break; ··· 1621 1491 struct of_reconfig_data *rd = data; 1622 1492 struct device_node *np = rd->dn; 1623 1493 struct pci_dn *pci = PCI_DN(np); 1624 - struct direct_window *window; 1494 + struct dma_win *window; 1625 1495 1626 1496 switch (action) { 1627 1497 case OF_RECONFIG_DETACH_NODE: ··· 1632 1502 * we have to remove the property when releasing 1633 1503 * the device node. 1634 1504 */ 1635 - remove_ddw(np, false); 1505 + if (remove_ddw(np, false, DIRECT64_PROPNAME)) 1506 + remove_ddw(np, false, DMA64_PROPNAME); 1507 + 1636 1508 if (pci && pci->table_group) 1637 1509 iommu_pseries_free_group(pci->table_group, 1638 1510 np->full_name); 1639 1511 1640 - spin_lock(&direct_window_list_lock); 1641 - list_for_each_entry(window, &direct_window_list, list) { 1512 + spin_lock(&dma_win_list_lock); 1513 + list_for_each_entry(window, &dma_win_list, list) { 1642 1514 if (window->device == np) { 1643 1515 list_del(&window->list); 1644 1516 kfree(window); 1645 1517 break; 1646 1518 } 1647 1519 } 1648 - spin_unlock(&direct_window_list_lock); 1520 + spin_unlock(&dma_win_list_lock); 1649 1521 break; 1650 1522 default: 1651 1523 err = NOTIFY_DONE;

+11 -7

arch/powerpc/platforms/pseries/lpar.c

··· 22 22 #include <linux/workqueue.h> 23 23 #include <linux/proc_fs.h> 24 24 #include <linux/pgtable.h> 25 + #include <linux/debugfs.h> 26 + 25 27 #include <asm/processor.h> 26 28 #include <asm/mmu.h> 27 29 #include <asm/page.h> ··· 41 39 #include <asm/kexec.h> 42 40 #include <asm/fadump.h> 43 41 #include <asm/asm-prototypes.h> 44 - #include <asm/debugfs.h> 45 42 #include <asm/dtl.h> 46 43 47 44 #include "pseries.h" ··· 262 261 if (!last_disp_cpu_assoc || !cur_disp_cpu_assoc) 263 262 return -EIO; 264 263 265 - return cpu_distance(last_disp_cpu_assoc, cur_disp_cpu_assoc); 264 + return cpu_relative_distance(last_disp_cpu_assoc, cur_disp_cpu_assoc); 266 265 } 267 266 268 267 static int cpu_home_node_dispatch_distance(int disp_cpu) ··· 282 281 if (!disp_cpu_assoc || !vcpu_assoc) 283 282 return -EIO; 284 283 285 - return cpu_distance(disp_cpu_assoc, vcpu_assoc); 284 + return cpu_relative_distance(disp_cpu_assoc, vcpu_assoc); 286 285 } 287 286 288 287 static void update_vcpu_disp_stat(int disp_cpu) ··· 802 801 return -1; 803 802 } 804 803 805 - static void manual_hpte_clear_all(void) 804 + /* Called during kexec sequence with MMU off */ 805 + static notrace void manual_hpte_clear_all(void) 806 806 { 807 807 unsigned long size_bytes = 1UL << ppc64_pft_size; 808 808 unsigned long hpte_count = size_bytes >> 4; ··· 836 834 } 837 835 } 838 836 839 - static int hcall_hpte_clear_all(void) 837 + /* Called during kexec sequence with MMU off */ 838 + static notrace int hcall_hpte_clear_all(void) 840 839 { 841 840 int rc; 842 841 ··· 848 845 return rc; 849 846 } 850 847 851 - static void pseries_hpte_clear_all(void) 848 + /* Called during kexec sequence with MMU off */ 849 + static notrace void pseries_hpte_clear_all(void) 852 850 { 853 851 int rc; 854 852 ··· 2020 2016 if (!firmware_has_feature(FW_FEATURE_SPLPAR)) 2021 2017 return 0; 2022 2018 2023 - vpa_dir = debugfs_create_dir("vpa", powerpc_debugfs_root); 2019 + vpa_dir = debugfs_create_dir("vpa", arch_debugfs_dir); 2024 2020 2025 2021 /* set up the per-cpu vpa file*/ 2026 2022 for_each_possible_cpu(i) {

+225 -71

arch/powerpc/platforms/pseries/msi.c

··· 13 13 #include <asm/hw_irq.h> 14 14 #include <asm/ppc-pci.h> 15 15 #include <asm/machdep.h> 16 + #include <asm/xive.h> 16 17 17 18 #include "pseries.h" 18 19 ··· 111 110 return rtas_ret[0]; 112 111 } 113 112 114 - static void rtas_teardown_msi_irqs(struct pci_dev *pdev) 115 - { 116 - struct msi_desc *entry; 117 - 118 - for_each_pci_msi_entry(entry, pdev) { 119 - if (!entry->irq) 120 - continue; 121 - 122 - irq_set_msi_desc(entry->irq, NULL); 123 - irq_dispose_mapping(entry->irq); 124 - } 125 - 126 - rtas_disable_msi(pdev); 127 - } 128 - 129 113 static int check_req(struct pci_dev *pdev, int nvec, char *prop_name) 130 114 { 131 115 struct device_node *dn; ··· 150 164 151 165 /* Quota calculation */ 152 166 153 - static struct device_node *find_pe_total_msi(struct pci_dev *dev, int *total) 167 + static struct device_node *__find_pe_total_msi(struct device_node *node, int *total) 154 168 { 155 169 struct device_node *dn; 156 170 const __be32 *p; 157 171 158 - dn = of_node_get(pci_device_to_OF_node(dev)); 172 + dn = of_node_get(node); 159 173 while (dn) { 160 174 p = of_get_property(dn, "ibm,pe-total-#msi", NULL); 161 175 if (p) { ··· 169 183 } 170 184 171 185 return NULL; 186 + } 187 + 188 + static struct device_node *find_pe_total_msi(struct pci_dev *dev, int *total) 189 + { 190 + return __find_pe_total_msi(pci_device_to_OF_node(dev), total); 172 191 } 173 192 174 193 static struct device_node *find_pe_dn(struct pci_dev *dev, int *total) ··· 359 368 pci_write_config_dword(pdev, pdev->msi_cap + PCI_MSI_ADDRESS_HI, 0); 360 369 } 361 370 362 - static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type) 371 + static int rtas_prepare_msi_irqs(struct pci_dev *pdev, int nvec_in, int type, 372 + msi_alloc_info_t *arg) 363 373 { 364 374 struct pci_dn *pdn; 365 - int hwirq, virq, i, quota, rc; 366 - struct msi_desc *entry; 367 - struct msi_msg msg; 375 + int quota, rc; 368 376 int nvec = nvec_in; 369 377 int use_32bit_msi_hack = 0; 370 378 ··· 441 451 return rc; 442 452 } 443 453 444 - i = 0; 445 - for_each_pci_msi_entry(entry, pdev) { 446 - hwirq = rtas_query_irq_number(pdn, i++); 447 - if (hwirq < 0) { 448 - pr_debug("rtas_msi: error (%d) getting hwirq\n", rc); 449 - return hwirq; 450 - } 454 + return 0; 455 + } 451 456 452 - /* 453 - * Depending on the number of online CPUs in the original 454 - * kernel, it is likely for CPU #0 to be offline in a kdump 455 - * kernel. The associated IRQs in the affinity mappings 456 - * provided by irq_create_affinity_masks() are thus not 457 - * started by irq_startup(), as per-design for managed IRQs. 458 - * This can be a problem with multi-queue block devices driven 459 - * by blk-mq : such a non-started IRQ is very likely paired 460 - * with the single queue enforced by blk-mq during kdump (see 461 - * blk_mq_alloc_tag_set()). This causes the device to remain 462 - * silent and likely hangs the guest at some point. 463 - * 464 - * We don't really care for fine-grained affinity when doing 465 - * kdump actually : simply ignore the pre-computed affinity 466 - * masks in this case and let the default mask with all CPUs 467 - * be used when creating the IRQ mappings. 468 - */ 469 - if (is_kdump_kernel()) 470 - virq = irq_create_mapping(NULL, hwirq); 471 - else 472 - virq = irq_create_mapping_affinity(NULL, hwirq, 473 - entry->affinity); 457 + static int pseries_msi_ops_prepare(struct irq_domain *domain, struct device *dev, 458 + int nvec, msi_alloc_info_t *arg) 459 + { 460 + struct pci_dev *pdev = to_pci_dev(dev); 461 + struct msi_desc *desc = first_pci_msi_entry(pdev); 462 + int type = desc->msi_attrib.is_msix ? PCI_CAP_ID_MSIX : PCI_CAP_ID_MSI; 474 463 475 - if (!virq) { 476 - pr_debug("rtas_msi: Failed mapping hwirq %d\n", hwirq); 477 - return -ENOSPC; 478 - } 464 + return rtas_prepare_msi_irqs(pdev, nvec, type, arg); 465 + } 479 466 480 - dev_dbg(&pdev->dev, "rtas_msi: allocated virq %d\n", virq); 481 - irq_set_msi_desc(virq, entry); 467 + /* 468 + * ->msi_free() is called before irq_domain_free_irqs_top() when the 469 + * handler data is still available. Use that to clear the XIVE 470 + * controller data. 471 + */ 472 + static void pseries_msi_ops_msi_free(struct irq_domain *domain, 473 + struct msi_domain_info *info, 474 + unsigned int irq) 475 + { 476 + if (xive_enabled()) 477 + xive_irq_free_data(irq); 478 + } 482 479 483 - /* Read config space back so we can restore after reset */ 484 - __pci_read_msi_msg(entry, &msg); 485 - entry->msg = msg; 480 + /* 481 + * RTAS can not disable one MSI at a time. It's all or nothing. Do it 482 + * at the end after all IRQs have been freed. 483 + */ 484 + static void pseries_msi_domain_free_irqs(struct irq_domain *domain, 485 + struct device *dev) 486 + { 487 + if (WARN_ON_ONCE(!dev_is_pci(dev))) 488 + return; 489 + 490 + __msi_domain_free_irqs(domain, dev); 491 + 492 + rtas_disable_msi(to_pci_dev(dev)); 493 + } 494 + 495 + static struct msi_domain_ops pseries_pci_msi_domain_ops = { 496 + .msi_prepare = pseries_msi_ops_prepare, 497 + .msi_free = pseries_msi_ops_msi_free, 498 + .domain_free_irqs = pseries_msi_domain_free_irqs, 499 + }; 500 + 501 + static void pseries_msi_shutdown(struct irq_data *d) 502 + { 503 + d = d->parent_data; 504 + if (d->chip->irq_shutdown) 505 + d->chip->irq_shutdown(d); 506 + } 507 + 508 + static void pseries_msi_mask(struct irq_data *d) 509 + { 510 + pci_msi_mask_irq(d); 511 + irq_chip_mask_parent(d); 512 + } 513 + 514 + static void pseries_msi_unmask(struct irq_data *d) 515 + { 516 + pci_msi_unmask_irq(d); 517 + irq_chip_unmask_parent(d); 518 + } 519 + 520 + static struct irq_chip pseries_pci_msi_irq_chip = { 521 + .name = "pSeries-PCI-MSI", 522 + .irq_shutdown = pseries_msi_shutdown, 523 + .irq_mask = pseries_msi_mask, 524 + .irq_unmask = pseries_msi_unmask, 525 + .irq_eoi = irq_chip_eoi_parent, 526 + }; 527 + 528 + static struct msi_domain_info pseries_msi_domain_info = { 529 + .flags = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS | 530 + MSI_FLAG_MULTI_PCI_MSI | MSI_FLAG_PCI_MSIX), 531 + .ops = &pseries_pci_msi_domain_ops, 532 + .chip = &pseries_pci_msi_irq_chip, 533 + }; 534 + 535 + static void pseries_msi_compose_msg(struct irq_data *data, struct msi_msg *msg) 536 + { 537 + __pci_read_msi_msg(irq_data_get_msi_desc(data), msg); 538 + } 539 + 540 + static struct irq_chip pseries_msi_irq_chip = { 541 + .name = "pSeries-MSI", 542 + .irq_shutdown = pseries_msi_shutdown, 543 + .irq_mask = irq_chip_mask_parent, 544 + .irq_unmask = irq_chip_unmask_parent, 545 + .irq_eoi = irq_chip_eoi_parent, 546 + .irq_set_affinity = irq_chip_set_affinity_parent, 547 + .irq_compose_msi_msg = pseries_msi_compose_msg, 548 + }; 549 + 550 + static int pseries_irq_parent_domain_alloc(struct irq_domain *domain, unsigned int virq, 551 + irq_hw_number_t hwirq) 552 + { 553 + struct irq_fwspec parent_fwspec; 554 + int ret; 555 + 556 + parent_fwspec.fwnode = domain->parent->fwnode; 557 + parent_fwspec.param_count = 2; 558 + parent_fwspec.param[0] = hwirq; 559 + parent_fwspec.param[1] = IRQ_TYPE_EDGE_RISING; 560 + 561 + ret = irq_domain_alloc_irqs_parent(domain, virq, 1, &parent_fwspec); 562 + if (ret) 563 + return ret; 564 + 565 + return 0; 566 + } 567 + 568 + static int pseries_irq_domain_alloc(struct irq_domain *domain, unsigned int virq, 569 + unsigned int nr_irqs, void *arg) 570 + { 571 + struct pci_controller *phb = domain->host_data; 572 + msi_alloc_info_t *info = arg; 573 + struct msi_desc *desc = info->desc; 574 + struct pci_dev *pdev = msi_desc_to_pci_dev(desc); 575 + int hwirq; 576 + int i, ret; 577 + 578 + hwirq = rtas_query_irq_number(pci_get_pdn(pdev), desc->msi_attrib.entry_nr); 579 + if (hwirq < 0) { 580 + dev_err(&pdev->dev, "Failed to query HW IRQ: %d\n", hwirq); 581 + return hwirq; 582 + } 583 + 584 + dev_dbg(&pdev->dev, "%s bridge %pOF %d/%x #%d\n", __func__, 585 + phb->dn, virq, hwirq, nr_irqs); 586 + 587 + for (i = 0; i < nr_irqs; i++) { 588 + ret = pseries_irq_parent_domain_alloc(domain, virq + i, hwirq + i); 589 + if (ret) 590 + goto out; 591 + 592 + irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i, 593 + &pseries_msi_irq_chip, domain->host_data); 486 594 } 487 595 488 596 return 0; 597 + 598 + out: 599 + /* TODO: handle RTAS cleanup in ->msi_finish() ? */ 600 + irq_domain_free_irqs_parent(domain, virq, i - 1); 601 + return ret; 602 + } 603 + 604 + static void pseries_irq_domain_free(struct irq_domain *domain, unsigned int virq, 605 + unsigned int nr_irqs) 606 + { 607 + struct irq_data *d = irq_domain_get_irq_data(domain, virq); 608 + struct pci_controller *phb = irq_data_get_irq_chip_data(d); 609 + 610 + pr_debug("%s bridge %pOF %d #%d\n", __func__, phb->dn, virq, nr_irqs); 611 + 612 + /* XIVE domain data is cleared through ->msi_free() */ 613 + } 614 + 615 + static const struct irq_domain_ops pseries_irq_domain_ops = { 616 + .alloc = pseries_irq_domain_alloc, 617 + .free = pseries_irq_domain_free, 618 + }; 619 + 620 + static int __pseries_msi_allocate_domains(struct pci_controller *phb, 621 + unsigned int count) 622 + { 623 + struct irq_domain *parent = irq_get_default_host(); 624 + 625 + phb->fwnode = irq_domain_alloc_named_id_fwnode("pSeries-MSI", 626 + phb->global_number); 627 + if (!phb->fwnode) 628 + return -ENOMEM; 629 + 630 + phb->dev_domain = irq_domain_create_hierarchy(parent, 0, count, 631 + phb->fwnode, 632 + &pseries_irq_domain_ops, phb); 633 + if (!phb->dev_domain) { 634 + pr_err("PCI: failed to create IRQ domain bridge %pOF (domain %d)\n", 635 + phb->dn, phb->global_number); 636 + irq_domain_free_fwnode(phb->fwnode); 637 + return -ENOMEM; 638 + } 639 + 640 + phb->msi_domain = pci_msi_create_irq_domain(of_node_to_fwnode(phb->dn), 641 + &pseries_msi_domain_info, 642 + phb->dev_domain); 643 + if (!phb->msi_domain) { 644 + pr_err("PCI: failed to create MSI IRQ domain bridge %pOF (domain %d)\n", 645 + phb->dn, phb->global_number); 646 + irq_domain_free_fwnode(phb->fwnode); 647 + irq_domain_remove(phb->dev_domain); 648 + return -ENOMEM; 649 + } 650 + 651 + return 0; 652 + } 653 + 654 + int pseries_msi_allocate_domains(struct pci_controller *phb) 655 + { 656 + int count; 657 + 658 + if (!__find_pe_total_msi(phb->dn, &count)) { 659 + pr_err("PCI: failed to find MSIs for bridge %pOF (domain %d)\n", 660 + phb->dn, phb->global_number); 661 + return -ENOSPC; 662 + } 663 + 664 + return __pseries_msi_allocate_domains(phb, count); 665 + } 666 + 667 + void pseries_msi_free_domains(struct pci_controller *phb) 668 + { 669 + if (phb->msi_domain) 670 + irq_domain_remove(phb->msi_domain); 671 + if (phb->dev_domain) 672 + irq_domain_remove(phb->dev_domain); 673 + if (phb->fwnode) 674 + irq_domain_free_fwnode(phb->fwnode); 489 675 } 490 676 491 677 static void rtas_msi_pci_irq_fixup(struct pci_dev *pdev) ··· 684 518 685 519 static int rtas_msi_init(void) 686 520 { 687 - struct pci_controller *phb; 688 - 689 521 query_token = rtas_token("ibm,query-interrupt-source-number"); 690 522 change_token = rtas_token("ibm,change-msi"); 691 523 ··· 694 530 } 695 531 696 532 pr_debug("rtas_msi: Registering RTAS MSI callbacks.\n"); 697 - 698 - WARN_ON(pseries_pci_controller_ops.setup_msi_irqs); 699 - pseries_pci_controller_ops.setup_msi_irqs = rtas_setup_msi_irqs; 700 - pseries_pci_controller_ops.teardown_msi_irqs = rtas_teardown_msi_irqs; 701 - 702 - list_for_each_entry(phb, &hose_list, list_node) { 703 - WARN_ON(phb->controller_ops.setup_msi_irqs); 704 - phb->controller_ops.setup_msi_irqs = rtas_setup_msi_irqs; 705 - phb->controller_ops.teardown_msi_irqs = rtas_teardown_msi_irqs; 706 - } 707 533 708 534 WARN_ON(ppc_md.pci_irq_fixup); 709 535 ppc_md.pci_irq_fixup = rtas_msi_pci_irq_fixup;

+4

arch/powerpc/platforms/pseries/pci_dlpar.c

··· 33 33 34 34 pci_devs_phb_init_dynamic(phb); 35 35 36 + pseries_msi_allocate_domains(phb); 37 + 36 38 /* Create EEH devices for the PHB */ 37 39 eeh_phb_pe_create(phb); 38 40 ··· 75 73 return 1; 76 74 } 77 75 } 76 + 77 + pseries_msi_free_domains(phb); 78 78 79 79 /* Remove the PCI bus and unregister the bridge device from sysfs */ 80 80 phb->bus = NULL;

+2

arch/powerpc/platforms/pseries/pseries.h

··· 85 85 int pseries_root_bridge_prepare(struct pci_host_bridge *bridge); 86 86 87 87 extern struct pci_controller_ops pseries_pci_controller_ops; 88 + int pseries_msi_allocate_domains(struct pci_controller *phb); 89 + void pseries_msi_free_domains(struct pci_controller *phb); 88 90 89 91 unsigned long pseries_memory_block_size(void); 90 92

+1 -1

arch/powerpc/platforms/pseries/ras.c

··· 783 783 { 784 784 int recovered = 0; 785 785 786 - if (!(regs->msr & MSR_RI)) { 786 + if (regs_is_unrecoverable(regs)) { 787 787 /* If MSR_RI isn't set, we cannot recover */ 788 788 pr_err("Machine check interrupt unrecoverable: MSR(RI=0)\n"); 789 789 recovered = 0;

+2

arch/powerpc/platforms/pseries/setup.c

··· 486 486 487 487 /* create pci_dn's for DT nodes under this PHB */ 488 488 pci_devs_phb_init_dynamic(phb); 489 + 490 + pseries_msi_allocate_domains(phb); 489 491 } 490 492 491 493 of_node_put(root);

+1 -1

arch/powerpc/platforms/pseries/vas.c

··· 184 184 * Note: The hypervisor forwards an interrupt for each fault request. 185 185 * So one fault CRB to process for each H_GET_NX_FAULT hcall. 186 186 */ 187 - irqreturn_t pseries_vas_fault_thread_fn(int irq, void *data) 187 + static irqreturn_t pseries_vas_fault_thread_fn(int irq, void *data) 188 188 { 189 189 struct pseries_vas_window *txwin = data; 190 190 struct coprocessor_request_block crb;

+1 -1

arch/powerpc/sysdev/fsl_rio.c

··· 108 108 __func__); 109 109 out_be32((u32 *)(rio_regs_win + RIO_LTLEDCSR), 110 110 0); 111 - regs_set_return_msr(regs, regs->msr | MSR_RI); 111 + regs_set_recoverable(regs); 112 112 regs_set_return_ip(regs, extable_fixup(entry)); 113 113 return 1; 114 114 }

+5 -8

arch/powerpc/sysdev/xics/ics-native.c

··· 131 131 .irq_retrigger = xics_retrigger, 132 132 }; 133 133 134 - static int ics_native_map(struct ics *ics, unsigned int virq) 134 + static int ics_native_check(struct ics *ics, unsigned int hw_irq) 135 135 { 136 - unsigned int vec = (unsigned int)virq_to_hw(virq); 137 136 struct ics_native *in = to_ics_native(ics); 138 137 139 - pr_devel("%s: vec=0x%x\n", __func__, vec); 138 + pr_devel("%s: hw_irq=0x%x\n", __func__, hw_irq); 140 139 141 - if (vec < in->ibase || vec >= (in->ibase + in->icount)) 140 + if (hw_irq < in->ibase || hw_irq >= (in->ibase + in->icount)) 142 141 return -EINVAL; 143 - 144 - irq_set_chip_and_handler(virq, &ics_native_irq_chip, handle_fasteoi_irq); 145 - irq_set_chip_data(virq, ics); 146 142 147 143 return 0; 148 144 } ··· 173 177 } 174 178 175 179 static struct ics ics_native_template = { 176 - .map = ics_native_map, 180 + .check = ics_native_check, 177 181 .mask_unknown = ics_native_mask_unknown, 178 182 .get_server = ics_native_get_server, 179 183 .host_match = ics_native_host_match, 184 + .chip = &ics_native_irq_chip, 180 185 }; 181 186 182 187 static int __init ics_native_add_one(struct device_node *np)

+11 -29

arch/powerpc/sysdev/xics/ics-opal.c

··· 62 62 63 63 static unsigned int ics_opal_startup(struct irq_data *d) 64 64 { 65 - #ifdef CONFIG_PCI_MSI 66 - /* 67 - * The generic MSI code returns with the interrupt disabled on the 68 - * card, using the MSI mask bits. Firmware doesn't appear to unmask 69 - * at that level, so we do it here by hand. 70 - */ 71 - if (irq_data_get_msi_desc(d)) 72 - pci_msi_unmask_irq(d); 73 - #endif 74 - 75 - /* unmask it */ 76 65 ics_opal_unmask_irq(d); 77 66 return 0; 78 67 } ··· 122 133 } 123 134 server = ics_opal_mangle_server(wanted_server); 124 135 125 - pr_devel("ics-hal: set-affinity irq %d [hw 0x%x] server: 0x%x/0x%x\n", 136 + pr_debug("ics-hal: set-affinity irq %d [hw 0x%x] server: 0x%x/0x%x\n", 126 137 d->irq, hw_irq, wanted_server, server); 127 138 128 139 rc = opal_set_xive(hw_irq, server, priority); ··· 146 157 .irq_retrigger = xics_retrigger, 147 158 }; 148 159 149 - static int ics_opal_map(struct ics *ics, unsigned int virq); 150 - static void ics_opal_mask_unknown(struct ics *ics, unsigned long vec); 151 - static long ics_opal_get_server(struct ics *ics, unsigned long vec); 152 - 153 160 static int ics_opal_host_match(struct ics *ics, struct device_node *node) 154 161 { 155 162 return 1; 156 163 } 157 164 158 - /* Only one global & state struct ics */ 159 - static struct ics ics_hal = { 160 - .map = ics_opal_map, 161 - .mask_unknown = ics_opal_mask_unknown, 162 - .get_server = ics_opal_get_server, 163 - .host_match = ics_opal_host_match, 164 - }; 165 - 166 - static int ics_opal_map(struct ics *ics, unsigned int virq) 165 + static int ics_opal_check(struct ics *ics, unsigned int hw_irq) 167 166 { 168 - unsigned int hw_irq = (unsigned int)virq_to_hw(virq); 169 167 int64_t rc; 170 168 __be16 server; 171 169 int8_t priority; ··· 164 188 rc = opal_get_xive(hw_irq, &server, &priority); 165 189 if (rc != OPAL_SUCCESS) 166 190 return -ENXIO; 167 - 168 - irq_set_chip_and_handler(virq, &ics_opal_irq_chip, handle_fasteoi_irq); 169 - irq_set_chip_data(virq, &ics_hal); 170 191 171 192 return 0; 172 193 } ··· 194 221 return -1; 195 222 return ics_opal_unmangle_server(be16_to_cpu(server)); 196 223 } 224 + 225 + /* Only one global & state struct ics */ 226 + static struct ics ics_hal = { 227 + .check = ics_opal_check, 228 + .mask_unknown = ics_opal_mask_unknown, 229 + .get_server = ics_opal_get_server, 230 + .host_match = ics_opal_host_match, 231 + .chip = &ics_opal_irq_chip, 232 + }; 197 233 198 234 int __init ics_opal_init(void) 199 235 {

+13 -27

arch/powerpc/sysdev/xics/ics-rtas.c

··· 24 24 static int ibm_int_on; 25 25 static int ibm_int_off; 26 26 27 - static int ics_rtas_map(struct ics *ics, unsigned int virq); 28 - static void ics_rtas_mask_unknown(struct ics *ics, unsigned long vec); 29 - static long ics_rtas_get_server(struct ics *ics, unsigned long vec); 30 - static int ics_rtas_host_match(struct ics *ics, struct device_node *node); 31 - 32 - /* Only one global & state struct ics */ 33 - static struct ics ics_rtas = { 34 - .map = ics_rtas_map, 35 - .mask_unknown = ics_rtas_mask_unknown, 36 - .get_server = ics_rtas_get_server, 37 - .host_match = ics_rtas_host_match, 38 - }; 39 - 40 27 static void ics_rtas_unmask_irq(struct irq_data *d) 41 28 { 42 29 unsigned int hw_irq = (unsigned int)irqd_to_hwirq(d); ··· 57 70 58 71 static unsigned int ics_rtas_startup(struct irq_data *d) 59 72 { 60 - #ifdef CONFIG_PCI_MSI 61 - /* 62 - * The generic MSI code returns with the interrupt disabled on the 63 - * card, using the MSI mask bits. Firmware doesn't appear to unmask 64 - * at that level, so we do it here by hand. 65 - */ 66 - if (irq_data_get_msi_desc(d)) 67 - pci_msi_unmask_irq(d); 68 - #endif 69 73 /* unmask it */ 70 74 ics_rtas_unmask_irq(d); 71 75 return 0; ··· 124 146 return -1; 125 147 } 126 148 149 + pr_debug("%s: irq %d [hw 0x%x] server: 0x%x\n", __func__, d->irq, 150 + hw_irq, irq_server); 151 + 127 152 status = rtas_call_reentrant(ibm_set_xive, 3, 1, NULL, 128 153 hw_irq, irq_server, xics_status[1]); 129 154 ··· 150 169 .irq_retrigger = xics_retrigger, 151 170 }; 152 171 153 - static int ics_rtas_map(struct ics *ics, unsigned int virq) 172 + static int ics_rtas_check(struct ics *ics, unsigned int hw_irq) 154 173 { 155 - unsigned int hw_irq = (unsigned int)virq_to_hw(virq); 156 174 int status[2]; 157 175 int rc; 158 176 ··· 162 182 rc = rtas_call_reentrant(ibm_get_xive, 1, 3, status, hw_irq); 163 183 if (rc) 164 184 return -ENXIO; 165 - 166 - irq_set_chip_and_handler(virq, &ics_rtas_irq_chip, handle_fasteoi_irq); 167 - irq_set_chip_data(virq, &ics_rtas); 168 185 169 186 return 0; 170 187 } ··· 189 212 */ 190 213 return !of_device_is_compatible(node, "chrp,iic"); 191 214 } 215 + 216 + /* Only one global & state struct ics */ 217 + static struct ics ics_rtas = { 218 + .check = ics_rtas_check, 219 + .mask_unknown = ics_rtas_mask_unknown, 220 + .get_server = ics_rtas_get_server, 221 + .host_match = ics_rtas_host_match, 222 + .chip = &ics_rtas_irq_chip, 223 + }; 192 224 193 225 __init int ics_rtas_init(void) 194 226 {

+93 -38

arch/powerpc/sysdev/xics/xics-common.c

··· 38 38 39 39 struct irq_domain *xics_host; 40 40 41 - static LIST_HEAD(ics_list); 41 + static struct ics *xics_ics; 42 42 43 43 void xics_update_irq_servers(void) 44 44 { ··· 111 111 112 112 void xics_mask_unknown_vec(unsigned int vec) 113 113 { 114 - struct ics *ics; 115 - 116 114 pr_err("Interrupt 0x%x (real) is invalid, disabling it.\n", vec); 117 115 118 - list_for_each_entry(ics, &ics_list, link) 119 - ics->mask_unknown(ics, vec); 116 + if (WARN_ON(!xics_ics)) 117 + return; 118 + xics_ics->mask_unknown(xics_ics, vec); 120 119 } 121 120 122 121 ··· 132 133 * IPIs are marked IRQF_PERCPU. The handler was set in map. 133 134 */ 134 135 BUG_ON(request_irq(ipi, icp_ops->ipi_action, 135 - IRQF_PERCPU | IRQF_NO_THREAD, "IPI", NULL)); 136 + IRQF_NO_DEBUG | IRQF_PERCPU | IRQF_NO_THREAD, "IPI", NULL)); 136 137 } 137 138 138 139 void __init xics_smp_probe(void) ··· 183 184 unsigned int irq, virq; 184 185 struct irq_desc *desc; 185 186 187 + pr_debug("%s: CPU %u\n", __func__, cpu); 188 + 186 189 /* If we used to be the default server, move to the new "boot_cpuid" */ 187 190 if (hw_cpu == xics_default_server) 188 191 xics_update_irq_servers(); ··· 199 198 struct irq_chip *chip; 200 199 long server; 201 200 unsigned long flags; 202 - struct ics *ics; 201 + struct irq_data *irqd; 203 202 204 203 /* We can't set affinity on ISA interrupts */ 205 204 if (virq < NR_IRQS_LEGACY) ··· 207 206 /* We only need to migrate enabled IRQS */ 208 207 if (!desc->action) 209 208 continue; 210 - if (desc->irq_data.domain != xics_host) 209 + /* We need a mapping in the XICS IRQ domain */ 210 + irqd = irq_domain_get_irq_data(xics_host, virq); 211 + if (!irqd) 211 212 continue; 212 - irq = desc->irq_data.hwirq; 213 + irq = irqd_to_hwirq(irqd); 213 214 /* We need to get IPIs still. */ 214 215 if (irq == XICS_IPI || irq == XICS_IRQ_SPURIOUS) 215 216 continue; ··· 222 219 raw_spin_lock_irqsave(&desc->lock, flags); 223 220 224 221 /* Locate interrupt server */ 225 - server = -1; 226 - ics = irq_desc_get_chip_data(desc); 227 - if (ics) 228 - server = ics->get_server(ics, irq); 222 + server = xics_ics->get_server(xics_ics, irq); 229 223 if (server < 0) { 230 - printk(KERN_ERR "%s: Can't find server for irq %d\n", 231 - __func__, irq); 224 + pr_err("%s: Can't find server for irq %d/%x\n", 225 + __func__, virq, irq); 232 226 goto unlock; 233 227 } 234 228 ··· 307 307 static int xics_host_match(struct irq_domain *h, struct device_node *node, 308 308 enum irq_domain_bus_token bus_token) 309 309 { 310 - struct ics *ics; 311 - 312 - list_for_each_entry(ics, &ics_list, link) 313 - if (ics->host_match(ics, node)) 314 - return 1; 315 - 316 - return 0; 310 + if (WARN_ON(!xics_ics)) 311 + return 0; 312 + return xics_ics->host_match(xics_ics, node) ? 1 : 0; 317 313 } 318 314 319 315 /* Dummies */ ··· 323 327 .irq_unmask = xics_ipi_unmask, 324 328 }; 325 329 326 - static int xics_host_map(struct irq_domain *h, unsigned int virq, 327 - irq_hw_number_t hw) 330 + static int xics_host_map(struct irq_domain *domain, unsigned int virq, 331 + irq_hw_number_t hwirq) 328 332 { 329 - struct ics *ics; 330 - 331 - pr_devel("xics: map virq %d, hwirq 0x%lx\n", virq, hw); 333 + pr_devel("xics: map virq %d, hwirq 0x%lx\n", virq, hwirq); 332 334 333 335 /* 334 336 * Mark interrupts as edge sensitive by default so that resend ··· 336 342 irq_clear_status_flags(virq, IRQ_LEVEL); 337 343 338 344 /* Don't call into ICS for IPIs */ 339 - if (hw == XICS_IPI) { 345 + if (hwirq == XICS_IPI) { 340 346 irq_set_chip_and_handler(virq, &xics_ipi_chip, 341 347 handle_percpu_irq); 342 348 return 0; 343 349 } 344 350 345 - /* Let the ICS setup the chip data */ 346 - list_for_each_entry(ics, &ics_list, link) 347 - if (ics->map(ics, virq) == 0) 348 - return 0; 351 + if (WARN_ON(!xics_ics)) 352 + return -EINVAL; 349 353 350 - return -EINVAL; 354 + if (xics_ics->check(xics_ics, hwirq)) 355 + return -EINVAL; 356 + 357 + /* No chip data for the XICS domain */ 358 + irq_domain_set_info(domain, virq, hwirq, xics_ics->chip, 359 + NULL, handle_fasteoi_irq, NULL, NULL); 360 + 361 + return 0; 351 362 } 352 363 353 364 static int xics_host_xlate(struct irq_domain *h, struct device_node *ct, ··· 411 412 return 0; 412 413 } 413 414 415 + #ifdef CONFIG_IRQ_DOMAIN_HIERARCHY 416 + static int xics_host_domain_translate(struct irq_domain *d, struct irq_fwspec *fwspec, 417 + unsigned long *hwirq, unsigned int *type) 418 + { 419 + return xics_host_xlate(d, to_of_node(fwspec->fwnode), fwspec->param, 420 + fwspec->param_count, hwirq, type); 421 + } 422 + 423 + static int xics_host_domain_alloc(struct irq_domain *domain, unsigned int virq, 424 + unsigned int nr_irqs, void *arg) 425 + { 426 + struct irq_fwspec *fwspec = arg; 427 + irq_hw_number_t hwirq; 428 + unsigned int type = IRQ_TYPE_NONE; 429 + int i, rc; 430 + 431 + rc = xics_host_domain_translate(domain, fwspec, &hwirq, &type); 432 + if (rc) 433 + return rc; 434 + 435 + pr_debug("%s %d/%lx #%d\n", __func__, virq, hwirq, nr_irqs); 436 + 437 + for (i = 0; i < nr_irqs; i++) 438 + irq_domain_set_info(domain, virq + i, hwirq + i, xics_ics->chip, 439 + xics_ics, handle_fasteoi_irq, NULL, NULL); 440 + 441 + return 0; 442 + } 443 + 444 + static void xics_host_domain_free(struct irq_domain *domain, 445 + unsigned int virq, unsigned int nr_irqs) 446 + { 447 + pr_debug("%s %d #%d\n", __func__, virq, nr_irqs); 448 + } 449 + #endif 450 + 414 451 static const struct irq_domain_ops xics_host_ops = { 452 + #ifdef CONFIG_IRQ_DOMAIN_HIERARCHY 453 + .alloc = xics_host_domain_alloc, 454 + .free = xics_host_domain_free, 455 + .translate = xics_host_domain_translate, 456 + #endif 415 457 .match = xics_host_match, 416 458 .map = xics_host_map, 417 459 .xlate = xics_host_xlate, 418 460 }; 419 461 420 - static void __init xics_init_host(void) 462 + static int __init xics_allocate_domain(void) 421 463 { 422 - xics_host = irq_domain_add_tree(NULL, &xics_host_ops, NULL); 423 - BUG_ON(xics_host == NULL); 464 + struct fwnode_handle *fn; 465 + 466 + fn = irq_domain_alloc_named_fwnode("XICS"); 467 + if (!fn) 468 + return -ENOMEM; 469 + 470 + xics_host = irq_domain_create_tree(fn, &xics_host_ops, NULL); 471 + if (!xics_host) { 472 + irq_domain_free_fwnode(fn); 473 + return -ENOMEM; 474 + } 475 + 424 476 irq_set_default_host(xics_host); 477 + return 0; 425 478 } 426 479 427 480 void __init xics_register_ics(struct ics *ics) 428 481 { 429 - list_add(&ics->link, &ics_list); 482 + if (WARN_ONCE(xics_ics, "XICS: Source Controller is already defined !")) 483 + return; 484 + xics_ics = ics; 430 485 } 431 486 432 487 static void __init xics_get_server_size(void) ··· 537 484 /* Initialize common bits */ 538 485 xics_get_server_size(); 539 486 xics_update_irq_servers(); 540 - xics_init_host(); 487 + rc = xics_allocate_domain(); 488 + if (rc < 0) 489 + pr_err("XICS: Failed to create IRQ domain"); 541 490 xics_setup_cpu(); 542 491 }

+77 -26

arch/powerpc/sysdev/xive/common.c

··· 21 21 #include <linux/msi.h> 22 22 #include <linux/vmalloc.h> 23 23 24 - #include <asm/debugfs.h> 25 24 #include <asm/prom.h> 26 25 #include <asm/io.h> 27 26 #include <asm/smp.h> ··· 312 313 struct irq_desc *desc; 313 314 314 315 for_each_irq_desc(i, desc) { 315 - struct irq_data *d = irq_desc_get_irq_data(desc); 316 - unsigned int hwirq = (unsigned int)irqd_to_hwirq(d); 316 + struct irq_data *d = irq_domain_get_irq_data(xive_irq_domain, i); 317 317 318 - if (d->domain == xive_irq_domain) 319 - xmon_xive_get_irq_config(hwirq, d); 318 + if (d) 319 + xmon_xive_get_irq_config(irqd_to_hwirq(d), d); 320 320 } 321 321 } 322 322 ··· 615 617 pr_devel("xive_irq_startup: irq %d [0x%x] data @%p\n", 616 618 d->irq, hw_irq, d); 617 619 618 - #ifdef CONFIG_PCI_MSI 619 - /* 620 - * The generic MSI code returns with the interrupt disabled on the 621 - * card, using the MSI mask bits. Firmware doesn't appear to unmask 622 - * at that level, so we do it here by hand. 623 - */ 624 - if (irq_data_get_msi_desc(d)) 625 - pci_msi_unmask_irq(d); 626 - #endif 627 - 628 620 /* Pick a target */ 629 621 target = xive_pick_irq_target(d, irq_data_get_affinity_mask(d)); 630 622 if (target == XIVE_INVALID_TARGET) { ··· 702 714 u32 target, old_target; 703 715 int rc = 0; 704 716 705 - pr_devel("xive_irq_set_affinity: irq %d\n", d->irq); 717 + pr_debug("%s: irq %d/%x\n", __func__, d->irq, hw_irq); 706 718 707 719 /* Is this valid ? */ 708 720 if (cpumask_any_and(cpumask, cpu_online_mask) >= nr_cpu_ids) 709 721 return -EINVAL; 710 - 711 - /* Don't do anything if the interrupt isn't started */ 712 - if (!irqd_is_started(d)) 713 - return IRQ_SET_MASK_OK; 714 722 715 723 /* 716 724 * If existing target is already in the new mask, and is ··· 743 759 return rc; 744 760 } 745 761 746 - pr_devel(" target: 0x%x\n", target); 762 + pr_debug(" target: 0x%x\n", target); 747 763 xd->target = target; 748 764 749 765 /* Give up previous target */ ··· 974 990 975 991 void xive_cleanup_irq_data(struct xive_irq_data *xd) 976 992 { 993 + pr_debug("%s for HW %x\n", __func__, xd->hw_irq); 994 + 977 995 if (xd->eoi_mmio) { 978 996 iounmap(xd->eoi_mmio); 979 997 if (xd->eoi_mmio == xd->trig_mmio) ··· 1017 1031 return 0; 1018 1032 } 1019 1033 1020 - static void xive_irq_free_data(unsigned int virq) 1034 + void xive_irq_free_data(unsigned int virq) 1021 1035 { 1022 1036 struct xive_irq_data *xd = irq_get_handler_data(virq); 1023 1037 ··· 1027 1041 xive_cleanup_irq_data(xd); 1028 1042 kfree(xd); 1029 1043 } 1044 + EXPORT_SYMBOL_GPL(xive_irq_free_data); 1030 1045 1031 1046 #ifdef CONFIG_SMP 1032 1047 ··· 1166 1179 return 0; 1167 1180 1168 1181 ret = request_irq(xid->irq, xive_muxed_ipi_action, 1169 - IRQF_PERCPU | IRQF_NO_THREAD, 1182 + IRQF_NO_DEBUG | IRQF_PERCPU | IRQF_NO_THREAD, 1170 1183 xid->name, NULL); 1171 1184 1172 1185 WARN(ret < 0, "Failed to request IPI %d: %d\n", xid->irq, ret); ··· 1366 1379 } 1367 1380 #endif 1368 1381 1382 + #ifdef CONFIG_IRQ_DOMAIN_HIERARCHY 1383 + static int xive_irq_domain_translate(struct irq_domain *d, 1384 + struct irq_fwspec *fwspec, 1385 + unsigned long *hwirq, 1386 + unsigned int *type) 1387 + { 1388 + return xive_irq_domain_xlate(d, to_of_node(fwspec->fwnode), 1389 + fwspec->param, fwspec->param_count, 1390 + hwirq, type); 1391 + } 1392 + 1393 + static int xive_irq_domain_alloc(struct irq_domain *domain, unsigned int virq, 1394 + unsigned int nr_irqs, void *arg) 1395 + { 1396 + struct irq_fwspec *fwspec = arg; 1397 + irq_hw_number_t hwirq; 1398 + unsigned int type = IRQ_TYPE_NONE; 1399 + int i, rc; 1400 + 1401 + rc = xive_irq_domain_translate(domain, fwspec, &hwirq, &type); 1402 + if (rc) 1403 + return rc; 1404 + 1405 + pr_debug("%s %d/%lx #%d\n", __func__, virq, hwirq, nr_irqs); 1406 + 1407 + for (i = 0; i < nr_irqs; i++) { 1408 + /* TODO: call xive_irq_domain_map() */ 1409 + 1410 + /* 1411 + * Mark interrupts as edge sensitive by default so that resend 1412 + * actually works. Will fix that up below if needed. 1413 + */ 1414 + irq_clear_status_flags(virq, IRQ_LEVEL); 1415 + 1416 + /* allocates and sets handler data */ 1417 + rc = xive_irq_alloc_data(virq + i, hwirq + i); 1418 + if (rc) 1419 + return rc; 1420 + 1421 + irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i, 1422 + &xive_irq_chip, domain->host_data); 1423 + irq_set_handler(virq + i, handle_fasteoi_irq); 1424 + } 1425 + 1426 + return 0; 1427 + } 1428 + 1429 + static void xive_irq_domain_free(struct irq_domain *domain, 1430 + unsigned int virq, unsigned int nr_irqs) 1431 + { 1432 + int i; 1433 + 1434 + pr_debug("%s %d #%d\n", __func__, virq, nr_irqs); 1435 + 1436 + for (i = 0; i < nr_irqs; i++) 1437 + xive_irq_free_data(virq + i); 1438 + } 1439 + #endif 1440 + 1369 1441 static const struct irq_domain_ops xive_irq_domain_ops = { 1442 + #ifdef CONFIG_IRQ_DOMAIN_HIERARCHY 1443 + .alloc = xive_irq_domain_alloc, 1444 + .free = xive_irq_domain_free, 1445 + .translate = xive_irq_domain_translate, 1446 + #endif 1370 1447 .match = xive_irq_domain_match, 1371 1448 .map = xive_irq_domain_map, 1372 1449 .unmap = xive_irq_domain_unmap, ··· 1768 1717 xive_debug_show_cpu(m, cpu); 1769 1718 1770 1719 for_each_irq_desc(i, desc) { 1771 - struct irq_data *d = irq_desc_get_irq_data(desc); 1720 + struct irq_data *d = irq_domain_get_irq_data(xive_irq_domain, i); 1772 1721 1773 - if (d->domain == xive_irq_domain) 1722 + if (d) 1774 1723 xive_debug_show_irq(m, d); 1775 1724 } 1776 1725 return 0; ··· 1780 1729 int xive_core_debug_init(void) 1781 1730 { 1782 1731 if (xive_enabled()) 1783 - debugfs_create_file("xive", 0400, powerpc_debugfs_root, 1732 + debugfs_create_file("xive", 0400, arch_debugfs_dir, 1784 1733 NULL, &xive_core_debug_fops); 1785 1734 return 0; 1786 1735 }

+10

arch/powerpc/sysdev/xive/native.c

··· 41 41 static u32 xive_pool_vps = XIVE_INVALID_VP; 42 42 static struct kmem_cache *xive_provision_cache; 43 43 static bool xive_has_single_esc; 44 + static bool xive_has_save_restore; 44 45 45 46 int xive_native_populate_irq_data(u32 hw_irq, struct xive_irq_data *data) 46 47 { ··· 589 588 if (of_get_property(np, "single-escalation-support", NULL) != NULL) 590 589 xive_has_single_esc = true; 591 590 591 + if (of_get_property(np, "vp-save-restore", NULL)) 592 + xive_has_save_restore = true; 593 + 592 594 /* Configure Thread Management areas for KVM */ 593 595 for_each_possible_cpu(cpu) 594 596 kvmppc_set_xive_tima(cpu, r.start, tima); ··· 755 751 return xive_has_single_esc; 756 752 } 757 753 EXPORT_SYMBOL_GPL(xive_native_has_single_escalation); 754 + 755 + bool xive_native_has_save_restore(void) 756 + { 757 + return xive_has_save_restore; 758 + } 759 + EXPORT_SYMBOL_GPL(xive_native_has_save_restore); 758 760 759 761 int xive_native_get_queue_info(u32 vp_id, u32 prio, 760 762 u64 *out_qpage,

+12 -12

arch/powerpc/tools/head_check.sh

··· 49 49 $nm "$vmlinux" | grep -e " [TA] _stext$" -e " t start_first_256B$" -e " a text_start$" -e " t start_text$" > .tmp_symbols.txt 50 50 51 51 52 - vma=$(cat .tmp_symbols.txt | grep -e " [TA] _stext$" | cut -d' ' -f1) 52 + vma=$(grep -e " [TA] _stext$" .tmp_symbols.txt | cut -d' ' -f1) 53 53 54 - expected_start_head_addr=$vma 54 + expected_start_head_addr="$vma" 55 55 56 - start_head_addr=$(cat .tmp_symbols.txt | grep " t start_first_256B$" | cut -d' ' -f1) 56 + start_head_addr=$(grep " t start_first_256B$" .tmp_symbols.txt | cut -d' ' -f1) 57 57 58 58 if [ "$start_head_addr" != "$expected_start_head_addr" ]; then 59 - echo "ERROR: head code starts at $start_head_addr, should be $expected_start_head_addr" 60 - echo "ERROR: try to enable LD_HEAD_STUB_CATCH config option" 61 - echo "ERROR: see comments in arch/powerpc/tools/head_check.sh" 59 + echo "ERROR: head code starts at $start_head_addr, should be $expected_start_head_addr" 1>&2 60 + echo "ERROR: try to enable LD_HEAD_STUB_CATCH config option" 1>&2 61 + echo "ERROR: see comments in arch/powerpc/tools/head_check.sh" 1>&2 62 62 63 63 exit 1 64 64 fi 65 65 66 - top_vma=$(echo $vma | cut -d'0' -f1) 66 + top_vma=$(echo "$vma" | cut -d'0' -f1) 67 67 68 - expected_start_text_addr=$(cat .tmp_symbols.txt | grep " a text_start$" | cut -d' ' -f1 | sed "s/^0/$top_vma/") 68 + expected_start_text_addr=$(grep " a text_start$" .tmp_symbols.txt | cut -d' ' -f1 | sed "s/^0/$top_vma/") 69 69 70 - start_text_addr=$(cat .tmp_symbols.txt | grep " t start_text$" | cut -d' ' -f1) 70 + start_text_addr=$(grep " t start_text$" .tmp_symbols.txt | cut -d' ' -f1) 71 71 72 72 if [ "$start_text_addr" != "$expected_start_text_addr" ]; then 73 - echo "ERROR: start_text address is $start_text_addr, should be $expected_start_text_addr" 74 - echo "ERROR: try to enable LD_HEAD_STUB_CATCH config option" 75 - echo "ERROR: see comments in arch/powerpc/tools/head_check.sh" 73 + echo "ERROR: start_text address is $start_text_addr, should be $expected_start_text_addr" 1>&2 74 + echo "ERROR: try to enable LD_HEAD_STUB_CATCH config option" 1>&2 75 + echo "ERROR: see comments in arch/powerpc/tools/head_check.sh" 1>&2 76 76 77 77 exit 1 78 78 fi

+6 -16

arch/powerpc/xmon/xmon.c

··· 26 26 #include <linux/ctype.h> 27 27 #include <linux/highmem.h> 28 28 #include <linux/security.h> 29 + #include <linux/debugfs.h> 29 30 30 - #include <asm/debugfs.h> 31 31 #include <asm/ptrace.h> 32 32 #include <asm/smp.h> 33 33 #include <asm/string.h> ··· 482 482 static inline void release_output_lock(void) {} 483 483 #endif 484 484 485 - static inline int unrecoverable_excp(struct pt_regs *regs) 486 - { 487 - #if defined(CONFIG_4xx) || defined(CONFIG_PPC_BOOK3E) 488 - /* We have no MSR_RI bit on 4xx or Book3e, so we simply return false */ 489 - return 0; 490 - #else 491 - return ((regs->msr & MSR_RI) == 0); 492 - #endif 493 - } 494 - 495 485 static void xmon_touch_watchdogs(void) 496 486 { 497 487 touch_softlockup_watchdog_sync(); ··· 555 565 bp = NULL; 556 566 if ((regs->msr & (MSR_IR|MSR_PR|MSR_64BIT)) == (MSR_IR|MSR_64BIT)) 557 567 bp = at_breakpoint(regs->nip); 558 - if (bp || unrecoverable_excp(regs)) 568 + if (bp || regs_is_unrecoverable(regs)) 559 569 fromipi = 0; 560 570 561 571 if (!fromipi) { ··· 567 577 cpu, BP_NUM(bp)); 568 578 xmon_print_symbol(regs->nip, " ", ")\n"); 569 579 } 570 - if (unrecoverable_excp(regs)) 580 + if (regs_is_unrecoverable(regs)) 571 581 printf("WARNING: exception is not recoverable, " 572 582 "can't continue\n"); 573 583 release_output_lock(); ··· 683 693 printf("Stopped at breakpoint %tx (", BP_NUM(bp)); 684 694 xmon_print_symbol(regs->nip, " ", ")\n"); 685 695 } 686 - if (unrecoverable_excp(regs)) 696 + if (regs_is_unrecoverable(regs)) 687 697 printf("WARNING: exception is not recoverable, " 688 698 "can't continue\n"); 689 699 remove_bpts(); ··· 4067 4077 4068 4078 static int __init setup_xmon_dbgfs(void) 4069 4079 { 4070 - debugfs_create_file("xmon", 0600, powerpc_debugfs_root, NULL, 4071 - &xmon_dbgfs_ops); 4080 + debugfs_create_file("xmon", 0600, arch_debugfs_dir, NULL, 4081 + &xmon_dbgfs_ops); 4072 4082 return 0; 4073 4083 } 4074 4084 device_initcall(setup_xmon_dbgfs);

+14 -2

drivers/cpufreq/powernv-cpufreq.c

··· 36 36 #define MAX_PSTATE_SHIFT 32 37 37 #define LPSTATE_SHIFT 48 38 38 #define GPSTATE_SHIFT 56 39 + #define MAX_NR_CHIPS 32 39 40 40 41 #define MAX_RAMP_DOWN_TIME 5120 41 42 /* ··· 1047 1046 unsigned int *chip; 1048 1047 unsigned int cpu, i; 1049 1048 unsigned int prev_chip_id = UINT_MAX; 1049 + cpumask_t *chip_cpu_mask; 1050 1050 int ret = 0; 1051 1051 1052 1052 chip = kcalloc(num_possible_cpus(), sizeof(*chip), GFP_KERNEL); 1053 1053 if (!chip) 1054 1054 return -ENOMEM; 1055 + 1056 + /* Allocate a chip cpu mask large enough to fit mask for all chips */ 1057 + chip_cpu_mask = kcalloc(MAX_NR_CHIPS, sizeof(cpumask_t), GFP_KERNEL); 1058 + if (!chip_cpu_mask) { 1059 + ret = -ENOMEM; 1060 + goto free_and_return; 1061 + } 1055 1062 1056 1063 for_each_possible_cpu(cpu) { 1057 1064 unsigned int id = cpu_to_chip_id(cpu); ··· 1068 1059 prev_chip_id = id; 1069 1060 chip[nr_chips++] = id; 1070 1061 } 1062 + cpumask_set_cpu(cpu, &chip_cpu_mask[nr_chips-1]); 1071 1063 } 1072 1064 1073 1065 chips = kcalloc(nr_chips, sizeof(struct chip), GFP_KERNEL); 1074 1066 if (!chips) { 1075 1067 ret = -ENOMEM; 1076 - goto free_and_return; 1068 + goto out_free_chip_cpu_mask; 1077 1069 } 1078 1070 1079 1071 for (i = 0; i < nr_chips; i++) { 1080 1072 chips[i].id = chip[i]; 1081 - cpumask_copy(&chips[i].mask, cpumask_of_node(chip[i])); 1073 + cpumask_copy(&chips[i].mask, &chip_cpu_mask[i]); 1082 1074 INIT_WORK(&chips[i].throttle, powernv_cpufreq_work_fn); 1083 1075 for_each_cpu(cpu, &chips[i].mask) 1084 1076 per_cpu(chip_info, cpu) = &chips[i]; 1085 1077 } 1086 1078 1079 + out_free_chip_cpu_mask: 1080 + kfree(chip_cpu_mask); 1087 1081 free_and_return: 1088 1082 kfree(chip); 1089 1083 return ret;

+47 -32

drivers/cpuidle/cpuidle-pseries.c

··· 346 346 static void __init fixup_cede0_latency(void) 347 347 { 348 348 struct xcede_latency_payload *payload; 349 - u64 min_latency_us; 349 + u64 min_xcede_latency_us = UINT_MAX; 350 350 int i; 351 - 352 - min_latency_us = dedicated_states[1].exit_latency; // CEDE latency 353 351 354 352 if (parse_cede_parameters()) 355 353 return; ··· 356 358 nr_xcede_records); 357 359 358 360 payload = &xcede_latency_parameter.payload; 361 + 362 + /* 363 + * The CEDE idle state maps to CEDE(0). While the hypervisor 364 + * does not advertise CEDE(0) exit latency values, it does 365 + * advertise the latency values of the extended CEDE states. 366 + * We use the lowest advertised exit latency value as a proxy 367 + * for the exit latency of CEDE(0). 368 + */ 359 369 for (i = 0; i < nr_xcede_records; i++) { 360 370 struct xcede_latency_record *record = &payload->records[i]; 371 + u8 hint = record->hint; 361 372 u64 latency_tb = be64_to_cpu(record->latency_ticks); 362 373 u64 latency_us = DIV_ROUND_UP_ULL(tb_to_ns(latency_tb), NSEC_PER_USEC); 363 374 364 - if (latency_us == 0) 365 - pr_warn("cpuidle: xcede record %d has an unrealistic latency of 0us.\n", i); 375 + /* 376 + * We expect the exit latency of an extended CEDE 377 + * state to be non-zero, it to since it takes at least 378 + * a few nanoseconds to wakeup the idle CPU and 379 + * dispatch the virtual processor into the Linux 380 + * Guest. 381 + * 382 + * So we consider only non-zero value for performing 383 + * the fixup of CEDE(0) latency. 384 + */ 385 + if (latency_us == 0) { 386 + pr_warn("cpuidle: Skipping xcede record %d [hint=%d]. Exit latency = 0us\n", 387 + i, hint); 388 + continue; 389 + } 366 390 367 - if (latency_us < min_latency_us) 368 - min_latency_us = latency_us; 391 + if (latency_us < min_xcede_latency_us) 392 + min_xcede_latency_us = latency_us; 369 393 } 370 394 371 - /* 372 - * By default, we assume that CEDE(0) has exit latency 10us, 373 - * since there is no way for us to query from the platform. 374 - * 375 - * However, if the wakeup latency of an Extended CEDE state is 376 - * smaller than 10us, then we can be sure that CEDE(0) 377 - * requires no more than that. 378 - * 379 - * Perform the fix-up. 380 - */ 381 - if (min_latency_us < dedicated_states[1].exit_latency) { 382 - /* 383 - * We set a minimum of 1us wakeup latency for cede0 to 384 - * distinguish it from snooze 385 - */ 386 - u64 cede0_latency = 1; 387 - 388 - if (min_latency_us > cede0_latency) 389 - cede0_latency = min_latency_us - 1; 390 - 391 - dedicated_states[1].exit_latency = cede0_latency; 392 - dedicated_states[1].target_residency = 10 * (cede0_latency); 395 + if (min_xcede_latency_us != UINT_MAX) { 396 + dedicated_states[1].exit_latency = min_xcede_latency_us; 397 + dedicated_states[1].target_residency = 10 * (min_xcede_latency_us); 393 398 pr_info("cpuidle: Fixed up CEDE exit latency to %llu us\n", 394 - cede0_latency); 399 + min_xcede_latency_us); 395 400 } 396 401 397 402 } ··· 403 402 * pseries_idle_probe() 404 403 * Choose state table for shared versus dedicated partition 405 404 */ 406 - static int pseries_idle_probe(void) 405 + static int __init pseries_idle_probe(void) 407 406 { 408 407 409 408 if (cpuidle_disable != IDLE_NO_OVERRIDE) ··· 420 419 cpuidle_state_table = shared_states; 421 420 max_idle_state = ARRAY_SIZE(shared_states); 422 421 } else { 423 - fixup_cede0_latency(); 422 + /* 423 + * Use firmware provided latency values 424 + * starting with POWER10 platforms. In the 425 + * case that we are running on a POWER10 426 + * platform but in an earlier compat mode, we 427 + * can still use the firmware provided values. 428 + * 429 + * However, on platforms prior to POWER10, we 430 + * cannot rely on the accuracy of the firmware 431 + * provided latency values. On such platforms, 432 + * go with the conservative default estimate 433 + * of 10us. 434 + */ 435 + if (cpu_has_feature(CPU_FTR_ARCH_31) || pvr_version_is(PVR_POWER10)) 436 + fixup_cede0_latency(); 424 437 cpuidle_state_table = dedicated_states; 425 438 max_idle_state = NR_DEDICATED_STATES; 426 439 }

+1

kernel/irq/irqdomain.c

··· 491 491 { 492 492 return irq_default_domain; 493 493 } 494 + EXPORT_SYMBOL_GPL(irq_get_default_host); 494 495 495 496 static bool irq_domain_is_nomap(struct irq_domain *domain) 496 497 {

+1 -1

scripts/mod/modpost.c

··· 931 931 ".kprobes.text", ".cpuidle.text", ".noinstr.text" 932 932 #define OTHER_TEXT_SECTIONS ".ref.text", ".head.text", ".spinlock.text", \ 933 933 ".fixup", ".entry.text", ".exception.text", ".text.*", \ 934 - ".coldtext" 934 + ".coldtext", ".softirqentry.text" 935 935 936 936 #define INIT_SECTIONS ".init.*" 937 937 #define MEM_INIT_SECTIONS ".meminit.*"

+2 -1

tools/testing/selftests/powerpc/ptrace/ptrace-tm-gpr.c

··· 57 57 : [gpr_1]"i"(GPR_1), [gpr_2]"i"(GPR_2), 58 58 [sprn_texasr] "i" (SPRN_TEXASR), [flt_1] "b" (&a), 59 59 [flt_2] "b" (&b), [cptr1] "b" (&cptr[1]) 60 - : "memory", "r7", "r8", "r9", "r10", 60 + : "memory", "r0", "r7", "r8", "r9", "r10", 61 61 "r11", "r12", "r13", "r14", "r15", "r16", 62 62 "r17", "r18", "r19", "r20", "r21", "r22", 63 63 "r23", "r24", "r25", "r26", "r27", "r28", ··· 113 113 int ret, status; 114 114 115 115 SKIP_IF(!have_htm()); 116 + SKIP_IF(htm_is_synthetic()); 116 117 shm_id = shmget(IPC_PRIVATE, sizeof(int) * 2, 0777|IPC_CREAT); 117 118 pid = fork(); 118 119 if (pid < 0) {

+2 -1

tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-gpr.c

··· 65 65 : [gpr_1]"i"(GPR_1), [gpr_2]"i"(GPR_2), [gpr_4]"i"(GPR_4), 66 66 [sprn_texasr] "i" (SPRN_TEXASR), [flt_1] "b" (&a), 67 67 [flt_4] "b" (&d) 68 - : "memory", "r5", "r6", "r7", 68 + : "memory", "r0", "r5", "r6", "r7", 69 69 "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15", 70 70 "r16", "r17", "r18", "r19", "r20", "r21", "r22", "r23", 71 71 "r24", "r25", "r26", "r27", "r28", "r29", "r30", "r31" ··· 119 119 int ret, status; 120 120 121 121 SKIP_IF(!have_htm()); 122 + SKIP_IF(htm_is_synthetic()); 122 123 shm_id = shmget(IPC_PRIVATE, sizeof(int) * 3, 0777|IPC_CREAT); 123 124 pid = fork(); 124 125 if (pid < 0) {

+1

tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-tar.c

··· 129 129 int ret, status; 130 130 131 131 SKIP_IF(!have_htm()); 132 + SKIP_IF(htm_is_synthetic()); 132 133 shm_id = shmget(IPC_PRIVATE, sizeof(int) * 3, 0777|IPC_CREAT); 133 134 pid = fork(); 134 135 if (pid == 0)

+1

tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-vsx.c

··· 129 129 int ret, status, i; 130 130 131 131 SKIP_IF(!have_htm()); 132 + SKIP_IF(htm_is_synthetic()); 132 133 shm_id = shmget(IPC_PRIVATE, sizeof(int) * 3, 0777|IPC_CREAT); 133 134 134 135 for (i = 0; i < 128; i++) {

+1

tools/testing/selftests/powerpc/ptrace/ptrace-tm-spr.c

··· 114 114 int ret, status; 115 115 116 116 SKIP_IF(!have_htm()); 117 + SKIP_IF(htm_is_synthetic()); 117 118 shm_id = shmget(IPC_PRIVATE, sizeof(struct shared), 0777|IPC_CREAT); 118 119 shm_id1 = shmget(IPC_PRIVATE, sizeof(int), 0777|IPC_CREAT); 119 120 pid = fork();

+1

tools/testing/selftests/powerpc/ptrace/ptrace-tm-tar.c

··· 117 117 int ret, status; 118 118 119 119 SKIP_IF(!have_htm()); 120 + SKIP_IF(htm_is_synthetic()); 120 121 shm_id = shmget(IPC_PRIVATE, sizeof(int) * 2, 0777|IPC_CREAT); 121 122 pid = fork(); 122 123 if (pid == 0)

+1

tools/testing/selftests/powerpc/ptrace/ptrace-tm-vsx.c

··· 113 113 int ret, status, i; 114 114 115 115 SKIP_IF(!have_htm()); 116 + SKIP_IF(htm_is_synthetic()); 116 117 shm_id = shmget(IPC_PRIVATE, sizeof(int) * 2, 0777|IPC_CREAT); 117 118 118 119 for (i = 0; i < 128; i++) {

+1

tools/testing/selftests/powerpc/signal/signal_tm.c

··· 56 56 } 57 57 58 58 SKIP_IF(!have_htm()); 59 + SKIP_IF(htm_is_synthetic()); 59 60 60 61 for (i = 0; i < MAX_ATTEMPT; i++) { 61 62 /*

+1

tools/testing/selftests/powerpc/tm/tm-exec.c

··· 27 27 static int test_exec(void) 28 28 { 29 29 SKIP_IF(!have_htm()); 30 + SKIP_IF(htm_is_synthetic()); 30 31 31 32 asm __volatile__( 32 33 "tbegin.;"

+1

tools/testing/selftests/powerpc/tm/tm-fork.c

··· 21 21 int test_fork(void) 22 22 { 23 23 SKIP_IF(!have_htm()); 24 + SKIP_IF(htm_is_synthetic()); 24 25 25 26 asm __volatile__( 26 27 "tbegin.;"

+1 -1

tools/testing/selftests/powerpc/tm/tm-poison.c

··· 20 20 #include <sched.h> 21 21 #include <sys/types.h> 22 22 #include <signal.h> 23 - #include <inttypes.h> 24 23 25 24 #include "tm.h" 26 25 ··· 33 34 bool fail_vr = false; 34 35 35 36 SKIP_IF(!have_htm()); 37 + SKIP_IF(htm_is_synthetic()); 36 38 37 39 cpu = pick_online_cpu(); 38 40 FAIL_IF(cpu < 0);

+1

tools/testing/selftests/powerpc/tm/tm-resched-dscr.c

··· 40 40 uint64_t rv, dscr1 = 1, dscr2, texasr; 41 41 42 42 SKIP_IF(!have_htm()); 43 + SKIP_IF(htm_is_synthetic()); 43 44 44 45 printf("Check DSCR TM context switch: "); 45 46 fflush(stdout);

+1

tools/testing/selftests/powerpc/tm/tm-signal-context-chk-fpu.c

··· 79 79 pid_t pid = getpid(); 80 80 81 81 SKIP_IF(!have_htm()); 82 + SKIP_IF(htm_is_synthetic()); 82 83 83 84 act.sa_sigaction = signal_usr1; 84 85 sigemptyset(&act.sa_mask);

+1

tools/testing/selftests/powerpc/tm/tm-signal-context-chk-gpr.c

··· 81 81 pid_t pid = getpid(); 82 82 83 83 SKIP_IF(!have_htm()); 84 + SKIP_IF(htm_is_synthetic()); 84 85 85 86 act.sa_sigaction = signal_usr1; 86 87 sigemptyset(&act.sa_mask);

+1

tools/testing/selftests/powerpc/tm/tm-signal-context-chk-vmx.c

··· 104 104 pid_t pid = getpid(); 105 105 106 106 SKIP_IF(!have_htm()); 107 + SKIP_IF(htm_is_synthetic()); 107 108 108 109 act.sa_sigaction = signal_usr1; 109 110 sigemptyset(&act.sa_mask);

+1

tools/testing/selftests/powerpc/tm/tm-signal-context-chk-vsx.c

··· 153 153 pid_t pid = getpid(); 154 154 155 155 SKIP_IF(!have_htm()); 156 + SKIP_IF(htm_is_synthetic()); 156 157 157 158 act.sa_sigaction = signal_usr1; 158 159 sigemptyset(&act.sa_mask);

+1

tools/testing/selftests/powerpc/tm/tm-signal-pagefault.c

··· 226 226 stack_t ss; 227 227 228 228 SKIP_IF(!have_htm()); 229 + SKIP_IF(htm_is_synthetic()); 229 230 SKIP_IF(!have_userfaultfd()); 230 231 231 232 setup_uf_mem();

+1

tools/testing/selftests/powerpc/tm/tm-signal-sigreturn-nt.c

··· 32 32 struct sigaction trap_sa; 33 33 34 34 SKIP_IF(!have_htm()); 35 + SKIP_IF(htm_is_synthetic()); 35 36 36 37 trap_sa.sa_flags = SA_SIGINFO; 37 38 trap_sa.sa_sigaction = trap_signal_handler;

+1

tools/testing/selftests/powerpc/tm/tm-signal-stack.c

··· 35 35 int pid; 36 36 37 37 SKIP_IF(!have_htm()); 38 + SKIP_IF(htm_is_synthetic()); 38 39 39 40 pid = fork(); 40 41 if (pid < 0)

+1

tools/testing/selftests/powerpc/tm/tm-sigreturn.c

··· 55 55 uint64_t ret = 0; 56 56 57 57 SKIP_IF(!have_htm()); 58 + SKIP_IF(htm_is_synthetic()); 58 59 SKIP_IF(!is_ppc64le()); 59 60 60 61 memset(&sa, 0, sizeof(sa));

+1 -1

tools/testing/selftests/powerpc/tm/tm-syscall.c

··· 25 25 unsigned retries = 0; 26 26 27 27 #define TEST_DURATION 10 /* seconds */ 28 - #define TM_RETRIES 100 29 28 30 29 pid_t getppid_tm(bool suspend) 31 30 { ··· 66 67 struct timeval end, now; 67 68 68 69 SKIP_IF(!have_htm_nosc()); 70 + SKIP_IF(htm_is_synthetic()); 69 71 70 72 setbuf(stdout, NULL); 71 73

+1

tools/testing/selftests/powerpc/tm/tm-tar.c

··· 26 26 int i; 27 27 28 28 SKIP_IF(!have_htm()); 29 + SKIP_IF(htm_is_synthetic()); 29 30 SKIP_IF(!is_ppc64le()); 30 31 31 32 for (i = 0; i < num_loops; i++)

+1

tools/testing/selftests/powerpc/tm/tm-tmspr.c

··· 96 96 unsigned long i; 97 97 98 98 SKIP_IF(!have_htm()); 99 + SKIP_IF(htm_is_synthetic()); 99 100 100 101 /* To cause some context switching */ 101 102 thread_num = 10 * sysconf(_SC_NPROCESSORS_ONLN);

+1

tools/testing/selftests/powerpc/tm/tm-trap.c

··· 255 255 struct sigaction trap_sa; 256 256 257 257 SKIP_IF(!have_htm()); 258 + SKIP_IF(htm_is_synthetic()); 258 259 259 260 trap_sa.sa_flags = SA_SIGINFO; 260 261 trap_sa.sa_sigaction = trap_signal_handler;

+1

tools/testing/selftests/powerpc/tm/tm-unavailable.c

··· 344 344 cpu_set_t cpuset; 345 345 346 346 SKIP_IF(!have_htm()); 347 + SKIP_IF(htm_is_synthetic()); 347 348 348 349 cpu = pick_online_cpu(); 349 350 FAIL_IF(cpu < 0);

+1

tools/testing/selftests/powerpc/tm/tm-vmx-unavail.c

··· 91 91 pthread_t *thread; 92 92 93 93 SKIP_IF(!have_htm()); 94 + SKIP_IF(htm_is_synthetic()); 94 95 95 96 passed = 1; 96 97

+1

tools/testing/selftests/powerpc/tm/tm-vmxcopy.c

··· 46 46 uint64_t aborted = 0; 47 47 48 48 SKIP_IF(!have_htm()); 49 + SKIP_IF(htm_is_synthetic()); 49 50 SKIP_IF(!is_ppc64le()); 50 51 51 52 fd = mkstemp(tmpfile);

+36

tools/testing/selftests/powerpc/tm/tm.h

··· 10 10 #include <asm/tm.h> 11 11 12 12 #include "utils.h" 13 + #include "reg.h" 14 + 15 + #define TM_RETRIES 100 13 16 14 17 static inline bool have_htm(void) 15 18 { ··· 32 29 printf("PPC_FEATURE2_HTM_NOSC not defined, can't check AT_HWCAP2\n"); 33 30 return false; 34 31 #endif 32 + } 33 + 34 + /* 35 + * Transactional Memory was removed in ISA 3.1. A synthetic TM implementation 36 + * is provided on P10 for threads running in P8/P9 compatibility mode. The 37 + * synthetic implementation immediately fails after tbegin. This failure sets 38 + * Bit 7 (Failure Persistent) and Bit 15 (Implementation-specific). 39 + */ 40 + static inline bool htm_is_synthetic(void) 41 + { 42 + int i; 43 + 44 + /* 45 + * Per the ISA, the Failure Persistent bit may be incorrect. Try a few 46 + * times in case we got an Implementation-specific failure on a non ISA 47 + * v3.1 system. On these systems the Implementation-specific failure 48 + * should not be persistent. 49 + */ 50 + for (i = 0; i < TM_RETRIES; i++) { 51 + asm volatile( 52 + "tbegin.;" 53 + "beq 1f;" 54 + "tend.;" 55 + "1:" 56 + : 57 + : 58 + : "memory"); 59 + 60 + if ((__builtin_get_texasr() & (TEXASR_FP | TEXASR_IC)) != 61 + (TEXASR_FP | TEXASR_IC)) 62 + break; 63 + } 64 + return i == TM_RETRIES; 35 65 } 36 66 37 67 static inline long failure_code(void)