Merge tag 'iommu-updates-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

+8 -1

Documentation/admin-guide/kernel-parameters.txt

··· 301 301 allowed anymore to lift isolation 302 302 requirements as needed. This option 303 303 does not override iommu=pt 304 + force_enable - Force enable the IOMMU on platforms known 305 + to be buggy with IOMMU enabled. Use this 306 + option with care. 304 307 305 308 amd_iommu_dump= [HW,X86-64] 306 309 Enable AMD IOMMU driver option to dump the ACPI table ··· 2034 2031 forcing Dual Address Cycle for PCI cards supporting 2035 2032 greater than 32-bit addressing. 2036 2033 2037 - iommu.strict= [ARM64] Configure TLB invalidation behaviour 2034 + iommu.strict= [ARM64, X86] Configure TLB invalidation behaviour 2038 2035 Format: { "0" | "1" } 2039 2036 0 - Lazy mode. 2040 2037 Request that DMA unmap operations use deferred ··· 2045 2042 1 - Strict mode (default). 2046 2043 DMA unmap operations invalidate IOMMU hardware TLBs 2047 2044 synchronously. 2045 + Note: on x86, the default behaviour depends on the 2046 + equivalent driver-specific parameters, but a strict 2047 + mode explicitly specified by either method takes 2048 + precedence. 2048 2049 2049 2050 iommu.passthrough= 2050 2051 [ARM64, X86] Configure DMA to bypass the IOMMU by default.

+18

Documentation/devicetree/bindings/iommu/iommu.txt

··· 92 92 tagging DMA transactions with an address space identifier. By default, 93 93 this is 0, which means that the device only has one address space. 94 94 95 + - dma-can-stall: When present, the master can wait for a transaction to 96 + complete for an indefinite amount of time. Upon translation fault some 97 + IOMMUs, instead of aborting the translation immediately, may first 98 + notify the driver and keep the transaction in flight. This allows the OS 99 + to inspect the fault and, for example, make physical pages resident 100 + before updating the mappings and completing the transaction. Such IOMMU 101 + accepts a limited number of simultaneous stalled transactions before 102 + having to either put back-pressure on the master, or abort new faulting 103 + transactions. 104 + 105 + Firmware has to opt-in stalling, because most buses and masters don't 106 + support it. In particular it isn't compatible with PCI, where 107 + transactions have to complete before a time limit. More generally it 108 + won't work in systems and masters that haven't been designed for 109 + stalling. For example the OS, in order to handle a stalled transaction, 110 + may attempt to retrieve pages from secondary storage in a stalled 111 + domain, leading to a deadlock. 112 + 95 113 96 114 Notes: 97 115 ======

-38

Documentation/devicetree/bindings/iommu/rockchip,iommu.txt

··· 1 - Rockchip IOMMU 2 - ============== 3 - 4 - A Rockchip DRM iommu translates io virtual addresses to physical addresses for 5 - its master device. Each slave device is bound to a single master device, and 6 - shares its clocks, power domain and irq. 7 - 8 - Required properties: 9 - - compatible : Should be "rockchip,iommu" 10 - - reg : Address space for the configuration registers 11 - - interrupts : Interrupt specifier for the IOMMU instance 12 - - interrupt-names : Interrupt name for the IOMMU instance 13 - - #iommu-cells : Should be <0>. This indicates the iommu is a 14 - "single-master" device, and needs no additional information 15 - to associate with its master device. See: 16 - Documentation/devicetree/bindings/iommu/iommu.txt 17 - - clocks : A list of clocks required for the IOMMU to be accessible by 18 - the host CPU. 19 - - clock-names : Should contain the following: 20 - "iface" - Main peripheral bus clock (PCLK/HCL) (required) 21 - "aclk" - AXI bus clock (required) 22 - 23 - Optional properties: 24 - - rockchip,disable-mmu-reset : Don't use the mmu reset operation. 25 - Some mmu instances may produce unexpected results 26 - when the reset operation is used. 27 - 28 - Example: 29 - 30 - vopl_mmu: iommu@ff940300 { 31 - compatible = "rockchip,iommu"; 32 - reg = <0xff940300 0x100>; 33 - interrupts = <GIC_SPI 16 IRQ_TYPE_LEVEL_HIGH>; 34 - interrupt-names = "vopl_mmu"; 35 - clocks = <&cru ACLK_VOP1>, <&cru HCLK_VOP1>; 36 - clock-names = "aclk", "iface"; 37 - #iommu-cells = <0>; 38 - };

+85

Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml

··· 1 + # SPDX-License-Identifier: GPL-2.0-only 2 + %YAML 1.2 3 + --- 4 + $id: http://devicetree.org/schemas/iommu/rockchip,iommu.yaml# 5 + $schema: http://devicetree.org/meta-schemas/core.yaml# 6 + 7 + title: Rockchip IOMMU 8 + 9 + maintainers: 10 + - Heiko Stuebner <heiko@sntech.de> 11 + 12 + description: |+ 13 + A Rockchip DRM iommu translates io virtual addresses to physical addresses for 14 + its master device. Each slave device is bound to a single master device and 15 + shares its clocks, power domain and irq. 16 + 17 + For information on assigning IOMMU controller to its peripheral devices, 18 + see generic IOMMU bindings. 19 + 20 + properties: 21 + compatible: 22 + enum: 23 + - rockchip,iommu 24 + - rockchip,rk3568-iommu 25 + 26 + reg: 27 + items: 28 + - description: configuration registers for MMU instance 0 29 + - description: configuration registers for MMU instance 1 30 + minItems: 1 31 + maxItems: 2 32 + 33 + interrupts: 34 + items: 35 + - description: interruption for MMU instance 0 36 + - description: interruption for MMU instance 1 37 + minItems: 1 38 + maxItems: 2 39 + 40 + clocks: 41 + items: 42 + - description: Core clock 43 + - description: Interface clock 44 + 45 + clock-names: 46 + items: 47 + - const: aclk 48 + - const: iface 49 + 50 + "#iommu-cells": 51 + const: 0 52 + 53 + power-domains: 54 + maxItems: 1 55 + 56 + rockchip,disable-mmu-reset: 57 + $ref: /schemas/types.yaml#/definitions/flag 58 + description: | 59 + Do not use the mmu reset operation. 60 + Some mmu instances may produce unexpected results 61 + when the reset operation is used. 62 + 63 + required: 64 + - compatible 65 + - reg 66 + - interrupts 67 + - clocks 68 + - clock-names 69 + - "#iommu-cells" 70 + 71 + additionalProperties: false 72 + 73 + examples: 74 + - | 75 + #include <dt-bindings/clock/rk3399-cru.h> 76 + #include <dt-bindings/interrupt-controller/arm-gic.h> 77 + 78 + vopl_mmu: iommu@ff940300 { 79 + compatible = "rockchip,iommu"; 80 + reg = <0xff940300 0x100>; 81 + interrupts = <GIC_SPI 16 IRQ_TYPE_LEVEL_HIGH>; 82 + clocks = <&cru ACLK_VOP1>, <&cru HCLK_VOP1>; 83 + clock-names = "aclk", "iface"; 84 + #iommu-cells = <0>; 85 + };

+8

MAINTAINERS

··· 431 431 B: https://bugzilla.kernel.org 432 432 F: drivers/acpi/acpi_video.c 433 433 434 + ACPI VIOT DRIVER 435 + M: Jean-Philippe Brucker <jean-philippe@linaro.org> 436 + L: linux-acpi@vger.kernel.org 437 + L: iommu@lists.linux-foundation.org 438 + S: Maintained 439 + F: drivers/acpi/viot.c 440 + F: include/linux/acpi_viot.h 441 + 434 442 ACPI WMI DRIVER 435 443 L: platform-driver-x86@vger.kernel.org 436 444 S: Orphan

+1 -1

arch/arm64/boot/dts/qcom/msm8996.dtsi

··· 1136 1136 }; 1137 1137 1138 1138 adreno_smmu: iommu@b40000 { 1139 - compatible = "qcom,msm8996-smmu-v2", "qcom,smmu-v2"; 1139 + compatible = "qcom,msm8996-smmu-v2", "qcom,adreno-smmu", "qcom,smmu-v2"; 1140 1140 reg = <0x00b40000 0x10000>; 1141 1141 1142 1142 #global-interrupts = <1>;

+1 -1

arch/arm64/mm/dma-mapping.c

··· 50 50 51 51 dev->dma_coherent = coherent; 52 52 if (iommu) 53 - iommu_setup_dma_ops(dev, dma_base, size); 53 + iommu_setup_dma_ops(dev, dma_base, dma_base + size - 1); 54 54 55 55 #ifdef CONFIG_XEN 56 56 if (xen_swiotlb_detect())

+3

drivers/acpi/Kconfig

··· 526 526 527 527 source "drivers/acpi/pmic/Kconfig" 528 528 529 + config ACPI_VIOT 530 + bool 531 + 529 532 endif # ACPI 530 533 531 534 config X86_PM_TIMER

+2

drivers/acpi/Makefile

··· 124 124 obj-y += dptf/ 125 125 126 126 obj-$(CONFIG_ARM64) += arm64/ 127 + 128 + obj-$(CONFIG_ACPI_VIOT) += viot.o

+1

drivers/acpi/arm64/Makefile

··· 1 1 # SPDX-License-Identifier: GPL-2.0-only 2 2 obj-$(CONFIG_ACPI_IORT) += iort.o 3 3 obj-$(CONFIG_ACPI_GTDT) += gtdt.o 4 + obj-y += dma.o

+50

drivers/acpi/arm64/dma.c

··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + #include <linux/acpi.h> 3 + #include <linux/acpi_iort.h> 4 + #include <linux/device.h> 5 + #include <linux/dma-direct.h> 6 + 7 + void acpi_arch_dma_setup(struct device *dev, u64 *dma_addr, u64 *dma_size) 8 + { 9 + int ret; 10 + u64 end, mask; 11 + u64 dmaaddr = 0, size = 0, offset = 0; 12 + 13 + /* 14 + * If @dev is expected to be DMA-capable then the bus code that created 15 + * it should have initialised its dma_mask pointer by this point. For 16 + * now, we'll continue the legacy behaviour of coercing it to the 17 + * coherent mask if not, but we'll no longer do so quietly. 18 + */ 19 + if (!dev->dma_mask) { 20 + dev_warn(dev, "DMA mask not set\n"); 21 + dev->dma_mask = &dev->coherent_dma_mask; 22 + } 23 + 24 + if (dev->coherent_dma_mask) 25 + size = max(dev->coherent_dma_mask, dev->coherent_dma_mask + 1); 26 + else 27 + size = 1ULL << 32; 28 + 29 + ret = acpi_dma_get_range(dev, &dmaaddr, &offset, &size); 30 + if (ret == -ENODEV) 31 + ret = iort_dma_get_ranges(dev, &size); 32 + if (!ret) { 33 + /* 34 + * Limit coherent and dma mask based on size retrieved from 35 + * firmware. 36 + */ 37 + end = dmaaddr + size - 1; 38 + mask = DMA_BIT_MASK(ilog2(end) + 1); 39 + dev->bus_dma_limit = end; 40 + dev->coherent_dma_mask = min(dev->coherent_dma_mask, mask); 41 + *dev->dma_mask = min(*dev->dma_mask, mask); 42 + } 43 + 44 + *dma_addr = dmaaddr; 45 + *dma_size = size; 46 + 47 + ret = dma_direct_set_offset(dev, dmaaddr + offset, dmaaddr, size); 48 + 49 + dev_dbg(dev, "dma_offset(%#08llx)%s\n", offset, ret ? " failed!" : ""); 50 + }

+19 -113

drivers/acpi/arm64/iort.c

··· 806 806 return NULL; 807 807 } 808 808 809 - static inline const struct iommu_ops *iort_fwspec_iommu_ops(struct device *dev) 810 - { 811 - struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 812 - 813 - return (fwspec && fwspec->ops) ? fwspec->ops : NULL; 814 - } 815 - 816 - static inline int iort_add_device_replay(struct device *dev) 817 - { 818 - int err = 0; 819 - 820 - if (dev->bus && !device_iommu_mapped(dev)) 821 - err = iommu_probe_device(dev); 822 - 823 - return err; 824 - } 825 - 826 809 /** 827 810 * iort_iommu_msi_get_resv_regions - Reserved region driver helper 828 811 * @dev: Device from iommu_get_resv_regions() ··· 883 900 } 884 901 } 885 902 886 - static int arm_smmu_iort_xlate(struct device *dev, u32 streamid, 887 - struct fwnode_handle *fwnode, 888 - const struct iommu_ops *ops) 889 - { 890 - int ret = iommu_fwspec_init(dev, fwnode, ops); 891 - 892 - if (!ret) 893 - ret = iommu_fwspec_add_ids(dev, &streamid, 1); 894 - 895 - return ret; 896 - } 897 - 898 903 static bool iort_pci_rc_supports_ats(struct acpi_iort_node *node) 899 904 { 900 905 struct acpi_iort_root_complex *pci_rc; ··· 917 946 return iort_iommu_driver_enabled(node->type) ? 918 947 -EPROBE_DEFER : -ENODEV; 919 948 920 - return arm_smmu_iort_xlate(dev, streamid, iort_fwnode, ops); 949 + return acpi_iommu_fwspec_init(dev, streamid, iort_fwnode, ops); 921 950 } 922 951 923 952 struct iort_pci_alias_info { ··· 939 968 static void iort_named_component_init(struct device *dev, 940 969 struct acpi_iort_node *node) 941 970 { 942 - struct property_entry props[2] = {}; 971 + struct property_entry props[3] = {}; 943 972 struct acpi_iort_named_component *nc; 944 973 945 974 nc = (struct acpi_iort_named_component *)node->node_data; 946 975 props[0] = PROPERTY_ENTRY_U32("pasid-num-bits", 947 976 FIELD_GET(ACPI_IORT_NC_PASID_BITS, 948 977 nc->node_flags)); 978 + if (nc->node_flags & ACPI_IORT_NC_STALL_SUPPORTED) 979 + props[1] = PROPERTY_ENTRY_BOOL("dma-can-stall"); 949 980 950 981 if (device_create_managed_software_node(dev, props, NULL)) 951 982 dev_warn(dev, "Could not add device properties\n"); ··· 993 1020 * @dev: device to configure 994 1021 * @id_in: optional input id const value pointer 995 1022 * 996 - * Returns: iommu_ops pointer on configuration success 997 - * NULL on configuration failure 1023 + * Returns: 0 on success, <0 on failure 998 1024 */ 999 - const struct iommu_ops *iort_iommu_configure_id(struct device *dev, 1000 - const u32 *id_in) 1025 + int iort_iommu_configure_id(struct device *dev, const u32 *id_in) 1001 1026 { 1002 1027 struct acpi_iort_node *node; 1003 - const struct iommu_ops *ops; 1004 1028 int err = -ENODEV; 1005 - 1006 - /* 1007 - * If we already translated the fwspec there 1008 - * is nothing left to do, return the iommu_ops. 1009 - */ 1010 - ops = iort_fwspec_iommu_ops(dev); 1011 - if (ops) 1012 - return ops; 1013 1029 1014 1030 if (dev_is_pci(dev)) { 1015 1031 struct iommu_fwspec *fwspec; ··· 1008 1046 node = iort_scan_node(ACPI_IORT_NODE_PCI_ROOT_COMPLEX, 1009 1047 iort_match_node_callback, &bus->dev); 1010 1048 if (!node) 1011 - return NULL; 1049 + return -ENODEV; 1012 1050 1013 1051 info.node = node; 1014 1052 err = pci_for_each_dma_alias(to_pci_dev(dev), ··· 1021 1059 node = iort_scan_node(ACPI_IORT_NODE_NAMED_COMPONENT, 1022 1060 iort_match_node_callback, dev); 1023 1061 if (!node) 1024 - return NULL; 1062 + return -ENODEV; 1025 1063 1026 1064 err = id_in ? iort_nc_iommu_map_id(dev, node, id_in) : 1027 1065 iort_nc_iommu_map(dev, node); ··· 1030 1068 iort_named_component_init(dev, node); 1031 1069 } 1032 1070 1033 - /* 1034 - * If we have reason to believe the IOMMU driver missed the initial 1035 - * add_device callback for dev, replay it to get things in order. 1036 - */ 1037 - if (!err) { 1038 - ops = iort_fwspec_iommu_ops(dev); 1039 - err = iort_add_device_replay(dev); 1040 - } 1041 - 1042 - /* Ignore all other errors apart from EPROBE_DEFER */ 1043 - if (err == -EPROBE_DEFER) { 1044 - ops = ERR_PTR(err); 1045 - } else if (err) { 1046 - dev_dbg(dev, "Adding to IOMMU failed: %d\n", err); 1047 - ops = NULL; 1048 - } 1049 - 1050 - return ops; 1071 + return err; 1051 1072 } 1052 1073 1053 1074 #else 1054 1075 int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head) 1055 1076 { return 0; } 1056 - const struct iommu_ops *iort_iommu_configure_id(struct device *dev, 1057 - const u32 *input_id) 1058 - { return NULL; } 1077 + int iort_iommu_configure_id(struct device *dev, const u32 *input_id) 1078 + { return -ENODEV; } 1059 1079 #endif 1060 1080 1061 1081 static int nc_dma_get_range(struct device *dev, u64 *size) ··· 1088 1144 } 1089 1145 1090 1146 /** 1091 - * iort_dma_setup() - Set-up device DMA parameters. 1147 + * iort_dma_get_ranges() - Look up DMA addressing limit for the device 1148 + * @dev: device to lookup 1149 + * @size: DMA range size result pointer 1092 1150 * 1093 - * @dev: device to configure 1094 - * @dma_addr: device DMA address result pointer 1095 - * @dma_size: DMA range size result pointer 1151 + * Return: 0 on success, an error otherwise. 1096 1152 */ 1097 - void iort_dma_setup(struct device *dev, u64 *dma_addr, u64 *dma_size) 1153 + int iort_dma_get_ranges(struct device *dev, u64 *size) 1098 1154 { 1099 - u64 end, mask, dmaaddr = 0, size = 0, offset = 0; 1100 - int ret; 1101 - 1102 - /* 1103 - * If @dev is expected to be DMA-capable then the bus code that created 1104 - * it should have initialised its dma_mask pointer by this point. For 1105 - * now, we'll continue the legacy behaviour of coercing it to the 1106 - * coherent mask if not, but we'll no longer do so quietly. 1107 - */ 1108 - if (!dev->dma_mask) { 1109 - dev_warn(dev, "DMA mask not set\n"); 1110 - dev->dma_mask = &dev->coherent_dma_mask; 1111 - } 1112 - 1113 - if (dev->coherent_dma_mask) 1114 - size = max(dev->coherent_dma_mask, dev->coherent_dma_mask + 1); 1155 + if (dev_is_pci(dev)) 1156 + return rc_dma_get_range(dev, size); 1115 1157 else 1116 - size = 1ULL << 32; 1117 - 1118 - ret = acpi_dma_get_range(dev, &dmaaddr, &offset, &size); 1119 - if (ret == -ENODEV) 1120 - ret = dev_is_pci(dev) ? rc_dma_get_range(dev, &size) 1121 - : nc_dma_get_range(dev, &size); 1122 - 1123 - if (!ret) { 1124 - /* 1125 - * Limit coherent and dma mask based on size retrieved from 1126 - * firmware. 1127 - */ 1128 - end = dmaaddr + size - 1; 1129 - mask = DMA_BIT_MASK(ilog2(end) + 1); 1130 - dev->bus_dma_limit = end; 1131 - dev->coherent_dma_mask = min(dev->coherent_dma_mask, mask); 1132 - *dev->dma_mask = min(*dev->dma_mask, mask); 1133 - } 1134 - 1135 - *dma_addr = dmaaddr; 1136 - *dma_size = size; 1137 - 1138 - ret = dma_direct_set_offset(dev, dmaaddr + offset, dmaaddr, size); 1139 - 1140 - dev_dbg(dev, "dma_offset(%#08llx)%s\n", offset, ret ? " failed!" : ""); 1158 + return nc_dma_get_range(dev, size); 1141 1159 } 1142 1160 1143 1161 static void __init acpi_iort_register_irq(int hwirq, const char *name,

+2

drivers/acpi/bus.c

··· 27 27 #include <linux/dmi.h> 28 28 #endif 29 29 #include <linux/acpi_iort.h> 30 + #include <linux/acpi_viot.h> 30 31 #include <linux/pci.h> 31 32 #include <acpi/apei.h> 32 33 #include <linux/suspend.h> ··· 1336 1335 acpi_wakeup_device_init(); 1337 1336 acpi_debugger_init(); 1338 1337 acpi_setup_sb_notify_handler(); 1338 + acpi_viot_init(); 1339 1339 return 0; 1340 1340 } 1341 1341

+76 -2

drivers/acpi/scan.c

··· 11 11 #include <linux/kernel.h> 12 12 #include <linux/acpi.h> 13 13 #include <linux/acpi_iort.h> 14 + #include <linux/acpi_viot.h> 15 + #include <linux/iommu.h> 14 16 #include <linux/signal.h> 15 17 #include <linux/kthread.h> 16 18 #include <linux/dmi.h> ··· 1528 1526 return ret >= 0 ? 0 : ret; 1529 1527 } 1530 1528 1529 + #ifdef CONFIG_IOMMU_API 1530 + int acpi_iommu_fwspec_init(struct device *dev, u32 id, 1531 + struct fwnode_handle *fwnode, 1532 + const struct iommu_ops *ops) 1533 + { 1534 + int ret = iommu_fwspec_init(dev, fwnode, ops); 1535 + 1536 + if (!ret) 1537 + ret = iommu_fwspec_add_ids(dev, &id, 1); 1538 + 1539 + return ret; 1540 + } 1541 + 1542 + static inline const struct iommu_ops *acpi_iommu_fwspec_ops(struct device *dev) 1543 + { 1544 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 1545 + 1546 + return fwspec ? fwspec->ops : NULL; 1547 + } 1548 + 1549 + static const struct iommu_ops *acpi_iommu_configure_id(struct device *dev, 1550 + const u32 *id_in) 1551 + { 1552 + int err; 1553 + const struct iommu_ops *ops; 1554 + 1555 + /* 1556 + * If we already translated the fwspec there is nothing left to do, 1557 + * return the iommu_ops. 1558 + */ 1559 + ops = acpi_iommu_fwspec_ops(dev); 1560 + if (ops) 1561 + return ops; 1562 + 1563 + err = iort_iommu_configure_id(dev, id_in); 1564 + if (err && err != -EPROBE_DEFER) 1565 + err = viot_iommu_configure(dev); 1566 + 1567 + /* 1568 + * If we have reason to believe the IOMMU driver missed the initial 1569 + * iommu_probe_device() call for dev, replay it to get things in order. 1570 + */ 1571 + if (!err && dev->bus && !device_iommu_mapped(dev)) 1572 + err = iommu_probe_device(dev); 1573 + 1574 + /* Ignore all other errors apart from EPROBE_DEFER */ 1575 + if (err == -EPROBE_DEFER) { 1576 + return ERR_PTR(err); 1577 + } else if (err) { 1578 + dev_dbg(dev, "Adding to IOMMU failed: %d\n", err); 1579 + return NULL; 1580 + } 1581 + return acpi_iommu_fwspec_ops(dev); 1582 + } 1583 + 1584 + #else /* !CONFIG_IOMMU_API */ 1585 + 1586 + int acpi_iommu_fwspec_init(struct device *dev, u32 id, 1587 + struct fwnode_handle *fwnode, 1588 + const struct iommu_ops *ops) 1589 + { 1590 + return -ENODEV; 1591 + } 1592 + 1593 + static const struct iommu_ops *acpi_iommu_configure_id(struct device *dev, 1594 + const u32 *id_in) 1595 + { 1596 + return NULL; 1597 + } 1598 + 1599 + #endif /* !CONFIG_IOMMU_API */ 1600 + 1531 1601 /** 1532 1602 * acpi_dma_configure_id - Set-up DMA configuration for the device. 1533 1603 * @dev: The pointer to the device ··· 1617 1543 return 0; 1618 1544 } 1619 1545 1620 - iort_dma_setup(dev, &dma_addr, &size); 1546 + acpi_arch_dma_setup(dev, &dma_addr, &size); 1621 1547 1622 - iommu = iort_iommu_configure_id(dev, input_id); 1548 + iommu = acpi_iommu_configure_id(dev, input_id); 1623 1549 if (PTR_ERR(iommu) == -EPROBE_DEFER) 1624 1550 return -EPROBE_DEFER; 1625 1551

+366

drivers/acpi/viot.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Virtual I/O topology 4 + * 5 + * The Virtual I/O Translation Table (VIOT) describes the topology of 6 + * para-virtual IOMMUs and the endpoints they manage. The OS uses it to 7 + * initialize devices in the right order, preventing endpoints from issuing DMA 8 + * before their IOMMU is ready. 9 + * 10 + * When binding a driver to a device, before calling the device driver's probe() 11 + * method, the driver infrastructure calls dma_configure(). At that point the 12 + * VIOT driver looks for an IOMMU associated to the device in the VIOT table. 13 + * If an IOMMU exists and has been initialized, the VIOT driver initializes the 14 + * device's IOMMU fwspec, allowing the DMA infrastructure to invoke the IOMMU 15 + * ops when the device driver configures DMA mappings. If an IOMMU exists and 16 + * hasn't yet been initialized, VIOT returns -EPROBE_DEFER to postpone probing 17 + * the device until the IOMMU is available. 18 + */ 19 + #define pr_fmt(fmt) "ACPI: VIOT: " fmt 20 + 21 + #include <linux/acpi_viot.h> 22 + #include <linux/dma-iommu.h> 23 + #include <linux/fwnode.h> 24 + #include <linux/iommu.h> 25 + #include <linux/list.h> 26 + #include <linux/pci.h> 27 + #include <linux/platform_device.h> 28 + 29 + struct viot_iommu { 30 + /* Node offset within the table */ 31 + unsigned int offset; 32 + struct fwnode_handle *fwnode; 33 + struct list_head list; 34 + }; 35 + 36 + struct viot_endpoint { 37 + union { 38 + /* PCI range */ 39 + struct { 40 + u16 segment_start; 41 + u16 segment_end; 42 + u16 bdf_start; 43 + u16 bdf_end; 44 + }; 45 + /* MMIO */ 46 + u64 address; 47 + }; 48 + u32 endpoint_id; 49 + struct viot_iommu *viommu; 50 + struct list_head list; 51 + }; 52 + 53 + static struct acpi_table_viot *viot; 54 + static LIST_HEAD(viot_iommus); 55 + static LIST_HEAD(viot_pci_ranges); 56 + static LIST_HEAD(viot_mmio_endpoints); 57 + 58 + static int __init viot_check_bounds(const struct acpi_viot_header *hdr) 59 + { 60 + struct acpi_viot_header *start, *end, *hdr_end; 61 + 62 + start = ACPI_ADD_PTR(struct acpi_viot_header, viot, 63 + max_t(size_t, sizeof(*viot), viot->node_offset)); 64 + end = ACPI_ADD_PTR(struct acpi_viot_header, viot, viot->header.length); 65 + hdr_end = ACPI_ADD_PTR(struct acpi_viot_header, hdr, sizeof(*hdr)); 66 + 67 + if (hdr < start || hdr_end > end) { 68 + pr_err(FW_BUG "Node pointer overflows\n"); 69 + return -EOVERFLOW; 70 + } 71 + if (hdr->length < sizeof(*hdr)) { 72 + pr_err(FW_BUG "Empty node\n"); 73 + return -EINVAL; 74 + } 75 + return 0; 76 + } 77 + 78 + static int __init viot_get_pci_iommu_fwnode(struct viot_iommu *viommu, 79 + u16 segment, u16 bdf) 80 + { 81 + struct pci_dev *pdev; 82 + struct fwnode_handle *fwnode; 83 + 84 + pdev = pci_get_domain_bus_and_slot(segment, PCI_BUS_NUM(bdf), 85 + bdf & 0xff); 86 + if (!pdev) { 87 + pr_err("Could not find PCI IOMMU\n"); 88 + return -ENODEV; 89 + } 90 + 91 + fwnode = pdev->dev.fwnode; 92 + if (!fwnode) { 93 + /* 94 + * PCI devices aren't necessarily described by ACPI. Create a 95 + * fwnode so the IOMMU subsystem can identify this device. 96 + */ 97 + fwnode = acpi_alloc_fwnode_static(); 98 + if (!fwnode) { 99 + pci_dev_put(pdev); 100 + return -ENOMEM; 101 + } 102 + set_primary_fwnode(&pdev->dev, fwnode); 103 + } 104 + viommu->fwnode = pdev->dev.fwnode; 105 + pci_dev_put(pdev); 106 + return 0; 107 + } 108 + 109 + static int __init viot_get_mmio_iommu_fwnode(struct viot_iommu *viommu, 110 + u64 address) 111 + { 112 + struct acpi_device *adev; 113 + struct resource res = { 114 + .start = address, 115 + .end = address, 116 + .flags = IORESOURCE_MEM, 117 + }; 118 + 119 + adev = acpi_resource_consumer(&res); 120 + if (!adev) { 121 + pr_err("Could not find MMIO IOMMU\n"); 122 + return -EINVAL; 123 + } 124 + viommu->fwnode = &adev->fwnode; 125 + return 0; 126 + } 127 + 128 + static struct viot_iommu * __init viot_get_iommu(unsigned int offset) 129 + { 130 + int ret; 131 + struct viot_iommu *viommu; 132 + struct acpi_viot_header *hdr = ACPI_ADD_PTR(struct acpi_viot_header, 133 + viot, offset); 134 + union { 135 + struct acpi_viot_virtio_iommu_pci pci; 136 + struct acpi_viot_virtio_iommu_mmio mmio; 137 + } *node = (void *)hdr; 138 + 139 + list_for_each_entry(viommu, &viot_iommus, list) 140 + if (viommu->offset == offset) 141 + return viommu; 142 + 143 + if (viot_check_bounds(hdr)) 144 + return NULL; 145 + 146 + viommu = kzalloc(sizeof(*viommu), GFP_KERNEL); 147 + if (!viommu) 148 + return NULL; 149 + 150 + viommu->offset = offset; 151 + switch (hdr->type) { 152 + case ACPI_VIOT_NODE_VIRTIO_IOMMU_PCI: 153 + if (hdr->length < sizeof(node->pci)) 154 + goto err_free; 155 + 156 + ret = viot_get_pci_iommu_fwnode(viommu, node->pci.segment, 157 + node->pci.bdf); 158 + break; 159 + case ACPI_VIOT_NODE_VIRTIO_IOMMU_MMIO: 160 + if (hdr->length < sizeof(node->mmio)) 161 + goto err_free; 162 + 163 + ret = viot_get_mmio_iommu_fwnode(viommu, 164 + node->mmio.base_address); 165 + break; 166 + default: 167 + ret = -EINVAL; 168 + } 169 + if (ret) 170 + goto err_free; 171 + 172 + list_add(&viommu->list, &viot_iommus); 173 + return viommu; 174 + 175 + err_free: 176 + kfree(viommu); 177 + return NULL; 178 + } 179 + 180 + static int __init viot_parse_node(const struct acpi_viot_header *hdr) 181 + { 182 + int ret = -EINVAL; 183 + struct list_head *list; 184 + struct viot_endpoint *ep; 185 + union { 186 + struct acpi_viot_mmio mmio; 187 + struct acpi_viot_pci_range pci; 188 + } *node = (void *)hdr; 189 + 190 + if (viot_check_bounds(hdr)) 191 + return -EINVAL; 192 + 193 + if (hdr->type == ACPI_VIOT_NODE_VIRTIO_IOMMU_PCI || 194 + hdr->type == ACPI_VIOT_NODE_VIRTIO_IOMMU_MMIO) 195 + return 0; 196 + 197 + ep = kzalloc(sizeof(*ep), GFP_KERNEL); 198 + if (!ep) 199 + return -ENOMEM; 200 + 201 + switch (hdr->type) { 202 + case ACPI_VIOT_NODE_PCI_RANGE: 203 + if (hdr->length < sizeof(node->pci)) { 204 + pr_err(FW_BUG "Invalid PCI node size\n"); 205 + goto err_free; 206 + } 207 + 208 + ep->segment_start = node->pci.segment_start; 209 + ep->segment_end = node->pci.segment_end; 210 + ep->bdf_start = node->pci.bdf_start; 211 + ep->bdf_end = node->pci.bdf_end; 212 + ep->endpoint_id = node->pci.endpoint_start; 213 + ep->viommu = viot_get_iommu(node->pci.output_node); 214 + list = &viot_pci_ranges; 215 + break; 216 + case ACPI_VIOT_NODE_MMIO: 217 + if (hdr->length < sizeof(node->mmio)) { 218 + pr_err(FW_BUG "Invalid MMIO node size\n"); 219 + goto err_free; 220 + } 221 + 222 + ep->address = node->mmio.base_address; 223 + ep->endpoint_id = node->mmio.endpoint; 224 + ep->viommu = viot_get_iommu(node->mmio.output_node); 225 + list = &viot_mmio_endpoints; 226 + break; 227 + default: 228 + pr_warn("Unsupported node %x\n", hdr->type); 229 + ret = 0; 230 + goto err_free; 231 + } 232 + 233 + if (!ep->viommu) { 234 + pr_warn("No IOMMU node found\n"); 235 + /* 236 + * A future version of the table may use the node for other 237 + * purposes. Keep parsing. 238 + */ 239 + ret = 0; 240 + goto err_free; 241 + } 242 + 243 + list_add(&ep->list, list); 244 + return 0; 245 + 246 + err_free: 247 + kfree(ep); 248 + return ret; 249 + } 250 + 251 + /** 252 + * acpi_viot_init - Parse the VIOT table 253 + * 254 + * Parse the VIOT table, prepare the list of endpoints to be used during DMA 255 + * setup of devices. 256 + */ 257 + void __init acpi_viot_init(void) 258 + { 259 + int i; 260 + acpi_status status; 261 + struct acpi_table_header *hdr; 262 + struct acpi_viot_header *node; 263 + 264 + status = acpi_get_table(ACPI_SIG_VIOT, 0, &hdr); 265 + if (ACPI_FAILURE(status)) { 266 + if (status != AE_NOT_FOUND) { 267 + const char *msg = acpi_format_exception(status); 268 + 269 + pr_err("Failed to get table, %s\n", msg); 270 + } 271 + return; 272 + } 273 + 274 + viot = (void *)hdr; 275 + 276 + node = ACPI_ADD_PTR(struct acpi_viot_header, viot, viot->node_offset); 277 + for (i = 0; i < viot->node_count; i++) { 278 + if (viot_parse_node(node)) 279 + return; 280 + 281 + node = ACPI_ADD_PTR(struct acpi_viot_header, node, 282 + node->length); 283 + } 284 + 285 + acpi_put_table(hdr); 286 + } 287 + 288 + static int viot_dev_iommu_init(struct device *dev, struct viot_iommu *viommu, 289 + u32 epid) 290 + { 291 + const struct iommu_ops *ops; 292 + 293 + if (!viommu) 294 + return -ENODEV; 295 + 296 + /* We're not translating ourself */ 297 + if (viommu->fwnode == dev->fwnode) 298 + return -EINVAL; 299 + 300 + ops = iommu_ops_from_fwnode(viommu->fwnode); 301 + if (!ops) 302 + return IS_ENABLED(CONFIG_VIRTIO_IOMMU) ? 303 + -EPROBE_DEFER : -ENODEV; 304 + 305 + return acpi_iommu_fwspec_init(dev, epid, viommu->fwnode, ops); 306 + } 307 + 308 + static int viot_pci_dev_iommu_init(struct pci_dev *pdev, u16 dev_id, void *data) 309 + { 310 + u32 epid; 311 + struct viot_endpoint *ep; 312 + u32 domain_nr = pci_domain_nr(pdev->bus); 313 + 314 + list_for_each_entry(ep, &viot_pci_ranges, list) { 315 + if (domain_nr >= ep->segment_start && 316 + domain_nr <= ep->segment_end && 317 + dev_id >= ep->bdf_start && 318 + dev_id <= ep->bdf_end) { 319 + epid = ((domain_nr - ep->segment_start) << 16) + 320 + dev_id - ep->bdf_start + ep->endpoint_id; 321 + 322 + /* 323 + * If we found a PCI range managed by the viommu, we're 324 + * the one that has to request ACS. 325 + */ 326 + pci_request_acs(); 327 + 328 + return viot_dev_iommu_init(&pdev->dev, ep->viommu, 329 + epid); 330 + } 331 + } 332 + return -ENODEV; 333 + } 334 + 335 + static int viot_mmio_dev_iommu_init(struct platform_device *pdev) 336 + { 337 + struct resource *mem; 338 + struct viot_endpoint *ep; 339 + 340 + mem = platform_get_resource(pdev, IORESOURCE_MEM, 0); 341 + if (!mem) 342 + return -ENODEV; 343 + 344 + list_for_each_entry(ep, &viot_mmio_endpoints, list) { 345 + if (ep->address == mem->start) 346 + return viot_dev_iommu_init(&pdev->dev, ep->viommu, 347 + ep->endpoint_id); 348 + } 349 + return -ENODEV; 350 + } 351 + 352 + /** 353 + * viot_iommu_configure - Setup IOMMU ops for an endpoint described by VIOT 354 + * @dev: the endpoint 355 + * 356 + * Return: 0 on success, <0 on failure 357 + */ 358 + int viot_iommu_configure(struct device *dev) 359 + { 360 + if (dev_is_pci(dev)) 361 + return pci_for_each_dma_alias(to_pci_dev(dev), 362 + viot_pci_dev_iommu_init, NULL); 363 + else if (dev_is_platform(dev)) 364 + return viot_mmio_dev_iommu_init(to_platform_device(dev)); 365 + return -ENODEV; 366 + }

+3 -1

drivers/iommu/Kconfig

··· 400 400 config VIRTIO_IOMMU 401 401 tristate "Virtio IOMMU driver" 402 402 depends on VIRTIO 403 - depends on ARM64 403 + depends on (ARM64 || X86) 404 404 select IOMMU_API 405 + select IOMMU_DMA 405 406 select INTERVAL_TREE 407 + select ACPI_VIOT if ACPI 406 408 help 407 409 Para-virtualised IOMMU driver with virtio. 408 410

-2

drivers/iommu/amd/amd_iommu.h

··· 11 11 12 12 #include "amd_iommu_types.h" 13 13 14 - extern int amd_iommu_init_dma_ops(void); 15 - extern int amd_iommu_init_passthrough(void); 16 14 extern irqreturn_t amd_iommu_int_thread(int irq, void *data); 17 15 extern irqreturn_t amd_iommu_int_handler(int irq, void *data); 18 16 extern void amd_iommu_apply_erratum_63(u16 devid);

+11 -9

drivers/iommu/amd/init.c

··· 153 153 static int amd_iommu_xt_mode = IRQ_REMAP_XAPIC_MODE; 154 154 155 155 static bool amd_iommu_detected; 156 - static bool __initdata amd_iommu_disabled; 156 + static bool amd_iommu_disabled __initdata; 157 + static bool amd_iommu_force_enable __initdata; 157 158 static int amd_iommu_target_ivhd_type; 158 159 159 160 u16 amd_iommu_last_bdf; /* largest PCI device id we have ··· 232 231 IOMMU_ENABLED, 233 232 IOMMU_PCI_INIT, 234 233 IOMMU_INTERRUPTS_EN, 235 - IOMMU_DMA_OPS, 236 234 IOMMU_INITIALIZED, 237 235 IOMMU_NOT_FOUND, 238 236 IOMMU_INIT_ERROR, ··· 1908 1908 pci_info(pdev, "Found IOMMU cap 0x%x\n", iommu->cap_ptr); 1909 1909 1910 1910 if (iommu->cap & (1 << IOMMU_CAP_EFR)) { 1911 - pci_info(pdev, "Extended features (%#llx):", 1912 - iommu->features); 1911 + pr_info("Extended features (%#llx):", iommu->features); 1912 + 1913 1913 for (i = 0; i < ARRAY_SIZE(feat_str); ++i) { 1914 1914 if (iommu_feature(iommu, (1ULL << i))) 1915 1915 pr_cont(" %s", feat_str[i]); ··· 2817 2817 return ret; 2818 2818 } 2819 2819 2820 - static bool detect_ivrs(void) 2820 + static bool __init detect_ivrs(void) 2821 2821 { 2822 2822 struct acpi_table_header *ivrs_base; 2823 2823 acpi_status status; ··· 2834 2834 2835 2835 acpi_put_table(ivrs_base); 2836 2836 2837 + if (amd_iommu_force_enable) 2838 + goto out; 2839 + 2837 2840 /* Don't use IOMMU if there is Stoney Ridge graphics */ 2838 2841 for (i = 0; i < 32; i++) { 2839 2842 u32 pci_id; ··· 2848 2845 } 2849 2846 } 2850 2847 2848 + out: 2851 2849 /* Make sure ACS will be enabled during PCI probe */ 2852 2850 pci_request_acs(); 2853 2851 ··· 2899 2895 init_state = ret ? IOMMU_INIT_ERROR : IOMMU_INTERRUPTS_EN; 2900 2896 break; 2901 2897 case IOMMU_INTERRUPTS_EN: 2902 - ret = amd_iommu_init_dma_ops(); 2903 - init_state = ret ? IOMMU_INIT_ERROR : IOMMU_DMA_OPS; 2904 - break; 2905 - case IOMMU_DMA_OPS: 2906 2898 init_state = IOMMU_INITIALIZED; 2907 2899 break; 2908 2900 case IOMMU_INITIALIZED: ··· 3100 3100 for (; *str; ++str) { 3101 3101 if (strncmp(str, "fullflush", 9) == 0) 3102 3102 amd_iommu_unmap_flush = true; 3103 + if (strncmp(str, "force_enable", 12) == 0) 3104 + amd_iommu_force_enable = true; 3103 3105 if (strncmp(str, "off", 3) == 0) 3104 3106 amd_iommu_disabled = true; 3105 3107 if (strncmp(str, "force_isolation", 15) == 0)

+14 -19

drivers/iommu/amd/iommu.c

··· 30 30 #include <linux/msi.h> 31 31 #include <linux/irqdomain.h> 32 32 #include <linux/percpu.h> 33 - #include <linux/iova.h> 34 33 #include <linux/io-pgtable.h> 35 34 #include <asm/irq_remapping.h> 36 35 #include <asm/io_apic.h> ··· 1712 1713 /* Domains are initialized for this device - have a look what we ended up with */ 1713 1714 domain = iommu_get_domain_for_dev(dev); 1714 1715 if (domain->type == IOMMU_DOMAIN_DMA) 1715 - iommu_setup_dma_ops(dev, IOVA_START_PFN << PAGE_SHIFT, 0); 1716 + iommu_setup_dma_ops(dev, 0, U64_MAX); 1716 1717 else 1717 1718 set_dma_ops(dev, NULL); 1718 1719 } ··· 1772 1773 amd_iommu_domain_flush_complete(domain); 1773 1774 } 1774 1775 1776 + static void __init amd_iommu_init_dma_ops(void) 1777 + { 1778 + swiotlb = (iommu_default_passthrough() || sme_me_mask) ? 1 : 0; 1779 + 1780 + if (amd_iommu_unmap_flush) 1781 + pr_info("IO/TLB flush on unmap enabled\n"); 1782 + else 1783 + pr_info("Lazy IO/TLB flushing enabled\n"); 1784 + iommu_set_dma_strict(amd_iommu_unmap_flush); 1785 + } 1786 + 1775 1787 int __init amd_iommu_init_api(void) 1776 1788 { 1777 - int ret, err = 0; 1789 + int err; 1778 1790 1779 - ret = iova_cache_get(); 1780 - if (ret) 1781 - return ret; 1791 + amd_iommu_init_dma_ops(); 1782 1792 1783 1793 err = bus_set_iommu(&pci_bus_type, &amd_iommu_ops); 1784 1794 if (err) ··· 1802 1794 return err; 1803 1795 1804 1796 return 0; 1805 - } 1806 - 1807 - int __init amd_iommu_init_dma_ops(void) 1808 - { 1809 - swiotlb = (iommu_default_passthrough() || sme_me_mask) ? 1 : 0; 1810 - 1811 - if (amd_iommu_unmap_flush) 1812 - pr_info("IO/TLB flush on unmap enabled\n"); 1813 - else 1814 - pr_info("Lazy IO/TLB flushing enabled\n"); 1815 - iommu_set_dma_strict(amd_iommu_unmap_flush); 1816 - return 0; 1817 - 1818 1797 } 1819 1798 1820 1799 /*****************************************************************************

+53 -6

drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c

··· 435 435 return true; 436 436 } 437 437 438 - static bool arm_smmu_iopf_supported(struct arm_smmu_master *master) 438 + bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master) 439 439 { 440 - return false; 440 + /* We're not keeping track of SIDs in fault events */ 441 + if (master->num_streams != 1) 442 + return false; 443 + 444 + return master->stall_enabled; 441 445 } 442 446 443 447 bool arm_smmu_master_sva_supported(struct arm_smmu_master *master) ··· 449 445 if (!(master->smmu->features & ARM_SMMU_FEAT_SVA)) 450 446 return false; 451 447 452 - /* SSID and IOPF support are mandatory for the moment */ 453 - return master->ssid_bits && arm_smmu_iopf_supported(master); 448 + /* SSID support is mandatory for the moment */ 449 + return master->ssid_bits; 454 450 } 455 451 456 452 bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master) ··· 463 459 return enabled; 464 460 } 465 461 462 + static int arm_smmu_master_sva_enable_iopf(struct arm_smmu_master *master) 463 + { 464 + int ret; 465 + struct device *dev = master->dev; 466 + 467 + /* 468 + * Drivers for devices supporting PRI or stall should enable IOPF first. 469 + * Others have device-specific fault handlers and don't need IOPF. 470 + */ 471 + if (!arm_smmu_master_iopf_supported(master)) 472 + return 0; 473 + 474 + if (!master->iopf_enabled) 475 + return -EINVAL; 476 + 477 + ret = iopf_queue_add_device(master->smmu->evtq.iopf, dev); 478 + if (ret) 479 + return ret; 480 + 481 + ret = iommu_register_device_fault_handler(dev, iommu_queue_iopf, dev); 482 + if (ret) { 483 + iopf_queue_remove_device(master->smmu->evtq.iopf, dev); 484 + return ret; 485 + } 486 + return 0; 487 + } 488 + 489 + static void arm_smmu_master_sva_disable_iopf(struct arm_smmu_master *master) 490 + { 491 + struct device *dev = master->dev; 492 + 493 + if (!master->iopf_enabled) 494 + return; 495 + 496 + iommu_unregister_device_fault_handler(dev); 497 + iopf_queue_remove_device(master->smmu->evtq.iopf, dev); 498 + } 499 + 466 500 int arm_smmu_master_enable_sva(struct arm_smmu_master *master) 467 501 { 502 + int ret; 503 + 468 504 mutex_lock(&sva_lock); 469 - master->sva_enabled = true; 505 + ret = arm_smmu_master_sva_enable_iopf(master); 506 + if (!ret) 507 + master->sva_enabled = true; 470 508 mutex_unlock(&sva_lock); 471 509 472 - return 0; 510 + return ret; 473 511 } 474 512 475 513 int arm_smmu_master_disable_sva(struct arm_smmu_master *master) ··· 522 476 mutex_unlock(&sva_lock); 523 477 return -EBUSY; 524 478 } 479 + arm_smmu_master_sva_disable_iopf(master); 525 480 master->sva_enabled = false; 526 481 mutex_unlock(&sva_lock); 527 482

+204 -19

drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c

··· 23 23 #include <linux/msi.h> 24 24 #include <linux/of.h> 25 25 #include <linux/of_address.h> 26 - #include <linux/of_iommu.h> 27 26 #include <linux/of_platform.h> 28 27 #include <linux/pci.h> 29 28 #include <linux/pci-ats.h> ··· 31 32 #include <linux/amba/bus.h> 32 33 33 34 #include "arm-smmu-v3.h" 35 + #include "../../iommu-sva-lib.h" 34 36 35 37 static bool disable_bypass = true; 36 38 module_param(disable_bypass, bool, 0444); ··· 313 313 } 314 314 cmd[1] |= FIELD_PREP(CMDQ_PRI_1_RESP, ent->pri.resp); 315 315 break; 316 + case CMDQ_OP_RESUME: 317 + cmd[0] |= FIELD_PREP(CMDQ_RESUME_0_SID, ent->resume.sid); 318 + cmd[0] |= FIELD_PREP(CMDQ_RESUME_0_RESP, ent->resume.resp); 319 + cmd[1] |= FIELD_PREP(CMDQ_RESUME_1_STAG, ent->resume.stag); 320 + break; 316 321 case CMDQ_OP_CMD_SYNC: 317 322 if (ent->sync.msiaddr) { 318 323 cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_IRQ); ··· 357 352 358 353 static void arm_smmu_cmdq_skip_err(struct arm_smmu_device *smmu) 359 354 { 360 - static const char *cerror_str[] = { 355 + static const char * const cerror_str[] = { 361 356 [CMDQ_ERR_CERROR_NONE_IDX] = "No error", 362 357 [CMDQ_ERR_CERROR_ILL_IDX] = "Illegal command", 363 358 [CMDQ_ERR_CERROR_ABT_IDX] = "Abort on command fetch", ··· 881 876 return arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmds, cmds->num, true); 882 877 } 883 878 879 + static int arm_smmu_page_response(struct device *dev, 880 + struct iommu_fault_event *unused, 881 + struct iommu_page_response *resp) 882 + { 883 + struct arm_smmu_cmdq_ent cmd = {0}; 884 + struct arm_smmu_master *master = dev_iommu_priv_get(dev); 885 + int sid = master->streams[0].id; 886 + 887 + if (master->stall_enabled) { 888 + cmd.opcode = CMDQ_OP_RESUME; 889 + cmd.resume.sid = sid; 890 + cmd.resume.stag = resp->grpid; 891 + switch (resp->code) { 892 + case IOMMU_PAGE_RESP_INVALID: 893 + case IOMMU_PAGE_RESP_FAILURE: 894 + cmd.resume.resp = CMDQ_RESUME_0_RESP_ABORT; 895 + break; 896 + case IOMMU_PAGE_RESP_SUCCESS: 897 + cmd.resume.resp = CMDQ_RESUME_0_RESP_RETRY; 898 + break; 899 + default: 900 + return -EINVAL; 901 + } 902 + } else { 903 + return -ENODEV; 904 + } 905 + 906 + arm_smmu_cmdq_issue_cmd(master->smmu, &cmd); 907 + /* 908 + * Don't send a SYNC, it doesn't do anything for RESUME or PRI_RESP. 909 + * RESUME consumption guarantees that the stalled transaction will be 910 + * terminated... at some point in the future. PRI_RESP is fire and 911 + * forget. 912 + */ 913 + 914 + return 0; 915 + } 916 + 884 917 /* Context descriptor manipulation functions */ 885 918 void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid) 886 919 { ··· 1029 986 u64 val; 1030 987 bool cd_live; 1031 988 __le64 *cdptr; 1032 - struct arm_smmu_device *smmu = smmu_domain->smmu; 1033 989 1034 990 if (WARN_ON(ssid >= (1 << smmu_domain->s1_cfg.s1cdmax))) 1035 991 return -E2BIG; ··· 1073 1031 FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) | 1074 1032 CTXDESC_CD_0_V; 1075 1033 1076 - /* STALL_MODEL==0b10 && CD.S==0 is ILLEGAL */ 1077 - if (smmu->features & ARM_SMMU_FEAT_STALL_FORCE) 1034 + if (smmu_domain->stall_enabled) 1078 1035 val |= CTXDESC_CD_0_S; 1079 1036 } 1080 1037 ··· 1317 1276 FIELD_PREP(STRTAB_STE_1_STRW, strw)); 1318 1277 1319 1278 if (smmu->features & ARM_SMMU_FEAT_STALLS && 1320 - !(smmu->features & ARM_SMMU_FEAT_STALL_FORCE)) 1279 + !master->stall_enabled) 1321 1280 dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD); 1322 1281 1323 1282 val |= (s1_cfg->cdcfg.cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) | ··· 1394 1353 return 0; 1395 1354 } 1396 1355 1397 - __maybe_unused 1398 1356 static struct arm_smmu_master * 1399 1357 arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid) 1400 1358 { ··· 1417 1377 } 1418 1378 1419 1379 /* IRQ and event handlers */ 1380 + static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt) 1381 + { 1382 + int ret; 1383 + u32 reason; 1384 + u32 perm = 0; 1385 + struct arm_smmu_master *master; 1386 + bool ssid_valid = evt[0] & EVTQ_0_SSV; 1387 + u32 sid = FIELD_GET(EVTQ_0_SID, evt[0]); 1388 + struct iommu_fault_event fault_evt = { }; 1389 + struct iommu_fault *flt = &fault_evt.fault; 1390 + 1391 + switch (FIELD_GET(EVTQ_0_ID, evt[0])) { 1392 + case EVT_ID_TRANSLATION_FAULT: 1393 + reason = IOMMU_FAULT_REASON_PTE_FETCH; 1394 + break; 1395 + case EVT_ID_ADDR_SIZE_FAULT: 1396 + reason = IOMMU_FAULT_REASON_OOR_ADDRESS; 1397 + break; 1398 + case EVT_ID_ACCESS_FAULT: 1399 + reason = IOMMU_FAULT_REASON_ACCESS; 1400 + break; 1401 + case EVT_ID_PERMISSION_FAULT: 1402 + reason = IOMMU_FAULT_REASON_PERMISSION; 1403 + break; 1404 + default: 1405 + return -EOPNOTSUPP; 1406 + } 1407 + 1408 + /* Stage-2 is always pinned at the moment */ 1409 + if (evt[1] & EVTQ_1_S2) 1410 + return -EFAULT; 1411 + 1412 + if (evt[1] & EVTQ_1_RnW) 1413 + perm |= IOMMU_FAULT_PERM_READ; 1414 + else 1415 + perm |= IOMMU_FAULT_PERM_WRITE; 1416 + 1417 + if (evt[1] & EVTQ_1_InD) 1418 + perm |= IOMMU_FAULT_PERM_EXEC; 1419 + 1420 + if (evt[1] & EVTQ_1_PnU) 1421 + perm |= IOMMU_FAULT_PERM_PRIV; 1422 + 1423 + if (evt[1] & EVTQ_1_STALL) { 1424 + flt->type = IOMMU_FAULT_PAGE_REQ; 1425 + flt->prm = (struct iommu_fault_page_request) { 1426 + .flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE, 1427 + .grpid = FIELD_GET(EVTQ_1_STAG, evt[1]), 1428 + .perm = perm, 1429 + .addr = FIELD_GET(EVTQ_2_ADDR, evt[2]), 1430 + }; 1431 + 1432 + if (ssid_valid) { 1433 + flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID; 1434 + flt->prm.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]); 1435 + } 1436 + } else { 1437 + flt->type = IOMMU_FAULT_DMA_UNRECOV; 1438 + flt->event = (struct iommu_fault_unrecoverable) { 1439 + .reason = reason, 1440 + .flags = IOMMU_FAULT_UNRECOV_ADDR_VALID, 1441 + .perm = perm, 1442 + .addr = FIELD_GET(EVTQ_2_ADDR, evt[2]), 1443 + }; 1444 + 1445 + if (ssid_valid) { 1446 + flt->event.flags |= IOMMU_FAULT_UNRECOV_PASID_VALID; 1447 + flt->event.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]); 1448 + } 1449 + } 1450 + 1451 + mutex_lock(&smmu->streams_mutex); 1452 + master = arm_smmu_find_master(smmu, sid); 1453 + if (!master) { 1454 + ret = -EINVAL; 1455 + goto out_unlock; 1456 + } 1457 + 1458 + ret = iommu_report_device_fault(master->dev, &fault_evt); 1459 + if (ret && flt->type == IOMMU_FAULT_PAGE_REQ) { 1460 + /* Nobody cared, abort the access */ 1461 + struct iommu_page_response resp = { 1462 + .pasid = flt->prm.pasid, 1463 + .grpid = flt->prm.grpid, 1464 + .code = IOMMU_PAGE_RESP_FAILURE, 1465 + }; 1466 + arm_smmu_page_response(master->dev, &fault_evt, &resp); 1467 + } 1468 + 1469 + out_unlock: 1470 + mutex_unlock(&smmu->streams_mutex); 1471 + return ret; 1472 + } 1473 + 1420 1474 static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev) 1421 1475 { 1422 - int i; 1476 + int i, ret; 1423 1477 struct arm_smmu_device *smmu = dev; 1424 1478 struct arm_smmu_queue *q = &smmu->evtq.q; 1425 1479 struct arm_smmu_ll_queue *llq = &q->llq; 1480 + static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL, 1481 + DEFAULT_RATELIMIT_BURST); 1426 1482 u64 evt[EVTQ_ENT_DWORDS]; 1427 1483 1428 1484 do { 1429 1485 while (!queue_remove_raw(q, evt)) { 1430 1486 u8 id = FIELD_GET(EVTQ_0_ID, evt[0]); 1487 + 1488 + ret = arm_smmu_handle_evt(smmu, evt); 1489 + if (!ret || !__ratelimit(&rs)) 1490 + continue; 1431 1491 1432 1492 dev_info(smmu->dev, "event 0x%02x received:\n", id); 1433 1493 for (i = 0; i < ARRAY_SIZE(evt); ++i) ··· 2063 1923 2064 1924 cfg->s1cdmax = master->ssid_bits; 2065 1925 1926 + smmu_domain->stall_enabled = master->stall_enabled; 1927 + 2066 1928 ret = arm_smmu_alloc_cd_tables(smmu_domain); 2067 1929 if (ret) 2068 1930 goto out_free_asid; ··· 2412 2270 smmu_domain->s1_cfg.s1cdmax, master->ssid_bits); 2413 2271 ret = -EINVAL; 2414 2272 goto out_unlock; 2273 + } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 && 2274 + smmu_domain->stall_enabled != master->stall_enabled) { 2275 + dev_err(dev, "cannot attach to stall-%s domain\n", 2276 + smmu_domain->stall_enabled ? "enabled" : "disabled"); 2277 + ret = -EINVAL; 2278 + goto out_unlock; 2415 2279 } 2416 2280 2417 2281 master->domain = smmu_domain; ··· 2656 2508 master->ssid_bits = min_t(u8, master->ssid_bits, 2657 2509 CTXDESC_LINEAR_CDMAX); 2658 2510 2511 + if ((smmu->features & ARM_SMMU_FEAT_STALLS && 2512 + device_property_read_bool(dev, "dma-can-stall")) || 2513 + smmu->features & ARM_SMMU_FEAT_STALL_FORCE) 2514 + master->stall_enabled = true; 2515 + 2659 2516 return &smmu->iommu; 2660 2517 2661 2518 err_free_master: ··· 2678 2525 return; 2679 2526 2680 2527 master = dev_iommu_priv_get(dev); 2681 - WARN_ON(arm_smmu_master_sva_enabled(master)); 2528 + if (WARN_ON(arm_smmu_master_sva_enabled(master))) 2529 + iopf_queue_remove_device(master->smmu->evtq.iopf, dev); 2682 2530 arm_smmu_detach_dev(master); 2683 2531 arm_smmu_disable_pasid(master); 2684 2532 arm_smmu_remove_master(master); ··· 2749 2595 return false; 2750 2596 2751 2597 switch (feat) { 2598 + case IOMMU_DEV_FEAT_IOPF: 2599 + return arm_smmu_master_iopf_supported(master); 2752 2600 case IOMMU_DEV_FEAT_SVA: 2753 2601 return arm_smmu_master_sva_supported(master); 2754 2602 default: ··· 2767 2611 return false; 2768 2612 2769 2613 switch (feat) { 2614 + case IOMMU_DEV_FEAT_IOPF: 2615 + return master->iopf_enabled; 2770 2616 case IOMMU_DEV_FEAT_SVA: 2771 2617 return arm_smmu_master_sva_enabled(master); 2772 2618 default: ··· 2779 2621 static int arm_smmu_dev_enable_feature(struct device *dev, 2780 2622 enum iommu_dev_features feat) 2781 2623 { 2624 + struct arm_smmu_master *master = dev_iommu_priv_get(dev); 2625 + 2782 2626 if (!arm_smmu_dev_has_feature(dev, feat)) 2783 2627 return -ENODEV; 2784 2628 ··· 2788 2628 return -EBUSY; 2789 2629 2790 2630 switch (feat) { 2631 + case IOMMU_DEV_FEAT_IOPF: 2632 + master->iopf_enabled = true; 2633 + return 0; 2791 2634 case IOMMU_DEV_FEAT_SVA: 2792 - return arm_smmu_master_enable_sva(dev_iommu_priv_get(dev)); 2635 + return arm_smmu_master_enable_sva(master); 2793 2636 default: 2794 2637 return -EINVAL; 2795 2638 } ··· 2801 2638 static int arm_smmu_dev_disable_feature(struct device *dev, 2802 2639 enum iommu_dev_features feat) 2803 2640 { 2641 + struct arm_smmu_master *master = dev_iommu_priv_get(dev); 2642 + 2804 2643 if (!arm_smmu_dev_feature_enabled(dev, feat)) 2805 2644 return -EINVAL; 2806 2645 2807 2646 switch (feat) { 2647 + case IOMMU_DEV_FEAT_IOPF: 2648 + if (master->sva_enabled) 2649 + return -EBUSY; 2650 + master->iopf_enabled = false; 2651 + return 0; 2808 2652 case IOMMU_DEV_FEAT_SVA: 2809 - return arm_smmu_master_disable_sva(dev_iommu_priv_get(dev)); 2653 + return arm_smmu_master_disable_sva(master); 2810 2654 default: 2811 2655 return -EINVAL; 2812 2656 } ··· 2843 2673 .sva_bind = arm_smmu_sva_bind, 2844 2674 .sva_unbind = arm_smmu_sva_unbind, 2845 2675 .sva_get_pasid = arm_smmu_sva_get_pasid, 2676 + .page_response = arm_smmu_page_response, 2846 2677 .pgsize_bitmap = -1UL, /* Restricted during device attach */ 2847 2678 .owner = THIS_MODULE, 2848 2679 }; ··· 2942 2771 if (ret) 2943 2772 return ret; 2944 2773 2774 + if ((smmu->features & ARM_SMMU_FEAT_SVA) && 2775 + (smmu->features & ARM_SMMU_FEAT_STALLS)) { 2776 + smmu->evtq.iopf = iopf_queue_alloc(dev_name(smmu->dev)); 2777 + if (!smmu->evtq.iopf) 2778 + return -ENOMEM; 2779 + } 2780 + 2945 2781 /* priq */ 2946 2782 if (!(smmu->features & ARM_SMMU_FEAT_PRI)) 2947 2783 return 0; ··· 2966 2788 void *strtab = smmu->strtab_cfg.strtab; 2967 2789 2968 2790 cfg->l1_desc = devm_kzalloc(smmu->dev, size, GFP_KERNEL); 2969 - if (!cfg->l1_desc) { 2970 - dev_err(smmu->dev, "failed to allocate l1 stream table desc\n"); 2791 + if (!cfg->l1_desc) 2971 2792 return -ENOMEM; 2972 - } 2973 2793 2974 2794 for (i = 0; i < cfg->num_l1_ents; ++i) { 2975 2795 arm_smmu_write_strtab_l1_desc(strtab, &cfg->l1_desc[i]); ··· 3758 3582 bool bypass; 3759 3583 3760 3584 smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL); 3761 - if (!smmu) { 3762 - dev_err(dev, "failed to allocate arm_smmu_device\n"); 3585 + if (!smmu) 3763 3586 return -ENOMEM; 3764 - } 3765 3587 smmu->dev = dev; 3766 3588 3767 3589 if (dev->of_node) { ··· 3843 3669 ret = iommu_device_register(&smmu->iommu, &arm_smmu_ops, dev); 3844 3670 if (ret) { 3845 3671 dev_err(dev, "Failed to register iommu\n"); 3846 - return ret; 3672 + goto err_sysfs_remove; 3847 3673 } 3848 3674 3849 - return arm_smmu_set_bus_ops(&arm_smmu_ops); 3675 + ret = arm_smmu_set_bus_ops(&arm_smmu_ops); 3676 + if (ret) 3677 + goto err_unregister_device; 3678 + 3679 + return 0; 3680 + 3681 + err_unregister_device: 3682 + iommu_device_unregister(&smmu->iommu); 3683 + err_sysfs_remove: 3684 + iommu_device_sysfs_remove(&smmu->iommu); 3685 + return ret; 3850 3686 } 3851 3687 3852 3688 static int arm_smmu_device_remove(struct platform_device *pdev) ··· 3867 3683 iommu_device_unregister(&smmu->iommu); 3868 3684 iommu_device_sysfs_remove(&smmu->iommu); 3869 3685 arm_smmu_device_disable(smmu); 3686 + iopf_queue_free(smmu->evtq.iopf); 3870 3687 3871 3688 return 0; 3872 3689 }

+46 -2

drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h

··· 184 184 #else 185 185 #define Q_MAX_SZ_SHIFT (PAGE_SHIFT + MAX_ORDER - 1) 186 186 #endif 187 + #define Q_MIN_SZ_SHIFT (PAGE_SHIFT) 187 188 188 189 /* 189 190 * Stream table. ··· 355 354 #define CMDQ_PRI_1_GRPID GENMASK_ULL(8, 0) 356 355 #define CMDQ_PRI_1_RESP GENMASK_ULL(13, 12) 357 356 357 + #define CMDQ_RESUME_0_RESP_TERM 0UL 358 + #define CMDQ_RESUME_0_RESP_RETRY 1UL 359 + #define CMDQ_RESUME_0_RESP_ABORT 2UL 360 + #define CMDQ_RESUME_0_RESP GENMASK_ULL(13, 12) 361 + #define CMDQ_RESUME_0_SID GENMASK_ULL(63, 32) 362 + #define CMDQ_RESUME_1_STAG GENMASK_ULL(15, 0) 363 + 358 364 #define CMDQ_SYNC_0_CS GENMASK_ULL(13, 12) 359 365 #define CMDQ_SYNC_0_CS_NONE 0 360 366 #define CMDQ_SYNC_0_CS_IRQ 1 ··· 374 366 /* Event queue */ 375 367 #define EVTQ_ENT_SZ_SHIFT 5 376 368 #define EVTQ_ENT_DWORDS ((1 << EVTQ_ENT_SZ_SHIFT) >> 3) 377 - #define EVTQ_MAX_SZ_SHIFT (Q_MAX_SZ_SHIFT - EVTQ_ENT_SZ_SHIFT) 369 + #define EVTQ_MAX_SZ_SHIFT (Q_MIN_SZ_SHIFT - EVTQ_ENT_SZ_SHIFT) 378 370 379 371 #define EVTQ_0_ID GENMASK_ULL(7, 0) 372 + 373 + #define EVT_ID_TRANSLATION_FAULT 0x10 374 + #define EVT_ID_ADDR_SIZE_FAULT 0x11 375 + #define EVT_ID_ACCESS_FAULT 0x12 376 + #define EVT_ID_PERMISSION_FAULT 0x13 377 + 378 + #define EVTQ_0_SSV (1UL << 11) 379 + #define EVTQ_0_SSID GENMASK_ULL(31, 12) 380 + #define EVTQ_0_SID GENMASK_ULL(63, 32) 381 + #define EVTQ_1_STAG GENMASK_ULL(15, 0) 382 + #define EVTQ_1_STALL (1UL << 31) 383 + #define EVTQ_1_PnU (1UL << 33) 384 + #define EVTQ_1_InD (1UL << 34) 385 + #define EVTQ_1_RnW (1UL << 35) 386 + #define EVTQ_1_S2 (1UL << 39) 387 + #define EVTQ_1_CLASS GENMASK_ULL(41, 40) 388 + #define EVTQ_1_TT_READ (1UL << 44) 389 + #define EVTQ_2_ADDR GENMASK_ULL(63, 0) 390 + #define EVTQ_3_IPA GENMASK_ULL(51, 12) 380 391 381 392 /* PRI queue */ 382 393 #define PRIQ_ENT_SZ_SHIFT 4 383 394 #define PRIQ_ENT_DWORDS ((1 << PRIQ_ENT_SZ_SHIFT) >> 3) 384 - #define PRIQ_MAX_SZ_SHIFT (Q_MAX_SZ_SHIFT - PRIQ_ENT_SZ_SHIFT) 395 + #define PRIQ_MAX_SZ_SHIFT (Q_MIN_SZ_SHIFT - PRIQ_ENT_SZ_SHIFT) 385 396 386 397 #define PRIQ_0_SID GENMASK_ULL(31, 0) 387 398 #define PRIQ_0_SSID GENMASK_ULL(51, 32) ··· 489 462 enum pri_resp resp; 490 463 } pri; 491 464 465 + #define CMDQ_OP_RESUME 0x44 466 + struct { 467 + u32 sid; 468 + u16 stag; 469 + u8 resp; 470 + } resume; 471 + 492 472 #define CMDQ_OP_CMD_SYNC 0x46 493 473 struct { 494 474 u64 msiaddr; ··· 554 520 555 521 struct arm_smmu_evtq { 556 522 struct arm_smmu_queue q; 523 + struct iopf_queue *iopf; 557 524 u32 max_stalls; 558 525 }; 559 526 ··· 692 657 struct arm_smmu_stream *streams; 693 658 unsigned int num_streams; 694 659 bool ats_enabled; 660 + bool stall_enabled; 695 661 bool sva_enabled; 662 + bool iopf_enabled; 696 663 struct list_head bonds; 697 664 unsigned int ssid_bits; 698 665 }; ··· 712 675 struct mutex init_mutex; /* Protects smmu pointer */ 713 676 714 677 struct io_pgtable_ops *pgtbl_ops; 678 + bool stall_enabled; 715 679 atomic_t nr_ats_masters; 716 680 717 681 enum arm_smmu_domain_stage stage; ··· 754 716 bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master); 755 717 int arm_smmu_master_enable_sva(struct arm_smmu_master *master); 756 718 int arm_smmu_master_disable_sva(struct arm_smmu_master *master); 719 + bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master); 757 720 struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm, 758 721 void *drvdata); 759 722 void arm_smmu_sva_unbind(struct iommu_sva *handle); ··· 784 745 static inline int arm_smmu_master_disable_sva(struct arm_smmu_master *master) 785 746 { 786 747 return -ENODEV; 748 + } 749 + 750 + static inline bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master) 751 + { 752 + return false; 787 753 } 788 754 789 755 static inline struct iommu_sva *

+40 -3

drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c

··· 3 3 * Copyright (c) 2019, The Linux Foundation. All rights reserved. 4 4 */ 5 5 6 + #include <linux/acpi.h> 6 7 #include <linux/adreno-smmu-priv.h> 7 8 #include <linux/of_device.h> 8 9 #include <linux/qcom_scm.h> ··· 178 177 return __arm_smmu_alloc_bitmap(smmu->context_map, start, count); 179 178 } 180 179 180 + static bool qcom_adreno_can_do_ttbr1(struct arm_smmu_device *smmu) 181 + { 182 + const struct device_node *np = smmu->dev->of_node; 183 + 184 + if (of_device_is_compatible(np, "qcom,msm8996-smmu-v2")) 185 + return false; 186 + 187 + return true; 188 + } 189 + 181 190 static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain, 182 191 struct io_pgtable_cfg *pgtbl_cfg, struct device *dev) 183 192 { ··· 202 191 * be AARCH64 stage 1 but double check because the arm-smmu code assumes 203 192 * that is the case when the TTBR1 quirk is enabled 204 193 */ 205 - if ((smmu_domain->stage == ARM_SMMU_DOMAIN_S1) && 194 + if (qcom_adreno_can_do_ttbr1(smmu_domain->smmu) && 195 + (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) && 206 196 (smmu_domain->cfg.fmt == ARM_SMMU_CTX_FMT_AARCH64)) 207 197 pgtbl_cfg->quirks |= IO_PGTABLE_QUIRK_ARM_TTBR1; 208 198 ··· 228 216 { .compatible = "qcom,mdss" }, 229 217 { .compatible = "qcom,sc7180-mdss" }, 230 218 { .compatible = "qcom,sc7180-mss-pil" }, 219 + { .compatible = "qcom,sc7280-mdss" }, 231 220 { .compatible = "qcom,sc8180x-mdss" }, 232 221 { .compatible = "qcom,sdm845-mdss" }, 233 222 { .compatible = "qcom,sdm845-mss-pil" }, ··· 393 380 static const struct of_device_id __maybe_unused qcom_smmu_impl_of_match[] = { 394 381 { .compatible = "qcom,msm8998-smmu-v2" }, 395 382 { .compatible = "qcom,sc7180-smmu-500" }, 383 + { .compatible = "qcom,sc7280-smmu-500" }, 396 384 { .compatible = "qcom,sc8180x-smmu-500" }, 397 385 { .compatible = "qcom,sdm630-smmu-v2" }, 398 386 { .compatible = "qcom,sdm845-smmu-500" }, 387 + { .compatible = "qcom,sm6125-smmu-500" }, 399 388 { .compatible = "qcom,sm8150-smmu-500" }, 400 389 { .compatible = "qcom,sm8250-smmu-500" }, 401 390 { .compatible = "qcom,sm8350-smmu-500" }, 402 391 { } 403 392 }; 404 393 394 + #ifdef CONFIG_ACPI 395 + static struct acpi_platform_list qcom_acpi_platlist[] = { 396 + { "LENOVO", "CB-01 ", 0x8180, ACPI_SIG_IORT, equal, "QCOM SMMU" }, 397 + { "QCOM ", "QCOMEDK2", 0x8180, ACPI_SIG_IORT, equal, "QCOM SMMU" }, 398 + { } 399 + }; 400 + #endif 401 + 405 402 struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu) 406 403 { 407 404 const struct device_node *np = smmu->dev->of_node; 408 405 409 - if (of_match_node(qcom_smmu_impl_of_match, np)) 410 - return qcom_smmu_create(smmu, &qcom_smmu_impl); 406 + #ifdef CONFIG_ACPI 407 + if (np == NULL) { 408 + /* Match platform for ACPI boot */ 409 + if (acpi_match_platform_list(qcom_acpi_platlist) >= 0) 410 + return qcom_smmu_create(smmu, &qcom_smmu_impl); 411 + } 412 + #endif 411 413 414 + /* 415 + * Do not change this order of implementation, i.e., first adreno 416 + * smmu impl and then apss smmu since we can have both implementing 417 + * arm,mmu-500 in which case we will miss setting adreno smmu specific 418 + * features if the order is changed. 419 + */ 412 420 if (of_device_is_compatible(np, "qcom,adreno-smmu")) 413 421 return qcom_smmu_create(smmu, &qcom_adreno_smmu_impl); 422 + 423 + if (of_match_node(qcom_smmu_impl_of_match, np)) 424 + return qcom_smmu_create(smmu, &qcom_smmu_impl); 414 425 415 426 return smmu; 416 427 }

+32 -7

drivers/iommu/arm/arm-smmu/arm-smmu.c

··· 31 31 #include <linux/of.h> 32 32 #include <linux/of_address.h> 33 33 #include <linux/of_device.h> 34 - #include <linux/of_iommu.h> 35 34 #include <linux/pci.h> 36 35 #include <linux/platform_device.h> 37 36 #include <linux/pm_runtime.h> ··· 73 74 static inline int arm_smmu_rpm_get(struct arm_smmu_device *smmu) 74 75 { 75 76 if (pm_runtime_enabled(smmu->dev)) 76 - return pm_runtime_get_sync(smmu->dev); 77 + return pm_runtime_resume_and_get(smmu->dev); 77 78 78 79 return 0; 79 80 } ··· 1275 1276 u64 phys; 1276 1277 unsigned long va, flags; 1277 1278 int ret, idx = cfg->cbndx; 1279 + phys_addr_t addr = 0; 1278 1280 1279 1281 ret = arm_smmu_rpm_get(smmu); 1280 1282 if (ret < 0) ··· 1295 1295 dev_err(dev, 1296 1296 "iova to phys timed out on %pad. Falling back to software table walk.\n", 1297 1297 &iova); 1298 + arm_smmu_rpm_put(smmu); 1298 1299 return ops->iova_to_phys(ops, iova); 1299 1300 } 1300 1301 ··· 1304 1303 if (phys & ARM_SMMU_CB_PAR_F) { 1305 1304 dev_err(dev, "translation fault!\n"); 1306 1305 dev_err(dev, "PAR = 0x%llx\n", phys); 1307 - return 0; 1306 + goto out; 1308 1307 } 1309 1308 1309 + addr = (phys & GENMASK_ULL(39, 12)) | (iova & 0xfff); 1310 + out: 1310 1311 arm_smmu_rpm_put(smmu); 1311 1312 1312 - return (phys & GENMASK_ULL(39, 12)) | (iova & 0xfff); 1313 + return addr; 1313 1314 } 1314 1315 1315 1316 static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain, ··· 1458 1455 iommu_fwspec_free(dev); 1459 1456 } 1460 1457 1458 + static void arm_smmu_probe_finalize(struct device *dev) 1459 + { 1460 + struct arm_smmu_master_cfg *cfg; 1461 + struct arm_smmu_device *smmu; 1462 + 1463 + cfg = dev_iommu_priv_get(dev); 1464 + smmu = cfg->smmu; 1465 + 1466 + if (smmu->impl && smmu->impl->probe_finalize) 1467 + smmu->impl->probe_finalize(smmu, dev); 1468 + } 1469 + 1461 1470 static struct iommu_group *arm_smmu_device_group(struct device *dev) 1462 1471 { 1463 1472 struct arm_smmu_master_cfg *cfg = dev_iommu_priv_get(dev); ··· 1589 1574 .iova_to_phys = arm_smmu_iova_to_phys, 1590 1575 .probe_device = arm_smmu_probe_device, 1591 1576 .release_device = arm_smmu_release_device, 1577 + .probe_finalize = arm_smmu_probe_finalize, 1592 1578 .device_group = arm_smmu_device_group, 1593 1579 .enable_nesting = arm_smmu_enable_nesting, 1594 1580 .set_pgtable_quirks = arm_smmu_set_pgtable_quirks, ··· 2185 2169 err = iommu_device_register(&smmu->iommu, &arm_smmu_ops, dev); 2186 2170 if (err) { 2187 2171 dev_err(dev, "Failed to register iommu\n"); 2188 - return err; 2172 + goto err_sysfs_remove; 2189 2173 } 2190 2174 2191 2175 platform_set_drvdata(pdev, smmu); ··· 2208 2192 * any device which might need it, so we want the bus ops in place 2209 2193 * ready to handle default domain setup as soon as any SMMU exists. 2210 2194 */ 2211 - if (!using_legacy_binding) 2212 - return arm_smmu_bus_init(&arm_smmu_ops); 2195 + if (!using_legacy_binding) { 2196 + err = arm_smmu_bus_init(&arm_smmu_ops); 2197 + if (err) 2198 + goto err_unregister_device; 2199 + } 2213 2200 2214 2201 return 0; 2202 + 2203 + err_unregister_device: 2204 + iommu_device_unregister(&smmu->iommu); 2205 + err_sysfs_remove: 2206 + iommu_device_sysfs_remove(&smmu->iommu); 2207 + return err; 2215 2208 } 2216 2209 2217 2210 static int arm_smmu_device_remove(struct platform_device *pdev)

+1

drivers/iommu/arm/arm-smmu/arm-smmu.h

··· 441 441 struct device *dev, int start); 442 442 void (*write_s2cr)(struct arm_smmu_device *smmu, int idx); 443 443 void (*write_sctlr)(struct arm_smmu_device *smmu, int idx, u32 reg); 444 + void (*probe_finalize)(struct arm_smmu_device *smmu, struct device *dev); 444 445 }; 445 446 446 447 #define INVALID_SMENDX -1

+11 -3

drivers/iommu/arm/arm-smmu/qcom_iommu.c

··· 25 25 #include <linux/of.h> 26 26 #include <linux/of_address.h> 27 27 #include <linux/of_device.h> 28 - #include <linux/of_iommu.h> 29 28 #include <linux/platform_device.h> 30 29 #include <linux/pm.h> 31 30 #include <linux/pm_runtime.h> ··· 849 850 ret = iommu_device_register(&qcom_iommu->iommu, &qcom_iommu_ops, dev); 850 851 if (ret) { 851 852 dev_err(dev, "Failed to register iommu\n"); 852 - return ret; 853 + goto err_sysfs_remove; 853 854 } 854 855 855 - bus_set_iommu(&platform_bus_type, &qcom_iommu_ops); 856 + ret = bus_set_iommu(&platform_bus_type, &qcom_iommu_ops); 857 + if (ret) 858 + goto err_unregister_device; 856 859 857 860 if (qcom_iommu->local_base) { 858 861 pm_runtime_get_sync(dev); ··· 863 862 } 864 863 865 864 return 0; 865 + 866 + err_unregister_device: 867 + iommu_device_unregister(&qcom_iommu->iommu); 868 + 869 + err_sysfs_remove: 870 + iommu_device_sysfs_remove(&qcom_iommu->iommu); 871 + return ret; 866 872 } 867 873 868 874 static int qcom_iommu_device_remove(struct platform_device *pdev)

+11 -8

drivers/iommu/dma-iommu.c

··· 243 243 lo = iova_pfn(iovad, start); 244 244 hi = iova_pfn(iovad, end); 245 245 reserve_iova(iovad, lo, hi); 246 - } else { 246 + } else if (end < start) { 247 247 /* dma_ranges list should be sorted */ 248 - dev_err(&dev->dev, "Failed to reserve IOVA\n"); 248 + dev_err(&dev->dev, 249 + "Failed to reserve IOVA [%pa-%pa]\n", 250 + &start, &end); 249 251 return -EINVAL; 250 252 } 251 253 ··· 321 319 * iommu_dma_init_domain - Initialise a DMA mapping domain 322 320 * @domain: IOMMU domain previously prepared by iommu_get_dma_cookie() 323 321 * @base: IOVA at which the mappable address space starts 324 - * @size: Size of IOVA space 322 + * @limit: Last address of the IOVA space 325 323 * @dev: Device the domain is being initialised for 326 324 * 327 - * @base and @size should be exact multiples of IOMMU page granularity to 325 + * @base and @limit + 1 should be exact multiples of IOMMU page granularity to 328 326 * avoid rounding surprises. If necessary, we reserve the page at address 0 329 327 * to ensure it is an invalid IOVA. It is safe to reinitialise a domain, but 330 328 * any change which could make prior IOVAs invalid will fail. 331 329 */ 332 330 static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, 333 - u64 size, struct device *dev) 331 + dma_addr_t limit, struct device *dev) 334 332 { 335 333 struct iommu_dma_cookie *cookie = domain->iova_cookie; 336 334 unsigned long order, base_pfn; ··· 348 346 /* Check the domain allows at least some access to the device... */ 349 347 if (domain->geometry.force_aperture) { 350 348 if (base > domain->geometry.aperture_end || 351 - base + size <= domain->geometry.aperture_start) { 349 + limit < domain->geometry.aperture_start) { 352 350 pr_warn("specified DMA range outside IOMMU capability\n"); 353 351 return -EFAULT; 354 352 } ··· 1310 1308 * The IOMMU core code allocates the default DMA domain, which the underlying 1311 1309 * IOMMU driver needs to support via the dma-iommu layer. 1312 1310 */ 1313 - void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size) 1311 + void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 dma_limit) 1314 1312 { 1315 1313 struct iommu_domain *domain = iommu_get_domain_for_dev(dev); 1316 1314 ··· 1322 1320 * underlying IOMMU driver needs to support via the dma-iommu layer. 1323 1321 */ 1324 1322 if (domain->type == IOMMU_DOMAIN_DMA) { 1325 - if (iommu_dma_init_domain(domain, dma_base, size, dev)) 1323 + if (iommu_dma_init_domain(domain, dma_base, dma_limit, dev)) 1326 1324 goto out_err; 1327 1325 dev->dma_ops = &iommu_dma_ops; 1328 1326 } ··· 1332 1330 pr_warn("Failed to set up IOMMU for device %s; retaining platform DMA ops\n", 1333 1331 dev_name(dev)); 1334 1332 } 1333 + EXPORT_SYMBOL_GPL(iommu_setup_dma_ops); 1335 1334 1336 1335 static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, 1337 1336 phys_addr_t msi_addr, struct iommu_domain *domain)

-1

drivers/iommu/exynos-iommu.c

··· 17 17 #include <linux/kmemleak.h> 18 18 #include <linux/list.h> 19 19 #include <linux/of.h> 20 - #include <linux/of_iommu.h> 21 20 #include <linux/of_platform.h> 22 21 #include <linux/platform_device.h> 23 22 #include <linux/pm_runtime.h>

+6

drivers/iommu/intel/Kconfig

··· 3 3 config DMAR_TABLE 4 4 bool 5 5 6 + config DMAR_PERF 7 + bool 8 + 6 9 config INTEL_IOMMU 7 10 bool "Support for Intel IOMMU using DMA Remapping Devices" 8 11 depends on PCI_MSI && ACPI && (X86 || IA64) ··· 17 14 select SWIOTLB 18 15 select IOASID 19 16 select IOMMU_DMA 17 + select PCI_ATS 20 18 help 21 19 DMA remapping (DMAR) devices support enables independent address 22 20 translations for Direct Memory Access (DMA) from devices. ··· 28 24 config INTEL_IOMMU_DEBUGFS 29 25 bool "Export Intel IOMMU internals in Debugfs" 30 26 depends on INTEL_IOMMU && IOMMU_DEBUGFS 27 + select DMAR_PERF 31 28 help 32 29 !!!WARNING!!! 33 30 ··· 46 41 select PCI_PRI 47 42 select MMU_NOTIFIER 48 43 select IOASID 44 + select IOMMU_SVA_LIB 49 45 help 50 46 Shared Virtual Memory (SVM) provides a facility for devices 51 47 to access DMA resources through process address space by

+1

drivers/iommu/intel/Makefile

··· 2 2 obj-$(CONFIG_DMAR_TABLE) += dmar.o 3 3 obj-$(CONFIG_INTEL_IOMMU) += iommu.o pasid.o 4 4 obj-$(CONFIG_DMAR_TABLE) += trace.o cap_audit.o 5 + obj-$(CONFIG_DMAR_PERF) += perf.o 5 6 obj-$(CONFIG_INTEL_IOMMU_DEBUGFS) += debugfs.o 6 7 obj-$(CONFIG_INTEL_IOMMU_SVM) += svm.o 7 8 obj-$(CONFIG_IRQ_REMAP) += irq_remapping.o

+111

drivers/iommu/intel/debugfs.c

··· 16 16 #include <asm/irq_remapping.h> 17 17 18 18 #include "pasid.h" 19 + #include "perf.h" 19 20 20 21 struct tbl_walk { 21 22 u16 bus; ··· 31 30 int offset; 32 31 const char *regs; 33 32 }; 33 + 34 + #define DEBUG_BUFFER_SIZE 1024 35 + static char debug_buf[DEBUG_BUFFER_SIZE]; 34 36 35 37 #define IOMMU_REGSET_ENTRY(_reg_) \ 36 38 { DMAR_##_reg_##_REG, __stringify(_reg_) } ··· 542 538 DEFINE_SHOW_ATTRIBUTE(ir_translation_struct); 543 539 #endif 544 540 541 + static void latency_show_one(struct seq_file *m, struct intel_iommu *iommu, 542 + struct dmar_drhd_unit *drhd) 543 + { 544 + int ret; 545 + 546 + seq_printf(m, "IOMMU: %s Register Base Address: %llx\n", 547 + iommu->name, drhd->reg_base_addr); 548 + 549 + ret = dmar_latency_snapshot(iommu, debug_buf, DEBUG_BUFFER_SIZE); 550 + if (ret < 0) 551 + seq_puts(m, "Failed to get latency snapshot"); 552 + else 553 + seq_puts(m, debug_buf); 554 + seq_puts(m, "\n"); 555 + } 556 + 557 + static int latency_show(struct seq_file *m, void *v) 558 + { 559 + struct dmar_drhd_unit *drhd; 560 + struct intel_iommu *iommu; 561 + 562 + rcu_read_lock(); 563 + for_each_active_iommu(iommu, drhd) 564 + latency_show_one(m, iommu, drhd); 565 + rcu_read_unlock(); 566 + 567 + return 0; 568 + } 569 + 570 + static int dmar_perf_latency_open(struct inode *inode, struct file *filp) 571 + { 572 + return single_open(filp, latency_show, NULL); 573 + } 574 + 575 + static ssize_t dmar_perf_latency_write(struct file *filp, 576 + const char __user *ubuf, 577 + size_t cnt, loff_t *ppos) 578 + { 579 + struct dmar_drhd_unit *drhd; 580 + struct intel_iommu *iommu; 581 + int counting; 582 + char buf[64]; 583 + 584 + if (cnt > 63) 585 + cnt = 63; 586 + 587 + if (copy_from_user(&buf, ubuf, cnt)) 588 + return -EFAULT; 589 + 590 + buf[cnt] = 0; 591 + 592 + if (kstrtoint(buf, 0, &counting)) 593 + return -EINVAL; 594 + 595 + switch (counting) { 596 + case 0: 597 + rcu_read_lock(); 598 + for_each_active_iommu(iommu, drhd) { 599 + dmar_latency_disable(iommu, DMAR_LATENCY_INV_IOTLB); 600 + dmar_latency_disable(iommu, DMAR_LATENCY_INV_DEVTLB); 601 + dmar_latency_disable(iommu, DMAR_LATENCY_INV_IEC); 602 + dmar_latency_disable(iommu, DMAR_LATENCY_PRQ); 603 + } 604 + rcu_read_unlock(); 605 + break; 606 + case 1: 607 + rcu_read_lock(); 608 + for_each_active_iommu(iommu, drhd) 609 + dmar_latency_enable(iommu, DMAR_LATENCY_INV_IOTLB); 610 + rcu_read_unlock(); 611 + break; 612 + case 2: 613 + rcu_read_lock(); 614 + for_each_active_iommu(iommu, drhd) 615 + dmar_latency_enable(iommu, DMAR_LATENCY_INV_DEVTLB); 616 + rcu_read_unlock(); 617 + break; 618 + case 3: 619 + rcu_read_lock(); 620 + for_each_active_iommu(iommu, drhd) 621 + dmar_latency_enable(iommu, DMAR_LATENCY_INV_IEC); 622 + rcu_read_unlock(); 623 + break; 624 + case 4: 625 + rcu_read_lock(); 626 + for_each_active_iommu(iommu, drhd) 627 + dmar_latency_enable(iommu, DMAR_LATENCY_PRQ); 628 + rcu_read_unlock(); 629 + break; 630 + default: 631 + return -EINVAL; 632 + } 633 + 634 + *ppos += cnt; 635 + return cnt; 636 + } 637 + 638 + static const struct file_operations dmar_perf_latency_fops = { 639 + .open = dmar_perf_latency_open, 640 + .write = dmar_perf_latency_write, 641 + .read = seq_read, 642 + .llseek = seq_lseek, 643 + .release = single_release, 644 + }; 645 + 545 646 void __init intel_iommu_debugfs_init(void) 546 647 { 547 648 struct dentry *intel_iommu_debug = debugfs_create_dir("intel", ··· 665 556 debugfs_create_file("ir_translation_struct", 0444, intel_iommu_debug, 666 557 NULL, &ir_translation_struct_fops); 667 558 #endif 559 + debugfs_create_file("dmar_perf_latency", 0644, intel_iommu_debug, 560 + NULL, &dmar_perf_latency_fops); 668 561 }

+46 -8

drivers/iommu/intel/dmar.c

··· 34 34 #include <trace/events/intel_iommu.h> 35 35 36 36 #include "../irq_remapping.h" 37 + #include "perf.h" 37 38 38 39 typedef int (*dmar_res_handler_t)(struct acpi_dmar_header *, void *); 39 40 struct dmar_res_callback { ··· 1343 1342 unsigned int count, unsigned long options) 1344 1343 { 1345 1344 struct q_inval *qi = iommu->qi; 1345 + s64 devtlb_start_ktime = 0; 1346 + s64 iotlb_start_ktime = 0; 1347 + s64 iec_start_ktime = 0; 1346 1348 struct qi_desc wait_desc; 1347 1349 int wait_index, index; 1348 1350 unsigned long flags; 1349 1351 int offset, shift; 1350 1352 int rc, i; 1353 + u64 type; 1351 1354 1352 1355 if (!qi) 1353 1356 return 0; 1357 + 1358 + type = desc->qw0 & GENMASK_ULL(3, 0); 1359 + 1360 + if ((type == QI_IOTLB_TYPE || type == QI_EIOTLB_TYPE) && 1361 + dmar_latency_enabled(iommu, DMAR_LATENCY_INV_IOTLB)) 1362 + iotlb_start_ktime = ktime_to_ns(ktime_get()); 1363 + 1364 + if ((type == QI_DIOTLB_TYPE || type == QI_DEIOTLB_TYPE) && 1365 + dmar_latency_enabled(iommu, DMAR_LATENCY_INV_DEVTLB)) 1366 + devtlb_start_ktime = ktime_to_ns(ktime_get()); 1367 + 1368 + if (type == QI_IEC_TYPE && 1369 + dmar_latency_enabled(iommu, DMAR_LATENCY_INV_IEC)) 1370 + iec_start_ktime = ktime_to_ns(ktime_get()); 1354 1371 1355 1372 restart: 1356 1373 rc = 0; ··· 1443 1424 1444 1425 if (rc == -EAGAIN) 1445 1426 goto restart; 1427 + 1428 + if (iotlb_start_ktime) 1429 + dmar_latency_update(iommu, DMAR_LATENCY_INV_IOTLB, 1430 + ktime_to_ns(ktime_get()) - iotlb_start_ktime); 1431 + 1432 + if (devtlb_start_ktime) 1433 + dmar_latency_update(iommu, DMAR_LATENCY_INV_DEVTLB, 1434 + ktime_to_ns(ktime_get()) - devtlb_start_ktime); 1435 + 1436 + if (iec_start_ktime) 1437 + dmar_latency_update(iommu, DMAR_LATENCY_INV_IEC, 1438 + ktime_to_ns(ktime_get()) - iec_start_ktime); 1446 1439 1447 1440 return rc; 1448 1441 } ··· 1944 1913 reason = dmar_get_fault_reason(fault_reason, &fault_type); 1945 1914 1946 1915 if (fault_type == INTR_REMAP) 1947 - pr_err("[INTR-REMAP] Request device [%02x:%02x.%d] fault index %llx [fault reason %02d] %s\n", 1948 - source_id >> 8, PCI_SLOT(source_id & 0xFF), 1949 - PCI_FUNC(source_id & 0xFF), addr >> 48, 1950 - fault_reason, reason); 1951 - else 1952 - pr_err("[%s] Request device [%02x:%02x.%d] PASID %x fault addr %llx [fault reason %02d] %s\n", 1916 + pr_err("[INTR-REMAP] Request device [0x%02x:0x%02x.%d] fault index 0x%llx [fault reason 0x%02x] %s\n", 1917 + source_id >> 8, PCI_SLOT(source_id & 0xFF), 1918 + PCI_FUNC(source_id & 0xFF), addr >> 48, 1919 + fault_reason, reason); 1920 + else if (pasid == INVALID_IOASID) 1921 + pr_err("[%s NO_PASID] Request device [0x%02x:0x%02x.%d] fault addr 0x%llx [fault reason 0x%02x] %s\n", 1953 1922 type ? "DMA Read" : "DMA Write", 1954 1923 source_id >> 8, PCI_SLOT(source_id & 0xFF), 1955 - PCI_FUNC(source_id & 0xFF), pasid, addr, 1924 + PCI_FUNC(source_id & 0xFF), addr, 1956 1925 fault_reason, reason); 1926 + else 1927 + pr_err("[%s PASID 0x%x] Request device [0x%02x:0x%02x.%d] fault addr 0x%llx [fault reason 0x%02x] %s\n", 1928 + type ? "DMA Read" : "DMA Write", pasid, 1929 + source_id >> 8, PCI_SLOT(source_id & 0xFF), 1930 + PCI_FUNC(source_id & 0xFF), addr, 1931 + fault_reason, reason); 1932 + 1957 1933 return 0; 1958 1934 } 1959 1935 ··· 2027 1989 if (!ratelimited) 2028 1990 /* Using pasid -1 if pasid is not present */ 2029 1991 dmar_fault_do_one(iommu, type, fault_reason, 2030 - pasid_present ? pasid : -1, 1992 + pasid_present ? pasid : INVALID_IOASID, 2031 1993 source_id, guest_addr); 2032 1994 2033 1995 fault_index++;

+104 -68

drivers/iommu/intel/iommu.c

··· 46 46 #include <asm/iommu.h> 47 47 48 48 #include "../irq_remapping.h" 49 + #include "../iommu-sva-lib.h" 49 50 #include "pasid.h" 50 51 #include "cap_audit.h" 51 52 ··· 565 564 static int __iommu_calculate_agaw(struct intel_iommu *iommu, int max_gaw) 566 565 { 567 566 unsigned long sagaw; 568 - int agaw = -1; 567 + int agaw; 569 568 570 569 sagaw = cap_sagaw(iommu->cap); 571 570 for (agaw = width_to_agaw(max_gaw); ··· 626 625 bool found = false; 627 626 int i; 628 627 629 - domain->iommu_coherency = 1; 628 + domain->iommu_coherency = true; 630 629 631 630 for_each_domain_iommu(i, domain) { 632 631 found = true; 633 632 if (!iommu_paging_structure_coherency(g_iommus[i])) { 634 - domain->iommu_coherency = 0; 633 + domain->iommu_coherency = false; 635 634 break; 636 635 } 637 636 } ··· 642 641 rcu_read_lock(); 643 642 for_each_active_iommu(iommu, drhd) { 644 643 if (!iommu_paging_structure_coherency(iommu)) { 645 - domain->iommu_coherency = 0; 644 + domain->iommu_coherency = false; 646 645 break; 647 646 } 648 647 } 649 648 rcu_read_unlock(); 650 649 } 651 650 652 - static int domain_update_iommu_snooping(struct intel_iommu *skip) 651 + static bool domain_update_iommu_snooping(struct intel_iommu *skip) 653 652 { 654 653 struct dmar_drhd_unit *drhd; 655 654 struct intel_iommu *iommu; 656 - int ret = 1; 655 + bool ret = true; 657 656 658 657 rcu_read_lock(); 659 658 for_each_active_iommu(iommu, drhd) { ··· 666 665 */ 667 666 if (!sm_supported(iommu) && 668 667 !ecap_sc_support(iommu->ecap)) { 669 - ret = 0; 668 + ret = false; 670 669 break; 671 670 } 672 671 } ··· 683 682 struct intel_iommu *iommu; 684 683 int mask = 0x3; 685 684 686 - if (!intel_iommu_superpage) { 685 + if (!intel_iommu_superpage) 687 686 return 0; 688 - } 689 687 690 688 /* set iommu_superpage to the smallest common denominator */ 691 689 rcu_read_lock(); ··· 1919 1919 assert_spin_locked(&iommu->lock); 1920 1920 1921 1921 domain->iommu_refcnt[iommu->seq_id] += 1; 1922 - domain->iommu_count += 1; 1923 1922 if (domain->iommu_refcnt[iommu->seq_id] == 1) { 1924 1923 ndomains = cap_ndoms(iommu->cap); 1925 1924 num = find_first_zero_bit(iommu->domain_ids, ndomains); ··· 1926 1927 if (num >= ndomains) { 1927 1928 pr_err("%s: No free domain ids\n", iommu->name); 1928 1929 domain->iommu_refcnt[iommu->seq_id] -= 1; 1929 - domain->iommu_count -= 1; 1930 1930 return -ENOSPC; 1931 1931 } 1932 1932 ··· 1941 1943 return 0; 1942 1944 } 1943 1945 1944 - static int domain_detach_iommu(struct dmar_domain *domain, 1945 - struct intel_iommu *iommu) 1946 + static void domain_detach_iommu(struct dmar_domain *domain, 1947 + struct intel_iommu *iommu) 1946 1948 { 1947 - int num, count; 1949 + int num; 1948 1950 1949 1951 assert_spin_locked(&device_domain_lock); 1950 1952 assert_spin_locked(&iommu->lock); 1951 1953 1952 1954 domain->iommu_refcnt[iommu->seq_id] -= 1; 1953 - count = --domain->iommu_count; 1954 1955 if (domain->iommu_refcnt[iommu->seq_id] == 0) { 1955 1956 num = domain->iommu_did[iommu->seq_id]; 1956 1957 clear_bit(num, iommu->domain_ids); ··· 1958 1961 domain_update_iommu_cap(domain); 1959 1962 domain->iommu_did[iommu->seq_id] = 0; 1960 1963 } 1961 - 1962 - return count; 1963 1964 } 1964 1965 1965 1966 static inline int guestwidth_to_adjustwidth(int gaw) ··· 4133 4138 return container_of(iommu_dev, struct intel_iommu, iommu); 4134 4139 } 4135 4140 4136 - static ssize_t intel_iommu_show_version(struct device *dev, 4137 - struct device_attribute *attr, 4138 - char *buf) 4141 + static ssize_t version_show(struct device *dev, 4142 + struct device_attribute *attr, char *buf) 4139 4143 { 4140 4144 struct intel_iommu *iommu = dev_to_intel_iommu(dev); 4141 4145 u32 ver = readl(iommu->reg + DMAR_VER_REG); 4142 4146 return sprintf(buf, "%d:%d\n", 4143 4147 DMAR_VER_MAJOR(ver), DMAR_VER_MINOR(ver)); 4144 4148 } 4145 - static DEVICE_ATTR(version, S_IRUGO, intel_iommu_show_version, NULL); 4149 + static DEVICE_ATTR_RO(version); 4146 4150 4147 - static ssize_t intel_iommu_show_address(struct device *dev, 4148 - struct device_attribute *attr, 4149 - char *buf) 4151 + static ssize_t address_show(struct device *dev, 4152 + struct device_attribute *attr, char *buf) 4150 4153 { 4151 4154 struct intel_iommu *iommu = dev_to_intel_iommu(dev); 4152 4155 return sprintf(buf, "%llx\n", iommu->reg_phys); 4153 4156 } 4154 - static DEVICE_ATTR(address, S_IRUGO, intel_iommu_show_address, NULL); 4157 + static DEVICE_ATTR_RO(address); 4155 4158 4156 - static ssize_t intel_iommu_show_cap(struct device *dev, 4157 - struct device_attribute *attr, 4158 - char *buf) 4159 + static ssize_t cap_show(struct device *dev, 4160 + struct device_attribute *attr, char *buf) 4159 4161 { 4160 4162 struct intel_iommu *iommu = dev_to_intel_iommu(dev); 4161 4163 return sprintf(buf, "%llx\n", iommu->cap); 4162 4164 } 4163 - static DEVICE_ATTR(cap, S_IRUGO, intel_iommu_show_cap, NULL); 4165 + static DEVICE_ATTR_RO(cap); 4164 4166 4165 - static ssize_t intel_iommu_show_ecap(struct device *dev, 4166 - struct device_attribute *attr, 4167 - char *buf) 4167 + static ssize_t ecap_show(struct device *dev, 4168 + struct device_attribute *attr, char *buf) 4168 4169 { 4169 4170 struct intel_iommu *iommu = dev_to_intel_iommu(dev); 4170 4171 return sprintf(buf, "%llx\n", iommu->ecap); 4171 4172 } 4172 - static DEVICE_ATTR(ecap, S_IRUGO, intel_iommu_show_ecap, NULL); 4173 + static DEVICE_ATTR_RO(ecap); 4173 4174 4174 - static ssize_t intel_iommu_show_ndoms(struct device *dev, 4175 - struct device_attribute *attr, 4176 - char *buf) 4175 + static ssize_t domains_supported_show(struct device *dev, 4176 + struct device_attribute *attr, char *buf) 4177 4177 { 4178 4178 struct intel_iommu *iommu = dev_to_intel_iommu(dev); 4179 4179 return sprintf(buf, "%ld\n", cap_ndoms(iommu->cap)); 4180 4180 } 4181 - static DEVICE_ATTR(domains_supported, S_IRUGO, intel_iommu_show_ndoms, NULL); 4181 + static DEVICE_ATTR_RO(domains_supported); 4182 4182 4183 - static ssize_t intel_iommu_show_ndoms_used(struct device *dev, 4184 - struct device_attribute *attr, 4185 - char *buf) 4183 + static ssize_t domains_used_show(struct device *dev, 4184 + struct device_attribute *attr, char *buf) 4186 4185 { 4187 4186 struct intel_iommu *iommu = dev_to_intel_iommu(dev); 4188 4187 return sprintf(buf, "%d\n", bitmap_weight(iommu->domain_ids, 4189 4188 cap_ndoms(iommu->cap))); 4190 4189 } 4191 - static DEVICE_ATTR(domains_used, S_IRUGO, intel_iommu_show_ndoms_used, NULL); 4190 + static DEVICE_ATTR_RO(domains_used); 4192 4191 4193 4192 static struct attribute *intel_iommu_attrs[] = { 4194 4193 &dev_attr_version.attr, ··· 4500 4511 adjust_width = guestwidth_to_adjustwidth(guest_width); 4501 4512 domain->agaw = width_to_agaw(adjust_width); 4502 4513 4503 - domain->iommu_coherency = 0; 4504 - domain->iommu_snooping = 0; 4514 + domain->iommu_coherency = false; 4515 + domain->iommu_snooping = false; 4505 4516 domain->iommu_superpage = 0; 4506 4517 domain->max_addr = 0; 4507 4518 4508 4519 /* always allocate the top pgd */ 4509 - domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid); 4520 + domain->pgd = alloc_pgtable_page(domain->nid); 4510 4521 if (!domain->pgd) 4511 4522 return -ENOMEM; 4512 4523 domain_flush_cache(domain, domain->pgd, PAGE_SIZE); ··· 4746 4757 if (!iommu) 4747 4758 return -ENODEV; 4748 4759 4760 + if ((dmar_domain->flags & DOMAIN_FLAG_NESTING_MODE) && 4761 + !ecap_nest(iommu->ecap)) { 4762 + dev_err(dev, "%s: iommu not support nested translation\n", 4763 + iommu->name); 4764 + return -EINVAL; 4765 + } 4766 + 4749 4767 /* check if this iommu agaw is sufficient for max mapped address */ 4750 4768 addr_width = agaw_to_width(iommu->agaw); 4751 4769 if (addr_width > cap_mgaw(iommu->cap)) ··· 4774 4778 4775 4779 pte = dmar_domain->pgd; 4776 4780 if (dma_pte_present(pte)) { 4777 - dmar_domain->pgd = (struct dma_pte *) 4778 - phys_to_virt(dma_pte_addr(pte)); 4781 + dmar_domain->pgd = phys_to_virt(dma_pte_addr(pte)); 4779 4782 free_pgtable_page(pte); 4780 4783 } 4781 4784 dmar_domain->agaw--; ··· 5124 5129 static bool intel_iommu_capable(enum iommu_cap cap) 5125 5130 { 5126 5131 if (cap == IOMMU_CAP_CACHE_COHERENCY) 5127 - return domain_update_iommu_snooping(NULL) == 1; 5132 + return domain_update_iommu_snooping(NULL); 5128 5133 if (cap == IOMMU_CAP_INTR_REMAP) 5129 5134 return irq_remapping_enabled == 1; 5130 5135 ··· 5160 5165 5161 5166 static void intel_iommu_probe_finalize(struct device *dev) 5162 5167 { 5163 - dma_addr_t base = IOVA_START_PFN << VTD_PAGE_SHIFT; 5164 5168 struct iommu_domain *domain = iommu_get_domain_for_dev(dev); 5165 - struct dmar_domain *dmar_domain = to_dmar_domain(domain); 5166 5169 5167 5170 if (domain && domain->type == IOMMU_DOMAIN_DMA) 5168 - iommu_setup_dma_ops(dev, base, 5169 - __DOMAIN_MAX_ADDR(dmar_domain->gaw) - base); 5171 + iommu_setup_dma_ops(dev, 0, U64_MAX); 5170 5172 else 5171 5173 set_dma_ops(dev, NULL); 5172 5174 } ··· 5323 5331 return 0; 5324 5332 } 5325 5333 5334 + static int intel_iommu_enable_sva(struct device *dev) 5335 + { 5336 + struct device_domain_info *info = get_domain_info(dev); 5337 + struct intel_iommu *iommu; 5338 + int ret; 5339 + 5340 + if (!info || dmar_disabled) 5341 + return -EINVAL; 5342 + 5343 + iommu = info->iommu; 5344 + if (!iommu) 5345 + return -EINVAL; 5346 + 5347 + if (!(iommu->flags & VTD_FLAG_SVM_CAPABLE)) 5348 + return -ENODEV; 5349 + 5350 + if (intel_iommu_enable_pasid(iommu, dev)) 5351 + return -ENODEV; 5352 + 5353 + if (!info->pasid_enabled || !info->pri_enabled || !info->ats_enabled) 5354 + return -EINVAL; 5355 + 5356 + ret = iopf_queue_add_device(iommu->iopf_queue, dev); 5357 + if (!ret) 5358 + ret = iommu_register_device_fault_handler(dev, iommu_queue_iopf, dev); 5359 + 5360 + return ret; 5361 + } 5362 + 5363 + static int intel_iommu_disable_sva(struct device *dev) 5364 + { 5365 + struct device_domain_info *info = get_domain_info(dev); 5366 + struct intel_iommu *iommu = info->iommu; 5367 + int ret; 5368 + 5369 + ret = iommu_unregister_device_fault_handler(dev); 5370 + if (!ret) 5371 + ret = iopf_queue_remove_device(iommu->iopf_queue, dev); 5372 + 5373 + return ret; 5374 + } 5375 + 5326 5376 /* 5327 5377 * A PCI express designated vendor specific extended capability is defined 5328 5378 * in the section 3.7 of Intel scalable I/O virtualization technical spec ··· 5426 5392 static int 5427 5393 intel_iommu_dev_enable_feat(struct device *dev, enum iommu_dev_features feat) 5428 5394 { 5429 - if (feat == IOMMU_DEV_FEAT_AUX) 5395 + switch (feat) { 5396 + case IOMMU_DEV_FEAT_AUX: 5430 5397 return intel_iommu_enable_auxd(dev); 5431 5398 5432 - if (feat == IOMMU_DEV_FEAT_IOPF) 5399 + case IOMMU_DEV_FEAT_IOPF: 5433 5400 return intel_iommu_dev_has_feat(dev, feat) ? 0 : -ENODEV; 5434 5401 5435 - if (feat == IOMMU_DEV_FEAT_SVA) { 5436 - struct device_domain_info *info = get_domain_info(dev); 5402 + case IOMMU_DEV_FEAT_SVA: 5403 + return intel_iommu_enable_sva(dev); 5437 5404 5438 - if (!info) 5439 - return -EINVAL; 5440 - 5441 - if (!info->pasid_enabled || !info->pri_enabled || !info->ats_enabled) 5442 - return -EINVAL; 5443 - 5444 - if (info->iommu->flags & VTD_FLAG_SVM_CAPABLE) 5445 - return 0; 5405 + default: 5406 + return -ENODEV; 5446 5407 } 5447 - 5448 - return -ENODEV; 5449 5408 } 5450 5409 5451 5410 static int 5452 5411 intel_iommu_dev_disable_feat(struct device *dev, enum iommu_dev_features feat) 5453 5412 { 5454 - if (feat == IOMMU_DEV_FEAT_AUX) 5413 + switch (feat) { 5414 + case IOMMU_DEV_FEAT_AUX: 5455 5415 return intel_iommu_disable_auxd(dev); 5456 5416 5457 - return -ENODEV; 5417 + case IOMMU_DEV_FEAT_IOPF: 5418 + return 0; 5419 + 5420 + case IOMMU_DEV_FEAT_SVA: 5421 + return intel_iommu_disable_sva(dev); 5422 + 5423 + default: 5424 + return -ENODEV; 5425 + } 5458 5426 } 5459 5427 5460 5428 static bool ··· 5493 5457 int ret = -ENODEV; 5494 5458 5495 5459 spin_lock_irqsave(&device_domain_lock, flags); 5496 - if (nested_mode_support() && list_empty(&dmar_domain->devices)) { 5460 + if (list_empty(&dmar_domain->devices)) { 5497 5461 dmar_domain->flags |= DOMAIN_FLAG_NESTING_MODE; 5498 5462 dmar_domain->flags &= ~DOMAIN_FLAG_USE_FIRST_LEVEL; 5499 5463 ret = 0;

+1 -1

drivers/iommu/intel/pasid.c

+166

drivers/iommu/intel/perf.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /** 3 + * perf.c - performance monitor 4 + * 5 + * Copyright (C) 2021 Intel Corporation 6 + * 7 + * Author: Lu Baolu <baolu.lu@linux.intel.com> 8 + * Fenghua Yu <fenghua.yu@intel.com> 9 + */ 10 + 11 + #include <linux/spinlock.h> 12 + #include <linux/intel-iommu.h> 13 + 14 + #include "perf.h" 15 + 16 + static DEFINE_SPINLOCK(latency_lock); 17 + 18 + bool dmar_latency_enabled(struct intel_iommu *iommu, enum latency_type type) 19 + { 20 + struct latency_statistic *lstat = iommu->perf_statistic; 21 + 22 + return lstat && lstat[type].enabled; 23 + } 24 + 25 + int dmar_latency_enable(struct intel_iommu *iommu, enum latency_type type) 26 + { 27 + struct latency_statistic *lstat; 28 + unsigned long flags; 29 + int ret = -EBUSY; 30 + 31 + if (dmar_latency_enabled(iommu, type)) 32 + return 0; 33 + 34 + spin_lock_irqsave(&latency_lock, flags); 35 + if (!iommu->perf_statistic) { 36 + iommu->perf_statistic = kzalloc(sizeof(*lstat) * DMAR_LATENCY_NUM, 37 + GFP_ATOMIC); 38 + if (!iommu->perf_statistic) { 39 + ret = -ENOMEM; 40 + goto unlock_out; 41 + } 42 + } 43 + 44 + lstat = iommu->perf_statistic; 45 + 46 + if (!lstat[type].enabled) { 47 + lstat[type].enabled = true; 48 + lstat[type].counter[COUNTS_MIN] = UINT_MAX; 49 + ret = 0; 50 + } 51 + unlock_out: 52 + spin_unlock_irqrestore(&latency_lock, flags); 53 + 54 + return ret; 55 + } 56 + 57 + void dmar_latency_disable(struct intel_iommu *iommu, enum latency_type type) 58 + { 59 + struct latency_statistic *lstat = iommu->perf_statistic; 60 + unsigned long flags; 61 + 62 + if (!dmar_latency_enabled(iommu, type)) 63 + return; 64 + 65 + spin_lock_irqsave(&latency_lock, flags); 66 + memset(&lstat[type], 0, sizeof(*lstat) * DMAR_LATENCY_NUM); 67 + spin_unlock_irqrestore(&latency_lock, flags); 68 + } 69 + 70 + void dmar_latency_update(struct intel_iommu *iommu, enum latency_type type, u64 latency) 71 + { 72 + struct latency_statistic *lstat = iommu->perf_statistic; 73 + unsigned long flags; 74 + u64 min, max; 75 + 76 + if (!dmar_latency_enabled(iommu, type)) 77 + return; 78 + 79 + spin_lock_irqsave(&latency_lock, flags); 80 + if (latency < 100) 81 + lstat[type].counter[COUNTS_10e2]++; 82 + else if (latency < 1000) 83 + lstat[type].counter[COUNTS_10e3]++; 84 + else if (latency < 10000) 85 + lstat[type].counter[COUNTS_10e4]++; 86 + else if (latency < 100000) 87 + lstat[type].counter[COUNTS_10e5]++; 88 + else if (latency < 1000000) 89 + lstat[type].counter[COUNTS_10e6]++; 90 + else if (latency < 10000000) 91 + lstat[type].counter[COUNTS_10e7]++; 92 + else 93 + lstat[type].counter[COUNTS_10e8_plus]++; 94 + 95 + min = lstat[type].counter[COUNTS_MIN]; 96 + max = lstat[type].counter[COUNTS_MAX]; 97 + lstat[type].counter[COUNTS_MIN] = min_t(u64, min, latency); 98 + lstat[type].counter[COUNTS_MAX] = max_t(u64, max, latency); 99 + lstat[type].counter[COUNTS_SUM] += latency; 100 + lstat[type].samples++; 101 + spin_unlock_irqrestore(&latency_lock, flags); 102 + } 103 + 104 + static char *latency_counter_names[] = { 105 + " <0.1us", 106 + " 0.1us-1us", " 1us-10us", " 10us-100us", 107 + " 100us-1ms", " 1ms-10ms", " >=10ms", 108 + " min(us)", " max(us)", " average(us)" 109 + }; 110 + 111 + static char *latency_type_names[] = { 112 + " inv_iotlb", " inv_devtlb", " inv_iec", 113 + " svm_prq" 114 + }; 115 + 116 + int dmar_latency_snapshot(struct intel_iommu *iommu, char *str, size_t size) 117 + { 118 + struct latency_statistic *lstat = iommu->perf_statistic; 119 + unsigned long flags; 120 + int bytes = 0, i, j; 121 + 122 + memset(str, 0, size); 123 + 124 + for (i = 0; i < COUNTS_NUM; i++) 125 + bytes += snprintf(str + bytes, size - bytes, 126 + "%s", latency_counter_names[i]); 127 + 128 + spin_lock_irqsave(&latency_lock, flags); 129 + for (i = 0; i < DMAR_LATENCY_NUM; i++) { 130 + if (!dmar_latency_enabled(iommu, i)) 131 + continue; 132 + 133 + bytes += snprintf(str + bytes, size - bytes, 134 + "\n%s", latency_type_names[i]); 135 + 136 + for (j = 0; j < COUNTS_NUM; j++) { 137 + u64 val = lstat[i].counter[j]; 138 + 139 + switch (j) { 140 + case COUNTS_MIN: 141 + if (val == UINT_MAX) 142 + val = 0; 143 + else 144 + val = div_u64(val, 1000); 145 + break; 146 + case COUNTS_MAX: 147 + val = div_u64(val, 1000); 148 + break; 149 + case COUNTS_SUM: 150 + if (lstat[i].samples) 151 + val = div_u64(val, (lstat[i].samples * 1000)); 152 + else 153 + val = 0; 154 + break; 155 + default: 156 + break; 157 + } 158 + 159 + bytes += snprintf(str + bytes, size - bytes, 160 + "%12lld", val); 161 + } 162 + } 163 + spin_unlock_irqrestore(&latency_lock, flags); 164 + 165 + return bytes; 166 + }

+73

drivers/iommu/intel/perf.h

··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + /* 3 + * perf.h - performance monitor header 4 + * 5 + * Copyright (C) 2021 Intel Corporation 6 + * 7 + * Author: Lu Baolu <baolu.lu@linux.intel.com> 8 + */ 9 + 10 + enum latency_type { 11 + DMAR_LATENCY_INV_IOTLB = 0, 12 + DMAR_LATENCY_INV_DEVTLB, 13 + DMAR_LATENCY_INV_IEC, 14 + DMAR_LATENCY_PRQ, 15 + DMAR_LATENCY_NUM 16 + }; 17 + 18 + enum latency_count { 19 + COUNTS_10e2 = 0, /* < 0.1us */ 20 + COUNTS_10e3, /* 0.1us ~ 1us */ 21 + COUNTS_10e4, /* 1us ~ 10us */ 22 + COUNTS_10e5, /* 10us ~ 100us */ 23 + COUNTS_10e6, /* 100us ~ 1ms */ 24 + COUNTS_10e7, /* 1ms ~ 10ms */ 25 + COUNTS_10e8_plus, /* 10ms and plus*/ 26 + COUNTS_MIN, 27 + COUNTS_MAX, 28 + COUNTS_SUM, 29 + COUNTS_NUM 30 + }; 31 + 32 + struct latency_statistic { 33 + bool enabled; 34 + u64 counter[COUNTS_NUM]; 35 + u64 samples; 36 + }; 37 + 38 + #ifdef CONFIG_DMAR_PERF 39 + int dmar_latency_enable(struct intel_iommu *iommu, enum latency_type type); 40 + void dmar_latency_disable(struct intel_iommu *iommu, enum latency_type type); 41 + bool dmar_latency_enabled(struct intel_iommu *iommu, enum latency_type type); 42 + void dmar_latency_update(struct intel_iommu *iommu, enum latency_type type, 43 + u64 latency); 44 + int dmar_latency_snapshot(struct intel_iommu *iommu, char *str, size_t size); 45 + #else 46 + static inline int 47 + dmar_latency_enable(struct intel_iommu *iommu, enum latency_type type) 48 + { 49 + return -EINVAL; 50 + } 51 + 52 + static inline void 53 + dmar_latency_disable(struct intel_iommu *iommu, enum latency_type type) 54 + { 55 + } 56 + 57 + static inline bool 58 + dmar_latency_enabled(struct intel_iommu *iommu, enum latency_type type) 59 + { 60 + return false; 61 + } 62 + 63 + static inline void 64 + dmar_latency_update(struct intel_iommu *iommu, enum latency_type type, u64 latency) 65 + { 66 + } 67 + 68 + static inline int 69 + dmar_latency_snapshot(struct intel_iommu *iommu, char *str, size_t size) 70 + { 71 + return 0; 72 + } 73 + #endif /* CONFIG_DMAR_PERF */

+320 -329

drivers/iommu/intel/svm.c

··· 17 17 #include <linux/dmar.h> 18 18 #include <linux/interrupt.h> 19 19 #include <linux/mm_types.h> 20 + #include <linux/xarray.h> 20 21 #include <linux/ioasid.h> 21 22 #include <asm/page.h> 22 23 #include <asm/fpu/api.h> 24 + #include <trace/events/intel_iommu.h> 23 25 24 26 #include "pasid.h" 27 + #include "perf.h" 28 + #include "../iommu-sva-lib.h" 25 29 26 30 static irqreturn_t prq_event_thread(int irq, void *d); 27 31 static void intel_svm_drain_prq(struct device *dev, u32 pasid); 32 + #define to_intel_svm_dev(handle) container_of(handle, struct intel_svm_dev, sva) 28 33 29 34 #define PRQ_ORDER 0 30 35 36 + static DEFINE_XARRAY_ALLOC(pasid_private_array); 37 + static int pasid_private_add(ioasid_t pasid, void *priv) 38 + { 39 + return xa_alloc(&pasid_private_array, &pasid, priv, 40 + XA_LIMIT(pasid, pasid), GFP_ATOMIC); 41 + } 42 + 43 + static void pasid_private_remove(ioasid_t pasid) 44 + { 45 + xa_erase(&pasid_private_array, pasid); 46 + } 47 + 48 + static void *pasid_private_find(ioasid_t pasid) 49 + { 50 + return xa_load(&pasid_private_array, pasid); 51 + } 52 + 53 + static struct intel_svm_dev * 54 + svm_lookup_device_by_sid(struct intel_svm *svm, u16 sid) 55 + { 56 + struct intel_svm_dev *sdev = NULL, *t; 57 + 58 + rcu_read_lock(); 59 + list_for_each_entry_rcu(t, &svm->devs, list) { 60 + if (t->sid == sid) { 61 + sdev = t; 62 + break; 63 + } 64 + } 65 + rcu_read_unlock(); 66 + 67 + return sdev; 68 + } 69 + 70 + static struct intel_svm_dev * 71 + svm_lookup_device_by_dev(struct intel_svm *svm, struct device *dev) 72 + { 73 + struct intel_svm_dev *sdev = NULL, *t; 74 + 75 + rcu_read_lock(); 76 + list_for_each_entry_rcu(t, &svm->devs, list) { 77 + if (t->dev == dev) { 78 + sdev = t; 79 + break; 80 + } 81 + } 82 + rcu_read_unlock(); 83 + 84 + return sdev; 85 + } 86 + 31 87 int intel_svm_enable_prq(struct intel_iommu *iommu) 32 88 { 89 + struct iopf_queue *iopfq; 33 90 struct page *pages; 34 91 int irq, ret; 35 92 ··· 103 46 pr_err("IOMMU: %s: Failed to create IRQ vector for page request queue\n", 104 47 iommu->name); 105 48 ret = -EINVAL; 106 - err: 107 - free_pages((unsigned long)iommu->prq, PRQ_ORDER); 108 - iommu->prq = NULL; 109 - return ret; 49 + goto free_prq; 110 50 } 111 51 iommu->pr_irq = irq; 52 + 53 + snprintf(iommu->iopfq_name, sizeof(iommu->iopfq_name), 54 + "dmar%d-iopfq", iommu->seq_id); 55 + iopfq = iopf_queue_alloc(iommu->iopfq_name); 56 + if (!iopfq) { 57 + pr_err("IOMMU: %s: Failed to allocate iopf queue\n", iommu->name); 58 + ret = -ENOMEM; 59 + goto free_hwirq; 60 + } 61 + iommu->iopf_queue = iopfq; 112 62 113 63 snprintf(iommu->prq_name, sizeof(iommu->prq_name), "dmar%d-prq", iommu->seq_id); 114 64 ··· 124 60 if (ret) { 125 61 pr_err("IOMMU: %s: Failed to request IRQ for page request queue\n", 126 62 iommu->name); 127 - dmar_free_hwirq(irq); 128 - iommu->pr_irq = 0; 129 - goto err; 63 + goto free_iopfq; 130 64 } 131 65 dmar_writeq(iommu->reg + DMAR_PQH_REG, 0ULL); 132 66 dmar_writeq(iommu->reg + DMAR_PQT_REG, 0ULL); ··· 133 71 init_completion(&iommu->prq_complete); 134 72 135 73 return 0; 74 + 75 + free_iopfq: 76 + iopf_queue_free(iommu->iopf_queue); 77 + iommu->iopf_queue = NULL; 78 + free_hwirq: 79 + dmar_free_hwirq(irq); 80 + iommu->pr_irq = 0; 81 + free_prq: 82 + free_pages((unsigned long)iommu->prq, PRQ_ORDER); 83 + iommu->prq = NULL; 84 + 85 + return ret; 136 86 } 137 87 138 88 int intel_svm_finish_prq(struct intel_iommu *iommu) ··· 157 83 free_irq(iommu->pr_irq, iommu); 158 84 dmar_free_hwirq(iommu->pr_irq); 159 85 iommu->pr_irq = 0; 86 + } 87 + 88 + if (iommu->iopf_queue) { 89 + iopf_queue_free(iommu->iopf_queue); 90 + iommu->iopf_queue = NULL; 160 91 } 161 92 162 93 free_pages((unsigned long)iommu->prq, PRQ_ORDER); ··· 283 204 }; 284 205 285 206 static DEFINE_MUTEX(pasid_mutex); 286 - static LIST_HEAD(global_svm_list); 287 - 288 - #define for_each_svm_dev(sdev, svm, d) \ 289 - list_for_each_entry((sdev), &(svm)->devs, list) \ 290 - if ((d) != (sdev)->dev) {} else 291 207 292 208 static int pasid_to_svm_sdev(struct device *dev, unsigned int pasid, 293 209 struct intel_svm **rsvm, 294 210 struct intel_svm_dev **rsdev) 295 211 { 296 - struct intel_svm_dev *d, *sdev = NULL; 212 + struct intel_svm_dev *sdev = NULL; 297 213 struct intel_svm *svm; 298 214 299 215 /* The caller should hold the pasid_mutex lock */ ··· 298 224 if (pasid == INVALID_IOASID || pasid >= PASID_MAX) 299 225 return -EINVAL; 300 226 301 - svm = ioasid_find(NULL, pasid, NULL); 227 + svm = pasid_private_find(pasid); 302 228 if (IS_ERR(svm)) 303 229 return PTR_ERR(svm); 304 230 ··· 311 237 */ 312 238 if (WARN_ON(list_empty(&svm->devs))) 313 239 return -EINVAL; 314 - 315 - rcu_read_lock(); 316 - list_for_each_entry_rcu(d, &svm->devs, list) { 317 - if (d->dev == dev) { 318 - sdev = d; 319 - break; 320 - } 321 - } 322 - rcu_read_unlock(); 240 + sdev = svm_lookup_device_by_dev(svm, dev); 323 241 324 242 out: 325 243 *rsvm = svm; ··· 400 334 svm->gpasid = data->gpasid; 401 335 svm->flags |= SVM_FLAG_GUEST_PASID; 402 336 } 403 - ioasid_set_data(data->hpasid, svm); 337 + pasid_private_add(data->hpasid, svm); 404 338 INIT_LIST_HEAD_RCU(&svm->devs); 405 339 mmput(svm->mm); 406 340 } ··· 454 388 list_add_rcu(&sdev->list, &svm->devs); 455 389 out: 456 390 if (!IS_ERR_OR_NULL(svm) && list_empty(&svm->devs)) { 457 - ioasid_set_data(data->hpasid, NULL); 391 + pasid_private_remove(data->hpasid); 458 392 kfree(svm); 459 393 } 460 394 ··· 497 431 * the unbind, IOMMU driver will get notified 498 432 * and perform cleanup. 499 433 */ 500 - ioasid_set_data(pasid, NULL); 434 + pasid_private_remove(pasid); 501 435 kfree(svm); 502 436 } 503 437 } ··· 525 459 mutex_unlock(&mm->context.lock); 526 460 } 527 461 528 - /* Caller must hold pasid_mutex, mm reference */ 529 - static int 530 - intel_svm_bind_mm(struct device *dev, unsigned int flags, 531 - struct mm_struct *mm, struct intel_svm_dev **sd) 462 + static int intel_svm_alloc_pasid(struct device *dev, struct mm_struct *mm, 463 + unsigned int flags) 532 464 { 533 - struct intel_iommu *iommu = device_to_iommu(dev, NULL, NULL); 534 - struct intel_svm *svm = NULL, *t; 535 - struct device_domain_info *info; 465 + ioasid_t max_pasid = dev_is_pci(dev) ? 466 + pci_max_pasids(to_pci_dev(dev)) : intel_pasid_max_id; 467 + 468 + return iommu_sva_alloc_pasid(mm, PASID_MIN, max_pasid - 1); 469 + } 470 + 471 + static void intel_svm_free_pasid(struct mm_struct *mm) 472 + { 473 + iommu_sva_free_pasid(mm); 474 + } 475 + 476 + static struct iommu_sva *intel_svm_bind_mm(struct intel_iommu *iommu, 477 + struct device *dev, 478 + struct mm_struct *mm, 479 + unsigned int flags) 480 + { 481 + struct device_domain_info *info = get_domain_info(dev); 482 + unsigned long iflags, sflags; 536 483 struct intel_svm_dev *sdev; 537 - unsigned long iflags; 538 - int pasid_max; 539 - int ret; 484 + struct intel_svm *svm; 485 + int ret = 0; 540 486 541 - if (!iommu || dmar_disabled) 542 - return -EINVAL; 487 + svm = pasid_private_find(mm->pasid); 488 + if (!svm) { 489 + svm = kzalloc(sizeof(*svm), GFP_KERNEL); 490 + if (!svm) 491 + return ERR_PTR(-ENOMEM); 543 492 544 - if (!intel_svm_capable(iommu)) 545 - return -ENOTSUPP; 493 + svm->pasid = mm->pasid; 494 + svm->mm = mm; 495 + svm->flags = flags; 496 + INIT_LIST_HEAD_RCU(&svm->devs); 546 497 547 - if (dev_is_pci(dev)) { 548 - pasid_max = pci_max_pasids(to_pci_dev(dev)); 549 - if (pasid_max < 0) 550 - return -EINVAL; 551 - } else 552 - pasid_max = 1 << 20; 498 + if (!(flags & SVM_FLAG_SUPERVISOR_MODE)) { 499 + svm->notifier.ops = &intel_mmuops; 500 + ret = mmu_notifier_register(&svm->notifier, mm); 501 + if (ret) { 502 + kfree(svm); 503 + return ERR_PTR(ret); 504 + } 505 + } 553 506 554 - /* Bind supervisor PASID shuld have mm = NULL */ 555 - if (flags & SVM_FLAG_SUPERVISOR_MODE) { 556 - if (!ecap_srs(iommu->ecap) || mm) { 557 - pr_err("Supervisor PASID with user provided mm.\n"); 558 - return -EINVAL; 507 + ret = pasid_private_add(svm->pasid, svm); 508 + if (ret) { 509 + if (svm->notifier.ops) 510 + mmu_notifier_unregister(&svm->notifier, mm); 511 + kfree(svm); 512 + return ERR_PTR(ret); 559 513 } 560 514 } 561 515 562 - list_for_each_entry(t, &global_svm_list, list) { 563 - if (t->mm != mm) 564 - continue; 565 - 566 - svm = t; 567 - if (svm->pasid >= pasid_max) { 568 - dev_warn(dev, 569 - "Limited PASID width. Cannot use existing PASID %d\n", 570 - svm->pasid); 571 - ret = -ENOSPC; 572 - goto out; 573 - } 574 - 575 - /* Find the matching device in svm list */ 576 - for_each_svm_dev(sdev, svm, dev) { 577 - sdev->users++; 578 - goto success; 579 - } 580 - 581 - break; 516 + /* Find the matching device in svm list */ 517 + sdev = svm_lookup_device_by_dev(svm, dev); 518 + if (sdev) { 519 + sdev->users++; 520 + goto success; 582 521 } 583 522 584 523 sdev = kzalloc(sizeof(*sdev), GFP_KERNEL); 585 524 if (!sdev) { 586 525 ret = -ENOMEM; 587 - goto out; 526 + goto free_svm; 588 527 } 528 + 589 529 sdev->dev = dev; 590 530 sdev->iommu = iommu; 591 - 592 - ret = intel_iommu_enable_pasid(iommu, dev); 593 - if (ret) { 594 - kfree(sdev); 595 - goto out; 596 - } 597 - 598 - info = get_domain_info(dev); 599 531 sdev->did = FLPT_DEFAULT_DID; 600 532 sdev->sid = PCI_DEVID(info->bus, info->devfn); 533 + sdev->users = 1; 534 + sdev->pasid = svm->pasid; 535 + sdev->sva.dev = dev; 536 + init_rcu_head(&sdev->rcu); 601 537 if (info->ats_enabled) { 602 538 sdev->dev_iotlb = 1; 603 539 sdev->qdep = info->ats_qdep; ··· 607 539 sdev->qdep = 0; 608 540 } 609 541 610 - /* Finish the setup now we know we're keeping it */ 611 - sdev->users = 1; 612 - init_rcu_head(&sdev->rcu); 542 + /* Setup the pasid table: */ 543 + sflags = (flags & SVM_FLAG_SUPERVISOR_MODE) ? 544 + PASID_FLAG_SUPERVISOR_MODE : 0; 545 + sflags |= cpu_feature_enabled(X86_FEATURE_LA57) ? PASID_FLAG_FL5LP : 0; 546 + spin_lock_irqsave(&iommu->lock, iflags); 547 + ret = intel_pasid_setup_first_level(iommu, dev, mm->pgd, mm->pasid, 548 + FLPT_DEFAULT_DID, sflags); 549 + spin_unlock_irqrestore(&iommu->lock, iflags); 613 550 614 - if (!svm) { 615 - svm = kzalloc(sizeof(*svm), GFP_KERNEL); 616 - if (!svm) { 617 - ret = -ENOMEM; 618 - kfree(sdev); 619 - goto out; 620 - } 551 + if (ret) 552 + goto free_sdev; 621 553 622 - if (pasid_max > intel_pasid_max_id) 623 - pasid_max = intel_pasid_max_id; 554 + /* The newly allocated pasid is loaded to the mm. */ 555 + if (!(flags & SVM_FLAG_SUPERVISOR_MODE) && list_empty(&svm->devs)) 556 + load_pasid(mm, svm->pasid); 624 557 625 - /* Do not use PASID 0, reserved for RID to PASID */ 626 - svm->pasid = ioasid_alloc(NULL, PASID_MIN, 627 - pasid_max - 1, svm); 628 - if (svm->pasid == INVALID_IOASID) { 629 - kfree(svm); 630 - kfree(sdev); 631 - ret = -ENOSPC; 632 - goto out; 633 - } 634 - svm->notifier.ops = &intel_mmuops; 635 - svm->mm = mm; 636 - svm->flags = flags; 637 - INIT_LIST_HEAD_RCU(&svm->devs); 638 - INIT_LIST_HEAD(&svm->list); 639 - ret = -ENOMEM; 640 - if (mm) { 641 - ret = mmu_notifier_register(&svm->notifier, mm); 642 - if (ret) { 643 - ioasid_put(svm->pasid); 644 - kfree(svm); 645 - kfree(sdev); 646 - goto out; 647 - } 648 - } 649 - 650 - spin_lock_irqsave(&iommu->lock, iflags); 651 - ret = intel_pasid_setup_first_level(iommu, dev, 652 - mm ? mm->pgd : init_mm.pgd, 653 - svm->pasid, FLPT_DEFAULT_DID, 654 - (mm ? 0 : PASID_FLAG_SUPERVISOR_MODE) | 655 - (cpu_feature_enabled(X86_FEATURE_LA57) ? 656 - PASID_FLAG_FL5LP : 0)); 657 - spin_unlock_irqrestore(&iommu->lock, iflags); 658 - if (ret) { 659 - if (mm) 660 - mmu_notifier_unregister(&svm->notifier, mm); 661 - ioasid_put(svm->pasid); 662 - kfree(svm); 663 - kfree(sdev); 664 - goto out; 665 - } 666 - 667 - list_add_tail(&svm->list, &global_svm_list); 668 - if (mm) { 669 - /* The newly allocated pasid is loaded to the mm. */ 670 - load_pasid(mm, svm->pasid); 671 - } 672 - } else { 673 - /* 674 - * Binding a new device with existing PASID, need to setup 675 - * the PASID entry. 676 - */ 677 - spin_lock_irqsave(&iommu->lock, iflags); 678 - ret = intel_pasid_setup_first_level(iommu, dev, 679 - mm ? mm->pgd : init_mm.pgd, 680 - svm->pasid, FLPT_DEFAULT_DID, 681 - (mm ? 0 : PASID_FLAG_SUPERVISOR_MODE) | 682 - (cpu_feature_enabled(X86_FEATURE_LA57) ? 683 - PASID_FLAG_FL5LP : 0)); 684 - spin_unlock_irqrestore(&iommu->lock, iflags); 685 - if (ret) { 686 - kfree(sdev); 687 - goto out; 688 - } 689 - } 690 558 list_add_rcu(&sdev->list, &svm->devs); 691 559 success: 692 - sdev->pasid = svm->pasid; 693 - sdev->sva.dev = dev; 694 - if (sd) 695 - *sd = sdev; 696 - ret = 0; 697 - out: 698 - return ret; 560 + return &sdev->sva; 561 + 562 + free_sdev: 563 + kfree(sdev); 564 + free_svm: 565 + if (list_empty(&svm->devs)) { 566 + if (svm->notifier.ops) 567 + mmu_notifier_unregister(&svm->notifier, mm); 568 + pasid_private_remove(mm->pasid); 569 + kfree(svm); 570 + } 571 + 572 + return ERR_PTR(ret); 699 573 } 700 574 701 575 /* Caller must hold pasid_mutex */ ··· 646 636 struct intel_svm_dev *sdev; 647 637 struct intel_iommu *iommu; 648 638 struct intel_svm *svm; 639 + struct mm_struct *mm; 649 640 int ret = -EINVAL; 650 641 651 642 iommu = device_to_iommu(dev, NULL, NULL); ··· 656 645 ret = pasid_to_svm_sdev(dev, pasid, &svm, &sdev); 657 646 if (ret) 658 647 goto out; 648 + mm = svm->mm; 659 649 660 650 if (sdev) { 661 651 sdev->users--; ··· 675 663 kfree_rcu(sdev, rcu); 676 664 677 665 if (list_empty(&svm->devs)) { 678 - ioasid_put(svm->pasid); 679 - if (svm->mm) { 680 - mmu_notifier_unregister(&svm->notifier, svm->mm); 666 + intel_svm_free_pasid(mm); 667 + if (svm->notifier.ops) { 668 + mmu_notifier_unregister(&svm->notifier, mm); 681 669 /* Clear mm's pasid. */ 682 - load_pasid(svm->mm, PASID_DISABLED); 670 + load_pasid(mm, PASID_DISABLED); 683 671 } 684 - list_del(&svm->list); 672 + pasid_private_remove(svm->pasid); 685 673 /* We mandate that no page faults may be outstanding 686 674 * for the PASID when intel_svm_unbind_mm() is called. 687 675 * If that is not obeyed, subtle errors will happen. ··· 725 713 }; 726 714 727 715 #define PRQ_RING_MASK ((0x1000 << PRQ_ORDER) - 0x20) 728 - 729 - static bool access_error(struct vm_area_struct *vma, struct page_req_dsc *req) 730 - { 731 - unsigned long requested = 0; 732 - 733 - if (req->exe_req) 734 - requested |= VM_EXEC; 735 - 736 - if (req->rd_req) 737 - requested |= VM_READ; 738 - 739 - if (req->wr_req) 740 - requested |= VM_WRITE; 741 - 742 - return (requested & ~vma->vm_flags) != 0; 743 - } 744 716 745 717 static bool is_canonical_address(u64 addr) 746 718 { ··· 795 799 goto prq_retry; 796 800 } 797 801 802 + iopf_queue_flush_dev(dev); 803 + 798 804 /* 799 805 * Perform steps described in VT-d spec CH7.10 to drain page 800 806 * requests and responses in hardware. ··· 839 841 return prot; 840 842 } 841 843 842 - static int 843 - intel_svm_prq_report(struct device *dev, struct page_req_dsc *desc) 844 + static int intel_svm_prq_report(struct intel_iommu *iommu, struct device *dev, 845 + struct page_req_dsc *desc) 844 846 { 845 847 struct iommu_fault_event event; 846 848 ··· 870 872 */ 871 873 event.fault.prm.flags |= IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE; 872 874 event.fault.prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PRIV_DATA; 873 - memcpy(event.fault.prm.private_data, desc->priv_data, 874 - sizeof(desc->priv_data)); 875 + event.fault.prm.private_data[0] = desc->priv_data[0]; 876 + event.fault.prm.private_data[1] = desc->priv_data[1]; 877 + } else if (dmar_latency_enabled(iommu, DMAR_LATENCY_PRQ)) { 878 + /* 879 + * If the private data fields are not used by hardware, use it 880 + * to monitor the prq handle latency. 881 + */ 882 + event.fault.prm.private_data[0] = ktime_to_ns(ktime_get()); 875 883 } 876 884 877 885 return iommu_report_device_fault(dev, &event); 886 + } 887 + 888 + static void handle_bad_prq_event(struct intel_iommu *iommu, 889 + struct page_req_dsc *req, int result) 890 + { 891 + struct qi_desc desc; 892 + 893 + pr_err("%s: Invalid page request: %08llx %08llx\n", 894 + iommu->name, ((unsigned long long *)req)[0], 895 + ((unsigned long long *)req)[1]); 896 + 897 + /* 898 + * Per VT-d spec. v3.0 ch7.7, system software must 899 + * respond with page group response if private data 900 + * is present (PDP) or last page in group (LPIG) bit 901 + * is set. This is an additional VT-d feature beyond 902 + * PCI ATS spec. 903 + */ 904 + if (!req->lpig && !req->priv_data_present) 905 + return; 906 + 907 + desc.qw0 = QI_PGRP_PASID(req->pasid) | 908 + QI_PGRP_DID(req->rid) | 909 + QI_PGRP_PASID_P(req->pasid_present) | 910 + QI_PGRP_PDP(req->priv_data_present) | 911 + QI_PGRP_RESP_CODE(result) | 912 + QI_PGRP_RESP_TYPE; 913 + desc.qw1 = QI_PGRP_IDX(req->prg_index) | 914 + QI_PGRP_LPIG(req->lpig); 915 + 916 + if (req->priv_data_present) { 917 + desc.qw2 = req->priv_data[0]; 918 + desc.qw3 = req->priv_data[1]; 919 + } else { 920 + desc.qw2 = 0; 921 + desc.qw3 = 0; 922 + } 923 + 924 + qi_submit_sync(iommu, &desc, 1, 0); 878 925 } 879 926 880 927 static irqreturn_t prq_event_thread(int irq, void *d) ··· 927 884 struct intel_svm_dev *sdev = NULL; 928 885 struct intel_iommu *iommu = d; 929 886 struct intel_svm *svm = NULL; 930 - int head, tail, handled = 0; 931 - unsigned int flags = 0; 887 + struct page_req_dsc *req; 888 + int head, tail, handled; 889 + u64 address; 932 890 933 - /* Clear PPR bit before reading head/tail registers, to 934 - * ensure that we get a new interrupt if needed. */ 891 + /* 892 + * Clear PPR bit before reading head/tail registers, to ensure that 893 + * we get a new interrupt if needed. 894 + */ 935 895 writel(DMA_PRS_PPR, iommu->reg + DMAR_PRS_REG); 936 896 937 897 tail = dmar_readq(iommu->reg + DMAR_PQT_REG) & PRQ_RING_MASK; 938 898 head = dmar_readq(iommu->reg + DMAR_PQH_REG) & PRQ_RING_MASK; 899 + handled = (head != tail); 939 900 while (head != tail) { 940 - struct vm_area_struct *vma; 941 - struct page_req_dsc *req; 942 - struct qi_desc resp; 943 - int result; 944 - vm_fault_t ret; 945 - u64 address; 946 - 947 - handled = 1; 948 901 req = &iommu->prq[head / sizeof(*req)]; 949 - result = QI_RESP_INVALID; 950 902 address = (u64)req->addr << VTD_PAGE_SHIFT; 951 - if (!req->pasid_present) { 952 - pr_err("%s: Page request without PASID: %08llx %08llx\n", 953 - iommu->name, ((unsigned long long *)req)[0], 954 - ((unsigned long long *)req)[1]); 955 - goto no_pasid; 903 + 904 + if (unlikely(!req->pasid_present)) { 905 + pr_err("IOMMU: %s: Page request without PASID\n", 906 + iommu->name); 907 + bad_req: 908 + svm = NULL; 909 + sdev = NULL; 910 + handle_bad_prq_event(iommu, req, QI_RESP_INVALID); 911 + goto prq_advance; 956 912 } 957 - /* We shall not receive page request for supervisor SVM */ 958 - if (req->pm_req && (req->rd_req | req->wr_req)) { 959 - pr_err("Unexpected page request in Privilege Mode"); 960 - /* No need to find the matching sdev as for bad_req */ 961 - goto no_pasid; 913 + 914 + if (unlikely(!is_canonical_address(address))) { 915 + pr_err("IOMMU: %s: Address is not canonical\n", 916 + iommu->name); 917 + goto bad_req; 962 918 } 963 - /* DMA read with exec requeset is not supported. */ 964 - if (req->exe_req && req->rd_req) { 965 - pr_err("Execution request not supported\n"); 966 - goto no_pasid; 919 + 920 + if (unlikely(req->pm_req && (req->rd_req | req->wr_req))) { 921 + pr_err("IOMMU: %s: Page request in Privilege Mode\n", 922 + iommu->name); 923 + goto bad_req; 967 924 } 925 + 926 + if (unlikely(req->exe_req && req->rd_req)) { 927 + pr_err("IOMMU: %s: Execution request not supported\n", 928 + iommu->name); 929 + goto bad_req; 930 + } 931 + 968 932 if (!svm || svm->pasid != req->pasid) { 969 - rcu_read_lock(); 970 - svm = ioasid_find(NULL, req->pasid, NULL); 971 - /* It *can't* go away, because the driver is not permitted 933 + /* 934 + * It can't go away, because the driver is not permitted 972 935 * to unbind the mm while any page faults are outstanding. 973 - * So we only need RCU to protect the internal idr code. */ 974 - rcu_read_unlock(); 975 - if (IS_ERR_OR_NULL(svm)) { 976 - pr_err("%s: Page request for invalid PASID %d: %08llx %08llx\n", 977 - iommu->name, req->pasid, ((unsigned long long *)req)[0], 978 - ((unsigned long long *)req)[1]); 979 - goto no_pasid; 980 - } 936 + */ 937 + svm = pasid_private_find(req->pasid); 938 + if (IS_ERR_OR_NULL(svm) || (svm->flags & SVM_FLAG_SUPERVISOR_MODE)) 939 + goto bad_req; 981 940 } 982 941 983 942 if (!sdev || sdev->sid != req->rid) { 984 - struct intel_svm_dev *t; 985 - 986 - sdev = NULL; 987 - rcu_read_lock(); 988 - list_for_each_entry_rcu(t, &svm->devs, list) { 989 - if (t->sid == req->rid) { 990 - sdev = t; 991 - break; 992 - } 993 - } 994 - rcu_read_unlock(); 943 + sdev = svm_lookup_device_by_sid(svm, req->rid); 944 + if (!sdev) 945 + goto bad_req; 995 946 } 996 947 997 - /* Since we're using init_mm.pgd directly, we should never take 998 - * any faults on kernel addresses. */ 999 - if (!svm->mm) 1000 - goto bad_req; 1001 - 1002 - /* If address is not canonical, return invalid response */ 1003 - if (!is_canonical_address(address)) 1004 - goto bad_req; 948 + sdev->prq_seq_number++; 1005 949 1006 950 /* 1007 951 * If prq is to be handled outside iommu driver via receiver of 1008 952 * the fault notifiers, we skip the page response here. 1009 953 */ 1010 - if (svm->flags & SVM_FLAG_GUEST_MODE) { 1011 - if (sdev && !intel_svm_prq_report(sdev->dev, req)) 1012 - goto prq_advance; 1013 - else 1014 - goto bad_req; 1015 - } 954 + if (intel_svm_prq_report(iommu, sdev->dev, req)) 955 + handle_bad_prq_event(iommu, req, QI_RESP_INVALID); 1016 956 1017 - /* If the mm is already defunct, don't handle faults. */ 1018 - if (!mmget_not_zero(svm->mm)) 1019 - goto bad_req; 1020 - 1021 - mmap_read_lock(svm->mm); 1022 - vma = find_extend_vma(svm->mm, address); 1023 - if (!vma || address < vma->vm_start) 1024 - goto invalid; 1025 - 1026 - if (access_error(vma, req)) 1027 - goto invalid; 1028 - 1029 - flags = FAULT_FLAG_USER | FAULT_FLAG_REMOTE; 1030 - if (req->wr_req) 1031 - flags |= FAULT_FLAG_WRITE; 1032 - 1033 - ret = handle_mm_fault(vma, address, flags, NULL); 1034 - if (ret & VM_FAULT_ERROR) 1035 - goto invalid; 1036 - 1037 - result = QI_RESP_SUCCESS; 1038 - invalid: 1039 - mmap_read_unlock(svm->mm); 1040 - mmput(svm->mm); 1041 - bad_req: 1042 - /* We get here in the error case where the PASID lookup failed, 1043 - and these can be NULL. Do not use them below this point! */ 1044 - sdev = NULL; 1045 - svm = NULL; 1046 - no_pasid: 1047 - if (req->lpig || req->priv_data_present) { 1048 - /* 1049 - * Per VT-d spec. v3.0 ch7.7, system software must 1050 - * respond with page group response if private data 1051 - * is present (PDP) or last page in group (LPIG) bit 1052 - * is set. This is an additional VT-d feature beyond 1053 - * PCI ATS spec. 1054 - */ 1055 - resp.qw0 = QI_PGRP_PASID(req->pasid) | 1056 - QI_PGRP_DID(req->rid) | 1057 - QI_PGRP_PASID_P(req->pasid_present) | 1058 - QI_PGRP_PDP(req->priv_data_present) | 1059 - QI_PGRP_RESP_CODE(result) | 1060 - QI_PGRP_RESP_TYPE; 1061 - resp.qw1 = QI_PGRP_IDX(req->prg_index) | 1062 - QI_PGRP_LPIG(req->lpig); 1063 - resp.qw2 = 0; 1064 - resp.qw3 = 0; 1065 - 1066 - if (req->priv_data_present) 1067 - memcpy(&resp.qw2, req->priv_data, 1068 - sizeof(req->priv_data)); 1069 - qi_submit_sync(iommu, &resp, 1, 0); 1070 - } 957 + trace_prq_report(iommu, sdev->dev, req->qw_0, req->qw_1, 958 + req->priv_data[0], req->priv_data[1], 959 + sdev->prq_seq_number); 1071 960 prq_advance: 1072 961 head = (head + sizeof(*req)) & PRQ_RING_MASK; 1073 962 } ··· 1016 1041 head = dmar_readq(iommu->reg + DMAR_PQH_REG) & PRQ_RING_MASK; 1017 1042 tail = dmar_readq(iommu->reg + DMAR_PQT_REG) & PRQ_RING_MASK; 1018 1043 if (head == tail) { 1044 + iopf_queue_discard_partial(iommu->iopf_queue); 1019 1045 writel(DMA_PRS_PRO, iommu->reg + DMAR_PRS_REG); 1020 1046 pr_info_ratelimited("IOMMU: %s: PRQ overflow cleared", 1021 1047 iommu->name); ··· 1029 1053 return IRQ_RETVAL(handled); 1030 1054 } 1031 1055 1032 - #define to_intel_svm_dev(handle) container_of(handle, struct intel_svm_dev, sva) 1033 - struct iommu_sva * 1034 - intel_svm_bind(struct device *dev, struct mm_struct *mm, void *drvdata) 1056 + struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm, void *drvdata) 1035 1057 { 1036 - struct iommu_sva *sva = ERR_PTR(-EINVAL); 1037 - struct intel_svm_dev *sdev = NULL; 1058 + struct intel_iommu *iommu = device_to_iommu(dev, NULL, NULL); 1038 1059 unsigned int flags = 0; 1060 + struct iommu_sva *sva; 1039 1061 int ret; 1040 1062 1041 - /* 1042 - * TODO: Consolidate with generic iommu-sva bind after it is merged. 1043 - * It will require shared SVM data structures, i.e. combine io_mm 1044 - * and intel_svm etc. 1045 - */ 1046 1063 if (drvdata) 1047 1064 flags = *(unsigned int *)drvdata; 1048 - mutex_lock(&pasid_mutex); 1049 - ret = intel_svm_bind_mm(dev, flags, mm, &sdev); 1050 - if (ret) 1051 - sva = ERR_PTR(ret); 1052 - else if (sdev) 1053 - sva = &sdev->sva; 1054 - else 1055 - WARN(!sdev, "SVM bind succeeded with no sdev!\n"); 1056 1065 1066 + if (flags & SVM_FLAG_SUPERVISOR_MODE) { 1067 + if (!ecap_srs(iommu->ecap)) { 1068 + dev_err(dev, "%s: Supervisor PASID not supported\n", 1069 + iommu->name); 1070 + return ERR_PTR(-EOPNOTSUPP); 1071 + } 1072 + 1073 + if (mm) { 1074 + dev_err(dev, "%s: Supervisor PASID with user provided mm\n", 1075 + iommu->name); 1076 + return ERR_PTR(-EINVAL); 1077 + } 1078 + 1079 + mm = &init_mm; 1080 + } 1081 + 1082 + mutex_lock(&pasid_mutex); 1083 + ret = intel_svm_alloc_pasid(dev, mm, flags); 1084 + if (ret) { 1085 + mutex_unlock(&pasid_mutex); 1086 + return ERR_PTR(ret); 1087 + } 1088 + 1089 + sva = intel_svm_bind_mm(iommu, dev, mm, flags); 1090 + if (IS_ERR_OR_NULL(sva)) 1091 + intel_svm_free_pasid(mm); 1057 1092 mutex_unlock(&pasid_mutex); 1058 1093 1059 1094 return sva; ··· 1072 1085 1073 1086 void intel_svm_unbind(struct iommu_sva *sva) 1074 1087 { 1075 - struct intel_svm_dev *sdev; 1088 + struct intel_svm_dev *sdev = to_intel_svm_dev(sva); 1076 1089 1077 1090 mutex_lock(&pasid_mutex); 1078 - sdev = to_intel_svm_dev(sva); 1079 1091 intel_svm_unbind_mm(sdev->dev, sdev->pasid); 1080 1092 mutex_unlock(&pasid_mutex); 1081 1093 } ··· 1180 1194 desc.qw1 = QI_PGRP_IDX(prm->grpid) | QI_PGRP_LPIG(last_page); 1181 1195 desc.qw2 = 0; 1182 1196 desc.qw3 = 0; 1183 - if (private_present) 1184 - memcpy(&desc.qw2, prm->private_data, 1185 - sizeof(prm->private_data)); 1197 + 1198 + if (private_present) { 1199 + desc.qw2 = prm->private_data[0]; 1200 + desc.qw3 = prm->private_data[1]; 1201 + } else if (prm->private_data[0]) { 1202 + dmar_latency_update(iommu, DMAR_LATENCY_PRQ, 1203 + ktime_to_ns(ktime_get()) - prm->private_data[0]); 1204 + } 1186 1205 1187 1206 qi_submit_sync(iommu, &desc, 1, 0); 1188 1207 }

-3

drivers/iommu/iommu.c

··· 3059 3059 int ret, dev_def_dom; 3060 3060 struct device *dev; 3061 3061 3062 - if (!group) 3063 - return -EINVAL; 3064 - 3065 3062 mutex_lock(&group->mutex); 3066 3063 3067 3064 if (group->default_domain != group->domain) {

+11 -7

drivers/iommu/iova.c

··· 412 412 return NULL; 413 413 } 414 414 415 - static void private_free_iova(struct iova_domain *iovad, struct iova *iova) 415 + static void remove_iova(struct iova_domain *iovad, struct iova *iova) 416 416 { 417 417 assert_spin_locked(&iovad->iova_rbtree_lock); 418 418 __cached_rbnode_delete_update(iovad, iova); 419 419 rb_erase(&iova->node, &iovad->rbroot); 420 - free_iova_mem(iova); 421 420 } 422 421 423 422 /** ··· 451 452 unsigned long flags; 452 453 453 454 spin_lock_irqsave(&iovad->iova_rbtree_lock, flags); 454 - private_free_iova(iovad, iova); 455 + remove_iova(iovad, iova); 455 456 spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags); 457 + free_iova_mem(iova); 456 458 } 457 459 EXPORT_SYMBOL_GPL(__free_iova); 458 460 ··· 472 472 473 473 spin_lock_irqsave(&iovad->iova_rbtree_lock, flags); 474 474 iova = private_find_iova(iovad, pfn); 475 - if (iova) 476 - private_free_iova(iovad, iova); 475 + if (!iova) { 476 + spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags); 477 + return; 478 + } 479 + remove_iova(iovad, iova); 477 480 spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags); 478 - 481 + free_iova_mem(iova); 479 482 } 480 483 EXPORT_SYMBOL_GPL(free_iova); 481 484 ··· 828 825 if (WARN_ON(!iova)) 829 826 continue; 830 827 831 - private_free_iova(iovad, iova); 828 + remove_iova(iovad, iova); 829 + free_iova_mem(iova); 832 830 } 833 831 834 832 spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags);

-1

drivers/iommu/ipmmu-vmsa.c

··· 19 19 #include <linux/iommu.h> 20 20 #include <linux/of.h> 21 21 #include <linux/of_device.h> 22 - #include <linux/of_iommu.h> 23 22 #include <linux/of_platform.h> 24 23 #include <linux/platform_device.h> 25 24 #include <linux/sizes.h>

-1

drivers/iommu/msm_iommu.c

··· 18 18 #include <linux/iommu.h> 19 19 #include <linux/clk.h> 20 20 #include <linux/err.h> 21 - #include <linux/of_iommu.h> 22 21 23 22 #include <asm/cacheflush.h> 24 23 #include <linux/sizes.h>

-1

drivers/iommu/mtk_iommu.c

··· 19 19 #include <linux/mfd/syscon.h> 20 20 #include <linux/module.h> 21 21 #include <linux/of_address.h> 22 - #include <linux/of_iommu.h> 23 22 #include <linux/of_irq.h> 24 23 #include <linux/of_platform.h> 25 24 #include <linux/platform_device.h>

-1

drivers/iommu/mtk_iommu_v1.c

··· 22 22 #include <linux/list.h> 23 23 #include <linux/module.h> 24 24 #include <linux/of_address.h> 25 - #include <linux/of_iommu.h> 26 25 #include <linux/of_irq.h> 27 26 #include <linux/of_platform.h> 28 27 #include <linux/platform_device.h>

-68

drivers/iommu/of_iommu.c

··· 19 19 20 20 #define NO_IOMMU 1 21 21 22 - /** 23 - * of_get_dma_window - Parse *dma-window property and returns 0 if found. 24 - * 25 - * @dn: device node 26 - * @prefix: prefix for property name if any 27 - * @index: index to start to parse 28 - * @busno: Returns busno if supported. Otherwise pass NULL 29 - * @addr: Returns address that DMA starts 30 - * @size: Returns the range that DMA can handle 31 - * 32 - * This supports different formats flexibly. "prefix" can be 33 - * configured if any. "busno" and "index" are optionally 34 - * specified. Set 0(or NULL) if not used. 35 - */ 36 - int of_get_dma_window(struct device_node *dn, const char *prefix, int index, 37 - unsigned long *busno, dma_addr_t *addr, size_t *size) 38 - { 39 - const __be32 *dma_window, *end; 40 - int bytes, cur_index = 0; 41 - char propname[NAME_MAX], addrname[NAME_MAX], sizename[NAME_MAX]; 42 - 43 - if (!dn || !addr || !size) 44 - return -EINVAL; 45 - 46 - if (!prefix) 47 - prefix = ""; 48 - 49 - snprintf(propname, sizeof(propname), "%sdma-window", prefix); 50 - snprintf(addrname, sizeof(addrname), "%s#dma-address-cells", prefix); 51 - snprintf(sizename, sizeof(sizename), "%s#dma-size-cells", prefix); 52 - 53 - dma_window = of_get_property(dn, propname, &bytes); 54 - if (!dma_window) 55 - return -ENODEV; 56 - end = dma_window + bytes / sizeof(*dma_window); 57 - 58 - while (dma_window < end) { 59 - u32 cells; 60 - const void *prop; 61 - 62 - /* busno is one cell if supported */ 63 - if (busno) 64 - *busno = be32_to_cpup(dma_window++); 65 - 66 - prop = of_get_property(dn, addrname, NULL); 67 - if (!prop) 68 - prop = of_get_property(dn, "#address-cells", NULL); 69 - 70 - cells = prop ? be32_to_cpup(prop) : of_n_addr_cells(dn); 71 - if (!cells) 72 - return -EINVAL; 73 - *addr = of_read_number(dma_window, cells); 74 - dma_window += cells; 75 - 76 - prop = of_get_property(dn, sizename, NULL); 77 - cells = prop ? be32_to_cpup(prop) : of_n_size_cells(dn); 78 - if (!cells) 79 - return -EINVAL; 80 - *size = of_read_number(dma_window, cells); 81 - dma_window += cells; 82 - 83 - if (cur_index++ == index) 84 - break; 85 - } 86 - return 0; 87 - } 88 - EXPORT_SYMBOL_GPL(of_get_dma_window); 89 - 90 22 static int of_iommu_xlate(struct device *dev, 91 23 struct of_phandle_args *iommu_spec) 92 24 {

-1

drivers/iommu/omap-iommu.c

··· 22 22 #include <linux/io.h> 23 23 #include <linux/pm_runtime.h> 24 24 #include <linux/of.h> 25 - #include <linux/of_iommu.h> 26 25 #include <linux/of_irq.h> 27 26 #include <linux/of_platform.h> 28 27 #include <linux/regmap.h>

+149 -26

drivers/iommu/rockchip-iommu.c

··· 21 21 #include <linux/mm.h> 22 22 #include <linux/init.h> 23 23 #include <linux/of.h> 24 - #include <linux/of_iommu.h> 25 24 #include <linux/of_platform.h> 26 25 #include <linux/platform_device.h> 27 26 #include <linux/pm_runtime.h> ··· 95 96 "aclk", "iface", 96 97 }; 97 98 99 + struct rk_iommu_ops { 100 + phys_addr_t (*pt_address)(u32 dte); 101 + u32 (*mk_dtentries)(dma_addr_t pt_dma); 102 + u32 (*mk_ptentries)(phys_addr_t page, int prot); 103 + phys_addr_t (*dte_addr_phys)(u32 addr); 104 + u32 (*dma_addr_dte)(dma_addr_t dt_dma); 105 + u64 dma_bit_mask; 106 + }; 107 + 98 108 struct rk_iommu { 99 109 struct device *dev; 100 110 void __iomem **bases; ··· 124 116 }; 125 117 126 118 static struct device *dma_dev; 119 + static const struct rk_iommu_ops *rk_ops; 127 120 128 121 static inline void rk_table_flush(struct rk_iommu_domain *dom, dma_addr_t dma, 129 122 unsigned int count) ··· 188 179 return (phys_addr_t)dte & RK_DTE_PT_ADDRESS_MASK; 189 180 } 190 181 182 + /* 183 + * In v2: 184 + * 31:12 - PT address bit 31:0 185 + * 11: 8 - PT address bit 35:32 186 + * 7: 4 - PT address bit 39:36 187 + * 3: 1 - Reserved 188 + * 0 - 1 if PT @ PT address is valid 189 + */ 190 + #define RK_DTE_PT_ADDRESS_MASK_V2 GENMASK_ULL(31, 4) 191 + #define DTE_HI_MASK1 GENMASK(11, 8) 192 + #define DTE_HI_MASK2 GENMASK(7, 4) 193 + #define DTE_HI_SHIFT1 24 /* shift bit 8 to bit 32 */ 194 + #define DTE_HI_SHIFT2 32 /* shift bit 4 to bit 36 */ 195 + #define PAGE_DESC_HI_MASK1 GENMASK_ULL(39, 36) 196 + #define PAGE_DESC_HI_MASK2 GENMASK_ULL(35, 32) 197 + 198 + static inline phys_addr_t rk_dte_pt_address_v2(u32 dte) 199 + { 200 + u64 dte_v2 = dte; 201 + 202 + dte_v2 = ((dte_v2 & DTE_HI_MASK2) << DTE_HI_SHIFT2) | 203 + ((dte_v2 & DTE_HI_MASK1) << DTE_HI_SHIFT1) | 204 + (dte_v2 & RK_DTE_PT_ADDRESS_MASK); 205 + 206 + return (phys_addr_t)dte_v2; 207 + } 208 + 191 209 static inline bool rk_dte_is_pt_valid(u32 dte) 192 210 { 193 211 return dte & RK_DTE_PT_VALID; ··· 223 187 static inline u32 rk_mk_dte(dma_addr_t pt_dma) 224 188 { 225 189 return (pt_dma & RK_DTE_PT_ADDRESS_MASK) | RK_DTE_PT_VALID; 190 + } 191 + 192 + static inline u32 rk_mk_dte_v2(dma_addr_t pt_dma) 193 + { 194 + pt_dma = (pt_dma & RK_DTE_PT_ADDRESS_MASK) | 195 + ((pt_dma & PAGE_DESC_HI_MASK1) >> DTE_HI_SHIFT1) | 196 + (pt_dma & PAGE_DESC_HI_MASK2) >> DTE_HI_SHIFT2; 197 + 198 + return (pt_dma & RK_DTE_PT_ADDRESS_MASK_V2) | RK_DTE_PT_VALID; 226 199 } 227 200 228 201 /* ··· 260 215 #define RK_PTE_PAGE_READABLE BIT(1) 261 216 #define RK_PTE_PAGE_VALID BIT(0) 262 217 263 - static inline phys_addr_t rk_pte_page_address(u32 pte) 264 - { 265 - return (phys_addr_t)pte & RK_PTE_PAGE_ADDRESS_MASK; 266 - } 267 - 268 218 static inline bool rk_pte_is_page_valid(u32 pte) 269 219 { 270 220 return pte & RK_PTE_PAGE_VALID; ··· 273 233 flags |= (prot & IOMMU_WRITE) ? RK_PTE_PAGE_WRITABLE : 0; 274 234 page &= RK_PTE_PAGE_ADDRESS_MASK; 275 235 return page | flags | RK_PTE_PAGE_VALID; 236 + } 237 + 238 + /* 239 + * In v2: 240 + * 31:12 - Page address bit 31:0 241 + * 11:9 - Page address bit 34:32 242 + * 8:4 - Page address bit 39:35 243 + * 3 - Security 244 + * 2 - Readable 245 + * 1 - Writable 246 + * 0 - 1 if Page @ Page address is valid 247 + */ 248 + #define RK_PTE_PAGE_READABLE_V2 BIT(2) 249 + #define RK_PTE_PAGE_WRITABLE_V2 BIT(1) 250 + 251 + static u32 rk_mk_pte_v2(phys_addr_t page, int prot) 252 + { 253 + u32 flags = 0; 254 + 255 + flags |= (prot & IOMMU_READ) ? RK_PTE_PAGE_READABLE_V2 : 0; 256 + flags |= (prot & IOMMU_WRITE) ? RK_PTE_PAGE_WRITABLE_V2 : 0; 257 + 258 + return rk_mk_dte_v2(page) | flags; 276 259 } 277 260 278 261 static u32 rk_mk_pte_invalid(u32 pte) ··· 511 448 * and verifying that upper 5 nybbles are read back. 512 449 */ 513 450 for (i = 0; i < iommu->num_mmu; i++) { 514 - rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR, DTE_ADDR_DUMMY); 451 + dte_addr = rk_ops->pt_address(DTE_ADDR_DUMMY); 452 + rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR, dte_addr); 515 453 516 - dte_addr = rk_iommu_read(iommu->bases[i], RK_MMU_DTE_ADDR); 517 - if (dte_addr != (DTE_ADDR_DUMMY & RK_DTE_PT_ADDRESS_MASK)) { 454 + if (dte_addr != rk_iommu_read(iommu->bases[i], RK_MMU_DTE_ADDR)) { 518 455 dev_err(iommu->dev, "Error during raw reset. MMU_DTE_ADDR is not functioning\n"); 519 456 return -EFAULT; 520 457 } ··· 531 468 } 532 469 533 470 return 0; 471 + } 472 + 473 + static inline phys_addr_t rk_dte_addr_phys(u32 addr) 474 + { 475 + return (phys_addr_t)addr; 476 + } 477 + 478 + static inline u32 rk_dma_addr_dte(dma_addr_t dt_dma) 479 + { 480 + return dt_dma; 481 + } 482 + 483 + #define DT_HI_MASK GENMASK_ULL(39, 32) 484 + #define DT_SHIFT 28 485 + 486 + static inline phys_addr_t rk_dte_addr_phys_v2(u32 addr) 487 + { 488 + return (phys_addr_t)(addr & RK_DTE_PT_ADDRESS_MASK) | 489 + ((addr & DT_HI_MASK) << DT_SHIFT); 490 + } 491 + 492 + static inline u32 rk_dma_addr_dte_v2(dma_addr_t dt_dma) 493 + { 494 + return (dt_dma & RK_DTE_PT_ADDRESS_MASK) | 495 + ((dt_dma & DT_HI_MASK) >> DT_SHIFT); 534 496 } 535 497 536 498 static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova) ··· 577 489 page_offset = rk_iova_page_offset(iova); 578 490 579 491 mmu_dte_addr = rk_iommu_read(base, RK_MMU_DTE_ADDR); 580 - mmu_dte_addr_phys = (phys_addr_t)mmu_dte_addr; 492 + mmu_dte_addr_phys = rk_ops->dte_addr_phys(mmu_dte_addr); 581 493 582 494 dte_addr_phys = mmu_dte_addr_phys + (4 * dte_index); 583 495 dte_addr = phys_to_virt(dte_addr_phys); ··· 586 498 if (!rk_dte_is_pt_valid(dte)) 587 499 goto print_it; 588 500 589 - pte_addr_phys = rk_dte_pt_address(dte) + (pte_index * 4); 501 + pte_addr_phys = rk_ops->pt_address(dte) + (pte_index * 4); 590 502 pte_addr = phys_to_virt(pte_addr_phys); 591 503 pte = *pte_addr; 592 504 593 505 if (!rk_pte_is_page_valid(pte)) 594 506 goto print_it; 595 507 596 - page_addr_phys = rk_pte_page_address(pte) + page_offset; 508 + page_addr_phys = rk_ops->pt_address(pte) + page_offset; 597 509 page_flags = pte & RK_PTE_PAGE_FLAGS_MASK; 598 510 599 511 print_it: ··· 689 601 if (!rk_dte_is_pt_valid(dte)) 690 602 goto out; 691 603 692 - pt_phys = rk_dte_pt_address(dte); 604 + pt_phys = rk_ops->pt_address(dte); 693 605 page_table = (u32 *)phys_to_virt(pt_phys); 694 606 pte = page_table[rk_iova_pte_index(iova)]; 695 607 if (!rk_pte_is_page_valid(pte)) 696 608 goto out; 697 609 698 - phys = rk_pte_page_address(pte) + rk_iova_page_offset(iova); 610 + phys = rk_ops->pt_address(pte) + rk_iova_page_offset(iova); 699 611 out: 700 612 spin_unlock_irqrestore(&rk_domain->dt_lock, flags); 701 613 ··· 767 679 return ERR_PTR(-ENOMEM); 768 680 } 769 681 770 - dte = rk_mk_dte(pt_dma); 682 + dte = rk_ops->mk_dtentries(pt_dma); 771 683 *dte_addr = dte; 772 684 773 - rk_table_flush(rk_domain, pt_dma, NUM_PT_ENTRIES); 774 685 rk_table_flush(rk_domain, 775 686 rk_domain->dt_dma + dte_index * sizeof(u32), 1); 776 687 done: 777 - pt_phys = rk_dte_pt_address(dte); 688 + pt_phys = rk_ops->pt_address(dte); 778 689 return (u32 *)phys_to_virt(pt_phys); 779 690 } 780 691 ··· 815 728 if (rk_pte_is_page_valid(pte)) 816 729 goto unwind; 817 730 818 - pte_addr[pte_count] = rk_mk_pte(paddr, prot); 731 + pte_addr[pte_count] = rk_ops->mk_ptentries(paddr, prot); 819 732 820 733 paddr += SPAGE_SIZE; 821 734 } ··· 837 750 pte_count * SPAGE_SIZE); 838 751 839 752 iova += pte_count * SPAGE_SIZE; 840 - page_phys = rk_pte_page_address(pte_addr[pte_count]); 753 + page_phys = rk_ops->pt_address(pte_addr[pte_count]); 841 754 pr_err("iova: %pad already mapped to %pa cannot remap to phys: %pa prot: %#x\n", 842 755 &iova, &page_phys, &paddr, prot); 843 756 ··· 872 785 dte_index = rk_domain->dt[rk_iova_dte_index(iova)]; 873 786 pte_index = rk_iova_pte_index(iova); 874 787 pte_addr = &page_table[pte_index]; 875 - pte_dma = rk_dte_pt_address(dte_index) + pte_index * sizeof(u32); 788 + 789 + pte_dma = rk_ops->pt_address(dte_index) + pte_index * sizeof(u32); 876 790 ret = rk_iommu_map_iova(rk_domain, pte_addr, pte_dma, iova, 877 791 paddr, size, prot); 878 792 ··· 909 821 return 0; 910 822 } 911 823 912 - pt_phys = rk_dte_pt_address(dte); 824 + pt_phys = rk_ops->pt_address(dte); 913 825 pte_addr = (u32 *)phys_to_virt(pt_phys) + rk_iova_pte_index(iova); 914 826 pte_dma = pt_phys + rk_iova_pte_index(iova) * sizeof(u32); 915 827 unmap_size = rk_iommu_unmap_iova(rk_domain, pte_addr, pte_dma, size); ··· 967 879 968 880 for (i = 0; i < iommu->num_mmu; i++) { 969 881 rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR, 970 - rk_domain->dt_dma); 882 + rk_ops->dma_addr_dte(rk_domain->dt_dma)); 971 883 rk_iommu_base_command(iommu->bases[i], RK_MMU_CMD_ZAP_CACHE); 972 884 rk_iommu_write(iommu->bases[i], RK_MMU_INT_MASK, RK_MMU_IRQ_MASK); 973 885 } ··· 1092 1004 goto err_free_dt; 1093 1005 } 1094 1006 1095 - rk_table_flush(rk_domain, rk_domain->dt_dma, NUM_DT_ENTRIES); 1096 - 1097 1007 spin_lock_init(&rk_domain->iommus_lock); 1098 1008 spin_lock_init(&rk_domain->dt_lock); 1099 1009 INIT_LIST_HEAD(&rk_domain->iommus); ··· 1123 1037 for (i = 0; i < NUM_DT_ENTRIES; i++) { 1124 1038 u32 dte = rk_domain->dt[i]; 1125 1039 if (rk_dte_is_pt_valid(dte)) { 1126 - phys_addr_t pt_phys = rk_dte_pt_address(dte); 1040 + phys_addr_t pt_phys = rk_ops->pt_address(dte); 1127 1041 u32 *page_table = phys_to_virt(pt_phys); 1128 1042 dma_unmap_single(dma_dev, pt_phys, 1129 1043 SPAGE_SIZE, DMA_TO_DEVICE); ··· 1213 1127 struct device *dev = &pdev->dev; 1214 1128 struct rk_iommu *iommu; 1215 1129 struct resource *res; 1130 + const struct rk_iommu_ops *ops; 1216 1131 int num_res = pdev->num_resources; 1217 1132 int err, i; 1218 1133 ··· 1224 1137 platform_set_drvdata(pdev, iommu); 1225 1138 iommu->dev = dev; 1226 1139 iommu->num_mmu = 0; 1140 + 1141 + ops = of_device_get_match_data(dev); 1142 + if (!rk_ops) 1143 + rk_ops = ops; 1144 + 1145 + /* 1146 + * That should not happen unless different versions of the 1147 + * hardware block are embedded the same SoC 1148 + */ 1149 + if (WARN_ON(rk_ops != ops)) 1150 + return -EINVAL; 1227 1151 1228 1152 iommu->bases = devm_kcalloc(dev, num_res, sizeof(*iommu->bases), 1229 1153 GFP_KERNEL); ··· 1324 1226 } 1325 1227 } 1326 1228 1229 + dma_set_mask_and_coherent(dev, rk_ops->dma_bit_mask); 1230 + 1327 1231 return 0; 1328 1232 err_remove_sysfs: 1329 1233 iommu_device_sysfs_remove(&iommu->iommu); ··· 1377 1277 pm_runtime_force_resume) 1378 1278 }; 1379 1279 1280 + static struct rk_iommu_ops iommu_data_ops_v1 = { 1281 + .pt_address = &rk_dte_pt_address, 1282 + .mk_dtentries = &rk_mk_dte, 1283 + .mk_ptentries = &rk_mk_pte, 1284 + .dte_addr_phys = &rk_dte_addr_phys, 1285 + .dma_addr_dte = &rk_dma_addr_dte, 1286 + .dma_bit_mask = DMA_BIT_MASK(32), 1287 + }; 1288 + 1289 + static struct rk_iommu_ops iommu_data_ops_v2 = { 1290 + .pt_address = &rk_dte_pt_address_v2, 1291 + .mk_dtentries = &rk_mk_dte_v2, 1292 + .mk_ptentries = &rk_mk_pte_v2, 1293 + .dte_addr_phys = &rk_dte_addr_phys_v2, 1294 + .dma_addr_dte = &rk_dma_addr_dte_v2, 1295 + .dma_bit_mask = DMA_BIT_MASK(40), 1296 + }; 1297 + 1380 1298 static const struct of_device_id rk_iommu_dt_ids[] = { 1381 - { .compatible = "rockchip,iommu" }, 1299 + { .compatible = "rockchip,iommu", 1300 + .data = &iommu_data_ops_v1, 1301 + }, 1302 + { .compatible = "rockchip,rk3568-iommu", 1303 + .data = &iommu_data_ops_v2, 1304 + }, 1382 1305 { /* sentinel */ } 1383 1306 }; 1384 1307

+11 -1

drivers/iommu/virtio-iommu.c

··· 10 10 #include <linux/amba/bus.h> 11 11 #include <linux/delay.h> 12 12 #include <linux/dma-iommu.h> 13 + #include <linux/dma-map-ops.h> 13 14 #include <linux/freezer.h> 14 15 #include <linux/interval_tree.h> 15 16 #include <linux/iommu.h> 16 17 #include <linux/module.h> 17 - #include <linux/of_iommu.h> 18 18 #include <linux/of_platform.h> 19 19 #include <linux/pci.h> 20 20 #include <linux/platform_device.h> ··· 904 904 return ERR_PTR(ret); 905 905 } 906 906 907 + static void viommu_probe_finalize(struct device *dev) 908 + { 909 + #ifndef CONFIG_ARCH_HAS_SETUP_DMA_OPS 910 + /* First clear the DMA ops in case we're switching from a DMA domain */ 911 + set_dma_ops(dev, NULL); 912 + iommu_setup_dma_ops(dev, 0, U64_MAX); 913 + #endif 914 + } 915 + 907 916 static void viommu_release_device(struct device *dev) 908 917 { 909 918 struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); ··· 949 940 .iova_to_phys = viommu_iova_to_phys, 950 941 .iotlb_sync = viommu_iotlb_sync, 951 942 .probe_device = viommu_probe_device, 943 + .probe_finalize = viommu_probe_finalize, 952 944 .release_device = viommu_release_device, 953 945 .device_group = viommu_device_group, 954 946 .get_resv_regions = viommu_get_resv_regions,

-1

drivers/of/platform.c

··· 17 17 #include <linux/slab.h> 18 18 #include <linux/of_address.h> 19 19 #include <linux/of_device.h> 20 - #include <linux/of_iommu.h> 21 20 #include <linux/of_irq.h> 22 21 #include <linux/of_platform.h> 23 22 #include <linux/platform_device.h>

+3

include/acpi/acpi_bus.h

··· 592 592 593 593 bool acpi_dma_supported(const struct acpi_device *adev); 594 594 enum dev_dma_attr acpi_get_dma_attr(struct acpi_device *adev); 595 + int acpi_iommu_fwspec_init(struct device *dev, u32 id, 596 + struct fwnode_handle *fwnode, 597 + const struct iommu_ops *ops); 595 598 int acpi_dma_get_range(struct device *dev, u64 *dma_addr, u64 *offset, 596 599 u64 *size); 597 600 int acpi_dma_configure_id(struct device *dev, enum dev_dma_attr attr,

+3

include/linux/acpi.h

··· 260 260 261 261 #ifdef CONFIG_ARM64 262 262 void acpi_numa_gicc_affinity_init(struct acpi_srat_gicc_affinity *pa); 263 + void acpi_arch_dma_setup(struct device *dev, u64 *dma_addr, u64 *dma_size); 263 264 #else 264 265 static inline void 265 266 acpi_numa_gicc_affinity_init(struct acpi_srat_gicc_affinity *pa) { } 267 + static inline void 268 + acpi_arch_dma_setup(struct device *dev, u64 *dma_addr, u64 *dma_size) { } 266 269 #endif 267 270 268 271 int acpi_numa_memory_affinity_init (struct acpi_srat_mem_affinity *ma);

+6 -8

include/linux/acpi_iort.h

··· 34 34 void acpi_configure_pmsi_domain(struct device *dev); 35 35 int iort_pmsi_get_dev_id(struct device *dev, u32 *dev_id); 36 36 /* IOMMU interface */ 37 - void iort_dma_setup(struct device *dev, u64 *dma_addr, u64 *size); 38 - const struct iommu_ops *iort_iommu_configure_id(struct device *dev, 39 - const u32 *id_in); 37 + int iort_dma_get_ranges(struct device *dev, u64 *size); 38 + int iort_iommu_configure_id(struct device *dev, const u32 *id_in); 40 39 int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head); 41 40 phys_addr_t acpi_iort_dma_get_max_cpu_address(void); 42 41 #else ··· 47 48 { return NULL; } 48 49 static inline void acpi_configure_pmsi_domain(struct device *dev) { } 49 50 /* IOMMU interface */ 50 - static inline void iort_dma_setup(struct device *dev, u64 *dma_addr, 51 - u64 *size) { } 52 - static inline const struct iommu_ops *iort_iommu_configure_id( 53 - struct device *dev, const u32 *id_in) 54 - { return NULL; } 51 + static inline int iort_dma_get_ranges(struct device *dev, u64 *size) 52 + { return -ENODEV; } 53 + static inline int iort_iommu_configure_id(struct device *dev, const u32 *id_in) 54 + { return -ENODEV; } 55 55 static inline 56 56 int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head) 57 57 { return 0; }

+19

include/linux/acpi_viot.h

··· 1 + /* SPDX-License-Identifier: GPL-2.0-only */ 2 + 3 + #ifndef __ACPI_VIOT_H__ 4 + #define __ACPI_VIOT_H__ 5 + 6 + #include <linux/acpi.h> 7 + 8 + #ifdef CONFIG_ACPI_VIOT 9 + void __init acpi_viot_init(void); 10 + int viot_iommu_configure(struct device *dev); 11 + #else 12 + static inline void acpi_viot_init(void) {} 13 + static inline int viot_iommu_configure(struct device *dev) 14 + { 15 + return -ENODEV; 16 + } 17 + #endif 18 + 19 + #endif /* __ACPI_VIOT_H__ */

+2 -2

include/linux/dma-iommu.h

··· 19 19 void iommu_put_dma_cookie(struct iommu_domain *domain); 20 20 21 21 /* Setup call for arch DMA mapping code */ 22 - void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size); 22 + void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 dma_limit); 23 23 24 24 /* The DMA API isn't _quite_ the whole story, though... */ 25 25 /* ··· 50 50 struct device; 51 51 52 52 static inline void iommu_setup_dma_ops(struct device *dev, u64 dma_base, 53 - u64 size) 53 + u64 dma_limit) 54 54 { 55 55 } 56 56

+37 -7

include/linux/intel-iommu.h

··· 537 537 struct dmar_domain { 538 538 int nid; /* node id */ 539 539 540 - unsigned iommu_refcnt[DMAR_UNITS_SUPPORTED]; 540 + unsigned int iommu_refcnt[DMAR_UNITS_SUPPORTED]; 541 541 /* Refcount of devices per iommu */ 542 542 543 543 ··· 546 546 * domain ids are 16 bit wide according 547 547 * to VT-d spec, section 9.3 */ 548 548 549 - bool has_iotlb_device; 549 + u8 has_iotlb_device: 1; 550 + u8 iommu_coherency: 1; /* indicate coherency of iommu access */ 551 + u8 iommu_snooping: 1; /* indicate snooping control feature */ 552 + 550 553 struct list_head devices; /* all devices' list */ 551 554 struct list_head subdevices; /* all subdevices' list */ 552 555 struct iova_domain iovad; /* iova's that belong to this domain */ ··· 561 558 int agaw; 562 559 563 560 int flags; /* flags to find out type of domain */ 564 - 565 - int iommu_coherency;/* indicate coherency of iommu access */ 566 - int iommu_snooping; /* indicate snooping control feature*/ 567 - int iommu_count; /* reference count of iommu */ 568 561 int iommu_superpage;/* Level of superpages supported: 569 562 0 == 4KiB (no superpages), 1 == 2MiB, 570 563 2 == 1GiB, 3 == 512GiB, 4 == 1TiB */ ··· 605 606 struct completion prq_complete; 606 607 struct ioasid_allocator_ops pasid_allocator; /* Custom allocator for PASIDs */ 607 608 #endif 609 + struct iopf_queue *iopf_queue; 610 + unsigned char iopfq_name[16]; 608 611 struct q_inval *qi; /* Queued invalidation info */ 609 612 u32 *iommu_state; /* Store iommu states between suspend and resume.*/ 610 613 ··· 620 619 u32 flags; /* Software defined flags */ 621 620 622 621 struct dmar_drhd_unit *drhd; 622 + void *perf_statistic; 623 623 }; 624 624 625 625 /* Per subdevice private data */ ··· 778 776 struct device *dev; 779 777 struct intel_iommu *iommu; 780 778 struct iommu_sva sva; 779 + unsigned long prq_seq_number; 781 780 u32 pasid; 782 781 int users; 783 782 u16 did; ··· 794 791 u32 pasid; 795 792 int gpasid; /* In case that guest PASID is different from host PASID */ 796 793 struct list_head devs; 797 - struct list_head list; 798 794 }; 799 795 #else 800 796 static inline void intel_svm_check(struct intel_iommu *iommu) {} ··· 828 826 #define dmar_disabled (1) 829 827 #define intel_iommu_enabled (0) 830 828 #endif 829 + 830 + static inline const char *decode_prq_descriptor(char *str, size_t size, 831 + u64 dw0, u64 dw1, u64 dw2, u64 dw3) 832 + { 833 + char *buf = str; 834 + int bytes; 835 + 836 + bytes = snprintf(buf, size, 837 + "rid=0x%llx addr=0x%llx %c%c%c%c%c pasid=0x%llx index=0x%llx", 838 + FIELD_GET(GENMASK_ULL(31, 16), dw0), 839 + FIELD_GET(GENMASK_ULL(63, 12), dw1), 840 + dw1 & BIT_ULL(0) ? 'r' : '-', 841 + dw1 & BIT_ULL(1) ? 'w' : '-', 842 + dw0 & BIT_ULL(52) ? 'x' : '-', 843 + dw0 & BIT_ULL(53) ? 'p' : '-', 844 + dw1 & BIT_ULL(2) ? 'l' : '-', 845 + FIELD_GET(GENMASK_ULL(51, 32), dw0), 846 + FIELD_GET(GENMASK_ULL(11, 3), dw1)); 847 + 848 + /* Private Data */ 849 + if (dw0 & BIT_ULL(9)) { 850 + size -= bytes; 851 + buf += bytes; 852 + snprintf(buf, size, " private=0x%llx/0x%llx\n", dw2, dw3); 853 + } 854 + 855 + return str; 856 + } 831 857 832 858 #endif

+3 -14

include/linux/of_iommu.h

··· 2 2 #ifndef __OF_IOMMU_H 3 3 #define __OF_IOMMU_H 4 4 5 - #include <linux/device.h> 6 - #include <linux/iommu.h> 7 - #include <linux/of.h> 5 + struct device; 6 + struct device_node; 7 + struct iommu_ops; 8 8 9 9 #ifdef CONFIG_OF_IOMMU 10 - 11 - extern int of_get_dma_window(struct device_node *dn, const char *prefix, 12 - int index, unsigned long *busno, dma_addr_t *addr, 13 - size_t *size); 14 10 15 11 extern const struct iommu_ops *of_iommu_configure(struct device *dev, 16 12 struct device_node *master_np, 17 13 const u32 *id); 18 14 19 15 #else 20 - 21 - static inline int of_get_dma_window(struct device_node *dn, const char *prefix, 22 - int index, unsigned long *busno, dma_addr_t *addr, 23 - size_t *size) 24 - { 25 - return -EINVAL; 26 - } 27 16 28 17 static inline const struct iommu_ops *of_iommu_configure(struct device *dev, 29 18 struct device_node *master_np,

+37

include/trace/events/intel_iommu.h

··· 15 15 #include <linux/tracepoint.h> 16 16 #include <linux/intel-iommu.h> 17 17 18 + #define MSG_MAX 256 19 + 18 20 TRACE_EVENT(qi_submit, 19 21 TP_PROTO(struct intel_iommu *iommu, u64 qw0, u64 qw1, u64 qw2, u64 qw3), 20 22 ··· 51 49 { QI_PGRP_RESP_TYPE, "page_grp_resp" }), 52 50 __get_str(iommu), 53 51 __entry->qw0, __entry->qw1, __entry->qw2, __entry->qw3 52 + ) 53 + ); 54 + 55 + TRACE_EVENT(prq_report, 56 + TP_PROTO(struct intel_iommu *iommu, struct device *dev, 57 + u64 dw0, u64 dw1, u64 dw2, u64 dw3, 58 + unsigned long seq), 59 + 60 + TP_ARGS(iommu, dev, dw0, dw1, dw2, dw3, seq), 61 + 62 + TP_STRUCT__entry( 63 + __field(u64, dw0) 64 + __field(u64, dw1) 65 + __field(u64, dw2) 66 + __field(u64, dw3) 67 + __field(unsigned long, seq) 68 + __string(iommu, iommu->name) 69 + __string(dev, dev_name(dev)) 70 + __dynamic_array(char, buff, MSG_MAX) 71 + ), 72 + 73 + TP_fast_assign( 74 + __entry->dw0 = dw0; 75 + __entry->dw1 = dw1; 76 + __entry->dw2 = dw2; 77 + __entry->dw3 = dw3; 78 + __entry->seq = seq; 79 + __assign_str(iommu, iommu->name); 80 + __assign_str(dev, dev_name(dev)); 81 + ), 82 + 83 + TP_printk("%s/%s seq# %ld: %s", 84 + __get_str(iommu), __get_str(dev), __entry->seq, 85 + decode_prq_descriptor(__get_str(buff), MSG_MAX, __entry->dw0, 86 + __entry->dw1, __entry->dw2, __entry->dw3) 54 87 ) 55 88 ); 56 89 #endif /* _TRACE_INTEL_IOMMU_H */