Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'iommu-updates-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux

Pull iommu updates from Joerg Roedel:
"Core changes:
- PASID support for the blocked_domain

ARM-SMMU Updates:
- SMMUv2:
- Implement per-client prefetcher configuration on Qualcomm SoCs
- Support for the Adreno SMMU on Qualcomm's SDM670 SOC
- SMMUv3:
- Pretty-printing of event records
- Drop the ->domain_alloc_paging implementation in favour of
domain_alloc_paging_flags(flags==0)
- IO-PGTable:
- Generalisation of the page-table walker to enable external
walkers (e.g. for debugging unexpected page-faults from the GPU)
- Minor fix for handling concatenated PGDs at stage-2 with 16KiB
pages
- Misc:
- Clean-up device probing and replace the crufty probe-deferral
hack with a more robust implementation of
arm_smmu_get_by_fwnode()
- Device-tree binding updates for a bunch of Qualcomm platforms

Intel VT-d Updates:
- Remove domain_alloc_paging()
- Remove capability audit code
- Draining PRQ in sva unbind path when FPD bit set
- Link cache tags of same iommu unit together

AMD-Vi Updates:
- Use CMPXCHG128 to update DTE
- Cleanups of the domain_alloc_paging() path

RiscV IOMMU:
- Platform MSI support
- Shutdown support

Rockchip IOMMU:
- Add DT bindings for Rockchip RK3576

More smaller fixes and cleanups"

* tag 'iommu-updates-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (66 commits)
iommu: Use str_enable_disable-like helpers
iommu/amd: Fully decode all combinations of alloc_paging_flags
iommu/amd: Move the nid to pdom_setup_pgtable()
iommu/amd: Change amd_iommu_pgtable to use enum protection_domain_mode
iommu/amd: Remove type argument from do_iommu_domain_alloc() and related
iommu/amd: Remove dev == NULL checks
iommu/amd: Remove domain_alloc()
iommu/amd: Remove unused amd_iommu_domain_update()
iommu/riscv: Fixup compile warning
iommu/arm-smmu-v3: Add missing #include of linux/string_choices.h
iommu/arm-smmu-v3: Use str_read_write helper w/ logs
iommu/io-pgtable-arm: Add way to debug pgtable walk
iommu/io-pgtable-arm: Re-use the pgtable walk for iova_to_phys
iommu/io-pgtable-arm: Make pgtable walker more generic
iommu/arm-smmu: Add ACTLR data and support for qcom_smmu_500
iommu/arm-smmu: Introduce ACTLR custom prefetcher settings
iommu/arm-smmu: Add support for PRR bit setup
iommu/arm-smmu: Refactor qcom_smmu structure to include single pointer
iommu/arm-smmu: Re-enable context caching in smmu reset operation
iommu/vt-d: Link cache tags of same iommu unit together
...

+1311 -1043
+2 -1
Documentation/arch/arm64/silicon-errata.rst
··· 198 198 +----------------+-----------------+-----------------+-----------------------------+ 199 199 | ARM | Neoverse-V3 | #3312417 | ARM64_ERRATUM_3194386 | 200 200 +----------------+-----------------+-----------------+-----------------------------+ 201 - | ARM | MMU-500 | #841119,826419 | N/A | 201 + | ARM | MMU-500 | #841119,826419 | ARM_SMMU_MMU_500_CPRE_ERRATA| 202 + | | | #562869,1047329 | | 202 203 +----------------+-----------------+-----------------+-----------------------------+ 203 204 | ARM | MMU-600 | #1076982,1209401| N/A | 204 205 +----------------+-----------------+-----------------+-----------------------------+
+22 -1
Documentation/devicetree/bindings/iommu/arm,smmu.yaml
··· 61 61 - qcom,sm8450-smmu-500 62 62 - qcom,sm8550-smmu-500 63 63 - qcom,sm8650-smmu-500 64 + - qcom,sm8750-smmu-500 64 65 - qcom,x1e80100-smmu-500 65 66 - const: qcom,smmu-500 66 67 - const: arm,mmu-500 ··· 89 88 items: 90 89 - enum: 91 90 - qcom,qcm2290-smmu-500 91 + - qcom,qcs615-smmu-500 92 92 - qcom,sa8255p-smmu-500 93 93 - qcom,sa8775p-smmu-500 94 94 - qcom,sar2130p-smmu-500 ··· 104 102 - qcom,sm8450-smmu-500 105 103 - qcom,sm8550-smmu-500 106 104 - qcom,sm8650-smmu-500 105 + - qcom,sm8750-smmu-500 107 106 - qcom,x1e80100-smmu-500 108 107 - const: qcom,adreno-smmu 109 108 - const: qcom,smmu-500 ··· 125 122 - qcom,msm8996-smmu-v2 126 123 - qcom,sc7180-smmu-v2 127 124 - qcom,sdm630-smmu-v2 125 + - qcom,sdm670-smmu-v2 128 126 - qcom,sdm845-smmu-v2 129 127 - qcom,sm6350-smmu-v2 130 128 - qcom,sm7150-smmu-v2 ··· 478 474 items: 479 475 - enum: 480 476 - qcom,qcm2290-smmu-500 477 + - qcom,qcs615-smmu-500 481 478 - qcom,sm6115-smmu-500 482 479 - qcom,sm6125-smmu-500 483 480 - const: qcom,adreno-smmu ··· 555 550 - description: GPU SNoC bus clock 556 551 - description: GPU AHB clock 557 552 553 + - if: 554 + properties: 555 + compatible: 556 + items: 557 + - const: qcom,sm8750-smmu-500 558 + - const: qcom,adreno-smmu 559 + - const: qcom,smmu-500 560 + - const: arm,mmu-500 561 + then: 562 + properties: 563 + clock-names: 564 + items: 565 + - const: hlos 566 + clocks: 567 + items: 568 + - description: HLOS vote clock 569 + 558 570 # Disallow clocks for all other platforms with specific compatibles 559 571 - if: 560 572 properties: ··· 581 559 - cavium,smmu-v2 582 560 - marvell,ap806-smmu-500 583 561 - nvidia,smmu-500 584 - - qcom,qcs615-smmu-500 585 562 - qcom,qcs8300-smmu-500 586 563 - qcom,qdu1000-smmu-500 587 564 - qcom,sa8255p-smmu-500
+1
Documentation/devicetree/bindings/iommu/qcom,iommu.yaml
··· 21 21 - items: 22 22 - enum: 23 23 - qcom,msm8916-iommu 24 + - qcom,msm8917-iommu 24 25 - qcom,msm8953-iommu 25 26 - const: qcom,msm-iommu-v1 26 27 - items:
+1
Documentation/devicetree/bindings/iommu/rockchip,iommu.yaml
··· 25 25 - rockchip,rk3568-iommu 26 26 - items: 27 27 - enum: 28 + - rockchip,rk3576-iommu 28 29 - rockchip,rk3588-iommu 29 30 - const: rockchip,rk3568-iommu 30 31
+12
drivers/iommu/Kconfig
··· 367 367 'arm-smmu.disable_bypass' will continue to override this 368 368 config. 369 369 370 + config ARM_SMMU_MMU_500_CPRE_ERRATA 371 + bool "Enable errata workaround for CPRE in SMMU reset path" 372 + depends on ARM_SMMU 373 + default y 374 + help 375 + Say Y here (by default) to apply workaround to disable 376 + MMU-500's next-page prefetcher for sake of 4 known errata. 377 + 378 + Say N here only when it is sure that any errata related to 379 + prefetch enablement are not applicable on the platform. 380 + Refer silicon-errata.rst for info on errata IDs. 381 + 370 382 config ARM_SMMU_QCOM 371 383 def_tristate y 372 384 depends on ARM_SMMU && ARCH_QCOM
+5 -4
drivers/iommu/amd/amd_iommu.h
··· 16 16 irqreturn_t amd_iommu_int_thread_pprlog(int irq, void *data); 17 17 irqreturn_t amd_iommu_int_thread_galog(int irq, void *data); 18 18 irqreturn_t amd_iommu_int_handler(int irq, void *data); 19 - void amd_iommu_apply_erratum_63(struct amd_iommu *iommu, u16 devid); 20 19 void amd_iommu_restart_log(struct amd_iommu *iommu, const char *evt_type, 21 20 u8 cntrl_intr, u8 cntrl_log, 22 21 u32 status_run_mask, u32 status_overflow_mask); ··· 40 41 int amd_iommu_reenable(int mode); 41 42 int amd_iommu_enable_faulting(unsigned int cpu); 42 43 extern int amd_iommu_guest_ir; 43 - extern enum io_pgtable_fmt amd_iommu_pgtable; 44 + extern enum protection_domain_mode amd_iommu_pgtable; 44 45 extern int amd_iommu_gpt_level; 45 46 extern unsigned long amd_iommu_pgsize_bitmap; 46 47 47 48 /* Protection domain ops */ 48 49 void amd_iommu_init_identity_domain(void); 49 - struct protection_domain *protection_domain_alloc(unsigned int type, int nid); 50 + struct protection_domain *protection_domain_alloc(void); 50 51 void protection_domain_free(struct protection_domain *domain); 51 52 struct iommu_domain *amd_iommu_domain_alloc_sva(struct device *dev, 52 53 struct mm_struct *mm); ··· 88 89 */ 89 90 void amd_iommu_flush_all_caches(struct amd_iommu *iommu); 90 91 void amd_iommu_update_and_flush_device_table(struct protection_domain *domain); 91 - void amd_iommu_domain_update(struct protection_domain *domain); 92 92 void amd_iommu_domain_flush_pages(struct protection_domain *domain, 93 93 u64 address, size_t size); 94 94 void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data, ··· 182 184 struct dev_table_entry *get_dev_table(struct amd_iommu *iommu); 183 185 184 186 #endif 187 + 188 + struct dev_table_entry *amd_iommu_get_ivhd_dte_flags(u16 segid, u16 devid); 189 + struct iommu_dev_data *search_dev_data(struct amd_iommu *iommu, u16 devid);
+30 -11
drivers/iommu/amd/amd_iommu_types.h
··· 220 220 #define DEV_ENTRY_EX 0x67 221 221 #define DEV_ENTRY_SYSMGT1 0x68 222 222 #define DEV_ENTRY_SYSMGT2 0x69 223 + #define DTE_DATA1_SYSMGT_MASK GENMASK_ULL(41, 40) 224 + 223 225 #define DEV_ENTRY_IRQ_TBL_EN 0x80 224 226 #define DEV_ENTRY_INIT_PASS 0xb8 225 227 #define DEV_ENTRY_EINT_PASS 0xb9 ··· 409 407 #define DTE_FLAG_HAD (3ULL << 7) 410 408 #define DTE_FLAG_GIOV BIT_ULL(54) 411 409 #define DTE_FLAG_GV BIT_ULL(55) 412 - #define DTE_GLX_SHIFT (56) 413 - #define DTE_GLX_MASK (3) 410 + #define DTE_GLX GENMASK_ULL(57, 56) 414 411 #define DTE_FLAG_IR BIT_ULL(61) 415 412 #define DTE_FLAG_IW BIT_ULL(62) 416 413 ··· 417 416 #define DTE_FLAG_MASK (0x3ffULL << 32) 418 417 #define DEV_DOMID_MASK 0xffffULL 419 418 420 - #define DTE_GCR3_VAL_A(x) (((x) >> 12) & 0x00007ULL) 421 - #define DTE_GCR3_VAL_B(x) (((x) >> 15) & 0x0ffffULL) 422 - #define DTE_GCR3_VAL_C(x) (((x) >> 31) & 0x1fffffULL) 423 - 424 - #define DTE_GCR3_SHIFT_A 58 425 - #define DTE_GCR3_SHIFT_B 16 426 - #define DTE_GCR3_SHIFT_C 43 419 + #define DTE_GCR3_14_12 GENMASK_ULL(60, 58) 420 + #define DTE_GCR3_30_15 GENMASK_ULL(31, 16) 421 + #define DTE_GCR3_51_31 GENMASK_ULL(63, 43) 427 422 428 423 #define DTE_GPT_LEVEL_SHIFT 54 424 + #define DTE_GPT_LEVEL_MASK GENMASK_ULL(55, 54) 429 425 430 426 #define GCR3_VALID 0x01ULL 427 + 428 + /* DTE[128:179] | DTE[184:191] */ 429 + #define DTE_DATA2_INTR_MASK ~GENMASK_ULL(55, 52) 431 430 432 431 #define IOMMU_PAGE_MASK (((1ULL << 52) - 1) & ~0xfffULL) 433 432 #define IOMMU_PTE_PRESENT(pte) ((pte) & IOMMU_PTE_PR) ··· 469 468 #define DUMP_printk(format, arg...) \ 470 469 do { \ 471 470 if (amd_iommu_dump) \ 472 - pr_info("AMD-Vi: " format, ## arg); \ 471 + pr_info(format, ## arg); \ 473 472 } while(0); 474 473 475 474 /* global flag if IOMMUs cache non-present entries */ ··· 516 515 list_for_each_entry(pdom_dev_data, &pdom->dev_data_list, list) 517 516 #define for_each_pdom_dev_data_safe(pdom_dev_data, next, pdom) \ 518 517 list_for_each_entry_safe((pdom_dev_data), (next), &pdom->dev_data_list, list) 518 + 519 + #define for_each_ivhd_dte_flags(entry) \ 520 + list_for_each_entry((entry), &amd_ivhd_dev_flags_list, list) 519 521 520 522 struct amd_iommu; 521 523 struct iommu_domain; ··· 841 837 struct iommu_dev_data { 842 838 /*Protect against attach/detach races */ 843 839 struct mutex mutex; 840 + spinlock_t dte_lock; /* DTE lock for 256-bit access */ 844 841 845 842 struct list_head list; /* For domain->dev_list */ 846 843 struct llist_node dev_data_list; /* For global dev_data_list */ ··· 886 881 * Structure defining one entry in the device table 887 882 */ 888 883 struct dev_table_entry { 889 - u64 data[4]; 884 + union { 885 + u64 data[4]; 886 + u128 data128[2]; 887 + }; 888 + }; 889 + 890 + /* 891 + * Structure to sture persistent DTE flags from IVHD 892 + */ 893 + struct ivhd_dte_flags { 894 + struct list_head list; 895 + u16 segid; 896 + u16 devid_first; 897 + u16 devid_last; 898 + struct dev_table_entry dte; 890 899 }; 891 900 892 901 /*
+137 -116
drivers/iommu/amd/init.c
··· 152 152 bool amd_iommu_dump; 153 153 bool amd_iommu_irq_remap __read_mostly; 154 154 155 - enum io_pgtable_fmt amd_iommu_pgtable = AMD_IOMMU_V1; 155 + enum protection_domain_mode amd_iommu_pgtable = PD_MODE_V1; 156 156 /* Guest page table level */ 157 157 int amd_iommu_gpt_level = PAGE_MODE_4_LEVEL; 158 158 ··· 174 174 EXPORT_SYMBOL(amd_iommu_snp_en); 175 175 176 176 LIST_HEAD(amd_iommu_pci_seg_list); /* list of all PCI segments */ 177 - LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the 178 - system */ 177 + LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the system */ 178 + LIST_HEAD(amd_ivhd_dev_flags_list); /* list of all IVHD device entry settings */ 179 179 180 180 /* Number of IOMMUs present in the system */ 181 181 static int amd_iommus_present; ··· 984 984 } 985 985 986 986 /* sets a specific bit in the device table entry. */ 987 - static void __set_dev_entry_bit(struct dev_table_entry *dev_table, 988 - u16 devid, u8 bit) 987 + static void set_dte_bit(struct dev_table_entry *dte, u8 bit) 989 988 { 990 989 int i = (bit >> 6) & 0x03; 991 990 int _bit = bit & 0x3f; 992 991 993 - dev_table[devid].data[i] |= (1UL << _bit); 994 - } 995 - 996 - static void set_dev_entry_bit(struct amd_iommu *iommu, u16 devid, u8 bit) 997 - { 998 - struct dev_table_entry *dev_table = get_dev_table(iommu); 999 - 1000 - return __set_dev_entry_bit(dev_table, devid, bit); 1001 - } 1002 - 1003 - static int __get_dev_entry_bit(struct dev_table_entry *dev_table, 1004 - u16 devid, u8 bit) 1005 - { 1006 - int i = (bit >> 6) & 0x03; 1007 - int _bit = bit & 0x3f; 1008 - 1009 - return (dev_table[devid].data[i] & (1UL << _bit)) >> _bit; 1010 - } 1011 - 1012 - static int get_dev_entry_bit(struct amd_iommu *iommu, u16 devid, u8 bit) 1013 - { 1014 - struct dev_table_entry *dev_table = get_dev_table(iommu); 1015 - 1016 - return __get_dev_entry_bit(dev_table, devid, bit); 992 + dte->data[i] |= (1UL << _bit); 1017 993 } 1018 994 1019 995 static bool __copy_device_table(struct amd_iommu *iommu) ··· 1057 1081 } 1058 1082 /* If gcr3 table existed, mask it out */ 1059 1083 if (old_devtb[devid].data[0] & DTE_FLAG_GV) { 1060 - tmp = DTE_GCR3_VAL_B(~0ULL) << DTE_GCR3_SHIFT_B; 1061 - tmp |= DTE_GCR3_VAL_C(~0ULL) << DTE_GCR3_SHIFT_C; 1084 + tmp = (DTE_GCR3_30_15 | DTE_GCR3_51_31); 1062 1085 pci_seg->old_dev_tbl_cpy[devid].data[1] &= ~tmp; 1063 - tmp = DTE_GCR3_VAL_A(~0ULL) << DTE_GCR3_SHIFT_A; 1064 - tmp |= DTE_FLAG_GV; 1086 + tmp = (DTE_GCR3_14_12 | DTE_FLAG_GV); 1065 1087 pci_seg->old_dev_tbl_cpy[devid].data[0] &= ~tmp; 1066 1088 } 1067 1089 } ··· 1110 1136 return true; 1111 1137 } 1112 1138 1113 - void amd_iommu_apply_erratum_63(struct amd_iommu *iommu, u16 devid) 1139 + struct dev_table_entry *amd_iommu_get_ivhd_dte_flags(u16 segid, u16 devid) 1114 1140 { 1115 - int sysmgt; 1141 + struct ivhd_dte_flags *e; 1142 + unsigned int best_len = UINT_MAX; 1143 + struct dev_table_entry *dte = NULL; 1116 1144 1117 - sysmgt = get_dev_entry_bit(iommu, devid, DEV_ENTRY_SYSMGT1) | 1118 - (get_dev_entry_bit(iommu, devid, DEV_ENTRY_SYSMGT2) << 1); 1145 + for_each_ivhd_dte_flags(e) { 1146 + /* 1147 + * Need to go through the whole list to find the smallest range, 1148 + * which contains the devid. 1149 + */ 1150 + if ((e->segid == segid) && 1151 + (e->devid_first <= devid) && (devid <= e->devid_last)) { 1152 + unsigned int len = e->devid_last - e->devid_first; 1119 1153 1120 - if (sysmgt == 0x01) 1121 - set_dev_entry_bit(iommu, devid, DEV_ENTRY_IW); 1154 + if (len < best_len) { 1155 + dte = &(e->dte); 1156 + best_len = len; 1157 + } 1158 + } 1159 + } 1160 + return dte; 1161 + } 1162 + 1163 + static bool search_ivhd_dte_flags(u16 segid, u16 first, u16 last) 1164 + { 1165 + struct ivhd_dte_flags *e; 1166 + 1167 + for_each_ivhd_dte_flags(e) { 1168 + if ((e->segid == segid) && 1169 + (e->devid_first == first) && 1170 + (e->devid_last == last)) 1171 + return true; 1172 + } 1173 + return false; 1122 1174 } 1123 1175 1124 1176 /* 1125 1177 * This function takes the device specific flags read from the ACPI 1126 1178 * table and sets up the device table entry with that information 1127 1179 */ 1180 + static void __init 1181 + set_dev_entry_from_acpi_range(struct amd_iommu *iommu, u16 first, u16 last, 1182 + u32 flags, u32 ext_flags) 1183 + { 1184 + int i; 1185 + struct dev_table_entry dte = {}; 1186 + 1187 + /* Parse IVHD DTE setting flags and store information */ 1188 + if (flags) { 1189 + struct ivhd_dte_flags *d; 1190 + 1191 + if (search_ivhd_dte_flags(iommu->pci_seg->id, first, last)) 1192 + return; 1193 + 1194 + d = kzalloc(sizeof(struct ivhd_dte_flags), GFP_KERNEL); 1195 + if (!d) 1196 + return; 1197 + 1198 + pr_debug("%s: devid range %#x:%#x\n", __func__, first, last); 1199 + 1200 + if (flags & ACPI_DEVFLAG_INITPASS) 1201 + set_dte_bit(&dte, DEV_ENTRY_INIT_PASS); 1202 + if (flags & ACPI_DEVFLAG_EXTINT) 1203 + set_dte_bit(&dte, DEV_ENTRY_EINT_PASS); 1204 + if (flags & ACPI_DEVFLAG_NMI) 1205 + set_dte_bit(&dte, DEV_ENTRY_NMI_PASS); 1206 + if (flags & ACPI_DEVFLAG_SYSMGT1) 1207 + set_dte_bit(&dte, DEV_ENTRY_SYSMGT1); 1208 + if (flags & ACPI_DEVFLAG_SYSMGT2) 1209 + set_dte_bit(&dte, DEV_ENTRY_SYSMGT2); 1210 + if (flags & ACPI_DEVFLAG_LINT0) 1211 + set_dte_bit(&dte, DEV_ENTRY_LINT0_PASS); 1212 + if (flags & ACPI_DEVFLAG_LINT1) 1213 + set_dte_bit(&dte, DEV_ENTRY_LINT1_PASS); 1214 + 1215 + /* Apply erratum 63, which needs info in initial_dte */ 1216 + if (FIELD_GET(DTE_DATA1_SYSMGT_MASK, dte.data[1]) == 0x1) 1217 + dte.data[0] |= DTE_FLAG_IW; 1218 + 1219 + memcpy(&d->dte, &dte, sizeof(dte)); 1220 + d->segid = iommu->pci_seg->id; 1221 + d->devid_first = first; 1222 + d->devid_last = last; 1223 + list_add_tail(&d->list, &amd_ivhd_dev_flags_list); 1224 + } 1225 + 1226 + for (i = first; i <= last; i++) { 1227 + if (flags) { 1228 + struct dev_table_entry *dev_table = get_dev_table(iommu); 1229 + 1230 + memcpy(&dev_table[i], &dte, sizeof(dte)); 1231 + } 1232 + amd_iommu_set_rlookup_table(iommu, i); 1233 + } 1234 + } 1235 + 1128 1236 static void __init set_dev_entry_from_acpi(struct amd_iommu *iommu, 1129 1237 u16 devid, u32 flags, u32 ext_flags) 1130 1238 { 1131 - if (flags & ACPI_DEVFLAG_INITPASS) 1132 - set_dev_entry_bit(iommu, devid, DEV_ENTRY_INIT_PASS); 1133 - if (flags & ACPI_DEVFLAG_EXTINT) 1134 - set_dev_entry_bit(iommu, devid, DEV_ENTRY_EINT_PASS); 1135 - if (flags & ACPI_DEVFLAG_NMI) 1136 - set_dev_entry_bit(iommu, devid, DEV_ENTRY_NMI_PASS); 1137 - if (flags & ACPI_DEVFLAG_SYSMGT1) 1138 - set_dev_entry_bit(iommu, devid, DEV_ENTRY_SYSMGT1); 1139 - if (flags & ACPI_DEVFLAG_SYSMGT2) 1140 - set_dev_entry_bit(iommu, devid, DEV_ENTRY_SYSMGT2); 1141 - if (flags & ACPI_DEVFLAG_LINT0) 1142 - set_dev_entry_bit(iommu, devid, DEV_ENTRY_LINT0_PASS); 1143 - if (flags & ACPI_DEVFLAG_LINT1) 1144 - set_dev_entry_bit(iommu, devid, DEV_ENTRY_LINT1_PASS); 1145 - 1146 - amd_iommu_apply_erratum_63(iommu, devid); 1147 - 1148 - amd_iommu_set_rlookup_table(iommu, devid); 1239 + set_dev_entry_from_acpi_range(iommu, devid, devid, flags, ext_flags); 1149 1240 } 1150 1241 1151 1242 int __init add_special_device(u8 type, u8 id, u32 *devid, bool cmd_line) ··· 1278 1239 entry->cmd_line = cmd_line; 1279 1240 entry->root_devid = (entry->devid & (~0x7)); 1280 1241 1281 - pr_info("%s, add hid:%s, uid:%s, rdevid:%d\n", 1242 + pr_info("%s, add hid:%s, uid:%s, rdevid:%#x\n", 1282 1243 entry->cmd_line ? "cmd" : "ivrs", 1283 1244 entry->hid, entry->uid, entry->root_devid); 1284 1245 ··· 1370 1331 switch (e->type) { 1371 1332 case IVHD_DEV_ALL: 1372 1333 1373 - DUMP_printk(" DEV_ALL\t\t\tflags: %02x\n", e->flags); 1374 - 1375 - for (dev_i = 0; dev_i <= pci_seg->last_bdf; ++dev_i) 1376 - set_dev_entry_from_acpi(iommu, dev_i, e->flags, 0); 1334 + DUMP_printk(" DEV_ALL\t\t\tsetting: %#02x\n", e->flags); 1335 + set_dev_entry_from_acpi_range(iommu, 0, pci_seg->last_bdf, e->flags, 0); 1377 1336 break; 1378 1337 case IVHD_DEV_SELECT: 1379 1338 1380 - DUMP_printk(" DEV_SELECT\t\t\t devid: %04x:%02x:%02x.%x " 1381 - "flags: %02x\n", 1339 + DUMP_printk(" DEV_SELECT\t\t\tdevid: %04x:%02x:%02x.%x flags: %#02x\n", 1382 1340 seg_id, PCI_BUS_NUM(e->devid), 1383 1341 PCI_SLOT(e->devid), 1384 1342 PCI_FUNC(e->devid), ··· 1386 1350 break; 1387 1351 case IVHD_DEV_SELECT_RANGE_START: 1388 1352 1389 - DUMP_printk(" DEV_SELECT_RANGE_START\t " 1390 - "devid: %04x:%02x:%02x.%x flags: %02x\n", 1353 + DUMP_printk(" DEV_SELECT_RANGE_START\tdevid: %04x:%02x:%02x.%x flags: %#02x\n", 1391 1354 seg_id, PCI_BUS_NUM(e->devid), 1392 1355 PCI_SLOT(e->devid), 1393 1356 PCI_FUNC(e->devid), ··· 1399 1364 break; 1400 1365 case IVHD_DEV_ALIAS: 1401 1366 1402 - DUMP_printk(" DEV_ALIAS\t\t\t devid: %04x:%02x:%02x.%x " 1403 - "flags: %02x devid_to: %02x:%02x.%x\n", 1367 + DUMP_printk(" DEV_ALIAS\t\t\tdevid: %04x:%02x:%02x.%x flags: %#02x devid_to: %02x:%02x.%x\n", 1404 1368 seg_id, PCI_BUS_NUM(e->devid), 1405 1369 PCI_SLOT(e->devid), 1406 1370 PCI_FUNC(e->devid), ··· 1416 1382 break; 1417 1383 case IVHD_DEV_ALIAS_RANGE: 1418 1384 1419 - DUMP_printk(" DEV_ALIAS_RANGE\t\t " 1420 - "devid: %04x:%02x:%02x.%x flags: %02x " 1421 - "devid_to: %04x:%02x:%02x.%x\n", 1385 + DUMP_printk(" DEV_ALIAS_RANGE\t\tdevid: %04x:%02x:%02x.%x flags: %#02x devid_to: %04x:%02x:%02x.%x\n", 1422 1386 seg_id, PCI_BUS_NUM(e->devid), 1423 1387 PCI_SLOT(e->devid), 1424 1388 PCI_FUNC(e->devid), ··· 1433 1401 break; 1434 1402 case IVHD_DEV_EXT_SELECT: 1435 1403 1436 - DUMP_printk(" DEV_EXT_SELECT\t\t devid: %04x:%02x:%02x.%x " 1437 - "flags: %02x ext: %08x\n", 1404 + DUMP_printk(" DEV_EXT_SELECT\t\tdevid: %04x:%02x:%02x.%x flags: %#02x ext: %08x\n", 1438 1405 seg_id, PCI_BUS_NUM(e->devid), 1439 1406 PCI_SLOT(e->devid), 1440 1407 PCI_FUNC(e->devid), ··· 1445 1414 break; 1446 1415 case IVHD_DEV_EXT_SELECT_RANGE: 1447 1416 1448 - DUMP_printk(" DEV_EXT_SELECT_RANGE\t devid: " 1449 - "%04x:%02x:%02x.%x flags: %02x ext: %08x\n", 1417 + DUMP_printk(" DEV_EXT_SELECT_RANGE\tdevid: %04x:%02x:%02x.%x flags: %#02x ext: %08x\n", 1450 1418 seg_id, PCI_BUS_NUM(e->devid), 1451 1419 PCI_SLOT(e->devid), 1452 1420 PCI_FUNC(e->devid), ··· 1458 1428 break; 1459 1429 case IVHD_DEV_RANGE_END: 1460 1430 1461 - DUMP_printk(" DEV_RANGE_END\t\t devid: %04x:%02x:%02x.%x\n", 1431 + DUMP_printk(" DEV_RANGE_END\t\tdevid: %04x:%02x:%02x.%x\n", 1462 1432 seg_id, PCI_BUS_NUM(e->devid), 1463 1433 PCI_SLOT(e->devid), 1464 1434 PCI_FUNC(e->devid)); 1465 1435 1466 1436 devid = e->devid; 1467 1437 for (dev_i = devid_start; dev_i <= devid; ++dev_i) { 1468 - if (alias) { 1438 + if (alias) 1469 1439 pci_seg->alias_table[dev_i] = devid_to; 1470 - set_dev_entry_from_acpi(iommu, 1471 - devid_to, flags, ext_flags); 1472 - } 1473 - set_dev_entry_from_acpi(iommu, dev_i, 1474 - flags, ext_flags); 1475 1440 } 1441 + set_dev_entry_from_acpi_range(iommu, devid_start, devid, flags, ext_flags); 1442 + set_dev_entry_from_acpi(iommu, devid_to, flags, ext_flags); 1476 1443 break; 1477 1444 case IVHD_DEV_SPECIAL: { 1478 1445 u8 handle, type; ··· 1488 1461 else 1489 1462 var = "UNKNOWN"; 1490 1463 1491 - DUMP_printk(" DEV_SPECIAL(%s[%d])\t\tdevid: %04x:%02x:%02x.%x\n", 1464 + DUMP_printk(" DEV_SPECIAL(%s[%d])\t\tdevid: %04x:%02x:%02x.%x, flags: %#02x\n", 1492 1465 var, (int)handle, 1493 1466 seg_id, PCI_BUS_NUM(devid), 1494 1467 PCI_SLOT(devid), 1495 - PCI_FUNC(devid)); 1468 + PCI_FUNC(devid), 1469 + e->flags); 1496 1470 1497 1471 ret = add_special_device(type, handle, &devid, false); 1498 1472 if (ret) ··· 1553 1525 } 1554 1526 1555 1527 devid = PCI_SEG_DEVID_TO_SBDF(seg_id, e->devid); 1556 - DUMP_printk(" DEV_ACPI_HID(%s[%s])\t\tdevid: %04x:%02x:%02x.%x\n", 1528 + DUMP_printk(" DEV_ACPI_HID(%s[%s])\t\tdevid: %04x:%02x:%02x.%x, flags: %#02x\n", 1557 1529 hid, uid, seg_id, 1558 1530 PCI_BUS_NUM(devid), 1559 1531 PCI_SLOT(devid), 1560 - PCI_FUNC(devid)); 1532 + PCI_FUNC(devid), 1533 + e->flags); 1561 1534 1562 1535 flags = e->flags; 1563 1536 ··· 1786 1757 else 1787 1758 iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET; 1788 1759 1789 - /* 1790 - * Note: GA (128-bit IRTE) mode requires cmpxchg16b supports. 1791 - * GAM also requires GA mode. Therefore, we need to 1792 - * check cmpxchg16b support before enabling it. 1793 - */ 1794 - if (!boot_cpu_has(X86_FEATURE_CX16) || 1795 - ((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0)) 1760 + /* GAM requires GA mode. */ 1761 + if ((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0) 1796 1762 amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY; 1797 1763 break; 1798 1764 case 0x11: ··· 1797 1773 else 1798 1774 iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET; 1799 1775 1800 - /* 1801 - * Note: GA (128-bit IRTE) mode requires cmpxchg16b supports. 1802 - * XT, GAM also requires GA mode. Therefore, we need to 1803 - * check cmpxchg16b support before enabling them. 1804 - */ 1805 - if (!boot_cpu_has(X86_FEATURE_CX16) || 1806 - ((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0)) { 1776 + /* XT and GAM require GA mode. */ 1777 + if ((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0) { 1807 1778 amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY; 1808 1779 break; 1809 1780 } ··· 2164 2145 if (amd_iommu_xt_mode == IRQ_REMAP_X2APIC_MODE) 2165 2146 pr_info("X2APIC enabled\n"); 2166 2147 } 2167 - if (amd_iommu_pgtable == AMD_IOMMU_V2) { 2148 + if (amd_iommu_pgtable == PD_MODE_V2) { 2168 2149 pr_info("V2 page table enabled (Paging mode : %d level)\n", 2169 2150 amd_iommu_gpt_level); 2170 2151 } ··· 2594 2575 return; 2595 2576 2596 2577 for (devid = 0; devid <= pci_seg->last_bdf; ++devid) { 2597 - __set_dev_entry_bit(dev_table, devid, DEV_ENTRY_VALID); 2578 + set_dte_bit(&dev_table[devid], DEV_ENTRY_VALID); 2598 2579 if (!amd_iommu_snp_en) 2599 - __set_dev_entry_bit(dev_table, devid, DEV_ENTRY_TRANSLATION); 2580 + set_dte_bit(&dev_table[devid], DEV_ENTRY_TRANSLATION); 2600 2581 } 2601 2582 } 2602 2583 ··· 2624 2605 2625 2606 for_each_pci_segment(pci_seg) { 2626 2607 for (devid = 0; devid <= pci_seg->last_bdf; ++devid) 2627 - __set_dev_entry_bit(pci_seg->dev_table, 2628 - devid, DEV_ENTRY_IRQ_TBL_EN); 2608 + set_dte_bit(&pci_seg->dev_table[devid], DEV_ENTRY_IRQ_TBL_EN); 2629 2609 } 2630 2610 } 2631 2611 ··· 3051 3033 return -EINVAL; 3052 3034 } 3053 3035 3036 + if (!boot_cpu_has(X86_FEATURE_CX16)) { 3037 + pr_err("Failed to initialize. The CMPXCHG16B feature is required.\n"); 3038 + return -EINVAL; 3039 + } 3040 + 3054 3041 /* 3055 3042 * Validate checksum here so we don't need to do it when 3056 3043 * we actually parse the table ··· 3082 3059 FIELD_GET(FEATURE_GATS, amd_iommu_efr) == GUEST_PGTABLE_5_LEVEL) 3083 3060 amd_iommu_gpt_level = PAGE_MODE_5_LEVEL; 3084 3061 3085 - if (amd_iommu_pgtable == AMD_IOMMU_V2) { 3062 + if (amd_iommu_pgtable == PD_MODE_V2) { 3086 3063 if (!amd_iommu_v2_pgtbl_supported()) { 3087 3064 pr_warn("Cannot enable v2 page table for DMA-API. Fallback to v1.\n"); 3088 - amd_iommu_pgtable = AMD_IOMMU_V1; 3065 + amd_iommu_pgtable = PD_MODE_V1; 3089 3066 } 3090 3067 } 3091 3068 ··· 3208 3185 goto disable_snp; 3209 3186 } 3210 3187 3211 - if (amd_iommu_pgtable != AMD_IOMMU_V1) { 3188 + if (amd_iommu_pgtable != PD_MODE_V1) { 3212 3189 pr_warn("SNP: IOMMU is configured with V2 page table mode, SNP cannot be supported.\n"); 3213 3190 goto disable_snp; 3214 3191 } ··· 3421 3398 * IOMMUs 3422 3399 * 3423 3400 ****************************************************************************/ 3424 - int __init amd_iommu_detect(void) 3401 + void __init amd_iommu_detect(void) 3425 3402 { 3426 3403 int ret; 3427 3404 3428 3405 if (no_iommu || (iommu_detected && !gart_iommu_aperture)) 3429 - return -ENODEV; 3406 + return; 3430 3407 3431 3408 if (!amd_iommu_sme_check()) 3432 - return -ENODEV; 3409 + return; 3433 3410 3434 3411 ret = iommu_go_to_state(IOMMU_IVRS_DETECTED); 3435 3412 if (ret) 3436 - return ret; 3413 + return; 3437 3414 3438 3415 amd_iommu_detected = true; 3439 3416 iommu_detected = 1; 3440 3417 x86_init.iommu.iommu_init = amd_iommu_init; 3441 - 3442 - return 1; 3443 3418 } 3444 3419 3445 3420 /**************************************************************************** ··· 3485 3464 } else if (strncmp(str, "force_isolation", 15) == 0) { 3486 3465 amd_iommu_force_isolation = true; 3487 3466 } else if (strncmp(str, "pgtbl_v1", 8) == 0) { 3488 - amd_iommu_pgtable = AMD_IOMMU_V1; 3467 + amd_iommu_pgtable = PD_MODE_V1; 3489 3468 } else if (strncmp(str, "pgtbl_v2", 8) == 0) { 3490 - amd_iommu_pgtable = AMD_IOMMU_V2; 3469 + amd_iommu_pgtable = PD_MODE_V2; 3491 3470 } else if (strncmp(str, "irtcachedis", 11) == 0) { 3492 3471 amd_iommu_irtcachedis = true; 3493 3472 } else if (strncmp(str, "nohugepages", 11) == 0) {
+334 -200
drivers/iommu/amd/iommu.c
··· 83 83 static void set_dte_entry(struct amd_iommu *iommu, 84 84 struct iommu_dev_data *dev_data); 85 85 86 + static void iommu_flush_dte_sync(struct amd_iommu *iommu, u16 devid); 87 + 88 + static struct iommu_dev_data *find_dev_data(struct amd_iommu *iommu, u16 devid); 89 + 86 90 /**************************************************************************** 87 91 * 88 92 * Helper functions 89 93 * 90 94 ****************************************************************************/ 95 + 96 + static __always_inline void amd_iommu_atomic128_set(__int128 *ptr, __int128 val) 97 + { 98 + /* 99 + * Note: 100 + * We use arch_cmpxchg128_local() because: 101 + * - Need cmpxchg16b instruction mainly for 128-bit store to DTE 102 + * (not necessary for cmpxchg since this function is already 103 + * protected by a spin_lock for this DTE). 104 + * - Neither need LOCK_PREFIX nor try loop because of the spin_lock. 105 + */ 106 + arch_cmpxchg128_local(ptr, *ptr, val); 107 + } 108 + 109 + static void write_dte_upper128(struct dev_table_entry *ptr, struct dev_table_entry *new) 110 + { 111 + struct dev_table_entry old; 112 + 113 + old.data128[1] = ptr->data128[1]; 114 + /* 115 + * Preserve DTE_DATA2_INTR_MASK. This needs to be 116 + * done here since it requires to be inside 117 + * spin_lock(&dev_data->dte_lock) context. 118 + */ 119 + new->data[2] &= ~DTE_DATA2_INTR_MASK; 120 + new->data[2] |= old.data[2] & DTE_DATA2_INTR_MASK; 121 + 122 + amd_iommu_atomic128_set(&ptr->data128[1], new->data128[1]); 123 + } 124 + 125 + static void write_dte_lower128(struct dev_table_entry *ptr, struct dev_table_entry *new) 126 + { 127 + amd_iommu_atomic128_set(&ptr->data128[0], new->data128[0]); 128 + } 129 + 130 + /* 131 + * Note: 132 + * IOMMU reads the entire Device Table entry in a single 256-bit transaction 133 + * but the driver is programming DTE using 2 128-bit cmpxchg. So, the driver 134 + * need to ensure the following: 135 + * - DTE[V|GV] bit is being written last when setting. 136 + * - DTE[V|GV] bit is being written first when clearing. 137 + * 138 + * This function is used only by code, which updates DMA translation part of the DTE. 139 + * So, only consider control bits related to DMA when updating the entry. 140 + */ 141 + static void update_dte256(struct amd_iommu *iommu, struct iommu_dev_data *dev_data, 142 + struct dev_table_entry *new) 143 + { 144 + unsigned long flags; 145 + struct dev_table_entry *dev_table = get_dev_table(iommu); 146 + struct dev_table_entry *ptr = &dev_table[dev_data->devid]; 147 + 148 + spin_lock_irqsave(&dev_data->dte_lock, flags); 149 + 150 + if (!(ptr->data[0] & DTE_FLAG_V)) { 151 + /* Existing DTE is not valid. */ 152 + write_dte_upper128(ptr, new); 153 + write_dte_lower128(ptr, new); 154 + iommu_flush_dte_sync(iommu, dev_data->devid); 155 + } else if (!(new->data[0] & DTE_FLAG_V)) { 156 + /* Existing DTE is valid. New DTE is not valid. */ 157 + write_dte_lower128(ptr, new); 158 + write_dte_upper128(ptr, new); 159 + iommu_flush_dte_sync(iommu, dev_data->devid); 160 + } else if (!FIELD_GET(DTE_FLAG_GV, ptr->data[0])) { 161 + /* 162 + * Both DTEs are valid. 163 + * Existing DTE has no guest page table. 164 + */ 165 + write_dte_upper128(ptr, new); 166 + write_dte_lower128(ptr, new); 167 + iommu_flush_dte_sync(iommu, dev_data->devid); 168 + } else if (!FIELD_GET(DTE_FLAG_GV, new->data[0])) { 169 + /* 170 + * Both DTEs are valid. 171 + * Existing DTE has guest page table, 172 + * new DTE has no guest page table, 173 + */ 174 + write_dte_lower128(ptr, new); 175 + write_dte_upper128(ptr, new); 176 + iommu_flush_dte_sync(iommu, dev_data->devid); 177 + } else if (FIELD_GET(DTE_GPT_LEVEL_MASK, ptr->data[2]) != 178 + FIELD_GET(DTE_GPT_LEVEL_MASK, new->data[2])) { 179 + /* 180 + * Both DTEs are valid and have guest page table, 181 + * but have different number of levels. So, we need 182 + * to upadte both upper and lower 128-bit value, which 183 + * require disabling and flushing. 184 + */ 185 + struct dev_table_entry clear = {}; 186 + 187 + /* First disable DTE */ 188 + write_dte_lower128(ptr, &clear); 189 + iommu_flush_dte_sync(iommu, dev_data->devid); 190 + 191 + /* Then update DTE */ 192 + write_dte_upper128(ptr, new); 193 + write_dte_lower128(ptr, new); 194 + iommu_flush_dte_sync(iommu, dev_data->devid); 195 + } else { 196 + /* 197 + * Both DTEs are valid and have guest page table, 198 + * and same number of levels. We just need to only 199 + * update the lower 128-bit. So no need to disable DTE. 200 + */ 201 + write_dte_lower128(ptr, new); 202 + } 203 + 204 + spin_unlock_irqrestore(&dev_data->dte_lock, flags); 205 + } 206 + 207 + static void get_dte256(struct amd_iommu *iommu, struct iommu_dev_data *dev_data, 208 + struct dev_table_entry *dte) 209 + { 210 + unsigned long flags; 211 + struct dev_table_entry *ptr; 212 + struct dev_table_entry *dev_table = get_dev_table(iommu); 213 + 214 + ptr = &dev_table[dev_data->devid]; 215 + 216 + spin_lock_irqsave(&dev_data->dte_lock, flags); 217 + dte->data128[0] = ptr->data128[0]; 218 + dte->data128[1] = ptr->data128[1]; 219 + spin_unlock_irqrestore(&dev_data->dte_lock, flags); 220 + } 91 221 92 222 static inline bool pdom_is_v2_pgtbl_mode(struct protection_domain *pdom) 93 223 { ··· 339 209 return NULL; 340 210 341 211 mutex_init(&dev_data->mutex); 212 + spin_lock_init(&dev_data->dte_lock); 342 213 dev_data->devid = devid; 343 214 ratelimit_default_init(&dev_data->rs); 344 215 ··· 347 216 return dev_data; 348 217 } 349 218 350 - static struct iommu_dev_data *search_dev_data(struct amd_iommu *iommu, u16 devid) 219 + struct iommu_dev_data *search_dev_data(struct amd_iommu *iommu, u16 devid) 351 220 { 352 221 struct iommu_dev_data *dev_data; 353 222 struct llist_node *node; ··· 367 236 368 237 static int clone_alias(struct pci_dev *pdev, u16 alias, void *data) 369 238 { 239 + struct dev_table_entry new; 370 240 struct amd_iommu *iommu; 371 - struct dev_table_entry *dev_table; 241 + struct iommu_dev_data *dev_data, *alias_data; 372 242 u16 devid = pci_dev_id(pdev); 243 + int ret = 0; 373 244 374 245 if (devid == alias) 375 246 return 0; ··· 380 247 if (!iommu) 381 248 return 0; 382 249 383 - amd_iommu_set_rlookup_table(iommu, alias); 384 - dev_table = get_dev_table(iommu); 385 - memcpy(dev_table[alias].data, 386 - dev_table[devid].data, 387 - sizeof(dev_table[alias].data)); 250 + /* Copy the data from pdev */ 251 + dev_data = dev_iommu_priv_get(&pdev->dev); 252 + if (!dev_data) { 253 + pr_err("%s : Failed to get dev_data for 0x%x\n", __func__, devid); 254 + ret = -EINVAL; 255 + goto out; 256 + } 257 + get_dte256(iommu, dev_data, &new); 388 258 389 - return 0; 259 + /* Setup alias */ 260 + alias_data = find_dev_data(iommu, alias); 261 + if (!alias_data) { 262 + pr_err("%s : Failed to get alias dev_data for 0x%x\n", __func__, alias); 263 + ret = -EINVAL; 264 + goto out; 265 + } 266 + update_dte256(iommu, alias_data, &new); 267 + 268 + amd_iommu_set_rlookup_table(iommu, alias); 269 + out: 270 + return ret; 390 271 } 391 272 392 273 static void clone_aliases(struct amd_iommu *iommu, struct device *dev) ··· 673 526 return -ENOMEM; 674 527 675 528 dev_data->dev = dev; 529 + 530 + /* 531 + * The dev_iommu_priv_set() needes to be called before setup_aliases. 532 + * Otherwise, subsequent call to dev_iommu_priv_get() will fail. 533 + */ 534 + dev_iommu_priv_set(dev, dev_data); 676 535 setup_aliases(iommu, dev); 677 536 678 537 /* ··· 691 538 dev_is_pci(dev) && amd_iommu_gt_ppr_supported()) { 692 539 dev_data->flags = pdev_get_caps(to_pci_dev(dev)); 693 540 } 694 - 695 - dev_iommu_priv_set(dev, dev_data); 696 541 697 542 return 0; 698 543 } ··· 722 571 static void dump_dte_entry(struct amd_iommu *iommu, u16 devid) 723 572 { 724 573 int i; 725 - struct dev_table_entry *dev_table = get_dev_table(iommu); 574 + struct dev_table_entry dte; 575 + struct iommu_dev_data *dev_data = find_dev_data(iommu, devid); 576 + 577 + get_dte256(iommu, dev_data, &dte); 726 578 727 579 for (i = 0; i < 4; ++i) 728 - pr_err("DTE[%d]: %016llx\n", i, dev_table[devid].data[i]); 580 + pr_err("DTE[%d]: %016llx\n", i, dte.data[i]); 729 581 } 730 582 731 583 static void dump_command(unsigned long phys_addr) ··· 1415 1261 return iommu_queue_command(iommu, &cmd); 1416 1262 } 1417 1263 1264 + static void iommu_flush_dte_sync(struct amd_iommu *iommu, u16 devid) 1265 + { 1266 + int ret; 1267 + 1268 + ret = iommu_flush_dte(iommu, devid); 1269 + if (!ret) 1270 + iommu_completion_wait(iommu); 1271 + } 1272 + 1418 1273 static void amd_iommu_flush_dte_all(struct amd_iommu *iommu) 1419 1274 { 1420 1275 u32 devid; ··· 1766 1603 domain_flush_complete(domain); 1767 1604 } 1768 1605 1769 - void amd_iommu_domain_update(struct protection_domain *domain) 1770 - { 1771 - /* Update device table */ 1772 - amd_iommu_update_and_flush_device_table(domain); 1773 - 1774 - /* Flush domain TLB(s) and wait for completion */ 1775 - amd_iommu_domain_flush_all(domain); 1776 - } 1777 - 1778 1606 int amd_iommu_complete_ppr(struct device *dev, u32 pasid, int status, int tag) 1779 1607 { 1780 1608 struct iommu_dev_data *dev_data; ··· 1980 1826 return ret; 1981 1827 } 1982 1828 1829 + static void make_clear_dte(struct iommu_dev_data *dev_data, struct dev_table_entry *ptr, 1830 + struct dev_table_entry *new) 1831 + { 1832 + /* All existing DTE must have V bit set */ 1833 + new->data128[0] = DTE_FLAG_V; 1834 + new->data128[1] = 0; 1835 + } 1836 + 1837 + /* 1838 + * Note: 1839 + * The old value for GCR3 table and GPT have been cleared from caller. 1840 + */ 1841 + static void set_dte_gcr3_table(struct amd_iommu *iommu, 1842 + struct iommu_dev_data *dev_data, 1843 + struct dev_table_entry *target) 1844 + { 1845 + struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info; 1846 + u64 gcr3; 1847 + 1848 + if (!gcr3_info->gcr3_tbl) 1849 + return; 1850 + 1851 + pr_debug("%s: devid=%#x, glx=%#x, gcr3_tbl=%#llx\n", 1852 + __func__, dev_data->devid, gcr3_info->glx, 1853 + (unsigned long long)gcr3_info->gcr3_tbl); 1854 + 1855 + gcr3 = iommu_virt_to_phys(gcr3_info->gcr3_tbl); 1856 + 1857 + target->data[0] |= DTE_FLAG_GV | 1858 + FIELD_PREP(DTE_GLX, gcr3_info->glx) | 1859 + FIELD_PREP(DTE_GCR3_14_12, gcr3 >> 12); 1860 + if (pdom_is_v2_pgtbl_mode(dev_data->domain)) 1861 + target->data[0] |= DTE_FLAG_GIOV; 1862 + 1863 + target->data[1] |= FIELD_PREP(DTE_GCR3_30_15, gcr3 >> 15) | 1864 + FIELD_PREP(DTE_GCR3_51_31, gcr3 >> 31); 1865 + 1866 + /* Guest page table can only support 4 and 5 levels */ 1867 + if (amd_iommu_gpt_level == PAGE_MODE_5_LEVEL) 1868 + target->data[2] |= FIELD_PREP(DTE_GPT_LEVEL_MASK, GUEST_PGTABLE_5_LEVEL); 1869 + else 1870 + target->data[2] |= FIELD_PREP(DTE_GPT_LEVEL_MASK, GUEST_PGTABLE_4_LEVEL); 1871 + } 1872 + 1983 1873 static void set_dte_entry(struct amd_iommu *iommu, 1984 1874 struct iommu_dev_data *dev_data) 1985 1875 { 1986 - u64 pte_root = 0; 1987 - u64 flags = 0; 1988 - u32 old_domid; 1989 - u16 devid = dev_data->devid; 1990 1876 u16 domid; 1877 + u32 old_domid; 1878 + struct dev_table_entry *initial_dte; 1879 + struct dev_table_entry new = {}; 1991 1880 struct protection_domain *domain = dev_data->domain; 1992 - struct dev_table_entry *dev_table = get_dev_table(iommu); 1993 1881 struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info; 1882 + struct dev_table_entry *dte = &get_dev_table(iommu)[dev_data->devid]; 1994 1883 1995 1884 if (gcr3_info && gcr3_info->gcr3_tbl) 1996 1885 domid = dev_data->gcr3_info.domid; 1997 1886 else 1998 1887 domid = domain->id; 1999 1888 2000 - if (domain->iop.mode != PAGE_MODE_NONE) 2001 - pte_root = iommu_virt_to_phys(domain->iop.root); 1889 + make_clear_dte(dev_data, dte, &new); 2002 1890 2003 - pte_root |= (domain->iop.mode & DEV_ENTRY_MODE_MASK) 1891 + if (domain->iop.mode != PAGE_MODE_NONE) 1892 + new.data[0] = iommu_virt_to_phys(domain->iop.root); 1893 + 1894 + new.data[0] |= (domain->iop.mode & DEV_ENTRY_MODE_MASK) 2004 1895 << DEV_ENTRY_MODE_SHIFT; 2005 1896 2006 - pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V; 1897 + new.data[0] |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V; 2007 1898 2008 1899 /* 2009 - * When SNP is enabled, Only set TV bit when IOMMU 2010 - * page translation is in use. 1900 + * When SNP is enabled, we can only support TV=1 with non-zero domain ID. 1901 + * This is prevented by the SNP-enable and IOMMU_DOMAIN_IDENTITY check in 1902 + * do_iommu_domain_alloc(). 2011 1903 */ 2012 - if (!amd_iommu_snp_en || (domid != 0)) 2013 - pte_root |= DTE_FLAG_TV; 2014 - 2015 - flags = dev_table[devid].data[1]; 2016 - 2017 - if (dev_data->ats_enabled) 2018 - flags |= DTE_FLAG_IOTLB; 1904 + WARN_ON(amd_iommu_snp_en && (domid == 0)); 1905 + new.data[0] |= DTE_FLAG_TV; 2019 1906 2020 1907 if (dev_data->ppr) 2021 - pte_root |= 1ULL << DEV_ENTRY_PPR; 1908 + new.data[0] |= 1ULL << DEV_ENTRY_PPR; 2022 1909 2023 1910 if (domain->dirty_tracking) 2024 - pte_root |= DTE_FLAG_HAD; 1911 + new.data[0] |= DTE_FLAG_HAD; 2025 1912 2026 - if (gcr3_info && gcr3_info->gcr3_tbl) { 2027 - u64 gcr3 = iommu_virt_to_phys(gcr3_info->gcr3_tbl); 2028 - u64 glx = gcr3_info->glx; 2029 - u64 tmp; 1913 + if (dev_data->ats_enabled) 1914 + new.data[1] |= DTE_FLAG_IOTLB; 2030 1915 2031 - pte_root |= DTE_FLAG_GV; 2032 - pte_root |= (glx & DTE_GLX_MASK) << DTE_GLX_SHIFT; 1916 + old_domid = READ_ONCE(dte->data[1]) & DEV_DOMID_MASK; 1917 + new.data[1] |= domid; 2033 1918 2034 - /* First mask out possible old values for GCR3 table */ 2035 - tmp = DTE_GCR3_VAL_B(~0ULL) << DTE_GCR3_SHIFT_B; 2036 - flags &= ~tmp; 2037 - 2038 - tmp = DTE_GCR3_VAL_C(~0ULL) << DTE_GCR3_SHIFT_C; 2039 - flags &= ~tmp; 2040 - 2041 - /* Encode GCR3 table into DTE */ 2042 - tmp = DTE_GCR3_VAL_A(gcr3) << DTE_GCR3_SHIFT_A; 2043 - pte_root |= tmp; 2044 - 2045 - tmp = DTE_GCR3_VAL_B(gcr3) << DTE_GCR3_SHIFT_B; 2046 - flags |= tmp; 2047 - 2048 - tmp = DTE_GCR3_VAL_C(gcr3) << DTE_GCR3_SHIFT_C; 2049 - flags |= tmp; 2050 - 2051 - if (amd_iommu_gpt_level == PAGE_MODE_5_LEVEL) { 2052 - dev_table[devid].data[2] |= 2053 - ((u64)GUEST_PGTABLE_5_LEVEL << DTE_GPT_LEVEL_SHIFT); 2054 - } 2055 - 2056 - /* GIOV is supported with V2 page table mode only */ 2057 - if (pdom_is_v2_pgtbl_mode(domain)) 2058 - pte_root |= DTE_FLAG_GIOV; 1919 + /* 1920 + * Restore cached persistent DTE bits, which can be set by information 1921 + * in IVRS table. See set_dev_entry_from_acpi(). 1922 + */ 1923 + initial_dte = amd_iommu_get_ivhd_dte_flags(iommu->pci_seg->id, dev_data->devid); 1924 + if (initial_dte) { 1925 + new.data128[0] |= initial_dte->data128[0]; 1926 + new.data128[1] |= initial_dte->data128[1]; 2059 1927 } 2060 1928 2061 - flags &= ~DEV_DOMID_MASK; 2062 - flags |= domid; 1929 + set_dte_gcr3_table(iommu, dev_data, &new); 2063 1930 2064 - old_domid = dev_table[devid].data[1] & DEV_DOMID_MASK; 2065 - dev_table[devid].data[1] = flags; 2066 - dev_table[devid].data[0] = pte_root; 1931 + update_dte256(iommu, dev_data, &new); 2067 1932 2068 1933 /* 2069 1934 * A kdump kernel might be replacing a domain ID that was copied from ··· 2094 1921 } 2095 1922 } 2096 1923 2097 - static void clear_dte_entry(struct amd_iommu *iommu, u16 devid) 1924 + /* 1925 + * Clear DMA-remap related flags to block all DMA (blockeded domain) 1926 + */ 1927 + static void clear_dte_entry(struct amd_iommu *iommu, struct iommu_dev_data *dev_data) 2098 1928 { 2099 - struct dev_table_entry *dev_table = get_dev_table(iommu); 1929 + struct dev_table_entry new = {}; 1930 + struct dev_table_entry *dte = &get_dev_table(iommu)[dev_data->devid]; 2100 1931 2101 - /* remove entry from the device table seen by the hardware */ 2102 - dev_table[devid].data[0] = DTE_FLAG_V; 2103 - 2104 - if (!amd_iommu_snp_en) 2105 - dev_table[devid].data[0] |= DTE_FLAG_TV; 2106 - 2107 - dev_table[devid].data[1] &= DTE_FLAG_MASK; 2108 - 2109 - amd_iommu_apply_erratum_63(iommu, devid); 1932 + make_clear_dte(dev_data, dte, &new); 1933 + update_dte256(iommu, dev_data, &new); 2110 1934 } 2111 1935 2112 1936 /* Update and flush DTE for the given device */ ··· 2114 1944 if (set) 2115 1945 set_dte_entry(iommu, dev_data); 2116 1946 else 2117 - clear_dte_entry(iommu, dev_data->devid); 1947 + clear_dte_entry(iommu, dev_data); 2118 1948 2119 1949 clone_aliases(iommu, dev_data->dev); 2120 1950 device_flush_dte(dev_data); ··· 2177 2007 struct protection_domain *pdom) 2178 2008 { 2179 2009 struct pdom_iommu_info *pdom_iommu_info, *curr; 2180 - struct io_pgtable_cfg *cfg = &pdom->iop.pgtbl.cfg; 2181 2010 unsigned long flags; 2182 2011 int ret = 0; 2183 2012 ··· 2204 2035 ret = -ENOSPC; 2205 2036 goto out_unlock; 2206 2037 } 2207 - 2208 - /* Update NUMA Node ID */ 2209 - if (cfg->amd.nid == NUMA_NO_NODE) 2210 - cfg->amd.nid = dev_to_node(&iommu->dev->dev); 2211 2038 2212 2039 out_unlock: 2213 2040 spin_unlock_irqrestore(&pdom->lock, flags); ··· 2441 2276 kfree(domain); 2442 2277 } 2443 2278 2444 - static void protection_domain_init(struct protection_domain *domain, int nid) 2279 + static void protection_domain_init(struct protection_domain *domain) 2445 2280 { 2446 2281 spin_lock_init(&domain->lock); 2447 2282 INIT_LIST_HEAD(&domain->dev_list); 2448 2283 INIT_LIST_HEAD(&domain->dev_data_list); 2449 2284 xa_init(&domain->iommu_array); 2450 - domain->iop.pgtbl.cfg.amd.nid = nid; 2451 2285 } 2452 2286 2453 - struct protection_domain *protection_domain_alloc(unsigned int type, int nid) 2287 + struct protection_domain *protection_domain_alloc(void) 2454 2288 { 2455 2289 struct protection_domain *domain; 2456 2290 int domid; ··· 2465 2301 } 2466 2302 domain->id = domid; 2467 2303 2468 - protection_domain_init(domain, nid); 2304 + protection_domain_init(domain); 2469 2305 2470 2306 return domain; 2471 2307 } 2472 2308 2473 2309 static int pdom_setup_pgtable(struct protection_domain *domain, 2474 - unsigned int type, int pgtable) 2310 + struct device *dev) 2475 2311 { 2476 2312 struct io_pgtable_ops *pgtbl_ops; 2313 + enum io_pgtable_fmt fmt; 2477 2314 2478 - /* No need to allocate io pgtable ops in passthrough mode */ 2479 - if (!(type & __IOMMU_DOMAIN_PAGING)) 2480 - return 0; 2481 - 2482 - switch (pgtable) { 2483 - case AMD_IOMMU_V1: 2484 - domain->pd_mode = PD_MODE_V1; 2315 + switch (domain->pd_mode) { 2316 + case PD_MODE_V1: 2317 + fmt = AMD_IOMMU_V1; 2485 2318 break; 2486 - case AMD_IOMMU_V2: 2487 - domain->pd_mode = PD_MODE_V2; 2319 + case PD_MODE_V2: 2320 + fmt = AMD_IOMMU_V2; 2488 2321 break; 2489 - default: 2490 - return -EINVAL; 2491 2322 } 2492 2323 2493 - pgtbl_ops = 2494 - alloc_io_pgtable_ops(pgtable, &domain->iop.pgtbl.cfg, domain); 2324 + domain->iop.pgtbl.cfg.amd.nid = dev_to_node(dev); 2325 + pgtbl_ops = alloc_io_pgtable_ops(fmt, &domain->iop.pgtbl.cfg, domain); 2495 2326 if (!pgtbl_ops) 2496 2327 return -ENOMEM; 2497 2328 2498 2329 return 0; 2499 2330 } 2500 2331 2501 - static inline u64 dma_max_address(int pgtable) 2332 + static inline u64 dma_max_address(enum protection_domain_mode pgtable) 2502 2333 { 2503 - if (pgtable == AMD_IOMMU_V1) 2334 + if (pgtable == PD_MODE_V1) 2504 2335 return ~0ULL; 2505 2336 2506 2337 /* V2 with 4/5 level page table */ ··· 2507 2348 return iommu && (iommu->features & FEATURE_HDSUP); 2508 2349 } 2509 2350 2510 - static struct iommu_domain *do_iommu_domain_alloc(unsigned int type, 2511 - struct device *dev, 2512 - u32 flags, int pgtable) 2351 + static struct iommu_domain * 2352 + do_iommu_domain_alloc(struct device *dev, u32 flags, 2353 + enum protection_domain_mode pgtable) 2513 2354 { 2514 2355 bool dirty_tracking = flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING; 2356 + struct amd_iommu *iommu = get_amd_iommu_from_dev(dev); 2515 2357 struct protection_domain *domain; 2516 - struct amd_iommu *iommu = NULL; 2517 2358 int ret; 2518 2359 2519 - if (dev) 2520 - iommu = get_amd_iommu_from_dev(dev); 2521 - 2522 - /* 2523 - * Since DTE[Mode]=0 is prohibited on SNP-enabled system, 2524 - * default to use IOMMU_DOMAIN_DMA[_FQ]. 2525 - */ 2526 - if (amd_iommu_snp_en && (type == IOMMU_DOMAIN_IDENTITY)) 2527 - return ERR_PTR(-EINVAL); 2528 - 2529 - domain = protection_domain_alloc(type, 2530 - dev ? dev_to_node(dev) : NUMA_NO_NODE); 2360 + domain = protection_domain_alloc(); 2531 2361 if (!domain) 2532 2362 return ERR_PTR(-ENOMEM); 2533 2363 2534 - ret = pdom_setup_pgtable(domain, type, pgtable); 2364 + domain->pd_mode = pgtable; 2365 + ret = pdom_setup_pgtable(domain, dev); 2535 2366 if (ret) { 2536 2367 pdom_id_free(domain->id); 2537 2368 kfree(domain); ··· 2533 2384 domain->domain.geometry.force_aperture = true; 2534 2385 domain->domain.pgsize_bitmap = domain->iop.pgtbl.cfg.pgsize_bitmap; 2535 2386 2536 - if (iommu) { 2537 - domain->domain.type = type; 2538 - domain->domain.ops = iommu->iommu.ops->default_domain_ops; 2387 + domain->domain.type = IOMMU_DOMAIN_UNMANAGED; 2388 + domain->domain.ops = iommu->iommu.ops->default_domain_ops; 2539 2389 2540 - if (dirty_tracking) 2541 - domain->domain.dirty_ops = &amd_dirty_ops; 2542 - } 2390 + if (dirty_tracking) 2391 + domain->domain.dirty_ops = &amd_dirty_ops; 2543 2392 2544 2393 return &domain->domain; 2545 - } 2546 - 2547 - static struct iommu_domain *amd_iommu_domain_alloc(unsigned int type) 2548 - { 2549 - struct iommu_domain *domain; 2550 - int pgtable = amd_iommu_pgtable; 2551 - 2552 - /* 2553 - * Force IOMMU v1 page table when allocating 2554 - * domain for pass-through devices. 2555 - */ 2556 - if (type == IOMMU_DOMAIN_UNMANAGED) 2557 - pgtable = AMD_IOMMU_V1; 2558 - 2559 - domain = do_iommu_domain_alloc(type, NULL, 0, pgtable); 2560 - if (IS_ERR(domain)) 2561 - return NULL; 2562 - 2563 - return domain; 2564 2394 } 2565 2395 2566 2396 static struct iommu_domain * ··· 2547 2419 const struct iommu_user_data *user_data) 2548 2420 2549 2421 { 2550 - unsigned int type = IOMMU_DOMAIN_UNMANAGED; 2551 - struct amd_iommu *iommu = NULL; 2422 + struct amd_iommu *iommu = get_amd_iommu_from_dev(dev); 2552 2423 const u32 supported_flags = IOMMU_HWPT_ALLOC_DIRTY_TRACKING | 2553 2424 IOMMU_HWPT_ALLOC_PASID; 2554 - 2555 - if (dev) 2556 - iommu = get_amd_iommu_from_dev(dev); 2557 2425 2558 2426 if ((flags & ~supported_flags) || user_data) 2559 2427 return ERR_PTR(-EOPNOTSUPP); 2560 2428 2561 - /* Allocate domain with v2 page table if IOMMU supports PASID. */ 2562 - if (flags & IOMMU_HWPT_ALLOC_PASID) { 2429 + switch (flags & supported_flags) { 2430 + case IOMMU_HWPT_ALLOC_DIRTY_TRACKING: 2431 + /* Allocate domain with v1 page table for dirty tracking */ 2432 + if (!amd_iommu_hd_support(iommu)) 2433 + break; 2434 + return do_iommu_domain_alloc(dev, flags, PD_MODE_V1); 2435 + case IOMMU_HWPT_ALLOC_PASID: 2436 + /* Allocate domain with v2 page table if IOMMU supports PASID. */ 2563 2437 if (!amd_iommu_pasid_supported()) 2564 - return ERR_PTR(-EOPNOTSUPP); 2565 - 2566 - return do_iommu_domain_alloc(type, dev, flags, AMD_IOMMU_V2); 2438 + break; 2439 + return do_iommu_domain_alloc(dev, flags, PD_MODE_V2); 2440 + case 0: 2441 + /* If nothing specific is required use the kernel commandline default */ 2442 + return do_iommu_domain_alloc(dev, 0, amd_iommu_pgtable); 2443 + default: 2444 + break; 2567 2445 } 2568 - 2569 - /* Allocate domain with v1 page table for dirty tracking */ 2570 - if (flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING) { 2571 - if (iommu && amd_iommu_hd_support(iommu)) { 2572 - return do_iommu_domain_alloc(type, dev, 2573 - flags, AMD_IOMMU_V1); 2574 - } 2575 - 2576 - return ERR_PTR(-EOPNOTSUPP); 2577 - } 2578 - 2579 - /* If nothing specific is required use the kernel commandline default */ 2580 - return do_iommu_domain_alloc(type, dev, 0, amd_iommu_pgtable); 2446 + return ERR_PTR(-EOPNOTSUPP); 2581 2447 } 2582 2448 2583 2449 void amd_iommu_domain_free(struct iommu_domain *dom) ··· 2597 2475 return 0; 2598 2476 } 2599 2477 2478 + static int blocked_domain_set_dev_pasid(struct iommu_domain *domain, 2479 + struct device *dev, ioasid_t pasid, 2480 + struct iommu_domain *old) 2481 + { 2482 + amd_iommu_remove_dev_pasid(dev, pasid, old); 2483 + return 0; 2484 + } 2485 + 2600 2486 static struct iommu_domain blocked_domain = { 2601 2487 .type = IOMMU_DOMAIN_BLOCKED, 2602 2488 .ops = &(const struct iommu_domain_ops) { 2603 2489 .attach_dev = blocked_domain_attach_device, 2490 + .set_dev_pasid = blocked_domain_set_dev_pasid, 2604 2491 } 2605 2492 }; 2606 2493 ··· 2629 2498 2630 2499 identity_domain.id = pdom_id_alloc(); 2631 2500 2632 - protection_domain_init(&identity_domain, NUMA_NO_NODE); 2501 + protection_domain_init(&identity_domain); 2633 2502 } 2634 2503 2635 2504 /* Same as blocked domain except it supports only ops->attach_dev() */ ··· 2797 2666 bool enable) 2798 2667 { 2799 2668 struct protection_domain *pdomain = to_pdomain(domain); 2800 - struct dev_table_entry *dev_table; 2669 + struct dev_table_entry *dte; 2801 2670 struct iommu_dev_data *dev_data; 2802 2671 bool domain_flush = false; 2803 2672 struct amd_iommu *iommu; 2804 2673 unsigned long flags; 2805 - u64 pte_root; 2674 + u64 new; 2806 2675 2807 2676 spin_lock_irqsave(&pdomain->lock, flags); 2808 2677 if (!(pdomain->dirty_tracking ^ enable)) { ··· 2811 2680 } 2812 2681 2813 2682 list_for_each_entry(dev_data, &pdomain->dev_list, list) { 2683 + spin_lock(&dev_data->dte_lock); 2814 2684 iommu = get_amd_iommu_from_dev_data(dev_data); 2815 - 2816 - dev_table = get_dev_table(iommu); 2817 - pte_root = dev_table[dev_data->devid].data[0]; 2818 - 2819 - pte_root = (enable ? pte_root | DTE_FLAG_HAD : 2820 - pte_root & ~DTE_FLAG_HAD); 2685 + dte = &get_dev_table(iommu)[dev_data->devid]; 2686 + new = dte->data[0]; 2687 + new = (enable ? new | DTE_FLAG_HAD : new & ~DTE_FLAG_HAD); 2688 + dte->data[0] = new; 2689 + spin_unlock(&dev_data->dte_lock); 2821 2690 2822 2691 /* Flush device DTE */ 2823 - dev_table[dev_data->devid].data[0] = pte_root; 2824 2692 device_flush_dte(dev_data); 2825 2693 domain_flush = true; 2826 2694 } ··· 3020 2890 .blocked_domain = &blocked_domain, 3021 2891 .release_domain = &release_domain, 3022 2892 .identity_domain = &identity_domain.domain, 3023 - .domain_alloc = amd_iommu_domain_alloc, 3024 2893 .domain_alloc_paging_flags = amd_iommu_domain_alloc_paging_flags, 3025 2894 .domain_alloc_sva = amd_iommu_domain_alloc_sva, 3026 2895 .probe_device = amd_iommu_probe_device, ··· 3030 2901 .def_domain_type = amd_iommu_def_domain_type, 3031 2902 .dev_enable_feat = amd_iommu_dev_enable_feature, 3032 2903 .dev_disable_feat = amd_iommu_dev_disable_feature, 3033 - .remove_dev_pasid = amd_iommu_remove_dev_pasid, 3034 2904 .page_response = amd_iommu_page_response, 3035 2905 .default_domain_ops = &(const struct iommu_domain_ops) { 3036 2906 .attach_dev = amd_iommu_attach_device, ··· 3084 2956 static void set_dte_irq_entry(struct amd_iommu *iommu, u16 devid, 3085 2957 struct irq_remap_table *table) 3086 2958 { 3087 - u64 dte; 3088 - struct dev_table_entry *dev_table = get_dev_table(iommu); 2959 + u64 new; 2960 + struct dev_table_entry *dte = &get_dev_table(iommu)[devid]; 2961 + struct iommu_dev_data *dev_data = search_dev_data(iommu, devid); 3089 2962 3090 - dte = dev_table[devid].data[2]; 3091 - dte &= ~DTE_IRQ_PHYS_ADDR_MASK; 3092 - dte |= iommu_virt_to_phys(table->table); 3093 - dte |= DTE_IRQ_REMAP_INTCTL; 3094 - dte |= DTE_INTTABLEN; 3095 - dte |= DTE_IRQ_REMAP_ENABLE; 2963 + if (dev_data) 2964 + spin_lock(&dev_data->dte_lock); 3096 2965 3097 - dev_table[devid].data[2] = dte; 2966 + new = READ_ONCE(dte->data[2]); 2967 + new &= ~DTE_IRQ_PHYS_ADDR_MASK; 2968 + new |= iommu_virt_to_phys(table->table); 2969 + new |= DTE_IRQ_REMAP_INTCTL; 2970 + new |= DTE_INTTABLEN; 2971 + new |= DTE_IRQ_REMAP_ENABLE; 2972 + WRITE_ONCE(dte->data[2], new); 2973 + 2974 + if (dev_data) 2975 + spin_unlock(&dev_data->dte_lock); 3098 2976 } 3099 2977 3100 2978 static struct irq_remap_table *get_irq_table(struct amd_iommu *iommu, u16 devid)
+2 -1
drivers/iommu/amd/pasid.c
··· 185 185 struct protection_domain *pdom; 186 186 int ret; 187 187 188 - pdom = protection_domain_alloc(IOMMU_DOMAIN_SVA, dev_to_node(dev)); 188 + pdom = protection_domain_alloc(); 189 189 if (!pdom) 190 190 return ERR_PTR(-ENOMEM); 191 191 192 192 pdom->domain.ops = &amd_sva_domain_ops; 193 193 pdom->mn.ops = &sva_mn; 194 + pdom->domain.type = IOMMU_DOMAIN_SVA; 194 195 195 196 ret = mmu_notifier_register(&pdom->mn, mm); 196 197 if (ret) {
+14 -1
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
··· 112 112 * from the current CPU register 113 113 */ 114 114 target->data[3] = cpu_to_le64(read_sysreg(mair_el1)); 115 + 116 + /* 117 + * Note that we don't bother with S1PIE on the SMMU, we just rely on 118 + * our default encoding scheme matching direct permissions anyway. 119 + * SMMU has no notion of S1POE nor GCS, so make sure that is clear if 120 + * either is enabled for CPUs, just in case anyone imagines otherwise. 121 + */ 122 + if (system_supports_poe() || system_supports_gcs()) 123 + dev_warn_once(master->smmu->dev, "SVA devices ignore permission overlays and GCS\n"); 115 124 } 116 125 EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_sva_cd); 117 126 ··· 215 206 unsigned long asid_bits; 216 207 u32 feat_mask = ARM_SMMU_FEAT_COHERENCY; 217 208 218 - if (vabits_actual == 52) 209 + if (vabits_actual == 52) { 210 + /* We don't support LPA2 */ 211 + if (PAGE_SIZE != SZ_64K) 212 + return false; 219 213 feat_mask |= ARM_SMMU_FEAT_VAX; 214 + } 220 215 221 216 if ((smmu->features & feat_mask) != feat_mask) 222 217 return false;
+194 -104
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
··· 26 26 #include <linux/pci.h> 27 27 #include <linux/pci-ats.h> 28 28 #include <linux/platform_device.h> 29 + #include <linux/string_choices.h> 29 30 #include <kunit/visibility.h> 30 31 #include <uapi/linux/iommufd.h> 31 32 ··· 84 83 { 0, NULL}, 85 84 }; 86 85 87 - static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain, 88 - struct arm_smmu_device *smmu, u32 flags); 86 + static const char * const event_str[] = { 87 + [EVT_ID_BAD_STREAMID_CONFIG] = "C_BAD_STREAMID", 88 + [EVT_ID_STE_FETCH_FAULT] = "F_STE_FETCH", 89 + [EVT_ID_BAD_STE_CONFIG] = "C_BAD_STE", 90 + [EVT_ID_STREAM_DISABLED_FAULT] = "F_STREAM_DISABLED", 91 + [EVT_ID_BAD_SUBSTREAMID_CONFIG] = "C_BAD_SUBSTREAMID", 92 + [EVT_ID_CD_FETCH_FAULT] = "F_CD_FETCH", 93 + [EVT_ID_BAD_CD_CONFIG] = "C_BAD_CD", 94 + [EVT_ID_TRANSLATION_FAULT] = "F_TRANSLATION", 95 + [EVT_ID_ADDR_SIZE_FAULT] = "F_ADDR_SIZE", 96 + [EVT_ID_ACCESS_FAULT] = "F_ACCESS", 97 + [EVT_ID_PERMISSION_FAULT] = "F_PERMISSION", 98 + [EVT_ID_VMS_FETCH_FAULT] = "F_VMS_FETCH", 99 + }; 100 + 101 + static const char * const event_class_str[] = { 102 + [0] = "CD fetch", 103 + [1] = "Stage 1 translation table fetch", 104 + [2] = "Input address caused fault", 105 + [3] = "Reserved", 106 + }; 107 + 89 108 static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master); 90 109 91 110 static void parse_driver_options(struct arm_smmu_device *smmu) ··· 1780 1759 } 1781 1760 1782 1761 /* IRQ and event handlers */ 1783 - static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt) 1762 + static void arm_smmu_decode_event(struct arm_smmu_device *smmu, u64 *raw, 1763 + struct arm_smmu_event *event) 1764 + { 1765 + struct arm_smmu_master *master; 1766 + 1767 + event->id = FIELD_GET(EVTQ_0_ID, raw[0]); 1768 + event->sid = FIELD_GET(EVTQ_0_SID, raw[0]); 1769 + event->ssv = FIELD_GET(EVTQ_0_SSV, raw[0]); 1770 + event->ssid = event->ssv ? FIELD_GET(EVTQ_0_SSID, raw[0]) : IOMMU_NO_PASID; 1771 + event->privileged = FIELD_GET(EVTQ_1_PnU, raw[1]); 1772 + event->instruction = FIELD_GET(EVTQ_1_InD, raw[1]); 1773 + event->s2 = FIELD_GET(EVTQ_1_S2, raw[1]); 1774 + event->read = FIELD_GET(EVTQ_1_RnW, raw[1]); 1775 + event->stag = FIELD_GET(EVTQ_1_STAG, raw[1]); 1776 + event->stall = FIELD_GET(EVTQ_1_STALL, raw[1]); 1777 + event->class = FIELD_GET(EVTQ_1_CLASS, raw[1]); 1778 + event->iova = FIELD_GET(EVTQ_2_ADDR, raw[2]); 1779 + event->ipa = raw[3] & EVTQ_3_IPA; 1780 + event->fetch_addr = raw[3] & EVTQ_3_FETCH_ADDR; 1781 + event->ttrnw = FIELD_GET(EVTQ_1_TT_READ, raw[1]); 1782 + event->class_tt = false; 1783 + event->dev = NULL; 1784 + 1785 + if (event->id == EVT_ID_PERMISSION_FAULT) 1786 + event->class_tt = (event->class == EVTQ_1_CLASS_TT); 1787 + 1788 + mutex_lock(&smmu->streams_mutex); 1789 + master = arm_smmu_find_master(smmu, event->sid); 1790 + if (master) 1791 + event->dev = get_device(master->dev); 1792 + mutex_unlock(&smmu->streams_mutex); 1793 + } 1794 + 1795 + static int arm_smmu_handle_event(struct arm_smmu_device *smmu, 1796 + struct arm_smmu_event *event) 1784 1797 { 1785 1798 int ret = 0; 1786 1799 u32 perm = 0; 1787 1800 struct arm_smmu_master *master; 1788 - bool ssid_valid = evt[0] & EVTQ_0_SSV; 1789 - u32 sid = FIELD_GET(EVTQ_0_SID, evt[0]); 1790 1801 struct iopf_fault fault_evt = { }; 1791 1802 struct iommu_fault *flt = &fault_evt.fault; 1792 1803 1793 - switch (FIELD_GET(EVTQ_0_ID, evt[0])) { 1804 + switch (event->id) { 1794 1805 case EVT_ID_TRANSLATION_FAULT: 1795 1806 case EVT_ID_ADDR_SIZE_FAULT: 1796 1807 case EVT_ID_ACCESS_FAULT: ··· 1832 1779 return -EOPNOTSUPP; 1833 1780 } 1834 1781 1835 - if (!(evt[1] & EVTQ_1_STALL)) 1782 + if (!event->stall) 1836 1783 return -EOPNOTSUPP; 1837 1784 1838 - if (evt[1] & EVTQ_1_RnW) 1785 + if (event->read) 1839 1786 perm |= IOMMU_FAULT_PERM_READ; 1840 1787 else 1841 1788 perm |= IOMMU_FAULT_PERM_WRITE; 1842 1789 1843 - if (evt[1] & EVTQ_1_InD) 1790 + if (event->instruction) 1844 1791 perm |= IOMMU_FAULT_PERM_EXEC; 1845 1792 1846 - if (evt[1] & EVTQ_1_PnU) 1793 + if (event->privileged) 1847 1794 perm |= IOMMU_FAULT_PERM_PRIV; 1848 1795 1849 1796 flt->type = IOMMU_FAULT_PAGE_REQ; 1850 1797 flt->prm = (struct iommu_fault_page_request) { 1851 1798 .flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE, 1852 - .grpid = FIELD_GET(EVTQ_1_STAG, evt[1]), 1799 + .grpid = event->stag, 1853 1800 .perm = perm, 1854 - .addr = FIELD_GET(EVTQ_2_ADDR, evt[2]), 1801 + .addr = event->iova, 1855 1802 }; 1856 1803 1857 - if (ssid_valid) { 1804 + if (event->ssv) { 1858 1805 flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID; 1859 - flt->prm.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]); 1806 + flt->prm.pasid = event->ssid; 1860 1807 } 1861 1808 1862 1809 mutex_lock(&smmu->streams_mutex); 1863 - master = arm_smmu_find_master(smmu, sid); 1810 + master = arm_smmu_find_master(smmu, event->sid); 1864 1811 if (!master) { 1865 1812 ret = -EINVAL; 1866 1813 goto out_unlock; ··· 1872 1819 return ret; 1873 1820 } 1874 1821 1822 + static void arm_smmu_dump_raw_event(struct arm_smmu_device *smmu, u64 *raw, 1823 + struct arm_smmu_event *event) 1824 + { 1825 + int i; 1826 + 1827 + dev_err(smmu->dev, "event 0x%02x received:\n", event->id); 1828 + 1829 + for (i = 0; i < EVTQ_ENT_DWORDS; ++i) 1830 + dev_err(smmu->dev, "\t0x%016llx\n", raw[i]); 1831 + } 1832 + 1833 + #define ARM_SMMU_EVT_KNOWN(e) ((e)->id < ARRAY_SIZE(event_str) && event_str[(e)->id]) 1834 + #define ARM_SMMU_LOG_EVT_STR(e) ARM_SMMU_EVT_KNOWN(e) ? event_str[(e)->id] : "UNKNOWN" 1835 + #define ARM_SMMU_LOG_CLIENT(e) (e)->dev ? dev_name((e)->dev) : "(unassigned sid)" 1836 + 1837 + static void arm_smmu_dump_event(struct arm_smmu_device *smmu, u64 *raw, 1838 + struct arm_smmu_event *evt, 1839 + struct ratelimit_state *rs) 1840 + { 1841 + if (!__ratelimit(rs)) 1842 + return; 1843 + 1844 + arm_smmu_dump_raw_event(smmu, raw, evt); 1845 + 1846 + switch (evt->id) { 1847 + case EVT_ID_TRANSLATION_FAULT: 1848 + case EVT_ID_ADDR_SIZE_FAULT: 1849 + case EVT_ID_ACCESS_FAULT: 1850 + case EVT_ID_PERMISSION_FAULT: 1851 + dev_err(smmu->dev, "event: %s client: %s sid: %#x ssid: %#x iova: %#llx ipa: %#llx", 1852 + ARM_SMMU_LOG_EVT_STR(evt), ARM_SMMU_LOG_CLIENT(evt), 1853 + evt->sid, evt->ssid, evt->iova, evt->ipa); 1854 + 1855 + dev_err(smmu->dev, "%s %s %s %s \"%s\"%s%s stag: %#x", 1856 + evt->privileged ? "priv" : "unpriv", 1857 + evt->instruction ? "inst" : "data", 1858 + str_read_write(evt->read), 1859 + evt->s2 ? "s2" : "s1", event_class_str[evt->class], 1860 + evt->class_tt ? (evt->ttrnw ? " ttd_read" : " ttd_write") : "", 1861 + evt->stall ? " stall" : "", evt->stag); 1862 + 1863 + break; 1864 + 1865 + case EVT_ID_STE_FETCH_FAULT: 1866 + case EVT_ID_CD_FETCH_FAULT: 1867 + case EVT_ID_VMS_FETCH_FAULT: 1868 + dev_err(smmu->dev, "event: %s client: %s sid: %#x ssid: %#x fetch_addr: %#llx", 1869 + ARM_SMMU_LOG_EVT_STR(evt), ARM_SMMU_LOG_CLIENT(evt), 1870 + evt->sid, evt->ssid, evt->fetch_addr); 1871 + 1872 + break; 1873 + 1874 + default: 1875 + dev_err(smmu->dev, "event: %s client: %s sid: %#x ssid: %#x", 1876 + ARM_SMMU_LOG_EVT_STR(evt), ARM_SMMU_LOG_CLIENT(evt), 1877 + evt->sid, evt->ssid); 1878 + } 1879 + } 1880 + 1875 1881 static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev) 1876 1882 { 1877 - int i, ret; 1883 + u64 evt[EVTQ_ENT_DWORDS]; 1884 + struct arm_smmu_event event = {0}; 1878 1885 struct arm_smmu_device *smmu = dev; 1879 1886 struct arm_smmu_queue *q = &smmu->evtq.q; 1880 1887 struct arm_smmu_ll_queue *llq = &q->llq; 1881 1888 static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL, 1882 1889 DEFAULT_RATELIMIT_BURST); 1883 - u64 evt[EVTQ_ENT_DWORDS]; 1884 1890 1885 1891 do { 1886 1892 while (!queue_remove_raw(q, evt)) { 1887 - u8 id = FIELD_GET(EVTQ_0_ID, evt[0]); 1893 + arm_smmu_decode_event(smmu, evt, &event); 1894 + if (arm_smmu_handle_event(smmu, &event)) 1895 + arm_smmu_dump_event(smmu, evt, &event, &rs); 1888 1896 1889 - ret = arm_smmu_handle_evt(smmu, evt); 1890 - if (!ret || !__ratelimit(&rs)) 1891 - continue; 1892 - 1893 - dev_info(smmu->dev, "event 0x%02x received:\n", id); 1894 - for (i = 0; i < ARRAY_SIZE(evt); ++i) 1895 - dev_info(smmu->dev, "\t0x%016llx\n", 1896 - (unsigned long long)evt[i]); 1897 - 1897 + put_device(event.dev); 1898 1898 cond_resched(); 1899 1899 } 1900 1900 ··· 2459 2353 if (!smmu_domain) 2460 2354 return ERR_PTR(-ENOMEM); 2461 2355 2462 - mutex_init(&smmu_domain->init_mutex); 2463 2356 INIT_LIST_HEAD(&smmu_domain->devices); 2464 2357 spin_lock_init(&smmu_domain->devices_lock); 2465 2358 2466 2359 return smmu_domain; 2467 - } 2468 - 2469 - static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev) 2470 - { 2471 - struct arm_smmu_domain *smmu_domain; 2472 - 2473 - /* 2474 - * Allocate the domain and initialise some of its data structures. 2475 - * We can't really do anything meaningful until we've added a 2476 - * master. 2477 - */ 2478 - smmu_domain = arm_smmu_domain_alloc(); 2479 - if (IS_ERR(smmu_domain)) 2480 - return ERR_CAST(smmu_domain); 2481 - 2482 - if (dev) { 2483 - struct arm_smmu_master *master = dev_iommu_priv_get(dev); 2484 - int ret; 2485 - 2486 - ret = arm_smmu_domain_finalise(smmu_domain, master->smmu, 0); 2487 - if (ret) { 2488 - kfree(smmu_domain); 2489 - return ERR_PTR(ret); 2490 - } 2491 - } 2492 - return &smmu_domain->domain; 2493 2360 } 2494 2361 2495 2362 static void arm_smmu_domain_free_paging(struct iommu_domain *domain) ··· 2529 2450 int (*finalise_stage_fn)(struct arm_smmu_device *smmu, 2530 2451 struct arm_smmu_domain *smmu_domain); 2531 2452 bool enable_dirty = flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING; 2532 - 2533 - /* Restrict the stage to what we can actually support */ 2534 - if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1)) 2535 - smmu_domain->stage = ARM_SMMU_DOMAIN_S2; 2536 - if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2)) 2537 - smmu_domain->stage = ARM_SMMU_DOMAIN_S1; 2538 2453 2539 2454 pgtbl_cfg = (struct io_pgtable_cfg) { 2540 2455 .pgsize_bitmap = smmu->pgsize_bitmap, ··· 2818 2745 * Translation Requests and Translated transactions are denied 2819 2746 * as though ATS is disabled for the stream (STE.EATS == 0b00), 2820 2747 * causing F_BAD_ATS_TREQ and F_TRANSL_FORBIDDEN events 2821 - * (IHI0070Ea 5.2 Stream Table Entry). Thus ATS can only be 2822 - * enabled if we have arm_smmu_domain, those always have page 2823 - * tables. 2748 + * (IHI0070Ea 5.2 Stream Table Entry). 2749 + * 2750 + * However, if we have installed a CD table and are using S1DSS 2751 + * then ATS will work in S1DSS bypass. See "13.6.4 Full ATS 2752 + * skipping stage 1". 2753 + * 2754 + * Disable ATS if we are going to create a normal 0b100 bypass 2755 + * STE. 2824 2756 */ 2825 2757 state->ats_enabled = !state->disable_ats && 2826 2758 arm_smmu_ats_supported(master); ··· 2931 2853 state.master = master = dev_iommu_priv_get(dev); 2932 2854 smmu = master->smmu; 2933 2855 2934 - mutex_lock(&smmu_domain->init_mutex); 2935 - 2936 - if (!smmu_domain->smmu) { 2937 - ret = arm_smmu_domain_finalise(smmu_domain, smmu, 0); 2938 - } else if (smmu_domain->smmu != smmu) 2939 - ret = -EINVAL; 2940 - 2941 - mutex_unlock(&smmu_domain->init_mutex); 2942 - if (ret) 2856 + if (smmu_domain->smmu != smmu) 2943 2857 return ret; 2944 2858 2945 2859 if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { ··· 2988 2918 struct arm_smmu_master *master = dev_iommu_priv_get(dev); 2989 2919 struct arm_smmu_device *smmu = master->smmu; 2990 2920 struct arm_smmu_cd target_cd; 2991 - int ret = 0; 2992 2921 2993 - mutex_lock(&smmu_domain->init_mutex); 2994 - if (!smmu_domain->smmu) 2995 - ret = arm_smmu_domain_finalise(smmu_domain, smmu, 0); 2996 - else if (smmu_domain->smmu != smmu) 2997 - ret = -EINVAL; 2998 - mutex_unlock(&smmu_domain->init_mutex); 2999 - if (ret) 3000 - return ret; 2922 + if (smmu_domain->smmu != smmu) 2923 + return -EINVAL; 3001 2924 3002 2925 if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1) 3003 2926 return -EINVAL; ··· 3079 3016 return ret; 3080 3017 } 3081 3018 3082 - static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid, 3083 - struct iommu_domain *domain) 3019 + static int arm_smmu_blocking_set_dev_pasid(struct iommu_domain *new_domain, 3020 + struct device *dev, ioasid_t pasid, 3021 + struct iommu_domain *old_domain) 3084 3022 { 3023 + struct arm_smmu_domain *smmu_domain = to_smmu_domain(old_domain); 3085 3024 struct arm_smmu_master *master = dev_iommu_priv_get(dev); 3086 - struct arm_smmu_domain *smmu_domain; 3087 - 3088 - smmu_domain = to_smmu_domain(domain); 3089 3025 3090 3026 mutex_lock(&arm_smmu_asid_lock); 3091 3027 arm_smmu_clear_cd(master, pasid); ··· 3105 3043 sid_domain->type == IOMMU_DOMAIN_BLOCKED) 3106 3044 sid_domain->ops->attach_dev(sid_domain, dev); 3107 3045 } 3046 + return 0; 3108 3047 } 3109 3048 3110 3049 static void arm_smmu_attach_dev_ste(struct iommu_domain *domain, ··· 3133 3070 if (arm_smmu_ssids_in_use(&master->cd_table)) { 3134 3071 /* 3135 3072 * If a CD table has to be present then we need to run with ATS 3136 - * on even though the RID will fail ATS queries with UR. This is 3137 - * because we have no idea what the PASID's need. 3073 + * on because we have to assume a PASID is using ATS. For 3074 + * IDENTITY this will setup things so that S1DSS=bypass which 3075 + * follows the explanation in "13.6.4 Full ATS skipping stage 1" 3076 + * and allows for ATS on the RID to work. 3138 3077 */ 3139 3078 state.cd_needs_ats = true; 3140 3079 arm_smmu_attach_prepare(&state, domain); ··· 3189 3124 3190 3125 static const struct iommu_domain_ops arm_smmu_blocked_ops = { 3191 3126 .attach_dev = arm_smmu_attach_dev_blocked, 3127 + .set_dev_pasid = arm_smmu_blocking_set_dev_pasid, 3192 3128 }; 3193 3129 3194 3130 static struct iommu_domain arm_smmu_blocked_domain = { ··· 3202 3136 const struct iommu_user_data *user_data) 3203 3137 { 3204 3138 struct arm_smmu_master *master = dev_iommu_priv_get(dev); 3139 + struct arm_smmu_device *smmu = master->smmu; 3205 3140 const u32 PAGING_FLAGS = IOMMU_HWPT_ALLOC_DIRTY_TRACKING | 3206 3141 IOMMU_HWPT_ALLOC_PASID | 3207 3142 IOMMU_HWPT_ALLOC_NEST_PARENT; ··· 3214 3147 if (user_data) 3215 3148 return ERR_PTR(-EOPNOTSUPP); 3216 3149 3217 - if (flags & IOMMU_HWPT_ALLOC_PASID) 3218 - return arm_smmu_domain_alloc_paging(dev); 3219 - 3220 3150 smmu_domain = arm_smmu_domain_alloc(); 3221 3151 if (IS_ERR(smmu_domain)) 3222 3152 return ERR_CAST(smmu_domain); 3223 3153 3224 - if (flags & IOMMU_HWPT_ALLOC_NEST_PARENT) { 3225 - if (!(master->smmu->features & ARM_SMMU_FEAT_NESTING)) { 3154 + switch (flags) { 3155 + case 0: 3156 + /* Prefer S1 if available */ 3157 + if (smmu->features & ARM_SMMU_FEAT_TRANS_S1) 3158 + smmu_domain->stage = ARM_SMMU_DOMAIN_S1; 3159 + else 3160 + smmu_domain->stage = ARM_SMMU_DOMAIN_S2; 3161 + break; 3162 + case IOMMU_HWPT_ALLOC_NEST_PARENT: 3163 + if (!(smmu->features & ARM_SMMU_FEAT_NESTING)) { 3226 3164 ret = -EOPNOTSUPP; 3227 3165 goto err_free; 3228 3166 } 3229 3167 smmu_domain->stage = ARM_SMMU_DOMAIN_S2; 3230 3168 smmu_domain->nest_parent = true; 3169 + break; 3170 + case IOMMU_HWPT_ALLOC_DIRTY_TRACKING: 3171 + case IOMMU_HWPT_ALLOC_DIRTY_TRACKING | IOMMU_HWPT_ALLOC_PASID: 3172 + case IOMMU_HWPT_ALLOC_PASID: 3173 + if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1)) { 3174 + ret = -EOPNOTSUPP; 3175 + goto err_free; 3176 + } 3177 + smmu_domain->stage = ARM_SMMU_DOMAIN_S1; 3178 + break; 3179 + default: 3180 + ret = -EOPNOTSUPP; 3181 + goto err_free; 3231 3182 } 3232 3183 3233 3184 smmu_domain->domain.type = IOMMU_DOMAIN_UNMANAGED; 3234 3185 smmu_domain->domain.ops = arm_smmu_ops.default_domain_ops; 3235 - ret = arm_smmu_domain_finalise(smmu_domain, master->smmu, flags); 3186 + ret = arm_smmu_domain_finalise(smmu_domain, smmu, flags); 3236 3187 if (ret) 3237 3188 goto err_free; 3238 3189 return &smmu_domain->domain; ··· 3322 3237 static 3323 3238 struct arm_smmu_device *arm_smmu_get_by_fwnode(struct fwnode_handle *fwnode) 3324 3239 { 3325 - struct device *dev = driver_find_device_by_fwnode(&arm_smmu_driver.driver, 3326 - fwnode); 3240 + struct device *dev = bus_find_device_by_fwnode(&platform_bus_type, fwnode); 3241 + 3327 3242 put_device(dev); 3328 3243 return dev ? dev_get_drvdata(dev) : NULL; 3329 3244 } ··· 3628 3543 .blocked_domain = &arm_smmu_blocked_domain, 3629 3544 .capable = arm_smmu_capable, 3630 3545 .hw_info = arm_smmu_hw_info, 3631 - .domain_alloc_paging = arm_smmu_domain_alloc_paging, 3632 3546 .domain_alloc_sva = arm_smmu_sva_domain_alloc, 3633 3547 .domain_alloc_paging_flags = arm_smmu_domain_alloc_paging_flags, 3634 3548 .probe_device = arm_smmu_probe_device, ··· 3635 3551 .device_group = arm_smmu_device_group, 3636 3552 .of_xlate = arm_smmu_of_xlate, 3637 3553 .get_resv_regions = arm_smmu_get_resv_regions, 3638 - .remove_dev_pasid = arm_smmu_remove_dev_pasid, 3639 3554 .dev_enable_feat = arm_smmu_dev_enable_feature, 3640 3555 .dev_disable_feat = arm_smmu_dev_disable_feature, 3641 3556 .page_response = arm_smmu_page_response, ··· 4322 4239 */ 4323 4240 if (!!(reg & IDR0_COHACC) != coherent) 4324 4241 dev_warn(smmu->dev, "IDR0.COHACC overridden by FW configuration (%s)\n", 4325 - coherent ? "true" : "false"); 4242 + str_true_false(coherent)); 4326 4243 4327 4244 switch (FIELD_GET(IDR0_STALL_MODEL, reg)) { 4328 4245 case IDR0_STALL_MODEL_FORCE: ··· 4746 4663 /* Initialise in-memory data structures */ 4747 4664 ret = arm_smmu_init_structures(smmu); 4748 4665 if (ret) 4749 - return ret; 4666 + goto err_free_iopf; 4750 4667 4751 4668 /* Record our private device structure */ 4752 4669 platform_set_drvdata(pdev, smmu); ··· 4757 4674 /* Reset the device */ 4758 4675 ret = arm_smmu_device_reset(smmu); 4759 4676 if (ret) 4760 - return ret; 4677 + goto err_disable; 4761 4678 4762 4679 /* And we're up. Go go go! */ 4763 4680 ret = iommu_device_sysfs_add(&smmu->iommu, dev, NULL, 4764 4681 "smmu3.%pa", &ioaddr); 4765 4682 if (ret) 4766 - return ret; 4683 + goto err_disable; 4767 4684 4768 4685 ret = iommu_device_register(&smmu->iommu, &arm_smmu_ops, dev); 4769 4686 if (ret) { 4770 4687 dev_err(dev, "Failed to register iommu\n"); 4771 - iommu_device_sysfs_remove(&smmu->iommu); 4772 - return ret; 4688 + goto err_free_sysfs; 4773 4689 } 4774 4690 4775 4691 return 0; 4692 + 4693 + err_free_sysfs: 4694 + iommu_device_sysfs_remove(&smmu->iommu); 4695 + err_disable: 4696 + arm_smmu_device_disable(smmu); 4697 + err_free_iopf: 4698 + iopf_queue_free(smmu->evtq.iopf); 4699 + return ret; 4776 4700 } 4777 4701 4778 4702 static void arm_smmu_device_remove(struct platform_device *pdev)
+30 -1
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
··· 452 452 453 453 #define EVTQ_0_ID GENMASK_ULL(7, 0) 454 454 455 + #define EVT_ID_BAD_STREAMID_CONFIG 0x02 456 + #define EVT_ID_STE_FETCH_FAULT 0x03 457 + #define EVT_ID_BAD_STE_CONFIG 0x04 458 + #define EVT_ID_STREAM_DISABLED_FAULT 0x06 459 + #define EVT_ID_BAD_SUBSTREAMID_CONFIG 0x08 460 + #define EVT_ID_CD_FETCH_FAULT 0x09 461 + #define EVT_ID_BAD_CD_CONFIG 0x0a 455 462 #define EVT_ID_TRANSLATION_FAULT 0x10 456 463 #define EVT_ID_ADDR_SIZE_FAULT 0x11 457 464 #define EVT_ID_ACCESS_FAULT 0x12 458 465 #define EVT_ID_PERMISSION_FAULT 0x13 466 + #define EVT_ID_VMS_FETCH_FAULT 0x25 459 467 460 468 #define EVTQ_0_SSV (1UL << 11) 461 469 #define EVTQ_0_SSID GENMASK_ULL(31, 12) ··· 475 467 #define EVTQ_1_RnW (1UL << 35) 476 468 #define EVTQ_1_S2 (1UL << 39) 477 469 #define EVTQ_1_CLASS GENMASK_ULL(41, 40) 470 + #define EVTQ_1_CLASS_TT 0x01 478 471 #define EVTQ_1_TT_READ (1UL << 44) 479 472 #define EVTQ_2_ADDR GENMASK_ULL(63, 0) 480 473 #define EVTQ_3_IPA GENMASK_ULL(51, 12) 474 + #define EVTQ_3_FETCH_ADDR GENMASK_ULL(51, 3) 481 475 482 476 /* PRI queue */ 483 477 #define PRIQ_ENT_SZ_SHIFT 4 ··· 799 789 struct rb_node node; 800 790 }; 801 791 792 + struct arm_smmu_event { 793 + u8 stall : 1, 794 + ssv : 1, 795 + privileged : 1, 796 + instruction : 1, 797 + s2 : 1, 798 + read : 1, 799 + ttrnw : 1, 800 + class_tt : 1; 801 + u8 id; 802 + u8 class; 803 + u16 stag; 804 + u32 sid; 805 + u32 ssid; 806 + u64 iova; 807 + u64 ipa; 808 + u64 fetch_addr; 809 + struct device *dev; 810 + }; 811 + 802 812 /* SMMU private data for each master */ 803 813 struct arm_smmu_master { 804 814 struct arm_smmu_device *smmu; ··· 843 813 844 814 struct arm_smmu_domain { 845 815 struct arm_smmu_device *smmu; 846 - struct mutex init_mutex; /* Protects smmu pointer */ 847 816 848 817 struct io_pgtable_ops *pgtbl_ops; 849 818 atomic_t nr_ats_masters;
+5 -3
drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
··· 79 79 #define TEGRA241_VCMDQ_PAGE1(q) (TEGRA241_VCMDQ_PAGE1_BASE + 0x80*(q)) 80 80 #define VCMDQ_ADDR GENMASK(47, 5) 81 81 #define VCMDQ_LOG2SIZE GENMASK(4, 0) 82 - #define VCMDQ_LOG2SIZE_MAX 19 83 82 84 83 #define TEGRA241_VCMDQ_BASE 0x00000 85 84 #define TEGRA241_VCMDQ_CONS_INDX_BASE 0x00008 ··· 504 505 struct arm_smmu_cmdq *cmdq = &vcmdq->cmdq; 505 506 struct arm_smmu_queue *q = &cmdq->q; 506 507 char name[16]; 508 + u32 regval; 507 509 int ret; 508 510 509 511 snprintf(name, 16, "vcmdq%u", vcmdq->idx); 510 512 511 - /* Queue size, capped to ensure natural alignment */ 512 - q->llq.max_n_shift = min_t(u32, CMDQ_MAX_SZ_SHIFT, VCMDQ_LOG2SIZE_MAX); 513 + /* Cap queue size to SMMU's IDR1.CMDQS and ensure natural alignment */ 514 + regval = readl_relaxed(smmu->base + ARM_SMMU_IDR1); 515 + q->llq.max_n_shift = 516 + min_t(u32, CMDQ_MAX_SZ_SHIFT, FIELD_GET(IDR1_CMDQS, regval)); 513 517 514 518 /* Use the common helper to init the VCMDQ, and then... */ 515 519 ret = arm_smmu_init_one_queue(smmu, q, vcmdq->page0,
+3 -2
drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
··· 110 110 int arm_mmu500_reset(struct arm_smmu_device *smmu) 111 111 { 112 112 u32 reg, major; 113 - int i; 114 113 /* 115 114 * On MMU-500 r2p0 onwards we need to clear ACR.CACHE_LOCK before 116 115 * writes to the context bank ACTLRs will stick. And we just hope that ··· 127 128 reg |= ARM_MMU500_ACR_SMTNMB_TLBEN | ARM_MMU500_ACR_S2CRB_TLBEN; 128 129 arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_sACR, reg); 129 130 131 + #ifdef CONFIG_ARM_SMMU_MMU_500_CPRE_ERRATA 130 132 /* 131 133 * Disable MMU-500's not-particularly-beneficial next-page 132 134 * prefetcher for the sake of at least 5 known errata. 133 135 */ 134 - for (i = 0; i < smmu->num_context_banks; ++i) { 136 + for (int i = 0; i < smmu->num_context_banks; ++i) { 135 137 reg = arm_smmu_cb_read(smmu, i, ARM_SMMU_CB_ACTLR); 136 138 reg &= ~ARM_MMU500_ACTLR_CPRE; 137 139 arm_smmu_cb_write(smmu, i, ARM_SMMU_CB_ACTLR, reg); ··· 140 140 if (reg & ARM_MMU500_ACTLR_CPRE) 141 141 dev_warn_once(smmu->dev, "Failed to disable prefetcher for errata workarounds, check SACR.CACHE_LOCK\n"); 142 142 } 143 + #endif 143 144 144 145 return 0; 145 146 }
+1 -1
drivers/iommu/arm/arm-smmu/arm-smmu-qcom-debug.c
··· 73 73 if (__ratelimit(&rs)) { 74 74 dev_err(smmu->dev, "TLB sync timed out -- SMMU may be deadlocked\n"); 75 75 76 - cfg = qsmmu->cfg; 76 + cfg = qsmmu->data->cfg; 77 77 if (!cfg) 78 78 return; 79 79
+120 -1
drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
··· 16 16 17 17 #define QCOM_DUMMY_VAL -1 18 18 19 + /* 20 + * SMMU-500 TRM defines BIT(0) as CMTLB (Enable context caching in the 21 + * macro TLB) and BIT(1) as CPRE (Enable context caching in the prefetch 22 + * buffer). The remaining bits are implementation defined and vary across 23 + * SoCs. 24 + */ 25 + 26 + #define CPRE (1 << 1) 27 + #define CMTLB (1 << 0) 28 + #define PREFETCH_SHIFT 8 29 + #define PREFETCH_DEFAULT 0 30 + #define PREFETCH_SHALLOW (1 << PREFETCH_SHIFT) 31 + #define PREFETCH_MODERATE (2 << PREFETCH_SHIFT) 32 + #define PREFETCH_DEEP (3 << PREFETCH_SHIFT) 33 + #define GFX_ACTLR_PRR (1 << 5) 34 + 35 + static const struct of_device_id qcom_smmu_actlr_client_of_match[] = { 36 + { .compatible = "qcom,adreno", 37 + .data = (const void *) (PREFETCH_DEEP | CPRE | CMTLB) }, 38 + { .compatible = "qcom,adreno-gmu", 39 + .data = (const void *) (PREFETCH_DEEP | CPRE | CMTLB) }, 40 + { .compatible = "qcom,adreno-smmu", 41 + .data = (const void *) (PREFETCH_DEEP | CPRE | CMTLB) }, 42 + { .compatible = "qcom,fastrpc", 43 + .data = (const void *) (PREFETCH_DEEP | CPRE | CMTLB) }, 44 + { .compatible = "qcom,sc7280-mdss", 45 + .data = (const void *) (PREFETCH_SHALLOW | CPRE | CMTLB) }, 46 + { .compatible = "qcom,sc7280-venus", 47 + .data = (const void *) (PREFETCH_SHALLOW | CPRE | CMTLB) }, 48 + { .compatible = "qcom,sm8550-mdss", 49 + .data = (const void *) (PREFETCH_DEFAULT | CMTLB) }, 50 + { } 51 + }; 52 + 19 53 static struct qcom_smmu *to_qcom_smmu(struct arm_smmu_device *smmu) 20 54 { 21 55 return container_of(smmu, struct qcom_smmu, smmu); ··· 131 97 reg |= ARM_SMMU_RESUME_TERMINATE; 132 98 133 99 arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_RESUME, reg); 100 + } 101 + 102 + static void qcom_adreno_smmu_set_prr_bit(const void *cookie, bool set) 103 + { 104 + struct arm_smmu_domain *smmu_domain = (void *)cookie; 105 + struct arm_smmu_device *smmu = smmu_domain->smmu; 106 + struct arm_smmu_cfg *cfg = &smmu_domain->cfg; 107 + u32 reg = 0; 108 + int ret; 109 + 110 + ret = pm_runtime_resume_and_get(smmu->dev); 111 + if (ret < 0) { 112 + dev_err(smmu->dev, "failed to get runtime PM: %d\n", ret); 113 + return; 114 + } 115 + 116 + reg = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR); 117 + reg &= ~GFX_ACTLR_PRR; 118 + if (set) 119 + reg |= FIELD_PREP(GFX_ACTLR_PRR, 1); 120 + arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR, reg); 121 + pm_runtime_put_autosuspend(smmu->dev); 122 + } 123 + 124 + static void qcom_adreno_smmu_set_prr_addr(const void *cookie, phys_addr_t page_addr) 125 + { 126 + struct arm_smmu_domain *smmu_domain = (void *)cookie; 127 + struct arm_smmu_device *smmu = smmu_domain->smmu; 128 + int ret; 129 + 130 + ret = pm_runtime_resume_and_get(smmu->dev); 131 + if (ret < 0) { 132 + dev_err(smmu->dev, "failed to get runtime PM: %d\n", ret); 133 + return; 134 + } 135 + 136 + writel_relaxed(lower_32_bits(page_addr), 137 + smmu->base + ARM_SMMU_GFX_PRR_CFG_LADDR); 138 + writel_relaxed(upper_32_bits(page_addr), 139 + smmu->base + ARM_SMMU_GFX_PRR_CFG_UADDR); 140 + pm_runtime_put_autosuspend(smmu->dev); 134 141 } 135 142 136 143 #define QCOM_ADRENO_SMMU_GPU_SID 0 ··· 282 207 return true; 283 208 } 284 209 210 + static void qcom_smmu_set_actlr_dev(struct device *dev, struct arm_smmu_device *smmu, int cbndx, 211 + const struct of_device_id *client_match) 212 + { 213 + const struct of_device_id *match = 214 + of_match_device(client_match, dev); 215 + 216 + if (!match) { 217 + dev_dbg(dev, "no ACTLR settings present\n"); 218 + return; 219 + } 220 + 221 + arm_smmu_cb_write(smmu, cbndx, ARM_SMMU_CB_ACTLR, (unsigned long)match->data); 222 + } 223 + 285 224 static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain, 286 225 struct io_pgtable_cfg *pgtbl_cfg, struct device *dev) 287 226 { 227 + const struct device_node *np = smmu_domain->smmu->dev->of_node; 228 + struct arm_smmu_device *smmu = smmu_domain->smmu; 229 + struct qcom_smmu *qsmmu = to_qcom_smmu(smmu); 230 + const struct of_device_id *client_match; 231 + int cbndx = smmu_domain->cfg.cbndx; 288 232 struct adreno_smmu_priv *priv; 289 233 290 234 smmu_domain->cfg.flush_walk_prefer_tlbiasid = true; 235 + 236 + client_match = qsmmu->data->client_match; 237 + 238 + if (client_match) 239 + qcom_smmu_set_actlr_dev(dev, smmu, cbndx, client_match); 291 240 292 241 /* Only enable split pagetables for the GPU device (SID 0) */ 293 242 if (!qcom_adreno_smmu_is_gpu_device(dev)) ··· 338 239 priv->get_fault_info = qcom_adreno_smmu_get_fault_info; 339 240 priv->set_stall = qcom_adreno_smmu_set_stall; 340 241 priv->resume_translation = qcom_adreno_smmu_resume_translation; 242 + priv->set_prr_bit = NULL; 243 + priv->set_prr_addr = NULL; 244 + 245 + if (of_device_is_compatible(np, "qcom,smmu-500") && 246 + of_device_is_compatible(np, "qcom,adreno-smmu")) { 247 + priv->set_prr_bit = qcom_adreno_smmu_set_prr_bit; 248 + priv->set_prr_addr = qcom_adreno_smmu_set_prr_addr; 249 + } 341 250 342 251 return 0; 343 252 } ··· 376 269 static int qcom_smmu_init_context(struct arm_smmu_domain *smmu_domain, 377 270 struct io_pgtable_cfg *pgtbl_cfg, struct device *dev) 378 271 { 272 + struct arm_smmu_device *smmu = smmu_domain->smmu; 273 + struct qcom_smmu *qsmmu = to_qcom_smmu(smmu); 274 + const struct of_device_id *client_match; 275 + int cbndx = smmu_domain->cfg.cbndx; 276 + 379 277 smmu_domain->cfg.flush_walk_prefer_tlbiasid = true; 278 + 279 + client_match = qsmmu->data->client_match; 280 + 281 + if (client_match) 282 + qcom_smmu_set_actlr_dev(dev, smmu, cbndx, client_match); 380 283 381 284 return 0; 382 285 } ··· 624 507 return ERR_PTR(-ENOMEM); 625 508 626 509 qsmmu->smmu.impl = impl; 627 - qsmmu->cfg = data->cfg; 510 + qsmmu->data = data; 628 511 629 512 return &qsmmu->smmu; 630 513 } ··· 667 550 .impl = &qcom_smmu_500_impl, 668 551 .adreno_impl = &qcom_adreno_smmu_500_impl, 669 552 .cfg = &qcom_smmu_impl0_cfg, 553 + .client_match = qcom_smmu_actlr_client_of_match, 670 554 }; 671 555 672 556 /* ··· 685 567 { .compatible = "qcom,sc8180x-smmu-500", .data = &qcom_smmu_500_impl0_data }, 686 568 { .compatible = "qcom,sc8280xp-smmu-500", .data = &qcom_smmu_500_impl0_data }, 687 569 { .compatible = "qcom,sdm630-smmu-v2", .data = &qcom_smmu_v2_data }, 570 + { .compatible = "qcom,sdm670-smmu-v2", .data = &qcom_smmu_v2_data }, 688 571 { .compatible = "qcom,sdm845-smmu-v2", .data = &qcom_smmu_v2_data }, 689 572 { .compatible = "qcom,sdm845-smmu-500", .data = &sdm845_smmu_500_data }, 690 573 { .compatible = "qcom,sm6115-smmu-500", .data = &qcom_smmu_500_impl0_data},
+2 -1
drivers/iommu/arm/arm-smmu/arm-smmu-qcom.h
··· 8 8 9 9 struct qcom_smmu { 10 10 struct arm_smmu_device smmu; 11 - const struct qcom_smmu_config *cfg; 11 + const struct qcom_smmu_match_data *data; 12 12 bool bypass_quirk; 13 13 u8 bypass_cbndx; 14 14 u32 stall_enabled; ··· 28 28 const struct qcom_smmu_config *cfg; 29 29 const struct arm_smmu_impl *impl; 30 30 const struct arm_smmu_impl *adreno_impl; 31 + const struct of_device_id * const client_match; 31 32 }; 32 33 33 34 irqreturn_t qcom_smmu_context_fault(int irq, void *dev);
+16 -29
drivers/iommu/arm/arm-smmu/arm-smmu.c
··· 34 34 #include <linux/pm_runtime.h> 35 35 #include <linux/ratelimit.h> 36 36 #include <linux/slab.h> 37 + #include <linux/string_choices.h> 37 38 38 39 #include <linux/fsl/mc.h> 39 40 ··· 1412 1411 static 1413 1412 struct arm_smmu_device *arm_smmu_get_by_fwnode(struct fwnode_handle *fwnode) 1414 1413 { 1415 - struct device *dev = driver_find_device_by_fwnode(&arm_smmu_driver.driver, 1416 - fwnode); 1414 + struct device *dev = bus_find_device_by_fwnode(&platform_bus_type, fwnode); 1415 + 1417 1416 put_device(dev); 1418 1417 return dev ? dev_get_drvdata(dev) : NULL; 1419 1418 } ··· 1438 1437 goto out_free; 1439 1438 } else { 1440 1439 smmu = arm_smmu_get_by_fwnode(fwspec->iommu_fwnode); 1441 - 1442 - /* 1443 - * Defer probe if the relevant SMMU instance hasn't finished 1444 - * probing yet. This is a fragile hack and we'd ideally 1445 - * avoid this race in the core code. Until that's ironed 1446 - * out, however, this is the most pragmatic option on the 1447 - * table. 1448 - */ 1449 - if (!smmu) 1450 - return ERR_PTR(dev_err_probe(dev, -EPROBE_DEFER, 1451 - "smmu dev has not bound yet\n")); 1452 1440 } 1453 1441 1454 1442 ret = -EINVAL; ··· 2107 2117 } 2108 2118 2109 2119 dev_notice(smmu->dev, "\tpreserved %d boot mapping%s\n", cnt, 2110 - cnt == 1 ? "" : "s"); 2120 + str_plural(cnt)); 2111 2121 iort_put_rmr_sids(dev_fwnode(smmu->dev), &rmr_list); 2112 2122 } 2113 2123 ··· 2217 2227 i, irq); 2218 2228 } 2219 2229 2220 - err = iommu_device_sysfs_add(&smmu->iommu, smmu->dev, NULL, 2221 - "smmu.%pa", &smmu->ioaddr); 2222 - if (err) { 2223 - dev_err(dev, "Failed to register iommu in sysfs\n"); 2224 - return err; 2225 - } 2226 - 2227 - err = iommu_device_register(&smmu->iommu, &arm_smmu_ops, 2228 - using_legacy_binding ? NULL : dev); 2229 - if (err) { 2230 - dev_err(dev, "Failed to register iommu\n"); 2231 - iommu_device_sysfs_remove(&smmu->iommu); 2232 - return err; 2233 - } 2234 - 2235 2230 platform_set_drvdata(pdev, smmu); 2236 2231 2237 2232 /* Check for RMRs and install bypass SMRs if any */ ··· 2224 2249 2225 2250 arm_smmu_device_reset(smmu); 2226 2251 arm_smmu_test_smr_masks(smmu); 2252 + 2253 + err = iommu_device_sysfs_add(&smmu->iommu, smmu->dev, NULL, 2254 + "smmu.%pa", &smmu->ioaddr); 2255 + if (err) 2256 + return dev_err_probe(dev, err, "Failed to register iommu in sysfs\n"); 2257 + 2258 + err = iommu_device_register(&smmu->iommu, &arm_smmu_ops, 2259 + using_legacy_binding ? NULL : dev); 2260 + if (err) { 2261 + iommu_device_sysfs_remove(&smmu->iommu); 2262 + return dev_err_probe(dev, err, "Failed to register iommu\n"); 2263 + } 2227 2264 2228 2265 /* 2229 2266 * We want to avoid touching dev->power.lock in fastpaths unless
+2
drivers/iommu/arm/arm-smmu/arm-smmu.h
··· 154 154 #define ARM_SMMU_SCTLR_M BIT(0) 155 155 156 156 #define ARM_SMMU_CB_ACTLR 0x4 157 + #define ARM_SMMU_GFX_PRR_CFG_LADDR 0x6008 158 + #define ARM_SMMU_GFX_PRR_CFG_UADDR 0x600C 157 159 158 160 #define ARM_SMMU_CB_RESUME 0x8 159 161 #define ARM_SMMU_RESUME_TERMINATE BIT(0)
+1 -1
drivers/iommu/intel/Makefile
··· 1 1 # SPDX-License-Identifier: GPL-2.0 2 2 obj-$(CONFIG_DMAR_TABLE) += dmar.o 3 3 obj-$(CONFIG_INTEL_IOMMU) += iommu.o pasid.o nested.o cache.o prq.o 4 - obj-$(CONFIG_DMAR_TABLE) += trace.o cap_audit.o 4 + obj-$(CONFIG_DMAR_TABLE) += trace.o 5 5 obj-$(CONFIG_DMAR_PERF) += perf.o 6 6 obj-$(CONFIG_INTEL_IOMMU_DEBUGFS) += debugfs.o 7 7 obj-$(CONFIG_INTEL_IOMMU_SVM) += svm.o
+10 -1
drivers/iommu/intel/cache.c
··· 47 47 struct device_domain_info *info = dev_iommu_priv_get(dev); 48 48 struct intel_iommu *iommu = info->iommu; 49 49 struct cache_tag *tag, *temp; 50 + struct list_head *prev; 50 51 unsigned long flags; 51 52 52 53 tag = kzalloc(sizeof(*tag), GFP_KERNEL); ··· 66 65 tag->dev = iommu->iommu.dev; 67 66 68 67 spin_lock_irqsave(&domain->cache_lock, flags); 68 + prev = &domain->cache_tags; 69 69 list_for_each_entry(temp, &domain->cache_tags, node) { 70 70 if (cache_tage_match(temp, did, iommu, dev, pasid, type)) { 71 71 temp->users++; ··· 75 73 trace_cache_tag_assign(temp); 76 74 return 0; 77 75 } 76 + if (temp->iommu == iommu) 77 + prev = &temp->node; 78 78 } 79 - list_add_tail(&tag->node, &domain->cache_tags); 79 + /* 80 + * Link cache tags of same iommu unit together, so corresponding 81 + * flush ops can be batched for iommu unit. 82 + */ 83 + list_add(&tag->node, prev); 84 + 80 85 spin_unlock_irqrestore(&domain->cache_lock, flags); 81 86 trace_cache_tag_assign(tag); 82 87
-217
drivers/iommu/intel/cap_audit.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* 3 - * cap_audit.c - audit iommu capabilities for boot time and hot plug 4 - * 5 - * Copyright (C) 2021 Intel Corporation 6 - * 7 - * Author: Kyung Min Park <kyung.min.park@intel.com> 8 - * Lu Baolu <baolu.lu@linux.intel.com> 9 - */ 10 - 11 - #define pr_fmt(fmt) "DMAR: " fmt 12 - 13 - #include "iommu.h" 14 - #include "cap_audit.h" 15 - 16 - static u64 intel_iommu_cap_sanity; 17 - static u64 intel_iommu_ecap_sanity; 18 - 19 - static inline void check_irq_capabilities(struct intel_iommu *a, 20 - struct intel_iommu *b) 21 - { 22 - CHECK_FEATURE_MISMATCH(a, b, cap, pi_support, CAP_PI_MASK); 23 - CHECK_FEATURE_MISMATCH(a, b, ecap, eim_support, ECAP_EIM_MASK); 24 - } 25 - 26 - static inline void check_dmar_capabilities(struct intel_iommu *a, 27 - struct intel_iommu *b) 28 - { 29 - MINIMAL_FEATURE_IOMMU(b, cap, CAP_MAMV_MASK); 30 - MINIMAL_FEATURE_IOMMU(b, cap, CAP_NFR_MASK); 31 - MINIMAL_FEATURE_IOMMU(b, cap, CAP_SLLPS_MASK); 32 - MINIMAL_FEATURE_IOMMU(b, cap, CAP_FRO_MASK); 33 - MINIMAL_FEATURE_IOMMU(b, cap, CAP_MGAW_MASK); 34 - MINIMAL_FEATURE_IOMMU(b, cap, CAP_SAGAW_MASK); 35 - MINIMAL_FEATURE_IOMMU(b, cap, CAP_NDOMS_MASK); 36 - MINIMAL_FEATURE_IOMMU(b, ecap, ECAP_PSS_MASK); 37 - MINIMAL_FEATURE_IOMMU(b, ecap, ECAP_MHMV_MASK); 38 - MINIMAL_FEATURE_IOMMU(b, ecap, ECAP_IRO_MASK); 39 - 40 - CHECK_FEATURE_MISMATCH(a, b, cap, fl5lp_support, CAP_FL5LP_MASK); 41 - CHECK_FEATURE_MISMATCH(a, b, cap, fl1gp_support, CAP_FL1GP_MASK); 42 - CHECK_FEATURE_MISMATCH(a, b, cap, read_drain, CAP_RD_MASK); 43 - CHECK_FEATURE_MISMATCH(a, b, cap, write_drain, CAP_WD_MASK); 44 - CHECK_FEATURE_MISMATCH(a, b, cap, pgsel_inv, CAP_PSI_MASK); 45 - CHECK_FEATURE_MISMATCH(a, b, cap, zlr, CAP_ZLR_MASK); 46 - CHECK_FEATURE_MISMATCH(a, b, cap, caching_mode, CAP_CM_MASK); 47 - CHECK_FEATURE_MISMATCH(a, b, cap, phmr, CAP_PHMR_MASK); 48 - CHECK_FEATURE_MISMATCH(a, b, cap, plmr, CAP_PLMR_MASK); 49 - CHECK_FEATURE_MISMATCH(a, b, cap, rwbf, CAP_RWBF_MASK); 50 - CHECK_FEATURE_MISMATCH(a, b, cap, afl, CAP_AFL_MASK); 51 - CHECK_FEATURE_MISMATCH(a, b, ecap, rps, ECAP_RPS_MASK); 52 - CHECK_FEATURE_MISMATCH(a, b, ecap, smpwc, ECAP_SMPWC_MASK); 53 - CHECK_FEATURE_MISMATCH(a, b, ecap, flts, ECAP_FLTS_MASK); 54 - CHECK_FEATURE_MISMATCH(a, b, ecap, slts, ECAP_SLTS_MASK); 55 - CHECK_FEATURE_MISMATCH(a, b, ecap, nwfs, ECAP_NWFS_MASK); 56 - CHECK_FEATURE_MISMATCH(a, b, ecap, slads, ECAP_SLADS_MASK); 57 - CHECK_FEATURE_MISMATCH(a, b, ecap, smts, ECAP_SMTS_MASK); 58 - CHECK_FEATURE_MISMATCH(a, b, ecap, pds, ECAP_PDS_MASK); 59 - CHECK_FEATURE_MISMATCH(a, b, ecap, dit, ECAP_DIT_MASK); 60 - CHECK_FEATURE_MISMATCH(a, b, ecap, pasid, ECAP_PASID_MASK); 61 - CHECK_FEATURE_MISMATCH(a, b, ecap, eafs, ECAP_EAFS_MASK); 62 - CHECK_FEATURE_MISMATCH(a, b, ecap, srs, ECAP_SRS_MASK); 63 - CHECK_FEATURE_MISMATCH(a, b, ecap, ers, ECAP_ERS_MASK); 64 - CHECK_FEATURE_MISMATCH(a, b, ecap, prs, ECAP_PRS_MASK); 65 - CHECK_FEATURE_MISMATCH(a, b, ecap, nest, ECAP_NEST_MASK); 66 - CHECK_FEATURE_MISMATCH(a, b, ecap, mts, ECAP_MTS_MASK); 67 - CHECK_FEATURE_MISMATCH(a, b, ecap, sc_support, ECAP_SC_MASK); 68 - CHECK_FEATURE_MISMATCH(a, b, ecap, pass_through, ECAP_PT_MASK); 69 - CHECK_FEATURE_MISMATCH(a, b, ecap, dev_iotlb_support, ECAP_DT_MASK); 70 - CHECK_FEATURE_MISMATCH(a, b, ecap, qis, ECAP_QI_MASK); 71 - CHECK_FEATURE_MISMATCH(a, b, ecap, coherent, ECAP_C_MASK); 72 - } 73 - 74 - static int cap_audit_hotplug(struct intel_iommu *iommu, enum cap_audit_type type) 75 - { 76 - bool mismatch = false; 77 - u64 old_cap = intel_iommu_cap_sanity; 78 - u64 old_ecap = intel_iommu_ecap_sanity; 79 - 80 - if (type == CAP_AUDIT_HOTPLUG_IRQR) { 81 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, cap, pi_support, CAP_PI_MASK); 82 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, eim_support, ECAP_EIM_MASK); 83 - goto out; 84 - } 85 - 86 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, cap, fl5lp_support, CAP_FL5LP_MASK); 87 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, cap, fl1gp_support, CAP_FL1GP_MASK); 88 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, cap, read_drain, CAP_RD_MASK); 89 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, cap, write_drain, CAP_WD_MASK); 90 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, cap, pgsel_inv, CAP_PSI_MASK); 91 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, cap, zlr, CAP_ZLR_MASK); 92 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, cap, caching_mode, CAP_CM_MASK); 93 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, cap, phmr, CAP_PHMR_MASK); 94 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, cap, plmr, CAP_PLMR_MASK); 95 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, cap, rwbf, CAP_RWBF_MASK); 96 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, cap, afl, CAP_AFL_MASK); 97 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, rps, ECAP_RPS_MASK); 98 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, smpwc, ECAP_SMPWC_MASK); 99 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, flts, ECAP_FLTS_MASK); 100 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, slts, ECAP_SLTS_MASK); 101 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, nwfs, ECAP_NWFS_MASK); 102 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, slads, ECAP_SLADS_MASK); 103 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, smts, ECAP_SMTS_MASK); 104 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, pds, ECAP_PDS_MASK); 105 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, dit, ECAP_DIT_MASK); 106 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, pasid, ECAP_PASID_MASK); 107 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, eafs, ECAP_EAFS_MASK); 108 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, srs, ECAP_SRS_MASK); 109 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, ers, ECAP_ERS_MASK); 110 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, prs, ECAP_PRS_MASK); 111 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, nest, ECAP_NEST_MASK); 112 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, mts, ECAP_MTS_MASK); 113 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, sc_support, ECAP_SC_MASK); 114 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, pass_through, ECAP_PT_MASK); 115 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, dev_iotlb_support, ECAP_DT_MASK); 116 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, qis, ECAP_QI_MASK); 117 - CHECK_FEATURE_MISMATCH_HOTPLUG(iommu, ecap, coherent, ECAP_C_MASK); 118 - 119 - /* Abort hot plug if the hot plug iommu feature is smaller than global */ 120 - MINIMAL_FEATURE_HOTPLUG(iommu, cap, max_amask_val, CAP_MAMV_MASK, mismatch); 121 - MINIMAL_FEATURE_HOTPLUG(iommu, cap, num_fault_regs, CAP_NFR_MASK, mismatch); 122 - MINIMAL_FEATURE_HOTPLUG(iommu, cap, super_page_val, CAP_SLLPS_MASK, mismatch); 123 - MINIMAL_FEATURE_HOTPLUG(iommu, cap, fault_reg_offset, CAP_FRO_MASK, mismatch); 124 - MINIMAL_FEATURE_HOTPLUG(iommu, cap, mgaw, CAP_MGAW_MASK, mismatch); 125 - MINIMAL_FEATURE_HOTPLUG(iommu, cap, sagaw, CAP_SAGAW_MASK, mismatch); 126 - MINIMAL_FEATURE_HOTPLUG(iommu, cap, ndoms, CAP_NDOMS_MASK, mismatch); 127 - MINIMAL_FEATURE_HOTPLUG(iommu, ecap, pss, ECAP_PSS_MASK, mismatch); 128 - MINIMAL_FEATURE_HOTPLUG(iommu, ecap, max_handle_mask, ECAP_MHMV_MASK, mismatch); 129 - MINIMAL_FEATURE_HOTPLUG(iommu, ecap, iotlb_offset, ECAP_IRO_MASK, mismatch); 130 - 131 - out: 132 - if (mismatch) { 133 - intel_iommu_cap_sanity = old_cap; 134 - intel_iommu_ecap_sanity = old_ecap; 135 - return -EFAULT; 136 - } 137 - 138 - return 0; 139 - } 140 - 141 - static int cap_audit_static(struct intel_iommu *iommu, enum cap_audit_type type) 142 - { 143 - struct dmar_drhd_unit *d; 144 - struct intel_iommu *i; 145 - int rc = 0; 146 - 147 - rcu_read_lock(); 148 - if (list_empty(&dmar_drhd_units)) 149 - goto out; 150 - 151 - for_each_active_iommu(i, d) { 152 - if (!iommu) { 153 - intel_iommu_ecap_sanity = i->ecap; 154 - intel_iommu_cap_sanity = i->cap; 155 - iommu = i; 156 - continue; 157 - } 158 - 159 - if (type == CAP_AUDIT_STATIC_DMAR) 160 - check_dmar_capabilities(iommu, i); 161 - else 162 - check_irq_capabilities(iommu, i); 163 - } 164 - 165 - /* 166 - * If the system is sane to support scalable mode, either SL or FL 167 - * should be sane. 168 - */ 169 - if (intel_cap_smts_sanity() && 170 - !intel_cap_flts_sanity() && !intel_cap_slts_sanity()) 171 - rc = -EOPNOTSUPP; 172 - 173 - out: 174 - rcu_read_unlock(); 175 - return rc; 176 - } 177 - 178 - int intel_cap_audit(enum cap_audit_type type, struct intel_iommu *iommu) 179 - { 180 - switch (type) { 181 - case CAP_AUDIT_STATIC_DMAR: 182 - case CAP_AUDIT_STATIC_IRQR: 183 - return cap_audit_static(iommu, type); 184 - case CAP_AUDIT_HOTPLUG_DMAR: 185 - case CAP_AUDIT_HOTPLUG_IRQR: 186 - return cap_audit_hotplug(iommu, type); 187 - default: 188 - break; 189 - } 190 - 191 - return -EFAULT; 192 - } 193 - 194 - bool intel_cap_smts_sanity(void) 195 - { 196 - return ecap_smts(intel_iommu_ecap_sanity); 197 - } 198 - 199 - bool intel_cap_pasid_sanity(void) 200 - { 201 - return ecap_pasid(intel_iommu_ecap_sanity); 202 - } 203 - 204 - bool intel_cap_nest_sanity(void) 205 - { 206 - return ecap_nest(intel_iommu_ecap_sanity); 207 - } 208 - 209 - bool intel_cap_flts_sanity(void) 210 - { 211 - return ecap_flts(intel_iommu_ecap_sanity); 212 - } 213 - 214 - bool intel_cap_slts_sanity(void) 215 - { 216 - return ecap_slts(intel_iommu_ecap_sanity); 217 - }
-131
drivers/iommu/intel/cap_audit.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 */ 2 - /* 3 - * cap_audit.h - audit iommu capabilities header 4 - * 5 - * Copyright (C) 2021 Intel Corporation 6 - * 7 - * Author: Kyung Min Park <kyung.min.park@intel.com> 8 - */ 9 - 10 - /* 11 - * Capability Register Mask 12 - */ 13 - #define CAP_FL5LP_MASK BIT_ULL(60) 14 - #define CAP_PI_MASK BIT_ULL(59) 15 - #define CAP_FL1GP_MASK BIT_ULL(56) 16 - #define CAP_RD_MASK BIT_ULL(55) 17 - #define CAP_WD_MASK BIT_ULL(54) 18 - #define CAP_MAMV_MASK GENMASK_ULL(53, 48) 19 - #define CAP_NFR_MASK GENMASK_ULL(47, 40) 20 - #define CAP_PSI_MASK BIT_ULL(39) 21 - #define CAP_SLLPS_MASK GENMASK_ULL(37, 34) 22 - #define CAP_FRO_MASK GENMASK_ULL(33, 24) 23 - #define CAP_ZLR_MASK BIT_ULL(22) 24 - #define CAP_MGAW_MASK GENMASK_ULL(21, 16) 25 - #define CAP_SAGAW_MASK GENMASK_ULL(12, 8) 26 - #define CAP_CM_MASK BIT_ULL(7) 27 - #define CAP_PHMR_MASK BIT_ULL(6) 28 - #define CAP_PLMR_MASK BIT_ULL(5) 29 - #define CAP_RWBF_MASK BIT_ULL(4) 30 - #define CAP_AFL_MASK BIT_ULL(3) 31 - #define CAP_NDOMS_MASK GENMASK_ULL(2, 0) 32 - 33 - /* 34 - * Extended Capability Register Mask 35 - */ 36 - #define ECAP_RPS_MASK BIT_ULL(49) 37 - #define ECAP_SMPWC_MASK BIT_ULL(48) 38 - #define ECAP_FLTS_MASK BIT_ULL(47) 39 - #define ECAP_SLTS_MASK BIT_ULL(46) 40 - #define ECAP_SLADS_MASK BIT_ULL(45) 41 - #define ECAP_VCS_MASK BIT_ULL(44) 42 - #define ECAP_SMTS_MASK BIT_ULL(43) 43 - #define ECAP_PDS_MASK BIT_ULL(42) 44 - #define ECAP_DIT_MASK BIT_ULL(41) 45 - #define ECAP_PASID_MASK BIT_ULL(40) 46 - #define ECAP_PSS_MASK GENMASK_ULL(39, 35) 47 - #define ECAP_EAFS_MASK BIT_ULL(34) 48 - #define ECAP_NWFS_MASK BIT_ULL(33) 49 - #define ECAP_SRS_MASK BIT_ULL(31) 50 - #define ECAP_ERS_MASK BIT_ULL(30) 51 - #define ECAP_PRS_MASK BIT_ULL(29) 52 - #define ECAP_NEST_MASK BIT_ULL(26) 53 - #define ECAP_MTS_MASK BIT_ULL(25) 54 - #define ECAP_MHMV_MASK GENMASK_ULL(23, 20) 55 - #define ECAP_IRO_MASK GENMASK_ULL(17, 8) 56 - #define ECAP_SC_MASK BIT_ULL(7) 57 - #define ECAP_PT_MASK BIT_ULL(6) 58 - #define ECAP_EIM_MASK BIT_ULL(4) 59 - #define ECAP_DT_MASK BIT_ULL(2) 60 - #define ECAP_QI_MASK BIT_ULL(1) 61 - #define ECAP_C_MASK BIT_ULL(0) 62 - 63 - /* 64 - * u64 intel_iommu_cap_sanity, intel_iommu_ecap_sanity will be adjusted as each 65 - * IOMMU gets audited. 66 - */ 67 - #define DO_CHECK_FEATURE_MISMATCH(a, b, cap, feature, MASK) \ 68 - do { \ 69 - if (cap##_##feature(a) != cap##_##feature(b)) { \ 70 - intel_iommu_##cap##_sanity &= ~(MASK); \ 71 - pr_info("IOMMU feature %s inconsistent", #feature); \ 72 - } \ 73 - } while (0) 74 - 75 - #define CHECK_FEATURE_MISMATCH(a, b, cap, feature, MASK) \ 76 - DO_CHECK_FEATURE_MISMATCH((a)->cap, (b)->cap, cap, feature, MASK) 77 - 78 - #define CHECK_FEATURE_MISMATCH_HOTPLUG(b, cap, feature, MASK) \ 79 - do { \ 80 - if (cap##_##feature(intel_iommu_##cap##_sanity)) \ 81 - DO_CHECK_FEATURE_MISMATCH(intel_iommu_##cap##_sanity, \ 82 - (b)->cap, cap, feature, MASK); \ 83 - } while (0) 84 - 85 - #define MINIMAL_FEATURE_IOMMU(iommu, cap, MASK) \ 86 - do { \ 87 - u64 min_feature = intel_iommu_##cap##_sanity & (MASK); \ 88 - min_feature = min_t(u64, min_feature, (iommu)->cap & (MASK)); \ 89 - intel_iommu_##cap##_sanity = (intel_iommu_##cap##_sanity & ~(MASK)) | \ 90 - min_feature; \ 91 - } while (0) 92 - 93 - #define MINIMAL_FEATURE_HOTPLUG(iommu, cap, feature, MASK, mismatch) \ 94 - do { \ 95 - if ((intel_iommu_##cap##_sanity & (MASK)) > \ 96 - (cap##_##feature((iommu)->cap))) \ 97 - mismatch = true; \ 98 - else \ 99 - (iommu)->cap = ((iommu)->cap & ~(MASK)) | \ 100 - (intel_iommu_##cap##_sanity & (MASK)); \ 101 - } while (0) 102 - 103 - enum cap_audit_type { 104 - CAP_AUDIT_STATIC_DMAR, 105 - CAP_AUDIT_STATIC_IRQR, 106 - CAP_AUDIT_HOTPLUG_DMAR, 107 - CAP_AUDIT_HOTPLUG_IRQR, 108 - }; 109 - 110 - bool intel_cap_smts_sanity(void); 111 - bool intel_cap_pasid_sanity(void); 112 - bool intel_cap_nest_sanity(void); 113 - bool intel_cap_flts_sanity(void); 114 - bool intel_cap_slts_sanity(void); 115 - 116 - static inline bool scalable_mode_support(void) 117 - { 118 - return (intel_iommu_sm && intel_cap_smts_sanity()); 119 - } 120 - 121 - static inline bool pasid_mode_support(void) 122 - { 123 - return scalable_mode_support() && intel_cap_pasid_sanity(); 124 - } 125 - 126 - static inline bool nested_mode_support(void) 127 - { 128 - return scalable_mode_support() && intel_cap_nest_sanity(); 129 - } 130 - 131 - int intel_cap_audit(enum cap_audit_type type, struct intel_iommu *iommu);
+15 -32
drivers/iommu/intel/iommu.c
··· 29 29 #include "../irq_remapping.h" 30 30 #include "../iommu-pages.h" 31 31 #include "pasid.h" 32 - #include "cap_audit.h" 33 32 #include "perfmon.h" 34 33 35 34 #define ROOT_SIZE VTD_PAGE_SIZE ··· 2117 2118 struct intel_iommu *iommu; 2118 2119 int ret; 2119 2120 2120 - ret = intel_cap_audit(CAP_AUDIT_STATIC_DMAR, NULL); 2121 - if (ret) 2122 - goto free_iommu; 2123 - 2124 2121 for_each_iommu(iommu, drhd) { 2125 2122 if (drhd->ignored) { 2126 2123 iommu_disable_translation(iommu); ··· 2611 2616 { 2612 2617 struct intel_iommu *iommu = dmaru->iommu; 2613 2618 int ret; 2614 - 2615 - ret = intel_cap_audit(CAP_AUDIT_HOTPLUG_DMAR, iommu); 2616 - if (ret) 2617 - goto out; 2618 2619 2619 2620 /* 2620 2621 * Disable translation if already enabled prior to OS handover. ··· 3241 3250 return 0; 3242 3251 } 3243 3252 3253 + static int blocking_domain_set_dev_pasid(struct iommu_domain *domain, 3254 + struct device *dev, ioasid_t pasid, 3255 + struct iommu_domain *old); 3256 + 3244 3257 static struct iommu_domain blocking_domain = { 3245 3258 .type = IOMMU_DOMAIN_BLOCKED, 3246 3259 .ops = &(const struct iommu_domain_ops) { 3247 3260 .attach_dev = blocking_domain_attach_dev, 3261 + .set_dev_pasid = blocking_domain_set_dev_pasid, 3248 3262 } 3249 3263 }; 3250 3264 ··· 4086 4090 break; 4087 4091 } 4088 4092 } 4089 - WARN_ON_ONCE(!dev_pasid); 4090 4093 spin_unlock_irqrestore(&dmar_domain->lock, flags); 4091 4094 4092 4095 cache_tag_unassign_domain(dmar_domain, dev, pasid); 4093 4096 domain_detach_iommu(dmar_domain, iommu); 4094 - intel_iommu_debugfs_remove_dev_pasid(dev_pasid); 4095 - kfree(dev_pasid); 4097 + if (!WARN_ON_ONCE(!dev_pasid)) { 4098 + intel_iommu_debugfs_remove_dev_pasid(dev_pasid); 4099 + kfree(dev_pasid); 4100 + } 4096 4101 } 4097 4102 4098 - static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid, 4099 - struct iommu_domain *domain) 4103 + static int blocking_domain_set_dev_pasid(struct iommu_domain *domain, 4104 + struct device *dev, ioasid_t pasid, 4105 + struct iommu_domain *old) 4100 4106 { 4101 4107 struct device_domain_info *info = dev_iommu_priv_get(dev); 4102 4108 4103 4109 intel_pasid_tear_down_entry(info->iommu, dev, pasid, false); 4104 - domain_remove_dev_pasid(domain, dev, pasid); 4110 + domain_remove_dev_pasid(old, dev, pasid); 4111 + 4112 + return 0; 4105 4113 } 4106 4114 4107 4115 struct dev_pasid_info * ··· 4445 4445 }, 4446 4446 }; 4447 4447 4448 - static struct iommu_domain *intel_iommu_domain_alloc_paging(struct device *dev) 4449 - { 4450 - struct device_domain_info *info = dev_iommu_priv_get(dev); 4451 - struct intel_iommu *iommu = info->iommu; 4452 - struct dmar_domain *dmar_domain; 4453 - bool first_stage; 4454 - 4455 - first_stage = first_level_by_default(iommu); 4456 - dmar_domain = paging_domain_alloc(dev, first_stage); 4457 - if (IS_ERR(dmar_domain)) 4458 - return ERR_CAST(dmar_domain); 4459 - 4460 - return &dmar_domain->domain; 4461 - } 4462 - 4463 4448 const struct iommu_ops intel_iommu_ops = { 4464 4449 .blocked_domain = &blocking_domain, 4465 4450 .release_domain = &blocking_domain, ··· 4453 4468 .hw_info = intel_iommu_hw_info, 4454 4469 .domain_alloc_paging_flags = intel_iommu_domain_alloc_paging_flags, 4455 4470 .domain_alloc_sva = intel_svm_domain_alloc, 4456 - .domain_alloc_paging = intel_iommu_domain_alloc_paging, 4457 4471 .domain_alloc_nested = intel_iommu_domain_alloc_nested, 4458 4472 .probe_device = intel_iommu_probe_device, 4459 4473 .release_device = intel_iommu_release_device, ··· 4462 4478 .dev_disable_feat = intel_iommu_dev_disable_feat, 4463 4479 .is_attach_deferred = intel_iommu_is_attach_deferred, 4464 4480 .def_domain_type = device_def_domain_type, 4465 - .remove_dev_pasid = intel_iommu_remove_dev_pasid, 4466 4481 .pgsize_bitmap = SZ_4K, 4467 4482 .page_response = intel_iommu_page_response, 4468 4483 .default_domain_ops = &(const struct iommu_domain_ops) {
-8
drivers/iommu/intel/irq_remapping.c
··· 24 24 #include "iommu.h" 25 25 #include "../irq_remapping.h" 26 26 #include "../iommu-pages.h" 27 - #include "cap_audit.h" 28 27 29 28 enum irq_mode { 30 29 IRQ_REMAPPING, ··· 724 725 } 725 726 726 727 if (dmar_table_init() < 0) 727 - return -ENODEV; 728 - 729 - if (intel_cap_audit(CAP_AUDIT_STATIC_IRQR, NULL)) 730 728 return -ENODEV; 731 729 732 730 if (!dmar_ir_support()) ··· 1528 1532 { 1529 1533 int ret; 1530 1534 int eim = x2apic_enabled(); 1531 - 1532 - ret = intel_cap_audit(CAP_AUDIT_HOTPLUG_IRQR, iommu); 1533 - if (ret) 1534 - return ret; 1535 1535 1536 1536 if (eim && !ecap_eim_support(iommu->ecap)) { 1537 1537 pr_info("DRHD %Lx: EIM not supported by DRHD, ecap %Lx\n",
+21 -1
drivers/iommu/intel/pasid.c
··· 244 244 245 245 spin_lock(&iommu->lock); 246 246 pte = intel_pasid_get_entry(dev, pasid); 247 - if (WARN_ON(!pte) || !pasid_pte_is_present(pte)) { 247 + if (WARN_ON(!pte)) { 248 248 spin_unlock(&iommu->lock); 249 + return; 250 + } 251 + 252 + if (!pasid_pte_is_present(pte)) { 253 + if (!pasid_pte_is_fault_disabled(pte)) { 254 + WARN_ON(READ_ONCE(pte->val[0]) != 0); 255 + spin_unlock(&iommu->lock); 256 + return; 257 + } 258 + 259 + /* 260 + * When a PASID is used for SVA by a device, it's possible 261 + * that the pasid entry is non-present with the Fault 262 + * Processing Disabled bit set. Clear the pasid entry and 263 + * drain the PRQ for the PASID before return. 264 + */ 265 + pasid_clear_entry(pte); 266 + spin_unlock(&iommu->lock); 267 + intel_iommu_drain_pasid_prq(dev, pasid); 268 + 249 269 return; 250 270 } 251 271
+6
drivers/iommu/intel/pasid.h
··· 73 73 return READ_ONCE(pte->val[0]) & PASID_PTE_PRESENT; 74 74 } 75 75 76 + /* Get FPD(Fault Processing Disable) bit of a PASID table entry */ 77 + static inline bool pasid_pte_is_fault_disabled(struct pasid_entry *pte) 78 + { 79 + return READ_ONCE(pte->val[0]) & PASID_PTE_FPD; 80 + } 81 + 76 82 /* Get PGTT field of a PASID table entry */ 77 83 static inline u16 pasid_pte_get_pgtt(struct pasid_entry *pte) 78 84 {
+152 -85
drivers/iommu/io-pgtable-arm.c
··· 223 223 return ptes_per_table - (i & (ptes_per_table - 1)); 224 224 } 225 225 226 + /* 227 + * Check if concatenated PGDs are mandatory according to Arm DDI0487 (K.a) 228 + * 1) R_DXBSH: For 16KB, and 48-bit input size, use level 1 instead of 0. 229 + * 2) R_SRKBC: After de-ciphering the table for PA size and valid initial lookup 230 + * a) 40 bits PA size with 4K: use level 1 instead of level 0 (2 tables for ias = oas) 231 + * b) 40 bits PA size with 16K: use level 2 instead of level 1 (16 tables for ias = oas) 232 + * c) 42 bits PA size with 4K: use level 1 instead of level 0 (8 tables for ias = oas) 233 + * d) 48 bits PA size with 16K: use level 1 instead of level 0 (2 tables for ias = oas) 234 + */ 235 + static inline bool arm_lpae_concat_mandatory(struct io_pgtable_cfg *cfg, 236 + struct arm_lpae_io_pgtable *data) 237 + { 238 + unsigned int ias = cfg->ias; 239 + unsigned int oas = cfg->oas; 240 + 241 + /* Covers 1 and 2.d */ 242 + if ((ARM_LPAE_GRANULE(data) == SZ_16K) && (data->start_level == 0)) 243 + return (oas == 48) || (ias == 48); 244 + 245 + /* Covers 2.a and 2.c */ 246 + if ((ARM_LPAE_GRANULE(data) == SZ_4K) && (data->start_level == 0)) 247 + return (oas == 40) || (oas == 42); 248 + 249 + /* Case 2.b */ 250 + return (ARM_LPAE_GRANULE(data) == SZ_16K) && 251 + (data->start_level == 1) && (oas == 40); 252 + } 253 + 226 254 static bool selftest_running = false; 227 255 228 256 static dma_addr_t __arm_lpae_dma_addr(void *pages) ··· 704 676 data->start_level, ptep); 705 677 } 706 678 707 - static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops, 708 - unsigned long iova) 709 - { 710 - struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); 711 - arm_lpae_iopte pte, *ptep = data->pgd; 712 - int lvl = data->start_level; 713 - 714 - do { 715 - /* Valid IOPTE pointer? */ 716 - if (!ptep) 717 - return 0; 718 - 719 - /* Grab the IOPTE we're interested in */ 720 - ptep += ARM_LPAE_LVL_IDX(iova, lvl, data); 721 - pte = READ_ONCE(*ptep); 722 - 723 - /* Valid entry? */ 724 - if (!pte) 725 - return 0; 726 - 727 - /* Leaf entry? */ 728 - if (iopte_leaf(pte, lvl, data->iop.fmt)) 729 - goto found_translation; 730 - 731 - /* Take it to the next level */ 732 - ptep = iopte_deref(pte, data); 733 - } while (++lvl < ARM_LPAE_MAX_LEVELS); 734 - 735 - /* Ran out of page tables to walk */ 736 - return 0; 737 - 738 - found_translation: 739 - iova &= (ARM_LPAE_BLOCK_SIZE(lvl, data) - 1); 740 - return iopte_to_paddr(pte, data) | iova; 741 - } 742 - 743 679 struct io_pgtable_walk_data { 744 - struct iommu_dirty_bitmap *dirty; 680 + struct io_pgtable *iop; 681 + void *data; 682 + int (*visit)(struct io_pgtable_walk_data *walk_data, int lvl, 683 + arm_lpae_iopte *ptep, size_t size); 745 684 unsigned long flags; 746 685 u64 addr; 747 686 const u64 end; 748 687 }; 749 688 750 - static int __arm_lpae_iopte_walk_dirty(struct arm_lpae_io_pgtable *data, 751 - struct io_pgtable_walk_data *walk_data, 752 - arm_lpae_iopte *ptep, 753 - int lvl); 689 + static int __arm_lpae_iopte_walk(struct arm_lpae_io_pgtable *data, 690 + struct io_pgtable_walk_data *walk_data, 691 + arm_lpae_iopte *ptep, 692 + int lvl); 754 693 755 - static int io_pgtable_visit_dirty(struct arm_lpae_io_pgtable *data, 756 - struct io_pgtable_walk_data *walk_data, 757 - arm_lpae_iopte *ptep, int lvl) 694 + struct iova_to_phys_data { 695 + arm_lpae_iopte pte; 696 + int lvl; 697 + }; 698 + 699 + static int visit_iova_to_phys(struct io_pgtable_walk_data *walk_data, int lvl, 700 + arm_lpae_iopte *ptep, size_t size) 701 + { 702 + struct iova_to_phys_data *data = walk_data->data; 703 + data->pte = *ptep; 704 + data->lvl = lvl; 705 + return 0; 706 + } 707 + 708 + static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops, 709 + unsigned long iova) 710 + { 711 + struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); 712 + struct iova_to_phys_data d; 713 + struct io_pgtable_walk_data walk_data = { 714 + .data = &d, 715 + .visit = visit_iova_to_phys, 716 + .addr = iova, 717 + .end = iova + 1, 718 + }; 719 + int ret; 720 + 721 + ret = __arm_lpae_iopte_walk(data, &walk_data, data->pgd, data->start_level); 722 + if (ret) 723 + return 0; 724 + 725 + iova &= (ARM_LPAE_BLOCK_SIZE(d.lvl, data) - 1); 726 + return iopte_to_paddr(d.pte, data) | iova; 727 + } 728 + 729 + static int visit_pgtable_walk(struct io_pgtable_walk_data *walk_data, int lvl, 730 + arm_lpae_iopte *ptep, size_t size) 731 + { 732 + struct arm_lpae_io_pgtable_walk_data *data = walk_data->data; 733 + data->ptes[lvl] = *ptep; 734 + return 0; 735 + } 736 + 737 + static int arm_lpae_pgtable_walk(struct io_pgtable_ops *ops, unsigned long iova, 738 + void *wd) 739 + { 740 + struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); 741 + struct io_pgtable_walk_data walk_data = { 742 + .data = wd, 743 + .visit = visit_pgtable_walk, 744 + .addr = iova, 745 + .end = iova + 1, 746 + }; 747 + 748 + return __arm_lpae_iopte_walk(data, &walk_data, data->pgd, data->start_level); 749 + } 750 + 751 + static int io_pgtable_visit(struct arm_lpae_io_pgtable *data, 752 + struct io_pgtable_walk_data *walk_data, 753 + arm_lpae_iopte *ptep, int lvl) 758 754 { 759 755 struct io_pgtable *iop = &data->iop; 760 756 arm_lpae_iopte pte = READ_ONCE(*ptep); 761 757 762 - if (iopte_leaf(pte, lvl, iop->fmt)) { 763 - size_t size = ARM_LPAE_BLOCK_SIZE(lvl, data); 758 + size_t size = ARM_LPAE_BLOCK_SIZE(lvl, data); 759 + int ret = walk_data->visit(walk_data, lvl, ptep, size); 760 + if (ret) 761 + return ret; 764 762 765 - if (iopte_writeable_dirty(pte)) { 766 - iommu_dirty_bitmap_record(walk_data->dirty, 767 - walk_data->addr, size); 768 - if (!(walk_data->flags & IOMMU_DIRTY_NO_CLEAR)) 769 - iopte_set_writeable_clean(ptep); 770 - } 763 + if (iopte_leaf(pte, lvl, iop->fmt)) { 771 764 walk_data->addr += size; 772 765 return 0; 773 766 } 774 767 775 - if (WARN_ON(!iopte_table(pte, lvl))) 768 + if (!iopte_table(pte, lvl)) { 776 769 return -EINVAL; 770 + } 777 771 778 772 ptep = iopte_deref(pte, data); 779 - return __arm_lpae_iopte_walk_dirty(data, walk_data, ptep, lvl + 1); 773 + return __arm_lpae_iopte_walk(data, walk_data, ptep, lvl + 1); 780 774 } 781 775 782 - static int __arm_lpae_iopte_walk_dirty(struct arm_lpae_io_pgtable *data, 783 - struct io_pgtable_walk_data *walk_data, 784 - arm_lpae_iopte *ptep, 785 - int lvl) 776 + static int __arm_lpae_iopte_walk(struct arm_lpae_io_pgtable *data, 777 + struct io_pgtable_walk_data *walk_data, 778 + arm_lpae_iopte *ptep, 779 + int lvl) 786 780 { 787 781 u32 idx; 788 782 int max_entries, ret; ··· 819 769 820 770 for (idx = ARM_LPAE_LVL_IDX(walk_data->addr, lvl, data); 821 771 (idx < max_entries) && (walk_data->addr < walk_data->end); ++idx) { 822 - ret = io_pgtable_visit_dirty(data, walk_data, ptep + idx, lvl); 772 + ret = io_pgtable_visit(data, walk_data, ptep + idx, lvl); 823 773 if (ret) 824 774 return ret; 775 + } 776 + 777 + return 0; 778 + } 779 + 780 + static int visit_dirty(struct io_pgtable_walk_data *walk_data, int lvl, 781 + arm_lpae_iopte *ptep, size_t size) 782 + { 783 + struct iommu_dirty_bitmap *dirty = walk_data->data; 784 + 785 + if (!iopte_leaf(*ptep, lvl, walk_data->iop->fmt)) 786 + return 0; 787 + 788 + if (iopte_writeable_dirty(*ptep)) { 789 + iommu_dirty_bitmap_record(dirty, walk_data->addr, size); 790 + if (!(walk_data->flags & IOMMU_DIRTY_NO_CLEAR)) 791 + iopte_set_writeable_clean(ptep); 825 792 } 826 793 827 794 return 0; ··· 852 785 struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); 853 786 struct io_pgtable_cfg *cfg = &data->iop.cfg; 854 787 struct io_pgtable_walk_data walk_data = { 855 - .dirty = dirty, 788 + .iop = &data->iop, 789 + .data = dirty, 790 + .visit = visit_dirty, 856 791 .flags = flags, 857 792 .addr = iova, 858 793 .end = iova + size, ··· 869 800 if (data->iop.fmt != ARM_64_LPAE_S1) 870 801 return -EINVAL; 871 802 872 - return __arm_lpae_iopte_walk_dirty(data, &walk_data, ptep, lvl); 803 + return __arm_lpae_iopte_walk(data, &walk_data, ptep, lvl); 873 804 } 874 805 875 806 static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg) ··· 951 882 .unmap_pages = arm_lpae_unmap_pages, 952 883 .iova_to_phys = arm_lpae_iova_to_phys, 953 884 .read_and_clear_dirty = arm_lpae_read_and_clear_dirty, 885 + .pgtable_walk = arm_lpae_pgtable_walk, 954 886 }; 955 887 956 888 return data; ··· 1076 1006 if (!data) 1077 1007 return NULL; 1078 1008 1079 - /* 1080 - * Concatenate PGDs at level 1 if possible in order to reduce 1081 - * the depth of the stage-2 walk. 1082 - */ 1083 - if (data->start_level == 0) { 1084 - unsigned long pgd_pages; 1085 - 1086 - pgd_pages = ARM_LPAE_PGD_SIZE(data) / sizeof(arm_lpae_iopte); 1087 - if (pgd_pages <= ARM_LPAE_S2_MAX_CONCAT_PAGES) { 1088 - data->pgd_bits += data->bits_per_level; 1089 - data->start_level++; 1090 - } 1009 + if (arm_lpae_concat_mandatory(cfg, data)) { 1010 + if (WARN_ON((ARM_LPAE_PGD_SIZE(data) / sizeof(arm_lpae_iopte)) > 1011 + ARM_LPAE_S2_MAX_CONCAT_PAGES)) 1012 + return NULL; 1013 + data->pgd_bits += data->bits_per_level; 1014 + data->start_level++; 1091 1015 } 1092 1016 1093 1017 /* VTCR */ ··· 1428 1364 SZ_64K | SZ_512M, 1429 1365 }; 1430 1366 1431 - static const unsigned int ias[] __initconst = { 1367 + static const unsigned int address_size[] __initconst = { 1432 1368 32, 36, 40, 42, 44, 48, 1433 1369 }; 1434 1370 1435 - int i, j, pass = 0, fail = 0; 1371 + int i, j, k, pass = 0, fail = 0; 1436 1372 struct device dev; 1437 1373 struct io_pgtable_cfg cfg = { 1438 1374 .tlb = &dummy_tlb_ops, 1439 - .oas = 48, 1440 1375 .coherent_walk = true, 1441 1376 .iommu_dev = &dev, 1442 1377 }; ··· 1444 1381 set_dev_node(&dev, NUMA_NO_NODE); 1445 1382 1446 1383 for (i = 0; i < ARRAY_SIZE(pgsize); ++i) { 1447 - for (j = 0; j < ARRAY_SIZE(ias); ++j) { 1448 - cfg.pgsize_bitmap = pgsize[i]; 1449 - cfg.ias = ias[j]; 1450 - pr_info("selftest: pgsize_bitmap 0x%08lx, IAS %u\n", 1451 - pgsize[i], ias[j]); 1452 - if (arm_lpae_run_tests(&cfg)) 1453 - fail++; 1454 - else 1455 - pass++; 1384 + for (j = 0; j < ARRAY_SIZE(address_size); ++j) { 1385 + /* Don't use ias > oas as it is not valid for stage-2. */ 1386 + for (k = 0; k <= j; ++k) { 1387 + cfg.pgsize_bitmap = pgsize[i]; 1388 + cfg.ias = address_size[k]; 1389 + cfg.oas = address_size[j]; 1390 + pr_info("selftest: pgsize_bitmap 0x%08lx, IAS %u OAS %u\n", 1391 + pgsize[i], cfg.ias, cfg.oas); 1392 + if (arm_lpae_run_tests(&cfg)) 1393 + fail++; 1394 + else 1395 + pass++; 1396 + } 1456 1397 } 1457 1398 } 1458 1399
+23 -14
drivers/iommu/iommu.c
··· 2819 2819 struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 2820 2820 2821 2821 if (!ops) 2822 - return -EPROBE_DEFER; 2822 + return driver_deferred_probe_check_state(dev); 2823 2823 2824 2824 if (fwspec) 2825 2825 return ops == iommu_fwspec_ops(fwspec) ? 0 : -EINVAL; ··· 3312 3312 } 3313 3313 EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed); 3314 3314 3315 + static void iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid, 3316 + struct iommu_domain *domain) 3317 + { 3318 + const struct iommu_ops *ops = dev_iommu_ops(dev); 3319 + struct iommu_domain *blocked_domain = ops->blocked_domain; 3320 + 3321 + WARN_ON(blocked_domain->ops->set_dev_pasid(blocked_domain, 3322 + dev, pasid, domain)); 3323 + } 3324 + 3315 3325 static int __iommu_set_group_pasid(struct iommu_domain *domain, 3316 3326 struct iommu_group *group, ioasid_t pasid) 3317 3327 { ··· 3340 3330 err_revert: 3341 3331 last_gdev = device; 3342 3332 for_each_group_device(group, device) { 3343 - const struct iommu_ops *ops = dev_iommu_ops(device->dev); 3344 - 3345 3333 if (device == last_gdev) 3346 3334 break; 3347 - ops->remove_dev_pasid(device->dev, pasid, domain); 3335 + iommu_remove_dev_pasid(device->dev, pasid, domain); 3348 3336 } 3349 3337 return ret; 3350 3338 } ··· 3352 3344 struct iommu_domain *domain) 3353 3345 { 3354 3346 struct group_device *device; 3355 - const struct iommu_ops *ops; 3356 3347 3357 - for_each_group_device(group, device) { 3358 - ops = dev_iommu_ops(device->dev); 3359 - ops->remove_dev_pasid(device->dev, pasid, domain); 3360 - } 3348 + for_each_group_device(group, device) 3349 + iommu_remove_dev_pasid(device->dev, pasid, domain); 3361 3350 } 3362 3351 3363 3352 /* ··· 3373 3368 /* Caller must be a probed driver on dev */ 3374 3369 struct iommu_group *group = dev->iommu_group; 3375 3370 struct group_device *device; 3371 + const struct iommu_ops *ops; 3376 3372 int ret; 3377 - 3378 - if (!domain->ops->set_dev_pasid) 3379 - return -EOPNOTSUPP; 3380 3373 3381 3374 if (!group) 3382 3375 return -ENODEV; 3383 3376 3384 - if (!dev_has_iommu(dev) || dev_iommu_ops(dev) != domain->owner || 3385 - pasid == IOMMU_NO_PASID) 3377 + ops = dev_iommu_ops(dev); 3378 + 3379 + if (!domain->ops->set_dev_pasid || 3380 + !ops->blocked_domain || 3381 + !ops->blocked_domain->ops->set_dev_pasid) 3382 + return -EOPNOTSUPP; 3383 + 3384 + if (ops != domain->owner || pasid == IOMMU_NO_PASID) 3386 3385 return -EINVAL; 3387 3386 3388 3387 mutex_lock(&group->mutex);
+11 -40
drivers/iommu/msm_iommu.c
··· 725 725 iommu->dev = &pdev->dev; 726 726 INIT_LIST_HEAD(&iommu->ctx_list); 727 727 728 - iommu->pclk = devm_clk_get(iommu->dev, "smmu_pclk"); 728 + iommu->pclk = devm_clk_get_prepared(iommu->dev, "smmu_pclk"); 729 729 if (IS_ERR(iommu->pclk)) 730 730 return dev_err_probe(iommu->dev, PTR_ERR(iommu->pclk), 731 731 "could not get smmu_pclk\n"); 732 732 733 - ret = clk_prepare(iommu->pclk); 734 - if (ret) 735 - return dev_err_probe(iommu->dev, ret, 736 - "could not prepare smmu_pclk\n"); 737 - 738 - iommu->clk = devm_clk_get(iommu->dev, "iommu_clk"); 739 - if (IS_ERR(iommu->clk)) { 740 - clk_unprepare(iommu->pclk); 733 + iommu->clk = devm_clk_get_prepared(iommu->dev, "iommu_clk"); 734 + if (IS_ERR(iommu->clk)) 741 735 return dev_err_probe(iommu->dev, PTR_ERR(iommu->clk), 742 736 "could not get iommu_clk\n"); 743 - } 744 - 745 - ret = clk_prepare(iommu->clk); 746 - if (ret) { 747 - clk_unprepare(iommu->pclk); 748 - return dev_err_probe(iommu->dev, ret, "could not prepare iommu_clk\n"); 749 - } 750 737 751 738 r = platform_get_resource(pdev, IORESOURCE_MEM, 0); 752 739 iommu->base = devm_ioremap_resource(iommu->dev, r); 753 740 if (IS_ERR(iommu->base)) { 754 741 ret = dev_err_probe(iommu->dev, PTR_ERR(iommu->base), "could not get iommu base\n"); 755 - goto fail; 742 + return ret; 756 743 } 757 744 ioaddr = r->start; 758 745 759 746 iommu->irq = platform_get_irq(pdev, 0); 760 - if (iommu->irq < 0) { 761 - ret = -ENODEV; 762 - goto fail; 763 - } 747 + if (iommu->irq < 0) 748 + return -ENODEV; 764 749 765 750 ret = of_property_read_u32(iommu->dev->of_node, "qcom,ncb", &val); 766 751 if (ret) { 767 752 dev_err(iommu->dev, "could not get ncb\n"); 768 - goto fail; 753 + return ret; 769 754 } 770 755 iommu->ncb = val; 771 756 ··· 765 780 766 781 if (!par) { 767 782 pr_err("Invalid PAR value detected\n"); 768 - ret = -ENODEV; 769 - goto fail; 783 + return -ENODEV; 770 784 } 771 785 772 786 ret = devm_request_threaded_irq(iommu->dev, iommu->irq, NULL, ··· 775 791 iommu); 776 792 if (ret) { 777 793 pr_err("Request IRQ %d failed with ret=%d\n", iommu->irq, ret); 778 - goto fail; 794 + return ret; 779 795 } 780 796 781 797 list_add(&iommu->dev_node, &qcom_iommu_devices); ··· 784 800 "msm-smmu.%pa", &ioaddr); 785 801 if (ret) { 786 802 pr_err("Could not add msm-smmu at %pa to sysfs\n", &ioaddr); 787 - goto fail; 803 + return ret; 788 804 } 789 805 790 806 ret = iommu_device_register(&iommu->iommu, &msm_iommu_ops, &pdev->dev); 791 807 if (ret) { 792 808 pr_err("Could not register msm-smmu at %pa\n", &ioaddr); 793 - goto fail; 809 + return ret; 794 810 } 795 811 796 812 pr_info("device mapped at %p, irq %d with %d ctx banks\n", 797 813 iommu->base, iommu->irq, iommu->ncb); 798 814 799 - return ret; 800 - fail: 801 - clk_unprepare(iommu->clk); 802 - clk_unprepare(iommu->pclk); 803 815 return ret; 804 816 } 805 817 ··· 804 824 {} 805 825 }; 806 826 807 - static void msm_iommu_remove(struct platform_device *pdev) 808 - { 809 - struct msm_iommu_dev *iommu = platform_get_drvdata(pdev); 810 - 811 - clk_unprepare(iommu->clk); 812 - clk_unprepare(iommu->pclk); 813 - } 814 - 815 827 static struct platform_driver msm_iommu_driver = { 816 828 .driver = { 817 829 .name = "msm_iommu", 818 830 .of_match_table = msm_iommu_dt_match, 819 831 }, 820 832 .probe = msm_iommu_probe, 821 - .remove = msm_iommu_remove, 822 833 }; 823 834 builtin_platform_driver(msm_iommu_driver);
+5 -4
drivers/iommu/mtk_iommu.c
··· 29 29 #include <linux/spinlock.h> 30 30 #include <linux/soc/mediatek/infracfg.h> 31 31 #include <linux/soc/mediatek/mtk_sip_svc.h> 32 + #include <linux/string_choices.h> 32 33 #include <asm/barrier.h> 33 34 #include <soc/mediatek/smi.h> 34 35 ··· 511 510 bank->parent_dev, 512 511 "fault type=0x%x iova=0x%llx pa=0x%llx master=0x%x(larb=%d port=%d) layer=%d %s\n", 513 512 int_state, fault_iova, fault_pa, regval, fault_larb, fault_port, 514 - layer, write ? "write" : "read"); 513 + layer, str_write_read(write)); 515 514 } 516 515 517 516 /* Interrupt clear */ ··· 603 602 larb_mmu->bank[portid] = upper_32_bits(region->iova_base); 604 603 605 604 dev_dbg(dev, "%s iommu for larb(%s) port 0x%lx region %d rgn-bank %d.\n", 606 - enable ? "enable" : "disable", dev_name(larb_mmu->dev), 605 + str_enable_disable(enable), dev_name(larb_mmu->dev), 607 606 portid_msk, regionid, upper_32_bits(region->iova_base)); 608 607 609 608 if (enable) ··· 631 630 } 632 631 if (ret) 633 632 dev_err(dev, "%s iommu(%s) inframaster 0x%lx fail(%d).\n", 634 - enable ? "enable" : "disable", 635 - dev_name(data->dev), portid_msk, ret); 633 + str_enable_disable(enable), dev_name(data->dev), 634 + portid_msk, ret); 636 635 } 637 636 return ret; 638 637 }
+2 -1
drivers/iommu/mtk_iommu_v1.c
··· 25 25 #include <linux/platform_device.h> 26 26 #include <linux/slab.h> 27 27 #include <linux/spinlock.h> 28 + #include <linux/string_choices.h> 28 29 #include <asm/barrier.h> 29 30 #include <asm/dma-iommu.h> 30 31 #include <dt-bindings/memory/mtk-memory-port.h> ··· 244 243 larb_mmu = &data->larb_imu[larbid]; 245 244 246 245 dev_dbg(dev, "%s iommu port: %d\n", 247 - enable ? "enable" : "disable", portid); 246 + str_enable_disable(enable), portid); 248 247 249 248 if (enable) 250 249 larb_mmu->mmu |= MTK_SMI_MMU_EN(portid);
-2
drivers/iommu/of_iommu.c
··· 29 29 return -ENODEV; 30 30 31 31 ret = iommu_fwspec_init(dev, of_fwnode_handle(iommu_spec->np)); 32 - if (ret == -EPROBE_DEFER) 33 - return driver_deferred_probe_check_state(dev); 34 32 if (ret) 35 33 return ret; 36 34
+8
drivers/iommu/riscv/iommu-pci.c
··· 101 101 riscv_iommu_remove(iommu); 102 102 } 103 103 104 + static void riscv_iommu_pci_shutdown(struct pci_dev *pdev) 105 + { 106 + struct riscv_iommu_device *iommu = dev_get_drvdata(&pdev->dev); 107 + 108 + riscv_iommu_disable(iommu); 109 + } 110 + 104 111 static const struct pci_device_id riscv_iommu_pci_tbl[] = { 105 112 {PCI_VDEVICE(REDHAT, PCI_DEVICE_ID_REDHAT_RISCV_IOMMU), 0}, 106 113 {PCI_VDEVICE(RIVOS, PCI_DEVICE_ID_RIVOS_RISCV_IOMMU_GA), 0}, ··· 119 112 .id_table = riscv_iommu_pci_tbl, 120 113 .probe = riscv_iommu_pci_probe, 121 114 .remove = riscv_iommu_pci_remove, 115 + .shutdown = riscv_iommu_pci_shutdown, 122 116 .driver = { 123 117 .suppress_bind_attrs = true, 124 118 },
+90 -18
drivers/iommu/riscv/iommu-platform.c
··· 11 11 */ 12 12 13 13 #include <linux/kernel.h> 14 + #include <linux/msi.h> 15 + #include <linux/of_irq.h> 14 16 #include <linux/of_platform.h> 15 17 #include <linux/platform_device.h> 16 18 17 19 #include "iommu-bits.h" 18 20 #include "iommu.h" 19 21 22 + static void riscv_iommu_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg) 23 + { 24 + struct device *dev = msi_desc_to_dev(desc); 25 + struct riscv_iommu_device *iommu = dev_get_drvdata(dev); 26 + u16 idx = desc->msi_index; 27 + u64 addr; 28 + 29 + addr = ((u64)msg->address_hi << 32) | msg->address_lo; 30 + 31 + if (addr != (addr & RISCV_IOMMU_MSI_CFG_TBL_ADDR)) { 32 + dev_err_once(dev, 33 + "uh oh, the IOMMU can't send MSIs to 0x%llx, sending to 0x%llx instead\n", 34 + addr, addr & RISCV_IOMMU_MSI_CFG_TBL_ADDR); 35 + } 36 + 37 + addr &= RISCV_IOMMU_MSI_CFG_TBL_ADDR; 38 + 39 + riscv_iommu_writeq(iommu, RISCV_IOMMU_REG_MSI_CFG_TBL_ADDR(idx), addr); 40 + riscv_iommu_writel(iommu, RISCV_IOMMU_REG_MSI_CFG_TBL_DATA(idx), msg->data); 41 + riscv_iommu_writel(iommu, RISCV_IOMMU_REG_MSI_CFG_TBL_CTRL(idx), 0); 42 + } 43 + 20 44 static int riscv_iommu_platform_probe(struct platform_device *pdev) 21 45 { 46 + enum riscv_iommu_igs_settings igs; 22 47 struct device *dev = &pdev->dev; 23 48 struct riscv_iommu_device *iommu = NULL; 24 49 struct resource *res = NULL; 25 - int vec; 50 + int vec, ret; 26 51 27 52 iommu = devm_kzalloc(dev, sizeof(*iommu), GFP_KERNEL); 28 53 if (!iommu) ··· 65 40 iommu->caps = riscv_iommu_readq(iommu, RISCV_IOMMU_REG_CAPABILITIES); 66 41 iommu->fctl = riscv_iommu_readl(iommu, RISCV_IOMMU_REG_FCTL); 67 42 68 - /* For now we only support WSI */ 69 - switch (FIELD_GET(RISCV_IOMMU_CAPABILITIES_IGS, iommu->caps)) { 70 - case RISCV_IOMMU_CAPABILITIES_IGS_WSI: 71 - case RISCV_IOMMU_CAPABILITIES_IGS_BOTH: 72 - break; 73 - default: 74 - return dev_err_probe(dev, -ENODEV, 75 - "unable to use wire-signaled interrupts\n"); 76 - } 77 - 78 43 iommu->irqs_count = platform_irq_count(pdev); 79 44 if (iommu->irqs_count <= 0) 80 45 return dev_err_probe(dev, -ENODEV, ··· 72 57 if (iommu->irqs_count > RISCV_IOMMU_INTR_COUNT) 73 58 iommu->irqs_count = RISCV_IOMMU_INTR_COUNT; 74 59 75 - for (vec = 0; vec < iommu->irqs_count; vec++) 76 - iommu->irqs[vec] = platform_get_irq(pdev, vec); 60 + igs = FIELD_GET(RISCV_IOMMU_CAPABILITIES_IGS, iommu->caps); 61 + switch (igs) { 62 + case RISCV_IOMMU_CAPABILITIES_IGS_BOTH: 63 + case RISCV_IOMMU_CAPABILITIES_IGS_MSI: 64 + if (is_of_node(dev->fwnode)) 65 + of_msi_configure(dev, to_of_node(dev->fwnode)); 77 66 78 - /* Enable wire-signaled interrupts, fctl.WSI */ 79 - if (!(iommu->fctl & RISCV_IOMMU_FCTL_WSI)) { 80 - iommu->fctl |= RISCV_IOMMU_FCTL_WSI; 81 - riscv_iommu_writel(iommu, RISCV_IOMMU_REG_FCTL, iommu->fctl); 67 + if (!dev_get_msi_domain(dev)) { 68 + dev_warn(dev, "failed to find an MSI domain\n"); 69 + goto msi_fail; 70 + } 71 + 72 + ret = platform_device_msi_init_and_alloc_irqs(dev, iommu->irqs_count, 73 + riscv_iommu_write_msi_msg); 74 + if (ret) { 75 + dev_warn(dev, "failed to allocate MSIs\n"); 76 + goto msi_fail; 77 + } 78 + 79 + for (vec = 0; vec < iommu->irqs_count; vec++) 80 + iommu->irqs[vec] = msi_get_virq(dev, vec); 81 + 82 + /* Enable message-signaled interrupts, fctl.WSI */ 83 + if (iommu->fctl & RISCV_IOMMU_FCTL_WSI) { 84 + iommu->fctl ^= RISCV_IOMMU_FCTL_WSI; 85 + riscv_iommu_writel(iommu, RISCV_IOMMU_REG_FCTL, iommu->fctl); 86 + } 87 + 88 + dev_info(dev, "using MSIs\n"); 89 + break; 90 + 91 + msi_fail: 92 + if (igs != RISCV_IOMMU_CAPABILITIES_IGS_BOTH) { 93 + return dev_err_probe(dev, -ENODEV, 94 + "unable to use wire-signaled interrupts\n"); 95 + } 96 + 97 + fallthrough; 98 + 99 + case RISCV_IOMMU_CAPABILITIES_IGS_WSI: 100 + for (vec = 0; vec < iommu->irqs_count; vec++) 101 + iommu->irqs[vec] = platform_get_irq(pdev, vec); 102 + 103 + /* Enable wire-signaled interrupts, fctl.WSI */ 104 + if (!(iommu->fctl & RISCV_IOMMU_FCTL_WSI)) { 105 + iommu->fctl |= RISCV_IOMMU_FCTL_WSI; 106 + riscv_iommu_writel(iommu, RISCV_IOMMU_REG_FCTL, iommu->fctl); 107 + } 108 + dev_info(dev, "using wire-signaled interrupts\n"); 109 + break; 110 + default: 111 + return dev_err_probe(dev, -ENODEV, "invalid IGS\n"); 82 112 } 83 113 84 114 return riscv_iommu_init(iommu); ··· 131 71 132 72 static void riscv_iommu_platform_remove(struct platform_device *pdev) 133 73 { 134 - riscv_iommu_remove(dev_get_drvdata(&pdev->dev)); 74 + struct riscv_iommu_device *iommu = dev_get_drvdata(&pdev->dev); 75 + bool msi = !(iommu->fctl & RISCV_IOMMU_FCTL_WSI); 76 + 77 + riscv_iommu_remove(iommu); 78 + 79 + if (msi) 80 + platform_device_msi_free_irqs_all(&pdev->dev); 81 + }; 82 + 83 + static void riscv_iommu_platform_shutdown(struct platform_device *pdev) 84 + { 85 + riscv_iommu_disable(dev_get_drvdata(&pdev->dev)); 135 86 }; 136 87 137 88 static const struct of_device_id riscv_iommu_of_match[] = { ··· 153 82 static struct platform_driver riscv_iommu_platform_driver = { 154 83 .probe = riscv_iommu_platform_probe, 155 84 .remove = riscv_iommu_platform_remove, 85 + .shutdown = riscv_iommu_platform_shutdown, 156 86 .driver = { 157 87 .name = "riscv,iommu", 158 88 .of_match_table = riscv_iommu_of_match,
+11 -3
drivers/iommu/riscv/iommu.c
··· 240 240 return rc; 241 241 } 242 242 243 + /* Empty queue before enabling it */ 244 + if (queue->qid == RISCV_IOMMU_INTR_CQ) 245 + riscv_iommu_writel(queue->iommu, Q_TAIL(queue), 0); 246 + else 247 + riscv_iommu_writel(queue->iommu, Q_HEAD(queue), 0); 248 + 243 249 /* 244 250 * Enable queue with interrupts, clear any memory fault if any. 245 251 * Wait for the hardware to acknowledge request and activate queue ··· 651 645 * This is best effort IOMMU translation shutdown flow. 652 646 * Disable IOMMU without waiting for hardware response. 653 647 */ 654 - static void riscv_iommu_disable(struct riscv_iommu_device *iommu) 648 + void riscv_iommu_disable(struct riscv_iommu_device *iommu) 655 649 { 656 - riscv_iommu_writeq(iommu, RISCV_IOMMU_REG_DDTP, 0); 650 + riscv_iommu_writeq(iommu, RISCV_IOMMU_REG_DDTP, 651 + FIELD_PREP(RISCV_IOMMU_DDTP_IOMMU_MODE, 652 + RISCV_IOMMU_DDTP_IOMMU_MODE_BARE)); 657 653 riscv_iommu_writel(iommu, RISCV_IOMMU_REG_CQCSR, 0); 658 654 riscv_iommu_writel(iommu, RISCV_IOMMU_REG_FQCSR, 0); 659 655 riscv_iommu_writel(iommu, RISCV_IOMMU_REG_PQCSR, 0); ··· 1278 1270 dma_addr_t iova) 1279 1271 { 1280 1272 struct riscv_iommu_domain *domain = iommu_domain_to_riscv(iommu_domain); 1281 - unsigned long pte_size; 1273 + size_t pte_size; 1282 1274 unsigned long *ptr; 1283 1275 1284 1276 ptr = riscv_iommu_pte_fetch(domain, iova, &pte_size);
+1
drivers/iommu/riscv/iommu.h
··· 64 64 65 65 int riscv_iommu_init(struct riscv_iommu_device *iommu); 66 66 void riscv_iommu_remove(struct riscv_iommu_device *iommu); 67 + void riscv_iommu_disable(struct riscv_iommu_device *iommu); 67 68 68 69 #define riscv_iommu_readl(iommu, addr) \ 69 70 readl_relaxed((iommu)->reg + (addr))
+2 -1
drivers/iommu/rockchip-iommu.c
··· 25 25 #include <linux/pm_runtime.h> 26 26 #include <linux/slab.h> 27 27 #include <linux/spinlock.h> 28 + #include <linux/string_choices.h> 28 29 29 30 #include "iommu-pages.h" 30 31 ··· 612 611 613 612 dev_err(iommu->dev, "Page fault at %pad of type %s\n", 614 613 &iova, 615 - (flags == IOMMU_FAULT_WRITE) ? "write" : "read"); 614 + str_write_read(flags == IOMMU_FAULT_WRITE)); 616 615 617 616 log_iova(iommu, i, iova); 618 617
+7
include/linux/adreno-smmu-priv.h
··· 50 50 * the GPU driver must call resume_translation() 51 51 * @resume_translation: Resume translation after a fault 52 52 * 53 + * @set_prr_bit: [optional] Configure the GPU's Partially Resident 54 + * Region (PRR) bit in the ACTLR register. 55 + * @set_prr_addr: [optional] Configure the PRR_CFG_*ADDR register with 56 + * the physical address of PRR page passed from GPU 57 + * driver. 53 58 * 54 59 * The GPU driver (drm/msm) and adreno-smmu work together for controlling 55 60 * the GPU's SMMU instance. This is by necessity, as the GPU is directly ··· 72 67 void (*get_fault_info)(const void *cookie, struct adreno_smmu_fault_info *info); 73 68 void (*set_stall)(const void *cookie, bool enabled); 74 69 void (*resume_translation)(const void *cookie, bool terminate); 70 + void (*set_prr_bit)(const void *cookie, bool set); 71 + void (*set_prr_addr)(const void *cookie, phys_addr_t page_addr); 75 72 }; 76 73 77 74 #endif /* __ADRENO_SMMU_PRIV_H */
+2 -2
include/linux/amd-iommu.h
··· 31 31 struct task_struct; 32 32 struct pci_dev; 33 33 34 - extern int amd_iommu_detect(void); 34 + extern void amd_iommu_detect(void); 35 35 36 36 #else /* CONFIG_AMD_IOMMU */ 37 37 38 - static inline int amd_iommu_detect(void) { return -ENODEV; } 38 + static inline void amd_iommu_detect(void) { } 39 39 40 40 #endif /* CONFIG_AMD_IOMMU */ 41 41
+11
include/linux/io-pgtable.h
··· 181 181 }; 182 182 183 183 /** 184 + * struct arm_lpae_io_pgtable_walk_data - information from a pgtable walk 185 + * 186 + * @ptes: The recorded PTE values from the walk 187 + */ 188 + struct arm_lpae_io_pgtable_walk_data { 189 + u64 ptes[4]; 190 + }; 191 + 192 + /** 184 193 * struct io_pgtable_ops - Page table manipulation API for IOMMU drivers. 185 194 * 186 195 * @map_pages: Map a physically contiguous range of pages of the same size. 187 196 * @unmap_pages: Unmap a range of virtually contiguous pages of the same size. 188 197 * @iova_to_phys: Translate iova to physical address. 198 + * @pgtable_walk: (optional) Perform a page table walk for a given iova. 189 199 * 190 200 * These functions map directly onto the iommu_ops member functions with 191 201 * the same names. ··· 209 199 struct iommu_iotlb_gather *gather); 210 200 phys_addr_t (*iova_to_phys)(struct io_pgtable_ops *ops, 211 201 unsigned long iova); 202 + int (*pgtable_walk)(struct io_pgtable_ops *ops, unsigned long iova, void *wd); 212 203 int (*read_and_clear_dirty)(struct io_pgtable_ops *ops, 213 204 unsigned long iova, size_t size, 214 205 unsigned long flags,
-5
include/linux/iommu.h
··· 587 587 * - IOMMU_DOMAIN_DMA: must use a dma domain 588 588 * - 0: use the default setting 589 589 * @default_domain_ops: the default ops for domains 590 - * @remove_dev_pasid: Remove any translation configurations of a specific 591 - * pasid, so that any DMA transactions with this pasid 592 - * will be blocked by the hardware. 593 590 * @viommu_alloc: Allocate an iommufd_viommu on a physical IOMMU instance behind 594 591 * the @dev, as the set of virtualization resources shared/passed 595 592 * to user space IOMMU instance. And associate it with a nesting ··· 644 647 struct iommu_page_response *msg); 645 648 646 649 int (*def_domain_type)(struct device *dev); 647 - void (*remove_dev_pasid)(struct device *dev, ioasid_t pasid, 648 - struct iommu_domain *domain); 649 650 650 651 struct iommufd_viommu *(*viommu_alloc)( 651 652 struct device *dev, struct iommu_domain *parent_domain,