Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd

Pull iommufd updates from Jason Gunthorpe:
"Two significant new items:

- Allow reporting IOMMU HW events to userspace when the events are
clearly linked to a device.

This is linked to the VIOMMU object and is intended to be used by a
VMM to forward HW events to the virtual machine as part of
emulating a vIOMMU. ARM SMMUv3 is the first driver to use this
mechanism. Like the existing fault events the data is delivered
through a simple FD returning event records on read().

- PASID support in VFIO.

The "Process Address Space ID" is a PCI feature that allows the
device to tag all PCI DMA operations with an ID. The IOMMU will
then use the ID to select a unique translation for those DMAs. This
is part of Intel's vIOMMU support as VT-D HW requires the
hypervisor to manage each PASID entry.

The support is generic so any VFIO user could attach any
translation to a PASID, and the support should work on ARM SMMUv3
as well. AMD requires additional driver work.

Some minor updates, along with fixes:

- Prevent using nested parents with fault's, no driver support today

- Put a single "cookie_type" value in the iommu_domain to indicate
what owns the various opaque owner fields"

* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (49 commits)
iommufd: Test attach before detaching pasid
iommufd: Fix iommu_vevent_header tables markup
iommu: Convert unreachable() to BUG()
iommufd: Balance veventq->num_events inc/dec
iommufd: Initialize the flags of vevent in iommufd_viommu_report_event()
iommufd/selftest: Add coverage for reporting max_pasid_log2 via IOMMU_HW_INFO
iommufd: Extend IOMMU_GET_HW_INFO to report PASID capability
vfio: VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT support pasid
vfio-iommufd: Support pasid [at|de]tach for physical VFIO devices
ida: Add ida_find_first_range()
iommufd/selftest: Add coverage for iommufd pasid attach/detach
iommufd/selftest: Add test ops to test pasid attach/detach
iommufd/selftest: Add a helper to get test device
iommufd/selftest: Add set_dev_pasid in mock iommu
iommufd: Allow allocating PASID-compatible domain
iommu/vt-d: Add IOMMU_HWPT_ALLOC_PASID support
iommufd: Enforce PASID-compatible domain for RID
iommufd: Support pasid attach/replace
iommufd: Enforce PASID-compatible domain in PASID path
iommufd/device: Add pasid_attach array to track per-PASID attach
...

+3155 -837
+17
Documentation/userspace-api/iommufd.rst
··· 63 63 space usually has mappings from guest-level I/O virtual addresses to guest- 64 64 level physical addresses. 65 65 66 + - IOMMUFD_FAULT, representing a software queue for an HWPT reporting IO page 67 + faults using the IOMMU HW's PRI (Page Request Interface). This queue object 68 + provides user space an FD to poll the page fault events and also to respond 69 + to those events. A FAULT object must be created first to get a fault_id that 70 + could be then used to allocate a fault-enabled HWPT via the IOMMU_HWPT_ALLOC 71 + command by setting the IOMMU_HWPT_FAULT_ID_VALID bit in its flags field. 72 + 66 73 - IOMMUFD_OBJ_VIOMMU, representing a slice of the physical IOMMU instance, 67 74 passed to or shared with a VM. It may be some HW-accelerated virtualization 68 75 features and some SW resources used by the VM. For examples: ··· 115 108 to forward all the device information in a VM, when it connects a device to a 116 109 vIOMMU, which is a separate ioctl call from attaching the same device to an 117 110 HWPT_PAGING that the vIOMMU holds. 111 + 112 + - IOMMUFD_OBJ_VEVENTQ, representing a software queue for a vIOMMU to report its 113 + events such as translation faults occurred to a nested stage-1 (excluding I/O 114 + page faults that should go through IOMMUFD_OBJ_FAULT) and HW-specific events. 115 + This queue object provides user space an FD to poll/read the vIOMMU events. A 116 + vIOMMU object must be created first to get its viommu_id, which could be then 117 + used to allocate a vEVENTQ. Each vIOMMU can support multiple types of vEVENTS, 118 + but is confined to one vEVENTQ per vEVENTQ type. 118 119 119 120 All user-visible objects are destroyed via the IOMMU_DESTROY uAPI. 120 121 ··· 266 251 - iommufd_device for IOMMUFD_OBJ_DEVICE. 267 252 - iommufd_hwpt_paging for IOMMUFD_OBJ_HWPT_PAGING. 268 253 - iommufd_hwpt_nested for IOMMUFD_OBJ_HWPT_NESTED. 254 + - iommufd_fault for IOMMUFD_OBJ_FAULT. 269 255 - iommufd_viommu for IOMMUFD_OBJ_VIOMMU. 270 256 - iommufd_vdevice for IOMMUFD_OBJ_VDEVICE. 257 + - iommufd_veventq for IOMMUFD_OBJ_VEVENTQ. 271 258 272 259 Several terminologies when looking at these datastructures: 273 260
+60
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
··· 43 43 target->data[0] |= nested_domain->ste[0] & 44 44 ~cpu_to_le64(STRTAB_STE_0_CFG); 45 45 target->data[1] |= nested_domain->ste[1]; 46 + /* Merge events for DoS mitigations on eventq */ 47 + target->data[1] |= cpu_to_le64(STRTAB_STE_1_MEV); 46 48 } 47 49 48 50 /* ··· 85 83 arm_smmu_make_abort_ste(target); 86 84 break; 87 85 } 86 + } 87 + 88 + int arm_smmu_attach_prepare_vmaster(struct arm_smmu_attach_state *state, 89 + struct arm_smmu_nested_domain *nested_domain) 90 + { 91 + struct arm_smmu_vmaster *vmaster; 92 + unsigned long vsid; 93 + int ret; 94 + 95 + iommu_group_mutex_assert(state->master->dev); 96 + 97 + ret = iommufd_viommu_get_vdev_id(&nested_domain->vsmmu->core, 98 + state->master->dev, &vsid); 99 + if (ret) 100 + return ret; 101 + 102 + vmaster = kzalloc(sizeof(*vmaster), GFP_KERNEL); 103 + if (!vmaster) 104 + return -ENOMEM; 105 + vmaster->vsmmu = nested_domain->vsmmu; 106 + vmaster->vsid = vsid; 107 + state->vmaster = vmaster; 108 + 109 + return 0; 110 + } 111 + 112 + void arm_smmu_attach_commit_vmaster(struct arm_smmu_attach_state *state) 113 + { 114 + struct arm_smmu_master *master = state->master; 115 + 116 + mutex_lock(&master->smmu->streams_mutex); 117 + kfree(master->vmaster); 118 + master->vmaster = state->vmaster; 119 + mutex_unlock(&master->smmu->streams_mutex); 120 + } 121 + 122 + void arm_smmu_master_clear_vmaster(struct arm_smmu_master *master) 123 + { 124 + struct arm_smmu_attach_state state = { .master = master }; 125 + 126 + arm_smmu_attach_commit_vmaster(&state); 88 127 } 89 128 90 129 static int arm_smmu_attach_dev_nested(struct iommu_domain *domain, ··· 433 390 vsmmu->vmid = s2_parent->s2_cfg.vmid; 434 391 435 392 return &vsmmu->core; 393 + } 394 + 395 + int arm_vmaster_report_event(struct arm_smmu_vmaster *vmaster, u64 *evt) 396 + { 397 + struct iommu_vevent_arm_smmuv3 vevt; 398 + int i; 399 + 400 + lockdep_assert_held(&vmaster->vsmmu->smmu->streams_mutex); 401 + 402 + vevt.evt[0] = cpu_to_le64((evt[0] & ~EVTQ_0_SID) | 403 + FIELD_PREP(EVTQ_0_SID, vmaster->vsid)); 404 + for (i = 1; i < EVTQ_ENT_DWORDS; i++) 405 + vevt.evt[i] = cpu_to_le64(evt[i]); 406 + 407 + return iommufd_viommu_report_event(&vmaster->vsmmu->core, 408 + IOMMU_VEVENTQ_TYPE_ARM_SMMUV3, &vevt, 409 + sizeof(vevt)); 436 410 } 437 411 438 412 MODULE_IMPORT_NS("IOMMUFD");
+52 -28
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
··· 1052 1052 cpu_to_le64(STRTAB_STE_1_S1DSS | STRTAB_STE_1_S1CIR | 1053 1053 STRTAB_STE_1_S1COR | STRTAB_STE_1_S1CSH | 1054 1054 STRTAB_STE_1_S1STALLD | STRTAB_STE_1_STRW | 1055 - STRTAB_STE_1_EATS); 1055 + STRTAB_STE_1_EATS | STRTAB_STE_1_MEV); 1056 1056 used_bits[2] |= cpu_to_le64(STRTAB_STE_2_S2VMID); 1057 1057 1058 1058 /* ··· 1068 1068 if (cfg & BIT(1)) { 1069 1069 used_bits[1] |= 1070 1070 cpu_to_le64(STRTAB_STE_1_S2FWB | STRTAB_STE_1_EATS | 1071 - STRTAB_STE_1_SHCFG); 1071 + STRTAB_STE_1_SHCFG | STRTAB_STE_1_MEV); 1072 1072 used_bits[2] |= 1073 1073 cpu_to_le64(STRTAB_STE_2_S2VMID | STRTAB_STE_2_VTCR | 1074 1074 STRTAB_STE_2_S2AA64 | STRTAB_STE_2_S2ENDI | ··· 1813 1813 mutex_unlock(&smmu->streams_mutex); 1814 1814 } 1815 1815 1816 - static int arm_smmu_handle_event(struct arm_smmu_device *smmu, 1817 - struct arm_smmu_event *event) 1816 + static int arm_smmu_handle_event(struct arm_smmu_device *smmu, u64 *evt, 1817 + struct arm_smmu_event *event) 1818 1818 { 1819 1819 int ret = 0; 1820 1820 u32 perm = 0; ··· 1823 1823 struct iommu_fault *flt = &fault_evt.fault; 1824 1824 1825 1825 switch (event->id) { 1826 + case EVT_ID_BAD_STE_CONFIG: 1827 + case EVT_ID_STREAM_DISABLED_FAULT: 1828 + case EVT_ID_BAD_SUBSTREAMID_CONFIG: 1829 + case EVT_ID_BAD_CD_CONFIG: 1826 1830 case EVT_ID_TRANSLATION_FAULT: 1827 1831 case EVT_ID_ADDR_SIZE_FAULT: 1828 1832 case EVT_ID_ACCESS_FAULT: ··· 1836 1832 return -EOPNOTSUPP; 1837 1833 } 1838 1834 1839 - if (!event->stall) 1840 - return -EOPNOTSUPP; 1835 + if (event->stall) { 1836 + if (event->read) 1837 + perm |= IOMMU_FAULT_PERM_READ; 1838 + else 1839 + perm |= IOMMU_FAULT_PERM_WRITE; 1841 1840 1842 - if (event->read) 1843 - perm |= IOMMU_FAULT_PERM_READ; 1844 - else 1845 - perm |= IOMMU_FAULT_PERM_WRITE; 1841 + if (event->instruction) 1842 + perm |= IOMMU_FAULT_PERM_EXEC; 1846 1843 1847 - if (event->instruction) 1848 - perm |= IOMMU_FAULT_PERM_EXEC; 1844 + if (event->privileged) 1845 + perm |= IOMMU_FAULT_PERM_PRIV; 1849 1846 1850 - if (event->privileged) 1851 - perm |= IOMMU_FAULT_PERM_PRIV; 1847 + flt->type = IOMMU_FAULT_PAGE_REQ; 1848 + flt->prm = (struct iommu_fault_page_request){ 1849 + .flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE, 1850 + .grpid = event->stag, 1851 + .perm = perm, 1852 + .addr = event->iova, 1853 + }; 1852 1854 1853 - flt->type = IOMMU_FAULT_PAGE_REQ; 1854 - flt->prm = (struct iommu_fault_page_request) { 1855 - .flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE, 1856 - .grpid = event->stag, 1857 - .perm = perm, 1858 - .addr = event->iova, 1859 - }; 1860 - 1861 - if (event->ssv) { 1862 - flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID; 1863 - flt->prm.pasid = event->ssid; 1855 + if (event->ssv) { 1856 + flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID; 1857 + flt->prm.pasid = event->ssid; 1858 + } 1864 1859 } 1865 1860 1866 1861 mutex_lock(&smmu->streams_mutex); ··· 1869 1866 goto out_unlock; 1870 1867 } 1871 1868 1872 - ret = iommu_report_device_fault(master->dev, &fault_evt); 1869 + if (event->stall) 1870 + ret = iommu_report_device_fault(master->dev, &fault_evt); 1871 + else if (master->vmaster && !event->s2) 1872 + ret = arm_vmaster_report_event(master->vmaster, evt); 1873 + else 1874 + ret = -EOPNOTSUPP; /* Unhandled events should be pinned */ 1873 1875 out_unlock: 1874 1876 mutex_unlock(&smmu->streams_mutex); 1875 1877 return ret; ··· 1952 1944 do { 1953 1945 while (!queue_remove_raw(q, evt)) { 1954 1946 arm_smmu_decode_event(smmu, evt, &event); 1955 - if (arm_smmu_handle_event(smmu, &event)) 1947 + if (arm_smmu_handle_event(smmu, evt, &event)) 1956 1948 arm_smmu_dump_event(smmu, evt, &event, &rs); 1957 1949 1958 1950 put_device(event.dev); ··· 2811 2803 struct arm_smmu_domain *smmu_domain = 2812 2804 to_smmu_domain_devices(new_domain); 2813 2805 unsigned long flags; 2806 + int ret; 2814 2807 2815 2808 /* 2816 2809 * arm_smmu_share_asid() must not see two domains pointing to the same ··· 2841 2832 } 2842 2833 2843 2834 if (smmu_domain) { 2835 + if (new_domain->type == IOMMU_DOMAIN_NESTED) { 2836 + ret = arm_smmu_attach_prepare_vmaster( 2837 + state, to_smmu_nested_domain(new_domain)); 2838 + if (ret) 2839 + return ret; 2840 + } 2841 + 2844 2842 master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL); 2845 - if (!master_domain) 2843 + if (!master_domain) { 2844 + kfree(state->vmaster); 2846 2845 return -ENOMEM; 2846 + } 2847 2847 master_domain->master = master; 2848 2848 master_domain->ssid = state->ssid; 2849 2849 if (new_domain->type == IOMMU_DOMAIN_NESTED) ··· 2879 2861 spin_unlock_irqrestore(&smmu_domain->devices_lock, 2880 2862 flags); 2881 2863 kfree(master_domain); 2864 + kfree(state->vmaster); 2882 2865 return -EINVAL; 2883 2866 } 2884 2867 ··· 2911 2892 struct arm_smmu_master *master = state->master; 2912 2893 2913 2894 lockdep_assert_held(&arm_smmu_asid_lock); 2895 + 2896 + arm_smmu_attach_commit_vmaster(state); 2914 2897 2915 2898 if (state->ats_enabled && !master->ats_enabled) { 2916 2899 arm_smmu_enable_ats(master); ··· 3183 3162 struct arm_smmu_ste ste; 3184 3163 struct arm_smmu_master *master = dev_iommu_priv_get(dev); 3185 3164 3165 + arm_smmu_master_clear_vmaster(master); 3186 3166 arm_smmu_make_bypass_ste(master->smmu, &ste); 3187 3167 arm_smmu_attach_dev_ste(domain, dev, &ste, STRTAB_STE_1_S1DSS_BYPASS); 3188 3168 return 0; ··· 3202 3180 struct device *dev) 3203 3181 { 3204 3182 struct arm_smmu_ste ste; 3183 + struct arm_smmu_master *master = dev_iommu_priv_get(dev); 3205 3184 3185 + arm_smmu_master_clear_vmaster(master); 3206 3186 arm_smmu_make_abort_ste(&ste); 3207 3187 arm_smmu_attach_dev_ste(domain, dev, &ste, 3208 3188 STRTAB_STE_1_S1DSS_TERMINATE);
+36
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
··· 266 266 #define STRTAB_STE_1_S1COR GENMASK_ULL(5, 4) 267 267 #define STRTAB_STE_1_S1CSH GENMASK_ULL(7, 6) 268 268 269 + #define STRTAB_STE_1_MEV (1UL << 19) 269 270 #define STRTAB_STE_1_S2FWB (1UL << 25) 270 271 #define STRTAB_STE_1_S1STALLD (1UL << 27) 271 272 ··· 800 799 struct rb_node node; 801 800 }; 802 801 802 + struct arm_smmu_vmaster { 803 + struct arm_vsmmu *vsmmu; 804 + unsigned long vsid; 805 + }; 806 + 803 807 struct arm_smmu_event { 804 808 u8 stall : 1, 805 809 ssv : 1, ··· 830 824 struct arm_smmu_device *smmu; 831 825 struct device *dev; 832 826 struct arm_smmu_stream *streams; 827 + struct arm_smmu_vmaster *vmaster; /* use smmu->streams_mutex */ 833 828 /* Locked by the iommu core using the group mutex */ 834 829 struct arm_smmu_ctx_desc_cfg cd_table; 835 830 unsigned int num_streams; ··· 979 972 bool disable_ats; 980 973 ioasid_t ssid; 981 974 /* Resulting state */ 975 + struct arm_smmu_vmaster *vmaster; 982 976 bool ats_enabled; 983 977 }; 984 978 ··· 1063 1055 struct iommu_domain *parent, 1064 1056 struct iommufd_ctx *ictx, 1065 1057 unsigned int viommu_type); 1058 + int arm_smmu_attach_prepare_vmaster(struct arm_smmu_attach_state *state, 1059 + struct arm_smmu_nested_domain *nested_domain); 1060 + void arm_smmu_attach_commit_vmaster(struct arm_smmu_attach_state *state); 1061 + void arm_smmu_master_clear_vmaster(struct arm_smmu_master *master); 1062 + int arm_vmaster_report_event(struct arm_smmu_vmaster *vmaster, u64 *evt); 1066 1063 #else 1067 1064 #define arm_smmu_hw_info NULL 1068 1065 #define arm_vsmmu_alloc NULL 1066 + 1067 + static inline int 1068 + arm_smmu_attach_prepare_vmaster(struct arm_smmu_attach_state *state, 1069 + struct arm_smmu_nested_domain *nested_domain) 1070 + { 1071 + return 0; 1072 + } 1073 + 1074 + static inline void 1075 + arm_smmu_attach_commit_vmaster(struct arm_smmu_attach_state *state) 1076 + { 1077 + } 1078 + 1079 + static inline void 1080 + arm_smmu_master_clear_vmaster(struct arm_smmu_master *master) 1081 + { 1082 + } 1083 + 1084 + static inline int arm_vmaster_report_event(struct arm_smmu_vmaster *vmaster, 1085 + u64 *evt) 1086 + { 1087 + return -EOPNOTSUPP; 1088 + } 1069 1089 #endif /* CONFIG_ARM_SMMU_V3_IOMMUFD */ 1070 1090 1071 1091 #endif /* _ARM_SMMU_V3_H */
+105 -101
drivers/iommu/dma-iommu.c
··· 42 42 phys_addr_t phys; 43 43 }; 44 44 45 - enum iommu_dma_cookie_type { 46 - IOMMU_DMA_IOVA_COOKIE, 47 - IOMMU_DMA_MSI_COOKIE, 48 - }; 49 - 50 45 enum iommu_dma_queue_type { 51 46 IOMMU_DMA_OPTS_PER_CPU_QUEUE, 52 47 IOMMU_DMA_OPTS_SINGLE_QUEUE, ··· 54 59 }; 55 60 56 61 struct iommu_dma_cookie { 57 - enum iommu_dma_cookie_type type; 62 + struct iova_domain iovad; 63 + struct list_head msi_page_list; 64 + /* Flush queue */ 58 65 union { 59 - /* Full allocator for IOMMU_DMA_IOVA_COOKIE */ 60 - struct { 61 - struct iova_domain iovad; 62 - /* Flush queue */ 63 - union { 64 - struct iova_fq *single_fq; 65 - struct iova_fq __percpu *percpu_fq; 66 - }; 67 - /* Number of TLB flushes that have been started */ 68 - atomic64_t fq_flush_start_cnt; 69 - /* Number of TLB flushes that have been finished */ 70 - atomic64_t fq_flush_finish_cnt; 71 - /* Timer to regularily empty the flush queues */ 72 - struct timer_list fq_timer; 73 - /* 1 when timer is active, 0 when not */ 74 - atomic_t fq_timer_on; 75 - }; 76 - /* Trivial linear page allocator for IOMMU_DMA_MSI_COOKIE */ 77 - dma_addr_t msi_iova; 66 + struct iova_fq *single_fq; 67 + struct iova_fq __percpu *percpu_fq; 78 68 }; 79 - struct list_head msi_page_list; 80 - 69 + /* Number of TLB flushes that have been started */ 70 + atomic64_t fq_flush_start_cnt; 71 + /* Number of TLB flushes that have been finished */ 72 + atomic64_t fq_flush_finish_cnt; 73 + /* Timer to regularily empty the flush queues */ 74 + struct timer_list fq_timer; 75 + /* 1 when timer is active, 0 when not */ 76 + atomic_t fq_timer_on; 81 77 /* Domain for flush queue callback; NULL if flush queue not in use */ 82 - struct iommu_domain *fq_domain; 78 + struct iommu_domain *fq_domain; 83 79 /* Options for dma-iommu use */ 84 - struct iommu_dma_options options; 80 + struct iommu_dma_options options; 81 + }; 82 + 83 + struct iommu_dma_msi_cookie { 84 + dma_addr_t msi_iova; 85 + struct list_head msi_page_list; 85 86 }; 86 87 87 88 static DEFINE_STATIC_KEY_FALSE(iommu_deferred_attach_enabled); ··· 92 101 return ret; 93 102 } 94 103 early_param("iommu.forcedac", iommu_dma_forcedac_setup); 95 - 96 - static int iommu_dma_sw_msi(struct iommu_domain *domain, struct msi_desc *desc, 97 - phys_addr_t msi_addr); 98 104 99 105 /* Number of entries per flush queue */ 100 106 #define IOVA_DEFAULT_FQ_SIZE 256 ··· 356 368 return 0; 357 369 } 358 370 359 - static inline size_t cookie_msi_granule(struct iommu_dma_cookie *cookie) 360 - { 361 - if (cookie->type == IOMMU_DMA_IOVA_COOKIE) 362 - return cookie->iovad.granule; 363 - return PAGE_SIZE; 364 - } 365 - 366 - static struct iommu_dma_cookie *cookie_alloc(enum iommu_dma_cookie_type type) 367 - { 368 - struct iommu_dma_cookie *cookie; 369 - 370 - cookie = kzalloc(sizeof(*cookie), GFP_KERNEL); 371 - if (cookie) { 372 - INIT_LIST_HEAD(&cookie->msi_page_list); 373 - cookie->type = type; 374 - } 375 - return cookie; 376 - } 377 - 378 371 /** 379 372 * iommu_get_dma_cookie - Acquire DMA-API resources for a domain 380 373 * @domain: IOMMU domain to prepare for DMA-API usage 381 374 */ 382 375 int iommu_get_dma_cookie(struct iommu_domain *domain) 383 376 { 384 - if (domain->iova_cookie) 377 + struct iommu_dma_cookie *cookie; 378 + 379 + if (domain->cookie_type != IOMMU_COOKIE_NONE) 385 380 return -EEXIST; 386 381 387 - domain->iova_cookie = cookie_alloc(IOMMU_DMA_IOVA_COOKIE); 388 - if (!domain->iova_cookie) 382 + cookie = kzalloc(sizeof(*cookie), GFP_KERNEL); 383 + if (!cookie) 389 384 return -ENOMEM; 390 385 391 - iommu_domain_set_sw_msi(domain, iommu_dma_sw_msi); 386 + INIT_LIST_HEAD(&cookie->msi_page_list); 387 + domain->cookie_type = IOMMU_COOKIE_DMA_IOVA; 388 + domain->iova_cookie = cookie; 392 389 return 0; 393 390 } 394 391 ··· 391 418 */ 392 419 int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base) 393 420 { 394 - struct iommu_dma_cookie *cookie; 421 + struct iommu_dma_msi_cookie *cookie; 395 422 396 423 if (domain->type != IOMMU_DOMAIN_UNMANAGED) 397 424 return -EINVAL; 398 425 399 - if (domain->iova_cookie) 426 + if (domain->cookie_type != IOMMU_COOKIE_NONE) 400 427 return -EEXIST; 401 428 402 - cookie = cookie_alloc(IOMMU_DMA_MSI_COOKIE); 429 + cookie = kzalloc(sizeof(*cookie), GFP_KERNEL); 403 430 if (!cookie) 404 431 return -ENOMEM; 405 432 406 433 cookie->msi_iova = base; 407 - domain->iova_cookie = cookie; 408 - iommu_domain_set_sw_msi(domain, iommu_dma_sw_msi); 434 + INIT_LIST_HEAD(&cookie->msi_page_list); 435 + domain->cookie_type = IOMMU_COOKIE_DMA_MSI; 436 + domain->msi_cookie = cookie; 409 437 return 0; 410 438 } 411 439 EXPORT_SYMBOL(iommu_get_msi_cookie); 412 440 413 441 /** 414 442 * iommu_put_dma_cookie - Release a domain's DMA mapping resources 415 - * @domain: IOMMU domain previously prepared by iommu_get_dma_cookie() or 416 - * iommu_get_msi_cookie() 443 + * @domain: IOMMU domain previously prepared by iommu_get_dma_cookie() 417 444 */ 418 445 void iommu_put_dma_cookie(struct iommu_domain *domain) 419 446 { 420 447 struct iommu_dma_cookie *cookie = domain->iova_cookie; 421 448 struct iommu_dma_msi_page *msi, *tmp; 422 449 423 - #if IS_ENABLED(CONFIG_IRQ_MSI_IOMMU) 424 - if (domain->sw_msi != iommu_dma_sw_msi) 425 - return; 426 - #endif 427 - 428 - if (!cookie) 429 - return; 430 - 431 - if (cookie->type == IOMMU_DMA_IOVA_COOKIE && cookie->iovad.granule) { 450 + if (cookie->iovad.granule) { 432 451 iommu_dma_free_fq(cookie); 433 452 put_iova_domain(&cookie->iovad); 434 453 } 435 - 436 - list_for_each_entry_safe(msi, tmp, &cookie->msi_page_list, list) { 437 - list_del(&msi->list); 454 + list_for_each_entry_safe(msi, tmp, &cookie->msi_page_list, list) 438 455 kfree(msi); 439 - } 440 456 kfree(cookie); 441 - domain->iova_cookie = NULL; 457 + } 458 + 459 + /** 460 + * iommu_put_msi_cookie - Release a domain's MSI mapping resources 461 + * @domain: IOMMU domain previously prepared by iommu_get_msi_cookie() 462 + */ 463 + void iommu_put_msi_cookie(struct iommu_domain *domain) 464 + { 465 + struct iommu_dma_msi_cookie *cookie = domain->msi_cookie; 466 + struct iommu_dma_msi_page *msi, *tmp; 467 + 468 + list_for_each_entry_safe(msi, tmp, &cookie->msi_page_list, list) 469 + kfree(msi); 470 + kfree(cookie); 442 471 } 443 472 444 473 /** ··· 660 685 struct iova_domain *iovad; 661 686 int ret; 662 687 663 - if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE) 688 + if (!cookie || domain->cookie_type != IOMMU_COOKIE_DMA_IOVA) 664 689 return -EINVAL; 665 690 666 691 iovad = &cookie->iovad; ··· 743 768 struct iova_domain *iovad = &cookie->iovad; 744 769 unsigned long shift, iova_len, iova; 745 770 746 - if (cookie->type == IOMMU_DMA_MSI_COOKIE) { 747 - cookie->msi_iova += size; 748 - return cookie->msi_iova - size; 771 + if (domain->cookie_type == IOMMU_COOKIE_DMA_MSI) { 772 + domain->msi_cookie->msi_iova += size; 773 + return domain->msi_cookie->msi_iova - size; 749 774 } 750 775 751 776 shift = iova_shift(iovad); ··· 782 807 return (dma_addr_t)iova << shift; 783 808 } 784 809 785 - static void iommu_dma_free_iova(struct iommu_dma_cookie *cookie, 786 - dma_addr_t iova, size_t size, struct iommu_iotlb_gather *gather) 810 + static void iommu_dma_free_iova(struct iommu_domain *domain, dma_addr_t iova, 811 + size_t size, struct iommu_iotlb_gather *gather) 787 812 { 788 - struct iova_domain *iovad = &cookie->iovad; 813 + struct iova_domain *iovad = &domain->iova_cookie->iovad; 789 814 790 815 /* The MSI case is only ever cleaning up its most recent allocation */ 791 - if (cookie->type == IOMMU_DMA_MSI_COOKIE) 792 - cookie->msi_iova -= size; 816 + if (domain->cookie_type == IOMMU_COOKIE_DMA_MSI) 817 + domain->msi_cookie->msi_iova -= size; 793 818 else if (gather && gather->queued) 794 - queue_iova(cookie, iova_pfn(iovad, iova), 819 + queue_iova(domain->iova_cookie, iova_pfn(iovad, iova), 795 820 size >> iova_shift(iovad), 796 821 &gather->freelist); 797 822 else ··· 819 844 820 845 if (!iotlb_gather.queued) 821 846 iommu_iotlb_sync(domain, &iotlb_gather); 822 - iommu_dma_free_iova(cookie, dma_addr, size, &iotlb_gather); 847 + iommu_dma_free_iova(domain, dma_addr, size, &iotlb_gather); 823 848 } 824 849 825 850 static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, ··· 847 872 return DMA_MAPPING_ERROR; 848 873 849 874 if (iommu_map(domain, iova, phys - iova_off, size, prot, GFP_ATOMIC)) { 850 - iommu_dma_free_iova(cookie, iova, size, NULL); 875 + iommu_dma_free_iova(domain, iova, size, NULL); 851 876 return DMA_MAPPING_ERROR; 852 877 } 853 878 return iova + iova_off; ··· 984 1009 out_free_sg: 985 1010 sg_free_table(sgt); 986 1011 out_free_iova: 987 - iommu_dma_free_iova(cookie, iova, size, NULL); 1012 + iommu_dma_free_iova(domain, iova, size, NULL); 988 1013 out_free_pages: 989 1014 __iommu_dma_free_pages(pages, count); 990 1015 return NULL; ··· 1461 1486 return __finalise_sg(dev, sg, nents, iova); 1462 1487 1463 1488 out_free_iova: 1464 - iommu_dma_free_iova(cookie, iova, iova_len, NULL); 1489 + iommu_dma_free_iova(domain, iova, iova_len, NULL); 1465 1490 out_restore_sg: 1466 1491 __invalidate_sg(sg, nents); 1467 1492 out: ··· 1739 1764 dev->dma_iommu = false; 1740 1765 } 1741 1766 1767 + static bool has_msi_cookie(const struct iommu_domain *domain) 1768 + { 1769 + return domain && (domain->cookie_type == IOMMU_COOKIE_DMA_IOVA || 1770 + domain->cookie_type == IOMMU_COOKIE_DMA_MSI); 1771 + } 1772 + 1773 + static size_t cookie_msi_granule(const struct iommu_domain *domain) 1774 + { 1775 + switch (domain->cookie_type) { 1776 + case IOMMU_COOKIE_DMA_IOVA: 1777 + return domain->iova_cookie->iovad.granule; 1778 + case IOMMU_COOKIE_DMA_MSI: 1779 + return PAGE_SIZE; 1780 + default: 1781 + BUG(); 1782 + }; 1783 + } 1784 + 1785 + static struct list_head *cookie_msi_pages(const struct iommu_domain *domain) 1786 + { 1787 + switch (domain->cookie_type) { 1788 + case IOMMU_COOKIE_DMA_IOVA: 1789 + return &domain->iova_cookie->msi_page_list; 1790 + case IOMMU_COOKIE_DMA_MSI: 1791 + return &domain->msi_cookie->msi_page_list; 1792 + default: 1793 + BUG(); 1794 + }; 1795 + } 1796 + 1742 1797 static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, 1743 1798 phys_addr_t msi_addr, struct iommu_domain *domain) 1744 1799 { 1745 - struct iommu_dma_cookie *cookie = domain->iova_cookie; 1800 + struct list_head *msi_page_list = cookie_msi_pages(domain); 1746 1801 struct iommu_dma_msi_page *msi_page; 1747 1802 dma_addr_t iova; 1748 1803 int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO; 1749 - size_t size = cookie_msi_granule(cookie); 1804 + size_t size = cookie_msi_granule(domain); 1750 1805 1751 1806 msi_addr &= ~(phys_addr_t)(size - 1); 1752 - list_for_each_entry(msi_page, &cookie->msi_page_list, list) 1807 + list_for_each_entry(msi_page, msi_page_list, list) 1753 1808 if (msi_page->phys == msi_addr) 1754 1809 return msi_page; 1755 1810 ··· 1797 1792 INIT_LIST_HEAD(&msi_page->list); 1798 1793 msi_page->phys = msi_addr; 1799 1794 msi_page->iova = iova; 1800 - list_add(&msi_page->list, &cookie->msi_page_list); 1795 + list_add(&msi_page->list, msi_page_list); 1801 1796 return msi_page; 1802 1797 1803 1798 out_free_iova: 1804 - iommu_dma_free_iova(cookie, iova, size, NULL); 1799 + iommu_dma_free_iova(domain, iova, size, NULL); 1805 1800 out_free_page: 1806 1801 kfree(msi_page); 1807 1802 return NULL; 1808 1803 } 1809 1804 1810 - static int iommu_dma_sw_msi(struct iommu_domain *domain, struct msi_desc *desc, 1811 - phys_addr_t msi_addr) 1805 + int iommu_dma_sw_msi(struct iommu_domain *domain, struct msi_desc *desc, 1806 + phys_addr_t msi_addr) 1812 1807 { 1813 1808 struct device *dev = msi_desc_to_dev(desc); 1814 1809 const struct iommu_dma_msi_page *msi_page; 1815 1810 1816 - if (!domain->iova_cookie) { 1811 + if (!has_msi_cookie(domain)) { 1817 1812 msi_desc_set_iommu_msi_iova(desc, 0, 0); 1818 1813 return 0; 1819 1814 } ··· 1823 1818 if (!msi_page) 1824 1819 return -ENOMEM; 1825 1820 1826 - msi_desc_set_iommu_msi_iova( 1827 - desc, msi_page->iova, 1828 - ilog2(cookie_msi_granule(domain->iova_cookie))); 1821 + msi_desc_set_iommu_msi_iova(desc, msi_page->iova, 1822 + ilog2(cookie_msi_granule(domain))); 1829 1823 return 0; 1830 1824 } 1831 1825
+14
drivers/iommu/dma-iommu.h
··· 13 13 14 14 int iommu_get_dma_cookie(struct iommu_domain *domain); 15 15 void iommu_put_dma_cookie(struct iommu_domain *domain); 16 + void iommu_put_msi_cookie(struct iommu_domain *domain); 16 17 17 18 int iommu_dma_init_fq(struct iommu_domain *domain); 18 19 19 20 void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list); 21 + 22 + int iommu_dma_sw_msi(struct iommu_domain *domain, struct msi_desc *desc, 23 + phys_addr_t msi_addr); 20 24 21 25 extern bool iommu_dma_forcedac; 22 26 ··· 44 40 { 45 41 } 46 42 43 + static inline void iommu_put_msi_cookie(struct iommu_domain *domain) 44 + { 45 + } 46 + 47 47 static inline void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list) 48 48 { 49 + } 50 + 51 + static inline int iommu_dma_sw_msi(struct iommu_domain *domain, 52 + struct msi_desc *desc, phys_addr_t msi_addr) 53 + { 54 + return -ENODEV; 49 55 } 50 56 51 57 #endif /* CONFIG_IOMMU_DMA */
+2 -1
drivers/iommu/intel/iommu.c
··· 3383 3383 bool first_stage; 3384 3384 3385 3385 if (flags & 3386 - (~(IOMMU_HWPT_ALLOC_NEST_PARENT | IOMMU_HWPT_ALLOC_DIRTY_TRACKING))) 3386 + (~(IOMMU_HWPT_ALLOC_NEST_PARENT | IOMMU_HWPT_ALLOC_DIRTY_TRACKING | 3387 + IOMMU_HWPT_ALLOC_PASID))) 3387 3388 return ERR_PTR(-EOPNOTSUPP); 3388 3389 if (nested_parent && !nested_supported(iommu)) 3389 3390 return ERR_PTR(-EOPNOTSUPP);
+1 -1
drivers/iommu/intel/nested.c
··· 198 198 struct dmar_domain *domain; 199 199 int ret; 200 200 201 - if (!nested_supported(iommu) || flags) 201 + if (!nested_supported(iommu) || flags & ~IOMMU_HWPT_ALLOC_PASID) 202 202 return ERR_PTR(-EOPNOTSUPP); 203 203 204 204 /* Must be nested domain */
+16
drivers/iommu/iommu-priv.h
··· 5 5 #define __LINUX_IOMMU_PRIV_H 6 6 7 7 #include <linux/iommu.h> 8 + #include <linux/msi.h> 8 9 9 10 static inline const struct iommu_ops *dev_iommu_ops(struct device *dev) 10 11 { ··· 47 46 struct iommu_group *group); 48 47 int iommu_replace_group_handle(struct iommu_group *group, 49 48 struct iommu_domain *new_domain, 49 + struct iommu_attach_handle *handle); 50 + 51 + #if IS_ENABLED(CONFIG_IOMMUFD_DRIVER_CORE) && IS_ENABLED(CONFIG_IRQ_MSI_IOMMU) 52 + int iommufd_sw_msi(struct iommu_domain *domain, struct msi_desc *desc, 53 + phys_addr_t msi_addr); 54 + #else /* !CONFIG_IOMMUFD_DRIVER_CORE || !CONFIG_IRQ_MSI_IOMMU */ 55 + static inline int iommufd_sw_msi(struct iommu_domain *domain, 56 + struct msi_desc *desc, phys_addr_t msi_addr) 57 + { 58 + return -EOPNOTSUPP; 59 + } 60 + #endif /* CONFIG_IOMMUFD_DRIVER_CORE && CONFIG_IRQ_MSI_IOMMU */ 61 + 62 + int iommu_replace_device_pasid(struct iommu_domain *domain, 63 + struct device *dev, ioasid_t pasid, 50 64 struct iommu_attach_handle *handle); 51 65 #endif /* __LINUX_IOMMU_PRIV_H */
+1
drivers/iommu/iommu-sva.c
··· 310 310 } 311 311 312 312 domain->type = IOMMU_DOMAIN_SVA; 313 + domain->cookie_type = IOMMU_COOKIE_SVA; 313 314 mmgrab(mm); 314 315 domain->mm = mm; 315 316 domain->owner = ops;
+151 -9
drivers/iommu/iommu.c
··· 18 18 #include <linux/errno.h> 19 19 #include <linux/host1x_context_bus.h> 20 20 #include <linux/iommu.h> 21 + #include <linux/iommufd.h> 21 22 #include <linux/idr.h> 22 23 #include <linux/err.h> 23 24 #include <linux/pci.h> ··· 538 537 dev->iommu_group = NULL; 539 538 module_put(ops->owner); 540 539 dev_iommu_free(dev); 540 + } 541 + 542 + static struct iommu_domain *pasid_array_entry_to_domain(void *entry) 543 + { 544 + if (xa_pointer_tag(entry) == IOMMU_PASID_ARRAY_DOMAIN) 545 + return xa_untag_pointer(entry); 546 + return ((struct iommu_attach_handle *)xa_untag_pointer(entry))->domain; 541 547 } 542 548 543 549 DEFINE_MUTEX(iommu_probe_device_lock); ··· 1981 1973 iommu_fault_handler_t handler, 1982 1974 void *token) 1983 1975 { 1984 - BUG_ON(!domain); 1976 + if (WARN_ON(!domain || domain->cookie_type != IOMMU_COOKIE_NONE)) 1977 + return; 1985 1978 1979 + domain->cookie_type = IOMMU_COOKIE_FAULT_HANDLER; 1986 1980 domain->handler = handler; 1987 1981 domain->handler_token = token; 1988 1982 } ··· 2054 2044 2055 2045 void iommu_domain_free(struct iommu_domain *domain) 2056 2046 { 2057 - if (domain->type == IOMMU_DOMAIN_SVA) 2047 + switch (domain->cookie_type) { 2048 + case IOMMU_COOKIE_DMA_IOVA: 2049 + iommu_put_dma_cookie(domain); 2050 + break; 2051 + case IOMMU_COOKIE_DMA_MSI: 2052 + iommu_put_msi_cookie(domain); 2053 + break; 2054 + case IOMMU_COOKIE_SVA: 2058 2055 mmdrop(domain->mm); 2059 - iommu_put_dma_cookie(domain); 2056 + break; 2057 + default: 2058 + break; 2059 + } 2060 2060 if (domain->ops->free) 2061 2061 domain->ops->free(domain); 2062 2062 } ··· 3355 3335 } 3356 3336 3357 3337 static int __iommu_set_group_pasid(struct iommu_domain *domain, 3358 - struct iommu_group *group, ioasid_t pasid) 3338 + struct iommu_group *group, ioasid_t pasid, 3339 + struct iommu_domain *old) 3359 3340 { 3360 3341 struct group_device *device, *last_gdev; 3361 3342 int ret; 3362 3343 3363 3344 for_each_group_device(group, device) { 3364 3345 ret = domain->ops->set_dev_pasid(domain, device->dev, 3365 - pasid, NULL); 3346 + pasid, old); 3366 3347 if (ret) 3367 3348 goto err_revert; 3368 3349 } ··· 3375 3354 for_each_group_device(group, device) { 3376 3355 if (device == last_gdev) 3377 3356 break; 3378 - iommu_remove_dev_pasid(device->dev, pasid, domain); 3357 + /* 3358 + * If no old domain, undo the succeeded devices/pasid. 3359 + * Otherwise, rollback the succeeded devices/pasid to the old 3360 + * domain. And it is a driver bug to fail attaching with a 3361 + * previously good domain. 3362 + */ 3363 + if (!old || WARN_ON(old->ops->set_dev_pasid(old, device->dev, 3364 + pasid, domain))) 3365 + iommu_remove_dev_pasid(device->dev, pasid, domain); 3379 3366 } 3380 3367 return ret; 3381 3368 } ··· 3404 3375 * @dev: the attached device. 3405 3376 * @pasid: the pasid of the device. 3406 3377 * @handle: the attach handle. 3378 + * 3379 + * Caller should always provide a new handle to avoid race with the paths 3380 + * that have lockless reference to handle if it intends to pass a valid handle. 3407 3381 * 3408 3382 * Return: 0 on success, or an error. 3409 3383 */ ··· 3452 3420 if (ret) 3453 3421 goto out_unlock; 3454 3422 3455 - ret = __iommu_set_group_pasid(domain, group, pasid); 3423 + ret = __iommu_set_group_pasid(domain, group, pasid, NULL); 3456 3424 if (ret) { 3457 3425 xa_release(&group->pasid_array, pasid); 3458 3426 goto out_unlock; ··· 3472 3440 return ret; 3473 3441 } 3474 3442 EXPORT_SYMBOL_GPL(iommu_attach_device_pasid); 3443 + 3444 + /** 3445 + * iommu_replace_device_pasid - Replace the domain that a specific pasid 3446 + * of the device is attached to 3447 + * @domain: the new iommu domain 3448 + * @dev: the attached device. 3449 + * @pasid: the pasid of the device. 3450 + * @handle: the attach handle. 3451 + * 3452 + * This API allows the pasid to switch domains. The @pasid should have been 3453 + * attached. Otherwise, this fails. The pasid will keep the old configuration 3454 + * if replacement failed. 3455 + * 3456 + * Caller should always provide a new handle to avoid race with the paths 3457 + * that have lockless reference to handle if it intends to pass a valid handle. 3458 + * 3459 + * Return 0 on success, or an error. 3460 + */ 3461 + int iommu_replace_device_pasid(struct iommu_domain *domain, 3462 + struct device *dev, ioasid_t pasid, 3463 + struct iommu_attach_handle *handle) 3464 + { 3465 + /* Caller must be a probed driver on dev */ 3466 + struct iommu_group *group = dev->iommu_group; 3467 + struct iommu_attach_handle *entry; 3468 + struct iommu_domain *curr_domain; 3469 + void *curr; 3470 + int ret; 3471 + 3472 + if (!group) 3473 + return -ENODEV; 3474 + 3475 + if (!domain->ops->set_dev_pasid) 3476 + return -EOPNOTSUPP; 3477 + 3478 + if (dev_iommu_ops(dev) != domain->owner || 3479 + pasid == IOMMU_NO_PASID || !handle) 3480 + return -EINVAL; 3481 + 3482 + mutex_lock(&group->mutex); 3483 + entry = iommu_make_pasid_array_entry(domain, handle); 3484 + curr = xa_cmpxchg(&group->pasid_array, pasid, NULL, 3485 + XA_ZERO_ENTRY, GFP_KERNEL); 3486 + if (xa_is_err(curr)) { 3487 + ret = xa_err(curr); 3488 + goto out_unlock; 3489 + } 3490 + 3491 + /* 3492 + * No domain (with or without handle) attached, hence not 3493 + * a replace case. 3494 + */ 3495 + if (!curr) { 3496 + xa_release(&group->pasid_array, pasid); 3497 + ret = -EINVAL; 3498 + goto out_unlock; 3499 + } 3500 + 3501 + /* 3502 + * Reusing handle is problematic as there are paths that refers 3503 + * the handle without lock. To avoid race, reject the callers that 3504 + * attempt it. 3505 + */ 3506 + if (curr == entry) { 3507 + WARN_ON(1); 3508 + ret = -EINVAL; 3509 + goto out_unlock; 3510 + } 3511 + 3512 + curr_domain = pasid_array_entry_to_domain(curr); 3513 + ret = 0; 3514 + 3515 + if (curr_domain != domain) { 3516 + ret = __iommu_set_group_pasid(domain, group, 3517 + pasid, curr_domain); 3518 + if (ret) 3519 + goto out_unlock; 3520 + } 3521 + 3522 + /* 3523 + * The above xa_cmpxchg() reserved the memory, and the 3524 + * group->mutex is held, this cannot fail. 3525 + */ 3526 + WARN_ON(xa_is_err(xa_store(&group->pasid_array, 3527 + pasid, entry, GFP_KERNEL))); 3528 + 3529 + out_unlock: 3530 + mutex_unlock(&group->mutex); 3531 + return ret; 3532 + } 3533 + EXPORT_SYMBOL_NS_GPL(iommu_replace_device_pasid, "IOMMUFD_INTERNAL"); 3475 3534 3476 3535 /* 3477 3536 * iommu_detach_device_pasid() - Detach the domain from pasid of device ··· 3659 3536 * This is a variant of iommu_attach_group(). It allows the caller to provide 3660 3537 * an attach handle and use it when the domain is attached. This is currently 3661 3538 * used by IOMMUFD to deliver the I/O page faults. 3539 + * 3540 + * Caller should always provide a new handle to avoid race with the paths 3541 + * that have lockless reference to handle. 3662 3542 */ 3663 3543 int iommu_attach_group_handle(struct iommu_domain *domain, 3664 3544 struct iommu_group *group, ··· 3731 3605 * 3732 3606 * If the currently attached domain is a core domain (e.g. a default_domain), 3733 3607 * it will act just like the iommu_attach_group_handle(). 3608 + * 3609 + * Caller should always provide a new handle to avoid race with the paths 3610 + * that have lockless reference to handle. 3734 3611 */ 3735 3612 int iommu_replace_group_handle(struct iommu_group *group, 3736 3613 struct iommu_domain *new_domain, ··· 3791 3662 return 0; 3792 3663 3793 3664 mutex_lock(&group->mutex); 3794 - if (group->domain && group->domain->sw_msi) 3795 - ret = group->domain->sw_msi(group->domain, desc, msi_addr); 3665 + /* An IDENTITY domain must pass through */ 3666 + if (group->domain && group->domain->type != IOMMU_DOMAIN_IDENTITY) { 3667 + switch (group->domain->cookie_type) { 3668 + case IOMMU_COOKIE_DMA_MSI: 3669 + case IOMMU_COOKIE_DMA_IOVA: 3670 + ret = iommu_dma_sw_msi(group->domain, desc, msi_addr); 3671 + break; 3672 + case IOMMU_COOKIE_IOMMUFD: 3673 + ret = iommufd_sw_msi(group->domain, desc, msi_addr); 3674 + break; 3675 + default: 3676 + ret = -EOPNOTSUPP; 3677 + break; 3678 + } 3679 + } 3796 3680 mutex_unlock(&group->mutex); 3797 3681 return ret; 3798 3682 }
+1 -1
drivers/iommu/iommufd/Kconfig
··· 1 1 # SPDX-License-Identifier: GPL-2.0-only 2 2 config IOMMUFD_DRIVER_CORE 3 - tristate 3 + bool 4 4 default (IOMMUFD_DRIVER || IOMMUFD) if IOMMUFD!=n 5 5 6 6 config IOMMUFD
+1 -1
drivers/iommu/iommufd/Makefile
··· 1 1 # SPDX-License-Identifier: GPL-2.0-only 2 2 iommufd-y := \ 3 3 device.o \ 4 - fault.o \ 4 + eventq.o \ 5 5 hw_pagetable.o \ 6 6 io_pagetable.o \ 7 7 ioas.o \
+294 -205
drivers/iommu/iommufd/device.c
··· 3 3 */ 4 4 #include <linux/iommu.h> 5 5 #include <linux/iommufd.h> 6 + #include <linux/pci-ats.h> 6 7 #include <linux/slab.h> 7 8 #include <uapi/linux/iommufd.h> 8 - #include <linux/msi.h> 9 9 10 10 #include "../iommu-priv.h" 11 11 #include "io_pagetable.h" ··· 18 18 "Allow IOMMUFD to bind to devices even if the platform cannot isolate " 19 19 "the MSI interrupt window. Enabling this is a security weakness."); 20 20 21 + struct iommufd_attach { 22 + struct iommufd_hw_pagetable *hwpt; 23 + struct xarray device_array; 24 + }; 25 + 21 26 static void iommufd_group_release(struct kref *kref) 22 27 { 23 28 struct iommufd_group *igroup = 24 29 container_of(kref, struct iommufd_group, ref); 25 30 26 - WARN_ON(igroup->hwpt || !list_empty(&igroup->device_list)); 31 + WARN_ON(!xa_empty(&igroup->pasid_attach)); 27 32 28 33 xa_cmpxchg(&igroup->ictx->groups, iommu_group_id(igroup->group), igroup, 29 34 NULL, GFP_KERNEL); ··· 95 90 96 91 kref_init(&new_igroup->ref); 97 92 mutex_init(&new_igroup->lock); 98 - INIT_LIST_HEAD(&new_igroup->device_list); 93 + xa_init(&new_igroup->pasid_attach); 99 94 new_igroup->sw_msi_start = PHYS_ADDR_MAX; 100 95 /* group reference moves into new_igroup */ 101 96 new_igroup->group = group; ··· 299 294 } 300 295 EXPORT_SYMBOL_NS_GPL(iommufd_device_to_id, "IOMMUFD"); 301 296 302 - /* 303 - * Get a iommufd_sw_msi_map for the msi physical address requested by the irq 304 - * layer. The mapping to IOVA is global to the iommufd file descriptor, every 305 - * domain that is attached to a device using the same MSI parameters will use 306 - * the same IOVA. 307 - */ 308 - static __maybe_unused struct iommufd_sw_msi_map * 309 - iommufd_sw_msi_get_map(struct iommufd_ctx *ictx, phys_addr_t msi_addr, 310 - phys_addr_t sw_msi_start) 297 + static unsigned int iommufd_group_device_num(struct iommufd_group *igroup, 298 + ioasid_t pasid) 311 299 { 312 - struct iommufd_sw_msi_map *cur; 313 - unsigned int max_pgoff = 0; 300 + struct iommufd_attach *attach; 301 + struct iommufd_device *idev; 302 + unsigned int count = 0; 303 + unsigned long index; 314 304 315 - lockdep_assert_held(&ictx->sw_msi_lock); 305 + lockdep_assert_held(&igroup->lock); 316 306 317 - list_for_each_entry(cur, &ictx->sw_msi_list, sw_msi_item) { 318 - if (cur->sw_msi_start != sw_msi_start) 319 - continue; 320 - max_pgoff = max(max_pgoff, cur->pgoff + 1); 321 - if (cur->msi_addr == msi_addr) 322 - return cur; 323 - } 324 - 325 - if (ictx->sw_msi_id >= 326 - BITS_PER_BYTE * sizeof_field(struct iommufd_sw_msi_maps, bitmap)) 327 - return ERR_PTR(-EOVERFLOW); 328 - 329 - cur = kzalloc(sizeof(*cur), GFP_KERNEL); 330 - if (!cur) 331 - return ERR_PTR(-ENOMEM); 332 - 333 - cur->sw_msi_start = sw_msi_start; 334 - cur->msi_addr = msi_addr; 335 - cur->pgoff = max_pgoff; 336 - cur->id = ictx->sw_msi_id++; 337 - list_add_tail(&cur->sw_msi_item, &ictx->sw_msi_list); 338 - return cur; 307 + attach = xa_load(&igroup->pasid_attach, pasid); 308 + if (attach) 309 + xa_for_each(&attach->device_array, index, idev) 310 + count++; 311 + return count; 339 312 } 340 313 341 - static int iommufd_sw_msi_install(struct iommufd_ctx *ictx, 342 - struct iommufd_hwpt_paging *hwpt_paging, 343 - struct iommufd_sw_msi_map *msi_map) 344 - { 345 - unsigned long iova; 346 - 347 - lockdep_assert_held(&ictx->sw_msi_lock); 348 - 349 - iova = msi_map->sw_msi_start + msi_map->pgoff * PAGE_SIZE; 350 - if (!test_bit(msi_map->id, hwpt_paging->present_sw_msi.bitmap)) { 351 - int rc; 352 - 353 - rc = iommu_map(hwpt_paging->common.domain, iova, 354 - msi_map->msi_addr, PAGE_SIZE, 355 - IOMMU_WRITE | IOMMU_READ | IOMMU_MMIO, 356 - GFP_KERNEL_ACCOUNT); 357 - if (rc) 358 - return rc; 359 - __set_bit(msi_map->id, hwpt_paging->present_sw_msi.bitmap); 360 - } 361 - return 0; 362 - } 363 - 364 - /* 365 - * Called by the irq code if the platform translates the MSI address through the 366 - * IOMMU. msi_addr is the physical address of the MSI page. iommufd will 367 - * allocate a fd global iova for the physical page that is the same on all 368 - * domains and devices. 369 - */ 370 314 #ifdef CONFIG_IRQ_MSI_IOMMU 371 - int iommufd_sw_msi(struct iommu_domain *domain, struct msi_desc *desc, 372 - phys_addr_t msi_addr) 373 - { 374 - struct device *dev = msi_desc_to_dev(desc); 375 - struct iommufd_hwpt_paging *hwpt_paging; 376 - struct iommu_attach_handle *raw_handle; 377 - struct iommufd_attach_handle *handle; 378 - struct iommufd_sw_msi_map *msi_map; 379 - struct iommufd_ctx *ictx; 380 - unsigned long iova; 381 - int rc; 382 - 383 - /* 384 - * It is safe to call iommu_attach_handle_get() here because the iommu 385 - * core code invokes this under the group mutex which also prevents any 386 - * change of the attach handle for the duration of this function. 387 - */ 388 - iommu_group_mutex_assert(dev); 389 - 390 - raw_handle = 391 - iommu_attach_handle_get(dev->iommu_group, IOMMU_NO_PASID, 0); 392 - if (IS_ERR(raw_handle)) 393 - return 0; 394 - hwpt_paging = find_hwpt_paging(domain->iommufd_hwpt); 395 - 396 - handle = to_iommufd_handle(raw_handle); 397 - /* No IOMMU_RESV_SW_MSI means no change to the msi_msg */ 398 - if (handle->idev->igroup->sw_msi_start == PHYS_ADDR_MAX) 399 - return 0; 400 - 401 - ictx = handle->idev->ictx; 402 - guard(mutex)(&ictx->sw_msi_lock); 403 - /* 404 - * The input msi_addr is the exact byte offset of the MSI doorbell, we 405 - * assume the caller has checked that it is contained with a MMIO region 406 - * that is secure to map at PAGE_SIZE. 407 - */ 408 - msi_map = iommufd_sw_msi_get_map(handle->idev->ictx, 409 - msi_addr & PAGE_MASK, 410 - handle->idev->igroup->sw_msi_start); 411 - if (IS_ERR(msi_map)) 412 - return PTR_ERR(msi_map); 413 - 414 - rc = iommufd_sw_msi_install(ictx, hwpt_paging, msi_map); 415 - if (rc) 416 - return rc; 417 - __set_bit(msi_map->id, handle->idev->igroup->required_sw_msi.bitmap); 418 - 419 - iova = msi_map->sw_msi_start + msi_map->pgoff * PAGE_SIZE; 420 - msi_desc_set_iommu_msi_iova(desc, iova, PAGE_SHIFT); 421 - return 0; 422 - } 423 - #endif 424 - 425 315 static int iommufd_group_setup_msi(struct iommufd_group *igroup, 426 316 struct iommufd_hwpt_paging *hwpt_paging) 427 317 { ··· 343 443 } 344 444 return 0; 345 445 } 446 + #else 447 + static inline int 448 + iommufd_group_setup_msi(struct iommufd_group *igroup, 449 + struct iommufd_hwpt_paging *hwpt_paging) 450 + { 451 + return 0; 452 + } 453 + #endif 454 + 455 + static bool 456 + iommufd_group_first_attach(struct iommufd_group *igroup, ioasid_t pasid) 457 + { 458 + lockdep_assert_held(&igroup->lock); 459 + return !xa_load(&igroup->pasid_attach, pasid); 460 + } 346 461 347 462 static int 348 463 iommufd_device_attach_reserved_iova(struct iommufd_device *idev, 349 464 struct iommufd_hwpt_paging *hwpt_paging) 350 465 { 466 + struct iommufd_group *igroup = idev->igroup; 351 467 int rc; 352 468 353 - lockdep_assert_held(&idev->igroup->lock); 469 + lockdep_assert_held(&igroup->lock); 354 470 355 471 rc = iopt_table_enforce_dev_resv_regions(&hwpt_paging->ioas->iopt, 356 472 idev->dev, 357 - &idev->igroup->sw_msi_start); 473 + &igroup->sw_msi_start); 358 474 if (rc) 359 475 return rc; 360 476 361 - if (list_empty(&idev->igroup->device_list)) { 362 - rc = iommufd_group_setup_msi(idev->igroup, hwpt_paging); 477 + if (iommufd_group_first_attach(igroup, IOMMU_NO_PASID)) { 478 + rc = iommufd_group_setup_msi(igroup, hwpt_paging); 363 479 if (rc) { 364 480 iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt, 365 481 idev->dev); ··· 387 471 388 472 /* The device attach/detach/replace helpers for attach_handle */ 389 473 474 + static bool iommufd_device_is_attached(struct iommufd_device *idev, 475 + ioasid_t pasid) 476 + { 477 + struct iommufd_attach *attach; 478 + 479 + attach = xa_load(&idev->igroup->pasid_attach, pasid); 480 + return xa_load(&attach->device_array, idev->obj.id); 481 + } 482 + 483 + static int iommufd_hwpt_pasid_compat(struct iommufd_hw_pagetable *hwpt, 484 + struct iommufd_device *idev, 485 + ioasid_t pasid) 486 + { 487 + struct iommufd_group *igroup = idev->igroup; 488 + 489 + lockdep_assert_held(&igroup->lock); 490 + 491 + if (pasid == IOMMU_NO_PASID) { 492 + unsigned long start = IOMMU_NO_PASID; 493 + 494 + if (!hwpt->pasid_compat && 495 + xa_find_after(&igroup->pasid_attach, 496 + &start, UINT_MAX, XA_PRESENT)) 497 + return -EINVAL; 498 + } else { 499 + struct iommufd_attach *attach; 500 + 501 + if (!hwpt->pasid_compat) 502 + return -EINVAL; 503 + 504 + attach = xa_load(&igroup->pasid_attach, IOMMU_NO_PASID); 505 + if (attach && attach->hwpt && !attach->hwpt->pasid_compat) 506 + return -EINVAL; 507 + } 508 + 509 + return 0; 510 + } 511 + 390 512 static int iommufd_hwpt_attach_device(struct iommufd_hw_pagetable *hwpt, 391 - struct iommufd_device *idev) 513 + struct iommufd_device *idev, 514 + ioasid_t pasid) 392 515 { 393 516 struct iommufd_attach_handle *handle; 394 517 int rc; 395 518 396 - lockdep_assert_held(&idev->igroup->lock); 519 + rc = iommufd_hwpt_pasid_compat(hwpt, idev, pasid); 520 + if (rc) 521 + return rc; 397 522 398 523 handle = kzalloc(sizeof(*handle), GFP_KERNEL); 399 524 if (!handle) ··· 447 490 } 448 491 449 492 handle->idev = idev; 450 - rc = iommu_attach_group_handle(hwpt->domain, idev->igroup->group, 451 - &handle->handle); 493 + if (pasid == IOMMU_NO_PASID) 494 + rc = iommu_attach_group_handle(hwpt->domain, idev->igroup->group, 495 + &handle->handle); 496 + else 497 + rc = iommu_attach_device_pasid(hwpt->domain, idev->dev, pasid, 498 + &handle->handle); 452 499 if (rc) 453 500 goto out_disable_iopf; 454 501 ··· 467 506 } 468 507 469 508 static struct iommufd_attach_handle * 470 - iommufd_device_get_attach_handle(struct iommufd_device *idev) 509 + iommufd_device_get_attach_handle(struct iommufd_device *idev, ioasid_t pasid) 471 510 { 472 511 struct iommu_attach_handle *handle; 473 512 474 513 lockdep_assert_held(&idev->igroup->lock); 475 514 476 515 handle = 477 - iommu_attach_handle_get(idev->igroup->group, IOMMU_NO_PASID, 0); 516 + iommu_attach_handle_get(idev->igroup->group, pasid, 0); 478 517 if (IS_ERR(handle)) 479 518 return NULL; 480 519 return to_iommufd_handle(handle); 481 520 } 482 521 483 522 static void iommufd_hwpt_detach_device(struct iommufd_hw_pagetable *hwpt, 484 - struct iommufd_device *idev) 523 + struct iommufd_device *idev, 524 + ioasid_t pasid) 485 525 { 486 526 struct iommufd_attach_handle *handle; 487 527 488 - handle = iommufd_device_get_attach_handle(idev); 489 - iommu_detach_group_handle(hwpt->domain, idev->igroup->group); 528 + handle = iommufd_device_get_attach_handle(idev, pasid); 529 + if (pasid == IOMMU_NO_PASID) 530 + iommu_detach_group_handle(hwpt->domain, idev->igroup->group); 531 + else 532 + iommu_detach_device_pasid(hwpt->domain, idev->dev, pasid); 533 + 490 534 if (hwpt->fault) { 491 535 iommufd_auto_response_faults(hwpt, handle); 492 536 iommufd_fault_iopf_disable(idev); ··· 500 534 } 501 535 502 536 static int iommufd_hwpt_replace_device(struct iommufd_device *idev, 537 + ioasid_t pasid, 503 538 struct iommufd_hw_pagetable *hwpt, 504 539 struct iommufd_hw_pagetable *old) 505 540 { 506 - struct iommufd_attach_handle *handle, *old_handle = 507 - iommufd_device_get_attach_handle(idev); 541 + struct iommufd_attach_handle *handle, *old_handle; 508 542 int rc; 543 + 544 + rc = iommufd_hwpt_pasid_compat(hwpt, idev, pasid); 545 + if (rc) 546 + return rc; 547 + 548 + old_handle = iommufd_device_get_attach_handle(idev, pasid); 509 549 510 550 handle = kzalloc(sizeof(*handle), GFP_KERNEL); 511 551 if (!handle) ··· 524 552 } 525 553 526 554 handle->idev = idev; 527 - rc = iommu_replace_group_handle(idev->igroup->group, hwpt->domain, 528 - &handle->handle); 555 + if (pasid == IOMMU_NO_PASID) 556 + rc = iommu_replace_group_handle(idev->igroup->group, 557 + hwpt->domain, &handle->handle); 558 + else 559 + rc = iommu_replace_device_pasid(hwpt->domain, idev->dev, 560 + pasid, &handle->handle); 529 561 if (rc) 530 562 goto out_disable_iopf; 531 563 ··· 551 575 } 552 576 553 577 int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt, 554 - struct iommufd_device *idev) 578 + struct iommufd_device *idev, ioasid_t pasid) 555 579 { 556 580 struct iommufd_hwpt_paging *hwpt_paging = find_hwpt_paging(hwpt); 581 + bool attach_resv = hwpt_paging && pasid == IOMMU_NO_PASID; 582 + struct iommufd_group *igroup = idev->igroup; 583 + struct iommufd_hw_pagetable *old_hwpt; 584 + struct iommufd_attach *attach; 557 585 int rc; 558 586 559 - mutex_lock(&idev->igroup->lock); 587 + mutex_lock(&igroup->lock); 560 588 561 - if (idev->igroup->hwpt != NULL && idev->igroup->hwpt != hwpt) { 562 - rc = -EINVAL; 589 + attach = xa_cmpxchg(&igroup->pasid_attach, pasid, NULL, 590 + XA_ZERO_ENTRY, GFP_KERNEL); 591 + if (xa_is_err(attach)) { 592 + rc = xa_err(attach); 563 593 goto err_unlock; 564 594 } 565 595 566 - if (hwpt_paging) { 596 + if (!attach) { 597 + attach = kzalloc(sizeof(*attach), GFP_KERNEL); 598 + if (!attach) { 599 + rc = -ENOMEM; 600 + goto err_release_pasid; 601 + } 602 + xa_init(&attach->device_array); 603 + } 604 + 605 + old_hwpt = attach->hwpt; 606 + 607 + rc = xa_insert(&attach->device_array, idev->obj.id, XA_ZERO_ENTRY, 608 + GFP_KERNEL); 609 + if (rc) { 610 + WARN_ON(rc == -EBUSY && !old_hwpt); 611 + goto err_free_attach; 612 + } 613 + 614 + if (old_hwpt && old_hwpt != hwpt) { 615 + rc = -EINVAL; 616 + goto err_release_devid; 617 + } 618 + 619 + if (attach_resv) { 567 620 rc = iommufd_device_attach_reserved_iova(idev, hwpt_paging); 568 621 if (rc) 569 - goto err_unlock; 622 + goto err_release_devid; 570 623 } 571 624 572 625 /* ··· 605 600 * reserved regions are only updated during individual device 606 601 * attachment. 607 602 */ 608 - if (list_empty(&idev->igroup->device_list)) { 609 - rc = iommufd_hwpt_attach_device(hwpt, idev); 603 + if (iommufd_group_first_attach(igroup, pasid)) { 604 + rc = iommufd_hwpt_attach_device(hwpt, idev, pasid); 610 605 if (rc) 611 606 goto err_unresv; 612 - idev->igroup->hwpt = hwpt; 607 + attach->hwpt = hwpt; 608 + WARN_ON(xa_is_err(xa_store(&igroup->pasid_attach, pasid, attach, 609 + GFP_KERNEL))); 613 610 } 614 611 refcount_inc(&hwpt->obj.users); 615 - list_add_tail(&idev->group_item, &idev->igroup->device_list); 616 - mutex_unlock(&idev->igroup->lock); 612 + WARN_ON(xa_is_err(xa_store(&attach->device_array, idev->obj.id, 613 + idev, GFP_KERNEL))); 614 + mutex_unlock(&igroup->lock); 617 615 return 0; 618 616 err_unresv: 619 - if (hwpt_paging) 617 + if (attach_resv) 620 618 iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt, idev->dev); 619 + err_release_devid: 620 + xa_release(&attach->device_array, idev->obj.id); 621 + err_free_attach: 622 + if (iommufd_group_first_attach(igroup, pasid)) 623 + kfree(attach); 624 + err_release_pasid: 625 + if (iommufd_group_first_attach(igroup, pasid)) 626 + xa_release(&igroup->pasid_attach, pasid); 621 627 err_unlock: 622 - mutex_unlock(&idev->igroup->lock); 628 + mutex_unlock(&igroup->lock); 623 629 return rc; 624 630 } 625 631 626 632 struct iommufd_hw_pagetable * 627 - iommufd_hw_pagetable_detach(struct iommufd_device *idev) 633 + iommufd_hw_pagetable_detach(struct iommufd_device *idev, ioasid_t pasid) 628 634 { 629 - struct iommufd_hw_pagetable *hwpt = idev->igroup->hwpt; 630 - struct iommufd_hwpt_paging *hwpt_paging = find_hwpt_paging(hwpt); 635 + struct iommufd_group *igroup = idev->igroup; 636 + struct iommufd_hwpt_paging *hwpt_paging; 637 + struct iommufd_hw_pagetable *hwpt; 638 + struct iommufd_attach *attach; 631 639 632 - mutex_lock(&idev->igroup->lock); 633 - list_del(&idev->group_item); 634 - if (list_empty(&idev->igroup->device_list)) { 635 - iommufd_hwpt_detach_device(hwpt, idev); 636 - idev->igroup->hwpt = NULL; 640 + mutex_lock(&igroup->lock); 641 + attach = xa_load(&igroup->pasid_attach, pasid); 642 + if (!attach) { 643 + mutex_unlock(&igroup->lock); 644 + return NULL; 637 645 } 638 - if (hwpt_paging) 646 + 647 + hwpt = attach->hwpt; 648 + hwpt_paging = find_hwpt_paging(hwpt); 649 + 650 + xa_erase(&attach->device_array, idev->obj.id); 651 + if (xa_empty(&attach->device_array)) { 652 + iommufd_hwpt_detach_device(hwpt, idev, pasid); 653 + xa_erase(&igroup->pasid_attach, pasid); 654 + kfree(attach); 655 + } 656 + if (hwpt_paging && pasid == IOMMU_NO_PASID) 639 657 iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt, idev->dev); 640 - mutex_unlock(&idev->igroup->lock); 658 + mutex_unlock(&igroup->lock); 641 659 642 660 /* Caller must destroy hwpt */ 643 661 return hwpt; 644 662 } 645 663 646 664 static struct iommufd_hw_pagetable * 647 - iommufd_device_do_attach(struct iommufd_device *idev, 665 + iommufd_device_do_attach(struct iommufd_device *idev, ioasid_t pasid, 648 666 struct iommufd_hw_pagetable *hwpt) 649 667 { 650 668 int rc; 651 669 652 - rc = iommufd_hw_pagetable_attach(hwpt, idev); 670 + rc = iommufd_hw_pagetable_attach(hwpt, idev, pasid); 653 671 if (rc) 654 672 return ERR_PTR(rc); 655 673 return NULL; ··· 682 654 iommufd_group_remove_reserved_iova(struct iommufd_group *igroup, 683 655 struct iommufd_hwpt_paging *hwpt_paging) 684 656 { 657 + struct iommufd_attach *attach; 685 658 struct iommufd_device *cur; 659 + unsigned long index; 686 660 687 661 lockdep_assert_held(&igroup->lock); 688 662 689 - list_for_each_entry(cur, &igroup->device_list, group_item) 663 + attach = xa_load(&igroup->pasid_attach, IOMMU_NO_PASID); 664 + xa_for_each(&attach->device_array, index, cur) 690 665 iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt, cur->dev); 691 666 } 692 667 ··· 698 667 struct iommufd_hwpt_paging *hwpt_paging) 699 668 { 700 669 struct iommufd_hwpt_paging *old_hwpt_paging; 670 + struct iommufd_attach *attach; 701 671 struct iommufd_device *cur; 672 + unsigned long index; 702 673 int rc; 703 674 704 675 lockdep_assert_held(&igroup->lock); 705 676 706 - old_hwpt_paging = find_hwpt_paging(igroup->hwpt); 677 + attach = xa_load(&igroup->pasid_attach, IOMMU_NO_PASID); 678 + old_hwpt_paging = find_hwpt_paging(attach->hwpt); 707 679 if (!old_hwpt_paging || hwpt_paging->ioas != old_hwpt_paging->ioas) { 708 - list_for_each_entry(cur, &igroup->device_list, group_item) { 680 + xa_for_each(&attach->device_array, index, cur) { 709 681 rc = iopt_table_enforce_dev_resv_regions( 710 682 &hwpt_paging->ioas->iopt, cur->dev, NULL); 711 683 if (rc) ··· 727 693 } 728 694 729 695 static struct iommufd_hw_pagetable * 730 - iommufd_device_do_replace(struct iommufd_device *idev, 696 + iommufd_device_do_replace(struct iommufd_device *idev, ioasid_t pasid, 731 697 struct iommufd_hw_pagetable *hwpt) 732 698 { 733 699 struct iommufd_hwpt_paging *hwpt_paging = find_hwpt_paging(hwpt); 700 + bool attach_resv = hwpt_paging && pasid == IOMMU_NO_PASID; 734 701 struct iommufd_hwpt_paging *old_hwpt_paging; 735 702 struct iommufd_group *igroup = idev->igroup; 736 703 struct iommufd_hw_pagetable *old_hwpt; 704 + struct iommufd_attach *attach; 737 705 unsigned int num_devices; 738 706 int rc; 739 707 740 - mutex_lock(&idev->igroup->lock); 708 + mutex_lock(&igroup->lock); 741 709 742 - if (igroup->hwpt == NULL) { 710 + attach = xa_load(&igroup->pasid_attach, pasid); 711 + if (!attach) { 743 712 rc = -EINVAL; 744 713 goto err_unlock; 745 714 } 746 715 747 - if (hwpt == igroup->hwpt) { 748 - mutex_unlock(&idev->igroup->lock); 716 + old_hwpt = attach->hwpt; 717 + 718 + WARN_ON(!old_hwpt || xa_empty(&attach->device_array)); 719 + 720 + if (!iommufd_device_is_attached(idev, pasid)) { 721 + rc = -EINVAL; 722 + goto err_unlock; 723 + } 724 + 725 + if (hwpt == old_hwpt) { 726 + mutex_unlock(&igroup->lock); 749 727 return NULL; 750 728 } 751 729 752 - old_hwpt = igroup->hwpt; 753 - if (hwpt_paging) { 730 + if (attach_resv) { 754 731 rc = iommufd_group_do_replace_reserved_iova(igroup, hwpt_paging); 755 732 if (rc) 756 733 goto err_unlock; 757 734 } 758 735 759 - rc = iommufd_hwpt_replace_device(idev, hwpt, old_hwpt); 736 + rc = iommufd_hwpt_replace_device(idev, pasid, hwpt, old_hwpt); 760 737 if (rc) 761 738 goto err_unresv; 762 739 763 740 old_hwpt_paging = find_hwpt_paging(old_hwpt); 764 - if (old_hwpt_paging && 741 + if (old_hwpt_paging && pasid == IOMMU_NO_PASID && 765 742 (!hwpt_paging || hwpt_paging->ioas != old_hwpt_paging->ioas)) 766 743 iommufd_group_remove_reserved_iova(igroup, old_hwpt_paging); 767 744 768 - igroup->hwpt = hwpt; 745 + attach->hwpt = hwpt; 769 746 770 - num_devices = list_count_nodes(&igroup->device_list); 747 + num_devices = iommufd_group_device_num(igroup, pasid); 771 748 /* 772 - * Move the refcounts held by the device_list to the new hwpt. Retain a 749 + * Move the refcounts held by the device_array to the new hwpt. Retain a 773 750 * refcount for this thread as the caller will free it. 774 751 */ 775 752 refcount_add(num_devices, &hwpt->obj.users); 776 753 if (num_devices > 1) 777 754 WARN_ON(refcount_sub_and_test(num_devices - 1, 778 755 &old_hwpt->obj.users)); 779 - mutex_unlock(&idev->igroup->lock); 756 + mutex_unlock(&igroup->lock); 780 757 781 758 /* Caller must destroy old_hwpt */ 782 759 return old_hwpt; 783 760 err_unresv: 784 - if (hwpt_paging) 761 + if (attach_resv) 785 762 iommufd_group_remove_reserved_iova(igroup, hwpt_paging); 786 763 err_unlock: 787 - mutex_unlock(&idev->igroup->lock); 764 + mutex_unlock(&igroup->lock); 788 765 return ERR_PTR(rc); 789 766 } 790 767 791 768 typedef struct iommufd_hw_pagetable *(*attach_fn)( 792 - struct iommufd_device *idev, struct iommufd_hw_pagetable *hwpt); 769 + struct iommufd_device *idev, ioasid_t pasid, 770 + struct iommufd_hw_pagetable *hwpt); 793 771 794 772 /* 795 773 * When automatically managing the domains we search for a compatible domain in ··· 809 763 * Automatic domain selection will never pick a manually created domain. 810 764 */ 811 765 static struct iommufd_hw_pagetable * 812 - iommufd_device_auto_get_domain(struct iommufd_device *idev, 766 + iommufd_device_auto_get_domain(struct iommufd_device *idev, ioasid_t pasid, 813 767 struct iommufd_ioas *ioas, u32 *pt_id, 814 768 attach_fn do_attach) 815 769 { ··· 838 792 hwpt = &hwpt_paging->common; 839 793 if (!iommufd_lock_obj(&hwpt->obj)) 840 794 continue; 841 - destroy_hwpt = (*do_attach)(idev, hwpt); 795 + destroy_hwpt = (*do_attach)(idev, pasid, hwpt); 842 796 if (IS_ERR(destroy_hwpt)) { 843 797 iommufd_put_object(idev->ictx, &hwpt->obj); 844 798 /* ··· 856 810 goto out_unlock; 857 811 } 858 812 859 - hwpt_paging = iommufd_hwpt_paging_alloc(idev->ictx, ioas, idev, 0, 860 - immediate_attach, NULL); 813 + hwpt_paging = iommufd_hwpt_paging_alloc(idev->ictx, ioas, idev, pasid, 814 + 0, immediate_attach, NULL); 861 815 if (IS_ERR(hwpt_paging)) { 862 816 destroy_hwpt = ERR_CAST(hwpt_paging); 863 817 goto out_unlock; ··· 865 819 hwpt = &hwpt_paging->common; 866 820 867 821 if (!immediate_attach) { 868 - destroy_hwpt = (*do_attach)(idev, hwpt); 822 + destroy_hwpt = (*do_attach)(idev, pasid, hwpt); 869 823 if (IS_ERR(destroy_hwpt)) 870 824 goto out_abort; 871 825 } else { ··· 886 840 return destroy_hwpt; 887 841 } 888 842 889 - static int iommufd_device_change_pt(struct iommufd_device *idev, u32 *pt_id, 890 - attach_fn do_attach) 843 + static int iommufd_device_change_pt(struct iommufd_device *idev, 844 + ioasid_t pasid, 845 + u32 *pt_id, attach_fn do_attach) 891 846 { 892 847 struct iommufd_hw_pagetable *destroy_hwpt; 893 848 struct iommufd_object *pt_obj; ··· 903 856 struct iommufd_hw_pagetable *hwpt = 904 857 container_of(pt_obj, struct iommufd_hw_pagetable, obj); 905 858 906 - destroy_hwpt = (*do_attach)(idev, hwpt); 859 + destroy_hwpt = (*do_attach)(idev, pasid, hwpt); 907 860 if (IS_ERR(destroy_hwpt)) 908 861 goto out_put_pt_obj; 909 862 break; ··· 912 865 struct iommufd_ioas *ioas = 913 866 container_of(pt_obj, struct iommufd_ioas, obj); 914 867 915 - destroy_hwpt = iommufd_device_auto_get_domain(idev, ioas, pt_id, 916 - do_attach); 868 + destroy_hwpt = iommufd_device_auto_get_domain(idev, pasid, ioas, 869 + pt_id, do_attach); 917 870 if (IS_ERR(destroy_hwpt)) 918 871 goto out_put_pt_obj; 919 872 break; ··· 935 888 } 936 889 937 890 /** 938 - * iommufd_device_attach - Connect a device to an iommu_domain 891 + * iommufd_device_attach - Connect a device/pasid to an iommu_domain 939 892 * @idev: device to attach 893 + * @pasid: pasid to attach 940 894 * @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HWPT_PAGING 941 895 * Output the IOMMUFD_OBJ_HWPT_PAGING ID 942 896 * 943 - * This connects the device to an iommu_domain, either automatically or manually 944 - * selected. Once this completes the device could do DMA. 897 + * This connects the device/pasid to an iommu_domain, either automatically 898 + * or manually selected. Once this completes the device could do DMA with 899 + * @pasid. @pasid is IOMMU_NO_PASID if this attach is for no pasid usage. 945 900 * 946 901 * The caller should return the resulting pt_id back to userspace. 947 902 * This function is undone by calling iommufd_device_detach(). 948 903 */ 949 - int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id) 904 + int iommufd_device_attach(struct iommufd_device *idev, ioasid_t pasid, 905 + u32 *pt_id) 950 906 { 951 907 int rc; 952 908 953 - rc = iommufd_device_change_pt(idev, pt_id, &iommufd_device_do_attach); 909 + rc = iommufd_device_change_pt(idev, pasid, pt_id, 910 + &iommufd_device_do_attach); 954 911 if (rc) 955 912 return rc; 956 913 ··· 968 917 EXPORT_SYMBOL_NS_GPL(iommufd_device_attach, "IOMMUFD"); 969 918 970 919 /** 971 - * iommufd_device_replace - Change the device's iommu_domain 920 + * iommufd_device_replace - Change the device/pasid's iommu_domain 972 921 * @idev: device to change 922 + * @pasid: pasid to change 973 923 * @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HWPT_PAGING 974 924 * Output the IOMMUFD_OBJ_HWPT_PAGING ID 975 925 * ··· 981 929 * 982 930 * If it fails then no change is made to the attachment. The iommu driver may 983 931 * implement this so there is no disruption in translation. This can only be 984 - * called if iommufd_device_attach() has already succeeded. 932 + * called if iommufd_device_attach() has already succeeded. @pasid is 933 + * IOMMU_NO_PASID for no pasid usage. 985 934 */ 986 - int iommufd_device_replace(struct iommufd_device *idev, u32 *pt_id) 935 + int iommufd_device_replace(struct iommufd_device *idev, ioasid_t pasid, 936 + u32 *pt_id) 987 937 { 988 - return iommufd_device_change_pt(idev, pt_id, 938 + return iommufd_device_change_pt(idev, pasid, pt_id, 989 939 &iommufd_device_do_replace); 990 940 } 991 941 EXPORT_SYMBOL_NS_GPL(iommufd_device_replace, "IOMMUFD"); 992 942 993 943 /** 994 - * iommufd_device_detach - Disconnect a device to an iommu_domain 944 + * iommufd_device_detach - Disconnect a device/device to an iommu_domain 995 945 * @idev: device to detach 946 + * @pasid: pasid to detach 996 947 * 997 948 * Undo iommufd_device_attach(). This disconnects the idev from the previously 998 949 * attached pt_id. The device returns back to a blocked DMA translation. 950 + * @pasid is IOMMU_NO_PASID for no pasid usage. 999 951 */ 1000 - void iommufd_device_detach(struct iommufd_device *idev) 952 + void iommufd_device_detach(struct iommufd_device *idev, ioasid_t pasid) 1001 953 { 1002 954 struct iommufd_hw_pagetable *hwpt; 1003 955 1004 - hwpt = iommufd_hw_pagetable_detach(idev); 956 + hwpt = iommufd_hw_pagetable_detach(idev, pasid); 957 + if (!hwpt) 958 + return; 1005 959 iommufd_hw_pagetable_put(idev->ictx, hwpt); 1006 960 refcount_dec(&idev->obj.users); 1007 961 } ··· 1407 1349 struct io_pagetable *iopt; 1408 1350 struct iopt_area *area; 1409 1351 unsigned long last_iova; 1410 - int rc; 1352 + int rc = -EINVAL; 1411 1353 1412 1354 if (!length) 1413 1355 return -EINVAL; ··· 1463 1405 void *data; 1464 1406 int rc; 1465 1407 1466 - if (cmd->flags || cmd->__reserved) 1408 + if (cmd->flags || cmd->__reserved[0] || cmd->__reserved[1] || 1409 + cmd->__reserved[2]) 1467 1410 return -EOPNOTSUPP; 1468 1411 1469 1412 idev = iommufd_get_device(ucmd, cmd->dev_id); ··· 1520 1461 cmd->out_capabilities = 0; 1521 1462 if (device_iommu_capable(idev->dev, IOMMU_CAP_DIRTY_TRACKING)) 1522 1463 cmd->out_capabilities |= IOMMU_HW_CAP_DIRTY_TRACKING; 1464 + 1465 + cmd->out_max_pasid_log2 = 0; 1466 + /* 1467 + * Currently, all iommu drivers enable PASID in the probe_device() 1468 + * op if iommu and device supports it. So the max_pasids stored in 1469 + * dev->iommu indicates both PASID support and enable status. A 1470 + * non-zero dev->iommu->max_pasids means PASID is supported and 1471 + * enabled. The iommufd only reports PASID capability to userspace 1472 + * if it's enabled. 1473 + */ 1474 + if (idev->dev->iommu->max_pasids) { 1475 + cmd->out_max_pasid_log2 = ilog2(idev->dev->iommu->max_pasids); 1476 + 1477 + if (dev_is_pci(idev->dev)) { 1478 + struct pci_dev *pdev = to_pci_dev(idev->dev); 1479 + int ctrl; 1480 + 1481 + ctrl = pci_pasid_status(pdev); 1482 + 1483 + WARN_ON_ONCE(ctrl < 0 || 1484 + !(ctrl & PCI_PASID_CTRL_ENABLE)); 1485 + 1486 + if (ctrl & PCI_PASID_CTRL_EXEC) 1487 + cmd->out_capabilities |= 1488 + IOMMU_HW_CAP_PCI_PASID_EXEC; 1489 + if (ctrl & PCI_PASID_CTRL_PRIV) 1490 + cmd->out_capabilities |= 1491 + IOMMU_HW_CAP_PCI_PASID_PRIV; 1492 + } 1493 + } 1523 1494 1524 1495 rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); 1525 1496 out_free:
+198
drivers/iommu/iommufd/driver.c
··· 49 49 } 50 50 EXPORT_SYMBOL_NS_GPL(iommufd_viommu_find_dev, "IOMMUFD"); 51 51 52 + /* Return -ENOENT if device is not associated to the vIOMMU */ 53 + int iommufd_viommu_get_vdev_id(struct iommufd_viommu *viommu, 54 + struct device *dev, unsigned long *vdev_id) 55 + { 56 + struct iommufd_vdevice *vdev; 57 + unsigned long index; 58 + int rc = -ENOENT; 59 + 60 + if (WARN_ON_ONCE(!vdev_id)) 61 + return -EINVAL; 62 + 63 + xa_lock(&viommu->vdevs); 64 + xa_for_each(&viommu->vdevs, index, vdev) { 65 + if (vdev->dev == dev) { 66 + *vdev_id = vdev->id; 67 + rc = 0; 68 + break; 69 + } 70 + } 71 + xa_unlock(&viommu->vdevs); 72 + return rc; 73 + } 74 + EXPORT_SYMBOL_NS_GPL(iommufd_viommu_get_vdev_id, "IOMMUFD"); 75 + 76 + /* 77 + * Typically called in driver's threaded IRQ handler. 78 + * The @type and @event_data must be defined in include/uapi/linux/iommufd.h 79 + */ 80 + int iommufd_viommu_report_event(struct iommufd_viommu *viommu, 81 + enum iommu_veventq_type type, void *event_data, 82 + size_t data_len) 83 + { 84 + struct iommufd_veventq *veventq; 85 + struct iommufd_vevent *vevent; 86 + int rc = 0; 87 + 88 + if (WARN_ON_ONCE(!data_len || !event_data)) 89 + return -EINVAL; 90 + 91 + down_read(&viommu->veventqs_rwsem); 92 + 93 + veventq = iommufd_viommu_find_veventq(viommu, type); 94 + if (!veventq) { 95 + rc = -EOPNOTSUPP; 96 + goto out_unlock_veventqs; 97 + } 98 + 99 + spin_lock(&veventq->common.lock); 100 + if (veventq->num_events == veventq->depth) { 101 + vevent = &veventq->lost_events_header; 102 + goto out_set_header; 103 + } 104 + 105 + vevent = kzalloc(struct_size(vevent, event_data, data_len), GFP_ATOMIC); 106 + if (!vevent) { 107 + rc = -ENOMEM; 108 + vevent = &veventq->lost_events_header; 109 + goto out_set_header; 110 + } 111 + memcpy(vevent->event_data, event_data, data_len); 112 + vevent->data_len = data_len; 113 + veventq->num_events++; 114 + 115 + out_set_header: 116 + iommufd_vevent_handler(veventq, vevent); 117 + spin_unlock(&veventq->common.lock); 118 + out_unlock_veventqs: 119 + up_read(&viommu->veventqs_rwsem); 120 + return rc; 121 + } 122 + EXPORT_SYMBOL_NS_GPL(iommufd_viommu_report_event, "IOMMUFD"); 123 + 124 + #ifdef CONFIG_IRQ_MSI_IOMMU 125 + /* 126 + * Get a iommufd_sw_msi_map for the msi physical address requested by the irq 127 + * layer. The mapping to IOVA is global to the iommufd file descriptor, every 128 + * domain that is attached to a device using the same MSI parameters will use 129 + * the same IOVA. 130 + */ 131 + static struct iommufd_sw_msi_map * 132 + iommufd_sw_msi_get_map(struct iommufd_ctx *ictx, phys_addr_t msi_addr, 133 + phys_addr_t sw_msi_start) 134 + { 135 + struct iommufd_sw_msi_map *cur; 136 + unsigned int max_pgoff = 0; 137 + 138 + lockdep_assert_held(&ictx->sw_msi_lock); 139 + 140 + list_for_each_entry(cur, &ictx->sw_msi_list, sw_msi_item) { 141 + if (cur->sw_msi_start != sw_msi_start) 142 + continue; 143 + max_pgoff = max(max_pgoff, cur->pgoff + 1); 144 + if (cur->msi_addr == msi_addr) 145 + return cur; 146 + } 147 + 148 + if (ictx->sw_msi_id >= 149 + BITS_PER_BYTE * sizeof_field(struct iommufd_sw_msi_maps, bitmap)) 150 + return ERR_PTR(-EOVERFLOW); 151 + 152 + cur = kzalloc(sizeof(*cur), GFP_KERNEL); 153 + if (!cur) 154 + return ERR_PTR(-ENOMEM); 155 + 156 + cur->sw_msi_start = sw_msi_start; 157 + cur->msi_addr = msi_addr; 158 + cur->pgoff = max_pgoff; 159 + cur->id = ictx->sw_msi_id++; 160 + list_add_tail(&cur->sw_msi_item, &ictx->sw_msi_list); 161 + return cur; 162 + } 163 + 164 + int iommufd_sw_msi_install(struct iommufd_ctx *ictx, 165 + struct iommufd_hwpt_paging *hwpt_paging, 166 + struct iommufd_sw_msi_map *msi_map) 167 + { 168 + unsigned long iova; 169 + 170 + lockdep_assert_held(&ictx->sw_msi_lock); 171 + 172 + iova = msi_map->sw_msi_start + msi_map->pgoff * PAGE_SIZE; 173 + if (!test_bit(msi_map->id, hwpt_paging->present_sw_msi.bitmap)) { 174 + int rc; 175 + 176 + rc = iommu_map(hwpt_paging->common.domain, iova, 177 + msi_map->msi_addr, PAGE_SIZE, 178 + IOMMU_WRITE | IOMMU_READ | IOMMU_MMIO, 179 + GFP_KERNEL_ACCOUNT); 180 + if (rc) 181 + return rc; 182 + __set_bit(msi_map->id, hwpt_paging->present_sw_msi.bitmap); 183 + } 184 + return 0; 185 + } 186 + EXPORT_SYMBOL_NS_GPL(iommufd_sw_msi_install, "IOMMUFD_INTERNAL"); 187 + 188 + /* 189 + * Called by the irq code if the platform translates the MSI address through the 190 + * IOMMU. msi_addr is the physical address of the MSI page. iommufd will 191 + * allocate a fd global iova for the physical page that is the same on all 192 + * domains and devices. 193 + */ 194 + int iommufd_sw_msi(struct iommu_domain *domain, struct msi_desc *desc, 195 + phys_addr_t msi_addr) 196 + { 197 + struct device *dev = msi_desc_to_dev(desc); 198 + struct iommufd_hwpt_paging *hwpt_paging; 199 + struct iommu_attach_handle *raw_handle; 200 + struct iommufd_attach_handle *handle; 201 + struct iommufd_sw_msi_map *msi_map; 202 + struct iommufd_ctx *ictx; 203 + unsigned long iova; 204 + int rc; 205 + 206 + /* 207 + * It is safe to call iommu_attach_handle_get() here because the iommu 208 + * core code invokes this under the group mutex which also prevents any 209 + * change of the attach handle for the duration of this function. 210 + */ 211 + iommu_group_mutex_assert(dev); 212 + 213 + raw_handle = 214 + iommu_attach_handle_get(dev->iommu_group, IOMMU_NO_PASID, 0); 215 + if (IS_ERR(raw_handle)) 216 + return 0; 217 + hwpt_paging = find_hwpt_paging(domain->iommufd_hwpt); 218 + 219 + handle = to_iommufd_handle(raw_handle); 220 + /* No IOMMU_RESV_SW_MSI means no change to the msi_msg */ 221 + if (handle->idev->igroup->sw_msi_start == PHYS_ADDR_MAX) 222 + return 0; 223 + 224 + ictx = handle->idev->ictx; 225 + guard(mutex)(&ictx->sw_msi_lock); 226 + /* 227 + * The input msi_addr is the exact byte offset of the MSI doorbell, we 228 + * assume the caller has checked that it is contained with a MMIO region 229 + * that is secure to map at PAGE_SIZE. 230 + */ 231 + msi_map = iommufd_sw_msi_get_map(handle->idev->ictx, 232 + msi_addr & PAGE_MASK, 233 + handle->idev->igroup->sw_msi_start); 234 + if (IS_ERR(msi_map)) 235 + return PTR_ERR(msi_map); 236 + 237 + rc = iommufd_sw_msi_install(ictx, hwpt_paging, msi_map); 238 + if (rc) 239 + return rc; 240 + __set_bit(msi_map->id, handle->idev->igroup->required_sw_msi.bitmap); 241 + 242 + iova = msi_map->sw_msi_start + msi_map->pgoff * PAGE_SIZE; 243 + msi_desc_set_iommu_msi_iova(desc, iova, PAGE_SHIFT); 244 + return 0; 245 + } 246 + EXPORT_SYMBOL_NS_GPL(iommufd_sw_msi, "IOMMUFD"); 247 + #endif 248 + 52 249 MODULE_DESCRIPTION("iommufd code shared with builtin modules"); 250 + MODULE_IMPORT_NS("IOMMUFD_INTERNAL"); 53 251 MODULE_LICENSE("GPL");
+598
drivers/iommu/iommufd/eventq.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* Copyright (C) 2024 Intel Corporation 3 + */ 4 + #define pr_fmt(fmt) "iommufd: " fmt 5 + 6 + #include <linux/anon_inodes.h> 7 + #include <linux/file.h> 8 + #include <linux/fs.h> 9 + #include <linux/iommufd.h> 10 + #include <linux/module.h> 11 + #include <linux/mutex.h> 12 + #include <linux/pci.h> 13 + #include <linux/pci-ats.h> 14 + #include <linux/poll.h> 15 + #include <uapi/linux/iommufd.h> 16 + 17 + #include "../iommu-priv.h" 18 + #include "iommufd_private.h" 19 + 20 + /* IOMMUFD_OBJ_FAULT Functions */ 21 + 22 + int iommufd_fault_iopf_enable(struct iommufd_device *idev) 23 + { 24 + struct device *dev = idev->dev; 25 + int ret; 26 + 27 + /* 28 + * Once we turn on PCI/PRI support for VF, the response failure code 29 + * should not be forwarded to the hardware due to PRI being a shared 30 + * resource between PF and VFs. There is no coordination for this 31 + * shared capability. This waits for a vPRI reset to recover. 32 + */ 33 + if (dev_is_pci(dev)) { 34 + struct pci_dev *pdev = to_pci_dev(dev); 35 + 36 + if (pdev->is_virtfn && pci_pri_supported(pdev)) 37 + return -EINVAL; 38 + } 39 + 40 + mutex_lock(&idev->iopf_lock); 41 + /* Device iopf has already been on. */ 42 + if (++idev->iopf_enabled > 1) { 43 + mutex_unlock(&idev->iopf_lock); 44 + return 0; 45 + } 46 + 47 + ret = iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_IOPF); 48 + if (ret) 49 + --idev->iopf_enabled; 50 + mutex_unlock(&idev->iopf_lock); 51 + 52 + return ret; 53 + } 54 + 55 + void iommufd_fault_iopf_disable(struct iommufd_device *idev) 56 + { 57 + mutex_lock(&idev->iopf_lock); 58 + if (!WARN_ON(idev->iopf_enabled == 0)) { 59 + if (--idev->iopf_enabled == 0) 60 + iommu_dev_disable_feature(idev->dev, IOMMU_DEV_FEAT_IOPF); 61 + } 62 + mutex_unlock(&idev->iopf_lock); 63 + } 64 + 65 + void iommufd_auto_response_faults(struct iommufd_hw_pagetable *hwpt, 66 + struct iommufd_attach_handle *handle) 67 + { 68 + struct iommufd_fault *fault = hwpt->fault; 69 + struct iopf_group *group, *next; 70 + struct list_head free_list; 71 + unsigned long index; 72 + 73 + if (!fault) 74 + return; 75 + INIT_LIST_HEAD(&free_list); 76 + 77 + mutex_lock(&fault->mutex); 78 + spin_lock(&fault->common.lock); 79 + list_for_each_entry_safe(group, next, &fault->common.deliver, node) { 80 + if (group->attach_handle != &handle->handle) 81 + continue; 82 + list_move(&group->node, &free_list); 83 + } 84 + spin_unlock(&fault->common.lock); 85 + 86 + list_for_each_entry_safe(group, next, &free_list, node) { 87 + list_del(&group->node); 88 + iopf_group_response(group, IOMMU_PAGE_RESP_INVALID); 89 + iopf_free_group(group); 90 + } 91 + 92 + xa_for_each(&fault->response, index, group) { 93 + if (group->attach_handle != &handle->handle) 94 + continue; 95 + xa_erase(&fault->response, index); 96 + iopf_group_response(group, IOMMU_PAGE_RESP_INVALID); 97 + iopf_free_group(group); 98 + } 99 + mutex_unlock(&fault->mutex); 100 + } 101 + 102 + void iommufd_fault_destroy(struct iommufd_object *obj) 103 + { 104 + struct iommufd_eventq *eventq = 105 + container_of(obj, struct iommufd_eventq, obj); 106 + struct iommufd_fault *fault = eventq_to_fault(eventq); 107 + struct iopf_group *group, *next; 108 + unsigned long index; 109 + 110 + /* 111 + * The iommufd object's reference count is zero at this point. 112 + * We can be confident that no other threads are currently 113 + * accessing this pointer. Therefore, acquiring the mutex here 114 + * is unnecessary. 115 + */ 116 + list_for_each_entry_safe(group, next, &fault->common.deliver, node) { 117 + list_del(&group->node); 118 + iopf_group_response(group, IOMMU_PAGE_RESP_INVALID); 119 + iopf_free_group(group); 120 + } 121 + xa_for_each(&fault->response, index, group) { 122 + xa_erase(&fault->response, index); 123 + iopf_group_response(group, IOMMU_PAGE_RESP_INVALID); 124 + iopf_free_group(group); 125 + } 126 + xa_destroy(&fault->response); 127 + mutex_destroy(&fault->mutex); 128 + } 129 + 130 + static void iommufd_compose_fault_message(struct iommu_fault *fault, 131 + struct iommu_hwpt_pgfault *hwpt_fault, 132 + struct iommufd_device *idev, 133 + u32 cookie) 134 + { 135 + hwpt_fault->flags = fault->prm.flags; 136 + hwpt_fault->dev_id = idev->obj.id; 137 + hwpt_fault->pasid = fault->prm.pasid; 138 + hwpt_fault->grpid = fault->prm.grpid; 139 + hwpt_fault->perm = fault->prm.perm; 140 + hwpt_fault->addr = fault->prm.addr; 141 + hwpt_fault->length = 0; 142 + hwpt_fault->cookie = cookie; 143 + } 144 + 145 + /* Fetch the first node out of the fault->deliver list */ 146 + static struct iopf_group * 147 + iommufd_fault_deliver_fetch(struct iommufd_fault *fault) 148 + { 149 + struct list_head *list = &fault->common.deliver; 150 + struct iopf_group *group = NULL; 151 + 152 + spin_lock(&fault->common.lock); 153 + if (!list_empty(list)) { 154 + group = list_first_entry(list, struct iopf_group, node); 155 + list_del(&group->node); 156 + } 157 + spin_unlock(&fault->common.lock); 158 + return group; 159 + } 160 + 161 + /* Restore a node back to the head of the fault->deliver list */ 162 + static void iommufd_fault_deliver_restore(struct iommufd_fault *fault, 163 + struct iopf_group *group) 164 + { 165 + spin_lock(&fault->common.lock); 166 + list_add(&group->node, &fault->common.deliver); 167 + spin_unlock(&fault->common.lock); 168 + } 169 + 170 + static ssize_t iommufd_fault_fops_read(struct file *filep, char __user *buf, 171 + size_t count, loff_t *ppos) 172 + { 173 + size_t fault_size = sizeof(struct iommu_hwpt_pgfault); 174 + struct iommufd_eventq *eventq = filep->private_data; 175 + struct iommufd_fault *fault = eventq_to_fault(eventq); 176 + struct iommu_hwpt_pgfault data = {}; 177 + struct iommufd_device *idev; 178 + struct iopf_group *group; 179 + struct iopf_fault *iopf; 180 + size_t done = 0; 181 + int rc = 0; 182 + 183 + if (*ppos || count % fault_size) 184 + return -ESPIPE; 185 + 186 + mutex_lock(&fault->mutex); 187 + while ((group = iommufd_fault_deliver_fetch(fault))) { 188 + if (done >= count || 189 + group->fault_count * fault_size > count - done) { 190 + iommufd_fault_deliver_restore(fault, group); 191 + break; 192 + } 193 + 194 + rc = xa_alloc(&fault->response, &group->cookie, group, 195 + xa_limit_32b, GFP_KERNEL); 196 + if (rc) { 197 + iommufd_fault_deliver_restore(fault, group); 198 + break; 199 + } 200 + 201 + idev = to_iommufd_handle(group->attach_handle)->idev; 202 + list_for_each_entry(iopf, &group->faults, list) { 203 + iommufd_compose_fault_message(&iopf->fault, 204 + &data, idev, 205 + group->cookie); 206 + if (copy_to_user(buf + done, &data, fault_size)) { 207 + xa_erase(&fault->response, group->cookie); 208 + iommufd_fault_deliver_restore(fault, group); 209 + rc = -EFAULT; 210 + break; 211 + } 212 + done += fault_size; 213 + } 214 + } 215 + mutex_unlock(&fault->mutex); 216 + 217 + return done == 0 ? rc : done; 218 + } 219 + 220 + static ssize_t iommufd_fault_fops_write(struct file *filep, const char __user *buf, 221 + size_t count, loff_t *ppos) 222 + { 223 + size_t response_size = sizeof(struct iommu_hwpt_page_response); 224 + struct iommufd_eventq *eventq = filep->private_data; 225 + struct iommufd_fault *fault = eventq_to_fault(eventq); 226 + struct iommu_hwpt_page_response response; 227 + struct iopf_group *group; 228 + size_t done = 0; 229 + int rc = 0; 230 + 231 + if (*ppos || count % response_size) 232 + return -ESPIPE; 233 + 234 + mutex_lock(&fault->mutex); 235 + while (count > done) { 236 + rc = copy_from_user(&response, buf + done, response_size); 237 + if (rc) 238 + break; 239 + 240 + static_assert((int)IOMMUFD_PAGE_RESP_SUCCESS == 241 + (int)IOMMU_PAGE_RESP_SUCCESS); 242 + static_assert((int)IOMMUFD_PAGE_RESP_INVALID == 243 + (int)IOMMU_PAGE_RESP_INVALID); 244 + if (response.code != IOMMUFD_PAGE_RESP_SUCCESS && 245 + response.code != IOMMUFD_PAGE_RESP_INVALID) { 246 + rc = -EINVAL; 247 + break; 248 + } 249 + 250 + group = xa_erase(&fault->response, response.cookie); 251 + if (!group) { 252 + rc = -EINVAL; 253 + break; 254 + } 255 + 256 + iopf_group_response(group, response.code); 257 + iopf_free_group(group); 258 + done += response_size; 259 + } 260 + mutex_unlock(&fault->mutex); 261 + 262 + return done == 0 ? rc : done; 263 + } 264 + 265 + /* IOMMUFD_OBJ_VEVENTQ Functions */ 266 + 267 + void iommufd_veventq_abort(struct iommufd_object *obj) 268 + { 269 + struct iommufd_eventq *eventq = 270 + container_of(obj, struct iommufd_eventq, obj); 271 + struct iommufd_veventq *veventq = eventq_to_veventq(eventq); 272 + struct iommufd_viommu *viommu = veventq->viommu; 273 + struct iommufd_vevent *cur, *next; 274 + 275 + lockdep_assert_held_write(&viommu->veventqs_rwsem); 276 + 277 + list_for_each_entry_safe(cur, next, &eventq->deliver, node) { 278 + list_del(&cur->node); 279 + if (cur != &veventq->lost_events_header) 280 + kfree(cur); 281 + } 282 + 283 + refcount_dec(&viommu->obj.users); 284 + list_del(&veventq->node); 285 + } 286 + 287 + void iommufd_veventq_destroy(struct iommufd_object *obj) 288 + { 289 + struct iommufd_veventq *veventq = eventq_to_veventq( 290 + container_of(obj, struct iommufd_eventq, obj)); 291 + 292 + down_write(&veventq->viommu->veventqs_rwsem); 293 + iommufd_veventq_abort(obj); 294 + up_write(&veventq->viommu->veventqs_rwsem); 295 + } 296 + 297 + static struct iommufd_vevent * 298 + iommufd_veventq_deliver_fetch(struct iommufd_veventq *veventq) 299 + { 300 + struct iommufd_eventq *eventq = &veventq->common; 301 + struct list_head *list = &eventq->deliver; 302 + struct iommufd_vevent *vevent = NULL; 303 + 304 + spin_lock(&eventq->lock); 305 + if (!list_empty(list)) { 306 + struct iommufd_vevent *next; 307 + 308 + next = list_first_entry(list, struct iommufd_vevent, node); 309 + /* Make a copy of the lost_events_header for copy_to_user */ 310 + if (next == &veventq->lost_events_header) { 311 + vevent = kzalloc(sizeof(*vevent), GFP_ATOMIC); 312 + if (!vevent) 313 + goto out_unlock; 314 + } 315 + list_del(&next->node); 316 + if (vevent) 317 + memcpy(vevent, next, sizeof(*vevent)); 318 + else 319 + vevent = next; 320 + } 321 + out_unlock: 322 + spin_unlock(&eventq->lock); 323 + return vevent; 324 + } 325 + 326 + static void iommufd_veventq_deliver_restore(struct iommufd_veventq *veventq, 327 + struct iommufd_vevent *vevent) 328 + { 329 + struct iommufd_eventq *eventq = &veventq->common; 330 + struct list_head *list = &eventq->deliver; 331 + 332 + spin_lock(&eventq->lock); 333 + if (vevent_for_lost_events_header(vevent)) { 334 + /* Remove the copy of the lost_events_header */ 335 + kfree(vevent); 336 + vevent = NULL; 337 + /* An empty list needs the lost_events_header back */ 338 + if (list_empty(list)) 339 + vevent = &veventq->lost_events_header; 340 + } 341 + if (vevent) 342 + list_add(&vevent->node, list); 343 + spin_unlock(&eventq->lock); 344 + } 345 + 346 + static ssize_t iommufd_veventq_fops_read(struct file *filep, char __user *buf, 347 + size_t count, loff_t *ppos) 348 + { 349 + struct iommufd_eventq *eventq = filep->private_data; 350 + struct iommufd_veventq *veventq = eventq_to_veventq(eventq); 351 + struct iommufd_vevent_header *hdr; 352 + struct iommufd_vevent *cur; 353 + size_t done = 0; 354 + int rc = 0; 355 + 356 + if (*ppos) 357 + return -ESPIPE; 358 + 359 + while ((cur = iommufd_veventq_deliver_fetch(veventq))) { 360 + /* Validate the remaining bytes against the header size */ 361 + if (done >= count || sizeof(*hdr) > count - done) { 362 + iommufd_veventq_deliver_restore(veventq, cur); 363 + break; 364 + } 365 + hdr = &cur->header; 366 + 367 + /* If being a normal vEVENT, validate against the full size */ 368 + if (!vevent_for_lost_events_header(cur) && 369 + sizeof(hdr) + cur->data_len > count - done) { 370 + iommufd_veventq_deliver_restore(veventq, cur); 371 + break; 372 + } 373 + 374 + if (copy_to_user(buf + done, hdr, sizeof(*hdr))) { 375 + iommufd_veventq_deliver_restore(veventq, cur); 376 + rc = -EFAULT; 377 + break; 378 + } 379 + done += sizeof(*hdr); 380 + 381 + if (cur->data_len && 382 + copy_to_user(buf + done, cur->event_data, cur->data_len)) { 383 + iommufd_veventq_deliver_restore(veventq, cur); 384 + rc = -EFAULT; 385 + break; 386 + } 387 + spin_lock(&eventq->lock); 388 + if (!vevent_for_lost_events_header(cur)) 389 + veventq->num_events--; 390 + spin_unlock(&eventq->lock); 391 + done += cur->data_len; 392 + kfree(cur); 393 + } 394 + 395 + return done == 0 ? rc : done; 396 + } 397 + 398 + /* Common Event Queue Functions */ 399 + 400 + static __poll_t iommufd_eventq_fops_poll(struct file *filep, 401 + struct poll_table_struct *wait) 402 + { 403 + struct iommufd_eventq *eventq = filep->private_data; 404 + __poll_t pollflags = 0; 405 + 406 + if (eventq->obj.type == IOMMUFD_OBJ_FAULT) 407 + pollflags |= EPOLLOUT; 408 + 409 + poll_wait(filep, &eventq->wait_queue, wait); 410 + spin_lock(&eventq->lock); 411 + if (!list_empty(&eventq->deliver)) 412 + pollflags |= EPOLLIN | EPOLLRDNORM; 413 + spin_unlock(&eventq->lock); 414 + 415 + return pollflags; 416 + } 417 + 418 + static int iommufd_eventq_fops_release(struct inode *inode, struct file *filep) 419 + { 420 + struct iommufd_eventq *eventq = filep->private_data; 421 + 422 + refcount_dec(&eventq->obj.users); 423 + iommufd_ctx_put(eventq->ictx); 424 + return 0; 425 + } 426 + 427 + #define INIT_EVENTQ_FOPS(read_op, write_op) \ 428 + ((const struct file_operations){ \ 429 + .owner = THIS_MODULE, \ 430 + .open = nonseekable_open, \ 431 + .read = read_op, \ 432 + .write = write_op, \ 433 + .poll = iommufd_eventq_fops_poll, \ 434 + .release = iommufd_eventq_fops_release, \ 435 + }) 436 + 437 + static int iommufd_eventq_init(struct iommufd_eventq *eventq, char *name, 438 + struct iommufd_ctx *ictx, 439 + const struct file_operations *fops) 440 + { 441 + struct file *filep; 442 + int fdno; 443 + 444 + spin_lock_init(&eventq->lock); 445 + INIT_LIST_HEAD(&eventq->deliver); 446 + init_waitqueue_head(&eventq->wait_queue); 447 + 448 + filep = anon_inode_getfile(name, fops, eventq, O_RDWR); 449 + if (IS_ERR(filep)) 450 + return PTR_ERR(filep); 451 + 452 + eventq->ictx = ictx; 453 + iommufd_ctx_get(eventq->ictx); 454 + eventq->filep = filep; 455 + refcount_inc(&eventq->obj.users); 456 + 457 + fdno = get_unused_fd_flags(O_CLOEXEC); 458 + if (fdno < 0) 459 + fput(filep); 460 + return fdno; 461 + } 462 + 463 + static const struct file_operations iommufd_fault_fops = 464 + INIT_EVENTQ_FOPS(iommufd_fault_fops_read, iommufd_fault_fops_write); 465 + 466 + int iommufd_fault_alloc(struct iommufd_ucmd *ucmd) 467 + { 468 + struct iommu_fault_alloc *cmd = ucmd->cmd; 469 + struct iommufd_fault *fault; 470 + int fdno; 471 + int rc; 472 + 473 + if (cmd->flags) 474 + return -EOPNOTSUPP; 475 + 476 + fault = __iommufd_object_alloc(ucmd->ictx, fault, IOMMUFD_OBJ_FAULT, 477 + common.obj); 478 + if (IS_ERR(fault)) 479 + return PTR_ERR(fault); 480 + 481 + xa_init_flags(&fault->response, XA_FLAGS_ALLOC1); 482 + mutex_init(&fault->mutex); 483 + 484 + fdno = iommufd_eventq_init(&fault->common, "[iommufd-pgfault]", 485 + ucmd->ictx, &iommufd_fault_fops); 486 + if (fdno < 0) { 487 + rc = fdno; 488 + goto out_abort; 489 + } 490 + 491 + cmd->out_fault_id = fault->common.obj.id; 492 + cmd->out_fault_fd = fdno; 493 + 494 + rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); 495 + if (rc) 496 + goto out_put_fdno; 497 + iommufd_object_finalize(ucmd->ictx, &fault->common.obj); 498 + 499 + fd_install(fdno, fault->common.filep); 500 + 501 + return 0; 502 + out_put_fdno: 503 + put_unused_fd(fdno); 504 + fput(fault->common.filep); 505 + out_abort: 506 + iommufd_object_abort_and_destroy(ucmd->ictx, &fault->common.obj); 507 + 508 + return rc; 509 + } 510 + 511 + int iommufd_fault_iopf_handler(struct iopf_group *group) 512 + { 513 + struct iommufd_hw_pagetable *hwpt; 514 + struct iommufd_fault *fault; 515 + 516 + hwpt = group->attach_handle->domain->iommufd_hwpt; 517 + fault = hwpt->fault; 518 + 519 + spin_lock(&fault->common.lock); 520 + list_add_tail(&group->node, &fault->common.deliver); 521 + spin_unlock(&fault->common.lock); 522 + 523 + wake_up_interruptible(&fault->common.wait_queue); 524 + 525 + return 0; 526 + } 527 + 528 + static const struct file_operations iommufd_veventq_fops = 529 + INIT_EVENTQ_FOPS(iommufd_veventq_fops_read, NULL); 530 + 531 + int iommufd_veventq_alloc(struct iommufd_ucmd *ucmd) 532 + { 533 + struct iommu_veventq_alloc *cmd = ucmd->cmd; 534 + struct iommufd_veventq *veventq; 535 + struct iommufd_viommu *viommu; 536 + int fdno; 537 + int rc; 538 + 539 + if (cmd->flags || cmd->__reserved || 540 + cmd->type == IOMMU_VEVENTQ_TYPE_DEFAULT) 541 + return -EOPNOTSUPP; 542 + if (!cmd->veventq_depth) 543 + return -EINVAL; 544 + 545 + viommu = iommufd_get_viommu(ucmd, cmd->viommu_id); 546 + if (IS_ERR(viommu)) 547 + return PTR_ERR(viommu); 548 + 549 + down_write(&viommu->veventqs_rwsem); 550 + 551 + if (iommufd_viommu_find_veventq(viommu, cmd->type)) { 552 + rc = -EEXIST; 553 + goto out_unlock_veventqs; 554 + } 555 + 556 + veventq = __iommufd_object_alloc(ucmd->ictx, veventq, 557 + IOMMUFD_OBJ_VEVENTQ, common.obj); 558 + if (IS_ERR(veventq)) { 559 + rc = PTR_ERR(veventq); 560 + goto out_unlock_veventqs; 561 + } 562 + 563 + veventq->type = cmd->type; 564 + veventq->viommu = viommu; 565 + refcount_inc(&viommu->obj.users); 566 + veventq->depth = cmd->veventq_depth; 567 + list_add_tail(&veventq->node, &viommu->veventqs); 568 + veventq->lost_events_header.header.flags = 569 + IOMMU_VEVENTQ_FLAG_LOST_EVENTS; 570 + 571 + fdno = iommufd_eventq_init(&veventq->common, "[iommufd-viommu-event]", 572 + ucmd->ictx, &iommufd_veventq_fops); 573 + if (fdno < 0) { 574 + rc = fdno; 575 + goto out_abort; 576 + } 577 + 578 + cmd->out_veventq_id = veventq->common.obj.id; 579 + cmd->out_veventq_fd = fdno; 580 + 581 + rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); 582 + if (rc) 583 + goto out_put_fdno; 584 + 585 + iommufd_object_finalize(ucmd->ictx, &veventq->common.obj); 586 + fd_install(fdno, veventq->common.filep); 587 + goto out_unlock_veventqs; 588 + 589 + out_put_fdno: 590 + put_unused_fd(fdno); 591 + fput(veventq->common.filep); 592 + out_abort: 593 + iommufd_object_abort_and_destroy(ucmd->ictx, &veventq->common.obj); 594 + out_unlock_veventqs: 595 + up_write(&viommu->veventqs_rwsem); 596 + iommufd_put_object(ucmd->ictx, &viommu->obj); 597 + return rc; 598 + }
-342
drivers/iommu/iommufd/fault.c
··· 1 - // SPDX-License-Identifier: GPL-2.0-only 2 - /* Copyright (C) 2024 Intel Corporation 3 - */ 4 - #define pr_fmt(fmt) "iommufd: " fmt 5 - 6 - #include <linux/anon_inodes.h> 7 - #include <linux/file.h> 8 - #include <linux/fs.h> 9 - #include <linux/iommufd.h> 10 - #include <linux/module.h> 11 - #include <linux/mutex.h> 12 - #include <linux/pci.h> 13 - #include <linux/pci-ats.h> 14 - #include <linux/poll.h> 15 - #include <uapi/linux/iommufd.h> 16 - 17 - #include "../iommu-priv.h" 18 - #include "iommufd_private.h" 19 - 20 - int iommufd_fault_iopf_enable(struct iommufd_device *idev) 21 - { 22 - struct device *dev = idev->dev; 23 - int ret; 24 - 25 - /* 26 - * Once we turn on PCI/PRI support for VF, the response failure code 27 - * should not be forwarded to the hardware due to PRI being a shared 28 - * resource between PF and VFs. There is no coordination for this 29 - * shared capability. This waits for a vPRI reset to recover. 30 - */ 31 - if (dev_is_pci(dev)) { 32 - struct pci_dev *pdev = to_pci_dev(dev); 33 - 34 - if (pdev->is_virtfn && pci_pri_supported(pdev)) 35 - return -EINVAL; 36 - } 37 - 38 - mutex_lock(&idev->iopf_lock); 39 - /* Device iopf has already been on. */ 40 - if (++idev->iopf_enabled > 1) { 41 - mutex_unlock(&idev->iopf_lock); 42 - return 0; 43 - } 44 - 45 - ret = iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_IOPF); 46 - if (ret) 47 - --idev->iopf_enabled; 48 - mutex_unlock(&idev->iopf_lock); 49 - 50 - return ret; 51 - } 52 - 53 - void iommufd_fault_iopf_disable(struct iommufd_device *idev) 54 - { 55 - mutex_lock(&idev->iopf_lock); 56 - if (!WARN_ON(idev->iopf_enabled == 0)) { 57 - if (--idev->iopf_enabled == 0) 58 - iommu_dev_disable_feature(idev->dev, IOMMU_DEV_FEAT_IOPF); 59 - } 60 - mutex_unlock(&idev->iopf_lock); 61 - } 62 - 63 - void iommufd_auto_response_faults(struct iommufd_hw_pagetable *hwpt, 64 - struct iommufd_attach_handle *handle) 65 - { 66 - struct iommufd_fault *fault = hwpt->fault; 67 - struct iopf_group *group, *next; 68 - struct list_head free_list; 69 - unsigned long index; 70 - 71 - if (!fault) 72 - return; 73 - INIT_LIST_HEAD(&free_list); 74 - 75 - mutex_lock(&fault->mutex); 76 - spin_lock(&fault->lock); 77 - list_for_each_entry_safe(group, next, &fault->deliver, node) { 78 - if (group->attach_handle != &handle->handle) 79 - continue; 80 - list_move(&group->node, &free_list); 81 - } 82 - spin_unlock(&fault->lock); 83 - 84 - list_for_each_entry_safe(group, next, &free_list, node) { 85 - list_del(&group->node); 86 - iopf_group_response(group, IOMMU_PAGE_RESP_INVALID); 87 - iopf_free_group(group); 88 - } 89 - 90 - xa_for_each(&fault->response, index, group) { 91 - if (group->attach_handle != &handle->handle) 92 - continue; 93 - xa_erase(&fault->response, index); 94 - iopf_group_response(group, IOMMU_PAGE_RESP_INVALID); 95 - iopf_free_group(group); 96 - } 97 - mutex_unlock(&fault->mutex); 98 - } 99 - 100 - void iommufd_fault_destroy(struct iommufd_object *obj) 101 - { 102 - struct iommufd_fault *fault = container_of(obj, struct iommufd_fault, obj); 103 - struct iopf_group *group, *next; 104 - unsigned long index; 105 - 106 - /* 107 - * The iommufd object's reference count is zero at this point. 108 - * We can be confident that no other threads are currently 109 - * accessing this pointer. Therefore, acquiring the mutex here 110 - * is unnecessary. 111 - */ 112 - list_for_each_entry_safe(group, next, &fault->deliver, node) { 113 - list_del(&group->node); 114 - iopf_group_response(group, IOMMU_PAGE_RESP_INVALID); 115 - iopf_free_group(group); 116 - } 117 - xa_for_each(&fault->response, index, group) { 118 - xa_erase(&fault->response, index); 119 - iopf_group_response(group, IOMMU_PAGE_RESP_INVALID); 120 - iopf_free_group(group); 121 - } 122 - xa_destroy(&fault->response); 123 - mutex_destroy(&fault->mutex); 124 - } 125 - 126 - static void iommufd_compose_fault_message(struct iommu_fault *fault, 127 - struct iommu_hwpt_pgfault *hwpt_fault, 128 - struct iommufd_device *idev, 129 - u32 cookie) 130 - { 131 - hwpt_fault->flags = fault->prm.flags; 132 - hwpt_fault->dev_id = idev->obj.id; 133 - hwpt_fault->pasid = fault->prm.pasid; 134 - hwpt_fault->grpid = fault->prm.grpid; 135 - hwpt_fault->perm = fault->prm.perm; 136 - hwpt_fault->addr = fault->prm.addr; 137 - hwpt_fault->length = 0; 138 - hwpt_fault->cookie = cookie; 139 - } 140 - 141 - static ssize_t iommufd_fault_fops_read(struct file *filep, char __user *buf, 142 - size_t count, loff_t *ppos) 143 - { 144 - size_t fault_size = sizeof(struct iommu_hwpt_pgfault); 145 - struct iommufd_fault *fault = filep->private_data; 146 - struct iommu_hwpt_pgfault data = {}; 147 - struct iommufd_device *idev; 148 - struct iopf_group *group; 149 - struct iopf_fault *iopf; 150 - size_t done = 0; 151 - int rc = 0; 152 - 153 - if (*ppos || count % fault_size) 154 - return -ESPIPE; 155 - 156 - mutex_lock(&fault->mutex); 157 - while ((group = iommufd_fault_deliver_fetch(fault))) { 158 - if (done >= count || 159 - group->fault_count * fault_size > count - done) { 160 - iommufd_fault_deliver_restore(fault, group); 161 - break; 162 - } 163 - 164 - rc = xa_alloc(&fault->response, &group->cookie, group, 165 - xa_limit_32b, GFP_KERNEL); 166 - if (rc) { 167 - iommufd_fault_deliver_restore(fault, group); 168 - break; 169 - } 170 - 171 - idev = to_iommufd_handle(group->attach_handle)->idev; 172 - list_for_each_entry(iopf, &group->faults, list) { 173 - iommufd_compose_fault_message(&iopf->fault, 174 - &data, idev, 175 - group->cookie); 176 - if (copy_to_user(buf + done, &data, fault_size)) { 177 - xa_erase(&fault->response, group->cookie); 178 - iommufd_fault_deliver_restore(fault, group); 179 - rc = -EFAULT; 180 - break; 181 - } 182 - done += fault_size; 183 - } 184 - } 185 - mutex_unlock(&fault->mutex); 186 - 187 - return done == 0 ? rc : done; 188 - } 189 - 190 - static ssize_t iommufd_fault_fops_write(struct file *filep, const char __user *buf, 191 - size_t count, loff_t *ppos) 192 - { 193 - size_t response_size = sizeof(struct iommu_hwpt_page_response); 194 - struct iommufd_fault *fault = filep->private_data; 195 - struct iommu_hwpt_page_response response; 196 - struct iopf_group *group; 197 - size_t done = 0; 198 - int rc = 0; 199 - 200 - if (*ppos || count % response_size) 201 - return -ESPIPE; 202 - 203 - mutex_lock(&fault->mutex); 204 - while (count > done) { 205 - rc = copy_from_user(&response, buf + done, response_size); 206 - if (rc) 207 - break; 208 - 209 - static_assert((int)IOMMUFD_PAGE_RESP_SUCCESS == 210 - (int)IOMMU_PAGE_RESP_SUCCESS); 211 - static_assert((int)IOMMUFD_PAGE_RESP_INVALID == 212 - (int)IOMMU_PAGE_RESP_INVALID); 213 - if (response.code != IOMMUFD_PAGE_RESP_SUCCESS && 214 - response.code != IOMMUFD_PAGE_RESP_INVALID) { 215 - rc = -EINVAL; 216 - break; 217 - } 218 - 219 - group = xa_erase(&fault->response, response.cookie); 220 - if (!group) { 221 - rc = -EINVAL; 222 - break; 223 - } 224 - 225 - iopf_group_response(group, response.code); 226 - iopf_free_group(group); 227 - done += response_size; 228 - } 229 - mutex_unlock(&fault->mutex); 230 - 231 - return done == 0 ? rc : done; 232 - } 233 - 234 - static __poll_t iommufd_fault_fops_poll(struct file *filep, 235 - struct poll_table_struct *wait) 236 - { 237 - struct iommufd_fault *fault = filep->private_data; 238 - __poll_t pollflags = EPOLLOUT; 239 - 240 - poll_wait(filep, &fault->wait_queue, wait); 241 - spin_lock(&fault->lock); 242 - if (!list_empty(&fault->deliver)) 243 - pollflags |= EPOLLIN | EPOLLRDNORM; 244 - spin_unlock(&fault->lock); 245 - 246 - return pollflags; 247 - } 248 - 249 - static int iommufd_fault_fops_release(struct inode *inode, struct file *filep) 250 - { 251 - struct iommufd_fault *fault = filep->private_data; 252 - 253 - refcount_dec(&fault->obj.users); 254 - iommufd_ctx_put(fault->ictx); 255 - return 0; 256 - } 257 - 258 - static const struct file_operations iommufd_fault_fops = { 259 - .owner = THIS_MODULE, 260 - .open = nonseekable_open, 261 - .read = iommufd_fault_fops_read, 262 - .write = iommufd_fault_fops_write, 263 - .poll = iommufd_fault_fops_poll, 264 - .release = iommufd_fault_fops_release, 265 - }; 266 - 267 - int iommufd_fault_alloc(struct iommufd_ucmd *ucmd) 268 - { 269 - struct iommu_fault_alloc *cmd = ucmd->cmd; 270 - struct iommufd_fault *fault; 271 - struct file *filep; 272 - int fdno; 273 - int rc; 274 - 275 - if (cmd->flags) 276 - return -EOPNOTSUPP; 277 - 278 - fault = iommufd_object_alloc(ucmd->ictx, fault, IOMMUFD_OBJ_FAULT); 279 - if (IS_ERR(fault)) 280 - return PTR_ERR(fault); 281 - 282 - fault->ictx = ucmd->ictx; 283 - INIT_LIST_HEAD(&fault->deliver); 284 - xa_init_flags(&fault->response, XA_FLAGS_ALLOC1); 285 - mutex_init(&fault->mutex); 286 - spin_lock_init(&fault->lock); 287 - init_waitqueue_head(&fault->wait_queue); 288 - 289 - filep = anon_inode_getfile("[iommufd-pgfault]", &iommufd_fault_fops, 290 - fault, O_RDWR); 291 - if (IS_ERR(filep)) { 292 - rc = PTR_ERR(filep); 293 - goto out_abort; 294 - } 295 - 296 - refcount_inc(&fault->obj.users); 297 - iommufd_ctx_get(fault->ictx); 298 - fault->filep = filep; 299 - 300 - fdno = get_unused_fd_flags(O_CLOEXEC); 301 - if (fdno < 0) { 302 - rc = fdno; 303 - goto out_fput; 304 - } 305 - 306 - cmd->out_fault_id = fault->obj.id; 307 - cmd->out_fault_fd = fdno; 308 - 309 - rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); 310 - if (rc) 311 - goto out_put_fdno; 312 - iommufd_object_finalize(ucmd->ictx, &fault->obj); 313 - 314 - fd_install(fdno, fault->filep); 315 - 316 - return 0; 317 - out_put_fdno: 318 - put_unused_fd(fdno); 319 - out_fput: 320 - fput(filep); 321 - out_abort: 322 - iommufd_object_abort_and_destroy(ucmd->ictx, &fault->obj); 323 - 324 - return rc; 325 - } 326 - 327 - int iommufd_fault_iopf_handler(struct iopf_group *group) 328 - { 329 - struct iommufd_hw_pagetable *hwpt; 330 - struct iommufd_fault *fault; 331 - 332 - hwpt = group->attach_handle->domain->iommufd_hwpt; 333 - fault = hwpt->fault; 334 - 335 - spin_lock(&fault->lock); 336 - list_add_tail(&group->node, &fault->deliver); 337 - spin_unlock(&fault->lock); 338 - 339 - wake_up_interruptible(&fault->wait_queue); 340 - 341 - return 0; 342 - }
+26 -16
drivers/iommu/iommufd/hw_pagetable.c
··· 14 14 iommu_domain_free(hwpt->domain); 15 15 16 16 if (hwpt->fault) 17 - refcount_dec(&hwpt->fault->obj.users); 17 + refcount_dec(&hwpt->fault->common.obj.users); 18 18 } 19 19 20 20 void iommufd_hwpt_paging_destroy(struct iommufd_object *obj) ··· 90 90 * @ictx: iommufd context 91 91 * @ioas: IOAS to associate the domain with 92 92 * @idev: Device to get an iommu_domain for 93 + * @pasid: PASID to get an iommu_domain for 93 94 * @flags: Flags from userspace 94 95 * @immediate_attach: True if idev should be attached to the hwpt 95 96 * @user_data: The user provided driver specific data describing the domain to ··· 106 105 */ 107 106 struct iommufd_hwpt_paging * 108 107 iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, 109 - struct iommufd_device *idev, u32 flags, 110 - bool immediate_attach, 108 + struct iommufd_device *idev, ioasid_t pasid, 109 + u32 flags, bool immediate_attach, 111 110 const struct iommu_user_data *user_data) 112 111 { 113 112 const u32 valid_flags = IOMMU_HWPT_ALLOC_NEST_PARENT | 114 113 IOMMU_HWPT_ALLOC_DIRTY_TRACKING | 115 - IOMMU_HWPT_FAULT_ID_VALID; 114 + IOMMU_HWPT_FAULT_ID_VALID | 115 + IOMMU_HWPT_ALLOC_PASID; 116 116 const struct iommu_ops *ops = dev_iommu_ops(idev->dev); 117 117 struct iommufd_hwpt_paging *hwpt_paging; 118 118 struct iommufd_hw_pagetable *hwpt; ··· 128 126 if ((flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING) && 129 127 !device_iommu_capable(idev->dev, IOMMU_CAP_DIRTY_TRACKING)) 130 128 return ERR_PTR(-EOPNOTSUPP); 129 + if ((flags & IOMMU_HWPT_FAULT_ID_VALID) && 130 + (flags & IOMMU_HWPT_ALLOC_NEST_PARENT)) 131 + return ERR_PTR(-EOPNOTSUPP); 131 132 132 133 hwpt_paging = __iommufd_object_alloc( 133 134 ictx, hwpt_paging, IOMMUFD_OBJ_HWPT_PAGING, common.obj); 134 135 if (IS_ERR(hwpt_paging)) 135 136 return ERR_CAST(hwpt_paging); 136 137 hwpt = &hwpt_paging->common; 138 + hwpt->pasid_compat = flags & IOMMU_HWPT_ALLOC_PASID; 137 139 138 140 INIT_LIST_HEAD(&hwpt_paging->hwpt_item); 139 141 /* Pairs with iommufd_hw_pagetable_destroy() */ ··· 162 156 goto out_abort; 163 157 } 164 158 } 165 - iommu_domain_set_sw_msi(hwpt->domain, iommufd_sw_msi); 159 + hwpt->domain->iommufd_hwpt = hwpt; 160 + hwpt->domain->cookie_type = IOMMU_COOKIE_IOMMUFD; 166 161 167 162 /* 168 163 * Set the coherency mode before we do iopt_table_add_domain() as some ··· 192 185 * sequence. Once those drivers are fixed this should be removed. 193 186 */ 194 187 if (immediate_attach) { 195 - rc = iommufd_hw_pagetable_attach(hwpt, idev); 188 + rc = iommufd_hw_pagetable_attach(hwpt, idev, pasid); 196 189 if (rc) 197 190 goto out_abort; 198 191 } ··· 205 198 206 199 out_detach: 207 200 if (immediate_attach) 208 - iommufd_hw_pagetable_detach(idev); 201 + iommufd_hw_pagetable_detach(idev, pasid); 209 202 out_abort: 210 203 iommufd_object_abort_and_destroy(ictx, &hwpt->obj); 211 204 return ERR_PTR(rc); ··· 234 227 struct iommufd_hw_pagetable *hwpt; 235 228 int rc; 236 229 237 - if ((flags & ~IOMMU_HWPT_FAULT_ID_VALID) || 230 + if ((flags & ~(IOMMU_HWPT_FAULT_ID_VALID | IOMMU_HWPT_ALLOC_PASID)) || 238 231 !user_data->len || !ops->domain_alloc_nested) 239 232 return ERR_PTR(-EOPNOTSUPP); 240 233 if (parent->auto_domain || !parent->nest_parent || ··· 246 239 if (IS_ERR(hwpt_nested)) 247 240 return ERR_CAST(hwpt_nested); 248 241 hwpt = &hwpt_nested->common; 242 + hwpt->pasid_compat = flags & IOMMU_HWPT_ALLOC_PASID; 249 243 250 244 refcount_inc(&parent->common.obj.users); 251 245 hwpt_nested->parent = parent; ··· 260 252 goto out_abort; 261 253 } 262 254 hwpt->domain->owner = ops; 263 - iommu_domain_set_sw_msi(hwpt->domain, iommufd_sw_msi); 255 + hwpt->domain->iommufd_hwpt = hwpt; 256 + hwpt->domain->cookie_type = IOMMU_COOKIE_IOMMUFD; 264 257 265 258 if (WARN_ON_ONCE(hwpt->domain->type != IOMMU_DOMAIN_NESTED)) { 266 259 rc = -EINVAL; ··· 291 282 struct iommufd_hw_pagetable *hwpt; 292 283 int rc; 293 284 294 - if (flags & ~IOMMU_HWPT_FAULT_ID_VALID) 285 + if (flags & ~(IOMMU_HWPT_FAULT_ID_VALID | IOMMU_HWPT_ALLOC_PASID)) 295 286 return ERR_PTR(-EOPNOTSUPP); 296 287 if (!user_data->len) 297 288 return ERR_PTR(-EOPNOTSUPP); ··· 303 294 if (IS_ERR(hwpt_nested)) 304 295 return ERR_CAST(hwpt_nested); 305 296 hwpt = &hwpt_nested->common; 297 + hwpt->pasid_compat = flags & IOMMU_HWPT_ALLOC_PASID; 306 298 307 299 hwpt_nested->viommu = viommu; 308 300 refcount_inc(&viommu->obj.users); ··· 318 308 hwpt->domain = NULL; 319 309 goto out_abort; 320 310 } 311 + hwpt->domain->iommufd_hwpt = hwpt; 321 312 hwpt->domain->owner = viommu->iommu_dev->ops; 322 - iommu_domain_set_sw_msi(hwpt->domain, iommufd_sw_msi); 313 + hwpt->domain->cookie_type = IOMMU_COOKIE_IOMMUFD; 323 314 324 315 if (WARN_ON_ONCE(hwpt->domain->type != IOMMU_DOMAIN_NESTED)) { 325 316 rc = -EINVAL; ··· 369 358 ioas = container_of(pt_obj, struct iommufd_ioas, obj); 370 359 mutex_lock(&ioas->mutex); 371 360 hwpt_paging = iommufd_hwpt_paging_alloc( 372 - ucmd->ictx, ioas, idev, cmd->flags, false, 373 - user_data.len ? &user_data : NULL); 361 + ucmd->ictx, ioas, idev, IOMMU_NO_PASID, cmd->flags, 362 + false, user_data.len ? &user_data : NULL); 374 363 if (IS_ERR(hwpt_paging)) { 375 364 rc = PTR_ERR(hwpt_paging); 376 365 goto out_unlock; ··· 420 409 } 421 410 hwpt->fault = fault; 422 411 hwpt->domain->iopf_handler = iommufd_fault_iopf_handler; 423 - refcount_inc(&fault->obj.users); 424 - iommufd_put_object(ucmd->ictx, &fault->obj); 412 + refcount_inc(&fault->common.obj.users); 413 + iommufd_put_object(ucmd->ictx, &fault->common.obj); 425 414 } 426 - hwpt->domain->iommufd_hwpt = hwpt; 427 415 428 416 cmd->out_hwpt_id = hwpt->obj.id; 429 417 rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
+114 -42
drivers/iommu/iommufd/iommufd_private.h
··· 32 32 DECLARE_BITMAP(bitmap, 64); 33 33 }; 34 34 35 - int iommufd_sw_msi(struct iommu_domain *domain, struct msi_desc *desc, 36 - phys_addr_t msi_addr); 35 + #ifdef CONFIG_IRQ_MSI_IOMMU 36 + int iommufd_sw_msi_install(struct iommufd_ctx *ictx, 37 + struct iommufd_hwpt_paging *hwpt_paging, 38 + struct iommufd_sw_msi_map *msi_map); 39 + #endif 37 40 38 41 struct iommufd_ctx { 39 42 struct file *file; ··· 299 296 struct iommufd_object obj; 300 297 struct iommu_domain *domain; 301 298 struct iommufd_fault *fault; 299 + bool pasid_compat : 1; 302 300 }; 303 301 304 302 struct iommufd_hwpt_paging { ··· 370 366 371 367 struct iommufd_hwpt_paging * 372 368 iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, 373 - struct iommufd_device *idev, u32 flags, 374 - bool immediate_attach, 369 + struct iommufd_device *idev, ioasid_t pasid, 370 + u32 flags, bool immediate_attach, 375 371 const struct iommu_user_data *user_data); 376 372 int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt, 377 - struct iommufd_device *idev); 373 + struct iommufd_device *idev, ioasid_t pasid); 378 374 struct iommufd_hw_pagetable * 379 - iommufd_hw_pagetable_detach(struct iommufd_device *idev); 375 + iommufd_hw_pagetable_detach(struct iommufd_device *idev, ioasid_t pasid); 380 376 void iommufd_hwpt_paging_destroy(struct iommufd_object *obj); 381 377 void iommufd_hwpt_paging_abort(struct iommufd_object *obj); 382 378 void iommufd_hwpt_nested_destroy(struct iommufd_object *obj); ··· 400 396 refcount_dec(&hwpt->obj.users); 401 397 } 402 398 399 + struct iommufd_attach; 400 + 403 401 struct iommufd_group { 404 402 struct kref ref; 405 403 struct mutex lock; 406 404 struct iommufd_ctx *ictx; 407 405 struct iommu_group *group; 408 - struct iommufd_hw_pagetable *hwpt; 409 - struct list_head device_list; 406 + struct xarray pasid_attach; 410 407 struct iommufd_sw_msi_maps required_sw_msi; 411 408 phys_addr_t sw_msi_start; 412 409 }; ··· 459 454 u32 iopt_access_list_id); 460 455 void iommufd_access_destroy_object(struct iommufd_object *obj); 461 456 462 - /* 463 - * An iommufd_fault object represents an interface to deliver I/O page faults 464 - * to the user space. These objects are created/destroyed by the user space and 465 - * associated with hardware page table objects during page-table allocation. 466 - */ 467 - struct iommufd_fault { 457 + struct iommufd_eventq { 468 458 struct iommufd_object obj; 469 459 struct iommufd_ctx *ictx; 470 460 struct file *filep; 471 461 472 462 spinlock_t lock; /* protects the deliver list */ 473 463 struct list_head deliver; 474 - struct mutex mutex; /* serializes response flows */ 475 - struct xarray response; 476 464 477 465 struct wait_queue_head wait_queue; 478 466 }; 479 - 480 - /* Fetch the first node out of the fault->deliver list */ 481 - static inline struct iopf_group * 482 - iommufd_fault_deliver_fetch(struct iommufd_fault *fault) 483 - { 484 - struct list_head *list = &fault->deliver; 485 - struct iopf_group *group = NULL; 486 - 487 - spin_lock(&fault->lock); 488 - if (!list_empty(list)) { 489 - group = list_first_entry(list, struct iopf_group, node); 490 - list_del(&group->node); 491 - } 492 - spin_unlock(&fault->lock); 493 - return group; 494 - } 495 - 496 - /* Restore a node back to the head of the fault->deliver list */ 497 - static inline void iommufd_fault_deliver_restore(struct iommufd_fault *fault, 498 - struct iopf_group *group) 499 - { 500 - spin_lock(&fault->lock); 501 - list_add(&group->node, &fault->deliver); 502 - spin_unlock(&fault->lock); 503 - } 504 467 505 468 struct iommufd_attach_handle { 506 469 struct iommu_attach_handle handle; ··· 478 505 /* Convert an iommu attach handle to iommufd handle. */ 479 506 #define to_iommufd_handle(hdl) container_of(hdl, struct iommufd_attach_handle, handle) 480 507 508 + /* 509 + * An iommufd_fault object represents an interface to deliver I/O page faults 510 + * to the user space. These objects are created/destroyed by the user space and 511 + * associated with hardware page table objects during page-table allocation. 512 + */ 513 + struct iommufd_fault { 514 + struct iommufd_eventq common; 515 + struct mutex mutex; /* serializes response flows */ 516 + struct xarray response; 517 + }; 518 + 519 + static inline struct iommufd_fault * 520 + eventq_to_fault(struct iommufd_eventq *eventq) 521 + { 522 + return container_of(eventq, struct iommufd_fault, common); 523 + } 524 + 481 525 static inline struct iommufd_fault * 482 526 iommufd_get_fault(struct iommufd_ucmd *ucmd, u32 id) 483 527 { 484 528 return container_of(iommufd_get_object(ucmd->ictx, id, 485 529 IOMMUFD_OBJ_FAULT), 486 - struct iommufd_fault, obj); 530 + struct iommufd_fault, common.obj); 487 531 } 488 532 489 533 int iommufd_fault_alloc(struct iommufd_ucmd *ucmd); ··· 512 522 void iommufd_auto_response_faults(struct iommufd_hw_pagetable *hwpt, 513 523 struct iommufd_attach_handle *handle); 514 524 525 + /* An iommufd_vevent represents a vIOMMU event in an iommufd_veventq */ 526 + struct iommufd_vevent { 527 + struct iommufd_vevent_header header; 528 + struct list_head node; /* for iommufd_eventq::deliver */ 529 + ssize_t data_len; 530 + u64 event_data[] __counted_by(data_len); 531 + }; 532 + 533 + #define vevent_for_lost_events_header(vevent) \ 534 + (vevent->header.flags & IOMMU_VEVENTQ_FLAG_LOST_EVENTS) 535 + 536 + /* 537 + * An iommufd_veventq object represents an interface to deliver vIOMMU events to 538 + * the user space. It is created/destroyed by the user space and associated with 539 + * a vIOMMU object during the allocations. 540 + */ 541 + struct iommufd_veventq { 542 + struct iommufd_eventq common; 543 + struct iommufd_viommu *viommu; 544 + struct list_head node; /* for iommufd_viommu::veventqs */ 545 + struct iommufd_vevent lost_events_header; 546 + 547 + unsigned int type; 548 + unsigned int depth; 549 + 550 + /* Use common.lock for protection */ 551 + u32 num_events; 552 + u32 sequence; 553 + }; 554 + 555 + static inline struct iommufd_veventq * 556 + eventq_to_veventq(struct iommufd_eventq *eventq) 557 + { 558 + return container_of(eventq, struct iommufd_veventq, common); 559 + } 560 + 561 + static inline struct iommufd_veventq * 562 + iommufd_get_veventq(struct iommufd_ucmd *ucmd, u32 id) 563 + { 564 + return container_of(iommufd_get_object(ucmd->ictx, id, 565 + IOMMUFD_OBJ_VEVENTQ), 566 + struct iommufd_veventq, common.obj); 567 + } 568 + 569 + int iommufd_veventq_alloc(struct iommufd_ucmd *ucmd); 570 + void iommufd_veventq_destroy(struct iommufd_object *obj); 571 + void iommufd_veventq_abort(struct iommufd_object *obj); 572 + 573 + static inline void iommufd_vevent_handler(struct iommufd_veventq *veventq, 574 + struct iommufd_vevent *vevent) 575 + { 576 + struct iommufd_eventq *eventq = &veventq->common; 577 + 578 + lockdep_assert_held(&eventq->lock); 579 + 580 + /* 581 + * Remove the lost_events_header and add the new node at the same time. 582 + * Note the new node can be lost_events_header, for a sequence update. 583 + */ 584 + if (list_is_last(&veventq->lost_events_header.node, &eventq->deliver)) 585 + list_del(&veventq->lost_events_header.node); 586 + list_add_tail(&vevent->node, &eventq->deliver); 587 + vevent->header.sequence = veventq->sequence; 588 + veventq->sequence = (veventq->sequence + 1) & INT_MAX; 589 + 590 + wake_up_interruptible(&eventq->wait_queue); 591 + } 592 + 515 593 static inline struct iommufd_viommu * 516 594 iommufd_get_viommu(struct iommufd_ucmd *ucmd, u32 id) 517 595 { 518 596 return container_of(iommufd_get_object(ucmd->ictx, id, 519 597 IOMMUFD_OBJ_VIOMMU), 520 598 struct iommufd_viommu, obj); 599 + } 600 + 601 + static inline struct iommufd_veventq * 602 + iommufd_viommu_find_veventq(struct iommufd_viommu *viommu, u32 type) 603 + { 604 + struct iommufd_veventq *veventq, *next; 605 + 606 + lockdep_assert_held(&viommu->veventqs_rwsem); 607 + 608 + list_for_each_entry_safe(veventq, next, &viommu->veventqs, node) { 609 + if (veventq->type == type) 610 + return veventq; 611 + } 612 + return NULL; 521 613 } 522 614 523 615 int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd);
+40
drivers/iommu/iommufd/iommufd_test.h
··· 24 24 IOMMU_TEST_OP_MD_CHECK_IOTLB, 25 25 IOMMU_TEST_OP_TRIGGER_IOPF, 26 26 IOMMU_TEST_OP_DEV_CHECK_CACHE, 27 + IOMMU_TEST_OP_TRIGGER_VEVENT, 28 + IOMMU_TEST_OP_PASID_ATTACH, 29 + IOMMU_TEST_OP_PASID_REPLACE, 30 + IOMMU_TEST_OP_PASID_DETACH, 31 + IOMMU_TEST_OP_PASID_CHECK_HWPT, 27 32 }; 28 33 29 34 enum { ··· 53 48 enum { 54 49 MOCK_FLAGS_DEVICE_NO_DIRTY = 1 << 0, 55 50 MOCK_FLAGS_DEVICE_HUGE_IOVA = 1 << 1, 51 + MOCK_FLAGS_DEVICE_PASID = 1 << 2, 56 52 }; 57 53 58 54 enum { ··· 65 59 MOCK_DEV_CACHE_ID_MAX = 3, 66 60 MOCK_DEV_CACHE_NUM = 4, 67 61 }; 62 + 63 + /* Reserved for special pasid replace test */ 64 + #define IOMMU_TEST_PASID_RESERVED 1024 68 65 69 66 struct iommu_test_cmd { 70 67 __u32 size; ··· 154 145 __u32 id; 155 146 __u32 cache; 156 147 } check_dev_cache; 148 + struct { 149 + __u32 dev_id; 150 + } trigger_vevent; 151 + struct { 152 + __u32 pasid; 153 + __u32 pt_id; 154 + /* @id is stdev_id */ 155 + } pasid_attach; 156 + struct { 157 + __u32 pasid; 158 + __u32 pt_id; 159 + /* @id is stdev_id */ 160 + } pasid_replace; 161 + struct { 162 + __u32 pasid; 163 + /* @id is stdev_id */ 164 + } pasid_detach; 165 + struct { 166 + __u32 pasid; 167 + __u32 hwpt_id; 168 + /* @id is stdev_id */ 169 + } pasid_check; 157 170 }; 158 171 __u32 last; 159 172 }; 160 173 #define IOMMU_TEST_CMD _IO(IOMMUFD_TYPE, IOMMUFD_CMD_BASE + 32) 174 + 175 + /* Mock device/iommu PASID width */ 176 + #define MOCK_PASID_WIDTH 20 161 177 162 178 /* Mock structs for IOMMU_DEVICE_GET_HW_INFO ioctl */ 163 179 #define IOMMU_HW_INFO_TYPE_SELFTEST 0xfeedbeef ··· 244 210 __u32 flags; 245 211 __u32 vdev_id; 246 212 __u32 cache_id; 213 + }; 214 + 215 + #define IOMMU_VEVENTQ_TYPE_SELFTEST 0xbeefbeef 216 + 217 + struct iommu_viommu_event_selftest { 218 + __u32 virt_id; 247 219 }; 248 220 249 221 #endif
+7
drivers/iommu/iommufd/main.c
··· 317 317 struct iommu_ioas_unmap unmap; 318 318 struct iommu_option option; 319 319 struct iommu_vdevice_alloc vdev; 320 + struct iommu_veventq_alloc veventq; 320 321 struct iommu_vfio_ioas vfio_ioas; 321 322 struct iommu_viommu_alloc viommu; 322 323 #ifdef CONFIG_IOMMUFD_TEST ··· 373 372 IOCTL_OP(IOMMU_OPTION, iommufd_option, struct iommu_option, val64), 374 373 IOCTL_OP(IOMMU_VDEVICE_ALLOC, iommufd_vdevice_alloc_ioctl, 375 374 struct iommu_vdevice_alloc, virt_id), 375 + IOCTL_OP(IOMMU_VEVENTQ_ALLOC, iommufd_veventq_alloc, 376 + struct iommu_veventq_alloc, out_veventq_fd), 376 377 IOCTL_OP(IOMMU_VFIO_IOAS, iommufd_vfio_ioas, struct iommu_vfio_ioas, 377 378 __reserved), 378 379 IOCTL_OP(IOMMU_VIOMMU_ALLOC, iommufd_viommu_alloc_ioctl, ··· 516 513 }, 517 514 [IOMMUFD_OBJ_VDEVICE] = { 518 515 .destroy = iommufd_vdevice_destroy, 516 + }, 517 + [IOMMUFD_OBJ_VEVENTQ] = { 518 + .destroy = iommufd_veventq_destroy, 519 + .abort = iommufd_veventq_abort, 519 520 }, 520 521 [IOMMUFD_OBJ_VIOMMU] = { 521 522 .destroy = iommufd_viommu_destroy,
+275 -22
drivers/iommu/iommufd/selftest.c
··· 161 161 162 162 struct mock_dev { 163 163 struct device dev; 164 + struct mock_viommu *viommu; 165 + struct rw_semaphore viommu_rwsem; 164 166 unsigned long flags; 167 + unsigned long vdev_id; 165 168 int id; 166 169 u32 cache[MOCK_DEV_CACHE_NUM]; 170 + atomic_t pasid_1024_fake_error; 167 171 }; 168 172 169 173 static inline struct mock_dev *to_mock_dev(struct device *dev) ··· 197 193 struct device *dev) 198 194 { 199 195 struct mock_dev *mdev = to_mock_dev(dev); 196 + struct mock_viommu *new_viommu = NULL; 197 + unsigned long vdev_id = 0; 198 + int rc; 200 199 201 200 if (domain->dirty_ops && (mdev->flags & MOCK_FLAGS_DEVICE_NO_DIRTY)) 202 201 return -EINVAL; 202 + 203 + iommu_group_mutex_assert(dev); 204 + if (domain->type == IOMMU_DOMAIN_NESTED) { 205 + new_viommu = to_mock_nested(domain)->mock_viommu; 206 + if (new_viommu) { 207 + rc = iommufd_viommu_get_vdev_id(&new_viommu->core, dev, 208 + &vdev_id); 209 + if (rc) 210 + return rc; 211 + } 212 + } 213 + if (new_viommu != mdev->viommu) { 214 + down_write(&mdev->viommu_rwsem); 215 + mdev->viommu = new_viommu; 216 + mdev->vdev_id = vdev_id; 217 + up_write(&mdev->viommu_rwsem); 218 + } 219 + 220 + return 0; 221 + } 222 + 223 + static int mock_domain_set_dev_pasid_nop(struct iommu_domain *domain, 224 + struct device *dev, ioasid_t pasid, 225 + struct iommu_domain *old) 226 + { 227 + struct mock_dev *mdev = to_mock_dev(dev); 228 + 229 + /* 230 + * Per the first attach with pasid 1024, set the 231 + * mdev->pasid_1024_fake_error. Hence the second call of this op 232 + * can fake an error to validate the error path of the core. This 233 + * is helpful to test the case in which the iommu core needs to 234 + * rollback to the old domain due to driver failure. e.g. replace. 235 + * User should be careful about the third call of this op, it shall 236 + * succeed since the mdev->pasid_1024_fake_error is cleared in the 237 + * second call. 238 + */ 239 + if (pasid == 1024) { 240 + if (domain->type == IOMMU_DOMAIN_BLOCKED) { 241 + atomic_set(&mdev->pasid_1024_fake_error, 0); 242 + } else if (atomic_read(&mdev->pasid_1024_fake_error)) { 243 + /* 244 + * Clear the flag, and fake an error to fail the 245 + * replacement. 246 + */ 247 + atomic_set(&mdev->pasid_1024_fake_error, 0); 248 + return -ENOMEM; 249 + } else { 250 + /* Set the flag to fake an error in next call */ 251 + atomic_set(&mdev->pasid_1024_fake_error, 1); 252 + } 253 + } 203 254 204 255 return 0; 205 256 } 206 257 207 258 static const struct iommu_domain_ops mock_blocking_ops = { 208 259 .attach_dev = mock_domain_nop_attach, 260 + .set_dev_pasid = mock_domain_set_dev_pasid_nop 209 261 }; 210 262 211 263 static struct iommu_domain mock_blocking_domain = { ··· 403 343 struct mock_iommu_domain_nested *mock_nested; 404 344 struct mock_iommu_domain *mock_parent; 405 345 406 - if (flags) 346 + if (flags & ~IOMMU_HWPT_ALLOC_PASID) 407 347 return ERR_PTR(-EOPNOTSUPP); 408 348 if (!parent || parent->ops != mock_ops.default_domain_ops) 409 349 return ERR_PTR(-EINVAL); ··· 425 365 { 426 366 bool has_dirty_flag = flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING; 427 367 const u32 PAGING_FLAGS = IOMMU_HWPT_ALLOC_DIRTY_TRACKING | 428 - IOMMU_HWPT_ALLOC_NEST_PARENT; 368 + IOMMU_HWPT_ALLOC_NEST_PARENT | 369 + IOMMU_HWPT_ALLOC_PASID; 429 370 struct mock_dev *mdev = to_mock_dev(dev); 430 371 bool no_dirty_ops = mdev->flags & MOCK_FLAGS_DEVICE_NO_DIRTY; 431 372 struct mock_iommu_domain *mock; ··· 646 585 struct mock_viommu *mock_viommu = to_mock_viommu(viommu); 647 586 struct mock_iommu_domain_nested *mock_nested; 648 587 649 - if (flags) 588 + if (flags & ~IOMMU_HWPT_ALLOC_PASID) 650 589 return ERR_PTR(-EOPNOTSUPP); 651 590 652 591 mock_nested = __mock_domain_alloc_nested(user_data); ··· 781 720 .map_pages = mock_domain_map_pages, 782 721 .unmap_pages = mock_domain_unmap_pages, 783 722 .iova_to_phys = mock_domain_iova_to_phys, 723 + .set_dev_pasid = mock_domain_set_dev_pasid_nop, 784 724 }, 785 725 }; 786 726 ··· 842 780 .free = mock_domain_free_nested, 843 781 .attach_dev = mock_domain_nop_attach, 844 782 .cache_invalidate_user = mock_domain_cache_invalidate_user, 783 + .set_dev_pasid = mock_domain_set_dev_pasid_nop, 845 784 }; 846 785 847 786 static inline struct iommufd_hw_pagetable * ··· 902 839 903 840 static struct mock_dev *mock_dev_create(unsigned long dev_flags) 904 841 { 842 + struct property_entry prop[] = { 843 + PROPERTY_ENTRY_U32("pasid-num-bits", 0), 844 + {}, 845 + }; 846 + const u32 valid_flags = MOCK_FLAGS_DEVICE_NO_DIRTY | 847 + MOCK_FLAGS_DEVICE_HUGE_IOVA | 848 + MOCK_FLAGS_DEVICE_PASID; 905 849 struct mock_dev *mdev; 906 850 int rc, i; 907 851 908 - if (dev_flags & 909 - ~(MOCK_FLAGS_DEVICE_NO_DIRTY | MOCK_FLAGS_DEVICE_HUGE_IOVA)) 852 + if (dev_flags & ~valid_flags) 910 853 return ERR_PTR(-EINVAL); 911 854 912 855 mdev = kzalloc(sizeof(*mdev), GFP_KERNEL); 913 856 if (!mdev) 914 857 return ERR_PTR(-ENOMEM); 915 858 859 + init_rwsem(&mdev->viommu_rwsem); 916 860 device_initialize(&mdev->dev); 917 861 mdev->flags = dev_flags; 918 862 mdev->dev.release = mock_dev_release; ··· 935 865 rc = dev_set_name(&mdev->dev, "iommufd_mock%u", mdev->id); 936 866 if (rc) 937 867 goto err_put; 868 + 869 + if (dev_flags & MOCK_FLAGS_DEVICE_PASID) 870 + prop[0] = PROPERTY_ENTRY_U32("pasid-num-bits", MOCK_PASID_WIDTH); 871 + 872 + rc = device_create_managed_software_node(&mdev->dev, prop, NULL); 873 + if (rc) { 874 + dev_err(&mdev->dev, "add pasid-num-bits property failed, rc: %d", rc); 875 + goto err_put; 876 + } 938 877 939 878 rc = device_add(&mdev->dev); 940 879 if (rc) ··· 1000 921 } 1001 922 sobj->idev.idev = idev; 1002 923 1003 - rc = iommufd_device_attach(idev, &pt_id); 924 + rc = iommufd_device_attach(idev, IOMMU_NO_PASID, &pt_id); 1004 925 if (rc) 1005 926 goto out_unbind; 1006 927 ··· 1015 936 return 0; 1016 937 1017 938 out_detach: 1018 - iommufd_device_detach(idev); 939 + iommufd_device_detach(idev, IOMMU_NO_PASID); 1019 940 out_unbind: 1020 941 iommufd_device_unbind(idev); 1021 942 out_mdev: ··· 1025 946 return rc; 1026 947 } 1027 948 1028 - /* Replace the mock domain with a manually allocated hw_pagetable */ 1029 - static int iommufd_test_mock_domain_replace(struct iommufd_ucmd *ucmd, 1030 - unsigned int device_id, u32 pt_id, 1031 - struct iommu_test_cmd *cmd) 949 + static struct selftest_obj * 950 + iommufd_test_get_selftest_obj(struct iommufd_ctx *ictx, u32 id) 1032 951 { 1033 952 struct iommufd_object *dev_obj; 1034 953 struct selftest_obj *sobj; 1035 - int rc; 1036 954 1037 955 /* 1038 956 * Prefer to use the OBJ_SELFTEST because the destroy_rwsem will ensure 1039 957 * it doesn't race with detach, which is not allowed. 1040 958 */ 1041 - dev_obj = 1042 - iommufd_get_object(ucmd->ictx, device_id, IOMMUFD_OBJ_SELFTEST); 959 + dev_obj = iommufd_get_object(ictx, id, IOMMUFD_OBJ_SELFTEST); 1043 960 if (IS_ERR(dev_obj)) 1044 - return PTR_ERR(dev_obj); 961 + return ERR_CAST(dev_obj); 1045 962 1046 963 sobj = to_selftest_obj(dev_obj); 1047 964 if (sobj->type != TYPE_IDEV) { 1048 - rc = -EINVAL; 1049 - goto out_dev_obj; 965 + iommufd_put_object(ictx, dev_obj); 966 + return ERR_PTR(-EINVAL); 1050 967 } 968 + return sobj; 969 + } 1051 970 1052 - rc = iommufd_device_replace(sobj->idev.idev, &pt_id); 971 + /* Replace the mock domain with a manually allocated hw_pagetable */ 972 + static int iommufd_test_mock_domain_replace(struct iommufd_ucmd *ucmd, 973 + unsigned int device_id, u32 pt_id, 974 + struct iommu_test_cmd *cmd) 975 + { 976 + struct selftest_obj *sobj; 977 + int rc; 978 + 979 + sobj = iommufd_test_get_selftest_obj(ucmd->ictx, device_id); 980 + if (IS_ERR(sobj)) 981 + return PTR_ERR(sobj); 982 + 983 + rc = iommufd_device_replace(sobj->idev.idev, IOMMU_NO_PASID, &pt_id); 1053 984 if (rc) 1054 - goto out_dev_obj; 985 + goto out_sobj; 1055 986 1056 987 cmd->mock_domain_replace.pt_id = pt_id; 1057 988 rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); 1058 989 1059 - out_dev_obj: 1060 - iommufd_put_object(ucmd->ictx, dev_obj); 990 + out_sobj: 991 + iommufd_put_object(ucmd->ictx, &sobj->obj); 1061 992 return rc; 1062 993 } 1063 994 ··· 1686 1597 return 0; 1687 1598 } 1688 1599 1600 + static int iommufd_test_trigger_vevent(struct iommufd_ucmd *ucmd, 1601 + struct iommu_test_cmd *cmd) 1602 + { 1603 + struct iommu_viommu_event_selftest test = {}; 1604 + struct iommufd_device *idev; 1605 + struct mock_dev *mdev; 1606 + int rc = -ENOENT; 1607 + 1608 + idev = iommufd_get_device(ucmd, cmd->trigger_vevent.dev_id); 1609 + if (IS_ERR(idev)) 1610 + return PTR_ERR(idev); 1611 + mdev = to_mock_dev(idev->dev); 1612 + 1613 + down_read(&mdev->viommu_rwsem); 1614 + if (!mdev->viommu || !mdev->vdev_id) 1615 + goto out_unlock; 1616 + 1617 + test.virt_id = mdev->vdev_id; 1618 + rc = iommufd_viommu_report_event(&mdev->viommu->core, 1619 + IOMMU_VEVENTQ_TYPE_SELFTEST, &test, 1620 + sizeof(test)); 1621 + out_unlock: 1622 + up_read(&mdev->viommu_rwsem); 1623 + iommufd_put_object(ucmd->ictx, &idev->obj); 1624 + 1625 + return rc; 1626 + } 1627 + 1628 + static inline struct iommufd_hw_pagetable * 1629 + iommufd_get_hwpt(struct iommufd_ucmd *ucmd, u32 id) 1630 + { 1631 + struct iommufd_object *pt_obj; 1632 + 1633 + pt_obj = iommufd_get_object(ucmd->ictx, id, IOMMUFD_OBJ_ANY); 1634 + if (IS_ERR(pt_obj)) 1635 + return ERR_CAST(pt_obj); 1636 + 1637 + if (pt_obj->type != IOMMUFD_OBJ_HWPT_NESTED && 1638 + pt_obj->type != IOMMUFD_OBJ_HWPT_PAGING) { 1639 + iommufd_put_object(ucmd->ictx, pt_obj); 1640 + return ERR_PTR(-EINVAL); 1641 + } 1642 + 1643 + return container_of(pt_obj, struct iommufd_hw_pagetable, obj); 1644 + } 1645 + 1646 + static int iommufd_test_pasid_check_hwpt(struct iommufd_ucmd *ucmd, 1647 + struct iommu_test_cmd *cmd) 1648 + { 1649 + u32 hwpt_id = cmd->pasid_check.hwpt_id; 1650 + struct iommu_domain *attached_domain; 1651 + struct iommu_attach_handle *handle; 1652 + struct iommufd_hw_pagetable *hwpt; 1653 + struct selftest_obj *sobj; 1654 + struct mock_dev *mdev; 1655 + int rc = 0; 1656 + 1657 + sobj = iommufd_test_get_selftest_obj(ucmd->ictx, cmd->id); 1658 + if (IS_ERR(sobj)) 1659 + return PTR_ERR(sobj); 1660 + 1661 + mdev = sobj->idev.mock_dev; 1662 + 1663 + handle = iommu_attach_handle_get(mdev->dev.iommu_group, 1664 + cmd->pasid_check.pasid, 0); 1665 + if (IS_ERR(handle)) 1666 + attached_domain = NULL; 1667 + else 1668 + attached_domain = handle->domain; 1669 + 1670 + /* hwpt_id == 0 means to check if pasid is detached */ 1671 + if (!hwpt_id) { 1672 + if (attached_domain) 1673 + rc = -EINVAL; 1674 + goto out_sobj; 1675 + } 1676 + 1677 + hwpt = iommufd_get_hwpt(ucmd, hwpt_id); 1678 + if (IS_ERR(hwpt)) { 1679 + rc = PTR_ERR(hwpt); 1680 + goto out_sobj; 1681 + } 1682 + 1683 + if (attached_domain != hwpt->domain) 1684 + rc = -EINVAL; 1685 + 1686 + iommufd_put_object(ucmd->ictx, &hwpt->obj); 1687 + out_sobj: 1688 + iommufd_put_object(ucmd->ictx, &sobj->obj); 1689 + return rc; 1690 + } 1691 + 1692 + static int iommufd_test_pasid_attach(struct iommufd_ucmd *ucmd, 1693 + struct iommu_test_cmd *cmd) 1694 + { 1695 + struct selftest_obj *sobj; 1696 + int rc; 1697 + 1698 + sobj = iommufd_test_get_selftest_obj(ucmd->ictx, cmd->id); 1699 + if (IS_ERR(sobj)) 1700 + return PTR_ERR(sobj); 1701 + 1702 + rc = iommufd_device_attach(sobj->idev.idev, cmd->pasid_attach.pasid, 1703 + &cmd->pasid_attach.pt_id); 1704 + if (rc) 1705 + goto out_sobj; 1706 + 1707 + rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); 1708 + if (rc) 1709 + iommufd_device_detach(sobj->idev.idev, 1710 + cmd->pasid_attach.pasid); 1711 + 1712 + out_sobj: 1713 + iommufd_put_object(ucmd->ictx, &sobj->obj); 1714 + return rc; 1715 + } 1716 + 1717 + static int iommufd_test_pasid_replace(struct iommufd_ucmd *ucmd, 1718 + struct iommu_test_cmd *cmd) 1719 + { 1720 + struct selftest_obj *sobj; 1721 + int rc; 1722 + 1723 + sobj = iommufd_test_get_selftest_obj(ucmd->ictx, cmd->id); 1724 + if (IS_ERR(sobj)) 1725 + return PTR_ERR(sobj); 1726 + 1727 + rc = iommufd_device_replace(sobj->idev.idev, cmd->pasid_attach.pasid, 1728 + &cmd->pasid_attach.pt_id); 1729 + if (rc) 1730 + goto out_sobj; 1731 + 1732 + rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); 1733 + 1734 + out_sobj: 1735 + iommufd_put_object(ucmd->ictx, &sobj->obj); 1736 + return rc; 1737 + } 1738 + 1739 + static int iommufd_test_pasid_detach(struct iommufd_ucmd *ucmd, 1740 + struct iommu_test_cmd *cmd) 1741 + { 1742 + struct selftest_obj *sobj; 1743 + 1744 + sobj = iommufd_test_get_selftest_obj(ucmd->ictx, cmd->id); 1745 + if (IS_ERR(sobj)) 1746 + return PTR_ERR(sobj); 1747 + 1748 + iommufd_device_detach(sobj->idev.idev, cmd->pasid_detach.pasid); 1749 + iommufd_put_object(ucmd->ictx, &sobj->obj); 1750 + return 0; 1751 + } 1752 + 1689 1753 void iommufd_selftest_destroy(struct iommufd_object *obj) 1690 1754 { 1691 1755 struct selftest_obj *sobj = to_selftest_obj(obj); 1692 1756 1693 1757 switch (sobj->type) { 1694 1758 case TYPE_IDEV: 1695 - iommufd_device_detach(sobj->idev.idev); 1759 + iommufd_device_detach(sobj->idev.idev, IOMMU_NO_PASID); 1696 1760 iommufd_device_unbind(sobj->idev.idev); 1697 1761 mock_dev_destroy(sobj->idev.mock_dev); 1698 1762 break; ··· 1920 1678 cmd->dirty.flags); 1921 1679 case IOMMU_TEST_OP_TRIGGER_IOPF: 1922 1680 return iommufd_test_trigger_iopf(ucmd, cmd); 1681 + case IOMMU_TEST_OP_TRIGGER_VEVENT: 1682 + return iommufd_test_trigger_vevent(ucmd, cmd); 1683 + case IOMMU_TEST_OP_PASID_ATTACH: 1684 + return iommufd_test_pasid_attach(ucmd, cmd); 1685 + case IOMMU_TEST_OP_PASID_REPLACE: 1686 + return iommufd_test_pasid_replace(ucmd, cmd); 1687 + case IOMMU_TEST_OP_PASID_DETACH: 1688 + return iommufd_test_pasid_detach(ucmd, cmd); 1689 + case IOMMU_TEST_OP_PASID_CHECK_HWPT: 1690 + return iommufd_test_pasid_check_hwpt(ucmd, cmd); 1923 1691 default: 1924 1692 return -EOPNOTSUPP; 1925 1693 } ··· 1976 1724 init_completion(&mock_iommu.complete); 1977 1725 1978 1726 mock_iommu_iopf_queue = iopf_queue_alloc("mock-iopfq"); 1727 + mock_iommu.iommu_dev.max_pasids = (1 << MOCK_PASID_WIDTH); 1979 1728 1980 1729 return 0; 1981 1730
+2
drivers/iommu/iommufd/viommu.c
··· 59 59 viommu->ictx = ucmd->ictx; 60 60 viommu->hwpt = hwpt_paging; 61 61 refcount_inc(&viommu->hwpt->common.obj.users); 62 + INIT_LIST_HEAD(&viommu->veventqs); 63 + init_rwsem(&viommu->veventqs_rwsem); 62 64 /* 63 65 * It is the most likely case that a physical IOMMU is unpluggable. A 64 66 * pluggable IOMMU instance (if exists) is responsible for refcounting
+33
drivers/pci/ats.c
··· 538 538 return (1 << FIELD_GET(PCI_PASID_CAP_WIDTH, supported)); 539 539 } 540 540 EXPORT_SYMBOL_GPL(pci_max_pasids); 541 + 542 + /** 543 + * pci_pasid_status - Check the PASID status 544 + * @pdev: PCI device structure 545 + * 546 + * Returns a negative value when no PASID capability is present. 547 + * Otherwise the value of the control register is returned. 548 + * Status reported are: 549 + * 550 + * PCI_PASID_CTRL_ENABLE - PASID enabled 551 + * PCI_PASID_CTRL_EXEC - Execute permission enabled 552 + * PCI_PASID_CTRL_PRIV - Privileged mode enabled 553 + */ 554 + int pci_pasid_status(struct pci_dev *pdev) 555 + { 556 + int pasid; 557 + u16 ctrl; 558 + 559 + if (pdev->is_virtfn) 560 + pdev = pci_physfn(pdev); 561 + 562 + pasid = pdev->pasid_cap; 563 + if (!pasid) 564 + return -EINVAL; 565 + 566 + pci_read_config_word(pdev, pasid + PCI_PASID_CTRL, &ctrl); 567 + 568 + ctrl &= PCI_PASID_CTRL_ENABLE | PCI_PASID_CTRL_EXEC | 569 + PCI_PASID_CTRL_PRIV; 570 + 571 + return ctrl; 572 + } 573 + EXPORT_SYMBOL_GPL(pci_pasid_status); 541 574 #endif /* CONFIG_PCI_PASID */
+52 -8
drivers/vfio/device_cdev.c
··· 162 162 int vfio_df_ioctl_attach_pt(struct vfio_device_file *df, 163 163 struct vfio_device_attach_iommufd_pt __user *arg) 164 164 { 165 - struct vfio_device *device = df->device; 166 165 struct vfio_device_attach_iommufd_pt attach; 167 - unsigned long minsz; 166 + struct vfio_device *device = df->device; 167 + unsigned long minsz, xend = 0; 168 168 int ret; 169 169 170 170 minsz = offsetofend(struct vfio_device_attach_iommufd_pt, pt_id); ··· 172 172 if (copy_from_user(&attach, arg, minsz)) 173 173 return -EFAULT; 174 174 175 - if (attach.argsz < minsz || attach.flags) 175 + if (attach.argsz < minsz) 176 176 return -EINVAL; 177 177 178 + if (attach.flags & ~VFIO_DEVICE_ATTACH_PASID) 179 + return -EINVAL; 180 + 181 + if (attach.flags & VFIO_DEVICE_ATTACH_PASID) { 182 + if (!device->ops->pasid_attach_ioas) 183 + return -EOPNOTSUPP; 184 + xend = offsetofend(struct vfio_device_attach_iommufd_pt, pasid); 185 + } 186 + 187 + if (xend) { 188 + if (attach.argsz < xend) 189 + return -EINVAL; 190 + 191 + if (copy_from_user((void *)&attach + minsz, 192 + (void __user *)arg + minsz, xend - minsz)) 193 + return -EFAULT; 194 + } 195 + 178 196 mutex_lock(&device->dev_set->lock); 179 - ret = device->ops->attach_ioas(device, &attach.pt_id); 197 + if (attach.flags & VFIO_DEVICE_ATTACH_PASID) 198 + ret = device->ops->pasid_attach_ioas(device, 199 + attach.pasid, 200 + &attach.pt_id); 201 + else 202 + ret = device->ops->attach_ioas(device, &attach.pt_id); 180 203 if (ret) 181 204 goto out_unlock; 182 205 ··· 221 198 int vfio_df_ioctl_detach_pt(struct vfio_device_file *df, 222 199 struct vfio_device_detach_iommufd_pt __user *arg) 223 200 { 224 - struct vfio_device *device = df->device; 225 201 struct vfio_device_detach_iommufd_pt detach; 226 - unsigned long minsz; 202 + struct vfio_device *device = df->device; 203 + unsigned long minsz, xend = 0; 227 204 228 205 minsz = offsetofend(struct vfio_device_detach_iommufd_pt, flags); 229 206 230 207 if (copy_from_user(&detach, arg, minsz)) 231 208 return -EFAULT; 232 209 233 - if (detach.argsz < minsz || detach.flags) 210 + if (detach.argsz < minsz) 234 211 return -EINVAL; 235 212 213 + if (detach.flags & ~VFIO_DEVICE_DETACH_PASID) 214 + return -EINVAL; 215 + 216 + if (detach.flags & VFIO_DEVICE_DETACH_PASID) { 217 + if (!device->ops->pasid_detach_ioas) 218 + return -EOPNOTSUPP; 219 + xend = offsetofend(struct vfio_device_detach_iommufd_pt, pasid); 220 + } 221 + 222 + if (xend) { 223 + if (detach.argsz < xend) 224 + return -EINVAL; 225 + 226 + if (copy_from_user((void *)&detach + minsz, 227 + (void __user *)arg + minsz, xend - minsz)) 228 + return -EFAULT; 229 + } 230 + 236 231 mutex_lock(&device->dev_set->lock); 237 - device->ops->detach_ioas(device); 232 + if (detach.flags & VFIO_DEVICE_DETACH_PASID) 233 + device->ops->pasid_detach_ioas(device, detach.pasid); 234 + else 235 + device->ops->detach_ioas(device); 238 236 mutex_unlock(&device->dev_set->lock); 239 237 240 238 return 0;
+56 -4
drivers/vfio/iommufd.c
··· 119 119 if (IS_ERR(idev)) 120 120 return PTR_ERR(idev); 121 121 vdev->iommufd_device = idev; 122 + ida_init(&vdev->pasids); 122 123 return 0; 123 124 } 124 125 EXPORT_SYMBOL_GPL(vfio_iommufd_physical_bind); 125 126 126 127 void vfio_iommufd_physical_unbind(struct vfio_device *vdev) 127 128 { 129 + int pasid; 130 + 128 131 lockdep_assert_held(&vdev->dev_set->lock); 129 132 133 + while ((pasid = ida_find_first(&vdev->pasids)) >= 0) { 134 + iommufd_device_detach(vdev->iommufd_device, pasid); 135 + ida_free(&vdev->pasids, pasid); 136 + } 137 + 130 138 if (vdev->iommufd_attached) { 131 - iommufd_device_detach(vdev->iommufd_device); 139 + iommufd_device_detach(vdev->iommufd_device, IOMMU_NO_PASID); 132 140 vdev->iommufd_attached = false; 133 141 } 134 142 iommufd_device_unbind(vdev->iommufd_device); ··· 154 146 return -EINVAL; 155 147 156 148 if (vdev->iommufd_attached) 157 - rc = iommufd_device_replace(vdev->iommufd_device, pt_id); 149 + rc = iommufd_device_replace(vdev->iommufd_device, 150 + IOMMU_NO_PASID, pt_id); 158 151 else 159 - rc = iommufd_device_attach(vdev->iommufd_device, pt_id); 152 + rc = iommufd_device_attach(vdev->iommufd_device, 153 + IOMMU_NO_PASID, pt_id); 160 154 if (rc) 161 155 return rc; 162 156 vdev->iommufd_attached = true; ··· 173 163 if (WARN_ON(!vdev->iommufd_device) || !vdev->iommufd_attached) 174 164 return; 175 165 176 - iommufd_device_detach(vdev->iommufd_device); 166 + iommufd_device_detach(vdev->iommufd_device, IOMMU_NO_PASID); 177 167 vdev->iommufd_attached = false; 178 168 } 179 169 EXPORT_SYMBOL_GPL(vfio_iommufd_physical_detach_ioas); 170 + 171 + int vfio_iommufd_physical_pasid_attach_ioas(struct vfio_device *vdev, 172 + u32 pasid, u32 *pt_id) 173 + { 174 + int rc; 175 + 176 + lockdep_assert_held(&vdev->dev_set->lock); 177 + 178 + if (WARN_ON(!vdev->iommufd_device)) 179 + return -EINVAL; 180 + 181 + if (ida_exists(&vdev->pasids, pasid)) 182 + return iommufd_device_replace(vdev->iommufd_device, 183 + pasid, pt_id); 184 + 185 + rc = ida_alloc_range(&vdev->pasids, pasid, pasid, GFP_KERNEL); 186 + if (rc < 0) 187 + return rc; 188 + 189 + rc = iommufd_device_attach(vdev->iommufd_device, pasid, pt_id); 190 + if (rc) 191 + ida_free(&vdev->pasids, pasid); 192 + 193 + return rc; 194 + } 195 + EXPORT_SYMBOL_GPL(vfio_iommufd_physical_pasid_attach_ioas); 196 + 197 + void vfio_iommufd_physical_pasid_detach_ioas(struct vfio_device *vdev, 198 + u32 pasid) 199 + { 200 + lockdep_assert_held(&vdev->dev_set->lock); 201 + 202 + if (WARN_ON(!vdev->iommufd_device)) 203 + return; 204 + 205 + if (!ida_exists(&vdev->pasids, pasid)) 206 + return; 207 + 208 + iommufd_device_detach(vdev->iommufd_device, pasid); 209 + ida_free(&vdev->pasids, pasid); 210 + } 211 + EXPORT_SYMBOL_GPL(vfio_iommufd_physical_pasid_detach_ioas); 180 212 181 213 /* 182 214 * The emulated standard ops mean that vfio_device is going to use the
+2
drivers/vfio/pci/vfio_pci.c
··· 144 144 .unbind_iommufd = vfio_iommufd_physical_unbind, 145 145 .attach_ioas = vfio_iommufd_physical_attach_ioas, 146 146 .detach_ioas = vfio_iommufd_physical_detach_ioas, 147 + .pasid_attach_ioas = vfio_iommufd_physical_pasid_attach_ioas, 148 + .pasid_detach_ioas = vfio_iommufd_physical_pasid_detach_ioas, 147 149 }; 148 150 149 151 static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+11
include/linux/idr.h
··· 274 274 int ida_alloc_range(struct ida *, unsigned int min, unsigned int max, gfp_t); 275 275 void ida_free(struct ida *, unsigned int id); 276 276 void ida_destroy(struct ida *ida); 277 + int ida_find_first_range(struct ida *ida, unsigned int min, unsigned int max); 277 278 278 279 /** 279 280 * ida_alloc() - Allocate an unused ID. ··· 345 344 static inline bool ida_is_empty(const struct ida *ida) 346 345 { 347 346 return xa_empty(&ida->xa); 347 + } 348 + 349 + static inline bool ida_exists(struct ida *ida, unsigned int id) 350 + { 351 + return ida_find_first_range(ida, id, id) == id; 352 + } 353 + 354 + static inline int ida_find_first(struct ida *ida) 355 + { 356 + return ida_find_first_range(ida, 0, ~0); 348 357 } 349 358 #endif /* __IDR_H__ */
+15 -20
include/linux/iommu.h
··· 41 41 struct notifier_block; 42 42 struct iommu_sva; 43 43 struct iommu_dma_cookie; 44 + struct iommu_dma_msi_cookie; 44 45 struct iommu_fault_param; 45 46 struct iommufd_ctx; 46 47 struct iommufd_viommu; ··· 166 165 bool force_aperture; /* DMA only allowed in mappable range? */ 167 166 }; 168 167 168 + enum iommu_domain_cookie_type { 169 + IOMMU_COOKIE_NONE, 170 + IOMMU_COOKIE_DMA_IOVA, 171 + IOMMU_COOKIE_DMA_MSI, 172 + IOMMU_COOKIE_FAULT_HANDLER, 173 + IOMMU_COOKIE_SVA, 174 + IOMMU_COOKIE_IOMMUFD, 175 + }; 176 + 169 177 /* Domain feature flags */ 170 178 #define __IOMMU_DOMAIN_PAGING (1U << 0) /* Support for iommu_map/unmap */ 171 179 #define __IOMMU_DOMAIN_DMA_API (1U << 1) /* Domain for use in DMA-API ··· 221 211 222 212 struct iommu_domain { 223 213 unsigned type; 214 + enum iommu_domain_cookie_type cookie_type; 224 215 const struct iommu_domain_ops *ops; 225 216 const struct iommu_dirty_ops *dirty_ops; 226 217 const struct iommu_ops *owner; /* Whose domain_alloc we came from */ 227 218 unsigned long pgsize_bitmap; /* Bitmap of page sizes in use */ 228 219 struct iommu_domain_geometry geometry; 229 - struct iommu_dma_cookie *iova_cookie; 230 220 int (*iopf_handler)(struct iopf_group *group); 231 221 232 - #if IS_ENABLED(CONFIG_IRQ_MSI_IOMMU) 233 - int (*sw_msi)(struct iommu_domain *domain, struct msi_desc *desc, 234 - phys_addr_t msi_addr); 235 - #endif 236 - 237 - union { /* Pointer usable by owner of the domain */ 238 - struct iommufd_hw_pagetable *iommufd_hwpt; /* iommufd */ 239 - }; 240 - union { /* Fault handler */ 222 + union { /* cookie */ 223 + struct iommu_dma_cookie *iova_cookie; 224 + struct iommu_dma_msi_cookie *msi_cookie; 225 + struct iommufd_hw_pagetable *iommufd_hwpt; 241 226 struct { 242 227 iommu_fault_handler_t handler; 243 228 void *handler_token; ··· 248 243 }; 249 244 }; 250 245 }; 251 - 252 - static inline void iommu_domain_set_sw_msi( 253 - struct iommu_domain *domain, 254 - int (*sw_msi)(struct iommu_domain *domain, struct msi_desc *desc, 255 - phys_addr_t msi_addr)) 256 - { 257 - #if IS_ENABLED(CONFIG_IRQ_MSI_IOMMU) 258 - domain->sw_msi = sw_msi; 259 - #endif 260 - } 261 246 262 247 static inline bool iommu_is_dma_domain(struct iommu_domain *domain) 263 248 {
+29 -3
include/linux/iommufd.h
··· 8 8 9 9 #include <linux/err.h> 10 10 #include <linux/errno.h> 11 + #include <linux/iommu.h> 11 12 #include <linux/refcount.h> 12 13 #include <linux/types.h> 13 14 #include <linux/xarray.h> 15 + #include <uapi/linux/iommufd.h> 14 16 15 17 struct device; 16 18 struct file; ··· 36 34 IOMMUFD_OBJ_FAULT, 37 35 IOMMUFD_OBJ_VIOMMU, 38 36 IOMMUFD_OBJ_VDEVICE, 37 + IOMMUFD_OBJ_VEVENTQ, 39 38 #ifdef CONFIG_IOMMUFD_TEST 40 39 IOMMUFD_OBJ_SELFTEST, 41 40 #endif ··· 55 52 struct device *dev, u32 *id); 56 53 void iommufd_device_unbind(struct iommufd_device *idev); 57 54 58 - int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id); 59 - int iommufd_device_replace(struct iommufd_device *idev, u32 *pt_id); 60 - void iommufd_device_detach(struct iommufd_device *idev); 55 + int iommufd_device_attach(struct iommufd_device *idev, ioasid_t pasid, 56 + u32 *pt_id); 57 + int iommufd_device_replace(struct iommufd_device *idev, ioasid_t pasid, 58 + u32 *pt_id); 59 + void iommufd_device_detach(struct iommufd_device *idev, ioasid_t pasid); 61 60 62 61 struct iommufd_ctx *iommufd_device_to_ictx(struct iommufd_device *idev); 63 62 u32 iommufd_device_to_id(struct iommufd_device *idev); ··· 98 93 const struct iommufd_viommu_ops *ops; 99 94 100 95 struct xarray vdevs; 96 + struct list_head veventqs; 97 + struct rw_semaphore veventqs_rwsem; 101 98 102 99 unsigned int type; 103 100 }; ··· 194 187 enum iommufd_object_type type); 195 188 struct device *iommufd_viommu_find_dev(struct iommufd_viommu *viommu, 196 189 unsigned long vdev_id); 190 + int iommufd_viommu_get_vdev_id(struct iommufd_viommu *viommu, 191 + struct device *dev, unsigned long *vdev_id); 192 + int iommufd_viommu_report_event(struct iommufd_viommu *viommu, 193 + enum iommu_veventq_type type, void *event_data, 194 + size_t data_len); 197 195 #else /* !CONFIG_IOMMUFD_DRIVER_CORE */ 198 196 static inline struct iommufd_object * 199 197 _iommufd_object_alloc(struct iommufd_ctx *ictx, size_t size, ··· 211 199 iommufd_viommu_find_dev(struct iommufd_viommu *viommu, unsigned long vdev_id) 212 200 { 213 201 return NULL; 202 + } 203 + 204 + static inline int iommufd_viommu_get_vdev_id(struct iommufd_viommu *viommu, 205 + struct device *dev, 206 + unsigned long *vdev_id) 207 + { 208 + return -ENOENT; 209 + } 210 + 211 + static inline int iommufd_viommu_report_event(struct iommufd_viommu *viommu, 212 + enum iommu_veventq_type type, 213 + void *event_data, size_t data_len) 214 + { 215 + return -EOPNOTSUPP; 214 216 } 215 217 #endif /* CONFIG_IOMMUFD_DRIVER_CORE */ 216 218
+3
include/linux/pci-ats.h
··· 42 42 void pci_disable_pasid(struct pci_dev *pdev); 43 43 int pci_pasid_features(struct pci_dev *pdev); 44 44 int pci_max_pasids(struct pci_dev *pdev); 45 + int pci_pasid_status(struct pci_dev *pdev); 45 46 #else /* CONFIG_PCI_PASID */ 46 47 static inline int pci_enable_pasid(struct pci_dev *pdev, int features) 47 48 { return -EINVAL; } ··· 50 49 static inline int pci_pasid_features(struct pci_dev *pdev) 51 50 { return -EINVAL; } 52 51 static inline int pci_max_pasids(struct pci_dev *pdev) 52 + { return -EINVAL; } 53 + static inline int pci_pasid_status(struct pci_dev *pdev) 53 54 { return -EINVAL; } 54 55 #endif /* CONFIG_PCI_PASID */ 55 56
+14
include/linux/vfio.h
··· 67 67 struct inode *inode; 68 68 #if IS_ENABLED(CONFIG_IOMMUFD) 69 69 struct iommufd_device *iommufd_device; 70 + struct ida pasids; 70 71 u8 iommufd_attached:1; 71 72 #endif 72 73 u8 cdev_opened:1; ··· 92 91 * bound iommufd. Undo in unbind_iommufd if @detach_ioas is not 93 92 * called. 94 93 * @detach_ioas: Opposite of attach_ioas 94 + * @pasid_attach_ioas: The pasid variation of attach_ioas 95 + * @pasid_detach_ioas: Opposite of pasid_attach_ioas 95 96 * @open_device: Called when the first file descriptor is opened for this device 96 97 * @close_device: Opposite of open_device 97 98 * @read: Perform read(2) on device file descriptor ··· 118 115 void (*unbind_iommufd)(struct vfio_device *vdev); 119 116 int (*attach_ioas)(struct vfio_device *vdev, u32 *pt_id); 120 117 void (*detach_ioas)(struct vfio_device *vdev); 118 + int (*pasid_attach_ioas)(struct vfio_device *vdev, u32 pasid, 119 + u32 *pt_id); 120 + void (*pasid_detach_ioas)(struct vfio_device *vdev, u32 pasid); 121 121 int (*open_device)(struct vfio_device *vdev); 122 122 void (*close_device)(struct vfio_device *vdev); 123 123 ssize_t (*read)(struct vfio_device *vdev, char __user *buf, ··· 145 139 void vfio_iommufd_physical_unbind(struct vfio_device *vdev); 146 140 int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id); 147 141 void vfio_iommufd_physical_detach_ioas(struct vfio_device *vdev); 142 + int vfio_iommufd_physical_pasid_attach_ioas(struct vfio_device *vdev, 143 + u32 pasid, u32 *pt_id); 144 + void vfio_iommufd_physical_pasid_detach_ioas(struct vfio_device *vdev, 145 + u32 pasid); 148 146 int vfio_iommufd_emulated_bind(struct vfio_device *vdev, 149 147 struct iommufd_ctx *ictx, u32 *out_device_id); 150 148 void vfio_iommufd_emulated_unbind(struct vfio_device *vdev); ··· 176 166 ((int (*)(struct vfio_device *vdev, u32 *pt_id)) NULL) 177 167 #define vfio_iommufd_physical_detach_ioas \ 178 168 ((void (*)(struct vfio_device *vdev)) NULL) 169 + #define vfio_iommufd_physical_pasid_attach_ioas \ 170 + ((int (*)(struct vfio_device *vdev, u32 pasid, u32 *pt_id)) NULL) 171 + #define vfio_iommufd_physical_pasid_detach_ioas \ 172 + ((void (*)(struct vfio_device *vdev, u32 pasid)) NULL) 179 173 #define vfio_iommufd_emulated_bind \ 180 174 ((int (*)(struct vfio_device *vdev, struct iommufd_ctx *ictx, \ 181 175 u32 *out_device_id)) NULL)
+128 -1
include/uapi/linux/iommufd.h
··· 55 55 IOMMUFD_CMD_VIOMMU_ALLOC = 0x90, 56 56 IOMMUFD_CMD_VDEVICE_ALLOC = 0x91, 57 57 IOMMUFD_CMD_IOAS_CHANGE_PROCESS = 0x92, 58 + IOMMUFD_CMD_VEVENTQ_ALLOC = 0x93, 58 59 }; 59 60 60 61 /** ··· 393 392 * Any domain attached to the non-PASID part of the 394 393 * device must also be flagged, otherwise attaching a 395 394 * PASID will blocked. 395 + * For the user that wants to attach PASID, ioas is 396 + * not recommended for both the non-PASID part 397 + * and PASID part of the device. 396 398 * If IOMMU does not support PASID it will return 397 399 * error (-EOPNOTSUPP). 398 400 */ ··· 612 608 * IOMMU_HWPT_GET_DIRTY_BITMAP 613 609 * IOMMU_HWPT_SET_DIRTY_TRACKING 614 610 * 611 + * @IOMMU_HW_CAP_PCI_PASID_EXEC: Execute Permission Supported, user ignores it 612 + * when the struct 613 + * iommu_hw_info::out_max_pasid_log2 is zero. 614 + * @IOMMU_HW_CAP_PCI_PASID_PRIV: Privileged Mode Supported, user ignores it 615 + * when the struct 616 + * iommu_hw_info::out_max_pasid_log2 is zero. 615 617 */ 616 618 enum iommufd_hw_capabilities { 617 619 IOMMU_HW_CAP_DIRTY_TRACKING = 1 << 0, 620 + IOMMU_HW_CAP_PCI_PASID_EXEC = 1 << 1, 621 + IOMMU_HW_CAP_PCI_PASID_PRIV = 1 << 2, 618 622 }; 619 623 620 624 /** ··· 638 626 * iommu_hw_info_type. 639 627 * @out_capabilities: Output the generic iommu capability info type as defined 640 628 * in the enum iommu_hw_capabilities. 629 + * @out_max_pasid_log2: Output the width of PASIDs. 0 means no PASID support. 630 + * PCI devices turn to out_capabilities to check if the 631 + * specific capabilities is supported or not. 641 632 * @__reserved: Must be 0 642 633 * 643 634 * Query an iommu type specific hardware information data from an iommu behind ··· 664 649 __u32 data_len; 665 650 __aligned_u64 data_uptr; 666 651 __u32 out_data_type; 667 - __u32 __reserved; 652 + __u8 out_max_pasid_log2; 653 + __u8 __reserved[3]; 668 654 __aligned_u64 out_capabilities; 669 655 }; 670 656 #define IOMMU_GET_HW_INFO _IO(IOMMUFD_TYPE, IOMMUFD_CMD_GET_HW_INFO) ··· 1030 1014 #define IOMMU_IOAS_CHANGE_PROCESS \ 1031 1015 _IO(IOMMUFD_TYPE, IOMMUFD_CMD_IOAS_CHANGE_PROCESS) 1032 1016 1017 + /** 1018 + * enum iommu_veventq_flag - flag for struct iommufd_vevent_header 1019 + * @IOMMU_VEVENTQ_FLAG_LOST_EVENTS: vEVENTQ has lost vEVENTs 1020 + */ 1021 + enum iommu_veventq_flag { 1022 + IOMMU_VEVENTQ_FLAG_LOST_EVENTS = (1U << 0), 1023 + }; 1024 + 1025 + /** 1026 + * struct iommufd_vevent_header - Virtual Event Header for a vEVENTQ Status 1027 + * @flags: Combination of enum iommu_veventq_flag 1028 + * @sequence: The sequence index of a vEVENT in the vEVENTQ, with a range of 1029 + * [0, INT_MAX] where the following index of INT_MAX is 0 1030 + * 1031 + * Each iommufd_vevent_header reports a sequence index of the following vEVENT: 1032 + * 1033 + * +----------------------+-------+----------------------+-------+---+-------+ 1034 + * | header0 {sequence=0} | data0 | header1 {sequence=1} | data1 |...| dataN | 1035 + * +----------------------+-------+----------------------+-------+---+-------+ 1036 + * 1037 + * And this sequence index is expected to be monotonic to the sequence index of 1038 + * the previous vEVENT. If two adjacent sequence indexes has a delta larger than 1039 + * 1, it means that delta - 1 number of vEVENTs has lost, e.g. two lost vEVENTs: 1040 + * 1041 + * +-----+----------------------+-------+----------------------+-------+-----+ 1042 + * | ... | header3 {sequence=3} | data3 | header6 {sequence=6} | data6 | ... | 1043 + * +-----+----------------------+-------+----------------------+-------+-----+ 1044 + * 1045 + * If a vEVENT lost at the tail of the vEVENTQ and there is no following vEVENT 1046 + * providing the next sequence index, an IOMMU_VEVENTQ_FLAG_LOST_EVENTS header 1047 + * would be added to the tail, and no data would follow this header: 1048 + * 1049 + * +--+----------------------+-------+-----------------------------------------+ 1050 + * |..| header3 {sequence=3} | data3 | header4 {flags=LOST_EVENTS, sequence=4} | 1051 + * +--+----------------------+-------+-----------------------------------------+ 1052 + */ 1053 + struct iommufd_vevent_header { 1054 + __u32 flags; 1055 + __u32 sequence; 1056 + }; 1057 + 1058 + /** 1059 + * enum iommu_veventq_type - Virtual Event Queue Type 1060 + * @IOMMU_VEVENTQ_TYPE_DEFAULT: Reserved for future use 1061 + * @IOMMU_VEVENTQ_TYPE_ARM_SMMUV3: ARM SMMUv3 Virtual Event Queue 1062 + */ 1063 + enum iommu_veventq_type { 1064 + IOMMU_VEVENTQ_TYPE_DEFAULT = 0, 1065 + IOMMU_VEVENTQ_TYPE_ARM_SMMUV3 = 1, 1066 + }; 1067 + 1068 + /** 1069 + * struct iommu_vevent_arm_smmuv3 - ARM SMMUv3 Virtual Event 1070 + * (IOMMU_VEVENTQ_TYPE_ARM_SMMUV3) 1071 + * @evt: 256-bit ARM SMMUv3 Event record, little-endian. 1072 + * Reported event records: (Refer to "7.3 Event records" in SMMUv3 HW Spec) 1073 + * - 0x04 C_BAD_STE 1074 + * - 0x06 F_STREAM_DISABLED 1075 + * - 0x08 C_BAD_SUBSTREAMID 1076 + * - 0x0a C_BAD_CD 1077 + * - 0x10 F_TRANSLATION 1078 + * - 0x11 F_ADDR_SIZE 1079 + * - 0x12 F_ACCESS 1080 + * - 0x13 F_PERMISSION 1081 + * 1082 + * StreamID field reports a virtual device ID. To receive a virtual event for a 1083 + * device, a vDEVICE must be allocated via IOMMU_VDEVICE_ALLOC. 1084 + */ 1085 + struct iommu_vevent_arm_smmuv3 { 1086 + __aligned_le64 evt[4]; 1087 + }; 1088 + 1089 + /** 1090 + * struct iommu_veventq_alloc - ioctl(IOMMU_VEVENTQ_ALLOC) 1091 + * @size: sizeof(struct iommu_veventq_alloc) 1092 + * @flags: Must be 0 1093 + * @viommu_id: virtual IOMMU ID to associate the vEVENTQ with 1094 + * @type: Type of the vEVENTQ. Must be defined in enum iommu_veventq_type 1095 + * @veventq_depth: Maximum number of events in the vEVENTQ 1096 + * @out_veventq_id: The ID of the new vEVENTQ 1097 + * @out_veventq_fd: The fd of the new vEVENTQ. User space must close the 1098 + * successfully returned fd after using it 1099 + * @__reserved: Must be 0 1100 + * 1101 + * Explicitly allocate a virtual event queue interface for a vIOMMU. A vIOMMU 1102 + * can have multiple FDs for different types, but is confined to one per @type. 1103 + * User space should open the @out_veventq_fd to read vEVENTs out of a vEVENTQ, 1104 + * if there are vEVENTs available. A vEVENTQ will lose events due to overflow, 1105 + * if the number of the vEVENTs hits @veventq_depth. 1106 + * 1107 + * Each vEVENT in a vEVENTQ encloses a struct iommufd_vevent_header followed by 1108 + * a type-specific data structure, in a normal case: 1109 + * 1110 + * +-+---------+-------+---------+-------+-----+---------+-------+-+ 1111 + * | | header0 | data0 | header1 | data1 | ... | headerN | dataN | | 1112 + * +-+---------+-------+---------+-------+-----+---------+-------+-+ 1113 + * 1114 + * unless a tailing IOMMU_VEVENTQ_FLAG_LOST_EVENTS header is logged (refer to 1115 + * struct iommufd_vevent_header). 1116 + */ 1117 + struct iommu_veventq_alloc { 1118 + __u32 size; 1119 + __u32 flags; 1120 + __u32 viommu_id; 1121 + __u32 type; 1122 + __u32 veventq_depth; 1123 + __u32 out_veventq_id; 1124 + __u32 out_veventq_fd; 1125 + __u32 __reserved; 1126 + }; 1127 + #define IOMMU_VEVENTQ_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VEVENTQ_ALLOC) 1033 1128 #endif
+19 -10
include/uapi/linux/vfio.h
··· 932 932 * VFIO_DEVICE_ATTACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 19, 933 933 * struct vfio_device_attach_iommufd_pt) 934 934 * @argsz: User filled size of this data. 935 - * @flags: Must be 0. 935 + * @flags: Flags for attach. 936 936 * @pt_id: Input the target id which can represent an ioas or a hwpt 937 937 * allocated via iommufd subsystem. 938 938 * Output the input ioas id or the attached hwpt id which could 939 939 * be the specified hwpt itself or a hwpt automatically created 940 940 * for the specified ioas by kernel during the attachment. 941 + * @pasid: The pasid to be attached, only meaningful when 942 + * VFIO_DEVICE_ATTACH_PASID is set in @flags 941 943 * 942 944 * Associate the device with an address space within the bound iommufd. 943 945 * Undo by VFIO_DEVICE_DETACH_IOMMUFD_PT or device fd close. This is only 944 946 * allowed on cdev fds. 945 947 * 946 - * If a vfio device is currently attached to a valid hw_pagetable, without doing 947 - * a VFIO_DEVICE_DETACH_IOMMUFD_PT, a second VFIO_DEVICE_ATTACH_IOMMUFD_PT ioctl 948 - * passing in another hw_pagetable (hwpt) id is allowed. This action, also known 949 - * as a hw_pagetable replacement, will replace the device's currently attached 950 - * hw_pagetable with a new hw_pagetable corresponding to the given pt_id. 948 + * If a vfio device or a pasid of this device is currently attached to a valid 949 + * hw_pagetable (hwpt), without doing a VFIO_DEVICE_DETACH_IOMMUFD_PT, a second 950 + * VFIO_DEVICE_ATTACH_IOMMUFD_PT ioctl passing in another hwpt id is allowed. 951 + * This action, also known as a hw_pagetable replacement, will replace the 952 + * currently attached hwpt of the device or the pasid of this device with a new 953 + * hwpt corresponding to the given pt_id. 951 954 * 952 955 * Return: 0 on success, -errno on failure. 953 956 */ 954 957 struct vfio_device_attach_iommufd_pt { 955 958 __u32 argsz; 956 959 __u32 flags; 960 + #define VFIO_DEVICE_ATTACH_PASID (1 << 0) 957 961 __u32 pt_id; 962 + __u32 pasid; 958 963 }; 959 964 960 965 #define VFIO_DEVICE_ATTACH_IOMMUFD_PT _IO(VFIO_TYPE, VFIO_BASE + 19) ··· 968 963 * VFIO_DEVICE_DETACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 20, 969 964 * struct vfio_device_detach_iommufd_pt) 970 965 * @argsz: User filled size of this data. 971 - * @flags: Must be 0. 966 + * @flags: Flags for detach. 967 + * @pasid: The pasid to be detached, only meaningful when 968 + * VFIO_DEVICE_DETACH_PASID is set in @flags 972 969 * 973 - * Remove the association of the device and its current associated address 974 - * space. After it, the device should be in a blocking DMA state. This is only 975 - * allowed on cdev fds. 970 + * Remove the association of the device or a pasid of the device and its current 971 + * associated address space. After it, the device or the pasid should be in a 972 + * blocking DMA state. This is only allowed on cdev fds. 976 973 * 977 974 * Return: 0 on success, -errno on failure. 978 975 */ 979 976 struct vfio_device_detach_iommufd_pt { 980 977 __u32 argsz; 981 978 __u32 flags; 979 + #define VFIO_DEVICE_DETACH_PASID (1 << 0) 980 + __u32 pasid; 982 981 }; 983 982 984 983 #define VFIO_DEVICE_DETACH_IOMMUFD_PT _IO(VFIO_TYPE, VFIO_BASE + 20)
+67
lib/idr.c
··· 477 477 EXPORT_SYMBOL(ida_alloc_range); 478 478 479 479 /** 480 + * ida_find_first_range - Get the lowest used ID. 481 + * @ida: IDA handle. 482 + * @min: Lowest ID to get. 483 + * @max: Highest ID to get. 484 + * 485 + * Get the lowest used ID between @min and @max, inclusive. The returned 486 + * ID will not exceed %INT_MAX, even if @max is larger. 487 + * 488 + * Context: Any context. Takes and releases the xa_lock. 489 + * Return: The lowest used ID, or errno if no used ID is found. 490 + */ 491 + int ida_find_first_range(struct ida *ida, unsigned int min, unsigned int max) 492 + { 493 + unsigned long index = min / IDA_BITMAP_BITS; 494 + unsigned int offset = min % IDA_BITMAP_BITS; 495 + unsigned long *addr, size, bit; 496 + unsigned long tmp = 0; 497 + unsigned long flags; 498 + void *entry; 499 + int ret; 500 + 501 + if ((int)min < 0) 502 + return -EINVAL; 503 + if ((int)max < 0) 504 + max = INT_MAX; 505 + 506 + xa_lock_irqsave(&ida->xa, flags); 507 + 508 + entry = xa_find(&ida->xa, &index, max / IDA_BITMAP_BITS, XA_PRESENT); 509 + if (!entry) { 510 + ret = -ENOENT; 511 + goto err_unlock; 512 + } 513 + 514 + if (index > min / IDA_BITMAP_BITS) 515 + offset = 0; 516 + if (index * IDA_BITMAP_BITS + offset > max) { 517 + ret = -ENOENT; 518 + goto err_unlock; 519 + } 520 + 521 + if (xa_is_value(entry)) { 522 + tmp = xa_to_value(entry); 523 + addr = &tmp; 524 + size = BITS_PER_XA_VALUE; 525 + } else { 526 + addr = ((struct ida_bitmap *)entry)->bitmap; 527 + size = IDA_BITMAP_BITS; 528 + } 529 + 530 + bit = find_next_bit(addr, size, offset); 531 + 532 + xa_unlock_irqrestore(&ida->xa, flags); 533 + 534 + if (bit == size || 535 + index * IDA_BITMAP_BITS + bit > max) 536 + return -ENOENT; 537 + 538 + return index * IDA_BITMAP_BITS + bit; 539 + 540 + err_unlock: 541 + xa_unlock_irqrestore(&ida->xa, flags); 542 + return ret; 543 + } 544 + EXPORT_SYMBOL(ida_find_first_range); 545 + 546 + /** 480 547 * ida_free() - Release an allocated ID. 481 548 * @ida: IDA handle. 482 549 * @id: Previously allocated ID.
+70
lib/test_ida.c
··· 189 189 IDA_BUG_ON(ida, !ida_is_empty(ida)); 190 190 } 191 191 192 + /* 193 + * Check ida_find_first_range() and varriants. 194 + */ 195 + static void ida_check_find_first(struct ida *ida) 196 + { 197 + /* IDA is empty; all of the below should be not exist */ 198 + IDA_BUG_ON(ida, ida_exists(ida, 0)); 199 + IDA_BUG_ON(ida, ida_exists(ida, 3)); 200 + IDA_BUG_ON(ida, ida_exists(ida, 63)); 201 + IDA_BUG_ON(ida, ida_exists(ida, 1023)); 202 + IDA_BUG_ON(ida, ida_exists(ida, (1 << 20) - 1)); 203 + 204 + /* IDA contains a single value entry */ 205 + IDA_BUG_ON(ida, ida_alloc_min(ida, 3, GFP_KERNEL) != 3); 206 + IDA_BUG_ON(ida, ida_exists(ida, 0)); 207 + IDA_BUG_ON(ida, !ida_exists(ida, 3)); 208 + IDA_BUG_ON(ida, ida_exists(ida, 63)); 209 + IDA_BUG_ON(ida, ida_exists(ida, 1023)); 210 + IDA_BUG_ON(ida, ida_exists(ida, (1 << 20) - 1)); 211 + 212 + IDA_BUG_ON(ida, ida_alloc_min(ida, 63, GFP_KERNEL) != 63); 213 + IDA_BUG_ON(ida, ida_exists(ida, 0)); 214 + IDA_BUG_ON(ida, !ida_exists(ida, 3)); 215 + IDA_BUG_ON(ida, !ida_exists(ida, 63)); 216 + IDA_BUG_ON(ida, ida_exists(ida, 1023)); 217 + IDA_BUG_ON(ida, ida_exists(ida, (1 << 20) - 1)); 218 + 219 + /* IDA contains a single bitmap */ 220 + IDA_BUG_ON(ida, ida_alloc_min(ida, 1023, GFP_KERNEL) != 1023); 221 + IDA_BUG_ON(ida, ida_exists(ida, 0)); 222 + IDA_BUG_ON(ida, !ida_exists(ida, 3)); 223 + IDA_BUG_ON(ida, !ida_exists(ida, 63)); 224 + IDA_BUG_ON(ida, !ida_exists(ida, 1023)); 225 + IDA_BUG_ON(ida, ida_exists(ida, (1 << 20) - 1)); 226 + 227 + /* IDA contains a tree */ 228 + IDA_BUG_ON(ida, ida_alloc_min(ida, (1 << 20) - 1, GFP_KERNEL) != (1 << 20) - 1); 229 + IDA_BUG_ON(ida, ida_exists(ida, 0)); 230 + IDA_BUG_ON(ida, !ida_exists(ida, 3)); 231 + IDA_BUG_ON(ida, !ida_exists(ida, 63)); 232 + IDA_BUG_ON(ida, !ida_exists(ida, 1023)); 233 + IDA_BUG_ON(ida, !ida_exists(ida, (1 << 20) - 1)); 234 + 235 + /* Now try to find first */ 236 + IDA_BUG_ON(ida, ida_find_first(ida) != 3); 237 + IDA_BUG_ON(ida, ida_find_first_range(ida, -1, 2) != -EINVAL); 238 + IDA_BUG_ON(ida, ida_find_first_range(ida, 0, 2) != -ENOENT); // no used ID 239 + IDA_BUG_ON(ida, ida_find_first_range(ida, 0, 3) != 3); 240 + IDA_BUG_ON(ida, ida_find_first_range(ida, 1, 3) != 3); 241 + IDA_BUG_ON(ida, ida_find_first_range(ida, 3, 3) != 3); 242 + IDA_BUG_ON(ida, ida_find_first_range(ida, 2, 4) != 3); 243 + IDA_BUG_ON(ida, ida_find_first_range(ida, 4, 3) != -ENOENT); // min > max, fail 244 + IDA_BUG_ON(ida, ida_find_first_range(ida, 4, 60) != -ENOENT); // no used ID 245 + IDA_BUG_ON(ida, ida_find_first_range(ida, 4, 64) != 63); 246 + IDA_BUG_ON(ida, ida_find_first_range(ida, 63, 63) != 63); 247 + IDA_BUG_ON(ida, ida_find_first_range(ida, 64, 1026) != 1023); 248 + IDA_BUG_ON(ida, ida_find_first_range(ida, 1023, 1023) != 1023); 249 + IDA_BUG_ON(ida, ida_find_first_range(ida, 1023, (1 << 20) - 1) != 1023); 250 + IDA_BUG_ON(ida, ida_find_first_range(ida, 1024, (1 << 20) - 1) != (1 << 20) - 1); 251 + IDA_BUG_ON(ida, ida_find_first_range(ida, (1 << 20), INT_MAX) != -ENOENT); 252 + 253 + ida_free(ida, 3); 254 + ida_free(ida, 63); 255 + ida_free(ida, 1023); 256 + ida_free(ida, (1 << 20) - 1); 257 + 258 + IDA_BUG_ON(ida, !ida_is_empty(ida)); 259 + } 260 + 192 261 static DEFINE_IDA(ida); 193 262 194 263 static int ida_checks(void) ··· 271 202 ida_check_max(&ida); 272 203 ida_check_conv(&ida); 273 204 ida_check_bad_free(&ida); 205 + ida_check_find_first(&ida); 274 206 275 207 printk("IDA: %u of %u tests passed\n", tests_passed, tests_run); 276 208 return (tests_run != tests_passed) ? 0 : -EINVAL;
+365
tools/testing/selftests/iommu/iommufd.c
··· 342 342 uint32_t hwpt_id; 343 343 uint32_t device_id; 344 344 uint64_t base_iova; 345 + uint32_t device_pasid_id; 345 346 }; 346 347 347 348 FIXTURE_VARIANT(iommufd_ioas) 348 349 { 349 350 unsigned int mock_domains; 350 351 unsigned int memory_limit; 352 + bool pasid_capable; 351 353 }; 352 354 353 355 FIXTURE_SETUP(iommufd_ioas) ··· 374 372 IOMMU_TEST_DEV_CACHE_DEFAULT); 375 373 self->base_iova = MOCK_APERTURE_START; 376 374 } 375 + 376 + if (variant->pasid_capable) 377 + test_cmd_mock_domain_flags(self->ioas_id, 378 + MOCK_FLAGS_DEVICE_PASID, 379 + NULL, NULL, 380 + &self->device_pasid_id); 377 381 } 378 382 379 383 FIXTURE_TEARDOWN(iommufd_ioas) ··· 395 387 FIXTURE_VARIANT_ADD(iommufd_ioas, mock_domain) 396 388 { 397 389 .mock_domains = 1, 390 + .pasid_capable = true, 398 391 }; 399 392 400 393 FIXTURE_VARIANT_ADD(iommufd_ioas, two_mock_domain) ··· 447 438 test_err_hwpt_alloc(ENOENT, self->ioas_id, self->device_id, 0, 448 439 &test_hwpt_id); 449 440 test_err_hwpt_alloc(EINVAL, self->device_id, self->device_id, 0, 441 + &test_hwpt_id); 442 + test_err_hwpt_alloc(EOPNOTSUPP, self->device_id, self->ioas_id, 443 + IOMMU_HWPT_ALLOC_NEST_PARENT | 444 + IOMMU_HWPT_FAULT_ID_VALID, 450 445 &test_hwpt_id); 451 446 452 447 test_cmd_hwpt_alloc(self->device_id, self->ioas_id, ··· 761 748 } buffer_smaller; 762 749 763 750 if (self->device_id) { 751 + uint8_t max_pasid = 0; 752 + 764 753 /* Provide a zero-size user_buffer */ 765 754 test_cmd_get_hw_info(self->device_id, NULL, 0); 766 755 /* Provide a user_buffer with exact size */ ··· 777 762 * the fields within the size range still gets updated. 778 763 */ 779 764 test_cmd_get_hw_info(self->device_id, &buffer_smaller, sizeof(buffer_smaller)); 765 + test_cmd_get_hw_info_pasid(self->device_id, &max_pasid); 766 + ASSERT_EQ(0, max_pasid); 767 + if (variant->pasid_capable) { 768 + test_cmd_get_hw_info_pasid(self->device_pasid_id, 769 + &max_pasid); 770 + ASSERT_EQ(MOCK_PASID_WIDTH, max_pasid); 771 + } 780 772 } else { 781 773 test_err_get_hw_info(ENOENT, self->device_id, 782 774 &buffer_exact, sizeof(buffer_exact)); ··· 2758 2736 uint32_t iopf_hwpt_id; 2759 2737 uint32_t fault_id; 2760 2738 uint32_t fault_fd; 2739 + uint32_t vdev_id; 2761 2740 2762 2741 if (self->device_id) { 2763 2742 test_ioctl_fault_alloc(&fault_id, &fault_fd); ··· 2775 2752 &iopf_hwpt_id, IOMMU_HWPT_DATA_SELFTEST, &data, 2776 2753 sizeof(data)); 2777 2754 2755 + /* Must allocate vdevice before attaching to a nested hwpt */ 2756 + test_err_mock_domain_replace(ENOENT, self->stdev_id, 2757 + iopf_hwpt_id); 2758 + test_cmd_vdevice_alloc(viommu_id, dev_id, 0x99, &vdev_id); 2778 2759 test_cmd_mock_domain_replace(self->stdev_id, iopf_hwpt_id); 2779 2760 EXPECT_ERRNO(EBUSY, 2780 2761 _test_ioctl_destroy(self->fd, iopf_hwpt_id)); ··· 2796 2769 uint32_t viommu_id = self->viommu_id; 2797 2770 uint32_t dev_id = self->device_id; 2798 2771 uint32_t vdev_id = 0; 2772 + uint32_t veventq_id; 2773 + uint32_t veventq_fd; 2774 + int prev_seq = -1; 2799 2775 2800 2776 if (dev_id) { 2777 + /* Must allocate vdevice before attaching to a nested hwpt */ 2778 + test_err_mock_domain_replace(ENOENT, self->stdev_id, 2779 + self->nested_hwpt_id); 2780 + 2781 + /* Allocate a vEVENTQ with veventq_depth=2 */ 2782 + test_cmd_veventq_alloc(viommu_id, IOMMU_VEVENTQ_TYPE_SELFTEST, 2783 + &veventq_id, &veventq_fd); 2784 + test_err_veventq_alloc(EEXIST, viommu_id, 2785 + IOMMU_VEVENTQ_TYPE_SELFTEST, NULL, NULL); 2801 2786 /* Set vdev_id to 0x99, unset it, and set to 0x88 */ 2802 2787 test_cmd_vdevice_alloc(viommu_id, dev_id, 0x99, &vdev_id); 2788 + test_cmd_mock_domain_replace(self->stdev_id, 2789 + self->nested_hwpt_id); 2790 + test_cmd_trigger_vevents(dev_id, 1); 2791 + test_cmd_read_vevents(veventq_fd, 1, 0x99, &prev_seq); 2803 2792 test_err_vdevice_alloc(EEXIST, viommu_id, dev_id, 0x99, 2804 2793 &vdev_id); 2794 + test_cmd_mock_domain_replace(self->stdev_id, self->ioas_id); 2805 2795 test_ioctl_destroy(vdev_id); 2796 + 2797 + /* Try again with 0x88 */ 2806 2798 test_cmd_vdevice_alloc(viommu_id, dev_id, 0x88, &vdev_id); 2799 + test_cmd_mock_domain_replace(self->stdev_id, 2800 + self->nested_hwpt_id); 2801 + /* Trigger an overflow with three events */ 2802 + test_cmd_trigger_vevents(dev_id, 3); 2803 + test_err_read_vevents(EOVERFLOW, veventq_fd, 3, 0x88, 2804 + &prev_seq); 2805 + /* Overflow must be gone after the previous reads */ 2806 + test_cmd_trigger_vevents(dev_id, 1); 2807 + test_cmd_read_vevents(veventq_fd, 1, 0x88, &prev_seq); 2808 + close(veventq_fd); 2809 + test_cmd_mock_domain_replace(self->stdev_id, self->ioas_id); 2807 2810 test_ioctl_destroy(vdev_id); 2811 + test_ioctl_destroy(veventq_id); 2808 2812 } else { 2809 2813 test_err_vdevice_alloc(ENOENT, viommu_id, dev_id, 0x99, NULL); 2810 2814 } ··· 3012 2954 test_cmd_dev_check_cache_all(dev_id, 0); 3013 2955 test_ioctl_destroy(vdev_id); 3014 2956 } 2957 + } 2958 + 2959 + FIXTURE(iommufd_device_pasid) 2960 + { 2961 + int fd; 2962 + uint32_t ioas_id; 2963 + uint32_t hwpt_id; 2964 + uint32_t stdev_id; 2965 + uint32_t device_id; 2966 + uint32_t no_pasid_stdev_id; 2967 + uint32_t no_pasid_device_id; 2968 + }; 2969 + 2970 + FIXTURE_VARIANT(iommufd_device_pasid) 2971 + { 2972 + bool pasid_capable; 2973 + }; 2974 + 2975 + FIXTURE_SETUP(iommufd_device_pasid) 2976 + { 2977 + self->fd = open("/dev/iommu", O_RDWR); 2978 + ASSERT_NE(-1, self->fd); 2979 + test_ioctl_ioas_alloc(&self->ioas_id); 2980 + 2981 + test_cmd_mock_domain_flags(self->ioas_id, 2982 + MOCK_FLAGS_DEVICE_PASID, 2983 + &self->stdev_id, &self->hwpt_id, 2984 + &self->device_id); 2985 + if (!variant->pasid_capable) 2986 + test_cmd_mock_domain_flags(self->ioas_id, 0, 2987 + &self->no_pasid_stdev_id, NULL, 2988 + &self->no_pasid_device_id); 2989 + } 2990 + 2991 + FIXTURE_TEARDOWN(iommufd_device_pasid) 2992 + { 2993 + teardown_iommufd(self->fd, _metadata); 2994 + } 2995 + 2996 + FIXTURE_VARIANT_ADD(iommufd_device_pasid, no_pasid) 2997 + { 2998 + .pasid_capable = false, 2999 + }; 3000 + 3001 + FIXTURE_VARIANT_ADD(iommufd_device_pasid, has_pasid) 3002 + { 3003 + .pasid_capable = true, 3004 + }; 3005 + 3006 + TEST_F(iommufd_device_pasid, pasid_attach) 3007 + { 3008 + struct iommu_hwpt_selftest data = { 3009 + .iotlb = IOMMU_TEST_IOTLB_DEFAULT, 3010 + }; 3011 + uint32_t nested_hwpt_id[3] = {}; 3012 + uint32_t parent_hwpt_id = 0; 3013 + uint32_t fault_id, fault_fd; 3014 + uint32_t s2_hwpt_id = 0; 3015 + uint32_t iopf_hwpt_id; 3016 + uint32_t pasid = 100; 3017 + uint32_t viommu_id; 3018 + 3019 + /* 3020 + * Negative, detach pasid without attaching, this is not expected. 3021 + * But it should not result in failure anyway. 3022 + */ 3023 + test_cmd_pasid_detach(pasid); 3024 + 3025 + /* Allocate two nested hwpts sharing one common parent hwpt */ 3026 + test_cmd_hwpt_alloc(self->device_id, self->ioas_id, 3027 + IOMMU_HWPT_ALLOC_NEST_PARENT, 3028 + &parent_hwpt_id); 3029 + test_cmd_hwpt_alloc_nested(self->device_id, parent_hwpt_id, 3030 + IOMMU_HWPT_ALLOC_PASID, 3031 + &nested_hwpt_id[0], 3032 + IOMMU_HWPT_DATA_SELFTEST, 3033 + &data, sizeof(data)); 3034 + test_cmd_hwpt_alloc_nested(self->device_id, parent_hwpt_id, 3035 + IOMMU_HWPT_ALLOC_PASID, 3036 + &nested_hwpt_id[1], 3037 + IOMMU_HWPT_DATA_SELFTEST, 3038 + &data, sizeof(data)); 3039 + 3040 + /* Fault related preparation */ 3041 + test_ioctl_fault_alloc(&fault_id, &fault_fd); 3042 + test_cmd_hwpt_alloc_iopf(self->device_id, parent_hwpt_id, fault_id, 3043 + IOMMU_HWPT_FAULT_ID_VALID | IOMMU_HWPT_ALLOC_PASID, 3044 + &iopf_hwpt_id, 3045 + IOMMU_HWPT_DATA_SELFTEST, &data, 3046 + sizeof(data)); 3047 + 3048 + /* Allocate a regular nested hwpt based on viommu */ 3049 + test_cmd_viommu_alloc(self->device_id, parent_hwpt_id, 3050 + IOMMU_VIOMMU_TYPE_SELFTEST, 3051 + &viommu_id); 3052 + test_cmd_hwpt_alloc_nested(self->device_id, viommu_id, 3053 + IOMMU_HWPT_ALLOC_PASID, 3054 + &nested_hwpt_id[2], 3055 + IOMMU_HWPT_DATA_SELFTEST, &data, 3056 + sizeof(data)); 3057 + 3058 + test_cmd_hwpt_alloc(self->device_id, self->ioas_id, 3059 + IOMMU_HWPT_ALLOC_PASID, 3060 + &s2_hwpt_id); 3061 + 3062 + /* Attach RID to non-pasid compat domain, */ 3063 + test_cmd_mock_domain_replace(self->stdev_id, parent_hwpt_id); 3064 + /* then attach to pasid should fail */ 3065 + test_err_pasid_attach(EINVAL, pasid, s2_hwpt_id); 3066 + 3067 + /* Attach RID to pasid compat domain, */ 3068 + test_cmd_mock_domain_replace(self->stdev_id, s2_hwpt_id); 3069 + /* then attach to pasid should succeed, */ 3070 + test_cmd_pasid_attach(pasid, nested_hwpt_id[0]); 3071 + /* but attach RID to non-pasid compat domain should fail now. */ 3072 + test_err_mock_domain_replace(EINVAL, self->stdev_id, parent_hwpt_id); 3073 + /* 3074 + * Detach hwpt from pasid 100, and check if the pasid 100 3075 + * has null domain. 3076 + */ 3077 + test_cmd_pasid_detach(pasid); 3078 + ASSERT_EQ(0, 3079 + test_cmd_pasid_check_hwpt(self->fd, self->stdev_id, 3080 + pasid, 0)); 3081 + /* RID is attached to pasid-comapt domain, pasid path is not used */ 3082 + 3083 + if (!variant->pasid_capable) { 3084 + /* 3085 + * PASID-compatible domain can be used by non-PASID-capable 3086 + * device. 3087 + */ 3088 + test_cmd_mock_domain_replace(self->no_pasid_stdev_id, nested_hwpt_id[0]); 3089 + test_cmd_mock_domain_replace(self->no_pasid_stdev_id, self->ioas_id); 3090 + /* 3091 + * Attach hwpt to pasid 100 of non-PASID-capable device, 3092 + * should fail, no matter domain is pasid-comapt or not. 3093 + */ 3094 + EXPECT_ERRNO(EINVAL, 3095 + _test_cmd_pasid_attach(self->fd, self->no_pasid_stdev_id, 3096 + pasid, parent_hwpt_id)); 3097 + EXPECT_ERRNO(EINVAL, 3098 + _test_cmd_pasid_attach(self->fd, self->no_pasid_stdev_id, 3099 + pasid, s2_hwpt_id)); 3100 + } 3101 + 3102 + /* 3103 + * Attach non pasid compat hwpt to pasid-capable device, should 3104 + * fail, and have null domain. 3105 + */ 3106 + test_err_pasid_attach(EINVAL, pasid, parent_hwpt_id); 3107 + ASSERT_EQ(0, 3108 + test_cmd_pasid_check_hwpt(self->fd, self->stdev_id, 3109 + pasid, 0)); 3110 + 3111 + /* 3112 + * Attach ioas to pasid 100, should fail, domain should 3113 + * be null. 3114 + */ 3115 + test_err_pasid_attach(EINVAL, pasid, self->ioas_id); 3116 + ASSERT_EQ(0, 3117 + test_cmd_pasid_check_hwpt(self->fd, self->stdev_id, 3118 + pasid, 0)); 3119 + 3120 + /* 3121 + * Attach the s2_hwpt to pasid 100, should succeed, domain should 3122 + * be valid. 3123 + */ 3124 + test_cmd_pasid_attach(pasid, s2_hwpt_id); 3125 + ASSERT_EQ(0, 3126 + test_cmd_pasid_check_hwpt(self->fd, self->stdev_id, 3127 + pasid, s2_hwpt_id)); 3128 + 3129 + /* 3130 + * Try attach pasid 100 with another hwpt, should FAIL 3131 + * as attach does not allow overwrite, use REPLACE instead. 3132 + */ 3133 + test_err_pasid_attach(EBUSY, pasid, nested_hwpt_id[0]); 3134 + 3135 + /* 3136 + * Detach hwpt from pasid 100 for next test, should succeed, 3137 + * and have null domain. 3138 + */ 3139 + test_cmd_pasid_detach(pasid); 3140 + ASSERT_EQ(0, 3141 + test_cmd_pasid_check_hwpt(self->fd, self->stdev_id, 3142 + pasid, 0)); 3143 + 3144 + /* 3145 + * Attach nested hwpt to pasid 100, should succeed, domain 3146 + * should be valid. 3147 + */ 3148 + test_cmd_pasid_attach(pasid, nested_hwpt_id[0]); 3149 + ASSERT_EQ(0, 3150 + test_cmd_pasid_check_hwpt(self->fd, self->stdev_id, 3151 + pasid, nested_hwpt_id[0])); 3152 + 3153 + /* Attach to pasid 100 which has been attached, should fail. */ 3154 + test_err_pasid_attach(EBUSY, pasid, nested_hwpt_id[0]); 3155 + 3156 + /* cleanup pasid 100 */ 3157 + test_cmd_pasid_detach(pasid); 3158 + 3159 + /* Replace tests */ 3160 + 3161 + pasid = 200; 3162 + /* 3163 + * Replace pasid 200 without attaching it, should fail 3164 + * with -EINVAL. 3165 + */ 3166 + test_err_pasid_replace(EINVAL, pasid, s2_hwpt_id); 3167 + 3168 + /* 3169 + * Attach the s2 hwpt to pasid 200, should succeed, domain should 3170 + * be valid. 3171 + */ 3172 + test_cmd_pasid_attach(pasid, s2_hwpt_id); 3173 + ASSERT_EQ(0, 3174 + test_cmd_pasid_check_hwpt(self->fd, self->stdev_id, 3175 + pasid, s2_hwpt_id)); 3176 + 3177 + /* 3178 + * Replace pasid 200 with self->ioas_id, should fail 3179 + * and domain should be the prior s2 hwpt. 3180 + */ 3181 + test_err_pasid_replace(EINVAL, pasid, self->ioas_id); 3182 + ASSERT_EQ(0, 3183 + test_cmd_pasid_check_hwpt(self->fd, self->stdev_id, 3184 + pasid, s2_hwpt_id)); 3185 + 3186 + /* 3187 + * Replace a nested hwpt for pasid 200, should succeed, 3188 + * and have valid domain. 3189 + */ 3190 + test_cmd_pasid_replace(pasid, nested_hwpt_id[0]); 3191 + ASSERT_EQ(0, 3192 + test_cmd_pasid_check_hwpt(self->fd, self->stdev_id, 3193 + pasid, nested_hwpt_id[0])); 3194 + 3195 + /* 3196 + * Replace with another nested hwpt for pasid 200, should 3197 + * succeed, and have valid domain. 3198 + */ 3199 + test_cmd_pasid_replace(pasid, nested_hwpt_id[1]); 3200 + ASSERT_EQ(0, 3201 + test_cmd_pasid_check_hwpt(self->fd, self->stdev_id, 3202 + pasid, nested_hwpt_id[1])); 3203 + 3204 + /* cleanup pasid 200 */ 3205 + test_cmd_pasid_detach(pasid); 3206 + 3207 + /* Negative Tests for pasid replace, use pasid 1024 */ 3208 + 3209 + /* 3210 + * Attach the s2 hwpt to pasid 1024, should succeed, domain should 3211 + * be valid. 3212 + */ 3213 + pasid = 1024; 3214 + test_cmd_pasid_attach(pasid, s2_hwpt_id); 3215 + ASSERT_EQ(0, 3216 + test_cmd_pasid_check_hwpt(self->fd, self->stdev_id, 3217 + pasid, s2_hwpt_id)); 3218 + 3219 + /* 3220 + * Replace pasid 1024 with nested_hwpt_id[0], should fail, 3221 + * but have the old valid domain. This is a designed 3222 + * negative case. Normally, this shall succeed. 3223 + */ 3224 + test_err_pasid_replace(ENOMEM, pasid, nested_hwpt_id[0]); 3225 + ASSERT_EQ(0, 3226 + test_cmd_pasid_check_hwpt(self->fd, self->stdev_id, 3227 + pasid, s2_hwpt_id)); 3228 + 3229 + /* cleanup pasid 1024 */ 3230 + test_cmd_pasid_detach(pasid); 3231 + 3232 + /* Attach to iopf-capable hwpt */ 3233 + 3234 + /* 3235 + * Attach an iopf hwpt to pasid 2048, should succeed, domain should 3236 + * be valid. 3237 + */ 3238 + pasid = 2048; 3239 + test_cmd_pasid_attach(pasid, iopf_hwpt_id); 3240 + ASSERT_EQ(0, 3241 + test_cmd_pasid_check_hwpt(self->fd, self->stdev_id, 3242 + pasid, iopf_hwpt_id)); 3243 + 3244 + test_cmd_trigger_iopf_pasid(self->device_id, pasid, fault_fd); 3245 + 3246 + /* 3247 + * Replace with s2_hwpt_id for pasid 2048, should 3248 + * succeed, and have valid domain. 3249 + */ 3250 + test_cmd_pasid_replace(pasid, s2_hwpt_id); 3251 + ASSERT_EQ(0, 3252 + test_cmd_pasid_check_hwpt(self->fd, self->stdev_id, 3253 + pasid, s2_hwpt_id)); 3254 + 3255 + /* cleanup pasid 2048 */ 3256 + test_cmd_pasid_detach(pasid); 3257 + 3258 + test_ioctl_destroy(iopf_hwpt_id); 3259 + close(fault_fd); 3260 + test_ioctl_destroy(fault_id); 3261 + 3262 + /* Detach the s2_hwpt_id from RID */ 3263 + test_cmd_mock_domain_replace(self->stdev_id, self->ioas_id); 3015 3264 } 3016 3265 3017 3266 TEST_HARNESS_MAIN
+58 -15
tools/testing/selftests/iommu/iommufd_fail_nth.c
··· 209 209 { 210 210 int fd; 211 211 uint32_t access_id; 212 + uint32_t stdev_id; 213 + uint32_t pasid; 212 214 }; 213 215 214 216 FIXTURE_SETUP(basic_fail_nth) 215 217 { 216 218 self->fd = -1; 217 219 self->access_id = 0; 220 + self->stdev_id = 0; 221 + self->pasid = 0; //test should use a non-zero value 218 222 } 219 223 220 224 FIXTURE_TEARDOWN(basic_fail_nth) ··· 230 226 rc = _test_cmd_destroy_access(self->access_id); 231 227 assert(rc == 0); 232 228 } 229 + if (self->pasid && self->stdev_id) 230 + _test_cmd_pasid_detach(self->fd, self->stdev_id, self->pasid); 233 231 teardown_iommufd(self->fd, _metadata); 234 232 } 235 233 ··· 626 620 }; 627 621 struct iommu_test_hw_info info; 628 622 uint32_t fault_id, fault_fd; 623 + uint32_t veventq_id, veventq_fd; 629 624 uint32_t fault_hwpt_id; 625 + uint32_t test_hwpt_id; 630 626 uint32_t ioas_id; 631 627 uint32_t ioas_id2; 632 - uint32_t stdev_id; 633 628 uint32_t idev_id; 634 629 uint32_t hwpt_id; 635 630 uint32_t viommu_id; ··· 661 654 662 655 fail_nth_enable(); 663 656 664 - if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id, NULL, 665 - &idev_id)) 657 + if (_test_cmd_mock_domain_flags(self->fd, ioas_id, 658 + MOCK_FLAGS_DEVICE_PASID, 659 + &self->stdev_id, NULL, &idev_id)) 666 660 return -1; 667 661 668 - if (_test_cmd_get_hw_info(self->fd, idev_id, &info, sizeof(info), NULL)) 669 - return -1; 670 - 671 - if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0, 0, &hwpt_id, 672 - IOMMU_HWPT_DATA_NONE, 0, 0)) 673 - return -1; 674 - 675 - if (_test_cmd_mock_domain_replace(self->fd, stdev_id, ioas_id2, NULL)) 676 - return -1; 677 - 678 - if (_test_cmd_mock_domain_replace(self->fd, stdev_id, hwpt_id, NULL)) 662 + if (_test_cmd_get_hw_info(self->fd, idev_id, &info, 663 + sizeof(info), NULL, NULL)) 679 664 return -1; 680 665 681 666 if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0, 682 - IOMMU_HWPT_ALLOC_NEST_PARENT, &hwpt_id, 667 + IOMMU_HWPT_ALLOC_PASID, &hwpt_id, 668 + IOMMU_HWPT_DATA_NONE, 0, 0)) 669 + return -1; 670 + 671 + if (_test_cmd_mock_domain_replace(self->fd, self->stdev_id, ioas_id2, NULL)) 672 + return -1; 673 + 674 + if (_test_cmd_mock_domain_replace(self->fd, self->stdev_id, hwpt_id, NULL)) 675 + return -1; 676 + 677 + if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0, 678 + IOMMU_HWPT_ALLOC_NEST_PARENT | 679 + IOMMU_HWPT_ALLOC_PASID, 680 + &hwpt_id, 683 681 IOMMU_HWPT_DATA_NONE, 0, 0)) 684 682 return -1; 685 683 ··· 703 691 IOMMU_HWPT_FAULT_ID_VALID, &fault_hwpt_id, 704 692 IOMMU_HWPT_DATA_SELFTEST, &data, sizeof(data))) 705 693 return -1; 694 + 695 + if (_test_cmd_veventq_alloc(self->fd, viommu_id, 696 + IOMMU_VEVENTQ_TYPE_SELFTEST, &veventq_id, 697 + &veventq_fd)) 698 + return -1; 699 + close(veventq_fd); 700 + 701 + if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0, 702 + IOMMU_HWPT_ALLOC_PASID, 703 + &test_hwpt_id, 704 + IOMMU_HWPT_DATA_NONE, 0, 0)) 705 + return -1; 706 + 707 + /* Tests for pasid attach/replace/detach */ 708 + 709 + self->pasid = 200; 710 + 711 + if (_test_cmd_pasid_attach(self->fd, self->stdev_id, 712 + self->pasid, hwpt_id)) { 713 + self->pasid = 0; 714 + return -1; 715 + } 716 + 717 + if (_test_cmd_pasid_replace(self->fd, self->stdev_id, 718 + self->pasid, test_hwpt_id)) 719 + return -1; 720 + 721 + if (_test_cmd_pasid_detach(self->fd, self->stdev_id, self->pasid)) 722 + return -1; 723 + 724 + self->pasid = 0; 706 725 707 726 return 0; 708 727 }
+222 -7
tools/testing/selftests/iommu/iommufd_utils.h
··· 9 9 #include <sys/ioctl.h> 10 10 #include <stdint.h> 11 11 #include <assert.h> 12 + #include <poll.h> 12 13 13 14 #include "../kselftest_harness.h" 14 15 #include "../../../../drivers/iommu/iommufd/iommufd_test.h" ··· 758 757 759 758 /* @data can be NULL */ 760 759 static int _test_cmd_get_hw_info(int fd, __u32 device_id, void *data, 761 - size_t data_len, uint32_t *capabilities) 760 + size_t data_len, uint32_t *capabilities, 761 + uint8_t *max_pasid) 762 762 { 763 763 struct iommu_test_hw_info *info = (struct iommu_test_hw_info *)data; 764 764 struct iommu_hw_info cmd = { ··· 804 802 assert(!info->flags); 805 803 } 806 804 805 + if (max_pasid) 806 + *max_pasid = cmd.out_max_pasid_log2; 807 + 807 808 if (capabilities) 808 809 *capabilities = cmd.out_capabilities; 809 810 ··· 815 810 816 811 #define test_cmd_get_hw_info(device_id, data, data_len) \ 817 812 ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, data, \ 818 - data_len, NULL)) 813 + data_len, NULL, NULL)) 819 814 820 815 #define test_err_get_hw_info(_errno, device_id, data, data_len) \ 821 816 EXPECT_ERRNO(_errno, _test_cmd_get_hw_info(self->fd, device_id, data, \ 822 - data_len, NULL)) 817 + data_len, NULL, NULL)) 823 818 824 819 #define test_cmd_get_hw_capabilities(device_id, caps, mask) \ 825 - ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, NULL, 0, &caps)) 820 + ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, NULL, \ 821 + 0, &caps, NULL)) 822 + 823 + #define test_cmd_get_hw_info_pasid(device_id, max_pasid) \ 824 + ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, NULL, \ 825 + 0, NULL, max_pasid)) 826 826 827 827 static int _test_ioctl_fault_alloc(int fd, __u32 *fault_id, __u32 *fault_fd) 828 828 { ··· 852 842 ASSERT_NE(0, *(fault_fd)); \ 853 843 }) 854 844 855 - static int _test_cmd_trigger_iopf(int fd, __u32 device_id, __u32 fault_fd) 845 + static int _test_cmd_trigger_iopf(int fd, __u32 device_id, __u32 pasid, 846 + __u32 fault_fd) 856 847 { 857 848 struct iommu_test_cmd trigger_iopf_cmd = { 858 849 .size = sizeof(trigger_iopf_cmd), 859 850 .op = IOMMU_TEST_OP_TRIGGER_IOPF, 860 851 .trigger_iopf = { 861 852 .dev_id = device_id, 862 - .pasid = 0x1, 853 + .pasid = pasid, 863 854 .grpid = 0x2, 864 855 .perm = IOMMU_PGFAULT_PERM_READ | IOMMU_PGFAULT_PERM_WRITE, 865 856 .addr = 0xdeadbeaf, ··· 891 880 } 892 881 893 882 #define test_cmd_trigger_iopf(device_id, fault_fd) \ 894 - ASSERT_EQ(0, _test_cmd_trigger_iopf(self->fd, device_id, fault_fd)) 883 + ASSERT_EQ(0, _test_cmd_trigger_iopf(self->fd, device_id, 0x1, fault_fd)) 884 + #define test_cmd_trigger_iopf_pasid(device_id, pasid, fault_fd) \ 885 + ASSERT_EQ(0, _test_cmd_trigger_iopf(self->fd, device_id, \ 886 + pasid, fault_fd)) 895 887 896 888 static int _test_cmd_viommu_alloc(int fd, __u32 device_id, __u32 hwpt_id, 897 889 __u32 type, __u32 flags, __u32 *viommu_id) ··· 950 936 EXPECT_ERRNO(_errno, \ 951 937 _test_cmd_vdevice_alloc(self->fd, viommu_id, idev_id, \ 952 938 virt_id, vdev_id)) 939 + 940 + static int _test_cmd_veventq_alloc(int fd, __u32 viommu_id, __u32 type, 941 + __u32 *veventq_id, __u32 *veventq_fd) 942 + { 943 + struct iommu_veventq_alloc cmd = { 944 + .size = sizeof(cmd), 945 + .type = type, 946 + .veventq_depth = 2, 947 + .viommu_id = viommu_id, 948 + }; 949 + int ret; 950 + 951 + ret = ioctl(fd, IOMMU_VEVENTQ_ALLOC, &cmd); 952 + if (ret) 953 + return ret; 954 + if (veventq_id) 955 + *veventq_id = cmd.out_veventq_id; 956 + if (veventq_fd) 957 + *veventq_fd = cmd.out_veventq_fd; 958 + return 0; 959 + } 960 + 961 + #define test_cmd_veventq_alloc(viommu_id, type, veventq_id, veventq_fd) \ 962 + ASSERT_EQ(0, _test_cmd_veventq_alloc(self->fd, viommu_id, type, \ 963 + veventq_id, veventq_fd)) 964 + #define test_err_veventq_alloc(_errno, viommu_id, type, veventq_id, \ 965 + veventq_fd) \ 966 + EXPECT_ERRNO(_errno, \ 967 + _test_cmd_veventq_alloc(self->fd, viommu_id, type, \ 968 + veventq_id, veventq_fd)) 969 + 970 + static int _test_cmd_trigger_vevents(int fd, __u32 dev_id, __u32 nvevents) 971 + { 972 + struct iommu_test_cmd trigger_vevent_cmd = { 973 + .size = sizeof(trigger_vevent_cmd), 974 + .op = IOMMU_TEST_OP_TRIGGER_VEVENT, 975 + .trigger_vevent = { 976 + .dev_id = dev_id, 977 + }, 978 + }; 979 + int ret; 980 + 981 + while (nvevents--) { 982 + ret = ioctl(fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_TRIGGER_VEVENT), 983 + &trigger_vevent_cmd); 984 + if (ret < 0) 985 + return -1; 986 + } 987 + return ret; 988 + } 989 + 990 + #define test_cmd_trigger_vevents(dev_id, nvevents) \ 991 + ASSERT_EQ(0, _test_cmd_trigger_vevents(self->fd, dev_id, nvevents)) 992 + 993 + static int _test_cmd_read_vevents(int fd, __u32 event_fd, __u32 nvevents, 994 + __u32 virt_id, int *prev_seq) 995 + { 996 + struct pollfd pollfd = { .fd = event_fd, .events = POLLIN }; 997 + struct iommu_viommu_event_selftest *event; 998 + struct iommufd_vevent_header *hdr; 999 + ssize_t bytes; 1000 + void *data; 1001 + int ret, i; 1002 + 1003 + ret = poll(&pollfd, 1, 1000); 1004 + if (ret < 0) 1005 + return -1; 1006 + 1007 + data = calloc(nvevents, sizeof(*hdr) + sizeof(*event)); 1008 + if (!data) { 1009 + errno = ENOMEM; 1010 + return -1; 1011 + } 1012 + 1013 + bytes = read(event_fd, data, 1014 + nvevents * (sizeof(*hdr) + sizeof(*event))); 1015 + if (bytes <= 0) { 1016 + errno = EFAULT; 1017 + ret = -1; 1018 + goto out_free; 1019 + } 1020 + 1021 + for (i = 0; i < nvevents; i++) { 1022 + hdr = data + i * (sizeof(*hdr) + sizeof(*event)); 1023 + 1024 + if (hdr->flags & IOMMU_VEVENTQ_FLAG_LOST_EVENTS || 1025 + hdr->sequence - *prev_seq > 1) { 1026 + *prev_seq = hdr->sequence; 1027 + errno = EOVERFLOW; 1028 + ret = -1; 1029 + goto out_free; 1030 + } 1031 + *prev_seq = hdr->sequence; 1032 + event = data + sizeof(*hdr); 1033 + if (event->virt_id != virt_id) { 1034 + errno = EINVAL; 1035 + ret = -1; 1036 + goto out_free; 1037 + } 1038 + } 1039 + 1040 + ret = 0; 1041 + out_free: 1042 + free(data); 1043 + return ret; 1044 + } 1045 + 1046 + #define test_cmd_read_vevents(event_fd, nvevents, virt_id, prev_seq) \ 1047 + ASSERT_EQ(0, _test_cmd_read_vevents(self->fd, event_fd, nvevents, \ 1048 + virt_id, prev_seq)) 1049 + #define test_err_read_vevents(_errno, event_fd, nvevents, virt_id, prev_seq) \ 1050 + EXPECT_ERRNO(_errno, \ 1051 + _test_cmd_read_vevents(self->fd, event_fd, nvevents, \ 1052 + virt_id, prev_seq)) 1053 + 1054 + static int _test_cmd_pasid_attach(int fd, __u32 stdev_id, __u32 pasid, 1055 + __u32 pt_id) 1056 + { 1057 + struct iommu_test_cmd test_attach = { 1058 + .size = sizeof(test_attach), 1059 + .op = IOMMU_TEST_OP_PASID_ATTACH, 1060 + .id = stdev_id, 1061 + .pasid_attach = { 1062 + .pasid = pasid, 1063 + .pt_id = pt_id, 1064 + }, 1065 + }; 1066 + 1067 + return ioctl(fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_PASID_ATTACH), 1068 + &test_attach); 1069 + } 1070 + 1071 + #define test_cmd_pasid_attach(pasid, hwpt_id) \ 1072 + ASSERT_EQ(0, _test_cmd_pasid_attach(self->fd, self->stdev_id, \ 1073 + pasid, hwpt_id)) 1074 + 1075 + #define test_err_pasid_attach(_errno, pasid, hwpt_id) \ 1076 + EXPECT_ERRNO(_errno, \ 1077 + _test_cmd_pasid_attach(self->fd, self->stdev_id, \ 1078 + pasid, hwpt_id)) 1079 + 1080 + static int _test_cmd_pasid_replace(int fd, __u32 stdev_id, __u32 pasid, 1081 + __u32 pt_id) 1082 + { 1083 + struct iommu_test_cmd test_replace = { 1084 + .size = sizeof(test_replace), 1085 + .op = IOMMU_TEST_OP_PASID_REPLACE, 1086 + .id = stdev_id, 1087 + .pasid_replace = { 1088 + .pasid = pasid, 1089 + .pt_id = pt_id, 1090 + }, 1091 + }; 1092 + 1093 + return ioctl(fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_PASID_REPLACE), 1094 + &test_replace); 1095 + } 1096 + 1097 + #define test_cmd_pasid_replace(pasid, hwpt_id) \ 1098 + ASSERT_EQ(0, _test_cmd_pasid_replace(self->fd, self->stdev_id, \ 1099 + pasid, hwpt_id)) 1100 + 1101 + #define test_err_pasid_replace(_errno, pasid, hwpt_id) \ 1102 + EXPECT_ERRNO(_errno, \ 1103 + _test_cmd_pasid_replace(self->fd, self->stdev_id, \ 1104 + pasid, hwpt_id)) 1105 + 1106 + static int _test_cmd_pasid_detach(int fd, __u32 stdev_id, __u32 pasid) 1107 + { 1108 + struct iommu_test_cmd test_detach = { 1109 + .size = sizeof(test_detach), 1110 + .op = IOMMU_TEST_OP_PASID_DETACH, 1111 + .id = stdev_id, 1112 + .pasid_detach = { 1113 + .pasid = pasid, 1114 + }, 1115 + }; 1116 + 1117 + return ioctl(fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_PASID_DETACH), 1118 + &test_detach); 1119 + } 1120 + 1121 + #define test_cmd_pasid_detach(pasid) \ 1122 + ASSERT_EQ(0, _test_cmd_pasid_detach(self->fd, self->stdev_id, pasid)) 1123 + 1124 + static int test_cmd_pasid_check_hwpt(int fd, __u32 stdev_id, __u32 pasid, 1125 + __u32 hwpt_id) 1126 + { 1127 + struct iommu_test_cmd test_pasid_check = { 1128 + .size = sizeof(test_pasid_check), 1129 + .op = IOMMU_TEST_OP_PASID_CHECK_HWPT, 1130 + .id = stdev_id, 1131 + .pasid_check = { 1132 + .pasid = pasid, 1133 + .hwpt_id = hwpt_id, 1134 + }, 1135 + }; 1136 + 1137 + return ioctl(fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_PASID_CHECK_HWPT), 1138 + &test_pasid_check); 1139 + }