Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'vfio-v6.18-rc1' of https://github.com/awilliam/linux-vfio

Pull VFIO updates from Alex Williamson:

- Use fdinfo to expose the sysfs path of a device represented by a vfio
device file (Alex Mastro)

- Mark vfio-fsl-mc, vfio-amba, and the reset functions for
vfio-platform for removal as these are either orphaned or believed to
be unused (Alex Williamson)

- Add reviewers for vfio-platform to save it from also being marked for
removal (Mostafa Saleh, Pranjal Shrivastava)

- VFIO selftests, including basic sanity testing and minimal userspace
drivers for testing against real hardware. This is also expected to
provide integration with KVM selftests for KVM-VFIO interfaces (David
Matlack, Josh Hilke)

- Fix drivers/cdx and vfio/cdx to build without CONFIG_GENERIC_MSI_IRQ
(Nipun Gupta)

- Fix reference leak in hisi_acc (Miaoqian Lin)

- Use consistent return for unsupported device feature (Alex Mastro)

- Unwind using the correct memory free callback in vfio/pds (Zilin
Guan)

- Use IRQ_DISABLE_LAZY flag to improve handling of pre-PCI2.3 INTx and
resolve stalled interrupt on ppc64 (Timothy Pearson)

- Enable GB300 in nvgrace-gpu vfio-pci variant driver (Tushar Dave)

- Misc:
- Drop unnecessary ternary conversion in vfio/pci (Xichao Zhao)
- Grammatical fix in nvgrace-gpu (Morduan Zang)
- Update Shameer's email address (Shameer Kolothum)
- Fix document build warning (Alex Williamson)

* tag 'vfio-v6.18-rc1' of https://github.com/awilliam/linux-vfio: (48 commits)
vfio/nvgrace-gpu: Add GB300 SKU to the devid table
vfio/pci: Fix INTx handling on legacy non-PCI 2.3 devices
vfio/pds: replace bitmap_free with vfree
vfio: return -ENOTTY for unsupported device feature
hisi_acc_vfio_pci: Fix reference leak in hisi_acc_vfio_debug_init
vfio/platform: Mark reset drivers for removal
vfio/amba: Mark for removal
MAINTAINERS: Add myself as VFIO-platform reviewer
MAINTAINERS: Add myself as VFIO-platform reviewer
docs: proc.rst: Fix VFIO Device title formatting
vfio: selftests: Fix .gitignore for already tracked files
vfio/cdx: update driver to build without CONFIG_GENERIC_MSI_IRQ
cdx: don't select CONFIG_GENERIC_MSI_IRQ
MAINTAINERS: Update Shameer Kolothum's email address
vfio: selftests: Add a script to help with running VFIO selftests
vfio: selftests: Make iommufd the default iommu_mode
vfio: selftests: Add iommufd mode
vfio: selftests: Add iommufd_compat_type1{,v2} modes
vfio: selftests: Add vfio_type1v2_mode
vfio: selftests: Replicate tests across all iommu_modes
...

+3319 -23
+1
.mailmap
··· 711 711 Sergey Senozhatsky <senozhatsky@chromium.org> <senozhatsky@google.com> 712 712 Seth Forshee <sforshee@kernel.org> <seth.forshee@canonical.com> 713 713 Shakeel Butt <shakeel.butt@linux.dev> <shakeelb@google.com> 714 + Shameer Kolothum <skolothumtho@nvidia.com> <shameerali.kolothum.thodi@huawei.com> 714 715 Shannon Nelson <sln@onemain.com> <shannon.nelson@amd.com> 715 716 Shannon Nelson <sln@onemain.com> <snelson@pensando.io> 716 717 Shannon Nelson <sln@onemain.com> <shannon.nelson@intel.com>
+14
Documentation/filesystems/proc.rst
··· 2159 2159 where 'size' is the size of the DMA buffer in bytes. 'count' is the file count of 2160 2160 the DMA buffer file. 'exp_name' is the name of the DMA buffer exporter. 2161 2161 2162 + VFIO Device files 2163 + ~~~~~~~~~~~~~~~~~ 2164 + 2165 + :: 2166 + 2167 + pos: 0 2168 + flags: 02000002 2169 + mnt_id: 17 2170 + ino: 5122 2171 + vfio-device-syspath: /sys/devices/pci0000:e0/0000:e0:01.1/0000:e1:00.0/0000:e2:05.0/0000:e8:00.0 2172 + 2173 + where 'vfio-device-syspath' is the sysfs path corresponding to the VFIO device 2174 + file. 2175 + 2162 2176 3.9 /proc/<pid>/map_files - Information about memory mapped files 2163 2177 --------------------------------------------------------------------- 2164 2178 This directory contains symbolic links which represent memory mapped files
+11 -3
MAINTAINERS
··· 26793 26793 F: include/linux/vfio.h 26794 26794 F: include/linux/vfio_pci_core.h 26795 26795 F: include/uapi/linux/vfio.h 26796 + F: tools/testing/selftests/vfio/ 26796 26797 26797 26798 VFIO FSL-MC DRIVER 26798 26799 L: kvm@vger.kernel.org 26799 - S: Orphan 26800 + S: Obsolete 26800 26801 F: drivers/vfio/fsl-mc/ 26801 26802 26802 26803 VFIO HISILICON PCI DRIVER 26803 26804 M: Longfang Liu <liulongfang@huawei.com> 26804 - M: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> 26805 26805 L: kvm@vger.kernel.org 26806 26806 S: Maintained 26807 26807 F: drivers/vfio/pci/hisilicon/ ··· 26830 26830 VFIO PCI DEVICE SPECIFIC DRIVERS 26831 26831 R: Jason Gunthorpe <jgg@nvidia.com> 26832 26832 R: Yishai Hadas <yishaih@nvidia.com> 26833 - R: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> 26833 + R: Shameer Kolothum <skolothumtho@nvidia.com> 26834 26834 R: Kevin Tian <kevin.tian@intel.com> 26835 26835 L: kvm@vger.kernel.org 26836 26836 S: Maintained ··· 26846 26846 26847 26847 VFIO PLATFORM DRIVER 26848 26848 M: Eric Auger <eric.auger@redhat.com> 26849 + R: Mostafa Saleh <smostafa@google.com> 26850 + R: Pranjal Shrivastava <praan@google.com> 26849 26851 L: kvm@vger.kernel.org 26850 26852 S: Maintained 26851 26853 F: drivers/vfio/platform/ ··· 26858 26856 L: qat-linux@intel.com 26859 26857 S: Supported 26860 26858 F: drivers/vfio/pci/qat/ 26859 + 26860 + VFIO SELFTESTS 26861 + M: David Matlack <dmatlack@google.com> 26862 + L: kvm@vger.kernel.org 26863 + S: Maintained 26864 + F: tools/testing/selftests/vfio/ 26861 26865 26862 26866 VFIO VIRTIO PCI DRIVER 26863 26867 M: Yishai Hadas <yishaih@nvidia.com>
-1
drivers/cdx/Kconfig
··· 8 8 config CDX_BUS 9 9 bool "CDX Bus driver" 10 10 depends on OF && ARM64 || COMPILE_TEST 11 - select GENERIC_MSI_IRQ 12 11 help 13 12 Driver to enable Composable DMA Transfer(CDX) Bus. CDX bus 14 13 exposes Fabric devices which uses composable DMA IP to the
+2 -2
drivers/cdx/cdx.c
··· 310 310 * Setup MSI device data so that generic MSI alloc/free can 311 311 * be used by the device driver. 312 312 */ 313 - if (cdx->msi_domain) { 313 + if (IS_ENABLED(CONFIG_GENERIC_MSI_IRQ) && cdx->msi_domain) { 314 314 error = msi_setup_device_data(&cdx_dev->dev); 315 315 if (error) 316 316 return error; ··· 833 833 ((cdx->id << CDX_CONTROLLER_ID_SHIFT) | (cdx_dev->bus_num & CDX_BUS_NUM_MASK)), 834 834 cdx_dev->dev_num); 835 835 836 - if (cdx->msi_domain) { 836 + if (IS_ENABLED(CONFIG_GENERIC_MSI_IRQ) && cdx->msi_domain) { 837 837 cdx_dev->num_msi = dev_params->num_msi; 838 838 dev_set_msi_domain(&cdx_dev->dev, cdx->msi_domain); 839 839 }
-1
drivers/cdx/controller/Kconfig
··· 10 10 config CDX_CONTROLLER 11 11 tristate "CDX bus controller" 12 12 depends on HAS_DMA 13 - select GENERIC_MSI_IRQ 14 13 select REMOTEPROC 15 14 select RPMSG 16 15 help
+2 -1
drivers/cdx/controller/cdx_controller.c
··· 193 193 cdx->ops = &cdx_ops; 194 194 195 195 /* Create MSI domain */ 196 - cdx->msi_domain = cdx_msi_domain_init(&pdev->dev); 196 + if (IS_ENABLED(CONFIG_GENERIC_MSI_IRQ)) 197 + cdx->msi_domain = cdx_msi_domain_init(&pdev->dev); 197 198 if (!cdx->msi_domain) { 198 199 ret = dev_err_probe(&pdev->dev, -ENODEV, "cdx_msi_domain_init() failed"); 199 200 goto cdx_msi_fail;
+4
drivers/dma/idxd/registers.h
··· 3 3 #ifndef _IDXD_REGISTERS_H_ 4 4 #define _IDXD_REGISTERS_H_ 5 5 6 + #ifdef __KERNEL__ 6 7 #include <uapi/linux/idxd.h> 8 + #else 9 + #include <linux/idxd.h> 10 + #endif 7 11 8 12 /* PCI Config */ 9 13 #define PCI_DEVICE_ID_INTEL_DSA_GNRD 0x11fb
+2
drivers/dma/ioat/dma.h
··· 19 19 20 20 #define IOAT_DMA_DCA_ANY_CPU ~0 21 21 22 + int system_has_dca_enabled(struct pci_dev *pdev); 23 + 22 24 #define to_ioatdma_device(dev) container_of(dev, struct ioatdma_device, dma_dev) 23 25 #define to_dev(ioat_chan) (&(ioat_chan)->ioat_dma->pdev->dev) 24 26 #define to_pdev(ioat_chan) ((ioat_chan)->ioat_dma->pdev)
-3
drivers/dma/ioat/hw.h
··· 63 63 #define IOAT_VER_3_3 0x33 /* Version 3.3 */ 64 64 #define IOAT_VER_3_4 0x34 /* Version 3.4 */ 65 65 66 - 67 - int system_has_dca_enabled(struct pci_dev *pdev); 68 - 69 66 #define IOAT_DESC_SZ 64 70 67 71 68 struct ioat_dma_descriptor {
+5 -1
drivers/vfio/cdx/Makefile
··· 5 5 6 6 obj-$(CONFIG_VFIO_CDX) += vfio-cdx.o 7 7 8 - vfio-cdx-objs := main.o intr.o 8 + vfio-cdx-objs := main.o 9 + 10 + ifdef CONFIG_GENERIC_MSI_IRQ 11 + vfio-cdx-objs += intr.o 12 + endif
+14
drivers/vfio/cdx/private.h
··· 38 38 u8 config_msi; 39 39 }; 40 40 41 + #ifdef CONFIG_GENERIC_MSI_IRQ 41 42 int vfio_cdx_set_irqs_ioctl(struct vfio_cdx_device *vdev, 42 43 u32 flags, unsigned int index, 43 44 unsigned int start, unsigned int count, 44 45 void *data); 45 46 46 47 void vfio_cdx_irqs_cleanup(struct vfio_cdx_device *vdev); 48 + #else 49 + static int vfio_cdx_set_irqs_ioctl(struct vfio_cdx_device *vdev, 50 + u32 flags, unsigned int index, 51 + unsigned int start, unsigned int count, 52 + void *data) 53 + { 54 + return -EINVAL; 55 + } 56 + 57 + static void vfio_cdx_irqs_cleanup(struct vfio_cdx_device *vdev) 58 + { 59 + } 60 + #endif 47 61 48 62 #endif /* VFIO_CDX_PRIVATE_H */
+4 -1
drivers/vfio/fsl-mc/Kconfig
··· 2 2 depends on FSL_MC_BUS 3 3 4 4 config VFIO_FSL_MC 5 - tristate "VFIO support for QorIQ DPAA2 fsl-mc bus devices" 5 + tristate "VFIO support for QorIQ DPAA2 fsl-mc bus devices (DEPRECATED)" 6 6 select EVENTFD 7 7 help 8 + The vfio-fsl-mc driver is deprecated and will be removed in a 9 + future kernel release. 10 + 8 11 Driver to enable support for the VFIO QorIQ DPAA2 fsl-mc 9 12 (Management Complex) devices. This is required to passthrough 10 13 fsl-mc bus devices using the VFIO framework.
+2
drivers/vfio/fsl-mc/vfio_fsl_mc.c
··· 537 537 struct device *dev = &mc_dev->dev; 538 538 int ret; 539 539 540 + dev_err_once(dev, "DEPRECATION: vfio-fsl-mc is deprecated and will be removed in a future kernel release\n"); 541 + 540 542 vdev = vfio_alloc_device(vfio_fsl_mc_device, vdev, dev, 541 543 &vfio_fsl_mc_ops); 542 544 if (IS_ERR(vdev))
+5 -1
drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
··· 1612 1612 } 1613 1613 1614 1614 migf = kzalloc(sizeof(*migf), GFP_KERNEL); 1615 - if (!migf) 1615 + if (!migf) { 1616 + dput(vfio_dev_migration); 1616 1617 return; 1618 + } 1617 1619 hisi_acc_vdev->debug_migf = migf; 1618 1620 1619 1621 vfio_hisi_acc = debugfs_create_dir("hisi_acc", vfio_dev_migration); ··· 1625 1623 hisi_acc_vf_migf_read); 1626 1624 debugfs_create_devm_seqfile(dev, "cmd_state", vfio_hisi_acc, 1627 1625 hisi_acc_vf_debug_cmd); 1626 + 1627 + dput(vfio_dev_migration); 1628 1628 } 1629 1629 1630 1630 static void hisi_acc_vf_debugfs_exit(struct hisi_acc_vf_core_device *hisi_acc_vdev)
+3 -1
drivers/vfio/pci/nvgrace-gpu/main.c
··· 260 260 info.offset = VFIO_PCI_INDEX_TO_OFFSET(info.index); 261 261 /* 262 262 * The region memory size may not be power-of-2 aligned. 263 - * Given that the memory as a BAR and may not be 263 + * Given that the memory is a BAR and may not be 264 264 * aligned, roundup to the next power-of-2. 265 265 */ 266 266 info.size = memregion->bar_size; ··· 995 995 { PCI_DRIVER_OVERRIDE_DEVICE_VFIO(PCI_VENDOR_ID_NVIDIA, 0x2348) }, 996 996 /* GB200 SKU */ 997 997 { PCI_DRIVER_OVERRIDE_DEVICE_VFIO(PCI_VENDOR_ID_NVIDIA, 0x2941) }, 998 + /* GB300 SKU */ 999 + { PCI_DRIVER_OVERRIDE_DEVICE_VFIO(PCI_VENDOR_ID_NVIDIA, 0x31C2) }, 998 1000 {} 999 1001 }; 1000 1002
+1 -1
drivers/vfio/pci/pds/dirty.c
··· 82 82 83 83 host_ack_bmp = vzalloc(bytes); 84 84 if (!host_ack_bmp) { 85 - bitmap_free(host_seq_bmp); 85 + vfree(host_seq_bmp); 86 86 return -ENOMEM; 87 87 } 88 88
+8 -1
drivers/vfio/pci/vfio_pci_intrs.c
··· 304 304 305 305 vdev->irq_type = VFIO_PCI_INTX_IRQ_INDEX; 306 306 307 + if (!vdev->pci_2_3) 308 + irq_set_status_flags(pdev->irq, IRQ_DISABLE_UNLAZY); 309 + 307 310 ret = request_irq(pdev->irq, vfio_intx_handler, 308 311 irqflags, ctx->name, ctx); 309 312 if (ret) { 313 + if (!vdev->pci_2_3) 314 + irq_clear_status_flags(pdev->irq, IRQ_DISABLE_UNLAZY); 310 315 vdev->irq_type = VFIO_PCI_NUM_IRQS; 311 316 kfree(name); 312 317 vfio_irq_ctx_free(vdev, ctx, 0); ··· 357 352 vfio_virqfd_disable(&ctx->unmask); 358 353 vfio_virqfd_disable(&ctx->mask); 359 354 free_irq(pdev->irq, ctx); 355 + if (!vdev->pci_2_3) 356 + irq_clear_status_flags(pdev->irq, IRQ_DISABLE_UNLAZY); 360 357 if (ctx->trigger) 361 358 eventfd_ctx_put(ctx->trigger); 362 359 kfree(ctx->name); ··· 684 677 { 685 678 struct vfio_pci_irq_ctx *ctx; 686 679 unsigned int i; 687 - bool msix = (index == VFIO_PCI_MSIX_IRQ_INDEX) ? true : false; 680 + bool msix = (index == VFIO_PCI_MSIX_IRQ_INDEX); 688 681 689 682 if (irq_is(vdev, index) && !count && (flags & VFIO_IRQ_SET_DATA_NONE)) { 690 683 vfio_msi_disable(vdev, msix);
+4 -1
drivers/vfio/platform/Kconfig
··· 17 17 If you don't know what to do here, say N. 18 18 19 19 config VFIO_AMBA 20 - tristate "VFIO support for AMBA devices" 20 + tristate "VFIO support for AMBA devices (DEPRECATED)" 21 21 depends on ARM_AMBA || COMPILE_TEST 22 22 select VFIO_PLATFORM_BASE 23 23 help 24 + The vfio-amba driver is deprecated and will be removed in a 25 + future kernel release. 26 + 24 27 Support for ARM AMBA devices with VFIO. This is required to make 25 28 use of ARM AMBA devices present on the system using the VFIO 26 29 framework.
+3 -3
drivers/vfio/platform/reset/Kconfig
··· 1 1 # SPDX-License-Identifier: GPL-2.0-only 2 2 if VFIO_PLATFORM 3 3 config VFIO_PLATFORM_CALXEDAXGMAC_RESET 4 - tristate "VFIO support for calxeda xgmac reset" 4 + tristate "VFIO support for calxeda xgmac reset (DEPRECATED)" 5 5 help 6 6 Enables the VFIO platform driver to handle reset for Calxeda xgmac 7 7 8 8 If you don't know what to do here, say N. 9 9 10 10 config VFIO_PLATFORM_AMDXGBE_RESET 11 - tristate "VFIO support for AMD XGBE reset" 11 + tristate "VFIO support for AMD XGBE reset (DEPRECATED)" 12 12 help 13 13 Enables the VFIO platform driver to handle reset for AMD XGBE 14 14 15 15 If you don't know what to do here, say N. 16 16 17 17 config VFIO_PLATFORM_BCMFLEXRM_RESET 18 - tristate "VFIO support for Broadcom FlexRM reset" 18 + tristate "VFIO support for Broadcom FlexRM reset (DEPRECATED)" 19 19 depends on ARCH_BCM_IPROC || COMPILE_TEST 20 20 default ARCH_BCM_IPROC 21 21 help
+2
drivers/vfio/platform/reset/vfio_platform_amdxgbe.c
··· 52 52 u32 dma_mr_value, pcs_value, value; 53 53 unsigned int count; 54 54 55 + dev_err_once(vdev->device, "DEPRECATION: VFIO AMD XGBE platform reset is deprecated and will be removed in a future kernel release\n"); 56 + 55 57 if (!xgmac_regs->ioaddr) { 56 58 xgmac_regs->ioaddr = 57 59 ioremap(xgmac_regs->addr, xgmac_regs->size);
+2
drivers/vfio/platform/reset/vfio_platform_bcmflexrm.c
··· 72 72 int rc = 0, ret = 0, ring_num = 0; 73 73 struct vfio_platform_region *reg = &vdev->regions[0]; 74 74 75 + dev_err_once(vdev->device, "DEPRECATION: VFIO Broadcom FlexRM platform reset is deprecated and will be removed in a future kernel release\n"); 76 + 75 77 /* Map FlexRM ring registers if not mapped */ 76 78 if (!reg->ioaddr) { 77 79 reg->ioaddr = ioremap(reg->addr, reg->size);
+2
drivers/vfio/platform/reset/vfio_platform_calxedaxgmac.c
··· 50 50 { 51 51 struct vfio_platform_region *reg = &vdev->regions[0]; 52 52 53 + dev_err_once(vdev->device, "DEPRECATION: VFIO Calxeda xgmac platform reset is deprecated and will be removed in a future kernel release\n"); 54 + 53 55 if (!reg->ioaddr) { 54 56 reg->ioaddr = 55 57 ioremap(reg->addr, reg->size);
+2
drivers/vfio/platform/vfio_amba.c
··· 70 70 struct vfio_platform_device *vdev; 71 71 int ret; 72 72 73 + dev_err_once(&adev->dev, "DEPRECATION: vfio-amba is deprecated and will be removed in a future kernel release\n"); 74 + 73 75 vdev = vfio_alloc_device(vfio_platform_device, vdev, &adev->dev, 74 76 &vfio_amba_ops); 75 77 if (IS_ERR(vdev))
+21 -1
drivers/vfio/vfio_main.c
··· 28 28 #include <linux/pseudo_fs.h> 29 29 #include <linux/rwsem.h> 30 30 #include <linux/sched.h> 31 + #include <linux/seq_file.h> 31 32 #include <linux/slab.h> 32 33 #include <linux/stat.h> 33 34 #include <linux/string.h> ··· 1252 1251 feature.argsz - minsz); 1253 1252 default: 1254 1253 if (unlikely(!device->ops->device_feature)) 1255 - return -EINVAL; 1254 + return -ENOTTY; 1256 1255 return device->ops->device_feature(device, feature.flags, 1257 1256 arg->data, 1258 1257 feature.argsz - minsz); ··· 1356 1355 return device->ops->mmap(device, vma); 1357 1356 } 1358 1357 1358 + #ifdef CONFIG_PROC_FS 1359 + static void vfio_device_show_fdinfo(struct seq_file *m, struct file *filep) 1360 + { 1361 + char *path; 1362 + struct vfio_device_file *df = filep->private_data; 1363 + struct vfio_device *device = df->device; 1364 + 1365 + path = kobject_get_path(&device->dev->kobj, GFP_KERNEL); 1366 + if (!path) 1367 + return; 1368 + 1369 + seq_printf(m, "vfio-device-syspath: /sys%s\n", path); 1370 + kfree(path); 1371 + } 1372 + #endif 1373 + 1359 1374 const struct file_operations vfio_device_fops = { 1360 1375 .owner = THIS_MODULE, 1361 1376 .open = vfio_device_fops_cdev_open, ··· 1381 1364 .unlocked_ioctl = vfio_device_fops_unl_ioctl, 1382 1365 .compat_ioctl = compat_ptr_ioctl, 1383 1366 .mmap = vfio_device_fops_mmap, 1367 + #ifdef CONFIG_PROC_FS 1368 + .show_fdinfo = vfio_device_show_fdinfo, 1369 + #endif 1384 1370 }; 1385 1371 1386 1372 static struct vfio_device *vfio_device_from_file(struct file *file)
+101
tools/arch/x86/include/asm/io.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + #ifndef _TOOLS_ASM_X86_IO_H 3 + #define _TOOLS_ASM_X86_IO_H 4 + 5 + #include <linux/compiler.h> 6 + #include <linux/types.h> 7 + #include "special_insns.h" 8 + 9 + #define build_mmio_read(name, size, type, reg, barrier) \ 10 + static inline type name(const volatile void __iomem *addr) \ 11 + { type ret; asm volatile("mov" size " %1,%0":reg (ret) \ 12 + :"m" (*(volatile type __force *)addr) barrier); return ret; } 13 + 14 + #define build_mmio_write(name, size, type, reg, barrier) \ 15 + static inline void name(type val, volatile void __iomem *addr) \ 16 + { asm volatile("mov" size " %0,%1": :reg (val), \ 17 + "m" (*(volatile type __force *)addr) barrier); } 18 + 19 + build_mmio_read(readb, "b", unsigned char, "=q", :"memory") 20 + build_mmio_read(readw, "w", unsigned short, "=r", :"memory") 21 + build_mmio_read(readl, "l", unsigned int, "=r", :"memory") 22 + 23 + build_mmio_read(__readb, "b", unsigned char, "=q", ) 24 + build_mmio_read(__readw, "w", unsigned short, "=r", ) 25 + build_mmio_read(__readl, "l", unsigned int, "=r", ) 26 + 27 + build_mmio_write(writeb, "b", unsigned char, "q", :"memory") 28 + build_mmio_write(writew, "w", unsigned short, "r", :"memory") 29 + build_mmio_write(writel, "l", unsigned int, "r", :"memory") 30 + 31 + build_mmio_write(__writeb, "b", unsigned char, "q", ) 32 + build_mmio_write(__writew, "w", unsigned short, "r", ) 33 + build_mmio_write(__writel, "l", unsigned int, "r", ) 34 + 35 + #define readb readb 36 + #define readw readw 37 + #define readl readl 38 + #define readb_relaxed(a) __readb(a) 39 + #define readw_relaxed(a) __readw(a) 40 + #define readl_relaxed(a) __readl(a) 41 + #define __raw_readb __readb 42 + #define __raw_readw __readw 43 + #define __raw_readl __readl 44 + 45 + #define writeb writeb 46 + #define writew writew 47 + #define writel writel 48 + #define writeb_relaxed(v, a) __writeb(v, a) 49 + #define writew_relaxed(v, a) __writew(v, a) 50 + #define writel_relaxed(v, a) __writel(v, a) 51 + #define __raw_writeb __writeb 52 + #define __raw_writew __writew 53 + #define __raw_writel __writel 54 + 55 + #ifdef __x86_64__ 56 + 57 + build_mmio_read(readq, "q", u64, "=r", :"memory") 58 + build_mmio_read(__readq, "q", u64, "=r", ) 59 + build_mmio_write(writeq, "q", u64, "r", :"memory") 60 + build_mmio_write(__writeq, "q", u64, "r", ) 61 + 62 + #define readq_relaxed(a) __readq(a) 63 + #define writeq_relaxed(v, a) __writeq(v, a) 64 + 65 + #define __raw_readq __readq 66 + #define __raw_writeq __writeq 67 + 68 + /* Let people know that we have them */ 69 + #define readq readq 70 + #define writeq writeq 71 + 72 + #endif /* __x86_64__ */ 73 + 74 + #include <asm-generic/io.h> 75 + 76 + /** 77 + * iosubmit_cmds512 - copy data to single MMIO location, in 512-bit units 78 + * @dst: destination, in MMIO space (must be 512-bit aligned) 79 + * @src: source 80 + * @count: number of 512 bits quantities to submit 81 + * 82 + * Submit data from kernel space to MMIO space, in units of 512 bits at a 83 + * time. Order of access is not guaranteed, nor is a memory barrier 84 + * performed afterwards. 85 + * 86 + * Warning: Do not use this helper unless your driver has checked that the CPU 87 + * instruction is supported on the platform. 88 + */ 89 + static inline void iosubmit_cmds512(void __iomem *dst, const void *src, 90 + size_t count) 91 + { 92 + const u8 *from = src; 93 + const u8 *end = from + count * 64; 94 + 95 + while (from < end) { 96 + movdir64b(dst, from); 97 + from += 64; 98 + } 99 + } 100 + 101 + #endif /* _TOOLS_ASM_X86_IO_H */
+27
tools/arch/x86/include/asm/special_insns.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + #ifndef _TOOLS_ASM_X86_SPECIAL_INSNS_H 3 + #define _TOOLS_ASM_X86_SPECIAL_INSNS_H 4 + 5 + /* The dst parameter must be 64-bytes aligned */ 6 + static inline void movdir64b(void *dst, const void *src) 7 + { 8 + const struct { char _[64]; } *__src = src; 9 + struct { char _[64]; } *__dst = dst; 10 + 11 + /* 12 + * MOVDIR64B %(rdx), rax. 13 + * 14 + * Both __src and __dst must be memory constraints in order to tell the 15 + * compiler that no other memory accesses should be reordered around 16 + * this one. 17 + * 18 + * Also, both must be supplied as lvalues because this tells 19 + * the compiler what the object is (its size) the instruction accesses. 20 + * I.e., not the pointers but what they point to, thus the deref'ing '*'. 21 + */ 22 + asm volatile(".byte 0x66, 0x0f, 0x38, 0xf8, 0x02" 23 + : "+m" (*__dst) 24 + : "m" (*__src), "a" (__dst), "d" (__src)); 25 + } 26 + 27 + #endif /* _TOOLS_ASM_X86_SPECIAL_INSNS_H */
+482
tools/include/asm-generic/io.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + #ifndef _TOOLS_ASM_GENERIC_IO_H 3 + #define _TOOLS_ASM_GENERIC_IO_H 4 + 5 + #include <asm/barrier.h> 6 + #include <asm/byteorder.h> 7 + 8 + #include <linux/compiler.h> 9 + #include <linux/kernel.h> 10 + #include <linux/types.h> 11 + 12 + #ifndef mmiowb_set_pending 13 + #define mmiowb_set_pending() do { } while (0) 14 + #endif 15 + 16 + #ifndef __io_br 17 + #define __io_br() barrier() 18 + #endif 19 + 20 + /* prevent prefetching of coherent DMA data ahead of a dma-complete */ 21 + #ifndef __io_ar 22 + #ifdef rmb 23 + #define __io_ar(v) rmb() 24 + #else 25 + #define __io_ar(v) barrier() 26 + #endif 27 + #endif 28 + 29 + /* flush writes to coherent DMA data before possibly triggering a DMA read */ 30 + #ifndef __io_bw 31 + #ifdef wmb 32 + #define __io_bw() wmb() 33 + #else 34 + #define __io_bw() barrier() 35 + #endif 36 + #endif 37 + 38 + /* serialize device access against a spin_unlock, usually handled there. */ 39 + #ifndef __io_aw 40 + #define __io_aw() mmiowb_set_pending() 41 + #endif 42 + 43 + #ifndef __io_pbw 44 + #define __io_pbw() __io_bw() 45 + #endif 46 + 47 + #ifndef __io_paw 48 + #define __io_paw() __io_aw() 49 + #endif 50 + 51 + #ifndef __io_pbr 52 + #define __io_pbr() __io_br() 53 + #endif 54 + 55 + #ifndef __io_par 56 + #define __io_par(v) __io_ar(v) 57 + #endif 58 + 59 + #ifndef _THIS_IP_ 60 + #define _THIS_IP_ 0 61 + #endif 62 + 63 + static inline void log_write_mmio(u64 val, u8 width, volatile void __iomem *addr, 64 + unsigned long caller_addr, unsigned long caller_addr0) {} 65 + static inline void log_post_write_mmio(u64 val, u8 width, volatile void __iomem *addr, 66 + unsigned long caller_addr, unsigned long caller_addr0) {} 67 + static inline void log_read_mmio(u8 width, const volatile void __iomem *addr, 68 + unsigned long caller_addr, unsigned long caller_addr0) {} 69 + static inline void log_post_read_mmio(u64 val, u8 width, const volatile void __iomem *addr, 70 + unsigned long caller_addr, unsigned long caller_addr0) {} 71 + 72 + /* 73 + * __raw_{read,write}{b,w,l,q}() access memory in native endianness. 74 + * 75 + * On some architectures memory mapped IO needs to be accessed differently. 76 + * On the simple architectures, we just read/write the memory location 77 + * directly. 78 + */ 79 + 80 + #ifndef __raw_readb 81 + #define __raw_readb __raw_readb 82 + static inline u8 __raw_readb(const volatile void __iomem *addr) 83 + { 84 + return *(const volatile u8 __force *)addr; 85 + } 86 + #endif 87 + 88 + #ifndef __raw_readw 89 + #define __raw_readw __raw_readw 90 + static inline u16 __raw_readw(const volatile void __iomem *addr) 91 + { 92 + return *(const volatile u16 __force *)addr; 93 + } 94 + #endif 95 + 96 + #ifndef __raw_readl 97 + #define __raw_readl __raw_readl 98 + static inline u32 __raw_readl(const volatile void __iomem *addr) 99 + { 100 + return *(const volatile u32 __force *)addr; 101 + } 102 + #endif 103 + 104 + #ifndef __raw_readq 105 + #define __raw_readq __raw_readq 106 + static inline u64 __raw_readq(const volatile void __iomem *addr) 107 + { 108 + return *(const volatile u64 __force *)addr; 109 + } 110 + #endif 111 + 112 + #ifndef __raw_writeb 113 + #define __raw_writeb __raw_writeb 114 + static inline void __raw_writeb(u8 value, volatile void __iomem *addr) 115 + { 116 + *(volatile u8 __force *)addr = value; 117 + } 118 + #endif 119 + 120 + #ifndef __raw_writew 121 + #define __raw_writew __raw_writew 122 + static inline void __raw_writew(u16 value, volatile void __iomem *addr) 123 + { 124 + *(volatile u16 __force *)addr = value; 125 + } 126 + #endif 127 + 128 + #ifndef __raw_writel 129 + #define __raw_writel __raw_writel 130 + static inline void __raw_writel(u32 value, volatile void __iomem *addr) 131 + { 132 + *(volatile u32 __force *)addr = value; 133 + } 134 + #endif 135 + 136 + #ifndef __raw_writeq 137 + #define __raw_writeq __raw_writeq 138 + static inline void __raw_writeq(u64 value, volatile void __iomem *addr) 139 + { 140 + *(volatile u64 __force *)addr = value; 141 + } 142 + #endif 143 + 144 + /* 145 + * {read,write}{b,w,l,q}() access little endian memory and return result in 146 + * native endianness. 147 + */ 148 + 149 + #ifndef readb 150 + #define readb readb 151 + static inline u8 readb(const volatile void __iomem *addr) 152 + { 153 + u8 val; 154 + 155 + log_read_mmio(8, addr, _THIS_IP_, _RET_IP_); 156 + __io_br(); 157 + val = __raw_readb(addr); 158 + __io_ar(val); 159 + log_post_read_mmio(val, 8, addr, _THIS_IP_, _RET_IP_); 160 + return val; 161 + } 162 + #endif 163 + 164 + #ifndef readw 165 + #define readw readw 166 + static inline u16 readw(const volatile void __iomem *addr) 167 + { 168 + u16 val; 169 + 170 + log_read_mmio(16, addr, _THIS_IP_, _RET_IP_); 171 + __io_br(); 172 + val = __le16_to_cpu((__le16 __force)__raw_readw(addr)); 173 + __io_ar(val); 174 + log_post_read_mmio(val, 16, addr, _THIS_IP_, _RET_IP_); 175 + return val; 176 + } 177 + #endif 178 + 179 + #ifndef readl 180 + #define readl readl 181 + static inline u32 readl(const volatile void __iomem *addr) 182 + { 183 + u32 val; 184 + 185 + log_read_mmio(32, addr, _THIS_IP_, _RET_IP_); 186 + __io_br(); 187 + val = __le32_to_cpu((__le32 __force)__raw_readl(addr)); 188 + __io_ar(val); 189 + log_post_read_mmio(val, 32, addr, _THIS_IP_, _RET_IP_); 190 + return val; 191 + } 192 + #endif 193 + 194 + #ifndef readq 195 + #define readq readq 196 + static inline u64 readq(const volatile void __iomem *addr) 197 + { 198 + u64 val; 199 + 200 + log_read_mmio(64, addr, _THIS_IP_, _RET_IP_); 201 + __io_br(); 202 + val = __le64_to_cpu((__le64 __force)__raw_readq(addr)); 203 + __io_ar(val); 204 + log_post_read_mmio(val, 64, addr, _THIS_IP_, _RET_IP_); 205 + return val; 206 + } 207 + #endif 208 + 209 + #ifndef writeb 210 + #define writeb writeb 211 + static inline void writeb(u8 value, volatile void __iomem *addr) 212 + { 213 + log_write_mmio(value, 8, addr, _THIS_IP_, _RET_IP_); 214 + __io_bw(); 215 + __raw_writeb(value, addr); 216 + __io_aw(); 217 + log_post_write_mmio(value, 8, addr, _THIS_IP_, _RET_IP_); 218 + } 219 + #endif 220 + 221 + #ifndef writew 222 + #define writew writew 223 + static inline void writew(u16 value, volatile void __iomem *addr) 224 + { 225 + log_write_mmio(value, 16, addr, _THIS_IP_, _RET_IP_); 226 + __io_bw(); 227 + __raw_writew((u16 __force)cpu_to_le16(value), addr); 228 + __io_aw(); 229 + log_post_write_mmio(value, 16, addr, _THIS_IP_, _RET_IP_); 230 + } 231 + #endif 232 + 233 + #ifndef writel 234 + #define writel writel 235 + static inline void writel(u32 value, volatile void __iomem *addr) 236 + { 237 + log_write_mmio(value, 32, addr, _THIS_IP_, _RET_IP_); 238 + __io_bw(); 239 + __raw_writel((u32 __force)__cpu_to_le32(value), addr); 240 + __io_aw(); 241 + log_post_write_mmio(value, 32, addr, _THIS_IP_, _RET_IP_); 242 + } 243 + #endif 244 + 245 + #ifndef writeq 246 + #define writeq writeq 247 + static inline void writeq(u64 value, volatile void __iomem *addr) 248 + { 249 + log_write_mmio(value, 64, addr, _THIS_IP_, _RET_IP_); 250 + __io_bw(); 251 + __raw_writeq((u64 __force)__cpu_to_le64(value), addr); 252 + __io_aw(); 253 + log_post_write_mmio(value, 64, addr, _THIS_IP_, _RET_IP_); 254 + } 255 + #endif 256 + 257 + /* 258 + * {read,write}{b,w,l,q}_relaxed() are like the regular version, but 259 + * are not guaranteed to provide ordering against spinlocks or memory 260 + * accesses. 261 + */ 262 + #ifndef readb_relaxed 263 + #define readb_relaxed readb_relaxed 264 + static inline u8 readb_relaxed(const volatile void __iomem *addr) 265 + { 266 + u8 val; 267 + 268 + log_read_mmio(8, addr, _THIS_IP_, _RET_IP_); 269 + val = __raw_readb(addr); 270 + log_post_read_mmio(val, 8, addr, _THIS_IP_, _RET_IP_); 271 + return val; 272 + } 273 + #endif 274 + 275 + #ifndef readw_relaxed 276 + #define readw_relaxed readw_relaxed 277 + static inline u16 readw_relaxed(const volatile void __iomem *addr) 278 + { 279 + u16 val; 280 + 281 + log_read_mmio(16, addr, _THIS_IP_, _RET_IP_); 282 + val = __le16_to_cpu((__le16 __force)__raw_readw(addr)); 283 + log_post_read_mmio(val, 16, addr, _THIS_IP_, _RET_IP_); 284 + return val; 285 + } 286 + #endif 287 + 288 + #ifndef readl_relaxed 289 + #define readl_relaxed readl_relaxed 290 + static inline u32 readl_relaxed(const volatile void __iomem *addr) 291 + { 292 + u32 val; 293 + 294 + log_read_mmio(32, addr, _THIS_IP_, _RET_IP_); 295 + val = __le32_to_cpu((__le32 __force)__raw_readl(addr)); 296 + log_post_read_mmio(val, 32, addr, _THIS_IP_, _RET_IP_); 297 + return val; 298 + } 299 + #endif 300 + 301 + #if defined(readq) && !defined(readq_relaxed) 302 + #define readq_relaxed readq_relaxed 303 + static inline u64 readq_relaxed(const volatile void __iomem *addr) 304 + { 305 + u64 val; 306 + 307 + log_read_mmio(64, addr, _THIS_IP_, _RET_IP_); 308 + val = __le64_to_cpu((__le64 __force)__raw_readq(addr)); 309 + log_post_read_mmio(val, 64, addr, _THIS_IP_, _RET_IP_); 310 + return val; 311 + } 312 + #endif 313 + 314 + #ifndef writeb_relaxed 315 + #define writeb_relaxed writeb_relaxed 316 + static inline void writeb_relaxed(u8 value, volatile void __iomem *addr) 317 + { 318 + log_write_mmio(value, 8, addr, _THIS_IP_, _RET_IP_); 319 + __raw_writeb(value, addr); 320 + log_post_write_mmio(value, 8, addr, _THIS_IP_, _RET_IP_); 321 + } 322 + #endif 323 + 324 + #ifndef writew_relaxed 325 + #define writew_relaxed writew_relaxed 326 + static inline void writew_relaxed(u16 value, volatile void __iomem *addr) 327 + { 328 + log_write_mmio(value, 16, addr, _THIS_IP_, _RET_IP_); 329 + __raw_writew((u16 __force)cpu_to_le16(value), addr); 330 + log_post_write_mmio(value, 16, addr, _THIS_IP_, _RET_IP_); 331 + } 332 + #endif 333 + 334 + #ifndef writel_relaxed 335 + #define writel_relaxed writel_relaxed 336 + static inline void writel_relaxed(u32 value, volatile void __iomem *addr) 337 + { 338 + log_write_mmio(value, 32, addr, _THIS_IP_, _RET_IP_); 339 + __raw_writel((u32 __force)__cpu_to_le32(value), addr); 340 + log_post_write_mmio(value, 32, addr, _THIS_IP_, _RET_IP_); 341 + } 342 + #endif 343 + 344 + #if defined(writeq) && !defined(writeq_relaxed) 345 + #define writeq_relaxed writeq_relaxed 346 + static inline void writeq_relaxed(u64 value, volatile void __iomem *addr) 347 + { 348 + log_write_mmio(value, 64, addr, _THIS_IP_, _RET_IP_); 349 + __raw_writeq((u64 __force)__cpu_to_le64(value), addr); 350 + log_post_write_mmio(value, 64, addr, _THIS_IP_, _RET_IP_); 351 + } 352 + #endif 353 + 354 + /* 355 + * {read,write}s{b,w,l,q}() repeatedly access the same memory address in 356 + * native endianness in 8-, 16-, 32- or 64-bit chunks (@count times). 357 + */ 358 + #ifndef readsb 359 + #define readsb readsb 360 + static inline void readsb(const volatile void __iomem *addr, void *buffer, 361 + unsigned int count) 362 + { 363 + if (count) { 364 + u8 *buf = buffer; 365 + 366 + do { 367 + u8 x = __raw_readb(addr); 368 + *buf++ = x; 369 + } while (--count); 370 + } 371 + } 372 + #endif 373 + 374 + #ifndef readsw 375 + #define readsw readsw 376 + static inline void readsw(const volatile void __iomem *addr, void *buffer, 377 + unsigned int count) 378 + { 379 + if (count) { 380 + u16 *buf = buffer; 381 + 382 + do { 383 + u16 x = __raw_readw(addr); 384 + *buf++ = x; 385 + } while (--count); 386 + } 387 + } 388 + #endif 389 + 390 + #ifndef readsl 391 + #define readsl readsl 392 + static inline void readsl(const volatile void __iomem *addr, void *buffer, 393 + unsigned int count) 394 + { 395 + if (count) { 396 + u32 *buf = buffer; 397 + 398 + do { 399 + u32 x = __raw_readl(addr); 400 + *buf++ = x; 401 + } while (--count); 402 + } 403 + } 404 + #endif 405 + 406 + #ifndef readsq 407 + #define readsq readsq 408 + static inline void readsq(const volatile void __iomem *addr, void *buffer, 409 + unsigned int count) 410 + { 411 + if (count) { 412 + u64 *buf = buffer; 413 + 414 + do { 415 + u64 x = __raw_readq(addr); 416 + *buf++ = x; 417 + } while (--count); 418 + } 419 + } 420 + #endif 421 + 422 + #ifndef writesb 423 + #define writesb writesb 424 + static inline void writesb(volatile void __iomem *addr, const void *buffer, 425 + unsigned int count) 426 + { 427 + if (count) { 428 + const u8 *buf = buffer; 429 + 430 + do { 431 + __raw_writeb(*buf++, addr); 432 + } while (--count); 433 + } 434 + } 435 + #endif 436 + 437 + #ifndef writesw 438 + #define writesw writesw 439 + static inline void writesw(volatile void __iomem *addr, const void *buffer, 440 + unsigned int count) 441 + { 442 + if (count) { 443 + const u16 *buf = buffer; 444 + 445 + do { 446 + __raw_writew(*buf++, addr); 447 + } while (--count); 448 + } 449 + } 450 + #endif 451 + 452 + #ifndef writesl 453 + #define writesl writesl 454 + static inline void writesl(volatile void __iomem *addr, const void *buffer, 455 + unsigned int count) 456 + { 457 + if (count) { 458 + const u32 *buf = buffer; 459 + 460 + do { 461 + __raw_writel(*buf++, addr); 462 + } while (--count); 463 + } 464 + } 465 + #endif 466 + 467 + #ifndef writesq 468 + #define writesq writesq 469 + static inline void writesq(volatile void __iomem *addr, const void *buffer, 470 + unsigned int count) 471 + { 472 + if (count) { 473 + const u64 *buf = buffer; 474 + 475 + do { 476 + __raw_writeq(*buf++, addr); 477 + } while (--count); 478 + } 479 + } 480 + #endif 481 + 482 + #endif /* _TOOLS_ASM_GENERIC_IO_H */
+11
tools/include/asm/io.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + #ifndef _TOOLS_ASM_IO_H 3 + #define _TOOLS_ASM_IO_H 4 + 5 + #if defined(__i386__) || defined(__x86_64__) 6 + #include "../../arch/x86/include/asm/io.h" 7 + #else 8 + #include <asm-generic/io.h> 9 + #endif 10 + 11 + #endif /* _TOOLS_ASM_IO_H */
+4
tools/include/linux/compiler.h
··· 138 138 # define __force 139 139 #endif 140 140 141 + #ifndef __iomem 142 + # define __iomem 143 + #endif 144 + 141 145 #ifndef __weak 142 146 # define __weak __attribute__((weak)) 143 147 #endif
+3 -1
tools/include/linux/io.h
··· 2 2 #ifndef _TOOLS_IO_H 3 3 #define _TOOLS_IO_H 4 4 5 - #endif 5 + #include <asm/io.h> 6 + 7 + #endif /* _TOOLS_IO_H */
+1
tools/testing/selftests/Makefile
··· 125 125 TARGETS += user_events 126 126 TARGETS += vDSO 127 127 TARGETS += mm 128 + TARGETS += vfio 128 129 TARGETS += x86 129 130 TARGETS += x86/bugs 130 131 TARGETS += zram
+10
tools/testing/selftests/vfio/.gitignore
··· 1 + # SPDX-License-Identifier: GPL-2.0-only 2 + * 3 + !/**/ 4 + !*.c 5 + !*.h 6 + !*.S 7 + !*.sh 8 + !*.mk 9 + !.gitignore 10 + !Makefile
+21
tools/testing/selftests/vfio/Makefile
··· 1 + CFLAGS = $(KHDR_INCLUDES) 2 + TEST_GEN_PROGS += vfio_dma_mapping_test 3 + TEST_GEN_PROGS += vfio_iommufd_setup_test 4 + TEST_GEN_PROGS += vfio_pci_device_test 5 + TEST_GEN_PROGS += vfio_pci_driver_test 6 + TEST_PROGS_EXTENDED := run.sh 7 + include ../lib.mk 8 + include lib/libvfio.mk 9 + 10 + CFLAGS += -I$(top_srcdir)/tools/include 11 + CFLAGS += -MD 12 + CFLAGS += $(EXTRA_CFLAGS) 13 + 14 + $(TEST_GEN_PROGS): %: %.o $(LIBVFIO_O) 15 + $(CC) $(CFLAGS) $(CPPFLAGS) $(LDFLAGS) $< $(LIBVFIO_O) $(LDLIBS) -o $@ 16 + 17 + TEST_GEN_PROGS_O = $(patsubst %, %.o, $(TEST_GEN_PROGS)) 18 + TEST_DEP_FILES = $(patsubst %.o, %.d, $(TEST_GEN_PROGS_O) $(LIBVFIO_O)) 19 + -include $(TEST_DEP_FILES) 20 + 21 + EXTRA_CLEAN += $(TEST_GEN_PROGS_O) $(TEST_DEP_FILES)
+416
tools/testing/selftests/vfio/lib/drivers/dsa/dsa.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + #include <stdint.h> 3 + #include <unistd.h> 4 + 5 + #include <linux/bits.h> 6 + #include <linux/errno.h> 7 + #include <linux/idxd.h> 8 + #include <linux/io.h> 9 + #include <linux/pci_ids.h> 10 + #include <linux/sizes.h> 11 + 12 + #include <vfio_util.h> 13 + 14 + #include "registers.h" 15 + 16 + /* Vectors 1+ are available for work queue completion interrupts. */ 17 + #define MSIX_VECTOR 1 18 + 19 + struct dsa_state { 20 + /* Descriptors for copy and batch operations. */ 21 + struct dsa_hw_desc batch[32]; 22 + struct dsa_hw_desc copy[1024]; 23 + 24 + /* Completion records for copy and batch operations. */ 25 + struct dsa_completion_record copy_completion; 26 + struct dsa_completion_record batch_completion; 27 + 28 + /* Cached device registers (and derived data) for easy access */ 29 + union gen_cap_reg gen_cap; 30 + union wq_cap_reg wq_cap; 31 + union group_cap_reg group_cap; 32 + union engine_cap_reg engine_cap; 33 + union offsets_reg table_offsets; 34 + void *wqcfg_table; 35 + void *grpcfg_table; 36 + u64 max_batches; 37 + u64 max_copies_per_batch; 38 + 39 + /* The number of ongoing memcpy operations. */ 40 + u64 memcpy_count; 41 + 42 + /* Buffers used by dsa_send_msi() to generate an interrupt */ 43 + u64 send_msi_src; 44 + u64 send_msi_dst; 45 + }; 46 + 47 + static inline struct dsa_state *to_dsa_state(struct vfio_pci_device *device) 48 + { 49 + return device->driver.region.vaddr; 50 + } 51 + 52 + static bool dsa_int_handle_request_required(struct vfio_pci_device *device) 53 + { 54 + void *bar0 = device->bars[0].vaddr; 55 + union gen_cap_reg gen_cap; 56 + u32 cmd_cap; 57 + 58 + gen_cap.bits = readq(bar0 + IDXD_GENCAP_OFFSET); 59 + if (!gen_cap.cmd_cap) 60 + return false; 61 + 62 + cmd_cap = readl(bar0 + IDXD_CMDCAP_OFFSET); 63 + return (cmd_cap >> IDXD_CMD_REQUEST_INT_HANDLE) & 1; 64 + } 65 + 66 + static int dsa_probe(struct vfio_pci_device *device) 67 + { 68 + if (!vfio_pci_device_match(device, PCI_VENDOR_ID_INTEL, 69 + PCI_DEVICE_ID_INTEL_DSA_SPR0)) 70 + return -EINVAL; 71 + 72 + if (dsa_int_handle_request_required(device)) { 73 + printf("Device requires requesting interrupt handles\n"); 74 + return -EINVAL; 75 + } 76 + 77 + return 0; 78 + } 79 + 80 + static void dsa_check_sw_err(struct vfio_pci_device *device) 81 + { 82 + void *reg = device->bars[0].vaddr + IDXD_SWERR_OFFSET; 83 + union sw_err_reg err = {}; 84 + int i; 85 + 86 + for (i = 0; i < ARRAY_SIZE(err.bits); i++) { 87 + err.bits[i] = readq(reg + offsetof(union sw_err_reg, bits[i])); 88 + 89 + /* No errors */ 90 + if (i == 0 && !err.valid) 91 + return; 92 + } 93 + 94 + fprintf(stderr, "SWERR: 0x%016lx 0x%016lx 0x%016lx 0x%016lx\n", 95 + err.bits[0], err.bits[1], err.bits[2], err.bits[3]); 96 + 97 + fprintf(stderr, " valid: 0x%x\n", err.valid); 98 + fprintf(stderr, " overflow: 0x%x\n", err.overflow); 99 + fprintf(stderr, " desc_valid: 0x%x\n", err.desc_valid); 100 + fprintf(stderr, " wq_idx_valid: 0x%x\n", err.wq_idx_valid); 101 + fprintf(stderr, " batch: 0x%x\n", err.batch); 102 + fprintf(stderr, " fault_rw: 0x%x\n", err.fault_rw); 103 + fprintf(stderr, " priv: 0x%x\n", err.priv); 104 + fprintf(stderr, " error: 0x%x\n", err.error); 105 + fprintf(stderr, " wq_idx: 0x%x\n", err.wq_idx); 106 + fprintf(stderr, " operation: 0x%x\n", err.operation); 107 + fprintf(stderr, " pasid: 0x%x\n", err.pasid); 108 + fprintf(stderr, " batch_idx: 0x%x\n", err.batch_idx); 109 + fprintf(stderr, " invalid_flags: 0x%x\n", err.invalid_flags); 110 + fprintf(stderr, " fault_addr: 0x%lx\n", err.fault_addr); 111 + 112 + VFIO_FAIL("Software Error Detected!\n"); 113 + } 114 + 115 + static void dsa_command(struct vfio_pci_device *device, u32 cmd) 116 + { 117 + union idxd_command_reg cmd_reg = { .cmd = cmd }; 118 + u32 sleep_ms = 1, attempts = 5000 / sleep_ms; 119 + void *bar0 = device->bars[0].vaddr; 120 + u32 status; 121 + u8 err; 122 + 123 + writel(cmd_reg.bits, bar0 + IDXD_CMD_OFFSET); 124 + 125 + for (;;) { 126 + dsa_check_sw_err(device); 127 + 128 + status = readl(bar0 + IDXD_CMDSTS_OFFSET); 129 + if (!(status & IDXD_CMDSTS_ACTIVE)) 130 + break; 131 + 132 + VFIO_ASSERT_GT(--attempts, 0); 133 + usleep(sleep_ms * 1000); 134 + } 135 + 136 + err = status & IDXD_CMDSTS_ERR_MASK; 137 + VFIO_ASSERT_EQ(err, 0, "Error issuing command 0x%x: 0x%x\n", cmd, err); 138 + } 139 + 140 + static void dsa_wq_init(struct vfio_pci_device *device) 141 + { 142 + struct dsa_state *dsa = to_dsa_state(device); 143 + union wq_cap_reg wq_cap = dsa->wq_cap; 144 + union wqcfg wqcfg; 145 + u64 wqcfg_size; 146 + int i; 147 + 148 + VFIO_ASSERT_GT((u32)wq_cap.num_wqs, 0); 149 + 150 + wqcfg = (union wqcfg) { 151 + .wq_size = wq_cap.total_wq_size, 152 + .mode = 1, 153 + .priority = 1, 154 + /* 155 + * Disable Address Translation Service (if enabled) so that VFIO 156 + * selftests using this driver can generate I/O page faults. 157 + */ 158 + .wq_ats_disable = wq_cap.wq_ats_support, 159 + .max_xfer_shift = dsa->gen_cap.max_xfer_shift, 160 + .max_batch_shift = dsa->gen_cap.max_batch_shift, 161 + .op_config[0] = BIT(DSA_OPCODE_MEMMOVE) | BIT(DSA_OPCODE_BATCH), 162 + }; 163 + 164 + wqcfg_size = 1UL << (wq_cap.wqcfg_size + IDXD_WQCFG_MIN); 165 + 166 + for (i = 0; i < wqcfg_size / sizeof(wqcfg.bits[0]); i++) 167 + writel(wqcfg.bits[i], dsa->wqcfg_table + offsetof(union wqcfg, bits[i])); 168 + } 169 + 170 + static void dsa_group_init(struct vfio_pci_device *device) 171 + { 172 + struct dsa_state *dsa = to_dsa_state(device); 173 + union group_cap_reg group_cap = dsa->group_cap; 174 + union engine_cap_reg engine_cap = dsa->engine_cap; 175 + 176 + VFIO_ASSERT_GT((u32)group_cap.num_groups, 0); 177 + VFIO_ASSERT_GT((u32)engine_cap.num_engines, 0); 178 + 179 + /* Assign work queue 0 and engine 0 to group 0 */ 180 + writeq(1, dsa->grpcfg_table + offsetof(struct grpcfg, wqs[0])); 181 + writeq(1, dsa->grpcfg_table + offsetof(struct grpcfg, engines)); 182 + } 183 + 184 + static void dsa_register_cache_init(struct vfio_pci_device *device) 185 + { 186 + struct dsa_state *dsa = to_dsa_state(device); 187 + void *bar0 = device->bars[0].vaddr; 188 + 189 + dsa->gen_cap.bits = readq(bar0 + IDXD_GENCAP_OFFSET); 190 + dsa->wq_cap.bits = readq(bar0 + IDXD_WQCAP_OFFSET); 191 + dsa->group_cap.bits = readq(bar0 + IDXD_GRPCAP_OFFSET); 192 + dsa->engine_cap.bits = readq(bar0 + IDXD_ENGCAP_OFFSET); 193 + 194 + dsa->table_offsets.bits[0] = readq(bar0 + IDXD_TABLE_OFFSET); 195 + dsa->table_offsets.bits[1] = readq(bar0 + IDXD_TABLE_OFFSET + 8); 196 + 197 + dsa->wqcfg_table = bar0 + dsa->table_offsets.wqcfg * IDXD_TABLE_MULT; 198 + dsa->grpcfg_table = bar0 + dsa->table_offsets.grpcfg * IDXD_TABLE_MULT; 199 + 200 + dsa->max_batches = 1U << (dsa->wq_cap.total_wq_size + IDXD_WQCFG_MIN); 201 + dsa->max_batches = min(dsa->max_batches, ARRAY_SIZE(dsa->batch)); 202 + 203 + dsa->max_copies_per_batch = 1UL << dsa->gen_cap.max_batch_shift; 204 + dsa->max_copies_per_batch = min(dsa->max_copies_per_batch, ARRAY_SIZE(dsa->copy)); 205 + } 206 + 207 + static void dsa_init(struct vfio_pci_device *device) 208 + { 209 + struct dsa_state *dsa = to_dsa_state(device); 210 + 211 + VFIO_ASSERT_GE(device->driver.region.size, sizeof(*dsa)); 212 + 213 + vfio_pci_config_writew(device, PCI_COMMAND, 214 + PCI_COMMAND_MEMORY | 215 + PCI_COMMAND_MASTER | 216 + PCI_COMMAND_INTX_DISABLE); 217 + 218 + dsa_command(device, IDXD_CMD_RESET_DEVICE); 219 + 220 + dsa_register_cache_init(device); 221 + dsa_wq_init(device); 222 + dsa_group_init(device); 223 + 224 + dsa_command(device, IDXD_CMD_ENABLE_DEVICE); 225 + dsa_command(device, IDXD_CMD_ENABLE_WQ); 226 + 227 + vfio_pci_msix_enable(device, MSIX_VECTOR, 1); 228 + 229 + device->driver.max_memcpy_count = 230 + dsa->max_batches * dsa->max_copies_per_batch; 231 + device->driver.max_memcpy_size = 1UL << dsa->gen_cap.max_xfer_shift; 232 + device->driver.msi = MSIX_VECTOR; 233 + } 234 + 235 + static void dsa_remove(struct vfio_pci_device *device) 236 + { 237 + dsa_command(device, IDXD_CMD_RESET_DEVICE); 238 + vfio_pci_msix_disable(device); 239 + } 240 + 241 + static int dsa_completion_wait(struct vfio_pci_device *device, 242 + struct dsa_completion_record *completion) 243 + { 244 + u8 status; 245 + 246 + for (;;) { 247 + dsa_check_sw_err(device); 248 + 249 + status = READ_ONCE(completion->status); 250 + if (status) 251 + break; 252 + 253 + usleep(1000); 254 + } 255 + 256 + if (status == DSA_COMP_SUCCESS) 257 + return 0; 258 + 259 + printf("Error detected during memcpy operation: 0x%x\n", status); 260 + return -1; 261 + } 262 + 263 + static void dsa_copy_desc_init(struct vfio_pci_device *device, 264 + struct dsa_hw_desc *desc, 265 + iova_t src, iova_t dst, u64 size, 266 + bool interrupt) 267 + { 268 + struct dsa_state *dsa = to_dsa_state(device); 269 + u16 flags; 270 + 271 + flags = IDXD_OP_FLAG_CRAV | IDXD_OP_FLAG_RCR; 272 + 273 + if (interrupt) 274 + flags |= IDXD_OP_FLAG_RCI; 275 + 276 + *desc = (struct dsa_hw_desc) { 277 + .opcode = DSA_OPCODE_MEMMOVE, 278 + .flags = flags, 279 + .priv = 1, 280 + .src_addr = src, 281 + .dst_addr = dst, 282 + .xfer_size = size, 283 + .completion_addr = to_iova(device, &dsa->copy_completion), 284 + .int_handle = interrupt ? MSIX_VECTOR : 0, 285 + }; 286 + } 287 + 288 + static void dsa_batch_desc_init(struct vfio_pci_device *device, 289 + struct dsa_hw_desc *desc, 290 + u64 count) 291 + { 292 + struct dsa_state *dsa = to_dsa_state(device); 293 + 294 + *desc = (struct dsa_hw_desc) { 295 + .opcode = DSA_OPCODE_BATCH, 296 + .flags = IDXD_OP_FLAG_CRAV, 297 + .priv = 1, 298 + .completion_addr = to_iova(device, &dsa->batch_completion), 299 + .desc_list_addr = to_iova(device, &dsa->copy[0]), 300 + .desc_count = count, 301 + }; 302 + } 303 + 304 + static void dsa_desc_write(struct vfio_pci_device *device, struct dsa_hw_desc *desc) 305 + { 306 + /* Write the contents (not address) of the 64-byte descriptor to the device. */ 307 + iosubmit_cmds512(device->bars[2].vaddr, desc, 1); 308 + } 309 + 310 + static void dsa_memcpy_one(struct vfio_pci_device *device, 311 + iova_t src, iova_t dst, u64 size, bool interrupt) 312 + { 313 + struct dsa_state *dsa = to_dsa_state(device); 314 + 315 + memset(&dsa->copy_completion, 0, sizeof(dsa->copy_completion)); 316 + 317 + dsa_copy_desc_init(device, &dsa->copy[0], src, dst, size, interrupt); 318 + dsa_desc_write(device, &dsa->copy[0]); 319 + } 320 + 321 + static void dsa_memcpy_batch(struct vfio_pci_device *device, 322 + iova_t src, iova_t dst, u64 size, u64 count) 323 + { 324 + struct dsa_state *dsa = to_dsa_state(device); 325 + int i; 326 + 327 + memset(&dsa->batch_completion, 0, sizeof(dsa->batch_completion)); 328 + 329 + for (i = 0; i < ARRAY_SIZE(dsa->copy); i++) { 330 + struct dsa_hw_desc *copy_desc = &dsa->copy[i]; 331 + 332 + dsa_copy_desc_init(device, copy_desc, src, dst, size, false); 333 + 334 + /* Don't request completions for individual copies. */ 335 + copy_desc->flags &= ~IDXD_OP_FLAG_RCR; 336 + } 337 + 338 + for (i = 0; i < ARRAY_SIZE(dsa->batch) && count; i++) { 339 + struct dsa_hw_desc *batch_desc = &dsa->batch[i]; 340 + int nr_copies; 341 + 342 + nr_copies = min(count, dsa->max_copies_per_batch); 343 + count -= nr_copies; 344 + 345 + /* 346 + * Batches must have at least 2 copies, so handle the case where 347 + * there is exactly 1 copy left by doing one less copy in this 348 + * batch and then 2 in the next. 349 + */ 350 + if (count == 1) { 351 + nr_copies--; 352 + count++; 353 + } 354 + 355 + dsa_batch_desc_init(device, batch_desc, nr_copies); 356 + 357 + /* Request a completion for the last batch. */ 358 + if (!count) 359 + batch_desc->flags |= IDXD_OP_FLAG_RCR; 360 + 361 + dsa_desc_write(device, batch_desc); 362 + } 363 + 364 + VFIO_ASSERT_EQ(count, 0, "Failed to start %lu copies.\n", count); 365 + } 366 + 367 + static void dsa_memcpy_start(struct vfio_pci_device *device, 368 + iova_t src, iova_t dst, u64 size, u64 count) 369 + { 370 + struct dsa_state *dsa = to_dsa_state(device); 371 + 372 + /* DSA devices require at least 2 copies per batch. */ 373 + if (count == 1) 374 + dsa_memcpy_one(device, src, dst, size, false); 375 + else 376 + dsa_memcpy_batch(device, src, dst, size, count); 377 + 378 + dsa->memcpy_count = count; 379 + } 380 + 381 + static int dsa_memcpy_wait(struct vfio_pci_device *device) 382 + { 383 + struct dsa_state *dsa = to_dsa_state(device); 384 + int r; 385 + 386 + if (dsa->memcpy_count == 1) 387 + r = dsa_completion_wait(device, &dsa->copy_completion); 388 + else 389 + r = dsa_completion_wait(device, &dsa->batch_completion); 390 + 391 + dsa->memcpy_count = 0; 392 + 393 + return r; 394 + } 395 + 396 + static void dsa_send_msi(struct vfio_pci_device *device) 397 + { 398 + struct dsa_state *dsa = to_dsa_state(device); 399 + 400 + dsa_memcpy_one(device, 401 + to_iova(device, &dsa->send_msi_src), 402 + to_iova(device, &dsa->send_msi_dst), 403 + sizeof(dsa->send_msi_src), true); 404 + 405 + VFIO_ASSERT_EQ(dsa_completion_wait(device, &dsa->copy_completion), 0); 406 + } 407 + 408 + const struct vfio_pci_driver_ops dsa_ops = { 409 + .name = "dsa", 410 + .probe = dsa_probe, 411 + .init = dsa_init, 412 + .remove = dsa_remove, 413 + .memcpy_start = dsa_memcpy_start, 414 + .memcpy_wait = dsa_memcpy_wait, 415 + .send_msi = dsa_send_msi, 416 + };
+235
tools/testing/selftests/vfio/lib/drivers/ioat/ioat.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + #include <stdint.h> 3 + #include <unistd.h> 4 + 5 + #include <linux/errno.h> 6 + #include <linux/io.h> 7 + #include <linux/pci_ids.h> 8 + #include <linux/sizes.h> 9 + 10 + #include <vfio_util.h> 11 + 12 + #include "hw.h" 13 + #include "registers.h" 14 + 15 + #define IOAT_DMACOUNT_MAX UINT16_MAX 16 + 17 + struct ioat_state { 18 + /* Single descriptor used to issue DMA memcpy operations */ 19 + struct ioat_dma_descriptor desc; 20 + 21 + /* Copy buffers used by ioat_send_msi() to generate an interrupt. */ 22 + u64 send_msi_src; 23 + u64 send_msi_dst; 24 + }; 25 + 26 + static inline struct ioat_state *to_ioat_state(struct vfio_pci_device *device) 27 + { 28 + return device->driver.region.vaddr; 29 + } 30 + 31 + static inline void *ioat_channel_registers(struct vfio_pci_device *device) 32 + { 33 + return device->bars[0].vaddr + IOAT_CHANNEL_MMIO_SIZE; 34 + } 35 + 36 + static int ioat_probe(struct vfio_pci_device *device) 37 + { 38 + u8 version; 39 + int r; 40 + 41 + if (!vfio_pci_device_match(device, PCI_VENDOR_ID_INTEL, 42 + PCI_DEVICE_ID_INTEL_IOAT_SKX)) 43 + return -EINVAL; 44 + 45 + VFIO_ASSERT_NOT_NULL(device->bars[0].vaddr); 46 + 47 + version = readb(device->bars[0].vaddr + IOAT_VER_OFFSET); 48 + switch (version) { 49 + case IOAT_VER_3_2: 50 + case IOAT_VER_3_3: 51 + r = 0; 52 + break; 53 + default: 54 + printf("ioat: Unsupported version: 0x%x\n", version); 55 + r = -EINVAL; 56 + } 57 + return r; 58 + } 59 + 60 + static u64 ioat_channel_status(void *bar) 61 + { 62 + return readq(bar + IOAT_CHANSTS_OFFSET) & IOAT_CHANSTS_STATUS; 63 + } 64 + 65 + static void ioat_clear_errors(struct vfio_pci_device *device) 66 + { 67 + void *registers = ioat_channel_registers(device); 68 + u32 errors; 69 + 70 + errors = vfio_pci_config_readl(device, IOAT_PCI_CHANERR_INT_OFFSET); 71 + vfio_pci_config_writel(device, IOAT_PCI_CHANERR_INT_OFFSET, errors); 72 + 73 + errors = vfio_pci_config_readl(device, IOAT_PCI_DMAUNCERRSTS_OFFSET); 74 + vfio_pci_config_writel(device, IOAT_PCI_CHANERR_INT_OFFSET, errors); 75 + 76 + errors = readl(registers + IOAT_CHANERR_OFFSET); 77 + writel(errors, registers + IOAT_CHANERR_OFFSET); 78 + } 79 + 80 + static void ioat_reset(struct vfio_pci_device *device) 81 + { 82 + void *registers = ioat_channel_registers(device); 83 + u32 sleep_ms = 1, attempts = 5000 / sleep_ms; 84 + u8 chancmd; 85 + 86 + ioat_clear_errors(device); 87 + 88 + writeb(IOAT_CHANCMD_RESET, registers + IOAT2_CHANCMD_OFFSET); 89 + 90 + for (;;) { 91 + chancmd = readb(registers + IOAT2_CHANCMD_OFFSET); 92 + if (!(chancmd & IOAT_CHANCMD_RESET)) 93 + break; 94 + 95 + VFIO_ASSERT_GT(--attempts, 0); 96 + usleep(sleep_ms * 1000); 97 + } 98 + 99 + VFIO_ASSERT_EQ(ioat_channel_status(registers), IOAT_CHANSTS_HALTED); 100 + } 101 + 102 + static void ioat_init(struct vfio_pci_device *device) 103 + { 104 + struct ioat_state *ioat = to_ioat_state(device); 105 + u8 intrctrl; 106 + 107 + VFIO_ASSERT_GE(device->driver.region.size, sizeof(*ioat)); 108 + 109 + vfio_pci_config_writew(device, PCI_COMMAND, 110 + PCI_COMMAND_MEMORY | 111 + PCI_COMMAND_MASTER | 112 + PCI_COMMAND_INTX_DISABLE); 113 + 114 + ioat_reset(device); 115 + 116 + /* Enable the use of MXI-x interrupts for channel interrupts. */ 117 + intrctrl = IOAT_INTRCTRL_MSIX_VECTOR_CONTROL; 118 + writeb(intrctrl, device->bars[0].vaddr + IOAT_INTRCTRL_OFFSET); 119 + 120 + vfio_pci_msix_enable(device, 0, device->msix_info.count); 121 + 122 + device->driver.msi = 0; 123 + device->driver.max_memcpy_size = 124 + 1UL << readb(device->bars[0].vaddr + IOAT_XFERCAP_OFFSET); 125 + device->driver.max_memcpy_count = IOAT_DMACOUNT_MAX; 126 + } 127 + 128 + static void ioat_remove(struct vfio_pci_device *device) 129 + { 130 + ioat_reset(device); 131 + vfio_pci_msix_disable(device); 132 + } 133 + 134 + static void ioat_handle_error(struct vfio_pci_device *device) 135 + { 136 + void *registers = ioat_channel_registers(device); 137 + 138 + printf("Error detected during memcpy operation!\n" 139 + " CHANERR: 0x%x\n" 140 + " CHANERR_INT: 0x%x\n" 141 + " DMAUNCERRSTS: 0x%x\n", 142 + readl(registers + IOAT_CHANERR_OFFSET), 143 + vfio_pci_config_readl(device, IOAT_PCI_CHANERR_INT_OFFSET), 144 + vfio_pci_config_readl(device, IOAT_PCI_DMAUNCERRSTS_OFFSET)); 145 + 146 + ioat_reset(device); 147 + } 148 + 149 + static int ioat_memcpy_wait(struct vfio_pci_device *device) 150 + { 151 + void *registers = ioat_channel_registers(device); 152 + u64 status; 153 + int r = 0; 154 + 155 + /* Wait until all operations complete. */ 156 + for (;;) { 157 + status = ioat_channel_status(registers); 158 + if (status == IOAT_CHANSTS_DONE) 159 + break; 160 + 161 + if (status == IOAT_CHANSTS_HALTED) { 162 + ioat_handle_error(device); 163 + return -1; 164 + } 165 + } 166 + 167 + /* Put the channel into the SUSPENDED state. */ 168 + writeb(IOAT_CHANCMD_SUSPEND, registers + IOAT2_CHANCMD_OFFSET); 169 + for (;;) { 170 + status = ioat_channel_status(registers); 171 + if (status == IOAT_CHANSTS_SUSPENDED) 172 + break; 173 + } 174 + 175 + return r; 176 + } 177 + 178 + static void __ioat_memcpy_start(struct vfio_pci_device *device, 179 + iova_t src, iova_t dst, u64 size, 180 + u16 count, bool interrupt) 181 + { 182 + void *registers = ioat_channel_registers(device); 183 + struct ioat_state *ioat = to_ioat_state(device); 184 + u64 desc_iova; 185 + u16 chanctrl; 186 + 187 + desc_iova = to_iova(device, &ioat->desc); 188 + ioat->desc = (struct ioat_dma_descriptor) { 189 + .ctl_f.op = IOAT_OP_COPY, 190 + .ctl_f.int_en = interrupt, 191 + .src_addr = src, 192 + .dst_addr = dst, 193 + .size = size, 194 + .next = desc_iova, 195 + }; 196 + 197 + /* Tell the device the address of the descriptor. */ 198 + writeq(desc_iova, registers + IOAT2_CHAINADDR_OFFSET); 199 + 200 + /* (Re)Enable the channel interrupt and abort on any errors */ 201 + chanctrl = IOAT_CHANCTRL_INT_REARM | IOAT_CHANCTRL_ANY_ERR_ABORT_EN; 202 + writew(chanctrl, registers + IOAT_CHANCTRL_OFFSET); 203 + 204 + /* Kick off @count DMA copy operation(s). */ 205 + writew(count, registers + IOAT_CHAN_DMACOUNT_OFFSET); 206 + } 207 + 208 + static void ioat_memcpy_start(struct vfio_pci_device *device, 209 + iova_t src, iova_t dst, u64 size, 210 + u64 count) 211 + { 212 + __ioat_memcpy_start(device, src, dst, size, count, false); 213 + } 214 + 215 + static void ioat_send_msi(struct vfio_pci_device *device) 216 + { 217 + struct ioat_state *ioat = to_ioat_state(device); 218 + 219 + __ioat_memcpy_start(device, 220 + to_iova(device, &ioat->send_msi_src), 221 + to_iova(device, &ioat->send_msi_dst), 222 + sizeof(ioat->send_msi_src), 1, true); 223 + 224 + VFIO_ASSERT_EQ(ioat_memcpy_wait(device), 0); 225 + } 226 + 227 + const struct vfio_pci_driver_ops ioat_ops = { 228 + .name = "ioat", 229 + .probe = ioat_probe, 230 + .init = ioat_init, 231 + .remove = ioat_remove, 232 + .memcpy_start = ioat_memcpy_start, 233 + .memcpy_wait = ioat_memcpy_wait, 234 + .send_msi = ioat_send_msi, 235 + };
+295
tools/testing/selftests/vfio/lib/include/vfio_util.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0-only */ 2 + #ifndef SELFTESTS_VFIO_LIB_INCLUDE_VFIO_UTIL_H 3 + #define SELFTESTS_VFIO_LIB_INCLUDE_VFIO_UTIL_H 4 + 5 + #include <fcntl.h> 6 + #include <string.h> 7 + #include <linux/vfio.h> 8 + #include <linux/list.h> 9 + #include <linux/pci_regs.h> 10 + 11 + #include "../../../kselftest.h" 12 + 13 + #define VFIO_LOG_AND_EXIT(...) do { \ 14 + fprintf(stderr, " " __VA_ARGS__); \ 15 + fprintf(stderr, "\n"); \ 16 + exit(KSFT_FAIL); \ 17 + } while (0) 18 + 19 + #define VFIO_ASSERT_OP(_lhs, _rhs, _op, ...) do { \ 20 + typeof(_lhs) __lhs = (_lhs); \ 21 + typeof(_rhs) __rhs = (_rhs); \ 22 + \ 23 + if (__lhs _op __rhs) \ 24 + break; \ 25 + \ 26 + fprintf(stderr, "%s:%u: Assertion Failure\n\n", __FILE__, __LINE__); \ 27 + fprintf(stderr, " Expression: " #_lhs " " #_op " " #_rhs "\n"); \ 28 + fprintf(stderr, " Observed: %#lx %s %#lx\n", \ 29 + (u64)__lhs, #_op, (u64)__rhs); \ 30 + fprintf(stderr, " [errno: %d - %s]\n", errno, strerror(errno)); \ 31 + VFIO_LOG_AND_EXIT(__VA_ARGS__); \ 32 + } while (0) 33 + 34 + #define VFIO_ASSERT_EQ(_a, _b, ...) VFIO_ASSERT_OP(_a, _b, ==, ##__VA_ARGS__) 35 + #define VFIO_ASSERT_NE(_a, _b, ...) VFIO_ASSERT_OP(_a, _b, !=, ##__VA_ARGS__) 36 + #define VFIO_ASSERT_LT(_a, _b, ...) VFIO_ASSERT_OP(_a, _b, <, ##__VA_ARGS__) 37 + #define VFIO_ASSERT_LE(_a, _b, ...) VFIO_ASSERT_OP(_a, _b, <=, ##__VA_ARGS__) 38 + #define VFIO_ASSERT_GT(_a, _b, ...) VFIO_ASSERT_OP(_a, _b, >, ##__VA_ARGS__) 39 + #define VFIO_ASSERT_GE(_a, _b, ...) VFIO_ASSERT_OP(_a, _b, >=, ##__VA_ARGS__) 40 + #define VFIO_ASSERT_TRUE(_a, ...) VFIO_ASSERT_NE(false, (_a), ##__VA_ARGS__) 41 + #define VFIO_ASSERT_FALSE(_a, ...) VFIO_ASSERT_EQ(false, (_a), ##__VA_ARGS__) 42 + #define VFIO_ASSERT_NULL(_a, ...) VFIO_ASSERT_EQ(NULL, _a, ##__VA_ARGS__) 43 + #define VFIO_ASSERT_NOT_NULL(_a, ...) VFIO_ASSERT_NE(NULL, _a, ##__VA_ARGS__) 44 + 45 + #define VFIO_FAIL(_fmt, ...) do { \ 46 + fprintf(stderr, "%s:%u: FAIL\n\n", __FILE__, __LINE__); \ 47 + VFIO_LOG_AND_EXIT(_fmt, ##__VA_ARGS__); \ 48 + } while (0) 49 + 50 + struct vfio_iommu_mode { 51 + const char *name; 52 + const char *container_path; 53 + unsigned long iommu_type; 54 + }; 55 + 56 + /* 57 + * Generator for VFIO selftests fixture variants that replicate across all 58 + * possible IOMMU modes. Tests must define FIXTURE_VARIANT_ADD_IOMMU_MODE() 59 + * which should then use FIXTURE_VARIANT_ADD() to create the variant. 60 + */ 61 + #define FIXTURE_VARIANT_ADD_ALL_IOMMU_MODES(...) \ 62 + FIXTURE_VARIANT_ADD_IOMMU_MODE(vfio_type1_iommu, ##__VA_ARGS__); \ 63 + FIXTURE_VARIANT_ADD_IOMMU_MODE(vfio_type1v2_iommu, ##__VA_ARGS__); \ 64 + FIXTURE_VARIANT_ADD_IOMMU_MODE(iommufd_compat_type1, ##__VA_ARGS__); \ 65 + FIXTURE_VARIANT_ADD_IOMMU_MODE(iommufd_compat_type1v2, ##__VA_ARGS__); \ 66 + FIXTURE_VARIANT_ADD_IOMMU_MODE(iommufd, ##__VA_ARGS__) 67 + 68 + struct vfio_pci_bar { 69 + struct vfio_region_info info; 70 + void *vaddr; 71 + }; 72 + 73 + typedef u64 iova_t; 74 + 75 + #define INVALID_IOVA UINT64_MAX 76 + 77 + struct vfio_dma_region { 78 + struct list_head link; 79 + void *vaddr; 80 + iova_t iova; 81 + u64 size; 82 + }; 83 + 84 + struct vfio_pci_device; 85 + 86 + struct vfio_pci_driver_ops { 87 + const char *name; 88 + 89 + /** 90 + * @probe() - Check if the driver supports the given device. 91 + * 92 + * Return: 0 on success, non-0 on failure. 93 + */ 94 + int (*probe)(struct vfio_pci_device *device); 95 + 96 + /** 97 + * @init() - Initialize the driver for @device. 98 + * 99 + * Must be called after device->driver.region has been initialized. 100 + */ 101 + void (*init)(struct vfio_pci_device *device); 102 + 103 + /** 104 + * remove() - Deinitialize the driver for @device. 105 + */ 106 + void (*remove)(struct vfio_pci_device *device); 107 + 108 + /** 109 + * memcpy_start() - Kick off @count repeated memcpy operations from 110 + * [@src, @src + @size) to [@dst, @dst + @size). 111 + * 112 + * Guarantees: 113 + * - The device will attempt DMA reads on [src, src + size). 114 + * - The device will attempt DMA writes on [dst, dst + size). 115 + * - The device will not generate any interrupts. 116 + * 117 + * memcpy_start() returns immediately, it does not wait for the 118 + * copies to complete. 119 + */ 120 + void (*memcpy_start)(struct vfio_pci_device *device, 121 + iova_t src, iova_t dst, u64 size, u64 count); 122 + 123 + /** 124 + * memcpy_wait() - Wait until the memcpy operations started by 125 + * memcpy_start() have finished. 126 + * 127 + * Guarantees: 128 + * - All in-flight DMAs initiated by memcpy_start() are fully complete 129 + * before memcpy_wait() returns. 130 + * 131 + * Returns non-0 if the driver detects that an error occurred during the 132 + * memcpy, 0 otherwise. 133 + */ 134 + int (*memcpy_wait)(struct vfio_pci_device *device); 135 + 136 + /** 137 + * send_msi() - Make the device send the MSI device->driver.msi. 138 + * 139 + * Guarantees: 140 + * - The device will send the MSI once. 141 + */ 142 + void (*send_msi)(struct vfio_pci_device *device); 143 + }; 144 + 145 + struct vfio_pci_driver { 146 + const struct vfio_pci_driver_ops *ops; 147 + bool initialized; 148 + bool memcpy_in_progress; 149 + 150 + /* Region to be used by the driver (e.g. for in-memory descriptors) */ 151 + struct vfio_dma_region region; 152 + 153 + /* The maximum size that can be passed to memcpy_start(). */ 154 + u64 max_memcpy_size; 155 + 156 + /* The maximum count that can be passed to memcpy_start(). */ 157 + u64 max_memcpy_count; 158 + 159 + /* The MSI vector the device will signal in ops->send_msi(). */ 160 + int msi; 161 + }; 162 + 163 + struct vfio_pci_device { 164 + int fd; 165 + 166 + const struct vfio_iommu_mode *iommu_mode; 167 + int group_fd; 168 + int container_fd; 169 + 170 + int iommufd; 171 + u32 ioas_id; 172 + 173 + struct vfio_device_info info; 174 + struct vfio_region_info config_space; 175 + struct vfio_pci_bar bars[PCI_STD_NUM_BARS]; 176 + 177 + struct vfio_irq_info msi_info; 178 + struct vfio_irq_info msix_info; 179 + 180 + struct list_head dma_regions; 181 + 182 + /* eventfds for MSI and MSI-x interrupts */ 183 + int msi_eventfds[PCI_MSIX_FLAGS_QSIZE + 1]; 184 + 185 + struct vfio_pci_driver driver; 186 + }; 187 + 188 + /* 189 + * Return the BDF string of the device that the test should use. 190 + * 191 + * If a BDF string is provided by the user on the command line (as the last 192 + * element of argv[]), then this function will return that and decrement argc 193 + * by 1. 194 + * 195 + * Otherwise this function will attempt to use the environment variable 196 + * $VFIO_SELFTESTS_BDF. 197 + * 198 + * If BDF cannot be determined then the test will exit with KSFT_SKIP. 199 + */ 200 + const char *vfio_selftests_get_bdf(int *argc, char *argv[]); 201 + const char *vfio_pci_get_cdev_path(const char *bdf); 202 + 203 + extern const char *default_iommu_mode; 204 + 205 + struct vfio_pci_device *vfio_pci_device_init(const char *bdf, const char *iommu_mode); 206 + void vfio_pci_device_cleanup(struct vfio_pci_device *device); 207 + void vfio_pci_device_reset(struct vfio_pci_device *device); 208 + 209 + void vfio_pci_dma_map(struct vfio_pci_device *device, 210 + struct vfio_dma_region *region); 211 + void vfio_pci_dma_unmap(struct vfio_pci_device *device, 212 + struct vfio_dma_region *region); 213 + 214 + void vfio_pci_config_access(struct vfio_pci_device *device, bool write, 215 + size_t config, size_t size, void *data); 216 + 217 + #define vfio_pci_config_read(_device, _offset, _type) ({ \ 218 + _type __data; \ 219 + vfio_pci_config_access((_device), false, _offset, sizeof(__data), &__data); \ 220 + __data; \ 221 + }) 222 + 223 + #define vfio_pci_config_readb(_d, _o) vfio_pci_config_read(_d, _o, u8) 224 + #define vfio_pci_config_readw(_d, _o) vfio_pci_config_read(_d, _o, u16) 225 + #define vfio_pci_config_readl(_d, _o) vfio_pci_config_read(_d, _o, u32) 226 + 227 + #define vfio_pci_config_write(_device, _offset, _value, _type) do { \ 228 + _type __data = (_value); \ 229 + vfio_pci_config_access((_device), true, _offset, sizeof(_type), &__data); \ 230 + } while (0) 231 + 232 + #define vfio_pci_config_writeb(_d, _o, _v) vfio_pci_config_write(_d, _o, _v, u8) 233 + #define vfio_pci_config_writew(_d, _o, _v) vfio_pci_config_write(_d, _o, _v, u16) 234 + #define vfio_pci_config_writel(_d, _o, _v) vfio_pci_config_write(_d, _o, _v, u32) 235 + 236 + void vfio_pci_irq_enable(struct vfio_pci_device *device, u32 index, 237 + u32 vector, int count); 238 + void vfio_pci_irq_disable(struct vfio_pci_device *device, u32 index); 239 + void vfio_pci_irq_trigger(struct vfio_pci_device *device, u32 index, u32 vector); 240 + 241 + static inline void fcntl_set_nonblock(int fd) 242 + { 243 + int r; 244 + 245 + r = fcntl(fd, F_GETFL, 0); 246 + VFIO_ASSERT_NE(r, -1, "F_GETFL failed for fd %d\n", fd); 247 + 248 + r = fcntl(fd, F_SETFL, r | O_NONBLOCK); 249 + VFIO_ASSERT_NE(r, -1, "F_SETFL O_NONBLOCK failed for fd %d\n", fd); 250 + } 251 + 252 + static inline void vfio_pci_msi_enable(struct vfio_pci_device *device, 253 + u32 vector, int count) 254 + { 255 + vfio_pci_irq_enable(device, VFIO_PCI_MSI_IRQ_INDEX, vector, count); 256 + } 257 + 258 + static inline void vfio_pci_msi_disable(struct vfio_pci_device *device) 259 + { 260 + vfio_pci_irq_disable(device, VFIO_PCI_MSI_IRQ_INDEX); 261 + } 262 + 263 + static inline void vfio_pci_msix_enable(struct vfio_pci_device *device, 264 + u32 vector, int count) 265 + { 266 + vfio_pci_irq_enable(device, VFIO_PCI_MSIX_IRQ_INDEX, vector, count); 267 + } 268 + 269 + static inline void vfio_pci_msix_disable(struct vfio_pci_device *device) 270 + { 271 + vfio_pci_irq_disable(device, VFIO_PCI_MSIX_IRQ_INDEX); 272 + } 273 + 274 + iova_t __to_iova(struct vfio_pci_device *device, void *vaddr); 275 + iova_t to_iova(struct vfio_pci_device *device, void *vaddr); 276 + 277 + static inline bool vfio_pci_device_match(struct vfio_pci_device *device, 278 + u16 vendor_id, u16 device_id) 279 + { 280 + return (vendor_id == vfio_pci_config_readw(device, PCI_VENDOR_ID)) && 281 + (device_id == vfio_pci_config_readw(device, PCI_DEVICE_ID)); 282 + } 283 + 284 + void vfio_pci_driver_probe(struct vfio_pci_device *device); 285 + void vfio_pci_driver_init(struct vfio_pci_device *device); 286 + void vfio_pci_driver_remove(struct vfio_pci_device *device); 287 + int vfio_pci_driver_memcpy(struct vfio_pci_device *device, 288 + iova_t src, iova_t dst, u64 size); 289 + void vfio_pci_driver_memcpy_start(struct vfio_pci_device *device, 290 + iova_t src, iova_t dst, u64 size, 291 + u64 count); 292 + int vfio_pci_driver_memcpy_wait(struct vfio_pci_device *device); 293 + void vfio_pci_driver_send_msi(struct vfio_pci_device *device); 294 + 295 + #endif /* SELFTESTS_VFIO_LIB_INCLUDE_VFIO_UTIL_H */
+24
tools/testing/selftests/vfio/lib/libvfio.mk
··· 1 + include $(top_srcdir)/scripts/subarch.include 2 + ARCH ?= $(SUBARCH) 3 + 4 + VFIO_DIR := $(selfdir)/vfio 5 + 6 + LIBVFIO_C := lib/vfio_pci_device.c 7 + LIBVFIO_C += lib/vfio_pci_driver.c 8 + 9 + ifeq ($(ARCH:x86_64=x86),x86) 10 + LIBVFIO_C += lib/drivers/ioat/ioat.c 11 + LIBVFIO_C += lib/drivers/dsa/dsa.c 12 + endif 13 + 14 + LIBVFIO_O := $(patsubst %.c, $(OUTPUT)/%.o, $(LIBVFIO_C)) 15 + 16 + LIBVFIO_O_DIRS := $(shell dirname $(LIBVFIO_O) | uniq) 17 + $(shell mkdir -p $(LIBVFIO_O_DIRS)) 18 + 19 + CFLAGS += -I$(VFIO_DIR)/lib/include 20 + 21 + $(LIBVFIO_O): $(OUTPUT)/%.o : $(VFIO_DIR)/%.c 22 + $(CC) $(CFLAGS) $(CPPFLAGS) $(TARGET_ARCH) -c $< -o $@ 23 + 24 + EXTRA_CLEAN += $(LIBVFIO_O)
+594
tools/testing/selftests/vfio/lib/vfio_pci_device.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + #include <dirent.h> 3 + #include <fcntl.h> 4 + #include <libgen.h> 5 + #include <stdlib.h> 6 + #include <string.h> 7 + #include <unistd.h> 8 + 9 + #include <sys/eventfd.h> 10 + #include <sys/ioctl.h> 11 + #include <sys/mman.h> 12 + 13 + #include <uapi/linux/types.h> 14 + #include <linux/limits.h> 15 + #include <linux/mman.h> 16 + #include <linux/types.h> 17 + #include <linux/vfio.h> 18 + #include <linux/iommufd.h> 19 + 20 + #include "../../../kselftest.h" 21 + #include <vfio_util.h> 22 + 23 + #define PCI_SYSFS_PATH "/sys/bus/pci/devices" 24 + 25 + #define ioctl_assert(_fd, _op, _arg) do { \ 26 + void *__arg = (_arg); \ 27 + int __ret = ioctl((_fd), (_op), (__arg)); \ 28 + VFIO_ASSERT_EQ(__ret, 0, "ioctl(%s, %s, %s) returned %d\n", #_fd, #_op, #_arg, __ret); \ 29 + } while (0) 30 + 31 + iova_t __to_iova(struct vfio_pci_device *device, void *vaddr) 32 + { 33 + struct vfio_dma_region *region; 34 + 35 + list_for_each_entry(region, &device->dma_regions, link) { 36 + if (vaddr < region->vaddr) 37 + continue; 38 + 39 + if (vaddr >= region->vaddr + region->size) 40 + continue; 41 + 42 + return region->iova + (vaddr - region->vaddr); 43 + } 44 + 45 + return INVALID_IOVA; 46 + } 47 + 48 + iova_t to_iova(struct vfio_pci_device *device, void *vaddr) 49 + { 50 + iova_t iova; 51 + 52 + iova = __to_iova(device, vaddr); 53 + VFIO_ASSERT_NE(iova, INVALID_IOVA, "%p is not mapped into device.\n", vaddr); 54 + 55 + return iova; 56 + } 57 + 58 + static void vfio_pci_irq_set(struct vfio_pci_device *device, 59 + u32 index, u32 vector, u32 count, int *fds) 60 + { 61 + u8 buf[sizeof(struct vfio_irq_set) + sizeof(int) * count] = {}; 62 + struct vfio_irq_set *irq = (void *)&buf; 63 + int *irq_fds = (void *)&irq->data; 64 + 65 + irq->argsz = sizeof(buf); 66 + irq->flags = VFIO_IRQ_SET_ACTION_TRIGGER; 67 + irq->index = index; 68 + irq->start = vector; 69 + irq->count = count; 70 + 71 + if (count) { 72 + irq->flags |= VFIO_IRQ_SET_DATA_EVENTFD; 73 + memcpy(irq_fds, fds, sizeof(int) * count); 74 + } else { 75 + irq->flags |= VFIO_IRQ_SET_DATA_NONE; 76 + } 77 + 78 + ioctl_assert(device->fd, VFIO_DEVICE_SET_IRQS, irq); 79 + } 80 + 81 + void vfio_pci_irq_trigger(struct vfio_pci_device *device, u32 index, u32 vector) 82 + { 83 + struct vfio_irq_set irq = { 84 + .argsz = sizeof(irq), 85 + .flags = VFIO_IRQ_SET_ACTION_TRIGGER | VFIO_IRQ_SET_DATA_NONE, 86 + .index = index, 87 + .start = vector, 88 + .count = 1, 89 + }; 90 + 91 + ioctl_assert(device->fd, VFIO_DEVICE_SET_IRQS, &irq); 92 + } 93 + 94 + static void check_supported_irq_index(u32 index) 95 + { 96 + /* VFIO selftests only supports MSI and MSI-x for now. */ 97 + VFIO_ASSERT_TRUE(index == VFIO_PCI_MSI_IRQ_INDEX || 98 + index == VFIO_PCI_MSIX_IRQ_INDEX, 99 + "Unsupported IRQ index: %u\n", index); 100 + } 101 + 102 + void vfio_pci_irq_enable(struct vfio_pci_device *device, u32 index, u32 vector, 103 + int count) 104 + { 105 + int i; 106 + 107 + check_supported_irq_index(index); 108 + 109 + for (i = vector; i < vector + count; i++) { 110 + VFIO_ASSERT_LT(device->msi_eventfds[i], 0); 111 + device->msi_eventfds[i] = eventfd(0, 0); 112 + VFIO_ASSERT_GE(device->msi_eventfds[i], 0); 113 + } 114 + 115 + vfio_pci_irq_set(device, index, vector, count, device->msi_eventfds + vector); 116 + } 117 + 118 + void vfio_pci_irq_disable(struct vfio_pci_device *device, u32 index) 119 + { 120 + int i; 121 + 122 + check_supported_irq_index(index); 123 + 124 + for (i = 0; i < ARRAY_SIZE(device->msi_eventfds); i++) { 125 + if (device->msi_eventfds[i] < 0) 126 + continue; 127 + 128 + VFIO_ASSERT_EQ(close(device->msi_eventfds[i]), 0); 129 + device->msi_eventfds[i] = -1; 130 + } 131 + 132 + vfio_pci_irq_set(device, index, 0, 0, NULL); 133 + } 134 + 135 + static void vfio_pci_irq_get(struct vfio_pci_device *device, u32 index, 136 + struct vfio_irq_info *irq_info) 137 + { 138 + irq_info->argsz = sizeof(*irq_info); 139 + irq_info->index = index; 140 + 141 + ioctl_assert(device->fd, VFIO_DEVICE_GET_IRQ_INFO, irq_info); 142 + } 143 + 144 + static void vfio_iommu_dma_map(struct vfio_pci_device *device, 145 + struct vfio_dma_region *region) 146 + { 147 + struct vfio_iommu_type1_dma_map args = { 148 + .argsz = sizeof(args), 149 + .flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE, 150 + .vaddr = (u64)region->vaddr, 151 + .iova = region->iova, 152 + .size = region->size, 153 + }; 154 + 155 + ioctl_assert(device->container_fd, VFIO_IOMMU_MAP_DMA, &args); 156 + } 157 + 158 + static void iommufd_dma_map(struct vfio_pci_device *device, 159 + struct vfio_dma_region *region) 160 + { 161 + struct iommu_ioas_map args = { 162 + .size = sizeof(args), 163 + .flags = IOMMU_IOAS_MAP_READABLE | 164 + IOMMU_IOAS_MAP_WRITEABLE | 165 + IOMMU_IOAS_MAP_FIXED_IOVA, 166 + .user_va = (u64)region->vaddr, 167 + .iova = region->iova, 168 + .length = region->size, 169 + .ioas_id = device->ioas_id, 170 + }; 171 + 172 + ioctl_assert(device->iommufd, IOMMU_IOAS_MAP, &args); 173 + } 174 + 175 + void vfio_pci_dma_map(struct vfio_pci_device *device, 176 + struct vfio_dma_region *region) 177 + { 178 + if (device->iommufd) 179 + iommufd_dma_map(device, region); 180 + else 181 + vfio_iommu_dma_map(device, region); 182 + 183 + list_add(&region->link, &device->dma_regions); 184 + } 185 + 186 + static void vfio_iommu_dma_unmap(struct vfio_pci_device *device, 187 + struct vfio_dma_region *region) 188 + { 189 + struct vfio_iommu_type1_dma_unmap args = { 190 + .argsz = sizeof(args), 191 + .iova = region->iova, 192 + .size = region->size, 193 + }; 194 + 195 + ioctl_assert(device->container_fd, VFIO_IOMMU_UNMAP_DMA, &args); 196 + } 197 + 198 + static void iommufd_dma_unmap(struct vfio_pci_device *device, 199 + struct vfio_dma_region *region) 200 + { 201 + struct iommu_ioas_unmap args = { 202 + .size = sizeof(args), 203 + .iova = region->iova, 204 + .length = region->size, 205 + .ioas_id = device->ioas_id, 206 + }; 207 + 208 + ioctl_assert(device->iommufd, IOMMU_IOAS_UNMAP, &args); 209 + } 210 + 211 + void vfio_pci_dma_unmap(struct vfio_pci_device *device, 212 + struct vfio_dma_region *region) 213 + { 214 + if (device->iommufd) 215 + iommufd_dma_unmap(device, region); 216 + else 217 + vfio_iommu_dma_unmap(device, region); 218 + 219 + list_del(&region->link); 220 + } 221 + 222 + static void vfio_pci_region_get(struct vfio_pci_device *device, int index, 223 + struct vfio_region_info *info) 224 + { 225 + memset(info, 0, sizeof(*info)); 226 + 227 + info->argsz = sizeof(*info); 228 + info->index = index; 229 + 230 + ioctl_assert(device->fd, VFIO_DEVICE_GET_REGION_INFO, info); 231 + } 232 + 233 + static void vfio_pci_bar_map(struct vfio_pci_device *device, int index) 234 + { 235 + struct vfio_pci_bar *bar = &device->bars[index]; 236 + int prot = 0; 237 + 238 + VFIO_ASSERT_LT(index, PCI_STD_NUM_BARS); 239 + VFIO_ASSERT_NULL(bar->vaddr); 240 + VFIO_ASSERT_TRUE(bar->info.flags & VFIO_REGION_INFO_FLAG_MMAP); 241 + 242 + if (bar->info.flags & VFIO_REGION_INFO_FLAG_READ) 243 + prot |= PROT_READ; 244 + if (bar->info.flags & VFIO_REGION_INFO_FLAG_WRITE) 245 + prot |= PROT_WRITE; 246 + 247 + bar->vaddr = mmap(NULL, bar->info.size, prot, MAP_FILE | MAP_SHARED, 248 + device->fd, bar->info.offset); 249 + VFIO_ASSERT_NE(bar->vaddr, MAP_FAILED); 250 + } 251 + 252 + static void vfio_pci_bar_unmap(struct vfio_pci_device *device, int index) 253 + { 254 + struct vfio_pci_bar *bar = &device->bars[index]; 255 + 256 + VFIO_ASSERT_LT(index, PCI_STD_NUM_BARS); 257 + VFIO_ASSERT_NOT_NULL(bar->vaddr); 258 + 259 + VFIO_ASSERT_EQ(munmap(bar->vaddr, bar->info.size), 0); 260 + bar->vaddr = NULL; 261 + } 262 + 263 + static void vfio_pci_bar_unmap_all(struct vfio_pci_device *device) 264 + { 265 + int i; 266 + 267 + for (i = 0; i < PCI_STD_NUM_BARS; i++) { 268 + if (device->bars[i].vaddr) 269 + vfio_pci_bar_unmap(device, i); 270 + } 271 + } 272 + 273 + void vfio_pci_config_access(struct vfio_pci_device *device, bool write, 274 + size_t config, size_t size, void *data) 275 + { 276 + struct vfio_region_info *config_space = &device->config_space; 277 + int ret; 278 + 279 + if (write) 280 + ret = pwrite(device->fd, data, size, config_space->offset + config); 281 + else 282 + ret = pread(device->fd, data, size, config_space->offset + config); 283 + 284 + VFIO_ASSERT_EQ(ret, size, "Failed to %s PCI config space: 0x%lx\n", 285 + write ? "write to" : "read from", config); 286 + } 287 + 288 + void vfio_pci_device_reset(struct vfio_pci_device *device) 289 + { 290 + ioctl_assert(device->fd, VFIO_DEVICE_RESET, NULL); 291 + } 292 + 293 + static unsigned int vfio_pci_get_group_from_dev(const char *bdf) 294 + { 295 + char dev_iommu_group_path[PATH_MAX] = {0}; 296 + char sysfs_path[PATH_MAX] = {0}; 297 + unsigned int group; 298 + int ret; 299 + 300 + snprintf(sysfs_path, PATH_MAX, "%s/%s/iommu_group", PCI_SYSFS_PATH, bdf); 301 + 302 + ret = readlink(sysfs_path, dev_iommu_group_path, sizeof(dev_iommu_group_path)); 303 + VFIO_ASSERT_NE(ret, -1, "Failed to get the IOMMU group for device: %s\n", bdf); 304 + 305 + ret = sscanf(basename(dev_iommu_group_path), "%u", &group); 306 + VFIO_ASSERT_EQ(ret, 1, "Failed to get the IOMMU group for device: %s\n", bdf); 307 + 308 + return group; 309 + } 310 + 311 + static void vfio_pci_group_setup(struct vfio_pci_device *device, const char *bdf) 312 + { 313 + struct vfio_group_status group_status = { 314 + .argsz = sizeof(group_status), 315 + }; 316 + char group_path[32]; 317 + int group; 318 + 319 + group = vfio_pci_get_group_from_dev(bdf); 320 + snprintf(group_path, sizeof(group_path), "/dev/vfio/%d", group); 321 + 322 + device->group_fd = open(group_path, O_RDWR); 323 + VFIO_ASSERT_GE(device->group_fd, 0, "open(%s) failed\n", group_path); 324 + 325 + ioctl_assert(device->group_fd, VFIO_GROUP_GET_STATUS, &group_status); 326 + VFIO_ASSERT_TRUE(group_status.flags & VFIO_GROUP_FLAGS_VIABLE); 327 + 328 + ioctl_assert(device->group_fd, VFIO_GROUP_SET_CONTAINER, &device->container_fd); 329 + } 330 + 331 + static void vfio_pci_container_setup(struct vfio_pci_device *device, const char *bdf) 332 + { 333 + unsigned long iommu_type = device->iommu_mode->iommu_type; 334 + const char *path = device->iommu_mode->container_path; 335 + int version; 336 + int ret; 337 + 338 + device->container_fd = open(path, O_RDWR); 339 + VFIO_ASSERT_GE(device->container_fd, 0, "open(%s) failed\n", path); 340 + 341 + version = ioctl(device->container_fd, VFIO_GET_API_VERSION); 342 + VFIO_ASSERT_EQ(version, VFIO_API_VERSION, "Unsupported version: %d\n", version); 343 + 344 + vfio_pci_group_setup(device, bdf); 345 + 346 + ret = ioctl(device->container_fd, VFIO_CHECK_EXTENSION, iommu_type); 347 + VFIO_ASSERT_GT(ret, 0, "VFIO IOMMU type %lu not supported\n", iommu_type); 348 + 349 + ioctl_assert(device->container_fd, VFIO_SET_IOMMU, (void *)iommu_type); 350 + 351 + device->fd = ioctl(device->group_fd, VFIO_GROUP_GET_DEVICE_FD, bdf); 352 + VFIO_ASSERT_GE(device->fd, 0); 353 + } 354 + 355 + static void vfio_pci_device_setup(struct vfio_pci_device *device) 356 + { 357 + int i; 358 + 359 + device->info.argsz = sizeof(device->info); 360 + ioctl_assert(device->fd, VFIO_DEVICE_GET_INFO, &device->info); 361 + 362 + vfio_pci_region_get(device, VFIO_PCI_CONFIG_REGION_INDEX, &device->config_space); 363 + 364 + /* Sanity check VFIO does not advertise mmap for config space */ 365 + VFIO_ASSERT_TRUE(!(device->config_space.flags & VFIO_REGION_INFO_FLAG_MMAP), 366 + "PCI config space should not support mmap()\n"); 367 + 368 + for (i = 0; i < PCI_STD_NUM_BARS; i++) { 369 + struct vfio_pci_bar *bar = device->bars + i; 370 + 371 + vfio_pci_region_get(device, i, &bar->info); 372 + if (bar->info.flags & VFIO_REGION_INFO_FLAG_MMAP) 373 + vfio_pci_bar_map(device, i); 374 + } 375 + 376 + vfio_pci_irq_get(device, VFIO_PCI_MSI_IRQ_INDEX, &device->msi_info); 377 + vfio_pci_irq_get(device, VFIO_PCI_MSIX_IRQ_INDEX, &device->msix_info); 378 + 379 + for (i = 0; i < ARRAY_SIZE(device->msi_eventfds); i++) 380 + device->msi_eventfds[i] = -1; 381 + } 382 + 383 + const char *vfio_pci_get_cdev_path(const char *bdf) 384 + { 385 + char dir_path[PATH_MAX]; 386 + struct dirent *entry; 387 + char *cdev_path; 388 + DIR *dir; 389 + 390 + cdev_path = calloc(PATH_MAX, 1); 391 + VFIO_ASSERT_NOT_NULL(cdev_path); 392 + 393 + snprintf(dir_path, sizeof(dir_path), "/sys/bus/pci/devices/%s/vfio-dev/", bdf); 394 + 395 + dir = opendir(dir_path); 396 + VFIO_ASSERT_NOT_NULL(dir, "Failed to open directory %s\n", dir_path); 397 + 398 + while ((entry = readdir(dir)) != NULL) { 399 + /* Find the file that starts with "vfio" */ 400 + if (strncmp("vfio", entry->d_name, 4)) 401 + continue; 402 + 403 + snprintf(cdev_path, PATH_MAX, "/dev/vfio/devices/%s", entry->d_name); 404 + break; 405 + } 406 + 407 + VFIO_ASSERT_NE(cdev_path[0], 0, "Failed to find vfio cdev file.\n"); 408 + VFIO_ASSERT_EQ(closedir(dir), 0); 409 + 410 + return cdev_path; 411 + } 412 + 413 + /* Reminder: Keep in sync with FIXTURE_VARIANT_ADD_ALL_IOMMU_MODES(). */ 414 + static const struct vfio_iommu_mode iommu_modes[] = { 415 + { 416 + .name = "vfio_type1_iommu", 417 + .container_path = "/dev/vfio/vfio", 418 + .iommu_type = VFIO_TYPE1_IOMMU, 419 + }, 420 + { 421 + .name = "vfio_type1v2_iommu", 422 + .container_path = "/dev/vfio/vfio", 423 + .iommu_type = VFIO_TYPE1v2_IOMMU, 424 + }, 425 + { 426 + .name = "iommufd_compat_type1", 427 + .container_path = "/dev/iommu", 428 + .iommu_type = VFIO_TYPE1_IOMMU, 429 + }, 430 + { 431 + .name = "iommufd_compat_type1v2", 432 + .container_path = "/dev/iommu", 433 + .iommu_type = VFIO_TYPE1v2_IOMMU, 434 + }, 435 + { 436 + .name = "iommufd", 437 + }, 438 + }; 439 + 440 + const char *default_iommu_mode = "iommufd"; 441 + 442 + static const struct vfio_iommu_mode *lookup_iommu_mode(const char *iommu_mode) 443 + { 444 + int i; 445 + 446 + if (!iommu_mode) 447 + iommu_mode = default_iommu_mode; 448 + 449 + for (i = 0; i < ARRAY_SIZE(iommu_modes); i++) { 450 + if (strcmp(iommu_mode, iommu_modes[i].name)) 451 + continue; 452 + 453 + return &iommu_modes[i]; 454 + } 455 + 456 + VFIO_FAIL("Unrecognized IOMMU mode: %s\n", iommu_mode); 457 + } 458 + 459 + static void vfio_device_bind_iommufd(int device_fd, int iommufd) 460 + { 461 + struct vfio_device_bind_iommufd args = { 462 + .argsz = sizeof(args), 463 + .iommufd = iommufd, 464 + }; 465 + 466 + ioctl_assert(device_fd, VFIO_DEVICE_BIND_IOMMUFD, &args); 467 + } 468 + 469 + static u32 iommufd_ioas_alloc(int iommufd) 470 + { 471 + struct iommu_ioas_alloc args = { 472 + .size = sizeof(args), 473 + }; 474 + 475 + ioctl_assert(iommufd, IOMMU_IOAS_ALLOC, &args); 476 + return args.out_ioas_id; 477 + } 478 + 479 + static void vfio_device_attach_iommufd_pt(int device_fd, u32 pt_id) 480 + { 481 + struct vfio_device_attach_iommufd_pt args = { 482 + .argsz = sizeof(args), 483 + .pt_id = pt_id, 484 + }; 485 + 486 + ioctl_assert(device_fd, VFIO_DEVICE_ATTACH_IOMMUFD_PT, &args); 487 + } 488 + 489 + static void vfio_pci_iommufd_setup(struct vfio_pci_device *device, const char *bdf) 490 + { 491 + const char *cdev_path = vfio_pci_get_cdev_path(bdf); 492 + 493 + device->fd = open(cdev_path, O_RDWR); 494 + VFIO_ASSERT_GE(device->fd, 0); 495 + free((void *)cdev_path); 496 + 497 + /* 498 + * Require device->iommufd to be >0 so that a simple non-0 check can be 499 + * used to check if iommufd is enabled. In practice open() will never 500 + * return 0 unless stdin is closed. 501 + */ 502 + device->iommufd = open("/dev/iommu", O_RDWR); 503 + VFIO_ASSERT_GT(device->iommufd, 0); 504 + 505 + vfio_device_bind_iommufd(device->fd, device->iommufd); 506 + device->ioas_id = iommufd_ioas_alloc(device->iommufd); 507 + vfio_device_attach_iommufd_pt(device->fd, device->ioas_id); 508 + } 509 + 510 + struct vfio_pci_device *vfio_pci_device_init(const char *bdf, const char *iommu_mode) 511 + { 512 + struct vfio_pci_device *device; 513 + 514 + device = calloc(1, sizeof(*device)); 515 + VFIO_ASSERT_NOT_NULL(device); 516 + 517 + INIT_LIST_HEAD(&device->dma_regions); 518 + 519 + device->iommu_mode = lookup_iommu_mode(iommu_mode); 520 + 521 + if (device->iommu_mode->container_path) 522 + vfio_pci_container_setup(device, bdf); 523 + else 524 + vfio_pci_iommufd_setup(device, bdf); 525 + 526 + vfio_pci_device_setup(device); 527 + vfio_pci_driver_probe(device); 528 + 529 + return device; 530 + } 531 + 532 + void vfio_pci_device_cleanup(struct vfio_pci_device *device) 533 + { 534 + int i; 535 + 536 + if (device->driver.initialized) 537 + vfio_pci_driver_remove(device); 538 + 539 + vfio_pci_bar_unmap_all(device); 540 + 541 + VFIO_ASSERT_EQ(close(device->fd), 0); 542 + 543 + for (i = 0; i < ARRAY_SIZE(device->msi_eventfds); i++) { 544 + if (device->msi_eventfds[i] < 0) 545 + continue; 546 + 547 + VFIO_ASSERT_EQ(close(device->msi_eventfds[i]), 0); 548 + } 549 + 550 + if (device->iommufd) { 551 + VFIO_ASSERT_EQ(close(device->iommufd), 0); 552 + } else { 553 + VFIO_ASSERT_EQ(close(device->group_fd), 0); 554 + VFIO_ASSERT_EQ(close(device->container_fd), 0); 555 + } 556 + 557 + free(device); 558 + } 559 + 560 + static bool is_bdf(const char *str) 561 + { 562 + unsigned int s, b, d, f; 563 + int length, count; 564 + 565 + count = sscanf(str, "%4x:%2x:%2x.%2x%n", &s, &b, &d, &f, &length); 566 + return count == 4 && length == strlen(str); 567 + } 568 + 569 + const char *vfio_selftests_get_bdf(int *argc, char *argv[]) 570 + { 571 + char *bdf; 572 + 573 + if (*argc > 1 && is_bdf(argv[*argc - 1])) 574 + return argv[--(*argc)]; 575 + 576 + bdf = getenv("VFIO_SELFTESTS_BDF"); 577 + if (bdf) { 578 + VFIO_ASSERT_TRUE(is_bdf(bdf), "Invalid BDF: %s\n", bdf); 579 + return bdf; 580 + } 581 + 582 + fprintf(stderr, "Unable to determine which device to use, skipping test.\n"); 583 + fprintf(stderr, "\n"); 584 + fprintf(stderr, "To pass the device address via environment variable:\n"); 585 + fprintf(stderr, "\n"); 586 + fprintf(stderr, " export VFIO_SELFTESTS_BDF=segment:bus:device.function\n"); 587 + fprintf(stderr, " %s [options]\n", argv[0]); 588 + fprintf(stderr, "\n"); 589 + fprintf(stderr, "To pass the device address via argv:\n"); 590 + fprintf(stderr, "\n"); 591 + fprintf(stderr, " %s [options] segment:bus:device.function\n", argv[0]); 592 + fprintf(stderr, "\n"); 593 + exit(KSFT_SKIP); 594 + }
+126
tools/testing/selftests/vfio/lib/vfio_pci_driver.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + #include <stdio.h> 3 + 4 + #include "../../../kselftest.h" 5 + #include <vfio_util.h> 6 + 7 + #ifdef __x86_64__ 8 + extern struct vfio_pci_driver_ops dsa_ops; 9 + extern struct vfio_pci_driver_ops ioat_ops; 10 + #endif 11 + 12 + static struct vfio_pci_driver_ops *driver_ops[] = { 13 + #ifdef __x86_64__ 14 + &dsa_ops, 15 + &ioat_ops, 16 + #endif 17 + }; 18 + 19 + void vfio_pci_driver_probe(struct vfio_pci_device *device) 20 + { 21 + struct vfio_pci_driver_ops *ops; 22 + int i; 23 + 24 + VFIO_ASSERT_NULL(device->driver.ops); 25 + 26 + for (i = 0; i < ARRAY_SIZE(driver_ops); i++) { 27 + ops = driver_ops[i]; 28 + 29 + if (ops->probe(device)) 30 + continue; 31 + 32 + printf("Driver found: %s\n", ops->name); 33 + device->driver.ops = ops; 34 + } 35 + } 36 + 37 + static void vfio_check_driver_op(struct vfio_pci_driver *driver, void *op, 38 + const char *op_name) 39 + { 40 + VFIO_ASSERT_NOT_NULL(driver->ops); 41 + VFIO_ASSERT_NOT_NULL(op, "Driver has no %s()\n", op_name); 42 + VFIO_ASSERT_EQ(driver->initialized, op != driver->ops->init); 43 + VFIO_ASSERT_EQ(driver->memcpy_in_progress, op == driver->ops->memcpy_wait); 44 + } 45 + 46 + #define VFIO_CHECK_DRIVER_OP(_driver, _op) do { \ 47 + struct vfio_pci_driver *__driver = (_driver); \ 48 + vfio_check_driver_op(__driver, __driver->ops->_op, #_op); \ 49 + } while (0) 50 + 51 + void vfio_pci_driver_init(struct vfio_pci_device *device) 52 + { 53 + struct vfio_pci_driver *driver = &device->driver; 54 + 55 + VFIO_ASSERT_NOT_NULL(driver->region.vaddr); 56 + VFIO_CHECK_DRIVER_OP(driver, init); 57 + 58 + driver->ops->init(device); 59 + 60 + driver->initialized = true; 61 + 62 + printf("%s: region: vaddr %p, iova 0x%lx, size 0x%lx\n", 63 + driver->ops->name, 64 + driver->region.vaddr, 65 + driver->region.iova, 66 + driver->region.size); 67 + 68 + printf("%s: max_memcpy_size 0x%lx, max_memcpy_count 0x%lx\n", 69 + driver->ops->name, 70 + driver->max_memcpy_size, 71 + driver->max_memcpy_count); 72 + } 73 + 74 + void vfio_pci_driver_remove(struct vfio_pci_device *device) 75 + { 76 + struct vfio_pci_driver *driver = &device->driver; 77 + 78 + VFIO_CHECK_DRIVER_OP(driver, remove); 79 + 80 + driver->ops->remove(device); 81 + driver->initialized = false; 82 + } 83 + 84 + void vfio_pci_driver_send_msi(struct vfio_pci_device *device) 85 + { 86 + struct vfio_pci_driver *driver = &device->driver; 87 + 88 + VFIO_CHECK_DRIVER_OP(driver, send_msi); 89 + 90 + driver->ops->send_msi(device); 91 + } 92 + 93 + void vfio_pci_driver_memcpy_start(struct vfio_pci_device *device, 94 + iova_t src, iova_t dst, u64 size, 95 + u64 count) 96 + { 97 + struct vfio_pci_driver *driver = &device->driver; 98 + 99 + VFIO_ASSERT_LE(size, driver->max_memcpy_size); 100 + VFIO_ASSERT_LE(count, driver->max_memcpy_count); 101 + VFIO_CHECK_DRIVER_OP(driver, memcpy_start); 102 + 103 + driver->ops->memcpy_start(device, src, dst, size, count); 104 + driver->memcpy_in_progress = true; 105 + } 106 + 107 + int vfio_pci_driver_memcpy_wait(struct vfio_pci_device *device) 108 + { 109 + struct vfio_pci_driver *driver = &device->driver; 110 + int r; 111 + 112 + VFIO_CHECK_DRIVER_OP(driver, memcpy_wait); 113 + 114 + r = driver->ops->memcpy_wait(device); 115 + driver->memcpy_in_progress = false; 116 + 117 + return r; 118 + } 119 + 120 + int vfio_pci_driver_memcpy(struct vfio_pci_device *device, 121 + iova_t src, iova_t dst, u64 size) 122 + { 123 + vfio_pci_driver_memcpy_start(device, src, dst, size, 1); 124 + 125 + return vfio_pci_driver_memcpy_wait(device); 126 + }
+109
tools/testing/selftests/vfio/run.sh
··· 1 + # SPDX-License-Identifier: GPL-2.0-or-later 2 + 3 + # Global variables initialized in main() and then used during cleanup() when 4 + # the script exits. 5 + declare DEVICE_BDF 6 + declare NEW_DRIVER 7 + declare OLD_DRIVER 8 + declare OLD_NUMVFS 9 + declare DRIVER_OVERRIDE 10 + 11 + function write_to() { 12 + # Unfortunately set -x does not show redirects so use echo to manually 13 + # tell the user what commands are being run. 14 + echo "+ echo \"${2}\" > ${1}" 15 + echo "${2}" > ${1} 16 + } 17 + 18 + function bind() { 19 + write_to /sys/bus/pci/drivers/${2}/bind ${1} 20 + } 21 + 22 + function unbind() { 23 + write_to /sys/bus/pci/drivers/${2}/unbind ${1} 24 + } 25 + 26 + function set_sriov_numvfs() { 27 + write_to /sys/bus/pci/devices/${1}/sriov_numvfs ${2} 28 + } 29 + 30 + function set_driver_override() { 31 + write_to /sys/bus/pci/devices/${1}/driver_override ${2} 32 + } 33 + 34 + function clear_driver_override() { 35 + set_driver_override ${1} "" 36 + } 37 + 38 + function cleanup() { 39 + if [ "${NEW_DRIVER}" ]; then unbind ${DEVICE_BDF} ${NEW_DRIVER} ; fi 40 + if [ "${DRIVER_OVERRIDE}" ]; then clear_driver_override ${DEVICE_BDF} ; fi 41 + if [ "${OLD_DRIVER}" ]; then bind ${DEVICE_BDF} ${OLD_DRIVER} ; fi 42 + if [ "${OLD_NUMVFS}" ]; then set_sriov_numvfs ${DEVICE_BDF} ${OLD_NUMVFS} ; fi 43 + } 44 + 45 + function usage() { 46 + echo "usage: $0 [-d segment:bus:device.function] [-s] [-h] [cmd ...]" >&2 47 + echo >&2 48 + echo " -d: The BDF of the device to use for the test (required)" >&2 49 + echo " -h: Show this help message" >&2 50 + echo " -s: Drop into a shell rather than running a command" >&2 51 + echo >&2 52 + echo " cmd: The command to run and arguments to pass to it." >&2 53 + echo " Required when not using -s. The SBDF will be " >&2 54 + echo " appended to the argument list." >&2 55 + exit 1 56 + } 57 + 58 + function main() { 59 + local shell 60 + 61 + while getopts "d:hs" opt; do 62 + case $opt in 63 + d) DEVICE_BDF="$OPTARG" ;; 64 + s) shell=true ;; 65 + *) usage ;; 66 + esac 67 + done 68 + 69 + # Shift past all optional arguments. 70 + shift $((OPTIND - 1)) 71 + 72 + # Check that the user passed in the command to run. 73 + [ ! "${shell}" ] && [ $# = 0 ] && usage 74 + 75 + # Check that the user passed in a BDF. 76 + [ "${DEVICE_BDF}" ] || usage 77 + 78 + trap cleanup EXIT 79 + set -e 80 + 81 + test -d /sys/bus/pci/devices/${DEVICE_BDF} 82 + 83 + if [ -f /sys/bus/pci/devices/${DEVICE_BDF}/sriov_numvfs ]; then 84 + OLD_NUMVFS=$(cat /sys/bus/pci/devices/${DEVICE_BDF}/sriov_numvfs) 85 + set_sriov_numvfs ${DEVICE_BDF} 0 86 + fi 87 + 88 + if [ -L /sys/bus/pci/devices/${DEVICE_BDF}/driver ]; then 89 + OLD_DRIVER=$(basename $(readlink -m /sys/bus/pci/devices/${DEVICE_BDF}/driver)) 90 + unbind ${DEVICE_BDF} ${OLD_DRIVER} 91 + fi 92 + 93 + set_driver_override ${DEVICE_BDF} vfio-pci 94 + DRIVER_OVERRIDE=true 95 + 96 + bind ${DEVICE_BDF} vfio-pci 97 + NEW_DRIVER=vfio-pci 98 + 99 + echo 100 + if [ "${shell}" ]; then 101 + echo "Dropping into ${SHELL} with VFIO_SELFTESTS_BDF=${DEVICE_BDF}" 102 + VFIO_SELFTESTS_BDF=${DEVICE_BDF} ${SHELL} 103 + else 104 + "$@" ${DEVICE_BDF} 105 + fi 106 + echo 107 + } 108 + 109 + main "$@"
+199
tools/testing/selftests/vfio/vfio_dma_mapping_test.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + #include <stdio.h> 3 + #include <sys/mman.h> 4 + #include <unistd.h> 5 + 6 + #include <linux/limits.h> 7 + #include <linux/mman.h> 8 + #include <linux/sizes.h> 9 + #include <linux/vfio.h> 10 + 11 + #include <vfio_util.h> 12 + 13 + #include "../kselftest_harness.h" 14 + 15 + static const char *device_bdf; 16 + 17 + struct iommu_mapping { 18 + u64 pgd; 19 + u64 p4d; 20 + u64 pud; 21 + u64 pmd; 22 + u64 pte; 23 + }; 24 + 25 + static void parse_next_value(char **line, u64 *value) 26 + { 27 + char *token; 28 + 29 + token = strtok_r(*line, " \t|\n", line); 30 + if (!token) 31 + return; 32 + 33 + /* Caller verifies `value`. No need to check return value. */ 34 + sscanf(token, "0x%lx", value); 35 + } 36 + 37 + static int intel_iommu_mapping_get(const char *bdf, u64 iova, 38 + struct iommu_mapping *mapping) 39 + { 40 + char iommu_mapping_path[PATH_MAX], line[PATH_MAX]; 41 + u64 line_iova = -1; 42 + int ret = -ENOENT; 43 + FILE *file; 44 + char *rest; 45 + 46 + snprintf(iommu_mapping_path, sizeof(iommu_mapping_path), 47 + "/sys/kernel/debug/iommu/intel/%s/domain_translation_struct", 48 + bdf); 49 + 50 + printf("Searching for IOVA 0x%lx in %s\n", iova, iommu_mapping_path); 51 + 52 + file = fopen(iommu_mapping_path, "r"); 53 + VFIO_ASSERT_NOT_NULL(file, "fopen(%s) failed", iommu_mapping_path); 54 + 55 + while (fgets(line, sizeof(line), file)) { 56 + rest = line; 57 + 58 + parse_next_value(&rest, &line_iova); 59 + if (line_iova != (iova / getpagesize())) 60 + continue; 61 + 62 + /* 63 + * Ensure each struct field is initialized in case of empty 64 + * page table values. 65 + */ 66 + memset(mapping, 0, sizeof(*mapping)); 67 + parse_next_value(&rest, &mapping->pgd); 68 + parse_next_value(&rest, &mapping->p4d); 69 + parse_next_value(&rest, &mapping->pud); 70 + parse_next_value(&rest, &mapping->pmd); 71 + parse_next_value(&rest, &mapping->pte); 72 + 73 + ret = 0; 74 + break; 75 + } 76 + 77 + fclose(file); 78 + 79 + if (ret) 80 + printf("IOVA not found\n"); 81 + 82 + return ret; 83 + } 84 + 85 + static int iommu_mapping_get(const char *bdf, u64 iova, 86 + struct iommu_mapping *mapping) 87 + { 88 + if (!access("/sys/kernel/debug/iommu/intel", F_OK)) 89 + return intel_iommu_mapping_get(bdf, iova, mapping); 90 + 91 + return -EOPNOTSUPP; 92 + } 93 + 94 + FIXTURE(vfio_dma_mapping_test) { 95 + struct vfio_pci_device *device; 96 + }; 97 + 98 + FIXTURE_VARIANT(vfio_dma_mapping_test) { 99 + const char *iommu_mode; 100 + u64 size; 101 + int mmap_flags; 102 + }; 103 + 104 + #define FIXTURE_VARIANT_ADD_IOMMU_MODE(_iommu_mode, _name, _size, _mmap_flags) \ 105 + FIXTURE_VARIANT_ADD(vfio_dma_mapping_test, _iommu_mode ## _ ## _name) { \ 106 + .iommu_mode = #_iommu_mode, \ 107 + .size = (_size), \ 108 + .mmap_flags = MAP_ANONYMOUS | MAP_PRIVATE | (_mmap_flags), \ 109 + } 110 + 111 + FIXTURE_VARIANT_ADD_ALL_IOMMU_MODES(anonymous, 0, 0); 112 + FIXTURE_VARIANT_ADD_ALL_IOMMU_MODES(anonymous_hugetlb_2mb, SZ_2M, MAP_HUGETLB | MAP_HUGE_2MB); 113 + FIXTURE_VARIANT_ADD_ALL_IOMMU_MODES(anonymous_hugetlb_1gb, SZ_1G, MAP_HUGETLB | MAP_HUGE_1GB); 114 + 115 + FIXTURE_SETUP(vfio_dma_mapping_test) 116 + { 117 + self->device = vfio_pci_device_init(device_bdf, variant->iommu_mode); 118 + } 119 + 120 + FIXTURE_TEARDOWN(vfio_dma_mapping_test) 121 + { 122 + vfio_pci_device_cleanup(self->device); 123 + } 124 + 125 + TEST_F(vfio_dma_mapping_test, dma_map_unmap) 126 + { 127 + const u64 size = variant->size ?: getpagesize(); 128 + const int flags = variant->mmap_flags; 129 + struct vfio_dma_region region; 130 + struct iommu_mapping mapping; 131 + u64 mapping_size = size; 132 + int rc; 133 + 134 + region.vaddr = mmap(NULL, size, PROT_READ | PROT_WRITE, flags, -1, 0); 135 + 136 + /* Skip the test if there aren't enough HugeTLB pages available. */ 137 + if (flags & MAP_HUGETLB && region.vaddr == MAP_FAILED) 138 + SKIP(return, "mmap() failed: %s (%d)\n", strerror(errno), errno); 139 + else 140 + ASSERT_NE(region.vaddr, MAP_FAILED); 141 + 142 + region.iova = (u64)region.vaddr; 143 + region.size = size; 144 + 145 + vfio_pci_dma_map(self->device, &region); 146 + printf("Mapped HVA %p (size 0x%lx) at IOVA 0x%lx\n", region.vaddr, size, region.iova); 147 + 148 + ASSERT_EQ(region.iova, to_iova(self->device, region.vaddr)); 149 + 150 + rc = iommu_mapping_get(device_bdf, region.iova, &mapping); 151 + if (rc == -EOPNOTSUPP) 152 + goto unmap; 153 + 154 + /* 155 + * IOMMUFD compatibility-mode does not support huge mappings when 156 + * using VFIO_TYPE1_IOMMU. 157 + */ 158 + if (!strcmp(variant->iommu_mode, "iommufd_compat_type1")) 159 + mapping_size = SZ_4K; 160 + 161 + ASSERT_EQ(0, rc); 162 + printf("Found IOMMU mappings for IOVA 0x%lx:\n", region.iova); 163 + printf("PGD: 0x%016lx\n", mapping.pgd); 164 + printf("P4D: 0x%016lx\n", mapping.p4d); 165 + printf("PUD: 0x%016lx\n", mapping.pud); 166 + printf("PMD: 0x%016lx\n", mapping.pmd); 167 + printf("PTE: 0x%016lx\n", mapping.pte); 168 + 169 + switch (mapping_size) { 170 + case SZ_4K: 171 + ASSERT_NE(0, mapping.pte); 172 + break; 173 + case SZ_2M: 174 + ASSERT_EQ(0, mapping.pte); 175 + ASSERT_NE(0, mapping.pmd); 176 + break; 177 + case SZ_1G: 178 + ASSERT_EQ(0, mapping.pte); 179 + ASSERT_EQ(0, mapping.pmd); 180 + ASSERT_NE(0, mapping.pud); 181 + break; 182 + default: 183 + VFIO_FAIL("Unrecognized size: 0x%lx\n", mapping_size); 184 + } 185 + 186 + unmap: 187 + vfio_pci_dma_unmap(self->device, &region); 188 + printf("Unmapped IOVA 0x%lx\n", region.iova); 189 + ASSERT_EQ(INVALID_IOVA, __to_iova(self->device, region.vaddr)); 190 + ASSERT_NE(0, iommu_mapping_get(device_bdf, region.iova, &mapping)); 191 + 192 + ASSERT_TRUE(!munmap(region.vaddr, size)); 193 + } 194 + 195 + int main(int argc, char *argv[]) 196 + { 197 + device_bdf = vfio_selftests_get_bdf(&argc, argv); 198 + return test_harness_run(argc, argv); 199 + }
+127
tools/testing/selftests/vfio/vfio_iommufd_setup_test.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <uapi/linux/types.h> 3 + #include <linux/limits.h> 4 + #include <linux/sizes.h> 5 + #include <linux/vfio.h> 6 + #include <linux/iommufd.h> 7 + 8 + #include <stdint.h> 9 + #include <stdio.h> 10 + #include <sys/ioctl.h> 11 + #include <unistd.h> 12 + 13 + #include <vfio_util.h> 14 + #include "../kselftest_harness.h" 15 + 16 + static const char iommu_dev_path[] = "/dev/iommu"; 17 + static const char *cdev_path; 18 + 19 + static int vfio_device_bind_iommufd_ioctl(int cdev_fd, int iommufd) 20 + { 21 + struct vfio_device_bind_iommufd bind_args = { 22 + .argsz = sizeof(bind_args), 23 + .iommufd = iommufd, 24 + }; 25 + 26 + return ioctl(cdev_fd, VFIO_DEVICE_BIND_IOMMUFD, &bind_args); 27 + } 28 + 29 + static int vfio_device_get_info_ioctl(int cdev_fd) 30 + { 31 + struct vfio_device_info info_args = { .argsz = sizeof(info_args) }; 32 + 33 + return ioctl(cdev_fd, VFIO_DEVICE_GET_INFO, &info_args); 34 + } 35 + 36 + static int vfio_device_ioas_alloc_ioctl(int iommufd, struct iommu_ioas_alloc *alloc_args) 37 + { 38 + *alloc_args = (struct iommu_ioas_alloc){ 39 + .size = sizeof(struct iommu_ioas_alloc), 40 + }; 41 + 42 + return ioctl(iommufd, IOMMU_IOAS_ALLOC, alloc_args); 43 + } 44 + 45 + static int vfio_device_attach_iommufd_pt_ioctl(int cdev_fd, u32 pt_id) 46 + { 47 + struct vfio_device_attach_iommufd_pt attach_args = { 48 + .argsz = sizeof(attach_args), 49 + .pt_id = pt_id, 50 + }; 51 + 52 + return ioctl(cdev_fd, VFIO_DEVICE_ATTACH_IOMMUFD_PT, &attach_args); 53 + } 54 + 55 + static int vfio_device_detach_iommufd_pt_ioctl(int cdev_fd) 56 + { 57 + struct vfio_device_detach_iommufd_pt detach_args = { 58 + .argsz = sizeof(detach_args), 59 + }; 60 + 61 + return ioctl(cdev_fd, VFIO_DEVICE_DETACH_IOMMUFD_PT, &detach_args); 62 + } 63 + 64 + FIXTURE(vfio_cdev) { 65 + int cdev_fd; 66 + int iommufd; 67 + }; 68 + 69 + FIXTURE_SETUP(vfio_cdev) 70 + { 71 + ASSERT_LE(0, (self->cdev_fd = open(cdev_path, O_RDWR, 0))); 72 + ASSERT_LE(0, (self->iommufd = open(iommu_dev_path, O_RDWR, 0))); 73 + } 74 + 75 + FIXTURE_TEARDOWN(vfio_cdev) 76 + { 77 + ASSERT_EQ(0, close(self->cdev_fd)); 78 + ASSERT_EQ(0, close(self->iommufd)); 79 + } 80 + 81 + TEST_F(vfio_cdev, bind) 82 + { 83 + ASSERT_EQ(0, vfio_device_bind_iommufd_ioctl(self->cdev_fd, self->iommufd)); 84 + ASSERT_EQ(0, vfio_device_get_info_ioctl(self->cdev_fd)); 85 + } 86 + 87 + TEST_F(vfio_cdev, get_info_without_bind_fails) 88 + { 89 + ASSERT_NE(0, vfio_device_get_info_ioctl(self->cdev_fd)); 90 + } 91 + 92 + TEST_F(vfio_cdev, bind_bad_iommufd_fails) 93 + { 94 + ASSERT_NE(0, vfio_device_bind_iommufd_ioctl(self->cdev_fd, -2)); 95 + } 96 + 97 + TEST_F(vfio_cdev, repeated_bind_fails) 98 + { 99 + ASSERT_EQ(0, vfio_device_bind_iommufd_ioctl(self->cdev_fd, self->iommufd)); 100 + ASSERT_NE(0, vfio_device_bind_iommufd_ioctl(self->cdev_fd, self->iommufd)); 101 + } 102 + 103 + TEST_F(vfio_cdev, attach_detatch_pt) 104 + { 105 + struct iommu_ioas_alloc alloc_args; 106 + 107 + ASSERT_EQ(0, vfio_device_bind_iommufd_ioctl(self->cdev_fd, self->iommufd)); 108 + ASSERT_EQ(0, vfio_device_ioas_alloc_ioctl(self->iommufd, &alloc_args)); 109 + ASSERT_EQ(0, vfio_device_attach_iommufd_pt_ioctl(self->cdev_fd, alloc_args.out_ioas_id)); 110 + ASSERT_EQ(0, vfio_device_detach_iommufd_pt_ioctl(self->cdev_fd)); 111 + } 112 + 113 + TEST_F(vfio_cdev, attach_invalid_pt_fails) 114 + { 115 + ASSERT_EQ(0, vfio_device_bind_iommufd_ioctl(self->cdev_fd, self->iommufd)); 116 + ASSERT_NE(0, vfio_device_attach_iommufd_pt_ioctl(self->cdev_fd, UINT32_MAX)); 117 + } 118 + 119 + int main(int argc, char *argv[]) 120 + { 121 + const char *device_bdf = vfio_selftests_get_bdf(&argc, argv); 122 + 123 + cdev_path = vfio_pci_get_cdev_path(device_bdf); 124 + printf("Using cdev device %s\n", cdev_path); 125 + 126 + return test_harness_run(argc, argv); 127 + }
+176
tools/testing/selftests/vfio/vfio_pci_device_test.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + #include <fcntl.h> 3 + #include <stdlib.h> 4 + 5 + #include <sys/ioctl.h> 6 + #include <sys/mman.h> 7 + 8 + #include <linux/limits.h> 9 + #include <linux/pci_regs.h> 10 + #include <linux/sizes.h> 11 + #include <linux/vfio.h> 12 + 13 + #include <vfio_util.h> 14 + 15 + #include "../kselftest_harness.h" 16 + 17 + static const char *device_bdf; 18 + 19 + /* 20 + * Limit the number of MSIs enabled/disabled by the test regardless of the 21 + * number of MSIs the device itself supports, e.g. to avoid hitting IRTE limits. 22 + */ 23 + #define MAX_TEST_MSI 16U 24 + 25 + FIXTURE(vfio_pci_device_test) { 26 + struct vfio_pci_device *device; 27 + }; 28 + 29 + FIXTURE_SETUP(vfio_pci_device_test) 30 + { 31 + self->device = vfio_pci_device_init(device_bdf, default_iommu_mode); 32 + } 33 + 34 + FIXTURE_TEARDOWN(vfio_pci_device_test) 35 + { 36 + vfio_pci_device_cleanup(self->device); 37 + } 38 + 39 + #define read_pci_id_from_sysfs(_file) ({ \ 40 + char __sysfs_path[PATH_MAX]; \ 41 + char __buf[32]; \ 42 + int __fd; \ 43 + \ 44 + snprintf(__sysfs_path, PATH_MAX, "/sys/bus/pci/devices/%s/%s", device_bdf, _file); \ 45 + ASSERT_GT((__fd = open(__sysfs_path, O_RDONLY)), 0); \ 46 + ASSERT_GT(read(__fd, __buf, ARRAY_SIZE(__buf)), 0); \ 47 + ASSERT_EQ(0, close(__fd)); \ 48 + (u16)strtoul(__buf, NULL, 0); \ 49 + }) 50 + 51 + TEST_F(vfio_pci_device_test, config_space_read_write) 52 + { 53 + u16 vendor, device; 54 + u16 command; 55 + 56 + /* Check that Vendor and Device match what the kernel reports. */ 57 + vendor = read_pci_id_from_sysfs("vendor"); 58 + device = read_pci_id_from_sysfs("device"); 59 + ASSERT_TRUE(vfio_pci_device_match(self->device, vendor, device)); 60 + 61 + printf("Vendor: %04x, Device: %04x\n", vendor, device); 62 + 63 + command = vfio_pci_config_readw(self->device, PCI_COMMAND); 64 + ASSERT_FALSE(command & PCI_COMMAND_MASTER); 65 + 66 + vfio_pci_config_writew(self->device, PCI_COMMAND, command | PCI_COMMAND_MASTER); 67 + command = vfio_pci_config_readw(self->device, PCI_COMMAND); 68 + ASSERT_TRUE(command & PCI_COMMAND_MASTER); 69 + printf("Enabled Bus Mastering (command: %04x)\n", command); 70 + 71 + vfio_pci_config_writew(self->device, PCI_COMMAND, command & ~PCI_COMMAND_MASTER); 72 + command = vfio_pci_config_readw(self->device, PCI_COMMAND); 73 + ASSERT_FALSE(command & PCI_COMMAND_MASTER); 74 + printf("Disabled Bus Mastering (command: %04x)\n", command); 75 + } 76 + 77 + TEST_F(vfio_pci_device_test, validate_bars) 78 + { 79 + struct vfio_pci_bar *bar; 80 + int i; 81 + 82 + for (i = 0; i < PCI_STD_NUM_BARS; i++) { 83 + bar = &self->device->bars[i]; 84 + 85 + if (!(bar->info.flags & VFIO_REGION_INFO_FLAG_MMAP)) { 86 + printf("BAR %d does not support mmap()\n", i); 87 + ASSERT_EQ(NULL, bar->vaddr); 88 + continue; 89 + } 90 + 91 + /* 92 + * BARs that support mmap() should be automatically mapped by 93 + * vfio_pci_device_init(). 94 + */ 95 + ASSERT_NE(NULL, bar->vaddr); 96 + ASSERT_NE(0, bar->info.size); 97 + printf("BAR %d mapped at %p (size 0x%llx)\n", i, bar->vaddr, bar->info.size); 98 + } 99 + } 100 + 101 + FIXTURE(vfio_pci_irq_test) { 102 + struct vfio_pci_device *device; 103 + }; 104 + 105 + FIXTURE_VARIANT(vfio_pci_irq_test) { 106 + int irq_index; 107 + }; 108 + 109 + FIXTURE_VARIANT_ADD(vfio_pci_irq_test, msi) { 110 + .irq_index = VFIO_PCI_MSI_IRQ_INDEX, 111 + }; 112 + 113 + FIXTURE_VARIANT_ADD(vfio_pci_irq_test, msix) { 114 + .irq_index = VFIO_PCI_MSIX_IRQ_INDEX, 115 + }; 116 + 117 + FIXTURE_SETUP(vfio_pci_irq_test) 118 + { 119 + self->device = vfio_pci_device_init(device_bdf, default_iommu_mode); 120 + } 121 + 122 + FIXTURE_TEARDOWN(vfio_pci_irq_test) 123 + { 124 + vfio_pci_device_cleanup(self->device); 125 + } 126 + 127 + TEST_F(vfio_pci_irq_test, enable_trigger_disable) 128 + { 129 + bool msix = variant->irq_index == VFIO_PCI_MSIX_IRQ_INDEX; 130 + int msi_eventfd; 131 + u32 count; 132 + u64 value; 133 + int i; 134 + 135 + if (msix) 136 + count = self->device->msix_info.count; 137 + else 138 + count = self->device->msi_info.count; 139 + 140 + count = min(count, MAX_TEST_MSI); 141 + 142 + if (!count) 143 + SKIP(return, "MSI%s: not supported\n", msix ? "-x" : ""); 144 + 145 + vfio_pci_irq_enable(self->device, variant->irq_index, 0, count); 146 + printf("MSI%s: enabled %d interrupts\n", msix ? "-x" : "", count); 147 + 148 + for (i = 0; i < count; i++) { 149 + msi_eventfd = self->device->msi_eventfds[i]; 150 + 151 + fcntl_set_nonblock(msi_eventfd); 152 + ASSERT_EQ(-1, read(msi_eventfd, &value, 8)); 153 + ASSERT_EQ(EAGAIN, errno); 154 + 155 + vfio_pci_irq_trigger(self->device, variant->irq_index, i); 156 + 157 + ASSERT_EQ(8, read(msi_eventfd, &value, 8)); 158 + ASSERT_EQ(1, value); 159 + } 160 + 161 + vfio_pci_irq_disable(self->device, variant->irq_index); 162 + } 163 + 164 + TEST_F(vfio_pci_device_test, reset) 165 + { 166 + if (!(self->device->info.flags & VFIO_DEVICE_FLAGS_RESET)) 167 + SKIP(return, "Device does not support reset\n"); 168 + 169 + vfio_pci_device_reset(self->device); 170 + } 171 + 172 + int main(int argc, char *argv[]) 173 + { 174 + device_bdf = vfio_selftests_get_bdf(&argc, argv); 175 + return test_harness_run(argc, argv); 176 + }
+244
tools/testing/selftests/vfio/vfio_pci_driver_test.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + #include <sys/ioctl.h> 3 + #include <sys/mman.h> 4 + 5 + #include <linux/sizes.h> 6 + #include <linux/vfio.h> 7 + 8 + #include <vfio_util.h> 9 + 10 + #include "../kselftest_harness.h" 11 + 12 + static const char *device_bdf; 13 + 14 + #define ASSERT_NO_MSI(_eventfd) do { \ 15 + u64 __value; \ 16 + \ 17 + ASSERT_EQ(-1, read(_eventfd, &__value, 8)); \ 18 + ASSERT_EQ(EAGAIN, errno); \ 19 + } while (0) 20 + 21 + static void region_setup(struct vfio_pci_device *device, 22 + struct vfio_dma_region *region, u64 size) 23 + { 24 + const int flags = MAP_SHARED | MAP_ANONYMOUS; 25 + const int prot = PROT_READ | PROT_WRITE; 26 + void *vaddr; 27 + 28 + vaddr = mmap(NULL, size, prot, flags, -1, 0); 29 + VFIO_ASSERT_NE(vaddr, MAP_FAILED); 30 + 31 + region->vaddr = vaddr; 32 + region->iova = (u64)vaddr; 33 + region->size = size; 34 + 35 + vfio_pci_dma_map(device, region); 36 + } 37 + 38 + static void region_teardown(struct vfio_pci_device *device, 39 + struct vfio_dma_region *region) 40 + { 41 + vfio_pci_dma_unmap(device, region); 42 + VFIO_ASSERT_EQ(munmap(region->vaddr, region->size), 0); 43 + } 44 + 45 + FIXTURE(vfio_pci_driver_test) { 46 + struct vfio_pci_device *device; 47 + struct vfio_dma_region memcpy_region; 48 + void *vaddr; 49 + int msi_fd; 50 + 51 + u64 size; 52 + void *src; 53 + void *dst; 54 + iova_t src_iova; 55 + iova_t dst_iova; 56 + iova_t unmapped_iova; 57 + }; 58 + 59 + FIXTURE_VARIANT(vfio_pci_driver_test) { 60 + const char *iommu_mode; 61 + }; 62 + 63 + #define FIXTURE_VARIANT_ADD_IOMMU_MODE(_iommu_mode) \ 64 + FIXTURE_VARIANT_ADD(vfio_pci_driver_test, _iommu_mode) { \ 65 + .iommu_mode = #_iommu_mode, \ 66 + } 67 + 68 + FIXTURE_VARIANT_ADD_ALL_IOMMU_MODES(); 69 + 70 + FIXTURE_SETUP(vfio_pci_driver_test) 71 + { 72 + struct vfio_pci_driver *driver; 73 + 74 + self->device = vfio_pci_device_init(device_bdf, variant->iommu_mode); 75 + 76 + driver = &self->device->driver; 77 + 78 + region_setup(self->device, &self->memcpy_region, SZ_1G); 79 + region_setup(self->device, &driver->region, SZ_2M); 80 + 81 + /* Any IOVA that doesn't overlap memcpy_region and driver->region. */ 82 + self->unmapped_iova = 8UL * SZ_1G; 83 + 84 + vfio_pci_driver_init(self->device); 85 + self->msi_fd = self->device->msi_eventfds[driver->msi]; 86 + 87 + /* 88 + * Use the maximum size supported by the device for memcpy operations, 89 + * slimmed down to fit into the memcpy region (divided by 2 so src and 90 + * dst regions do not overlap). 91 + */ 92 + self->size = self->device->driver.max_memcpy_size; 93 + self->size = min(self->size, self->memcpy_region.size / 2); 94 + 95 + self->src = self->memcpy_region.vaddr; 96 + self->dst = self->src + self->size; 97 + 98 + self->src_iova = to_iova(self->device, self->src); 99 + self->dst_iova = to_iova(self->device, self->dst); 100 + } 101 + 102 + FIXTURE_TEARDOWN(vfio_pci_driver_test) 103 + { 104 + struct vfio_pci_driver *driver = &self->device->driver; 105 + 106 + vfio_pci_driver_remove(self->device); 107 + 108 + region_teardown(self->device, &self->memcpy_region); 109 + region_teardown(self->device, &driver->region); 110 + 111 + vfio_pci_device_cleanup(self->device); 112 + } 113 + 114 + TEST_F(vfio_pci_driver_test, init_remove) 115 + { 116 + int i; 117 + 118 + for (i = 0; i < 10; i++) { 119 + vfio_pci_driver_remove(self->device); 120 + vfio_pci_driver_init(self->device); 121 + } 122 + } 123 + 124 + TEST_F(vfio_pci_driver_test, memcpy_success) 125 + { 126 + fcntl_set_nonblock(self->msi_fd); 127 + 128 + memset(self->src, 'x', self->size); 129 + memset(self->dst, 'y', self->size); 130 + 131 + ASSERT_EQ(0, vfio_pci_driver_memcpy(self->device, 132 + self->src_iova, 133 + self->dst_iova, 134 + self->size)); 135 + 136 + ASSERT_EQ(0, memcmp(self->src, self->dst, self->size)); 137 + ASSERT_NO_MSI(self->msi_fd); 138 + } 139 + 140 + TEST_F(vfio_pci_driver_test, memcpy_from_unmapped_iova) 141 + { 142 + fcntl_set_nonblock(self->msi_fd); 143 + 144 + /* 145 + * Ignore the return value since not all devices will detect and report 146 + * accesses to unmapped IOVAs as errors. 147 + */ 148 + vfio_pci_driver_memcpy(self->device, self->unmapped_iova, 149 + self->dst_iova, self->size); 150 + 151 + ASSERT_NO_MSI(self->msi_fd); 152 + } 153 + 154 + TEST_F(vfio_pci_driver_test, memcpy_to_unmapped_iova) 155 + { 156 + fcntl_set_nonblock(self->msi_fd); 157 + 158 + /* 159 + * Ignore the return value since not all devices will detect and report 160 + * accesses to unmapped IOVAs as errors. 161 + */ 162 + vfio_pci_driver_memcpy(self->device, self->src_iova, 163 + self->unmapped_iova, self->size); 164 + 165 + ASSERT_NO_MSI(self->msi_fd); 166 + } 167 + 168 + TEST_F(vfio_pci_driver_test, send_msi) 169 + { 170 + u64 value; 171 + 172 + vfio_pci_driver_send_msi(self->device); 173 + ASSERT_EQ(8, read(self->msi_fd, &value, 8)); 174 + ASSERT_EQ(1, value); 175 + } 176 + 177 + TEST_F(vfio_pci_driver_test, mix_and_match) 178 + { 179 + u64 value; 180 + int i; 181 + 182 + for (i = 0; i < 10; i++) { 183 + memset(self->src, 'x', self->size); 184 + memset(self->dst, 'y', self->size); 185 + 186 + ASSERT_EQ(0, vfio_pci_driver_memcpy(self->device, 187 + self->src_iova, 188 + self->dst_iova, 189 + self->size)); 190 + 191 + ASSERT_EQ(0, memcmp(self->src, self->dst, self->size)); 192 + 193 + vfio_pci_driver_memcpy(self->device, 194 + self->unmapped_iova, 195 + self->dst_iova, 196 + self->size); 197 + 198 + vfio_pci_driver_send_msi(self->device); 199 + ASSERT_EQ(8, read(self->msi_fd, &value, 8)); 200 + ASSERT_EQ(1, value); 201 + } 202 + } 203 + 204 + TEST_F_TIMEOUT(vfio_pci_driver_test, memcpy_storm, 60) 205 + { 206 + struct vfio_pci_driver *driver = &self->device->driver; 207 + u64 total_size; 208 + u64 count; 209 + 210 + fcntl_set_nonblock(self->msi_fd); 211 + 212 + /* 213 + * Perform up to 250GiB worth of DMA reads and writes across several 214 + * memcpy operations. Some devices can support even more but the test 215 + * will take too long. 216 + */ 217 + total_size = 250UL * SZ_1G; 218 + count = min(total_size / self->size, driver->max_memcpy_count); 219 + 220 + printf("Kicking off %lu memcpys of size 0x%lx\n", count, self->size); 221 + vfio_pci_driver_memcpy_start(self->device, 222 + self->src_iova, 223 + self->dst_iova, 224 + self->size, count); 225 + 226 + ASSERT_EQ(0, vfio_pci_driver_memcpy_wait(self->device)); 227 + ASSERT_NO_MSI(self->msi_fd); 228 + } 229 + 230 + int main(int argc, char *argv[]) 231 + { 232 + struct vfio_pci_device *device; 233 + 234 + device_bdf = vfio_selftests_get_bdf(&argc, argv); 235 + 236 + device = vfio_pci_device_init(device_bdf, default_iommu_mode); 237 + if (!device->driver.ops) { 238 + fprintf(stderr, "No driver found for device %s\n", device_bdf); 239 + return KSFT_SKIP; 240 + } 241 + vfio_pci_device_cleanup(device); 242 + 243 + return test_harness_run(argc, argv); 244 + }