Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'iommu-updates-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU updates from Joerg Roedel:

- Page table code for AMD IOMMU now supports large pages where smaller
page-sizes were mapped before. VFIO had to work around that in the
past and I included a patch to remove it (acked by Alex Williamson)

- Patches to unmodularize a couple of IOMMU drivers that would never
work as modules anyway.

- Work to unify the the iommu-related pointers in 'struct device' into
one pointer. This work is not finished yet, but will probably be in
the next cycle.

- NUMA aware allocation in iommu-dma code

- Support for r8a774a1 and r8a774c0 in the Renesas IOMMU driver

- Scalable mode support for the Intel VT-d driver

- PM runtime improvements for the ARM-SMMU driver

- Support for the QCOM-SMMUv2 IOMMU hardware from Qualcom

- Various smaller fixes and improvements

* tag 'iommu-updates-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (78 commits)
iommu: Check for iommu_ops == NULL in iommu_probe_device()
ACPI/IORT: Don't call iommu_ops->add_device directly
iommu/of: Don't call iommu_ops->add_device directly
iommu: Consolitate ->add/remove_device() calls
iommu/sysfs: Rename iommu_release_device()
dmaengine: sh: rcar-dmac: Use device_iommu_mapped()
xhci: Use device_iommu_mapped()
powerpc/iommu: Use device_iommu_mapped()
ACPI/IORT: Use device_iommu_mapped()
iommu/of: Use device_iommu_mapped()
driver core: Introduce device_iommu_mapped() function
iommu/tegra: Use helper functions to access dev->iommu_fwspec
iommu/qcom: Use helper functions to access dev->iommu_fwspec
iommu/of: Use helper functions to access dev->iommu_fwspec
iommu/mediatek: Use helper functions to access dev->iommu_fwspec
iommu/ipmmu-vmsa: Use helper functions to access dev->iommu_fwspec
iommu/dma: Use helper functions to access dev->iommu_fwspec
iommu/arm-smmu: Use helper functions to access dev->iommu_fwspec
ACPI/IORT: Use helper functions to access dev->iommu_fwspec
iommu: Introduce wrappers around dev->iommu_fwspec
...

+1613 -918
+6 -6
Documentation/admin-guide/kernel-parameters.txt
··· 1690 1690 By default, super page will be supported if Intel IOMMU 1691 1691 has the capability. With this option, super page will 1692 1692 not be supported. 1693 - ecs_off [Default Off] 1694 - By default, extended context tables will be supported if 1695 - the hardware advertises that it has support both for the 1696 - extended tables themselves, and also PASID support. With 1697 - this option set, extended tables will not be used even 1698 - on hardware which claims to support them. 1693 + sm_off [Default Off] 1694 + By default, scalable mode will be supported if the 1695 + hardware advertises that it has support for the scalable 1696 + mode translation. With this option set, scalable mode 1697 + will not be used even on hardware which claims to support 1698 + it. 1699 1699 tboot_noforce [Default Off] 1700 1700 Do not force the Intel IOMMU enabled under tboot. 1701 1701 By default, tboot will force Intel IOMMU on, which
+43
Documentation/devicetree/bindings/iommu/arm,smmu.txt
··· 17 17 "arm,mmu-401" 18 18 "arm,mmu-500" 19 19 "cavium,smmu-v2" 20 + "qcom,smmu-v2" 20 21 21 22 depending on the particular implementation and/or the 22 23 version of the architecture implemented. 24 + 25 + Qcom SoCs must contain, as below, SoC-specific compatibles 26 + along with "qcom,smmu-v2": 27 + "qcom,msm8996-smmu-v2", "qcom,smmu-v2", 28 + "qcom,sdm845-smmu-v2", "qcom,smmu-v2". 29 + 30 + Qcom SoCs implementing "arm,mmu-500" must also include, 31 + as below, SoC-specific compatibles: 32 + "qcom,sdm845-smmu-500", "arm,mmu-500" 23 33 24 34 - reg : Base address and size of the SMMU. 25 35 ··· 80 70 property is not valid for SMMUs using stream indexing, 81 71 or using stream matching with #iommu-cells = <2>, and 82 72 may be ignored if present in such cases. 73 + 74 + - clock-names: List of the names of clocks input to the device. The 75 + required list depends on particular implementation and 76 + is as follows: 77 + - for "qcom,smmu-v2": 78 + - "bus": clock required for downstream bus access and 79 + for the smmu ptw, 80 + - "iface": clock required to access smmu's registers 81 + through the TCU's programming interface. 82 + - unspecified for other implementations. 83 + 84 + - clocks: Specifiers for all clocks listed in the clock-names property, 85 + as per generic clock bindings. 86 + 87 + - power-domains: Specifiers for power domains required to be powered on for 88 + the SMMU to operate, as per generic power domain bindings. 83 89 84 90 ** Deprecated properties: 85 91 ··· 163 137 iommu-map = <0 &smmu3 0 0x400>; 164 138 ... 165 139 }; 140 + 141 + /* Qcom's arm,smmu-v2 implementation */ 142 + smmu4: iommu@d00000 { 143 + compatible = "qcom,msm8996-smmu-v2", "qcom,smmu-v2"; 144 + reg = <0xd00000 0x10000>; 145 + 146 + #global-interrupts = <1>; 147 + interrupts = <GIC_SPI 73 IRQ_TYPE_LEVEL_HIGH>, 148 + <GIC_SPI 320 IRQ_TYPE_LEVEL_HIGH>, 149 + <GIC_SPI 321 IRQ_TYPE_LEVEL_HIGH>; 150 + #iommu-cells = <1>; 151 + power-domains = <&mmcc MDSS_GDSC>; 152 + 153 + clocks = <&mmcc SMMU_MDP_AXI_CLK>, 154 + <&mmcc SMMU_MDP_AHB_CLK>; 155 + clock-names = "bus", "iface"; 156 + };
+2
Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.txt
··· 14 14 - "renesas,ipmmu-r8a7743" for the R8A7743 (RZ/G1M) IPMMU. 15 15 - "renesas,ipmmu-r8a7744" for the R8A7744 (RZ/G1N) IPMMU. 16 16 - "renesas,ipmmu-r8a7745" for the R8A7745 (RZ/G1E) IPMMU. 17 + - "renesas,ipmmu-r8a774a1" for the R8A774A1 (RZ/G2M) IPMMU. 18 + - "renesas,ipmmu-r8a774c0" for the R8A774C0 (RZ/G2E) IPMMU. 17 19 - "renesas,ipmmu-r8a7790" for the R8A7790 (R-Car H2) IPMMU. 18 20 - "renesas,ipmmu-r8a7791" for the R8A7791 (R-Car M2-W) IPMMU. 19 21 - "renesas,ipmmu-r8a7793" for the R8A7793 (R-Car M2-N) IPMMU.
+1 -1
arch/powerpc/kernel/eeh.c
··· 1472 1472 if (!dev) 1473 1473 return 0; 1474 1474 1475 - if (dev->iommu_group) { 1475 + if (device_iommu_mapped(dev)) { 1476 1476 *ppdev = pdev; 1477 1477 return 1; 1478 1478 }
+2 -2
arch/powerpc/kernel/iommu.c
··· 1088 1088 if (!device_is_registered(dev)) 1089 1089 return -ENOENT; 1090 1090 1091 - if (dev->iommu_group) { 1091 + if (device_iommu_mapped(dev)) { 1092 1092 pr_debug("%s: Skipping device %s with iommu group %d\n", 1093 1093 __func__, dev_name(dev), 1094 1094 iommu_group_id(dev->iommu_group)); ··· 1109 1109 * and we needn't detach them from the associated 1110 1110 * IOMMU groups 1111 1111 */ 1112 - if (!dev->iommu_group) { 1112 + if (!device_iommu_mapped(dev)) { 1113 1113 pr_debug("iommu_tce: skipping device %s with no tbl\n", 1114 1114 dev_name(dev)); 1115 1115 return;
+1 -1
arch/x86/kernel/tboot.c
··· 19 19 * 20 20 */ 21 21 22 - #include <linux/dma_remapping.h> 22 + #include <linux/intel-iommu.h> 23 23 #include <linux/init_task.h> 24 24 #include <linux/spinlock.h> 25 25 #include <linux/export.h>
+12 -11
drivers/acpi/arm64/iort.c
··· 779 779 static struct acpi_iort_node *iort_get_msi_resv_iommu(struct device *dev) 780 780 { 781 781 struct acpi_iort_node *iommu; 782 - struct iommu_fwspec *fwspec = dev->iommu_fwspec; 782 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 783 783 784 784 iommu = iort_get_iort_node(fwspec->iommu_fwnode); 785 785 ··· 794 794 return NULL; 795 795 } 796 796 797 - static inline const struct iommu_ops *iort_fwspec_iommu_ops( 798 - struct iommu_fwspec *fwspec) 797 + static inline const struct iommu_ops *iort_fwspec_iommu_ops(struct device *dev) 799 798 { 799 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 800 + 800 801 return (fwspec && fwspec->ops) ? fwspec->ops : NULL; 801 802 } 802 803 ··· 806 805 { 807 806 int err = 0; 808 807 809 - if (ops->add_device && dev->bus && !dev->iommu_group) 810 - err = ops->add_device(dev); 808 + if (dev->bus && !device_iommu_mapped(dev)) 809 + err = iommu_probe_device(dev); 811 810 812 811 return err; 813 812 } ··· 825 824 */ 826 825 int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head) 827 826 { 827 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 828 828 struct acpi_iort_its_group *its; 829 829 struct acpi_iort_node *iommu_node, *its_node = NULL; 830 830 int i, resv = 0; ··· 843 841 * a given PCI or named component may map IDs to. 844 842 */ 845 843 846 - for (i = 0; i < dev->iommu_fwspec->num_ids; i++) { 844 + for (i = 0; i < fwspec->num_ids; i++) { 847 845 its_node = iort_node_map_id(iommu_node, 848 - dev->iommu_fwspec->ids[i], 846 + fwspec->ids[i], 849 847 NULL, IORT_MSI_TYPE); 850 848 if (its_node) 851 849 break; ··· 876 874 return (resv == its->its_count) ? resv : -ENODEV; 877 875 } 878 876 #else 879 - static inline const struct iommu_ops *iort_fwspec_iommu_ops( 880 - struct iommu_fwspec *fwspec) 877 + static inline const struct iommu_ops *iort_fwspec_iommu_ops(struct device *dev); 881 878 { return NULL; } 882 879 static inline int iort_add_device_replay(const struct iommu_ops *ops, 883 880 struct device *dev) ··· 1046 1045 * If we already translated the fwspec there 1047 1046 * is nothing left to do, return the iommu_ops. 1048 1047 */ 1049 - ops = iort_fwspec_iommu_ops(dev->iommu_fwspec); 1048 + ops = iort_fwspec_iommu_ops(dev); 1050 1049 if (ops) 1051 1050 return ops; 1052 1051 ··· 1085 1084 * add_device callback for dev, replay it to get things in order. 1086 1085 */ 1087 1086 if (!err) { 1088 - ops = iort_fwspec_iommu_ops(dev->iommu_fwspec); 1087 + ops = iort_fwspec_iommu_ops(dev); 1089 1088 err = iort_add_device_replay(ops, dev); 1090 1089 } 1091 1090
+1 -1
drivers/dma/sh/rcar-dmac.c
··· 1809 1809 * level we can't disable it selectively, so ignore channel 0 for now if 1810 1810 * the device is part of an IOMMU group. 1811 1811 */ 1812 - if (pdev->dev.iommu_group) { 1812 + if (device_iommu_mapped(&pdev->dev)) { 1813 1813 dmac->n_channels--; 1814 1814 channels_offset = 1; 1815 1815 }
+1 -1
drivers/gpu/drm/i915/i915_gem_execbuffer.c
··· 26 26 * 27 27 */ 28 28 29 - #include <linux/dma_remapping.h> 29 + #include <linux/intel-iommu.h> 30 30 #include <linux/reservation.h> 31 31 #include <linux/sync_file.h> 32 32 #include <linux/uaccess.h>
+1 -1
drivers/gpu/drm/i915/intel_display.c
··· 46 46 #include <drm/drm_plane_helper.h> 47 47 #include <drm/drm_rect.h> 48 48 #include <drm/drm_atomic_uapi.h> 49 - #include <linux/dma_remapping.h> 49 + #include <linux/intel-iommu.h> 50 50 #include <linux/reservation.h> 51 51 52 52 /* Primary plane formats for gen <= 3 */
+1 -1
drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
··· 34 34 #include <drm/ttm/ttm_placement.h> 35 35 #include <drm/ttm/ttm_bo_driver.h> 36 36 #include <drm/ttm/ttm_module.h> 37 - #include <linux/dma_remapping.h> 37 + #include <linux/intel-iommu.h> 38 38 39 39 #define VMWGFX_DRIVER_DESC "Linux drm driver for VMware graphics devices" 40 40 #define VMWGFX_CHIP_SVGAII 0
+173 -102
drivers/iommu/amd_iommu.c
··· 17 17 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA 18 18 */ 19 19 20 + #define pr_fmt(fmt) "AMD-Vi: " fmt 21 + 20 22 #include <linux/ratelimit.h> 21 23 #include <linux/pci.h> 22 24 #include <linux/acpi.h> ··· 279 277 return pci_alias; 280 278 } 281 279 282 - pr_info("AMD-Vi: Using IVRS reported alias %02x:%02x.%d " 280 + pr_info("Using IVRS reported alias %02x:%02x.%d " 283 281 "for device %s[%04x:%04x], kernel reported alias " 284 282 "%02x:%02x.%d\n", PCI_BUS_NUM(ivrs_alias), PCI_SLOT(ivrs_alias), 285 283 PCI_FUNC(ivrs_alias), dev_name(dev), pdev->vendor, pdev->device, ··· 293 291 if (pci_alias == devid && 294 292 PCI_BUS_NUM(ivrs_alias) == pdev->bus->number) { 295 293 pci_add_dma_alias(pdev, ivrs_alias & 0xff); 296 - pr_info("AMD-Vi: Added PCI DMA alias %02x.%d for %s\n", 294 + pr_info("Added PCI DMA alias %02x.%d for %s\n", 297 295 PCI_SLOT(ivrs_alias), PCI_FUNC(ivrs_alias), 298 296 dev_name(dev)); 299 297 } ··· 438 436 439 437 dev_data->alias = get_alias(dev); 440 438 441 - if (dev_is_pci(dev) && pci_iommuv2_capable(to_pci_dev(dev))) { 439 + /* 440 + * By default we use passthrough mode for IOMMUv2 capable device. 441 + * But if amd_iommu=force_isolation is set (e.g. to debug DMA to 442 + * invalid address), we ignore the capability for the device so 443 + * it'll be forced to go into translation mode. 444 + */ 445 + if ((iommu_pass_through || !amd_iommu_force_isolation) && 446 + dev_is_pci(dev) && pci_iommuv2_capable(to_pci_dev(dev))) { 442 447 struct amd_iommu *iommu; 443 448 444 449 iommu = amd_iommu_rlookup_table[dev_data->devid]; ··· 520 511 int i; 521 512 522 513 for (i = 0; i < 4; ++i) 523 - pr_err("AMD-Vi: DTE[%d]: %016llx\n", i, 514 + pr_err("DTE[%d]: %016llx\n", i, 524 515 amd_iommu_dev_table[devid].data[i]); 525 516 } 526 517 ··· 530 521 int i; 531 522 532 523 for (i = 0; i < 4; ++i) 533 - pr_err("AMD-Vi: CMD[%d]: %08x\n", i, cmd->data[i]); 524 + pr_err("CMD[%d]: %08x\n", i, cmd->data[i]); 534 525 } 535 526 536 527 static void amd_iommu_report_page_fault(u16 devid, u16 domain_id, ··· 545 536 dev_data = get_dev_data(&pdev->dev); 546 537 547 538 if (dev_data && __ratelimit(&dev_data->rs)) { 548 - dev_err(&pdev->dev, "AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x%04x address=0x%016llx flags=0x%04x]\n", 539 + dev_err(&pdev->dev, "Event logged [IO_PAGE_FAULT domain=0x%04x address=0x%llx flags=0x%04x]\n", 549 540 domain_id, address, flags); 550 541 } else if (printk_ratelimit()) { 551 - pr_err("AMD-Vi: Event logged [IO_PAGE_FAULT device=%02x:%02x.%x domain=0x%04x address=0x%016llx flags=0x%04x]\n", 542 + pr_err("Event logged [IO_PAGE_FAULT device=%02x:%02x.%x domain=0x%04x address=0x%llx flags=0x%04x]\n", 552 543 PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid), 553 544 domain_id, address, flags); 554 545 } ··· 575 566 if (type == 0) { 576 567 /* Did we hit the erratum? */ 577 568 if (++count == LOOP_TIMEOUT) { 578 - pr_err("AMD-Vi: No event written to event log\n"); 569 + pr_err("No event written to event log\n"); 579 570 return; 580 571 } 581 572 udelay(1); ··· 585 576 if (type == EVENT_TYPE_IO_FAULT) { 586 577 amd_iommu_report_page_fault(devid, pasid, address, flags); 587 578 return; 588 - } else { 589 - dev_err(dev, "AMD-Vi: Event logged ["); 590 579 } 591 580 592 581 switch (type) { 593 582 case EVENT_TYPE_ILL_DEV: 594 - dev_err(dev, "ILLEGAL_DEV_TABLE_ENTRY device=%02x:%02x.%x pasid=0x%05x address=0x%016llx flags=0x%04x]\n", 583 + dev_err(dev, "Event logged [ILLEGAL_DEV_TABLE_ENTRY device=%02x:%02x.%x pasid=0x%05x address=0x%llx flags=0x%04x]\n", 595 584 PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid), 596 585 pasid, address, flags); 597 586 dump_dte_entry(devid); 598 587 break; 599 588 case EVENT_TYPE_DEV_TAB_ERR: 600 - dev_err(dev, "DEV_TAB_HARDWARE_ERROR device=%02x:%02x.%x " 601 - "address=0x%016llx flags=0x%04x]\n", 589 + dev_err(dev, "Event logged [DEV_TAB_HARDWARE_ERROR device=%02x:%02x.%x " 590 + "address=0x%llx flags=0x%04x]\n", 602 591 PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid), 603 592 address, flags); 604 593 break; 605 594 case EVENT_TYPE_PAGE_TAB_ERR: 606 - dev_err(dev, "PAGE_TAB_HARDWARE_ERROR device=%02x:%02x.%x domain=0x%04x address=0x%016llx flags=0x%04x]\n", 595 + dev_err(dev, "Event logged [PAGE_TAB_HARDWARE_ERROR device=%02x:%02x.%x domain=0x%04x address=0x%llx flags=0x%04x]\n", 607 596 PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid), 608 597 pasid, address, flags); 609 598 break; 610 599 case EVENT_TYPE_ILL_CMD: 611 - dev_err(dev, "ILLEGAL_COMMAND_ERROR address=0x%016llx]\n", address); 600 + dev_err(dev, "Event logged [ILLEGAL_COMMAND_ERROR address=0x%llx]\n", address); 612 601 dump_command(address); 613 602 break; 614 603 case EVENT_TYPE_CMD_HARD_ERR: 615 - dev_err(dev, "COMMAND_HARDWARE_ERROR address=0x%016llx flags=0x%04x]\n", 604 + dev_err(dev, "Event logged [COMMAND_HARDWARE_ERROR address=0x%llx flags=0x%04x]\n", 616 605 address, flags); 617 606 break; 618 607 case EVENT_TYPE_IOTLB_INV_TO: 619 - dev_err(dev, "IOTLB_INV_TIMEOUT device=%02x:%02x.%x address=0x%016llx]\n", 608 + dev_err(dev, "Event logged [IOTLB_INV_TIMEOUT device=%02x:%02x.%x address=0x%llx]\n", 620 609 PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid), 621 610 address); 622 611 break; 623 612 case EVENT_TYPE_INV_DEV_REQ: 624 - dev_err(dev, "INVALID_DEVICE_REQUEST device=%02x:%02x.%x pasid=0x%05x address=0x%016llx flags=0x%04x]\n", 613 + dev_err(dev, "Event logged [INVALID_DEVICE_REQUEST device=%02x:%02x.%x pasid=0x%05x address=0x%llx flags=0x%04x]\n", 625 614 PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid), 626 615 pasid, address, flags); 627 616 break; ··· 627 620 pasid = ((event[0] >> 16) & 0xFFFF) 628 621 | ((event[1] << 6) & 0xF0000); 629 622 tag = event[1] & 0x03FF; 630 - dev_err(dev, "INVALID_PPR_REQUEST device=%02x:%02x.%x pasid=0x%05x address=0x%016llx flags=0x%04x]\n", 623 + dev_err(dev, "Event logged [INVALID_PPR_REQUEST device=%02x:%02x.%x pasid=0x%05x address=0x%llx flags=0x%04x]\n", 631 624 PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid), 632 625 pasid, address, flags); 633 626 break; 634 627 default: 635 - dev_err(dev, "UNKNOWN event[0]=0x%08x event[1]=0x%08x event[2]=0x%08x event[3]=0x%08x\n", 628 + dev_err(dev, "Event logged [UNKNOWN event[0]=0x%08x event[1]=0x%08x event[2]=0x%08x event[3]=0x%08x\n", 636 629 event[0], event[1], event[2], event[3]); 637 630 } 638 631 ··· 659 652 struct amd_iommu_fault fault; 660 653 661 654 if (PPR_REQ_TYPE(raw[0]) != PPR_REQ_FAULT) { 662 - pr_err_ratelimited("AMD-Vi: Unknown PPR request received\n"); 655 + pr_err_ratelimited("Unknown PPR request received\n"); 663 656 return; 664 657 } 665 658 ··· 764 757 if (!iommu_ga_log_notifier) 765 758 break; 766 759 767 - pr_debug("AMD-Vi: %s: devid=%#x, ga_tag=%#x\n", 760 + pr_debug("%s: devid=%#x, ga_tag=%#x\n", 768 761 __func__, GA_DEVID(log_entry), 769 762 GA_TAG(log_entry)); 770 763 771 764 if (iommu_ga_log_notifier(GA_TAG(log_entry)) != 0) 772 - pr_err("AMD-Vi: GA log notifier failed.\n"); 765 + pr_err("GA log notifier failed.\n"); 773 766 break; 774 767 default: 775 768 break; ··· 794 787 iommu->mmio_base + MMIO_STATUS_OFFSET); 795 788 796 789 if (status & MMIO_STATUS_EVT_INT_MASK) { 797 - pr_devel("AMD-Vi: Processing IOMMU Event Log\n"); 790 + pr_devel("Processing IOMMU Event Log\n"); 798 791 iommu_poll_events(iommu); 799 792 } 800 793 801 794 if (status & MMIO_STATUS_PPR_INT_MASK) { 802 - pr_devel("AMD-Vi: Processing IOMMU PPR Log\n"); 795 + pr_devel("Processing IOMMU PPR Log\n"); 803 796 iommu_poll_ppr_log(iommu); 804 797 } 805 798 806 799 #ifdef CONFIG_IRQ_REMAP 807 800 if (status & MMIO_STATUS_GALOG_INT_MASK) { 808 - pr_devel("AMD-Vi: Processing IOMMU GA Log\n"); 801 + pr_devel("Processing IOMMU GA Log\n"); 809 802 iommu_poll_ga_log(iommu); 810 803 } 811 804 #endif ··· 849 842 } 850 843 851 844 if (i == LOOP_TIMEOUT) { 852 - pr_alert("AMD-Vi: Completion-Wait loop timed out\n"); 845 + pr_alert("Completion-Wait loop timed out\n"); 853 846 return -EIO; 854 847 } 855 848 ··· 1041 1034 /* Skip udelay() the first time around */ 1042 1035 if (count++) { 1043 1036 if (count == LOOP_TIMEOUT) { 1044 - pr_err("AMD-Vi: Command buffer timeout\n"); 1037 + pr_err("Command buffer timeout\n"); 1045 1038 return -EIO; 1046 1039 } 1047 1040 ··· 1322 1315 * 1323 1316 ****************************************************************************/ 1324 1317 1318 + static void free_page_list(struct page *freelist) 1319 + { 1320 + while (freelist != NULL) { 1321 + unsigned long p = (unsigned long)page_address(freelist); 1322 + freelist = freelist->freelist; 1323 + free_page(p); 1324 + } 1325 + } 1326 + 1327 + static struct page *free_pt_page(unsigned long pt, struct page *freelist) 1328 + { 1329 + struct page *p = virt_to_page((void *)pt); 1330 + 1331 + p->freelist = freelist; 1332 + 1333 + return p; 1334 + } 1335 + 1336 + #define DEFINE_FREE_PT_FN(LVL, FN) \ 1337 + static struct page *free_pt_##LVL (unsigned long __pt, struct page *freelist) \ 1338 + { \ 1339 + unsigned long p; \ 1340 + u64 *pt; \ 1341 + int i; \ 1342 + \ 1343 + pt = (u64 *)__pt; \ 1344 + \ 1345 + for (i = 0; i < 512; ++i) { \ 1346 + /* PTE present? */ \ 1347 + if (!IOMMU_PTE_PRESENT(pt[i])) \ 1348 + continue; \ 1349 + \ 1350 + /* Large PTE? */ \ 1351 + if (PM_PTE_LEVEL(pt[i]) == 0 || \ 1352 + PM_PTE_LEVEL(pt[i]) == 7) \ 1353 + continue; \ 1354 + \ 1355 + p = (unsigned long)IOMMU_PTE_PAGE(pt[i]); \ 1356 + freelist = FN(p, freelist); \ 1357 + } \ 1358 + \ 1359 + return free_pt_page((unsigned long)pt, freelist); \ 1360 + } 1361 + 1362 + DEFINE_FREE_PT_FN(l2, free_pt_page) 1363 + DEFINE_FREE_PT_FN(l3, free_pt_l2) 1364 + DEFINE_FREE_PT_FN(l4, free_pt_l3) 1365 + DEFINE_FREE_PT_FN(l5, free_pt_l4) 1366 + DEFINE_FREE_PT_FN(l6, free_pt_l5) 1367 + 1368 + static struct page *free_sub_pt(unsigned long root, int mode, 1369 + struct page *freelist) 1370 + { 1371 + switch (mode) { 1372 + case PAGE_MODE_NONE: 1373 + case PAGE_MODE_7_LEVEL: 1374 + break; 1375 + case PAGE_MODE_1_LEVEL: 1376 + freelist = free_pt_page(root, freelist); 1377 + break; 1378 + case PAGE_MODE_2_LEVEL: 1379 + freelist = free_pt_l2(root, freelist); 1380 + break; 1381 + case PAGE_MODE_3_LEVEL: 1382 + freelist = free_pt_l3(root, freelist); 1383 + break; 1384 + case PAGE_MODE_4_LEVEL: 1385 + freelist = free_pt_l4(root, freelist); 1386 + break; 1387 + case PAGE_MODE_5_LEVEL: 1388 + freelist = free_pt_l5(root, freelist); 1389 + break; 1390 + case PAGE_MODE_6_LEVEL: 1391 + freelist = free_pt_l6(root, freelist); 1392 + break; 1393 + default: 1394 + BUG(); 1395 + } 1396 + 1397 + return freelist; 1398 + } 1399 + 1400 + static void free_pagetable(struct protection_domain *domain) 1401 + { 1402 + unsigned long root = (unsigned long)domain->pt_root; 1403 + struct page *freelist = NULL; 1404 + 1405 + BUG_ON(domain->mode < PAGE_MODE_NONE || 1406 + domain->mode > PAGE_MODE_6_LEVEL); 1407 + 1408 + free_sub_pt(root, domain->mode, freelist); 1409 + 1410 + free_page_list(freelist); 1411 + } 1412 + 1325 1413 /* 1326 1414 * This function is used to add another level to an IO page table. Adding 1327 1415 * another level increases the size of the address space by 9 bits to a size up ··· 1465 1363 1466 1364 while (level > end_lvl) { 1467 1365 u64 __pte, __npte; 1366 + int pte_level; 1468 1367 1469 - __pte = *pte; 1368 + __pte = *pte; 1369 + pte_level = PM_PTE_LEVEL(__pte); 1470 1370 1471 - if (!IOMMU_PTE_PRESENT(__pte)) { 1371 + if (!IOMMU_PTE_PRESENT(__pte) || 1372 + pte_level == PAGE_MODE_7_LEVEL) { 1472 1373 page = (u64 *)get_zeroed_page(gfp); 1473 1374 if (!page) 1474 1375 return NULL; ··· 1479 1374 __npte = PM_LEVEL_PDE(level, iommu_virt_to_phys(page)); 1480 1375 1481 1376 /* pte could have been changed somewhere. */ 1482 - if (cmpxchg64(pte, __pte, __npte) != __pte) { 1377 + if (cmpxchg64(pte, __pte, __npte) != __pte) 1483 1378 free_page((unsigned long)page); 1484 - continue; 1485 - } 1379 + else if (pte_level == PAGE_MODE_7_LEVEL) 1380 + domain->updated = true; 1381 + 1382 + continue; 1486 1383 } 1487 1384 1488 1385 /* No level skipping support yet */ 1489 - if (PM_PTE_LEVEL(*pte) != level) 1386 + if (pte_level != level) 1490 1387 return NULL; 1491 1388 1492 1389 level -= 1; 1493 1390 1494 - pte = IOMMU_PTE_PAGE(*pte); 1391 + pte = IOMMU_PTE_PAGE(__pte); 1495 1392 1496 1393 if (pte_page && level == end_lvl) 1497 1394 *pte_page = pte; ··· 1562 1455 return pte; 1563 1456 } 1564 1457 1458 + static struct page *free_clear_pte(u64 *pte, u64 pteval, struct page *freelist) 1459 + { 1460 + unsigned long pt; 1461 + int mode; 1462 + 1463 + while (cmpxchg64(pte, pteval, 0) != pteval) { 1464 + pr_warn("AMD-Vi: IOMMU pte changed since we read it\n"); 1465 + pteval = *pte; 1466 + } 1467 + 1468 + if (!IOMMU_PTE_PRESENT(pteval)) 1469 + return freelist; 1470 + 1471 + pt = (unsigned long)IOMMU_PTE_PAGE(pteval); 1472 + mode = IOMMU_PTE_MODE(pteval); 1473 + 1474 + return free_sub_pt(pt, mode, freelist); 1475 + } 1476 + 1565 1477 /* 1566 1478 * Generic mapping functions. It maps a physical address into a DMA 1567 1479 * address space. It allocates the page table pages if necessary. ··· 1595 1469 int prot, 1596 1470 gfp_t gfp) 1597 1471 { 1472 + struct page *freelist = NULL; 1598 1473 u64 __pte, *pte; 1599 1474 int i, count; 1600 1475 ··· 1612 1485 return -ENOMEM; 1613 1486 1614 1487 for (i = 0; i < count; ++i) 1615 - if (IOMMU_PTE_PRESENT(pte[i])) 1616 - return -EBUSY; 1488 + freelist = free_clear_pte(&pte[i], pte[i], freelist); 1489 + 1490 + if (freelist != NULL) 1491 + dom->updated = true; 1617 1492 1618 1493 if (count > 1) { 1619 1494 __pte = PAGE_SIZE_PTE(__sme_set(phys_addr), page_size); ··· 1632 1503 pte[i] = __pte; 1633 1504 1634 1505 update_domain(dom); 1506 + 1507 + /* Everything flushed out, free pages now */ 1508 + free_page_list(freelist); 1635 1509 1636 1510 return 0; 1637 1511 } ··· 1766 1634 if (id > 0 && id < MAX_DOMAIN_ID) 1767 1635 __clear_bit(id, amd_iommu_pd_alloc_bitmap); 1768 1636 spin_unlock(&pd_bitmap_lock); 1769 - } 1770 - 1771 - #define DEFINE_FREE_PT_FN(LVL, FN) \ 1772 - static void free_pt_##LVL (unsigned long __pt) \ 1773 - { \ 1774 - unsigned long p; \ 1775 - u64 *pt; \ 1776 - int i; \ 1777 - \ 1778 - pt = (u64 *)__pt; \ 1779 - \ 1780 - for (i = 0; i < 512; ++i) { \ 1781 - /* PTE present? */ \ 1782 - if (!IOMMU_PTE_PRESENT(pt[i])) \ 1783 - continue; \ 1784 - \ 1785 - /* Large PTE? */ \ 1786 - if (PM_PTE_LEVEL(pt[i]) == 0 || \ 1787 - PM_PTE_LEVEL(pt[i]) == 7) \ 1788 - continue; \ 1789 - \ 1790 - p = (unsigned long)IOMMU_PTE_PAGE(pt[i]); \ 1791 - FN(p); \ 1792 - } \ 1793 - free_page((unsigned long)pt); \ 1794 - } 1795 - 1796 - DEFINE_FREE_PT_FN(l2, free_page) 1797 - DEFINE_FREE_PT_FN(l3, free_pt_l2) 1798 - DEFINE_FREE_PT_FN(l4, free_pt_l3) 1799 - DEFINE_FREE_PT_FN(l5, free_pt_l4) 1800 - DEFINE_FREE_PT_FN(l6, free_pt_l5) 1801 - 1802 - static void free_pagetable(struct protection_domain *domain) 1803 - { 1804 - unsigned long root = (unsigned long)domain->pt_root; 1805 - 1806 - switch (domain->mode) { 1807 - case PAGE_MODE_NONE: 1808 - break; 1809 - case PAGE_MODE_1_LEVEL: 1810 - free_page(root); 1811 - break; 1812 - case PAGE_MODE_2_LEVEL: 1813 - free_pt_l2(root); 1814 - break; 1815 - case PAGE_MODE_3_LEVEL: 1816 - free_pt_l3(root); 1817 - break; 1818 - case PAGE_MODE_4_LEVEL: 1819 - free_pt_l4(root); 1820 - break; 1821 - case PAGE_MODE_5_LEVEL: 1822 - free_pt_l5(root); 1823 - break; 1824 - case PAGE_MODE_6_LEVEL: 1825 - free_pt_l6(root); 1826 - break; 1827 - default: 1828 - BUG(); 1829 - } 1830 1637 } 1831 1638 1832 1639 static void free_gcr3_tbl_level1(u64 *tbl) ··· 2842 2771 iommu_detected = 1; 2843 2772 2844 2773 if (amd_iommu_unmap_flush) 2845 - pr_info("AMD-Vi: IO/TLB flush on unmap enabled\n"); 2774 + pr_info("IO/TLB flush on unmap enabled\n"); 2846 2775 else 2847 - pr_info("AMD-Vi: Lazy IO/TLB flushing enabled\n"); 2776 + pr_info("Lazy IO/TLB flushing enabled\n"); 2848 2777 2849 2778 return 0; 2850 2779 ··· 2949 2878 case IOMMU_DOMAIN_DMA: 2950 2879 dma_domain = dma_ops_domain_alloc(); 2951 2880 if (!dma_domain) { 2952 - pr_err("AMD-Vi: Failed to allocate\n"); 2881 + pr_err("Failed to allocate\n"); 2953 2882 return NULL; 2954 2883 } 2955 2884 pdomain = &dma_domain->domain; ··· 4370 4299 * legacy mode. So, we force legacy mode instead. 4371 4300 */ 4372 4301 if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir)) { 4373 - pr_debug("AMD-Vi: %s: Fall back to using intr legacy remap\n", 4302 + pr_debug("%s: Fall back to using intr legacy remap\n", 4374 4303 __func__); 4375 4304 pi_data->is_guest_mode = false; 4376 4305 }
+33 -31
drivers/iommu/amd_iommu_init.c
··· 17 17 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA 18 18 */ 19 19 20 + #define pr_fmt(fmt) "AMD-Vi: " fmt 21 + 20 22 #include <linux/pci.h> 21 23 #include <linux/acpi.h> 22 24 #include <linux/list.h> ··· 445 443 static u8 __iomem * __init iommu_map_mmio_space(u64 address, u64 end) 446 444 { 447 445 if (!request_mem_region(address, end, "amd_iommu")) { 448 - pr_err("AMD-Vi: Can not reserve memory region %llx-%llx for mmio\n", 446 + pr_err("Can not reserve memory region %llx-%llx for mmio\n", 449 447 address, end); 450 - pr_err("AMD-Vi: This is a BIOS bug. Please contact your hardware vendor\n"); 448 + pr_err("This is a BIOS bug. Please contact your hardware vendor\n"); 451 449 return NULL; 452 450 } 453 451 ··· 514 512 u32 ivhd_size = get_ivhd_header_size(h); 515 513 516 514 if (!ivhd_size) { 517 - pr_err("AMD-Vi: Unsupported IVHD type %#x\n", h->type); 515 + pr_err("Unsupported IVHD type %#x\n", h->type); 518 516 return -EINVAL; 519 517 } 520 518 ··· 555 553 checksum += p[i]; 556 554 if (checksum != 0) { 557 555 /* ACPI table corrupt */ 558 - pr_err(FW_BUG "AMD-Vi: IVRS invalid checksum\n"); 556 + pr_err(FW_BUG "IVRS invalid checksum\n"); 559 557 return -ENODEV; 560 558 } 561 559 ··· 1030 1028 if (!(entry->id == id && entry->cmd_line)) 1031 1029 continue; 1032 1030 1033 - pr_info("AMD-Vi: Command-line override present for %s id %d - ignoring\n", 1031 + pr_info("Command-line override present for %s id %d - ignoring\n", 1034 1032 type == IVHD_SPECIAL_IOAPIC ? "IOAPIC" : "HPET", id); 1035 1033 1036 1034 *devid = entry->devid; ··· 1063 1061 !entry->cmd_line) 1064 1062 continue; 1065 1063 1066 - pr_info("AMD-Vi: Command-line override for hid:%s uid:%s\n", 1064 + pr_info("Command-line override for hid:%s uid:%s\n", 1067 1065 hid, uid); 1068 1066 *devid = entry->devid; 1069 1067 return 0; ··· 1079 1077 entry->cmd_line = cmd_line; 1080 1078 entry->root_devid = (entry->devid & (~0x7)); 1081 1079 1082 - pr_info("AMD-Vi:%s, add hid:%s, uid:%s, rdevid:%d\n", 1080 + pr_info("%s, add hid:%s, uid:%s, rdevid:%d\n", 1083 1081 entry->cmd_line ? "cmd" : "ivrs", 1084 1082 entry->hid, entry->uid, entry->root_devid); 1085 1083 ··· 1175 1173 */ 1176 1174 ivhd_size = get_ivhd_header_size(h); 1177 1175 if (!ivhd_size) { 1178 - pr_err("AMD-Vi: Unsupported IVHD type %#x\n", h->type); 1176 + pr_err("Unsupported IVHD type %#x\n", h->type); 1179 1177 return -EINVAL; 1180 1178 } 1181 1179 ··· 1457 1455 pci_write_config_dword(iommu->dev, 0xf0, 0x90 | (1 << 8)); 1458 1456 1459 1457 pci_write_config_dword(iommu->dev, 0xf4, value | 0x4); 1460 - pr_info("AMD-Vi: Applying erratum 746 workaround for IOMMU at %s\n", 1458 + pr_info("Applying erratum 746 workaround for IOMMU at %s\n", 1461 1459 dev_name(&iommu->dev->dev)); 1462 1460 1463 1461 /* Clear the enable writing bit */ ··· 1488 1486 /* Set L2_DEBUG_3[AtsIgnoreIWDis] = 1 */ 1489 1487 iommu_write_l2(iommu, 0x47, value | BIT(0)); 1490 1488 1491 - pr_info("AMD-Vi: Applying ATS write check workaround for IOMMU at %s\n", 1489 + pr_info("Applying ATS write check workaround for IOMMU at %s\n", 1492 1490 dev_name(&iommu->dev->dev)); 1493 1491 } 1494 1492 ··· 1508 1506 iommu->index = amd_iommus_present++; 1509 1507 1510 1508 if (unlikely(iommu->index >= MAX_IOMMUS)) { 1511 - WARN(1, "AMD-Vi: System has more IOMMUs than supported by this driver\n"); 1509 + WARN(1, "System has more IOMMUs than supported by this driver\n"); 1512 1510 return -ENOSYS; 1513 1511 } 1514 1512 ··· 1676 1674 if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true)) || 1677 1675 (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false)) || 1678 1676 (val != val2)) { 1679 - pr_err("AMD-Vi: Unable to write to IOMMU perf counter.\n"); 1677 + pr_err("Unable to write to IOMMU perf counter.\n"); 1680 1678 amd_iommu_pc_present = false; 1681 1679 return; 1682 1680 } 1683 1681 1684 - pr_info("AMD-Vi: IOMMU performance counters supported\n"); 1682 + pr_info("IOMMU performance counters supported\n"); 1685 1683 1686 1684 val = readl(iommu->mmio_base + MMIO_CNTR_CONF_OFFSET); 1687 1685 iommu->max_banks = (u8) ((val >> 12) & 0x3f); ··· 1842 1840 for_each_iommu(iommu) { 1843 1841 int i; 1844 1842 1845 - pr_info("AMD-Vi: Found IOMMU at %s cap 0x%hx\n", 1843 + pr_info("Found IOMMU at %s cap 0x%hx\n", 1846 1844 dev_name(&iommu->dev->dev), iommu->cap_ptr); 1847 1845 1848 1846 if (iommu->cap & (1 << IOMMU_CAP_EFR)) { 1849 - pr_info("AMD-Vi: Extended features (%#llx):\n", 1847 + pr_info("Extended features (%#llx):\n", 1850 1848 iommu->features); 1851 1849 for (i = 0; i < ARRAY_SIZE(feat_str); ++i) { 1852 1850 if (iommu_feature(iommu, (1ULL << i))) ··· 1860 1858 } 1861 1859 } 1862 1860 if (irq_remapping_enabled) { 1863 - pr_info("AMD-Vi: Interrupt remapping enabled\n"); 1861 + pr_info("Interrupt remapping enabled\n"); 1864 1862 if (AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir)) 1865 - pr_info("AMD-Vi: virtual APIC enabled\n"); 1863 + pr_info("Virtual APIC enabled\n"); 1866 1864 if (amd_iommu_xt_mode == IRQ_REMAP_X2APIC_MODE) 1867 - pr_info("AMD-Vi: X2APIC enabled\n"); 1865 + pr_info("X2APIC enabled\n"); 1868 1866 } 1869 1867 } 1870 1868 ··· 2378 2376 2379 2377 devid = get_ioapic_devid(id); 2380 2378 if (devid < 0) { 2381 - pr_err("%sAMD-Vi: IOAPIC[%d] not in IVRS table\n", 2379 + pr_err("%s: IOAPIC[%d] not in IVRS table\n", 2382 2380 fw_bug, id); 2383 2381 ret = false; 2384 2382 } else if (devid == IOAPIC_SB_DEVID) { ··· 2396 2394 * when the BIOS is buggy and provides us the wrong 2397 2395 * device id for the IOAPIC in the system. 2398 2396 */ 2399 - pr_err("%sAMD-Vi: No southbridge IOAPIC found\n", fw_bug); 2397 + pr_err("%s: No southbridge IOAPIC found\n", fw_bug); 2400 2398 } 2401 2399 2402 2400 if (!ret) 2403 - pr_err("AMD-Vi: Disabling interrupt remapping\n"); 2401 + pr_err("Disabling interrupt remapping\n"); 2404 2402 2405 2403 return ret; 2406 2404 } ··· 2455 2453 return -ENODEV; 2456 2454 else if (ACPI_FAILURE(status)) { 2457 2455 const char *err = acpi_format_exception(status); 2458 - pr_err("AMD-Vi: IVRS table error: %s\n", err); 2456 + pr_err("IVRS table error: %s\n", err); 2459 2457 return -EINVAL; 2460 2458 } 2461 2459 ··· 2608 2606 return false; 2609 2607 else if (ACPI_FAILURE(status)) { 2610 2608 const char *err = acpi_format_exception(status); 2611 - pr_err("AMD-Vi: IVRS table error: %s\n", err); 2609 + pr_err("IVRS table error: %s\n", err); 2612 2610 return false; 2613 2611 } 2614 2612 ··· 2643 2641 ret = early_amd_iommu_init(); 2644 2642 init_state = ret ? IOMMU_INIT_ERROR : IOMMU_ACPI_FINISHED; 2645 2643 if (init_state == IOMMU_ACPI_FINISHED && amd_iommu_disabled) { 2646 - pr_info("AMD-Vi: AMD IOMMU disabled on kernel command-line\n"); 2644 + pr_info("AMD IOMMU disabled on kernel command-line\n"); 2647 2645 free_dma_resources(); 2648 2646 free_iommu_resources(); 2649 2647 init_state = IOMMU_CMDLINE_DISABLED; ··· 2790 2788 (boot_cpu_data.microcode <= 0x080011ff)) 2791 2789 return true; 2792 2790 2793 - pr_notice("AMD-Vi: IOMMU not currently supported when SME is active\n"); 2791 + pr_notice("IOMMU not currently supported when SME is active\n"); 2794 2792 2795 2793 return false; 2796 2794 } ··· 2875 2873 ret = sscanf(str, "[%d]=%x:%x.%x", &id, &bus, &dev, &fn); 2876 2874 2877 2875 if (ret != 4) { 2878 - pr_err("AMD-Vi: Invalid command line: ivrs_ioapic%s\n", str); 2876 + pr_err("Invalid command line: ivrs_ioapic%s\n", str); 2879 2877 return 1; 2880 2878 } 2881 2879 2882 2880 if (early_ioapic_map_size == EARLY_MAP_SIZE) { 2883 - pr_err("AMD-Vi: Early IOAPIC map overflow - ignoring ivrs_ioapic%s\n", 2881 + pr_err("Early IOAPIC map overflow - ignoring ivrs_ioapic%s\n", 2884 2882 str); 2885 2883 return 1; 2886 2884 } ··· 2905 2903 ret = sscanf(str, "[%d]=%x:%x.%x", &id, &bus, &dev, &fn); 2906 2904 2907 2905 if (ret != 4) { 2908 - pr_err("AMD-Vi: Invalid command line: ivrs_hpet%s\n", str); 2906 + pr_err("Invalid command line: ivrs_hpet%s\n", str); 2909 2907 return 1; 2910 2908 } 2911 2909 2912 2910 if (early_hpet_map_size == EARLY_MAP_SIZE) { 2913 - pr_err("AMD-Vi: Early HPET map overflow - ignoring ivrs_hpet%s\n", 2911 + pr_err("Early HPET map overflow - ignoring ivrs_hpet%s\n", 2914 2912 str); 2915 2913 return 1; 2916 2914 } ··· 2935 2933 2936 2934 ret = sscanf(str, "[%x:%x.%x]=%s", &bus, &dev, &fn, acpiid); 2937 2935 if (ret != 4) { 2938 - pr_err("AMD-Vi: Invalid command line: ivrs_acpihid(%s)\n", str); 2936 + pr_err("Invalid command line: ivrs_acpihid(%s)\n", str); 2939 2937 return 1; 2940 2938 } 2941 2939 ··· 2944 2942 uid = p; 2945 2943 2946 2944 if (!hid || !(*hid) || !uid) { 2947 - pr_err("AMD-Vi: Invalid command line: hid or uid\n"); 2945 + pr_err("Invalid command line: hid or uid\n"); 2948 2946 return 1; 2949 2947 } 2950 2948
+1
drivers/iommu/amd_iommu_types.h
··· 269 269 #define PAGE_MODE_4_LEVEL 0x04 270 270 #define PAGE_MODE_5_LEVEL 0x05 271 271 #define PAGE_MODE_6_LEVEL 0x06 272 + #define PAGE_MODE_7_LEVEL 0x07 272 273 273 274 #define PM_LEVEL_SHIFT(x) (12 + ((x) * 9)) 274 275 #define PM_LEVEL_SIZE(x) (((x) < 6) ? \
+2
drivers/iommu/amd_iommu_v2.c
··· 16 16 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA 17 17 */ 18 18 19 + #define pr_fmt(fmt) "AMD-Vi: " fmt 20 + 19 21 #include <linux/mmu_notifier.h> 20 22 #include <linux/amd-iommu.h> 21 23 #include <linux/mm_types.h>
+37 -26
drivers/iommu/arm-smmu-v3.c
··· 20 20 #include <linux/interrupt.h> 21 21 #include <linux/iommu.h> 22 22 #include <linux/iopoll.h> 23 - #include <linux/module.h> 23 + #include <linux/init.h> 24 + #include <linux/moduleparam.h> 24 25 #include <linux/msi.h> 25 26 #include <linux/of.h> 26 27 #include <linux/of_address.h> ··· 357 356 #define MSI_IOVA_BASE 0x8000000 358 357 #define MSI_IOVA_LENGTH 0x100000 359 358 359 + /* 360 + * not really modular, but the easiest way to keep compat with existing 361 + * bootargs behaviour is to continue using module_param_named here. 362 + */ 360 363 static bool disable_bypass = 1; 361 364 module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO); 362 365 MODULE_PARM_DESC(disable_bypass, ··· 581 576 582 577 struct arm_smmu_strtab_cfg strtab_cfg; 583 578 584 - u32 sync_count; 579 + /* Hi16xx adds an extra 32 bits of goodness to its MSI payload */ 580 + union { 581 + u32 sync_count; 582 + u64 padding; 583 + }; 585 584 586 585 /* IOMMU core code handle */ 587 586 struct iommu_device iommu; ··· 684 675 u32 cons = (Q_WRP(q, q->cons) | Q_IDX(q, q->cons)) + 1; 685 676 686 677 q->cons = Q_OVF(q, q->cons) | Q_WRP(q, cons) | Q_IDX(q, cons); 687 - writel(q->cons, q->cons_reg); 678 + 679 + /* 680 + * Ensure that all CPU accesses (reads and writes) to the queue 681 + * are complete before we update the cons pointer. 682 + */ 683 + mb(); 684 + writel_relaxed(q->cons, q->cons_reg); 688 685 } 689 686 690 687 static int queue_sync_prod(struct arm_smmu_queue *q) ··· 843 828 cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_SEV); 844 829 cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSH, ARM_SMMU_SH_ISH); 845 830 cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIATTR, ARM_SMMU_MEMATTR_OIWB); 846 - cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, ent->sync.msidata); 831 + /* 832 + * Commands are written little-endian, but we want the SMMU to 833 + * receive MSIData, and thus write it back to memory, in CPU 834 + * byte order, so big-endian needs an extra byteswap here. 835 + */ 836 + cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, 837 + cpu_to_le32(ent->sync.msidata)); 847 838 cmd[1] |= ent->sync.msiaddr & CMDQ_SYNC_1_MSIADDR_MASK; 848 839 break; 849 840 default: ··· 1712 1691 1713 1692 static void arm_smmu_detach_dev(struct device *dev) 1714 1693 { 1715 - struct arm_smmu_master_data *master = dev->iommu_fwspec->iommu_priv; 1694 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 1695 + struct arm_smmu_master_data *master = fwspec->iommu_priv; 1716 1696 1717 1697 master->ste.assigned = false; 1718 - arm_smmu_install_ste_for_dev(dev->iommu_fwspec); 1698 + arm_smmu_install_ste_for_dev(fwspec); 1719 1699 } 1720 1700 1721 1701 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) 1722 1702 { 1723 1703 int ret = 0; 1704 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 1724 1705 struct arm_smmu_device *smmu; 1725 1706 struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); 1726 1707 struct arm_smmu_master_data *master; 1727 1708 struct arm_smmu_strtab_ent *ste; 1728 1709 1729 - if (!dev->iommu_fwspec) 1710 + if (!fwspec) 1730 1711 return -ENOENT; 1731 1712 1732 - master = dev->iommu_fwspec->iommu_priv; 1713 + master = fwspec->iommu_priv; 1733 1714 smmu = master->smmu; 1734 1715 ste = &master->ste; 1735 1716 ··· 1771 1748 ste->s2_cfg = &smmu_domain->s2_cfg; 1772 1749 } 1773 1750 1774 - arm_smmu_install_ste_for_dev(dev->iommu_fwspec); 1751 + arm_smmu_install_ste_for_dev(fwspec); 1775 1752 out_unlock: 1776 1753 mutex_unlock(&smmu_domain->init_mutex); 1777 1754 return ret; ··· 1862 1839 int i, ret; 1863 1840 struct arm_smmu_device *smmu; 1864 1841 struct arm_smmu_master_data *master; 1865 - struct iommu_fwspec *fwspec = dev->iommu_fwspec; 1842 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 1866 1843 struct iommu_group *group; 1867 1844 1868 1845 if (!fwspec || fwspec->ops != &arm_smmu_ops) ··· 1913 1890 1914 1891 static void arm_smmu_remove_device(struct device *dev) 1915 1892 { 1916 - struct iommu_fwspec *fwspec = dev->iommu_fwspec; 1893 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 1917 1894 struct arm_smmu_master_data *master; 1918 1895 struct arm_smmu_device *smmu; 1919 1896 ··· 2951 2928 return 0; 2952 2929 } 2953 2930 2954 - static int arm_smmu_device_remove(struct platform_device *pdev) 2931 + static void arm_smmu_device_shutdown(struct platform_device *pdev) 2955 2932 { 2956 2933 struct arm_smmu_device *smmu = platform_get_drvdata(pdev); 2957 2934 2958 2935 arm_smmu_device_disable(smmu); 2959 - 2960 - return 0; 2961 - } 2962 - 2963 - static void arm_smmu_device_shutdown(struct platform_device *pdev) 2964 - { 2965 - arm_smmu_device_remove(pdev); 2966 2936 } 2967 2937 2968 2938 static const struct of_device_id arm_smmu_of_match[] = { 2969 2939 { .compatible = "arm,smmu-v3", }, 2970 2940 { }, 2971 2941 }; 2972 - MODULE_DEVICE_TABLE(of, arm_smmu_of_match); 2973 2942 2974 2943 static struct platform_driver arm_smmu_driver = { 2975 2944 .driver = { 2976 2945 .name = "arm-smmu-v3", 2977 2946 .of_match_table = of_match_ptr(arm_smmu_of_match), 2947 + .suppress_bind_attrs = true, 2978 2948 }, 2979 2949 .probe = arm_smmu_device_probe, 2980 - .remove = arm_smmu_device_remove, 2981 2950 .shutdown = arm_smmu_device_shutdown, 2982 2951 }; 2983 - module_platform_driver(arm_smmu_driver); 2984 - 2985 - MODULE_DESCRIPTION("IOMMU API for ARM architected SMMUv3 implementations"); 2986 - MODULE_AUTHOR("Will Deacon <will.deacon@arm.com>"); 2987 - MODULE_LICENSE("GPL v2"); 2952 + builtin_platform_driver(arm_smmu_driver);
+174 -35
drivers/iommu/arm-smmu.c
··· 41 41 #include <linux/io-64-nonatomic-hi-lo.h> 42 42 #include <linux/iommu.h> 43 43 #include <linux/iopoll.h> 44 - #include <linux/module.h> 44 + #include <linux/init.h> 45 + #include <linux/moduleparam.h> 45 46 #include <linux/of.h> 46 47 #include <linux/of_address.h> 47 48 #include <linux/of_device.h> 48 49 #include <linux/of_iommu.h> 49 50 #include <linux/pci.h> 50 51 #include <linux/platform_device.h> 52 + #include <linux/pm_runtime.h> 51 53 #include <linux/slab.h> 52 54 #include <linux/spinlock.h> 53 55 ··· 103 101 #define MSI_IOVA_LENGTH 0x100000 104 102 105 103 static int force_stage; 104 + /* 105 + * not really modular, but the easiest way to keep compat with existing 106 + * bootargs behaviour is to continue using module_param() here. 107 + */ 106 108 module_param(force_stage, int, S_IRUGO); 107 109 MODULE_PARM_DESC(force_stage, 108 110 "Force SMMU mappings to be installed at a particular stage of translation. A value of '1' or '2' forces the corresponding stage. All other values are ignored (i.e. no stage is forced). Note that selecting a specific stage will disable support for nested translation."); ··· 125 119 GENERIC_SMMU, 126 120 ARM_MMU500, 127 121 CAVIUM_SMMUV2, 122 + QCOM_SMMUV2, 128 123 }; 129 124 130 125 struct arm_smmu_s2cr { ··· 213 206 u32 num_global_irqs; 214 207 u32 num_context_irqs; 215 208 unsigned int *irqs; 209 + struct clk_bulk_data *clks; 210 + int num_clks; 216 211 217 212 u32 cavium_id_base; /* Specific to Cavium */ 218 213 ··· 275 266 { ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" }, 276 267 { 0, NULL}, 277 268 }; 269 + 270 + static inline int arm_smmu_rpm_get(struct arm_smmu_device *smmu) 271 + { 272 + if (pm_runtime_enabled(smmu->dev)) 273 + return pm_runtime_get_sync(smmu->dev); 274 + 275 + return 0; 276 + } 277 + 278 + static inline void arm_smmu_rpm_put(struct arm_smmu_device *smmu) 279 + { 280 + if (pm_runtime_enabled(smmu->dev)) 281 + pm_runtime_put(smmu->dev); 282 + } 278 283 279 284 static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom) 280 285 { ··· 949 926 struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); 950 927 struct arm_smmu_device *smmu = smmu_domain->smmu; 951 928 struct arm_smmu_cfg *cfg = &smmu_domain->cfg; 952 - int irq; 929 + int ret, irq; 953 930 954 931 if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY) 932 + return; 933 + 934 + ret = arm_smmu_rpm_get(smmu); 935 + if (ret < 0) 955 936 return; 956 937 957 938 /* ··· 972 945 973 946 free_io_pgtable_ops(smmu_domain->pgtbl_ops); 974 947 __arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx); 948 + 949 + arm_smmu_rpm_put(smmu); 975 950 } 976 951 977 952 static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) ··· 1132 1103 1133 1104 static int arm_smmu_master_alloc_smes(struct device *dev) 1134 1105 { 1135 - struct iommu_fwspec *fwspec = dev->iommu_fwspec; 1106 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 1136 1107 struct arm_smmu_master_cfg *cfg = fwspec->iommu_priv; 1137 1108 struct arm_smmu_device *smmu = cfg->smmu; 1138 1109 struct arm_smmu_smr *smrs = smmu->smrs; ··· 1235 1206 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) 1236 1207 { 1237 1208 int ret; 1238 - struct iommu_fwspec *fwspec = dev->iommu_fwspec; 1209 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 1239 1210 struct arm_smmu_device *smmu; 1240 1211 struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); 1241 1212 ··· 1255 1226 return -ENODEV; 1256 1227 1257 1228 smmu = fwspec_smmu(fwspec); 1229 + 1230 + ret = arm_smmu_rpm_get(smmu); 1231 + if (ret < 0) 1232 + return ret; 1233 + 1258 1234 /* Ensure that the domain is finalised */ 1259 1235 ret = arm_smmu_init_domain_context(domain, smmu); 1260 1236 if (ret < 0) 1261 - return ret; 1237 + goto rpm_put; 1262 1238 1263 1239 /* 1264 1240 * Sanity check the domain. We don't support domains across ··· 1273 1239 dev_err(dev, 1274 1240 "cannot attach to SMMU %s whilst already attached to domain on SMMU %s\n", 1275 1241 dev_name(smmu_domain->smmu->dev), dev_name(smmu->dev)); 1276 - return -EINVAL; 1242 + ret = -EINVAL; 1243 + goto rpm_put; 1277 1244 } 1278 1245 1279 1246 /* Looks ok, so add the device to the domain */ 1280 - return arm_smmu_domain_add_master(smmu_domain, fwspec); 1247 + ret = arm_smmu_domain_add_master(smmu_domain, fwspec); 1248 + 1249 + rpm_put: 1250 + arm_smmu_rpm_put(smmu); 1251 + return ret; 1281 1252 } 1282 1253 1283 1254 static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova, 1284 1255 phys_addr_t paddr, size_t size, int prot) 1285 1256 { 1286 1257 struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops; 1258 + struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu; 1259 + int ret; 1287 1260 1288 1261 if (!ops) 1289 1262 return -ENODEV; 1290 1263 1291 - return ops->map(ops, iova, paddr, size, prot); 1264 + arm_smmu_rpm_get(smmu); 1265 + ret = ops->map(ops, iova, paddr, size, prot); 1266 + arm_smmu_rpm_put(smmu); 1267 + 1268 + return ret; 1292 1269 } 1293 1270 1294 1271 static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova, 1295 1272 size_t size) 1296 1273 { 1297 1274 struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops; 1275 + struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu; 1276 + size_t ret; 1298 1277 1299 1278 if (!ops) 1300 1279 return 0; 1301 1280 1302 - return ops->unmap(ops, iova, size); 1281 + arm_smmu_rpm_get(smmu); 1282 + ret = ops->unmap(ops, iova, size); 1283 + arm_smmu_rpm_put(smmu); 1284 + 1285 + return ret; 1303 1286 } 1304 1287 1305 1288 static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain) 1306 1289 { 1307 1290 struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); 1291 + struct arm_smmu_device *smmu = smmu_domain->smmu; 1308 1292 1309 - if (smmu_domain->tlb_ops) 1293 + if (smmu_domain->tlb_ops) { 1294 + arm_smmu_rpm_get(smmu); 1310 1295 smmu_domain->tlb_ops->tlb_flush_all(smmu_domain); 1296 + arm_smmu_rpm_put(smmu); 1297 + } 1311 1298 } 1312 1299 1313 1300 static void arm_smmu_iotlb_sync(struct iommu_domain *domain) 1314 1301 { 1315 1302 struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); 1303 + struct arm_smmu_device *smmu = smmu_domain->smmu; 1316 1304 1317 - if (smmu_domain->tlb_ops) 1305 + if (smmu_domain->tlb_ops) { 1306 + arm_smmu_rpm_get(smmu); 1318 1307 smmu_domain->tlb_ops->tlb_sync(smmu_domain); 1308 + arm_smmu_rpm_put(smmu); 1309 + } 1319 1310 } 1320 1311 1321 1312 static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain, ··· 1355 1296 u32 tmp; 1356 1297 u64 phys; 1357 1298 unsigned long va, flags; 1299 + int ret; 1300 + 1301 + ret = arm_smmu_rpm_get(smmu); 1302 + if (ret < 0) 1303 + return 0; 1358 1304 1359 1305 cb_base = ARM_SMMU_CB(smmu, cfg->cbndx); 1360 1306 ··· 1387 1323 dev_err(dev, "PAR = 0x%llx\n", phys); 1388 1324 return 0; 1389 1325 } 1326 + 1327 + arm_smmu_rpm_put(smmu); 1390 1328 1391 1329 return (phys & GENMASK_ULL(39, 12)) | (iova & 0xfff); 1392 1330 } ··· 1446 1380 { 1447 1381 struct arm_smmu_device *smmu; 1448 1382 struct arm_smmu_master_cfg *cfg; 1449 - struct iommu_fwspec *fwspec = dev->iommu_fwspec; 1383 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 1450 1384 int i, ret; 1451 1385 1452 1386 if (using_legacy_binding) { ··· 1457 1391 * will allocate/initialise a new one. Thus we need to update fwspec for 1458 1392 * later use. 1459 1393 */ 1460 - fwspec = dev->iommu_fwspec; 1394 + fwspec = dev_iommu_fwspec_get(dev); 1461 1395 if (ret) 1462 1396 goto out_free; 1463 1397 } else if (fwspec && fwspec->ops == &arm_smmu_ops) { ··· 1494 1428 while (i--) 1495 1429 cfg->smendx[i] = INVALID_SMENDX; 1496 1430 1431 + ret = arm_smmu_rpm_get(smmu); 1432 + if (ret < 0) 1433 + goto out_cfg_free; 1434 + 1497 1435 ret = arm_smmu_master_alloc_smes(dev); 1436 + arm_smmu_rpm_put(smmu); 1437 + 1498 1438 if (ret) 1499 1439 goto out_cfg_free; 1500 1440 1501 1441 iommu_device_link(&smmu->iommu, dev); 1442 + 1443 + device_link_add(dev, smmu->dev, 1444 + DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE_SUPPLIER); 1502 1445 1503 1446 return 0; 1504 1447 ··· 1520 1445 1521 1446 static void arm_smmu_remove_device(struct device *dev) 1522 1447 { 1523 - struct iommu_fwspec *fwspec = dev->iommu_fwspec; 1448 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 1524 1449 struct arm_smmu_master_cfg *cfg; 1525 1450 struct arm_smmu_device *smmu; 1526 - 1451 + int ret; 1527 1452 1528 1453 if (!fwspec || fwspec->ops != &arm_smmu_ops) 1529 1454 return; ··· 1531 1456 cfg = fwspec->iommu_priv; 1532 1457 smmu = cfg->smmu; 1533 1458 1459 + ret = arm_smmu_rpm_get(smmu); 1460 + if (ret < 0) 1461 + return; 1462 + 1534 1463 iommu_device_unlink(&smmu->iommu, dev); 1535 1464 arm_smmu_master_free_smes(fwspec); 1465 + 1466 + arm_smmu_rpm_put(smmu); 1467 + 1536 1468 iommu_group_remove_device(dev); 1537 1469 kfree(fwspec->iommu_priv); 1538 1470 iommu_fwspec_free(dev); ··· 1547 1465 1548 1466 static struct iommu_group *arm_smmu_device_group(struct device *dev) 1549 1467 { 1550 - struct iommu_fwspec *fwspec = dev->iommu_fwspec; 1468 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 1551 1469 struct arm_smmu_device *smmu = fwspec_smmu(fwspec); 1552 1470 struct iommu_group *group = NULL; 1553 1471 int i, idx; ··· 2029 1947 }; 2030 1948 2031 1949 #define ARM_SMMU_MATCH_DATA(name, ver, imp) \ 2032 - static struct arm_smmu_match_data name = { .version = ver, .model = imp } 1950 + static const struct arm_smmu_match_data name = { .version = ver, .model = imp } 2033 1951 2034 1952 ARM_SMMU_MATCH_DATA(smmu_generic_v1, ARM_SMMU_V1, GENERIC_SMMU); 2035 1953 ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, GENERIC_SMMU); 2036 1954 ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, GENERIC_SMMU); 2037 1955 ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500); 2038 1956 ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2); 1957 + ARM_SMMU_MATCH_DATA(qcom_smmuv2, ARM_SMMU_V2, QCOM_SMMUV2); 2039 1958 2040 1959 static const struct of_device_id arm_smmu_of_match[] = { 2041 1960 { .compatible = "arm,smmu-v1", .data = &smmu_generic_v1 }, ··· 2045 1962 { .compatible = "arm,mmu-401", .data = &arm_mmu401 }, 2046 1963 { .compatible = "arm,mmu-500", .data = &arm_mmu500 }, 2047 1964 { .compatible = "cavium,smmu-v2", .data = &cavium_smmuv2 }, 1965 + { .compatible = "qcom,smmu-v2", .data = &qcom_smmuv2 }, 2048 1966 { }, 2049 1967 }; 2050 - MODULE_DEVICE_TABLE(of, arm_smmu_of_match); 2051 1968 2052 1969 #ifdef CONFIG_ACPI 2053 1970 static int acpi_smmu_get_data(u32 model, struct arm_smmu_device *smmu) ··· 2233 2150 smmu->irqs[i] = irq; 2234 2151 } 2235 2152 2153 + err = devm_clk_bulk_get_all(dev, &smmu->clks); 2154 + if (err < 0) { 2155 + dev_err(dev, "failed to get clocks %d\n", err); 2156 + return err; 2157 + } 2158 + smmu->num_clks = err; 2159 + 2160 + err = clk_bulk_prepare_enable(smmu->num_clks, smmu->clks); 2161 + if (err) 2162 + return err; 2163 + 2236 2164 err = arm_smmu_device_cfg_probe(smmu); 2237 2165 if (err) 2238 2166 return err; ··· 2294 2200 arm_smmu_test_smr_masks(smmu); 2295 2201 2296 2202 /* 2203 + * We want to avoid touching dev->power.lock in fastpaths unless 2204 + * it's really going to do something useful - pm_runtime_enabled() 2205 + * can serve as an ideal proxy for that decision. So, conditionally 2206 + * enable pm_runtime. 2207 + */ 2208 + if (dev->pm_domain) { 2209 + pm_runtime_set_active(dev); 2210 + pm_runtime_enable(dev); 2211 + } 2212 + 2213 + /* 2297 2214 * For ACPI and generic DT bindings, an SMMU will be probed before 2298 2215 * any device which might need it, so we want the bus ops in place 2299 2216 * ready to handle default domain setup as soon as any SMMU exists. ··· 2329 2224 } 2330 2225 device_initcall_sync(arm_smmu_legacy_bus_init); 2331 2226 2332 - static int arm_smmu_device_remove(struct platform_device *pdev) 2227 + static void arm_smmu_device_shutdown(struct platform_device *pdev) 2333 2228 { 2334 2229 struct arm_smmu_device *smmu = platform_get_drvdata(pdev); 2335 2230 2336 2231 if (!smmu) 2337 - return -ENODEV; 2232 + return; 2338 2233 2339 2234 if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS)) 2340 2235 dev_err(&pdev->dev, "removing device with active domains!\n"); 2341 2236 2237 + arm_smmu_rpm_get(smmu); 2342 2238 /* Turn the thing off */ 2343 2239 writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0); 2240 + arm_smmu_rpm_put(smmu); 2241 + 2242 + if (pm_runtime_enabled(smmu->dev)) 2243 + pm_runtime_force_suspend(smmu->dev); 2244 + else 2245 + clk_bulk_disable(smmu->num_clks, smmu->clks); 2246 + 2247 + clk_bulk_unprepare(smmu->num_clks, smmu->clks); 2248 + } 2249 + 2250 + static int __maybe_unused arm_smmu_runtime_resume(struct device *dev) 2251 + { 2252 + struct arm_smmu_device *smmu = dev_get_drvdata(dev); 2253 + int ret; 2254 + 2255 + ret = clk_bulk_enable(smmu->num_clks, smmu->clks); 2256 + if (ret) 2257 + return ret; 2258 + 2259 + arm_smmu_device_reset(smmu); 2260 + 2344 2261 return 0; 2345 2262 } 2346 2263 2347 - static void arm_smmu_device_shutdown(struct platform_device *pdev) 2264 + static int __maybe_unused arm_smmu_runtime_suspend(struct device *dev) 2348 2265 { 2349 - arm_smmu_device_remove(pdev); 2266 + struct arm_smmu_device *smmu = dev_get_drvdata(dev); 2267 + 2268 + clk_bulk_disable(smmu->num_clks, smmu->clks); 2269 + 2270 + return 0; 2350 2271 } 2351 2272 2352 2273 static int __maybe_unused arm_smmu_pm_resume(struct device *dev) 2353 2274 { 2354 - struct arm_smmu_device *smmu = dev_get_drvdata(dev); 2275 + if (pm_runtime_suspended(dev)) 2276 + return 0; 2355 2277 2356 - arm_smmu_device_reset(smmu); 2357 - return 0; 2278 + return arm_smmu_runtime_resume(dev); 2358 2279 } 2359 2280 2360 - static SIMPLE_DEV_PM_OPS(arm_smmu_pm_ops, NULL, arm_smmu_pm_resume); 2281 + static int __maybe_unused arm_smmu_pm_suspend(struct device *dev) 2282 + { 2283 + if (pm_runtime_suspended(dev)) 2284 + return 0; 2285 + 2286 + return arm_smmu_runtime_suspend(dev); 2287 + } 2288 + 2289 + static const struct dev_pm_ops arm_smmu_pm_ops = { 2290 + SET_SYSTEM_SLEEP_PM_OPS(arm_smmu_pm_suspend, arm_smmu_pm_resume) 2291 + SET_RUNTIME_PM_OPS(arm_smmu_runtime_suspend, 2292 + arm_smmu_runtime_resume, NULL) 2293 + }; 2361 2294 2362 2295 static struct platform_driver arm_smmu_driver = { 2363 2296 .driver = { 2364 - .name = "arm-smmu", 2365 - .of_match_table = of_match_ptr(arm_smmu_of_match), 2366 - .pm = &arm_smmu_pm_ops, 2297 + .name = "arm-smmu", 2298 + .of_match_table = of_match_ptr(arm_smmu_of_match), 2299 + .pm = &arm_smmu_pm_ops, 2300 + .suppress_bind_attrs = true, 2367 2301 }, 2368 2302 .probe = arm_smmu_device_probe, 2369 - .remove = arm_smmu_device_remove, 2370 2303 .shutdown = arm_smmu_device_shutdown, 2371 2304 }; 2372 - module_platform_driver(arm_smmu_driver); 2373 - 2374 - MODULE_DESCRIPTION("IOMMU API for ARM architected SMMU implementations"); 2375 - MODULE_AUTHOR("Will Deacon <will.deacon@arm.com>"); 2376 - MODULE_LICENSE("GPL v2"); 2305 + builtin_platform_driver(arm_smmu_driver);
+11 -11
drivers/iommu/dma-iommu.c
··· 175 175 void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list) 176 176 { 177 177 178 - if (!is_of_node(dev->iommu_fwspec->iommu_fwnode)) 178 + if (!is_of_node(dev_iommu_fwspec_get(dev)->iommu_fwnode)) 179 179 iort_iommu_msi_get_resv_regions(dev, list); 180 180 181 181 } ··· 447 447 kvfree(pages); 448 448 } 449 449 450 - static struct page **__iommu_dma_alloc_pages(unsigned int count, 451 - unsigned long order_mask, gfp_t gfp) 450 + static struct page **__iommu_dma_alloc_pages(struct device *dev, 451 + unsigned int count, unsigned long order_mask, gfp_t gfp) 452 452 { 453 453 struct page **pages; 454 - unsigned int i = 0, array_size = count * sizeof(*pages); 454 + unsigned int i = 0, nid = dev_to_node(dev); 455 455 456 456 order_mask &= (2U << MAX_ORDER) - 1; 457 457 if (!order_mask) 458 458 return NULL; 459 459 460 - if (array_size <= PAGE_SIZE) 461 - pages = kzalloc(array_size, GFP_KERNEL); 462 - else 463 - pages = vzalloc(array_size); 460 + pages = kvzalloc(count * sizeof(*pages), GFP_KERNEL); 464 461 if (!pages) 465 462 return NULL; 466 463 ··· 476 479 for (order_mask &= (2U << __fls(count)) - 1; 477 480 order_mask; order_mask &= ~order_size) { 478 481 unsigned int order = __fls(order_mask); 482 + gfp_t alloc_flags = gfp; 479 483 480 484 order_size = 1U << order; 481 - page = alloc_pages((order_mask - order_size) ? 482 - gfp | __GFP_NORETRY : gfp, order); 485 + if (order_mask > order_size) 486 + alloc_flags |= __GFP_NORETRY; 487 + page = alloc_pages_node(nid, alloc_flags, order); 483 488 if (!page) 484 489 continue; 485 490 if (!order) ··· 566 567 alloc_sizes = min_size; 567 568 568 569 count = PAGE_ALIGN(size) >> PAGE_SHIFT; 569 - pages = __iommu_dma_alloc_pages(count, alloc_sizes >> PAGE_SHIFT, gfp); 570 + pages = __iommu_dma_alloc_pages(dev, count, alloc_sizes >> PAGE_SHIFT, 571 + gfp); 570 572 if (!pages) 571 573 return NULL; 572 574
+61 -30
drivers/iommu/dmar.c
··· 1160 1160 int head, tail; 1161 1161 struct q_inval *qi = iommu->qi; 1162 1162 int wait_index = (index + 1) % QI_LENGTH; 1163 + int shift = qi_shift(iommu); 1163 1164 1164 1165 if (qi->desc_status[wait_index] == QI_ABORT) 1165 1166 return -EAGAIN; ··· 1174 1173 */ 1175 1174 if (fault & DMA_FSTS_IQE) { 1176 1175 head = readl(iommu->reg + DMAR_IQH_REG); 1177 - if ((head >> DMAR_IQ_SHIFT) == index) { 1178 - pr_err("VT-d detected invalid descriptor: " 1179 - "low=%llx, high=%llx\n", 1180 - (unsigned long long)qi->desc[index].low, 1181 - (unsigned long long)qi->desc[index].high); 1182 - memcpy(&qi->desc[index], &qi->desc[wait_index], 1183 - sizeof(struct qi_desc)); 1176 + if ((head >> shift) == index) { 1177 + struct qi_desc *desc = qi->desc + head; 1178 + 1179 + /* 1180 + * desc->qw2 and desc->qw3 are either reserved or 1181 + * used by software as private data. We won't print 1182 + * out these two qw's for security consideration. 1183 + */ 1184 + pr_err("VT-d detected invalid descriptor: qw0 = %llx, qw1 = %llx\n", 1185 + (unsigned long long)desc->qw0, 1186 + (unsigned long long)desc->qw1); 1187 + memcpy(desc, qi->desc + (wait_index << shift), 1188 + 1 << shift); 1184 1189 writel(DMA_FSTS_IQE, iommu->reg + DMAR_FSTS_REG); 1185 1190 return -EINVAL; 1186 1191 } ··· 1198 1191 */ 1199 1192 if (fault & DMA_FSTS_ITE) { 1200 1193 head = readl(iommu->reg + DMAR_IQH_REG); 1201 - head = ((head >> DMAR_IQ_SHIFT) - 1 + QI_LENGTH) % QI_LENGTH; 1194 + head = ((head >> shift) - 1 + QI_LENGTH) % QI_LENGTH; 1202 1195 head |= 1; 1203 1196 tail = readl(iommu->reg + DMAR_IQT_REG); 1204 - tail = ((tail >> DMAR_IQ_SHIFT) - 1 + QI_LENGTH) % QI_LENGTH; 1197 + tail = ((tail >> shift) - 1 + QI_LENGTH) % QI_LENGTH; 1205 1198 1206 1199 writel(DMA_FSTS_ITE, iommu->reg + DMAR_FSTS_REG); 1207 1200 ··· 1229 1222 { 1230 1223 int rc; 1231 1224 struct q_inval *qi = iommu->qi; 1232 - struct qi_desc *hw, wait_desc; 1225 + int offset, shift, length; 1226 + struct qi_desc wait_desc; 1233 1227 int wait_index, index; 1234 1228 unsigned long flags; 1235 1229 1236 1230 if (!qi) 1237 1231 return 0; 1238 - 1239 - hw = qi->desc; 1240 1232 1241 1233 restart: 1242 1234 rc = 0; ··· 1249 1243 1250 1244 index = qi->free_head; 1251 1245 wait_index = (index + 1) % QI_LENGTH; 1246 + shift = qi_shift(iommu); 1247 + length = 1 << shift; 1252 1248 1253 1249 qi->desc_status[index] = qi->desc_status[wait_index] = QI_IN_USE; 1254 1250 1255 - hw[index] = *desc; 1256 - 1257 - wait_desc.low = QI_IWD_STATUS_DATA(QI_DONE) | 1251 + offset = index << shift; 1252 + memcpy(qi->desc + offset, desc, length); 1253 + wait_desc.qw0 = QI_IWD_STATUS_DATA(QI_DONE) | 1258 1254 QI_IWD_STATUS_WRITE | QI_IWD_TYPE; 1259 - wait_desc.high = virt_to_phys(&qi->desc_status[wait_index]); 1255 + wait_desc.qw1 = virt_to_phys(&qi->desc_status[wait_index]); 1256 + wait_desc.qw2 = 0; 1257 + wait_desc.qw3 = 0; 1260 1258 1261 - hw[wait_index] = wait_desc; 1259 + offset = wait_index << shift; 1260 + memcpy(qi->desc + offset, &wait_desc, length); 1262 1261 1263 1262 qi->free_head = (qi->free_head + 2) % QI_LENGTH; 1264 1263 qi->free_cnt -= 2; ··· 1272 1261 * update the HW tail register indicating the presence of 1273 1262 * new descriptors. 1274 1263 */ 1275 - writel(qi->free_head << DMAR_IQ_SHIFT, iommu->reg + DMAR_IQT_REG); 1264 + writel(qi->free_head << shift, iommu->reg + DMAR_IQT_REG); 1276 1265 1277 1266 while (qi->desc_status[wait_index] != QI_DONE) { 1278 1267 /* ··· 1309 1298 { 1310 1299 struct qi_desc desc; 1311 1300 1312 - desc.low = QI_IEC_TYPE; 1313 - desc.high = 0; 1301 + desc.qw0 = QI_IEC_TYPE; 1302 + desc.qw1 = 0; 1303 + desc.qw2 = 0; 1304 + desc.qw3 = 0; 1314 1305 1315 1306 /* should never fail */ 1316 1307 qi_submit_sync(&desc, iommu); ··· 1323 1310 { 1324 1311 struct qi_desc desc; 1325 1312 1326 - desc.low = QI_CC_FM(fm) | QI_CC_SID(sid) | QI_CC_DID(did) 1313 + desc.qw0 = QI_CC_FM(fm) | QI_CC_SID(sid) | QI_CC_DID(did) 1327 1314 | QI_CC_GRAN(type) | QI_CC_TYPE; 1328 - desc.high = 0; 1315 + desc.qw1 = 0; 1316 + desc.qw2 = 0; 1317 + desc.qw3 = 0; 1329 1318 1330 1319 qi_submit_sync(&desc, iommu); 1331 1320 } ··· 1346 1331 if (cap_read_drain(iommu->cap)) 1347 1332 dr = 1; 1348 1333 1349 - desc.low = QI_IOTLB_DID(did) | QI_IOTLB_DR(dr) | QI_IOTLB_DW(dw) 1334 + desc.qw0 = QI_IOTLB_DID(did) | QI_IOTLB_DR(dr) | QI_IOTLB_DW(dw) 1350 1335 | QI_IOTLB_GRAN(type) | QI_IOTLB_TYPE; 1351 - desc.high = QI_IOTLB_ADDR(addr) | QI_IOTLB_IH(ih) 1336 + desc.qw1 = QI_IOTLB_ADDR(addr) | QI_IOTLB_IH(ih) 1352 1337 | QI_IOTLB_AM(size_order); 1338 + desc.qw2 = 0; 1339 + desc.qw3 = 0; 1353 1340 1354 1341 qi_submit_sync(&desc, iommu); 1355 1342 } ··· 1364 1347 if (mask) { 1365 1348 WARN_ON_ONCE(addr & ((1ULL << (VTD_PAGE_SHIFT + mask)) - 1)); 1366 1349 addr |= (1ULL << (VTD_PAGE_SHIFT + mask - 1)) - 1; 1367 - desc.high = QI_DEV_IOTLB_ADDR(addr) | QI_DEV_IOTLB_SIZE; 1350 + desc.qw1 = QI_DEV_IOTLB_ADDR(addr) | QI_DEV_IOTLB_SIZE; 1368 1351 } else 1369 - desc.high = QI_DEV_IOTLB_ADDR(addr); 1352 + desc.qw1 = QI_DEV_IOTLB_ADDR(addr); 1370 1353 1371 1354 if (qdep >= QI_DEV_IOTLB_MAX_INVS) 1372 1355 qdep = 0; 1373 1356 1374 - desc.low = QI_DEV_IOTLB_SID(sid) | QI_DEV_IOTLB_QDEP(qdep) | 1357 + desc.qw0 = QI_DEV_IOTLB_SID(sid) | QI_DEV_IOTLB_QDEP(qdep) | 1375 1358 QI_DIOTLB_TYPE | QI_DEV_IOTLB_PFSID(pfsid); 1359 + desc.qw2 = 0; 1360 + desc.qw3 = 0; 1376 1361 1377 1362 qi_submit_sync(&desc, iommu); 1378 1363 } ··· 1422 1403 u32 sts; 1423 1404 unsigned long flags; 1424 1405 struct q_inval *qi = iommu->qi; 1406 + u64 val = virt_to_phys(qi->desc); 1425 1407 1426 1408 qi->free_head = qi->free_tail = 0; 1427 1409 qi->free_cnt = QI_LENGTH; 1410 + 1411 + /* 1412 + * Set DW=1 and QS=1 in IQA_REG when Scalable Mode capability 1413 + * is present. 1414 + */ 1415 + if (ecap_smts(iommu->ecap)) 1416 + val |= (1 << 11) | 1; 1428 1417 1429 1418 raw_spin_lock_irqsave(&iommu->register_lock, flags); 1430 1419 1431 1420 /* write zero to the tail reg */ 1432 1421 writel(0, iommu->reg + DMAR_IQT_REG); 1433 1422 1434 - dmar_writeq(iommu->reg + DMAR_IQA_REG, virt_to_phys(qi->desc)); 1423 + dmar_writeq(iommu->reg + DMAR_IQA_REG, val); 1435 1424 1436 1425 iommu->gcmd |= DMA_GCMD_QIE; 1437 1426 writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG); ··· 1475 1448 1476 1449 qi = iommu->qi; 1477 1450 1478 - 1479 - desc_page = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO, 0); 1451 + /* 1452 + * Need two pages to accommodate 256 descriptors of 256 bits each 1453 + * if the remapping hardware supports scalable mode translation. 1454 + */ 1455 + desc_page = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO, 1456 + !!ecap_smts(iommu->ecap)); 1480 1457 if (!desc_page) { 1481 1458 kfree(qi); 1482 1459 iommu->qi = NULL;
+179 -172
drivers/iommu/intel-iommu.c
··· 292 292 } 293 293 294 294 /* 295 - * 0: readable 296 - * 1: writable 297 - * 2-6: reserved 298 - * 7: super page 299 - * 8-10: available 300 - * 11: snoop behavior 301 - * 12-63: Host physcial address 302 - */ 303 - struct dma_pte { 304 - u64 val; 305 - }; 306 - 307 - static inline void dma_clear_pte(struct dma_pte *pte) 308 - { 309 - pte->val = 0; 310 - } 311 - 312 - static inline u64 dma_pte_addr(struct dma_pte *pte) 313 - { 314 - #ifdef CONFIG_64BIT 315 - return pte->val & VTD_PAGE_MASK; 316 - #else 317 - /* Must have a full atomic 64-bit read */ 318 - return __cmpxchg64(&pte->val, 0ULL, 0ULL) & VTD_PAGE_MASK; 319 - #endif 320 - } 321 - 322 - static inline bool dma_pte_present(struct dma_pte *pte) 323 - { 324 - return (pte->val & 3) != 0; 325 - } 326 - 327 - static inline bool dma_pte_superpage(struct dma_pte *pte) 328 - { 329 - return (pte->val & DMA_PTE_LARGE_PAGE); 330 - } 331 - 332 - static inline int first_pte_in_page(struct dma_pte *pte) 333 - { 334 - return !((unsigned long)pte & ~VTD_PAGE_MASK); 335 - } 336 - 337 - /* 338 295 * This domain is a statically identity mapping domain. 339 296 * 1. This domain creats a static 1:1 mapping to all usable memory. 340 297 * 2. It maps to each iommu if successful. ··· 363 406 static int dmar_forcedac; 364 407 static int intel_iommu_strict; 365 408 static int intel_iommu_superpage = 1; 366 - static int intel_iommu_ecs = 1; 367 - static int intel_iommu_pasid28; 409 + static int intel_iommu_sm = 1; 368 410 static int iommu_identity_mapping; 369 411 370 412 #define IDENTMAP_ALL 1 371 413 #define IDENTMAP_GFX 2 372 414 #define IDENTMAP_AZALIA 4 373 415 374 - /* Broadwell and Skylake have broken ECS support — normal so-called "second 375 - * level" translation of DMA requests-without-PASID doesn't actually happen 376 - * unless you also set the NESTE bit in an extended context-entry. Which of 377 - * course means that SVM doesn't work because it's trying to do nested 378 - * translation of the physical addresses it finds in the process page tables, 379 - * through the IOVA->phys mapping found in the "second level" page tables. 380 - * 381 - * The VT-d specification was retroactively changed to change the definition 382 - * of the capability bits and pretend that Broadwell/Skylake never happened... 383 - * but unfortunately the wrong bit was changed. It's ECS which is broken, but 384 - * for some reason it was the PASID capability bit which was redefined (from 385 - * bit 28 on BDW/SKL to bit 40 in future). 386 - * 387 - * So our test for ECS needs to eschew those implementations which set the old 388 - * PASID capabiity bit 28, since those are the ones on which ECS is broken. 389 - * Unless we are working around the 'pasid28' limitations, that is, by putting 390 - * the device into passthrough mode for normal DMA and thus masking the bug. 391 - */ 392 - #define ecs_enabled(iommu) (intel_iommu_ecs && ecap_ecs(iommu->ecap) && \ 393 - (intel_iommu_pasid28 || !ecap_broken_pasid(iommu->ecap))) 394 - /* PASID support is thus enabled if ECS is enabled and *either* of the old 395 - * or new capability bits are set. */ 396 - #define pasid_enabled(iommu) (ecs_enabled(iommu) && \ 397 - (ecap_pasid(iommu->ecap) || ecap_broken_pasid(iommu->ecap))) 416 + #define sm_supported(iommu) (intel_iommu_sm && ecap_smts((iommu)->ecap)) 417 + #define pasid_supported(iommu) (sm_supported(iommu) && \ 418 + ecap_pasid((iommu)->ecap)) 398 419 399 420 int intel_iommu_gfx_mapped; 400 421 EXPORT_SYMBOL_GPL(intel_iommu_gfx_mapped); ··· 383 448 384 449 /* 385 450 * Iterate over elements in device_domain_list and call the specified 386 - * callback @fn against each element. This helper should only be used 387 - * in the context where the device_domain_lock has already been holden. 451 + * callback @fn against each element. 388 452 */ 389 453 int for_each_device_domain(int (*fn)(struct device_domain_info *info, 390 454 void *data), void *data) 391 455 { 392 456 int ret = 0; 457 + unsigned long flags; 393 458 struct device_domain_info *info; 394 459 395 - assert_spin_locked(&device_domain_lock); 460 + spin_lock_irqsave(&device_domain_lock, flags); 396 461 list_for_each_entry(info, &device_domain_list, global) { 397 462 ret = fn(info, data); 398 - if (ret) 463 + if (ret) { 464 + spin_unlock_irqrestore(&device_domain_lock, flags); 399 465 return ret; 466 + } 400 467 } 468 + spin_unlock_irqrestore(&device_domain_lock, flags); 401 469 402 470 return 0; 403 471 } ··· 456 518 } else if (!strncmp(str, "sp_off", 6)) { 457 519 pr_info("Disable supported super page\n"); 458 520 intel_iommu_superpage = 0; 459 - } else if (!strncmp(str, "ecs_off", 7)) { 460 - printk(KERN_INFO 461 - "Intel-IOMMU: disable extended context table support\n"); 462 - intel_iommu_ecs = 0; 463 - } else if (!strncmp(str, "pasid28", 7)) { 464 - printk(KERN_INFO 465 - "Intel-IOMMU: enable pre-production PASID support\n"); 466 - intel_iommu_pasid28 = 1; 467 - iommu_identity_mapping |= IDENTMAP_GFX; 521 + } else if (!strncmp(str, "sm_off", 6)) { 522 + pr_info("Intel-IOMMU: disable scalable mode support\n"); 523 + intel_iommu_sm = 0; 468 524 } else if (!strncmp(str, "tboot_noforce", 13)) { 469 525 printk(KERN_INFO 470 526 "Intel-IOMMU: not forcing on after tboot. This could expose security risk for tboot\n"); ··· 705 773 u64 *entry; 706 774 707 775 entry = &root->lo; 708 - if (ecs_enabled(iommu)) { 776 + if (sm_supported(iommu)) { 709 777 if (devfn >= 0x80) { 710 778 devfn -= 0x80; 711 779 entry = &root->hi; ··· 847 915 if (context) 848 916 free_pgtable_page(context); 849 917 850 - if (!ecs_enabled(iommu)) 918 + if (!sm_supported(iommu)) 851 919 continue; 852 920 853 921 context = iommu_context_addr(iommu, i, 0x80, 0); ··· 1199 1267 unsigned long flag; 1200 1268 1201 1269 addr = virt_to_phys(iommu->root_entry); 1202 - if (ecs_enabled(iommu)) 1203 - addr |= DMA_RTADDR_RTT; 1270 + if (sm_supported(iommu)) 1271 + addr |= DMA_RTADDR_SMT; 1204 1272 1205 1273 raw_spin_lock_irqsave(&iommu->register_lock, flag); 1206 1274 dmar_writeq(iommu->reg + DMAR_RTADDR_REG, addr); ··· 1214 1282 raw_spin_unlock_irqrestore(&iommu->register_lock, flag); 1215 1283 } 1216 1284 1217 - static void iommu_flush_write_buffer(struct intel_iommu *iommu) 1285 + void iommu_flush_write_buffer(struct intel_iommu *iommu) 1218 1286 { 1219 1287 u32 val; 1220 1288 unsigned long flag; ··· 1626 1694 */ 1627 1695 set_bit(0, iommu->domain_ids); 1628 1696 1697 + /* 1698 + * Vt-d spec rev3.0 (section 6.2.3.1) requires that each pasid 1699 + * entry for first-level or pass-through translation modes should 1700 + * be programmed with a domain id different from those used for 1701 + * second-level or nested translation. We reserve a domain id for 1702 + * this purpose. 1703 + */ 1704 + if (sm_supported(iommu)) 1705 + set_bit(FLPT_DEFAULT_DID, iommu->domain_ids); 1706 + 1629 1707 return 0; 1630 1708 } 1631 1709 ··· 1700 1758 free_context_table(iommu); 1701 1759 1702 1760 #ifdef CONFIG_INTEL_IOMMU_SVM 1703 - if (pasid_enabled(iommu)) { 1761 + if (pasid_supported(iommu)) { 1704 1762 if (ecap_prs(iommu->ecap)) 1705 1763 intel_svm_finish_prq(iommu); 1706 - intel_svm_exit(iommu); 1707 1764 } 1708 1765 #endif 1709 1766 } ··· 1922 1981 free_domain_mem(domain); 1923 1982 } 1924 1983 1984 + /* 1985 + * Get the PASID directory size for scalable mode context entry. 1986 + * Value of X in the PDTS field of a scalable mode context entry 1987 + * indicates PASID directory with 2^(X + 7) entries. 1988 + */ 1989 + static inline unsigned long context_get_sm_pds(struct pasid_table *table) 1990 + { 1991 + int pds, max_pde; 1992 + 1993 + max_pde = table->max_pasid >> PASID_PDE_SHIFT; 1994 + pds = find_first_bit((unsigned long *)&max_pde, MAX_NR_PASID_BITS); 1995 + if (pds < 7) 1996 + return 0; 1997 + 1998 + return pds - 7; 1999 + } 2000 + 2001 + /* 2002 + * Set the RID_PASID field of a scalable mode context entry. The 2003 + * IOMMU hardware will use the PASID value set in this field for 2004 + * DMA translations of DMA requests without PASID. 2005 + */ 2006 + static inline void 2007 + context_set_sm_rid2pasid(struct context_entry *context, unsigned long pasid) 2008 + { 2009 + context->hi |= pasid & ((1 << 20) - 1); 2010 + context->hi |= (1 << 20); 2011 + } 2012 + 2013 + /* 2014 + * Set the DTE(Device-TLB Enable) field of a scalable mode context 2015 + * entry. 2016 + */ 2017 + static inline void context_set_sm_dte(struct context_entry *context) 2018 + { 2019 + context->lo |= (1 << 2); 2020 + } 2021 + 2022 + /* 2023 + * Set the PRE(Page Request Enable) field of a scalable mode context 2024 + * entry. 2025 + */ 2026 + static inline void context_set_sm_pre(struct context_entry *context) 2027 + { 2028 + context->lo |= (1 << 4); 2029 + } 2030 + 2031 + /* Convert value to context PASID directory size field coding. */ 2032 + #define context_pdts(pds) (((pds) & 0x7) << 9) 2033 + 1925 2034 static int domain_context_mapping_one(struct dmar_domain *domain, 1926 2035 struct intel_iommu *iommu, 2036 + struct pasid_table *table, 1927 2037 u8 bus, u8 devfn) 1928 2038 { 1929 2039 u16 did = domain->iommu_did[iommu->seq_id]; ··· 1982 1990 struct device_domain_info *info = NULL; 1983 1991 struct context_entry *context; 1984 1992 unsigned long flags; 1985 - struct dma_pte *pgd; 1986 - int ret, agaw; 1993 + int ret; 1987 1994 1988 1995 WARN_ON(did == 0); 1989 1996 ··· 2028 2037 } 2029 2038 } 2030 2039 2031 - pgd = domain->pgd; 2032 - 2033 2040 context_clear_entry(context); 2034 - context_set_domain_id(context, did); 2035 2041 2036 - /* 2037 - * Skip top levels of page tables for iommu which has less agaw 2038 - * than default. Unnecessary for PT mode. 2039 - */ 2040 - if (translation != CONTEXT_TT_PASS_THROUGH) { 2041 - for (agaw = domain->agaw; agaw != iommu->agaw; agaw--) { 2042 - ret = -ENOMEM; 2043 - pgd = phys_to_virt(dma_pte_addr(pgd)); 2044 - if (!dma_pte_present(pgd)) 2045 - goto out_unlock; 2046 - } 2042 + if (sm_supported(iommu)) { 2043 + unsigned long pds; 2047 2044 2045 + WARN_ON(!table); 2046 + 2047 + /* Setup the PASID DIR pointer: */ 2048 + pds = context_get_sm_pds(table); 2049 + context->lo = (u64)virt_to_phys(table->table) | 2050 + context_pdts(pds); 2051 + 2052 + /* Setup the RID_PASID field: */ 2053 + context_set_sm_rid2pasid(context, PASID_RID2PASID); 2054 + 2055 + /* 2056 + * Setup the Device-TLB enable bit and Page request 2057 + * Enable bit: 2058 + */ 2048 2059 info = iommu_support_dev_iotlb(domain, iommu, bus, devfn); 2049 2060 if (info && info->ats_supported) 2050 - translation = CONTEXT_TT_DEV_IOTLB; 2051 - else 2052 - translation = CONTEXT_TT_MULTI_LEVEL; 2053 - 2054 - context_set_address_root(context, virt_to_phys(pgd)); 2055 - context_set_address_width(context, iommu->agaw); 2061 + context_set_sm_dte(context); 2062 + if (info && info->pri_supported) 2063 + context_set_sm_pre(context); 2056 2064 } else { 2057 - /* 2058 - * In pass through mode, AW must be programmed to 2059 - * indicate the largest AGAW value supported by 2060 - * hardware. And ASR is ignored by hardware. 2061 - */ 2062 - context_set_address_width(context, iommu->msagaw); 2065 + struct dma_pte *pgd = domain->pgd; 2066 + int agaw; 2067 + 2068 + context_set_domain_id(context, did); 2069 + context_set_translation_type(context, translation); 2070 + 2071 + if (translation != CONTEXT_TT_PASS_THROUGH) { 2072 + /* 2073 + * Skip top levels of page tables for iommu which has 2074 + * less agaw than default. Unnecessary for PT mode. 2075 + */ 2076 + for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) { 2077 + ret = -ENOMEM; 2078 + pgd = phys_to_virt(dma_pte_addr(pgd)); 2079 + if (!dma_pte_present(pgd)) 2080 + goto out_unlock; 2081 + } 2082 + 2083 + info = iommu_support_dev_iotlb(domain, iommu, bus, devfn); 2084 + if (info && info->ats_supported) 2085 + translation = CONTEXT_TT_DEV_IOTLB; 2086 + else 2087 + translation = CONTEXT_TT_MULTI_LEVEL; 2088 + 2089 + context_set_address_root(context, virt_to_phys(pgd)); 2090 + context_set_address_width(context, agaw); 2091 + } else { 2092 + /* 2093 + * In pass through mode, AW must be programmed to 2094 + * indicate the largest AGAW value supported by 2095 + * hardware. And ASR is ignored by hardware. 2096 + */ 2097 + context_set_address_width(context, iommu->msagaw); 2098 + } 2063 2099 } 2064 2100 2065 - context_set_translation_type(context, translation); 2066 2101 context_set_fault_enable(context); 2067 2102 context_set_present(context); 2068 2103 domain_flush_cache(domain, context, sizeof(*context)); ··· 2122 2105 struct domain_context_mapping_data { 2123 2106 struct dmar_domain *domain; 2124 2107 struct intel_iommu *iommu; 2108 + struct pasid_table *table; 2125 2109 }; 2126 2110 2127 2111 static int domain_context_mapping_cb(struct pci_dev *pdev, ··· 2131 2113 struct domain_context_mapping_data *data = opaque; 2132 2114 2133 2115 return domain_context_mapping_one(data->domain, data->iommu, 2134 - PCI_BUS_NUM(alias), alias & 0xff); 2116 + data->table, PCI_BUS_NUM(alias), 2117 + alias & 0xff); 2135 2118 } 2136 2119 2137 2120 static int 2138 2121 domain_context_mapping(struct dmar_domain *domain, struct device *dev) 2139 2122 { 2123 + struct domain_context_mapping_data data; 2124 + struct pasid_table *table; 2140 2125 struct intel_iommu *iommu; 2141 2126 u8 bus, devfn; 2142 - struct domain_context_mapping_data data; 2143 2127 2144 2128 iommu = device_to_iommu(dev, &bus, &devfn); 2145 2129 if (!iommu) 2146 2130 return -ENODEV; 2147 2131 2132 + table = intel_pasid_get_table(dev); 2133 + 2148 2134 if (!dev_is_pci(dev)) 2149 - return domain_context_mapping_one(domain, iommu, bus, devfn); 2135 + return domain_context_mapping_one(domain, iommu, table, 2136 + bus, devfn); 2150 2137 2151 2138 data.domain = domain; 2152 2139 data.iommu = iommu; 2140 + data.table = table; 2153 2141 2154 2142 return pci_for_each_dma_alias(to_pci_dev(dev), 2155 2143 &domain_context_mapping_cb, &data); ··· 2491 2467 dmar_find_matched_atsr_unit(pdev)) 2492 2468 info->ats_supported = 1; 2493 2469 2494 - if (ecs_enabled(iommu)) { 2495 - if (pasid_enabled(iommu)) { 2470 + if (sm_supported(iommu)) { 2471 + if (pasid_supported(iommu)) { 2496 2472 int features = pci_pasid_features(pdev); 2497 2473 if (features >= 0) 2498 2474 info->pasid_supported = features | 1; ··· 2538 2514 list_add(&info->global, &device_domain_list); 2539 2515 if (dev) 2540 2516 dev->archdata.iommu = info; 2517 + spin_unlock_irqrestore(&device_domain_lock, flags); 2541 2518 2542 - if (dev && dev_is_pci(dev) && info->pasid_supported) { 2519 + /* PASID table is mandatory for a PCI device in scalable mode. */ 2520 + if (dev && dev_is_pci(dev) && sm_supported(iommu)) { 2543 2521 ret = intel_pasid_alloc_table(dev); 2544 2522 if (ret) { 2545 - pr_warn("No pasid table for %s, pasid disabled\n", 2546 - dev_name(dev)); 2547 - info->pasid_supported = 0; 2523 + pr_err("PASID table allocation for %s failed\n", 2524 + dev_name(dev)); 2525 + dmar_remove_one_dev_info(domain, dev); 2526 + return NULL; 2527 + } 2528 + 2529 + /* Setup the PASID entry for requests without PASID: */ 2530 + spin_lock(&iommu->lock); 2531 + if (hw_pass_through && domain_type_is_si(domain)) 2532 + ret = intel_pasid_setup_pass_through(iommu, domain, 2533 + dev, PASID_RID2PASID); 2534 + else 2535 + ret = intel_pasid_setup_second_level(iommu, domain, 2536 + dev, PASID_RID2PASID); 2537 + spin_unlock(&iommu->lock); 2538 + if (ret) { 2539 + pr_err("Setup RID2PASID for %s failed\n", 2540 + dev_name(dev)); 2541 + dmar_remove_one_dev_info(domain, dev); 2542 + return NULL; 2548 2543 } 2549 2544 } 2550 - spin_unlock_irqrestore(&device_domain_lock, flags); 2551 2545 2552 2546 if (dev && domain_context_mapping(domain, dev)) { 2553 2547 pr_err("Domain context map for %s failed\n", dev_name(dev)); ··· 3329 3287 * We need to ensure the system pasid table is no bigger 3330 3288 * than the smallest supported. 3331 3289 */ 3332 - if (pasid_enabled(iommu)) { 3290 + if (pasid_supported(iommu)) { 3333 3291 u32 temp = 2 << ecap_pss(iommu->ecap); 3334 3292 3335 3293 intel_pasid_max_id = min_t(u32, temp, ··· 3390 3348 if (!ecap_pass_through(iommu->ecap)) 3391 3349 hw_pass_through = 0; 3392 3350 #ifdef CONFIG_INTEL_IOMMU_SVM 3393 - if (pasid_enabled(iommu)) 3351 + if (pasid_supported(iommu)) 3394 3352 intel_svm_init(iommu); 3395 3353 #endif 3396 3354 } ··· 3494 3452 iommu_flush_write_buffer(iommu); 3495 3453 3496 3454 #ifdef CONFIG_INTEL_IOMMU_SVM 3497 - if (pasid_enabled(iommu) && ecap_prs(iommu->ecap)) { 3455 + if (pasid_supported(iommu) && ecap_prs(iommu->ecap)) { 3498 3456 ret = intel_svm_enable_prq(iommu); 3499 3457 if (ret) 3500 3458 goto free_iommu; ··· 4377 4335 goto out; 4378 4336 4379 4337 #ifdef CONFIG_INTEL_IOMMU_SVM 4380 - if (pasid_enabled(iommu)) 4338 + if (pasid_supported(iommu)) 4381 4339 intel_svm_init(iommu); 4382 4340 #endif 4383 4341 ··· 4394 4352 iommu_flush_write_buffer(iommu); 4395 4353 4396 4354 #ifdef CONFIG_INTEL_IOMMU_SVM 4397 - if (pasid_enabled(iommu) && ecap_prs(iommu->ecap)) { 4355 + if (pasid_supported(iommu) && ecap_prs(iommu->ecap)) { 4398 4356 ret = intel_svm_enable_prq(iommu); 4399 4357 if (ret) 4400 4358 goto disable_iommu; ··· 4969 4927 iommu = info->iommu; 4970 4928 4971 4929 if (info->dev) { 4930 + if (dev_is_pci(info->dev) && sm_supported(iommu)) 4931 + intel_pasid_tear_down_entry(iommu, info->dev, 4932 + PASID_RID2PASID); 4933 + 4972 4934 iommu_disable_dev_iotlb(info); 4973 4935 domain_context_clear(iommu, info->dev); 4974 4936 intel_pasid_free_table(info->dev); ··· 5300 5254 } 5301 5255 5302 5256 #ifdef CONFIG_INTEL_IOMMU_SVM 5303 - #define MAX_NR_PASID_BITS (20) 5304 - static inline unsigned long intel_iommu_get_pts(struct device *dev) 5305 - { 5306 - int pts, max_pasid; 5307 - 5308 - max_pasid = intel_pasid_get_dev_max_id(dev); 5309 - pts = find_first_bit((unsigned long *)&max_pasid, MAX_NR_PASID_BITS); 5310 - if (pts < 5) 5311 - return 0; 5312 - 5313 - return pts - 5; 5314 - } 5315 - 5316 5257 int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sdev) 5317 5258 { 5318 5259 struct device_domain_info *info; ··· 5331 5298 sdev->sid = PCI_DEVID(info->bus, info->devfn); 5332 5299 5333 5300 if (!(ctx_lo & CONTEXT_PASIDE)) { 5334 - if (iommu->pasid_state_table) 5335 - context[1].hi = (u64)virt_to_phys(iommu->pasid_state_table); 5336 - context[1].lo = (u64)virt_to_phys(info->pasid_table->table) | 5337 - intel_iommu_get_pts(sdev->dev); 5338 - 5339 - wmb(); 5340 - /* CONTEXT_TT_MULTI_LEVEL and CONTEXT_TT_DEV_IOTLB are both 5341 - * extended to permit requests-with-PASID if the PASIDE bit 5342 - * is set. which makes sense. For CONTEXT_TT_PASS_THROUGH, 5343 - * however, the PASIDE bit is ignored and requests-with-PASID 5344 - * are unconditionally blocked. Which makes less sense. 5345 - * So convert from CONTEXT_TT_PASS_THROUGH to one of the new 5346 - * "guest mode" translation types depending on whether ATS 5347 - * is available or not. Annoyingly, we can't use the new 5348 - * modes *unless* PASIDE is set. */ 5349 - if ((ctx_lo & CONTEXT_TT_MASK) == (CONTEXT_TT_PASS_THROUGH << 2)) { 5350 - ctx_lo &= ~CONTEXT_TT_MASK; 5351 - if (info->ats_supported) 5352 - ctx_lo |= CONTEXT_TT_PT_PASID_DEV_IOTLB << 2; 5353 - else 5354 - ctx_lo |= CONTEXT_TT_PT_PASID << 2; 5355 - } 5356 5301 ctx_lo |= CONTEXT_PASIDE; 5357 - if (iommu->pasid_state_table) 5358 - ctx_lo |= CONTEXT_DINVE; 5359 - if (info->pri_supported) 5360 - ctx_lo |= CONTEXT_PRS; 5361 5302 context[0].lo = ctx_lo; 5362 5303 wmb(); 5363 5304 iommu->flush.flush_context(iommu, sdev->did, sdev->sid,
+433 -16
drivers/iommu/intel-pasid.c
··· 9 9 10 10 #define pr_fmt(fmt) "DMAR: " fmt 11 11 12 + #include <linux/bitops.h> 13 + #include <linux/cpufeature.h> 12 14 #include <linux/dmar.h> 13 15 #include <linux/intel-iommu.h> 14 16 #include <linux/iommu.h> ··· 125 123 struct pasid_table *pasid_table; 126 124 struct pasid_table_opaque data; 127 125 struct page *pages; 128 - size_t size, count; 126 + int max_pasid = 0; 129 127 int ret, order; 128 + int size; 130 129 130 + might_sleep(); 131 131 info = dev->archdata.iommu; 132 - if (WARN_ON(!info || !dev_is_pci(dev) || 133 - !info->pasid_supported || info->pasid_table)) 132 + if (WARN_ON(!info || !dev_is_pci(dev) || info->pasid_table)) 134 133 return -EINVAL; 135 134 136 135 /* DMA alias device already has a pasid table, use it: */ ··· 141 138 if (ret) 142 139 goto attach_out; 143 140 144 - pasid_table = kzalloc(sizeof(*pasid_table), GFP_ATOMIC); 141 + pasid_table = kzalloc(sizeof(*pasid_table), GFP_KERNEL); 145 142 if (!pasid_table) 146 143 return -ENOMEM; 147 144 INIT_LIST_HEAD(&pasid_table->dev); 148 145 149 - size = sizeof(struct pasid_entry); 150 - count = min_t(int, pci_max_pasids(to_pci_dev(dev)), intel_pasid_max_id); 151 - order = get_order(size * count); 146 + if (info->pasid_supported) 147 + max_pasid = min_t(int, pci_max_pasids(to_pci_dev(dev)), 148 + intel_pasid_max_id); 149 + 150 + size = max_pasid >> (PASID_PDE_SHIFT - 3); 151 + order = size ? get_order(size) : 0; 152 152 pages = alloc_pages_node(info->iommu->node, 153 - GFP_ATOMIC | __GFP_ZERO, 154 - order); 153 + GFP_KERNEL | __GFP_ZERO, order); 155 154 if (!pages) 156 155 return -ENOMEM; 157 156 158 157 pasid_table->table = page_address(pages); 159 158 pasid_table->order = order; 160 - pasid_table->max_pasid = count; 159 + pasid_table->max_pasid = 1 << (order + PAGE_SHIFT + 3); 161 160 162 161 attach_out: 163 162 device_attach_pasid_table(info, pasid_table); ··· 167 162 return 0; 168 163 } 169 164 165 + /* Get PRESENT bit of a PASID directory entry. */ 166 + static inline bool 167 + pasid_pde_is_present(struct pasid_dir_entry *pde) 168 + { 169 + return READ_ONCE(pde->val) & PASID_PTE_PRESENT; 170 + } 171 + 172 + /* Get PASID table from a PASID directory entry. */ 173 + static inline struct pasid_entry * 174 + get_pasid_table_from_pde(struct pasid_dir_entry *pde) 175 + { 176 + if (!pasid_pde_is_present(pde)) 177 + return NULL; 178 + 179 + return phys_to_virt(READ_ONCE(pde->val) & PDE_PFN_MASK); 180 + } 181 + 170 182 void intel_pasid_free_table(struct device *dev) 171 183 { 172 184 struct device_domain_info *info; 173 185 struct pasid_table *pasid_table; 186 + struct pasid_dir_entry *dir; 187 + struct pasid_entry *table; 188 + int i, max_pde; 174 189 175 190 info = dev->archdata.iommu; 176 - if (!info || !dev_is_pci(dev) || 177 - !info->pasid_supported || !info->pasid_table) 191 + if (!info || !dev_is_pci(dev) || !info->pasid_table) 178 192 return; 179 193 180 194 pasid_table = info->pasid_table; ··· 201 177 202 178 if (!list_empty(&pasid_table->dev)) 203 179 return; 180 + 181 + /* Free scalable mode PASID directory tables: */ 182 + dir = pasid_table->table; 183 + max_pde = pasid_table->max_pasid >> PASID_PDE_SHIFT; 184 + for (i = 0; i < max_pde; i++) { 185 + table = get_pasid_table_from_pde(&dir[i]); 186 + free_pgtable_page(table); 187 + } 204 188 205 189 free_pages((unsigned long)pasid_table->table, pasid_table->order); 206 190 kfree(pasid_table); ··· 238 206 239 207 struct pasid_entry *intel_pasid_get_entry(struct device *dev, int pasid) 240 208 { 209 + struct device_domain_info *info; 241 210 struct pasid_table *pasid_table; 211 + struct pasid_dir_entry *dir; 242 212 struct pasid_entry *entries; 213 + int dir_index, index; 243 214 244 215 pasid_table = intel_pasid_get_table(dev); 245 216 if (WARN_ON(!pasid_table || pasid < 0 || 246 217 pasid >= intel_pasid_get_dev_max_id(dev))) 247 218 return NULL; 248 219 249 - entries = pasid_table->table; 220 + dir = pasid_table->table; 221 + info = dev->archdata.iommu; 222 + dir_index = pasid >> PASID_PDE_SHIFT; 223 + index = pasid & PASID_PTE_MASK; 250 224 251 - return &entries[pasid]; 225 + spin_lock(&pasid_lock); 226 + entries = get_pasid_table_from_pde(&dir[dir_index]); 227 + if (!entries) { 228 + entries = alloc_pgtable_page(info->iommu->node); 229 + if (!entries) { 230 + spin_unlock(&pasid_lock); 231 + return NULL; 232 + } 233 + 234 + WRITE_ONCE(dir[dir_index].val, 235 + (u64)virt_to_phys(entries) | PASID_PTE_PRESENT); 236 + } 237 + spin_unlock(&pasid_lock); 238 + 239 + return &entries[index]; 252 240 } 253 241 254 242 /* ··· 276 224 */ 277 225 static inline void pasid_clear_entry(struct pasid_entry *pe) 278 226 { 279 - WRITE_ONCE(pe->val, 0); 227 + WRITE_ONCE(pe->val[0], 0); 228 + WRITE_ONCE(pe->val[1], 0); 229 + WRITE_ONCE(pe->val[2], 0); 230 + WRITE_ONCE(pe->val[3], 0); 231 + WRITE_ONCE(pe->val[4], 0); 232 + WRITE_ONCE(pe->val[5], 0); 233 + WRITE_ONCE(pe->val[6], 0); 234 + WRITE_ONCE(pe->val[7], 0); 280 235 } 281 236 282 - void intel_pasid_clear_entry(struct device *dev, int pasid) 237 + static void intel_pasid_clear_entry(struct device *dev, int pasid) 283 238 { 284 239 struct pasid_entry *pe; 285 240 ··· 295 236 return; 296 237 297 238 pasid_clear_entry(pe); 239 + } 240 + 241 + static inline void pasid_set_bits(u64 *ptr, u64 mask, u64 bits) 242 + { 243 + u64 old; 244 + 245 + old = READ_ONCE(*ptr); 246 + WRITE_ONCE(*ptr, (old & ~mask) | bits); 247 + } 248 + 249 + /* 250 + * Setup the DID(Domain Identifier) field (Bit 64~79) of scalable mode 251 + * PASID entry. 252 + */ 253 + static inline void 254 + pasid_set_domain_id(struct pasid_entry *pe, u64 value) 255 + { 256 + pasid_set_bits(&pe->val[1], GENMASK_ULL(15, 0), value); 257 + } 258 + 259 + /* 260 + * Get domain ID value of a scalable mode PASID entry. 261 + */ 262 + static inline u16 263 + pasid_get_domain_id(struct pasid_entry *pe) 264 + { 265 + return (u16)(READ_ONCE(pe->val[1]) & GENMASK_ULL(15, 0)); 266 + } 267 + 268 + /* 269 + * Setup the SLPTPTR(Second Level Page Table Pointer) field (Bit 12~63) 270 + * of a scalable mode PASID entry. 271 + */ 272 + static inline void 273 + pasid_set_slptr(struct pasid_entry *pe, u64 value) 274 + { 275 + pasid_set_bits(&pe->val[0], VTD_PAGE_MASK, value); 276 + } 277 + 278 + /* 279 + * Setup the AW(Address Width) field (Bit 2~4) of a scalable mode PASID 280 + * entry. 281 + */ 282 + static inline void 283 + pasid_set_address_width(struct pasid_entry *pe, u64 value) 284 + { 285 + pasid_set_bits(&pe->val[0], GENMASK_ULL(4, 2), value << 2); 286 + } 287 + 288 + /* 289 + * Setup the PGTT(PASID Granular Translation Type) field (Bit 6~8) 290 + * of a scalable mode PASID entry. 291 + */ 292 + static inline void 293 + pasid_set_translation_type(struct pasid_entry *pe, u64 value) 294 + { 295 + pasid_set_bits(&pe->val[0], GENMASK_ULL(8, 6), value << 6); 296 + } 297 + 298 + /* 299 + * Enable fault processing by clearing the FPD(Fault Processing 300 + * Disable) field (Bit 1) of a scalable mode PASID entry. 301 + */ 302 + static inline void pasid_set_fault_enable(struct pasid_entry *pe) 303 + { 304 + pasid_set_bits(&pe->val[0], 1 << 1, 0); 305 + } 306 + 307 + /* 308 + * Setup the SRE(Supervisor Request Enable) field (Bit 128) of a 309 + * scalable mode PASID entry. 310 + */ 311 + static inline void pasid_set_sre(struct pasid_entry *pe) 312 + { 313 + pasid_set_bits(&pe->val[2], 1 << 0, 1); 314 + } 315 + 316 + /* 317 + * Setup the P(Present) field (Bit 0) of a scalable mode PASID 318 + * entry. 319 + */ 320 + static inline void pasid_set_present(struct pasid_entry *pe) 321 + { 322 + pasid_set_bits(&pe->val[0], 1 << 0, 1); 323 + } 324 + 325 + /* 326 + * Setup Page Walk Snoop bit (Bit 87) of a scalable mode PASID 327 + * entry. 328 + */ 329 + static inline void pasid_set_page_snoop(struct pasid_entry *pe, bool value) 330 + { 331 + pasid_set_bits(&pe->val[1], 1 << 23, value); 332 + } 333 + 334 + /* 335 + * Setup the First Level Page table Pointer field (Bit 140~191) 336 + * of a scalable mode PASID entry. 337 + */ 338 + static inline void 339 + pasid_set_flptr(struct pasid_entry *pe, u64 value) 340 + { 341 + pasid_set_bits(&pe->val[2], VTD_PAGE_MASK, value); 342 + } 343 + 344 + /* 345 + * Setup the First Level Paging Mode field (Bit 130~131) of a 346 + * scalable mode PASID entry. 347 + */ 348 + static inline void 349 + pasid_set_flpm(struct pasid_entry *pe, u64 value) 350 + { 351 + pasid_set_bits(&pe->val[2], GENMASK_ULL(3, 2), value << 2); 352 + } 353 + 354 + static void 355 + pasid_cache_invalidation_with_pasid(struct intel_iommu *iommu, 356 + u16 did, int pasid) 357 + { 358 + struct qi_desc desc; 359 + 360 + desc.qw0 = QI_PC_DID(did) | QI_PC_PASID_SEL | QI_PC_PASID(pasid); 361 + desc.qw1 = 0; 362 + desc.qw2 = 0; 363 + desc.qw3 = 0; 364 + 365 + qi_submit_sync(&desc, iommu); 366 + } 367 + 368 + static void 369 + iotlb_invalidation_with_pasid(struct intel_iommu *iommu, u16 did, u32 pasid) 370 + { 371 + struct qi_desc desc; 372 + 373 + desc.qw0 = QI_EIOTLB_PASID(pasid) | QI_EIOTLB_DID(did) | 374 + QI_EIOTLB_GRAN(QI_GRAN_NONG_PASID) | QI_EIOTLB_TYPE; 375 + desc.qw1 = 0; 376 + desc.qw2 = 0; 377 + desc.qw3 = 0; 378 + 379 + qi_submit_sync(&desc, iommu); 380 + } 381 + 382 + static void 383 + devtlb_invalidation_with_pasid(struct intel_iommu *iommu, 384 + struct device *dev, int pasid) 385 + { 386 + struct device_domain_info *info; 387 + u16 sid, qdep, pfsid; 388 + 389 + info = dev->archdata.iommu; 390 + if (!info || !info->ats_enabled) 391 + return; 392 + 393 + sid = info->bus << 8 | info->devfn; 394 + qdep = info->ats_qdep; 395 + pfsid = info->pfsid; 396 + 397 + qi_flush_dev_iotlb(iommu, sid, pfsid, qdep, 0, 64 - VTD_PAGE_SHIFT); 398 + } 399 + 400 + void intel_pasid_tear_down_entry(struct intel_iommu *iommu, 401 + struct device *dev, int pasid) 402 + { 403 + struct pasid_entry *pte; 404 + u16 did; 405 + 406 + pte = intel_pasid_get_entry(dev, pasid); 407 + if (WARN_ON(!pte)) 408 + return; 409 + 410 + intel_pasid_clear_entry(dev, pasid); 411 + did = pasid_get_domain_id(pte); 412 + 413 + if (!ecap_coherent(iommu->ecap)) 414 + clflush_cache_range(pte, sizeof(*pte)); 415 + 416 + pasid_cache_invalidation_with_pasid(iommu, did, pasid); 417 + iotlb_invalidation_with_pasid(iommu, did, pasid); 418 + 419 + /* Device IOTLB doesn't need to be flushed in caching mode. */ 420 + if (!cap_caching_mode(iommu->cap)) 421 + devtlb_invalidation_with_pasid(iommu, dev, pasid); 422 + } 423 + 424 + /* 425 + * Set up the scalable mode pasid table entry for first only 426 + * translation type. 427 + */ 428 + int intel_pasid_setup_first_level(struct intel_iommu *iommu, 429 + struct device *dev, pgd_t *pgd, 430 + int pasid, u16 did, int flags) 431 + { 432 + struct pasid_entry *pte; 433 + 434 + if (!ecap_flts(iommu->ecap)) { 435 + pr_err("No first level translation support on %s\n", 436 + iommu->name); 437 + return -EINVAL; 438 + } 439 + 440 + pte = intel_pasid_get_entry(dev, pasid); 441 + if (WARN_ON(!pte)) 442 + return -EINVAL; 443 + 444 + pasid_clear_entry(pte); 445 + 446 + /* Setup the first level page table pointer: */ 447 + pasid_set_flptr(pte, (u64)__pa(pgd)); 448 + if (flags & PASID_FLAG_SUPERVISOR_MODE) { 449 + if (!ecap_srs(iommu->ecap)) { 450 + pr_err("No supervisor request support on %s\n", 451 + iommu->name); 452 + return -EINVAL; 453 + } 454 + pasid_set_sre(pte); 455 + } 456 + 457 + #ifdef CONFIG_X86 458 + if (cpu_feature_enabled(X86_FEATURE_LA57)) 459 + pasid_set_flpm(pte, 1); 460 + #endif /* CONFIG_X86 */ 461 + 462 + pasid_set_domain_id(pte, did); 463 + pasid_set_address_width(pte, iommu->agaw); 464 + pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap)); 465 + 466 + /* Setup Present and PASID Granular Transfer Type: */ 467 + pasid_set_translation_type(pte, 1); 468 + pasid_set_present(pte); 469 + 470 + if (!ecap_coherent(iommu->ecap)) 471 + clflush_cache_range(pte, sizeof(*pte)); 472 + 473 + if (cap_caching_mode(iommu->cap)) { 474 + pasid_cache_invalidation_with_pasid(iommu, did, pasid); 475 + iotlb_invalidation_with_pasid(iommu, did, pasid); 476 + } else { 477 + iommu_flush_write_buffer(iommu); 478 + } 479 + 480 + return 0; 481 + } 482 + 483 + /* 484 + * Set up the scalable mode pasid entry for second only translation type. 485 + */ 486 + int intel_pasid_setup_second_level(struct intel_iommu *iommu, 487 + struct dmar_domain *domain, 488 + struct device *dev, int pasid) 489 + { 490 + struct pasid_entry *pte; 491 + struct dma_pte *pgd; 492 + u64 pgd_val; 493 + int agaw; 494 + u16 did; 495 + 496 + /* 497 + * If hardware advertises no support for second level 498 + * translation, return directly. 499 + */ 500 + if (!ecap_slts(iommu->ecap)) { 501 + pr_err("No second level translation support on %s\n", 502 + iommu->name); 503 + return -EINVAL; 504 + } 505 + 506 + /* 507 + * Skip top levels of page tables for iommu which has less agaw 508 + * than default. Unnecessary for PT mode. 509 + */ 510 + pgd = domain->pgd; 511 + for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) { 512 + pgd = phys_to_virt(dma_pte_addr(pgd)); 513 + if (!dma_pte_present(pgd)) { 514 + dev_err(dev, "Invalid domain page table\n"); 515 + return -EINVAL; 516 + } 517 + } 518 + 519 + pgd_val = virt_to_phys(pgd); 520 + did = domain->iommu_did[iommu->seq_id]; 521 + 522 + pte = intel_pasid_get_entry(dev, pasid); 523 + if (!pte) { 524 + dev_err(dev, "Failed to get pasid entry of PASID %d\n", pasid); 525 + return -ENODEV; 526 + } 527 + 528 + pasid_clear_entry(pte); 529 + pasid_set_domain_id(pte, did); 530 + pasid_set_slptr(pte, pgd_val); 531 + pasid_set_address_width(pte, agaw); 532 + pasid_set_translation_type(pte, 2); 533 + pasid_set_fault_enable(pte); 534 + pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap)); 535 + 536 + /* 537 + * Since it is a second level only translation setup, we should 538 + * set SRE bit as well (addresses are expected to be GPAs). 539 + */ 540 + pasid_set_sre(pte); 541 + pasid_set_present(pte); 542 + 543 + if (!ecap_coherent(iommu->ecap)) 544 + clflush_cache_range(pte, sizeof(*pte)); 545 + 546 + if (cap_caching_mode(iommu->cap)) { 547 + pasid_cache_invalidation_with_pasid(iommu, did, pasid); 548 + iotlb_invalidation_with_pasid(iommu, did, pasid); 549 + } else { 550 + iommu_flush_write_buffer(iommu); 551 + } 552 + 553 + return 0; 554 + } 555 + 556 + /* 557 + * Set up the scalable mode pasid entry for passthrough translation type. 558 + */ 559 + int intel_pasid_setup_pass_through(struct intel_iommu *iommu, 560 + struct dmar_domain *domain, 561 + struct device *dev, int pasid) 562 + { 563 + u16 did = FLPT_DEFAULT_DID; 564 + struct pasid_entry *pte; 565 + 566 + pte = intel_pasid_get_entry(dev, pasid); 567 + if (!pte) { 568 + dev_err(dev, "Failed to get pasid entry of PASID %d\n", pasid); 569 + return -ENODEV; 570 + } 571 + 572 + pasid_clear_entry(pte); 573 + pasid_set_domain_id(pte, did); 574 + pasid_set_address_width(pte, iommu->agaw); 575 + pasid_set_translation_type(pte, 4); 576 + pasid_set_fault_enable(pte); 577 + pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap)); 578 + 579 + /* 580 + * We should set SRE bit as well since the addresses are expected 581 + * to be GPAs. 582 + */ 583 + pasid_set_sre(pte); 584 + pasid_set_present(pte); 585 + 586 + if (!ecap_coherent(iommu->ecap)) 587 + clflush_cache_range(pte, sizeof(*pte)); 588 + 589 + if (cap_caching_mode(iommu->cap)) { 590 + pasid_cache_invalidation_with_pasid(iommu, did, pasid); 591 + iotlb_invalidation_with_pasid(iommu, did, pasid); 592 + } else { 593 + iommu_flush_write_buffer(iommu); 594 + } 595 + 596 + return 0; 298 597 }
+37 -3
drivers/iommu/intel-pasid.h
··· 10 10 #ifndef __INTEL_PASID_H 11 11 #define __INTEL_PASID_H 12 12 13 + #define PASID_RID2PASID 0x0 13 14 #define PASID_MIN 0x1 14 - #define PASID_MAX 0x20000 15 + #define PASID_MAX 0x100000 16 + #define PASID_PTE_MASK 0x3F 17 + #define PASID_PTE_PRESENT 1 18 + #define PDE_PFN_MASK PAGE_MASK 19 + #define PASID_PDE_SHIFT 6 20 + #define MAX_NR_PASID_BITS 20 21 + 22 + /* 23 + * Domain ID reserved for pasid entries programmed for first-level 24 + * only and pass-through transfer modes. 25 + */ 26 + #define FLPT_DEFAULT_DID 1 27 + 28 + /* 29 + * The SUPERVISOR_MODE flag indicates a first level translation which 30 + * can be used for access to kernel addresses. It is valid only for 31 + * access to the kernel's static 1:1 mapping of physical memory — not 32 + * to vmalloc or even module mappings. 33 + */ 34 + #define PASID_FLAG_SUPERVISOR_MODE BIT(0) 35 + 36 + struct pasid_dir_entry { 37 + u64 val; 38 + }; 15 39 16 40 struct pasid_entry { 17 - u64 val; 41 + u64 val[8]; 18 42 }; 19 43 20 44 /* The representative of a PASID table */ ··· 58 34 struct pasid_table *intel_pasid_get_table(struct device *dev); 59 35 int intel_pasid_get_dev_max_id(struct device *dev); 60 36 struct pasid_entry *intel_pasid_get_entry(struct device *dev, int pasid); 61 - void intel_pasid_clear_entry(struct device *dev, int pasid); 37 + int intel_pasid_setup_first_level(struct intel_iommu *iommu, 38 + struct device *dev, pgd_t *pgd, 39 + int pasid, u16 did, int flags); 40 + int intel_pasid_setup_second_level(struct intel_iommu *iommu, 41 + struct dmar_domain *domain, 42 + struct device *dev, int pasid); 43 + int intel_pasid_setup_pass_through(struct intel_iommu *iommu, 44 + struct dmar_domain *domain, 45 + struct device *dev, int pasid); 46 + void intel_pasid_tear_down_entry(struct intel_iommu *iommu, 47 + struct device *dev, int pasid); 62 48 63 49 #endif /* __INTEL_PASID_H */
+61 -110
drivers/iommu/intel-svm.c
··· 29 29 30 30 #include "intel-pasid.h" 31 31 32 - #define PASID_ENTRY_P BIT_ULL(0) 33 - #define PASID_ENTRY_FLPM_5LP BIT_ULL(9) 34 - #define PASID_ENTRY_SRE BIT_ULL(11) 35 - 36 32 static irqreturn_t prq_event_thread(int irq, void *d); 37 - 38 - struct pasid_state_entry { 39 - u64 val; 40 - }; 41 33 42 34 int intel_svm_init(struct intel_iommu *iommu) 43 35 { 44 - struct page *pages; 45 - int order; 46 - 47 36 if (cpu_feature_enabled(X86_FEATURE_GBPAGES) && 48 37 !cap_fl1gp_support(iommu->cap)) 49 38 return -EINVAL; ··· 40 51 if (cpu_feature_enabled(X86_FEATURE_LA57) && 41 52 !cap_5lp_support(iommu->cap)) 42 53 return -EINVAL; 43 - 44 - /* Start at 2 because it's defined as 2^(1+PSS) */ 45 - iommu->pasid_max = 2 << ecap_pss(iommu->ecap); 46 - 47 - /* Eventually I'm promised we will get a multi-level PASID table 48 - * and it won't have to be physically contiguous. Until then, 49 - * limit the size because 8MiB contiguous allocations can be hard 50 - * to come by. The limit of 0x20000, which is 1MiB for each of 51 - * the PASID and PASID-state tables, is somewhat arbitrary. */ 52 - if (iommu->pasid_max > 0x20000) 53 - iommu->pasid_max = 0x20000; 54 - 55 - order = get_order(sizeof(struct pasid_entry) * iommu->pasid_max); 56 - if (ecap_dis(iommu->ecap)) { 57 - /* Just making it explicit... */ 58 - BUILD_BUG_ON(sizeof(struct pasid_entry) != sizeof(struct pasid_state_entry)); 59 - pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, order); 60 - if (pages) 61 - iommu->pasid_state_table = page_address(pages); 62 - else 63 - pr_warn("IOMMU: %s: Failed to allocate PASID state table\n", 64 - iommu->name); 65 - } 66 - 67 - return 0; 68 - } 69 - 70 - int intel_svm_exit(struct intel_iommu *iommu) 71 - { 72 - int order = get_order(sizeof(struct pasid_entry) * iommu->pasid_max); 73 - 74 - if (iommu->pasid_state_table) { 75 - free_pages((unsigned long)iommu->pasid_state_table, order); 76 - iommu->pasid_state_table = NULL; 77 - } 78 54 79 55 return 0; 80 56 } ··· 117 163 * because that's the only option the hardware gives us. Despite 118 164 * the fact that they are actually only accessible through one. */ 119 165 if (gl) 120 - desc.low = QI_EIOTLB_PASID(svm->pasid) | QI_EIOTLB_DID(sdev->did) | 121 - QI_EIOTLB_GRAN(QI_GRAN_ALL_ALL) | QI_EIOTLB_TYPE; 166 + desc.qw0 = QI_EIOTLB_PASID(svm->pasid) | 167 + QI_EIOTLB_DID(sdev->did) | 168 + QI_EIOTLB_GRAN(QI_GRAN_ALL_ALL) | 169 + QI_EIOTLB_TYPE; 122 170 else 123 - desc.low = QI_EIOTLB_PASID(svm->pasid) | QI_EIOTLB_DID(sdev->did) | 124 - QI_EIOTLB_GRAN(QI_GRAN_NONG_PASID) | QI_EIOTLB_TYPE; 125 - desc.high = 0; 171 + desc.qw0 = QI_EIOTLB_PASID(svm->pasid) | 172 + QI_EIOTLB_DID(sdev->did) | 173 + QI_EIOTLB_GRAN(QI_GRAN_NONG_PASID) | 174 + QI_EIOTLB_TYPE; 175 + desc.qw1 = 0; 126 176 } else { 127 177 int mask = ilog2(__roundup_pow_of_two(pages)); 128 178 129 - desc.low = QI_EIOTLB_PASID(svm->pasid) | QI_EIOTLB_DID(sdev->did) | 130 - QI_EIOTLB_GRAN(QI_GRAN_PSI_PASID) | QI_EIOTLB_TYPE; 131 - desc.high = QI_EIOTLB_ADDR(address) | QI_EIOTLB_GL(gl) | 132 - QI_EIOTLB_IH(ih) | QI_EIOTLB_AM(mask); 179 + desc.qw0 = QI_EIOTLB_PASID(svm->pasid) | 180 + QI_EIOTLB_DID(sdev->did) | 181 + QI_EIOTLB_GRAN(QI_GRAN_PSI_PASID) | 182 + QI_EIOTLB_TYPE; 183 + desc.qw1 = QI_EIOTLB_ADDR(address) | 184 + QI_EIOTLB_GL(gl) | 185 + QI_EIOTLB_IH(ih) | 186 + QI_EIOTLB_AM(mask); 133 187 } 188 + desc.qw2 = 0; 189 + desc.qw3 = 0; 134 190 qi_submit_sync(&desc, svm->iommu); 135 191 136 192 if (sdev->dev_iotlb) { 137 - desc.low = QI_DEV_EIOTLB_PASID(svm->pasid) | QI_DEV_EIOTLB_SID(sdev->sid) | 138 - QI_DEV_EIOTLB_QDEP(sdev->qdep) | QI_DEIOTLB_TYPE; 193 + desc.qw0 = QI_DEV_EIOTLB_PASID(svm->pasid) | 194 + QI_DEV_EIOTLB_SID(sdev->sid) | 195 + QI_DEV_EIOTLB_QDEP(sdev->qdep) | 196 + QI_DEIOTLB_TYPE; 139 197 if (pages == -1) { 140 - desc.high = QI_DEV_EIOTLB_ADDR(-1ULL >> 1) | QI_DEV_EIOTLB_SIZE; 198 + desc.qw1 = QI_DEV_EIOTLB_ADDR(-1ULL >> 1) | 199 + QI_DEV_EIOTLB_SIZE; 141 200 } else if (pages > 1) { 142 201 /* The least significant zero bit indicates the size. So, 143 202 * for example, an "address" value of 0x12345f000 will ··· 158 191 unsigned long last = address + ((unsigned long)(pages - 1) << VTD_PAGE_SHIFT); 159 192 unsigned long mask = __rounddown_pow_of_two(address ^ last); 160 193 161 - desc.high = QI_DEV_EIOTLB_ADDR((address & ~mask) | (mask - 1)) | QI_DEV_EIOTLB_SIZE; 194 + desc.qw1 = QI_DEV_EIOTLB_ADDR((address & ~mask) | 195 + (mask - 1)) | QI_DEV_EIOTLB_SIZE; 162 196 } else { 163 - desc.high = QI_DEV_EIOTLB_ADDR(address); 197 + desc.qw1 = QI_DEV_EIOTLB_ADDR(address); 164 198 } 199 + desc.qw2 = 0; 200 + desc.qw3 = 0; 165 201 qi_submit_sync(&desc, svm->iommu); 166 202 } 167 203 } ··· 173 203 unsigned long pages, int ih, int gl) 174 204 { 175 205 struct intel_svm_dev *sdev; 176 - 177 - /* Try deferred invalidate if available */ 178 - if (svm->iommu->pasid_state_table && 179 - !cmpxchg64(&svm->iommu->pasid_state_table[svm->pasid].val, 0, 1ULL << 63)) 180 - return; 181 206 182 207 rcu_read_lock(); 183 208 list_for_each_entry_rcu(sdev, &svm->devs, list) ··· 199 234 (end - start + PAGE_SIZE - 1) >> VTD_PAGE_SHIFT, 0, 0); 200 235 } 201 236 202 - 203 - static void intel_flush_pasid_dev(struct intel_svm *svm, struct intel_svm_dev *sdev, int pasid) 204 - { 205 - struct qi_desc desc; 206 - 207 - desc.high = 0; 208 - desc.low = QI_PC_TYPE | QI_PC_DID(sdev->did) | QI_PC_PASID_SEL | QI_PC_PASID(pasid); 209 - 210 - qi_submit_sync(&desc, svm->iommu); 211 - } 212 - 213 237 static void intel_mm_release(struct mmu_notifier *mn, struct mm_struct *mm) 214 238 { 215 239 struct intel_svm *svm = container_of(mn, struct intel_svm, notifier); ··· 218 264 */ 219 265 rcu_read_lock(); 220 266 list_for_each_entry_rcu(sdev, &svm->devs, list) { 221 - intel_pasid_clear_entry(sdev->dev, svm->pasid); 222 - intel_flush_pasid_dev(svm, sdev, svm->pasid); 267 + intel_pasid_tear_down_entry(svm->iommu, sdev->dev, svm->pasid); 223 268 intel_flush_svm_range_dev(svm, sdev, 0, -1, 0, !svm->mm); 224 269 } 225 270 rcu_read_unlock(); ··· 237 284 int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_ops *ops) 238 285 { 239 286 struct intel_iommu *iommu = intel_svm_device_to_iommu(dev); 240 - struct pasid_entry *entry; 241 287 struct intel_svm_dev *sdev; 242 288 struct intel_svm *svm = NULL; 243 289 struct mm_struct *mm = NULL; 244 - u64 pasid_entry_val; 245 290 int pasid_max; 246 291 int ret; 247 292 ··· 348 397 kfree(sdev); 349 398 goto out; 350 399 } 351 - pasid_entry_val = (u64)__pa(mm->pgd) | PASID_ENTRY_P; 352 - } else 353 - pasid_entry_val = (u64)__pa(init_mm.pgd) | 354 - PASID_ENTRY_P | PASID_ENTRY_SRE; 355 - if (cpu_feature_enabled(X86_FEATURE_LA57)) 356 - pasid_entry_val |= PASID_ENTRY_FLPM_5LP; 400 + } 357 401 358 - entry = intel_pasid_get_entry(dev, svm->pasid); 359 - entry->val = pasid_entry_val; 360 - 361 - wmb(); 362 - 363 - /* 364 - * Flush PASID cache when a PASID table entry becomes 365 - * present. 366 - */ 367 - if (cap_caching_mode(iommu->cap)) 368 - intel_flush_pasid_dev(svm, sdev, svm->pasid); 402 + spin_lock(&iommu->lock); 403 + ret = intel_pasid_setup_first_level(iommu, dev, 404 + mm ? mm->pgd : init_mm.pgd, 405 + svm->pasid, FLPT_DEFAULT_DID, 406 + mm ? 0 : PASID_FLAG_SUPERVISOR_MODE); 407 + spin_unlock(&iommu->lock); 408 + if (ret) { 409 + if (mm) 410 + mmu_notifier_unregister(&svm->notifier, mm); 411 + intel_pasid_free_id(svm->pasid); 412 + kfree(svm); 413 + kfree(sdev); 414 + goto out; 415 + } 369 416 370 417 list_add_tail(&svm->list, &global_svm_list); 371 418 } ··· 409 460 * to use. We have a *shared* PASID table, because it's 410 461 * large and has to be physically contiguous. So it's 411 462 * hard to be as defensive as we might like. */ 412 - intel_flush_pasid_dev(svm, sdev, svm->pasid); 463 + intel_pasid_tear_down_entry(iommu, dev, svm->pasid); 413 464 intel_flush_svm_range_dev(svm, sdev, 0, -1, 0, !svm->mm); 414 465 kfree_rcu(sdev, rcu); 415 - intel_pasid_clear_entry(dev, svm->pasid); 416 466 417 467 if (list_empty(&svm->devs)) { 418 468 intel_pasid_free_id(svm->pasid); ··· 619 671 no_pasid: 620 672 if (req->lpig) { 621 673 /* Page Group Response */ 622 - resp.low = QI_PGRP_PASID(req->pasid) | 674 + resp.qw0 = QI_PGRP_PASID(req->pasid) | 623 675 QI_PGRP_DID((req->bus << 8) | req->devfn) | 624 676 QI_PGRP_PASID_P(req->pasid_present) | 625 677 QI_PGRP_RESP_TYPE; 626 - resp.high = QI_PGRP_IDX(req->prg_index) | 627 - QI_PGRP_PRIV(req->private) | QI_PGRP_RESP_CODE(result); 628 - 629 - qi_submit_sync(&resp, iommu); 678 + resp.qw1 = QI_PGRP_IDX(req->prg_index) | 679 + QI_PGRP_PRIV(req->private) | 680 + QI_PGRP_RESP_CODE(result); 630 681 } else if (req->srr) { 631 682 /* Page Stream Response */ 632 - resp.low = QI_PSTRM_IDX(req->prg_index) | 633 - QI_PSTRM_PRIV(req->private) | QI_PSTRM_BUS(req->bus) | 634 - QI_PSTRM_PASID(req->pasid) | QI_PSTRM_RESP_TYPE; 635 - resp.high = QI_PSTRM_ADDR(address) | QI_PSTRM_DEVFN(req->devfn) | 683 + resp.qw0 = QI_PSTRM_IDX(req->prg_index) | 684 + QI_PSTRM_PRIV(req->private) | 685 + QI_PSTRM_BUS(req->bus) | 686 + QI_PSTRM_PASID(req->pasid) | 687 + QI_PSTRM_RESP_TYPE; 688 + resp.qw1 = QI_PSTRM_ADDR(address) | 689 + QI_PSTRM_DEVFN(req->devfn) | 636 690 QI_PSTRM_RESP_CODE(result); 637 - 638 - qi_submit_sync(&resp, iommu); 639 691 } 692 + resp.qw2 = 0; 693 + resp.qw3 = 0; 694 + qi_submit_sync(&resp, iommu); 640 695 641 696 head = (head + sizeof(*req)) & PRQ_RING_MASK; 642 697 }
+4 -2
drivers/iommu/intel_irq_remapping.c
··· 145 145 { 146 146 struct qi_desc desc; 147 147 148 - desc.low = QI_IEC_IIDEX(index) | QI_IEC_TYPE | QI_IEC_IM(mask) 148 + desc.qw0 = QI_IEC_IIDEX(index) | QI_IEC_TYPE | QI_IEC_IM(mask) 149 149 | QI_IEC_SELECTIVE; 150 - desc.high = 0; 150 + desc.qw1 = 0; 151 + desc.qw2 = 0; 152 + desc.qw3 = 0; 151 153 152 154 return qi_submit_sync(&desc, iommu); 153 155 }
-4
drivers/iommu/io-pgtable-arm-v7s.c
··· 709 709 { 710 710 struct arm_v7s_io_pgtable *data; 711 711 712 - #ifdef PHYS_OFFSET 713 - if (upper_32_bits(PHYS_OFFSET)) 714 - return NULL; 715 - #endif 716 712 if (cfg->ias > ARM_V7S_ADDR_BITS || cfg->oas > ARM_V7S_ADDR_BITS) 717 713 return NULL; 718 714
+7 -7
drivers/iommu/iommu-sysfs.c
··· 11 11 12 12 #include <linux/device.h> 13 13 #include <linux/iommu.h> 14 - #include <linux/module.h> 14 + #include <linux/init.h> 15 15 #include <linux/slab.h> 16 16 17 17 /* ··· 22 22 NULL, 23 23 }; 24 24 25 - static const struct attribute_group iommu_devices_attr_group = { 25 + static const struct attribute_group devices_attr_group = { 26 26 .name = "devices", 27 27 .attrs = devices_attr, 28 28 }; 29 29 30 - static const struct attribute_group *iommu_dev_groups[] = { 31 - &iommu_devices_attr_group, 30 + static const struct attribute_group *dev_groups[] = { 31 + &devices_attr_group, 32 32 NULL, 33 33 }; 34 34 35 - static void iommu_release_device(struct device *dev) 35 + static void release_device(struct device *dev) 36 36 { 37 37 kfree(dev); 38 38 } 39 39 40 40 static struct class iommu_class = { 41 41 .name = "iommu", 42 - .dev_release = iommu_release_device, 43 - .dev_groups = iommu_dev_groups, 42 + .dev_release = release_device, 43 + .dev_groups = dev_groups, 44 44 }; 45 45 46 46 static int __init iommu_dev_init(void)
+59 -56
drivers/iommu/iommu.c
··· 22 22 #include <linux/kernel.h> 23 23 #include <linux/bug.h> 24 24 #include <linux/types.h> 25 - #include <linux/module.h> 25 + #include <linux/init.h> 26 + #include <linux/export.h> 26 27 #include <linux/slab.h> 27 28 #include <linux/errno.h> 28 29 #include <linux/iommu.h> ··· 109 108 spin_lock(&iommu_device_lock); 110 109 list_del(&iommu->list); 111 110 spin_unlock(&iommu_device_lock); 111 + } 112 + 113 + int iommu_probe_device(struct device *dev) 114 + { 115 + const struct iommu_ops *ops = dev->bus->iommu_ops; 116 + int ret = -EINVAL; 117 + 118 + WARN_ON(dev->iommu_group); 119 + 120 + if (ops) 121 + ret = ops->add_device(dev); 122 + 123 + return ret; 124 + } 125 + 126 + void iommu_release_device(struct device *dev) 127 + { 128 + const struct iommu_ops *ops = dev->bus->iommu_ops; 129 + 130 + if (dev->iommu_group) 131 + ops->remove_device(dev); 112 132 } 113 133 114 134 static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus, ··· 1139 1117 1140 1118 static int add_iommu_group(struct device *dev, void *data) 1141 1119 { 1142 - struct iommu_callback_data *cb = data; 1143 - const struct iommu_ops *ops = cb->ops; 1144 - int ret; 1145 - 1146 - if (!ops->add_device) 1147 - return 0; 1148 - 1149 - WARN_ON(dev->iommu_group); 1150 - 1151 - ret = ops->add_device(dev); 1120 + int ret = iommu_probe_device(dev); 1152 1121 1153 1122 /* 1154 1123 * We ignore -ENODEV errors for now, as they just mean that the ··· 1154 1141 1155 1142 static int remove_iommu_group(struct device *dev, void *data) 1156 1143 { 1157 - struct iommu_callback_data *cb = data; 1158 - const struct iommu_ops *ops = cb->ops; 1159 - 1160 - if (ops->remove_device && dev->iommu_group) 1161 - ops->remove_device(dev); 1144 + iommu_release_device(dev); 1162 1145 1163 1146 return 0; 1164 1147 } ··· 1162 1153 static int iommu_bus_notifier(struct notifier_block *nb, 1163 1154 unsigned long action, void *data) 1164 1155 { 1165 - struct device *dev = data; 1166 - const struct iommu_ops *ops = dev->bus->iommu_ops; 1167 - struct iommu_group *group; 1168 1156 unsigned long group_action = 0; 1157 + struct device *dev = data; 1158 + struct iommu_group *group; 1169 1159 1170 1160 /* 1171 1161 * ADD/DEL call into iommu driver ops if provided, which may 1172 1162 * result in ADD/DEL notifiers to group->notifier 1173 1163 */ 1174 1164 if (action == BUS_NOTIFY_ADD_DEVICE) { 1175 - if (ops->add_device) { 1176 - int ret; 1165 + int ret; 1177 1166 1178 - ret = ops->add_device(dev); 1179 - return (ret) ? NOTIFY_DONE : NOTIFY_OK; 1180 - } 1167 + ret = iommu_probe_device(dev); 1168 + return (ret) ? NOTIFY_DONE : NOTIFY_OK; 1181 1169 } else if (action == BUS_NOTIFY_REMOVED_DEVICE) { 1182 - if (ops->remove_device && dev->iommu_group) { 1183 - ops->remove_device(dev); 1184 - return 0; 1185 - } 1170 + iommu_release_device(dev); 1171 + return NOTIFY_OK; 1186 1172 } 1187 1173 1188 1174 /* ··· 1716 1712 size_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova, 1717 1713 struct scatterlist *sg, unsigned int nents, int prot) 1718 1714 { 1719 - struct scatterlist *s; 1720 - size_t mapped = 0; 1721 - unsigned int i, min_pagesz; 1715 + size_t len = 0, mapped = 0; 1716 + phys_addr_t start; 1717 + unsigned int i = 0; 1722 1718 int ret; 1723 1719 1724 - if (unlikely(domain->pgsize_bitmap == 0UL)) 1725 - return 0; 1720 + while (i <= nents) { 1721 + phys_addr_t s_phys = sg_phys(sg); 1726 1722 1727 - min_pagesz = 1 << __ffs(domain->pgsize_bitmap); 1723 + if (len && s_phys != start + len) { 1724 + ret = iommu_map(domain, iova + mapped, start, len, prot); 1725 + if (ret) 1726 + goto out_err; 1728 1727 1729 - for_each_sg(sg, s, nents, i) { 1730 - phys_addr_t phys = page_to_phys(sg_page(s)) + s->offset; 1728 + mapped += len; 1729 + len = 0; 1730 + } 1731 1731 1732 - /* 1733 - * We are mapping on IOMMU page boundaries, so offset within 1734 - * the page must be 0. However, the IOMMU may support pages 1735 - * smaller than PAGE_SIZE, so s->offset may still represent 1736 - * an offset of that boundary within the CPU page. 1737 - */ 1738 - if (!IS_ALIGNED(s->offset, min_pagesz)) 1739 - goto out_err; 1732 + if (len) { 1733 + len += sg->length; 1734 + } else { 1735 + len = sg->length; 1736 + start = s_phys; 1737 + } 1740 1738 1741 - ret = iommu_map(domain, iova + mapped, phys, s->length, prot); 1742 - if (ret) 1743 - goto out_err; 1744 - 1745 - mapped += s->length; 1739 + if (++i < nents) 1740 + sg = sg_next(sg); 1746 1741 } 1747 1742 1748 1743 return mapped; ··· 1979 1976 int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, 1980 1977 const struct iommu_ops *ops) 1981 1978 { 1982 - struct iommu_fwspec *fwspec = dev->iommu_fwspec; 1979 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 1983 1980 1984 1981 if (fwspec) 1985 1982 return ops == fwspec->ops ? 0 : -EINVAL; ··· 1991 1988 of_node_get(to_of_node(iommu_fwnode)); 1992 1989 fwspec->iommu_fwnode = iommu_fwnode; 1993 1990 fwspec->ops = ops; 1994 - dev->iommu_fwspec = fwspec; 1991 + dev_iommu_fwspec_set(dev, fwspec); 1995 1992 return 0; 1996 1993 } 1997 1994 EXPORT_SYMBOL_GPL(iommu_fwspec_init); 1998 1995 1999 1996 void iommu_fwspec_free(struct device *dev) 2000 1997 { 2001 - struct iommu_fwspec *fwspec = dev->iommu_fwspec; 1998 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 2002 1999 2003 2000 if (fwspec) { 2004 2001 fwnode_handle_put(fwspec->iommu_fwnode); 2005 2002 kfree(fwspec); 2006 - dev->iommu_fwspec = NULL; 2003 + dev_iommu_fwspec_set(dev, NULL); 2007 2004 } 2008 2005 } 2009 2006 EXPORT_SYMBOL_GPL(iommu_fwspec_free); 2010 2007 2011 2008 int iommu_fwspec_add_ids(struct device *dev, u32 *ids, int num_ids) 2012 2009 { 2013 - struct iommu_fwspec *fwspec = dev->iommu_fwspec; 2010 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 2014 2011 size_t size; 2015 2012 int i; 2016 2013 ··· 2019 2016 2020 2017 size = offsetof(struct iommu_fwspec, ids[fwspec->num_ids + num_ids]); 2021 2018 if (size > sizeof(*fwspec)) { 2022 - fwspec = krealloc(dev->iommu_fwspec, size, GFP_KERNEL); 2019 + fwspec = krealloc(fwspec, size, GFP_KERNEL); 2023 2020 if (!fwspec) 2024 2021 return -ENOMEM; 2025 2022 2026 - dev->iommu_fwspec = fwspec; 2023 + dev_iommu_fwspec_set(dev, fwspec); 2027 2024 } 2028 2025 2029 2026 for (i = 0; i < num_ids; i++)
+61 -27
drivers/iommu/ipmmu-vmsa.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 2 /* 3 - * IPMMU VMSA 3 + * IOMMU API for Renesas VMSA-compatible IPMMU 4 + * Author: Laurent Pinchart <laurent.pinchart@ideasonboard.com> 4 5 * 5 6 * Copyright (C) 2014 Renesas Electronics Corporation 6 7 */ ··· 12 11 #include <linux/dma-mapping.h> 13 12 #include <linux/err.h> 14 13 #include <linux/export.h> 14 + #include <linux/init.h> 15 15 #include <linux/interrupt.h> 16 16 #include <linux/io.h> 17 17 #include <linux/iommu.h> 18 - #include <linux/module.h> 19 18 #include <linux/of.h> 20 19 #include <linux/of_device.h> 21 20 #include <linux/of_iommu.h> ··· 82 81 83 82 static struct ipmmu_vmsa_device *to_ipmmu(struct device *dev) 84 83 { 85 - return dev->iommu_fwspec ? dev->iommu_fwspec->iommu_priv : NULL; 84 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 85 + 86 + return fwspec ? fwspec->iommu_priv : NULL; 86 87 } 87 88 88 89 #define TLB_LOOP_TIMEOUT 100 /* 100us */ ··· 646 643 static int ipmmu_attach_device(struct iommu_domain *io_domain, 647 644 struct device *dev) 648 645 { 649 - struct iommu_fwspec *fwspec = dev->iommu_fwspec; 646 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 650 647 struct ipmmu_vmsa_device *mmu = to_ipmmu(dev); 651 648 struct ipmmu_vmsa_domain *domain = to_vmsa_domain(io_domain); 652 649 unsigned int i; ··· 695 692 static void ipmmu_detach_device(struct iommu_domain *io_domain, 696 693 struct device *dev) 697 694 { 698 - struct iommu_fwspec *fwspec = dev->iommu_fwspec; 695 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 699 696 struct ipmmu_vmsa_domain *domain = to_vmsa_domain(io_domain); 700 697 unsigned int i; 701 698 ··· 747 744 static int ipmmu_init_platform_device(struct device *dev, 748 745 struct of_phandle_args *args) 749 746 { 747 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 750 748 struct platform_device *ipmmu_pdev; 751 749 752 750 ipmmu_pdev = of_find_device_by_node(args->np); 753 751 if (!ipmmu_pdev) 754 752 return -ENODEV; 755 753 756 - dev->iommu_fwspec->iommu_priv = platform_get_drvdata(ipmmu_pdev); 754 + fwspec->iommu_priv = platform_get_drvdata(ipmmu_pdev); 755 + 757 756 return 0; 758 757 } 759 758 760 - static bool ipmmu_slave_whitelist(struct device *dev) 761 - { 762 - /* By default, do not allow use of IPMMU */ 763 - return false; 764 - } 765 - 766 759 static const struct soc_device_attribute soc_rcar_gen3[] = { 760 + { .soc_id = "r8a774a1", }, 761 + { .soc_id = "r8a774c0", }, 767 762 { .soc_id = "r8a7795", }, 768 763 { .soc_id = "r8a7796", }, 769 764 { .soc_id = "r8a77965", }, 770 765 { .soc_id = "r8a77970", }, 766 + { .soc_id = "r8a77990", }, 771 767 { .soc_id = "r8a77995", }, 772 768 { /* sentinel */ } 773 769 }; 774 770 771 + static const struct soc_device_attribute soc_rcar_gen3_whitelist[] = { 772 + { .soc_id = "r8a774c0", }, 773 + { .soc_id = "r8a7795", .revision = "ES3.*" }, 774 + { .soc_id = "r8a77965", }, 775 + { .soc_id = "r8a77990", }, 776 + { .soc_id = "r8a77995", }, 777 + { /* sentinel */ } 778 + }; 779 + 780 + static const char * const rcar_gen3_slave_whitelist[] = { 781 + }; 782 + 783 + static bool ipmmu_slave_whitelist(struct device *dev) 784 + { 785 + unsigned int i; 786 + 787 + /* 788 + * For R-Car Gen3 use a white list to opt-in slave devices. 789 + * For Other SoCs, this returns true anyway. 790 + */ 791 + if (!soc_device_match(soc_rcar_gen3)) 792 + return true; 793 + 794 + /* Check whether this R-Car Gen3 can use the IPMMU correctly or not */ 795 + if (!soc_device_match(soc_rcar_gen3_whitelist)) 796 + return false; 797 + 798 + /* Check whether this slave device can work with the IPMMU */ 799 + for (i = 0; i < ARRAY_SIZE(rcar_gen3_slave_whitelist); i++) { 800 + if (!strcmp(dev_name(dev), rcar_gen3_slave_whitelist[i])) 801 + return true; 802 + } 803 + 804 + /* Otherwise, do not allow use of IPMMU */ 805 + return false; 806 + } 807 + 775 808 static int ipmmu_of_xlate(struct device *dev, 776 809 struct of_phandle_args *spec) 777 810 { 778 - /* For R-Car Gen3 use a white list to opt-in slave devices */ 779 - if (soc_device_match(soc_rcar_gen3) && !ipmmu_slave_whitelist(dev)) 811 + if (!ipmmu_slave_whitelist(dev)) 780 812 return -ENODEV; 781 813 782 814 iommu_fwspec_add_ids(dev, spec->args, 1); ··· 979 941 .compatible = "renesas,ipmmu-vmsa", 980 942 .data = &ipmmu_features_default, 981 943 }, { 944 + .compatible = "renesas,ipmmu-r8a774a1", 945 + .data = &ipmmu_features_rcar_gen3, 946 + }, { 947 + .compatible = "renesas,ipmmu-r8a774c0", 948 + .data = &ipmmu_features_rcar_gen3, 949 + }, { 982 950 .compatible = "renesas,ipmmu-r8a7795", 983 951 .data = &ipmmu_features_rcar_gen3, 984 952 }, { ··· 997 953 .compatible = "renesas,ipmmu-r8a77970", 998 954 .data = &ipmmu_features_rcar_gen3, 999 955 }, { 956 + .compatible = "renesas,ipmmu-r8a77990", 957 + .data = &ipmmu_features_rcar_gen3, 958 + }, { 1000 959 .compatible = "renesas,ipmmu-r8a77995", 1001 960 .data = &ipmmu_features_rcar_gen3, 1002 961 }, { 1003 962 /* Terminator */ 1004 963 }, 1005 964 }; 1006 - 1007 - MODULE_DEVICE_TABLE(of, ipmmu_of_ids); 1008 965 1009 966 static int ipmmu_probe(struct platform_device *pdev) 1010 967 { ··· 1177 1132 setup_done = true; 1178 1133 return 0; 1179 1134 } 1180 - 1181 - static void __exit ipmmu_exit(void) 1182 - { 1183 - return platform_driver_unregister(&ipmmu_driver); 1184 - } 1185 - 1186 1135 subsys_initcall(ipmmu_init); 1187 - module_exit(ipmmu_exit); 1188 - 1189 - MODULE_DESCRIPTION("IOMMU API for Renesas VMSA-compatible IPMMU"); 1190 - MODULE_AUTHOR("Laurent Pinchart <laurent.pinchart@ideasonboard.com>"); 1191 - MODULE_LICENSE("GPL v2");
-1
drivers/iommu/irq_remapping.c
··· 1 - #include <linux/seq_file.h> 2 1 #include <linux/cpumask.h> 3 2 #include <linux/kernel.h> 4 3 #include <linux/string.h>
+3 -10
drivers/iommu/msm_iommu.c
··· 1 1 /* Copyright (c) 2010-2011, Code Aurora Forum. All rights reserved. 2 2 * 3 + * Author: Stepan Moskovchenko <stepanm@codeaurora.org> 4 + * 3 5 * This program is free software; you can redistribute it and/or modify 4 6 * it under the terms of the GNU General Public License version 2 and 5 7 * only version 2 as published by the Free Software Foundation. ··· 19 17 20 18 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt 21 19 #include <linux/kernel.h> 22 - #include <linux/module.h> 20 + #include <linux/init.h> 23 21 #include <linux/platform_device.h> 24 22 #include <linux/errno.h> 25 23 #include <linux/io.h> ··· 863 861 864 862 return ret; 865 863 } 866 - 867 - static void __exit msm_iommu_driver_exit(void) 868 - { 869 - platform_driver_unregister(&msm_iommu_driver); 870 - } 871 - 872 864 subsys_initcall(msm_iommu_driver_init); 873 - module_exit(msm_iommu_driver_exit); 874 865 875 - MODULE_LICENSE("GPL v2"); 876 - MODULE_AUTHOR("Stepan Moskovchenko <stepanm@codeaurora.org>");
+14 -11
drivers/iommu/mtk_iommu.c
··· 113 113 struct iommu_domain domain; 114 114 }; 115 115 116 - static struct iommu_ops mtk_iommu_ops; 116 + static const struct iommu_ops mtk_iommu_ops; 117 117 118 118 static LIST_HEAD(m4ulist); /* List all the M4U HWs */ 119 119 ··· 244 244 { 245 245 struct mtk_smi_larb_iommu *larb_mmu; 246 246 unsigned int larbid, portid; 247 - struct iommu_fwspec *fwspec = dev->iommu_fwspec; 247 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 248 248 int i; 249 249 250 250 for (i = 0; i < fwspec->num_ids; ++i) { ··· 336 336 struct device *dev) 337 337 { 338 338 struct mtk_iommu_domain *dom = to_mtk_domain(domain); 339 - struct mtk_iommu_data *data = dev->iommu_fwspec->iommu_priv; 339 + struct mtk_iommu_data *data = dev_iommu_fwspec_get(dev)->iommu_priv; 340 340 341 341 if (!data) 342 342 return -ENODEV; ··· 355 355 static void mtk_iommu_detach_device(struct iommu_domain *domain, 356 356 struct device *dev) 357 357 { 358 - struct mtk_iommu_data *data = dev->iommu_fwspec->iommu_priv; 358 + struct mtk_iommu_data *data = dev_iommu_fwspec_get(dev)->iommu_priv; 359 359 360 360 if (!data) 361 361 return; ··· 417 417 418 418 static int mtk_iommu_add_device(struct device *dev) 419 419 { 420 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 420 421 struct mtk_iommu_data *data; 421 422 struct iommu_group *group; 422 423 423 - if (!dev->iommu_fwspec || dev->iommu_fwspec->ops != &mtk_iommu_ops) 424 + if (!fwspec || fwspec->ops != &mtk_iommu_ops) 424 425 return -ENODEV; /* Not a iommu client device */ 425 426 426 - data = dev->iommu_fwspec->iommu_priv; 427 + data = fwspec->iommu_priv; 427 428 iommu_device_link(&data->iommu, dev); 428 429 429 430 group = iommu_group_get_for_dev(dev); ··· 437 436 438 437 static void mtk_iommu_remove_device(struct device *dev) 439 438 { 439 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 440 440 struct mtk_iommu_data *data; 441 441 442 - if (!dev->iommu_fwspec || dev->iommu_fwspec->ops != &mtk_iommu_ops) 442 + if (!fwspec || fwspec->ops != &mtk_iommu_ops) 443 443 return; 444 444 445 - data = dev->iommu_fwspec->iommu_priv; 445 + data = fwspec->iommu_priv; 446 446 iommu_device_unlink(&data->iommu, dev); 447 447 448 448 iommu_group_remove_device(dev); ··· 470 468 471 469 static int mtk_iommu_of_xlate(struct device *dev, struct of_phandle_args *args) 472 470 { 471 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 473 472 struct platform_device *m4updev; 474 473 475 474 if (args->args_count != 1) { ··· 479 476 return -EINVAL; 480 477 } 481 478 482 - if (!dev->iommu_fwspec->iommu_priv) { 479 + if (!fwspec->iommu_priv) { 483 480 /* Get the m4u device */ 484 481 m4updev = of_find_device_by_node(args->np); 485 482 if (WARN_ON(!m4updev)) 486 483 return -EINVAL; 487 484 488 - dev->iommu_fwspec->iommu_priv = platform_get_drvdata(m4updev); 485 + fwspec->iommu_priv = platform_get_drvdata(m4updev); 489 486 } 490 487 491 488 return iommu_fwspec_add_ids(dev, args->args, 1); 492 489 } 493 490 494 - static struct iommu_ops mtk_iommu_ops = { 491 + static const struct iommu_ops mtk_iommu_ops = { 495 492 .domain_alloc = mtk_iommu_domain_alloc, 496 493 .domain_free = mtk_iommu_domain_free, 497 494 .attach_dev = mtk_iommu_attach_device,
+21 -26
drivers/iommu/mtk_iommu_v1.c
··· 1 1 /* 2 + * IOMMU API for MTK architected m4u v1 implementations 3 + * 2 4 * Copyright (c) 2015-2016 MediaTek Inc. 3 5 * Author: Honghui Zhang <honghui.zhang@mediatek.com> 4 6 * ··· 37 35 #include <linux/spinlock.h> 38 36 #include <asm/barrier.h> 39 37 #include <asm/dma-iommu.h> 40 - #include <linux/module.h> 38 + #include <linux/init.h> 41 39 #include <dt-bindings/memory/mt2701-larb-port.h> 42 40 #include <soc/mediatek/smi.h> 43 41 #include "mtk_iommu.h" ··· 208 206 { 209 207 struct mtk_smi_larb_iommu *larb_mmu; 210 208 unsigned int larbid, portid; 211 - struct iommu_fwspec *fwspec = dev->iommu_fwspec; 209 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 212 210 int i; 213 211 214 212 for (i = 0; i < fwspec->num_ids; ++i) { ··· 273 271 struct device *dev) 274 272 { 275 273 struct mtk_iommu_domain *dom = to_mtk_domain(domain); 276 - struct mtk_iommu_data *data = dev->iommu_fwspec->iommu_priv; 274 + struct mtk_iommu_data *data = dev_iommu_fwspec_get(dev)->iommu_priv; 277 275 int ret; 278 276 279 277 if (!data) ··· 295 293 static void mtk_iommu_detach_device(struct iommu_domain *domain, 296 294 struct device *dev) 297 295 { 298 - struct mtk_iommu_data *data = dev->iommu_fwspec->iommu_priv; 296 + struct mtk_iommu_data *data = dev_iommu_fwspec_get(dev)->iommu_priv; 299 297 300 298 if (!data) 301 299 return; ··· 364 362 return pa; 365 363 } 366 364 367 - static struct iommu_ops mtk_iommu_ops; 365 + static const struct iommu_ops mtk_iommu_ops; 368 366 369 367 /* 370 368 * MTK generation one iommu HW only support one iommu domain, and all the client ··· 373 371 static int mtk_iommu_create_mapping(struct device *dev, 374 372 struct of_phandle_args *args) 375 373 { 374 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 376 375 struct mtk_iommu_data *data; 377 376 struct platform_device *m4updev; 378 377 struct dma_iommu_mapping *mtk_mapping; ··· 386 383 return -EINVAL; 387 384 } 388 385 389 - if (!dev->iommu_fwspec) { 386 + if (!fwspec) { 390 387 ret = iommu_fwspec_init(dev, &args->np->fwnode, &mtk_iommu_ops); 391 388 if (ret) 392 389 return ret; 393 - } else if (dev->iommu_fwspec->ops != &mtk_iommu_ops) { 390 + fwspec = dev_iommu_fwspec_get(dev); 391 + } else if (dev_iommu_fwspec_get(dev)->ops != &mtk_iommu_ops) { 394 392 return -EINVAL; 395 393 } 396 394 397 - if (!dev->iommu_fwspec->iommu_priv) { 395 + if (!fwspec->iommu_priv) { 398 396 /* Get the m4u device */ 399 397 m4updev = of_find_device_by_node(args->np); 400 398 if (WARN_ON(!m4updev)) 401 399 return -EINVAL; 402 400 403 - dev->iommu_fwspec->iommu_priv = platform_get_drvdata(m4updev); 401 + fwspec->iommu_priv = platform_get_drvdata(m4updev); 404 402 } 405 403 406 404 ret = iommu_fwspec_add_ids(dev, args->args, 1); 407 405 if (ret) 408 406 return ret; 409 407 410 - data = dev->iommu_fwspec->iommu_priv; 408 + data = fwspec->iommu_priv; 411 409 m4udev = data->dev; 412 410 mtk_mapping = m4udev->archdata.iommu; 413 411 if (!mtk_mapping) { ··· 426 422 427 423 static int mtk_iommu_add_device(struct device *dev) 428 424 { 425 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 429 426 struct dma_iommu_mapping *mtk_mapping; 430 427 struct of_phandle_args iommu_spec; 431 428 struct of_phandle_iterator it; ··· 445 440 of_node_put(iommu_spec.np); 446 441 } 447 442 448 - if (!dev->iommu_fwspec || dev->iommu_fwspec->ops != &mtk_iommu_ops) 443 + if (!fwspec || fwspec->ops != &mtk_iommu_ops) 449 444 return -ENODEV; /* Not a iommu client device */ 450 445 451 446 /* ··· 463 458 if (err) 464 459 return err; 465 460 466 - data = dev->iommu_fwspec->iommu_priv; 461 + data = fwspec->iommu_priv; 467 462 mtk_mapping = data->dev->archdata.iommu; 468 463 err = arm_iommu_attach_device(dev, mtk_mapping); 469 464 if (err) { ··· 476 471 477 472 static void mtk_iommu_remove_device(struct device *dev) 478 473 { 474 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 479 475 struct mtk_iommu_data *data; 480 476 481 - if (!dev->iommu_fwspec || dev->iommu_fwspec->ops != &mtk_iommu_ops) 477 + if (!fwspec || fwspec->ops != &mtk_iommu_ops) 482 478 return; 483 479 484 - data = dev->iommu_fwspec->iommu_priv; 480 + data = fwspec->iommu_priv; 485 481 iommu_device_unlink(&data->iommu, dev); 486 482 487 483 iommu_group_remove_device(dev); ··· 530 524 return 0; 531 525 } 532 526 533 - static struct iommu_ops mtk_iommu_ops = { 527 + static const struct iommu_ops mtk_iommu_ops = { 534 528 .domain_alloc = mtk_iommu_domain_alloc, 535 529 .domain_free = mtk_iommu_domain_free, 536 530 .attach_dev = mtk_iommu_attach_device, ··· 710 704 { 711 705 return platform_driver_register(&mtk_iommu_driver); 712 706 } 713 - 714 - static void __exit m4u_exit(void) 715 - { 716 - return platform_driver_unregister(&mtk_iommu_driver); 717 - } 718 - 719 707 subsys_initcall(m4u_init); 720 - module_exit(m4u_exit); 721 - 722 - MODULE_DESCRIPTION("IOMMU API for MTK architected m4u v1 implementations"); 723 - MODULE_AUTHOR("Honghui Zhang <honghui.zhang@mediatek.com>"); 724 - MODULE_LICENSE("GPL v2");
+10 -6
drivers/iommu/of_iommu.c
··· 164 164 struct device_node *master_np) 165 165 { 166 166 const struct iommu_ops *ops = NULL; 167 - struct iommu_fwspec *fwspec = dev->iommu_fwspec; 167 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 168 168 int err = NO_IOMMU; 169 169 170 170 if (!master_np) ··· 208 208 } 209 209 } 210 210 211 + 211 212 /* 212 213 * Two success conditions can be represented by non-negative err here: 213 214 * >0 : there is no IOMMU, or one was unavailable for non-fatal reasons 214 215 * 0 : we found an IOMMU, and dev->fwspec is initialised appropriately 215 216 * <0 : any actual error 216 217 */ 217 - if (!err) 218 - ops = dev->iommu_fwspec->ops; 218 + if (!err) { 219 + /* The fwspec pointer changed, read it again */ 220 + fwspec = dev_iommu_fwspec_get(dev); 221 + ops = fwspec->ops; 222 + } 219 223 /* 220 224 * If we have reason to believe the IOMMU driver missed the initial 221 - * add_device callback for dev, replay it to get things in order. 225 + * probe for dev, replay it to get things in order. 222 226 */ 223 - if (ops && ops->add_device && dev->bus && !dev->iommu_group) 224 - err = ops->add_device(dev); 227 + if (dev->bus && !device_iommu_mapped(dev)) 228 + err = iommu_probe_device(dev); 225 229 226 230 /* Ignore all other errors apart from EPROBE_DEFER */ 227 231 if (err == -EPROBE_DEFER) {
+6 -19
drivers/iommu/omap-iommu-debug.c
··· 159 159 return 0; 160 160 } 161 161 162 - static int debug_read_tlb(struct seq_file *s, void *data) 162 + static int tlb_show(struct seq_file *s, void *data) 163 163 { 164 164 struct omap_iommu *obj = s->private; 165 165 ··· 210 210 spin_unlock(&obj->page_table_lock); 211 211 } 212 212 213 - static int debug_read_pagetable(struct seq_file *s, void *data) 213 + static int pagetable_show(struct seq_file *s, void *data) 214 214 { 215 215 struct omap_iommu *obj = s->private; 216 216 ··· 228 228 return 0; 229 229 } 230 230 231 - #define DEBUG_SEQ_FOPS_RO(name) \ 232 - static int debug_open_##name(struct inode *inode, struct file *file) \ 233 - { \ 234 - return single_open(file, debug_read_##name, inode->i_private); \ 235 - } \ 236 - \ 237 - static const struct file_operations debug_##name##_fops = { \ 238 - .open = debug_open_##name, \ 239 - .read = seq_read, \ 240 - .llseek = seq_lseek, \ 241 - .release = single_release, \ 242 - } 243 - 244 231 #define DEBUG_FOPS_RO(name) \ 245 - static const struct file_operations debug_##name##_fops = { \ 232 + static const struct file_operations name##_fops = { \ 246 233 .open = simple_open, \ 247 234 .read = debug_read_##name, \ 248 235 .llseek = generic_file_llseek, \ 249 236 } 250 237 251 238 DEBUG_FOPS_RO(regs); 252 - DEBUG_SEQ_FOPS_RO(tlb); 253 - DEBUG_SEQ_FOPS_RO(pagetable); 239 + DEFINE_SHOW_ATTRIBUTE(tlb); 240 + DEFINE_SHOW_ATTRIBUTE(pagetable); 254 241 255 242 #define __DEBUG_ADD_FILE(attr, mode) \ 256 243 { \ 257 244 struct dentry *dent; \ 258 245 dent = debugfs_create_file(#attr, mode, obj->debug_dir, \ 259 - obj, &debug_##attr##_fops); \ 246 + obj, &attr##_fops); \ 260 247 if (!dent) \ 261 248 goto err; \ 262 249 }
+12 -22
drivers/iommu/qcom_iommu.c
··· 29 29 #include <linux/iommu.h> 30 30 #include <linux/iopoll.h> 31 31 #include <linux/kconfig.h> 32 - #include <linux/module.h> 32 + #include <linux/init.h> 33 33 #include <linux/mutex.h> 34 34 #include <linux/of.h> 35 35 #include <linux/of_address.h> ··· 354 354 355 355 static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev) 356 356 { 357 - struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec); 357 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 358 + struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec); 358 359 struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain); 359 360 int ret; 360 361 ··· 366 365 367 366 /* Ensure that the domain is finalized */ 368 367 pm_runtime_get_sync(qcom_iommu->dev); 369 - ret = qcom_iommu_init_domain(domain, qcom_iommu, dev->iommu_fwspec); 368 + ret = qcom_iommu_init_domain(domain, qcom_iommu, fwspec); 370 369 pm_runtime_put_sync(qcom_iommu->dev); 371 370 if (ret < 0) 372 371 return ret; ··· 388 387 389 388 static void qcom_iommu_detach_dev(struct iommu_domain *domain, struct device *dev) 390 389 { 391 - struct iommu_fwspec *fwspec = dev->iommu_fwspec; 390 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 392 391 struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec); 393 392 struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain); 394 393 unsigned i; ··· 501 500 502 501 static int qcom_iommu_add_device(struct device *dev) 503 502 { 504 - struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec); 503 + struct qcom_iommu_dev *qcom_iommu = to_iommu(dev_iommu_fwspec_get(dev)); 505 504 struct iommu_group *group; 506 505 struct device_link *link; 507 506 ··· 532 531 533 532 static void qcom_iommu_remove_device(struct device *dev) 534 533 { 535 - struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec); 534 + struct qcom_iommu_dev *qcom_iommu = to_iommu(dev_iommu_fwspec_get(dev)); 536 535 537 536 if (!qcom_iommu) 538 537 return; ··· 544 543 545 544 static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args *args) 546 545 { 546 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 547 547 struct qcom_iommu_dev *qcom_iommu; 548 548 struct platform_device *iommu_pdev; 549 549 unsigned asid = args->args[0]; ··· 570 568 WARN_ON(asid > qcom_iommu->num_ctxs)) 571 569 return -EINVAL; 572 570 573 - if (!dev->iommu_fwspec->iommu_priv) { 574 - dev->iommu_fwspec->iommu_priv = qcom_iommu; 571 + if (!fwspec->iommu_priv) { 572 + fwspec->iommu_priv = qcom_iommu; 575 573 } else { 576 574 /* make sure devices iommus dt node isn't referring to 577 575 * multiple different iommu devices. Multiple context 578 576 * banks are ok, but multiple devices are not: 579 577 */ 580 - if (WARN_ON(qcom_iommu != dev->iommu_fwspec->iommu_priv)) 578 + if (WARN_ON(qcom_iommu != fwspec->iommu_priv)) 581 579 return -EINVAL; 582 580 } 583 581 ··· 910 908 { .compatible = "qcom,msm-iommu-v1" }, 911 909 { /* sentinel */ } 912 910 }; 913 - MODULE_DEVICE_TABLE(of, qcom_iommu_of_match); 914 911 915 912 static struct platform_driver qcom_iommu_driver = { 916 913 .driver = { ··· 935 934 936 935 return ret; 937 936 } 938 - 939 - static void __exit qcom_iommu_exit(void) 940 - { 941 - platform_driver_unregister(&qcom_iommu_driver); 942 - platform_driver_unregister(&qcom_iommu_ctx_driver); 943 - } 944 - 945 - module_init(qcom_iommu_init); 946 - module_exit(qcom_iommu_exit); 947 - 948 - MODULE_DESCRIPTION("IOMMU API for QCOM IOMMU v1 implementations"); 949 - MODULE_LICENSE("GPL v2"); 937 + device_initcall(qcom_iommu_init);
+6 -7
drivers/iommu/rockchip-iommu.c
··· 1 1 /* 2 + * IOMMU API for Rockchip 3 + * 4 + * Module Authors: Simon Xue <xxm@rock-chips.com> 5 + * Daniel Kurtz <djkurtz@chromium.org> 6 + * 2 7 * This program is free software; you can redistribute it and/or modify 3 8 * it under the terms of the GNU General Public License version 2 as 4 9 * published by the Free Software Foundation. ··· 22 17 #include <linux/iopoll.h> 23 18 #include <linux/list.h> 24 19 #include <linux/mm.h> 25 - #include <linux/module.h> 20 + #include <linux/init.h> 26 21 #include <linux/of.h> 27 22 #include <linux/of_iommu.h> 28 23 #include <linux/of_platform.h> ··· 1286 1281 { .compatible = "rockchip,iommu" }, 1287 1282 { /* sentinel */ } 1288 1283 }; 1289 - MODULE_DEVICE_TABLE(of, rk_iommu_dt_ids); 1290 1284 1291 1285 static struct platform_driver rk_iommu_driver = { 1292 1286 .probe = rk_iommu_probe, ··· 1303 1299 return platform_driver_register(&rk_iommu_driver); 1304 1300 } 1305 1301 subsys_initcall(rk_iommu_init); 1306 - 1307 - MODULE_DESCRIPTION("IOMMU API for Rockchip"); 1308 - MODULE_AUTHOR("Simon Xue <xxm@rock-chips.com> and Daniel Kurtz <djkurtz@chromium.org>"); 1309 - MODULE_ALIAS("platform:rockchip-iommu"); 1310 - MODULE_LICENSE("GPL v2");
+7 -30
drivers/iommu/tegra-gart.c
··· 3 3 * 4 4 * Copyright (c) 2010-2012, NVIDIA CORPORATION. All rights reserved. 5 5 * 6 + * Author: Hiroshi DOYU <hdoyu@nvidia.com> 7 + * 6 8 * This program is free software; you can redistribute it and/or modify it 7 9 * under the terms and conditions of the GNU General Public License, 8 10 * version 2, as published by the Free Software Foundation. ··· 21 19 22 20 #define pr_fmt(fmt) "%s(): " fmt, __func__ 23 21 24 - #include <linux/module.h> 22 + #include <linux/init.h> 23 + #include <linux/moduleparam.h> 25 24 #include <linux/platform_device.h> 26 25 #include <linux/spinlock.h> 27 26 #include <linux/slab.h> ··· 481 478 return 0; 482 479 } 483 480 484 - static int tegra_gart_remove(struct platform_device *pdev) 485 - { 486 - struct gart_device *gart = platform_get_drvdata(pdev); 487 - 488 - iommu_device_unregister(&gart->iommu); 489 - iommu_device_sysfs_remove(&gart->iommu); 490 - 491 - writel(0, gart->regs + GART_CONFIG); 492 - if (gart->savedata) 493 - vfree(gart->savedata); 494 - gart_handle = NULL; 495 - return 0; 496 - } 497 - 498 481 static const struct dev_pm_ops tegra_gart_pm_ops = { 499 482 .suspend = tegra_gart_suspend, 500 483 .resume = tegra_gart_resume, ··· 490 501 { .compatible = "nvidia,tegra20-gart", }, 491 502 { }, 492 503 }; 493 - MODULE_DEVICE_TABLE(of, tegra_gart_of_match); 494 504 495 505 static struct platform_driver tegra_gart_driver = { 496 506 .probe = tegra_gart_probe, 497 - .remove = tegra_gart_remove, 498 507 .driver = { 499 508 .name = "tegra-gart", 500 509 .pm = &tegra_gart_pm_ops, 501 510 .of_match_table = tegra_gart_of_match, 511 + .suppress_bind_attrs = true, 502 512 }, 503 513 }; 504 514 505 - static int tegra_gart_init(void) 515 + static int __init tegra_gart_init(void) 506 516 { 507 517 return platform_driver_register(&tegra_gart_driver); 508 518 } 509 - 510 - static void __exit tegra_gart_exit(void) 511 - { 512 - platform_driver_unregister(&tegra_gart_driver); 513 - } 514 - 515 519 subsys_initcall(tegra_gart_init); 516 - module_exit(tegra_gart_exit); 517 - module_param(gart_debug, bool, 0644); 518 520 521 + module_param(gart_debug, bool, 0644); 519 522 MODULE_PARM_DESC(gart_debug, "Enable GART debugging"); 520 - MODULE_DESCRIPTION("IOMMU API for GART in Tegra20"); 521 - MODULE_AUTHOR("Hiroshi DOYU <hdoyu@nvidia.com>"); 522 - MODULE_ALIAS("platform:tegra-gart"); 523 - MODULE_LICENSE("GPL v2");
+3 -23
drivers/iommu/tegra-smmu.c
··· 846 846 847 847 static struct iommu_group *tegra_smmu_device_group(struct device *dev) 848 848 { 849 - struct iommu_fwspec *fwspec = dev->iommu_fwspec; 849 + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); 850 850 struct tegra_smmu *smmu = dev->archdata.iommu; 851 851 struct iommu_group *group; 852 852 ··· 926 926 return 0; 927 927 } 928 928 929 - static int tegra_smmu_swgroups_open(struct inode *inode, struct file *file) 930 - { 931 - return single_open(file, tegra_smmu_swgroups_show, inode->i_private); 932 - } 933 - 934 - static const struct file_operations tegra_smmu_swgroups_fops = { 935 - .open = tegra_smmu_swgroups_open, 936 - .read = seq_read, 937 - .llseek = seq_lseek, 938 - .release = single_release, 939 - }; 929 + DEFINE_SHOW_ATTRIBUTE(tegra_smmu_swgroups); 940 930 941 931 static int tegra_smmu_clients_show(struct seq_file *s, void *data) 942 932 { ··· 954 964 return 0; 955 965 } 956 966 957 - static int tegra_smmu_clients_open(struct inode *inode, struct file *file) 958 - { 959 - return single_open(file, tegra_smmu_clients_show, inode->i_private); 960 - } 961 - 962 - static const struct file_operations tegra_smmu_clients_fops = { 963 - .open = tegra_smmu_clients_open, 964 - .read = seq_read, 965 - .llseek = seq_lseek, 966 - .release = single_release, 967 - }; 967 + DEFINE_SHOW_ATTRIBUTE(tegra_smmu_clients); 968 968 969 969 static void tegra_smmu_debugfs_init(struct tegra_smmu *smmu) 970 970 {
+1 -1
drivers/misc/mic/scif/scif_rma.c
··· 15 15 * Intel SCIF driver. 16 16 * 17 17 */ 18 - #include <linux/dma_remapping.h> 18 + #include <linux/intel-iommu.h> 19 19 #include <linux/pagemap.h> 20 20 #include <linux/sched/mm.h> 21 21 #include <linux/sched/signal.h>
+1 -1
drivers/misc/mic/scif/scif_rma.h
··· 53 53 #ifndef SCIF_RMA_H 54 54 #define SCIF_RMA_H 55 55 56 - #include <linux/dma_remapping.h> 56 + #include <linux/intel-iommu.h> 57 57 #include <linux/mmu_notifier.h> 58 58 59 59 #include "../bus/scif_bus.h"
+1 -1
drivers/usb/host/xhci.c
··· 245 245 * an iommu. Doing anything when there is no iommu is definitely 246 246 * unsafe... 247 247 */ 248 - if (!(xhci->quirks & XHCI_ZERO_64B_REGS) || !dev->iommu_group) 248 + if (!(xhci->quirks & XHCI_ZERO_64B_REGS) || !device_iommu_mapped(dev)) 249 249 return; 250 250 251 251 xhci_info(xhci, "Zeroing 64bit base registers, expecting fault\n");
+2 -31
drivers/vfio/vfio_iommu_type1.c
··· 978 978 return ret; 979 979 } 980 980 981 - /* 982 - * Turns out AMD IOMMU has a page table bug where it won't map large pages 983 - * to a region that previously mapped smaller pages. This should be fixed 984 - * soon, so this is just a temporary workaround to break mappings down into 985 - * PAGE_SIZE. Better to map smaller pages than nothing. 986 - */ 987 - static int map_try_harder(struct vfio_domain *domain, dma_addr_t iova, 988 - unsigned long pfn, long npage, int prot) 989 - { 990 - long i; 991 - int ret = 0; 992 - 993 - for (i = 0; i < npage; i++, pfn++, iova += PAGE_SIZE) { 994 - ret = iommu_map(domain->domain, iova, 995 - (phys_addr_t)pfn << PAGE_SHIFT, 996 - PAGE_SIZE, prot | domain->prot); 997 - if (ret) 998 - break; 999 - } 1000 - 1001 - for (; i < npage && i > 0; i--, iova -= PAGE_SIZE) 1002 - iommu_unmap(domain->domain, iova, PAGE_SIZE); 1003 - 1004 - return ret; 1005 - } 1006 - 1007 981 static int vfio_iommu_map(struct vfio_iommu *iommu, dma_addr_t iova, 1008 982 unsigned long pfn, long npage, int prot) 1009 983 { ··· 987 1013 list_for_each_entry(d, &iommu->domain_list, next) { 988 1014 ret = iommu_map(d->domain, iova, (phys_addr_t)pfn << PAGE_SHIFT, 989 1015 npage << PAGE_SHIFT, prot | d->prot); 990 - if (ret) { 991 - if (ret != -EBUSY || 992 - map_try_harder(d, iova, pfn, npage, prot)) 993 - goto unwind; 994 - } 1016 + if (ret) 1017 + goto unwind; 995 1018 996 1019 cond_resched(); 997 1020 }
+10
include/linux/device.h
··· 1058 1058 return container_of(kobj, struct device, kobj); 1059 1059 } 1060 1060 1061 + /** 1062 + * device_iommu_mapped - Returns true when the device DMA is translated 1063 + * by an IOMMU 1064 + * @dev: Device to perform the check on 1065 + */ 1066 + static inline bool device_iommu_mapped(struct device *dev) 1067 + { 1068 + return (dev->iommu_group != NULL); 1069 + } 1070 + 1061 1071 /* Get the wakeup routines, which depend on struct device */ 1062 1072 #include <linux/pm_wakeup.h> 1063 1073
-58
include/linux/dma_remapping.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 */ 2 - #ifndef _DMA_REMAPPING_H 3 - #define _DMA_REMAPPING_H 4 - 5 - /* 6 - * VT-d hardware uses 4KiB page size regardless of host page size. 7 - */ 8 - #define VTD_PAGE_SHIFT (12) 9 - #define VTD_PAGE_SIZE (1UL << VTD_PAGE_SHIFT) 10 - #define VTD_PAGE_MASK (((u64)-1) << VTD_PAGE_SHIFT) 11 - #define VTD_PAGE_ALIGN(addr) (((addr) + VTD_PAGE_SIZE - 1) & VTD_PAGE_MASK) 12 - 13 - #define VTD_STRIDE_SHIFT (9) 14 - #define VTD_STRIDE_MASK (((u64)-1) << VTD_STRIDE_SHIFT) 15 - 16 - #define DMA_PTE_READ (1) 17 - #define DMA_PTE_WRITE (2) 18 - #define DMA_PTE_LARGE_PAGE (1 << 7) 19 - #define DMA_PTE_SNP (1 << 11) 20 - 21 - #define CONTEXT_TT_MULTI_LEVEL 0 22 - #define CONTEXT_TT_DEV_IOTLB 1 23 - #define CONTEXT_TT_PASS_THROUGH 2 24 - /* Extended context entry types */ 25 - #define CONTEXT_TT_PT_PASID 4 26 - #define CONTEXT_TT_PT_PASID_DEV_IOTLB 5 27 - #define CONTEXT_TT_MASK (7ULL << 2) 28 - 29 - #define CONTEXT_DINVE (1ULL << 8) 30 - #define CONTEXT_PRS (1ULL << 9) 31 - #define CONTEXT_PASIDE (1ULL << 11) 32 - 33 - struct intel_iommu; 34 - struct dmar_domain; 35 - struct root_entry; 36 - 37 - 38 - #ifdef CONFIG_INTEL_IOMMU 39 - extern int iommu_calculate_agaw(struct intel_iommu *iommu); 40 - extern int iommu_calculate_max_sagaw(struct intel_iommu *iommu); 41 - extern int dmar_disabled; 42 - extern int intel_iommu_enabled; 43 - extern int intel_iommu_tboot_noforce; 44 - #else 45 - static inline int iommu_calculate_agaw(struct intel_iommu *iommu) 46 - { 47 - return 0; 48 - } 49 - static inline int iommu_calculate_max_sagaw(struct intel_iommu *iommu) 50 - { 51 - return 0; 52 - } 53 - #define dmar_disabled (1) 54 - #define intel_iommu_enabled (0) 55 - #endif 56 - 57 - 58 - #endif
+96 -12
include/linux/intel-iommu.h
··· 26 26 #include <linux/iova.h> 27 27 #include <linux/io.h> 28 28 #include <linux/idr.h> 29 - #include <linux/dma_remapping.h> 30 29 #include <linux/mmu_notifier.h> 31 30 #include <linux/list.h> 32 31 #include <linux/iommu.h> ··· 36 37 #include <asm/iommu.h> 37 38 38 39 /* 40 + * VT-d hardware uses 4KiB page size regardless of host page size. 41 + */ 42 + #define VTD_PAGE_SHIFT (12) 43 + #define VTD_PAGE_SIZE (1UL << VTD_PAGE_SHIFT) 44 + #define VTD_PAGE_MASK (((u64)-1) << VTD_PAGE_SHIFT) 45 + #define VTD_PAGE_ALIGN(addr) (((addr) + VTD_PAGE_SIZE - 1) & VTD_PAGE_MASK) 46 + 47 + #define VTD_STRIDE_SHIFT (9) 48 + #define VTD_STRIDE_MASK (((u64)-1) << VTD_STRIDE_SHIFT) 49 + 50 + #define DMA_PTE_READ (1) 51 + #define DMA_PTE_WRITE (2) 52 + #define DMA_PTE_LARGE_PAGE (1 << 7) 53 + #define DMA_PTE_SNP (1 << 11) 54 + 55 + #define CONTEXT_TT_MULTI_LEVEL 0 56 + #define CONTEXT_TT_DEV_IOTLB 1 57 + #define CONTEXT_TT_PASS_THROUGH 2 58 + #define CONTEXT_PASIDE BIT_ULL(3) 59 + 60 + /* 39 61 * Intel IOMMU register specification per version 1.0 public spec. 40 62 */ 41 - 42 63 #define DMAR_VER_REG 0x0 /* Arch version supported by this IOMMU */ 43 64 #define DMAR_CAP_REG 0x8 /* Hardware supported capabilities */ 44 65 #define DMAR_ECAP_REG 0x10 /* Extended capabilities supported */ ··· 170 151 * Extended Capability Register 171 152 */ 172 153 154 + #define ecap_smpwc(e) (((e) >> 48) & 0x1) 155 + #define ecap_flts(e) (((e) >> 47) & 0x1) 156 + #define ecap_slts(e) (((e) >> 46) & 0x1) 157 + #define ecap_smts(e) (((e) >> 43) & 0x1) 173 158 #define ecap_dit(e) ((e >> 41) & 0x1) 174 159 #define ecap_pasid(e) ((e >> 40) & 0x1) 175 160 #define ecap_pss(e) ((e >> 35) & 0x1f) ··· 252 229 253 230 /* DMA_RTADDR_REG */ 254 231 #define DMA_RTADDR_RTT (((u64)1) << 11) 232 + #define DMA_RTADDR_SMT (((u64)1) << 10) 255 233 256 234 /* CCMD_REG */ 257 235 #define DMA_CCMD_ICC (((u64)1) << 63) ··· 398 374 #define QI_GRAN_NONG_PASID 2 399 375 #define QI_GRAN_PSI_PASID 3 400 376 377 + #define qi_shift(iommu) (DMAR_IQ_SHIFT + !!ecap_smts((iommu)->ecap)) 378 + 401 379 struct qi_desc { 402 - u64 low, high; 380 + u64 qw0; 381 + u64 qw1; 382 + u64 qw2; 383 + u64 qw3; 403 384 }; 404 385 405 386 struct q_inval { 406 387 raw_spinlock_t q_lock; 407 - struct qi_desc *desc; /* invalidation queue */ 388 + void *desc; /* invalidation queue */ 408 389 int *desc_status; /* desc status */ 409 390 int free_head; /* first free entry */ 410 391 int free_tail; /* last free entry */ ··· 541 512 struct iommu_flush flush; 542 513 #endif 543 514 #ifdef CONFIG_INTEL_IOMMU_SVM 544 - /* These are large and need to be contiguous, so we allocate just 545 - * one for now. We'll maybe want to rethink that if we truly give 546 - * devices away to userspace processes (e.g. for DPDK) and don't 547 - * want to trust that userspace will use *only* the PASID it was 548 - * told to. But while it's all driver-arbitrated, we're fine. */ 549 - struct pasid_state_entry *pasid_state_table; 550 515 struct page_req_dsc *prq; 551 516 unsigned char prq_name[16]; /* Name for PRQ interrupt */ 552 - u32 pasid_max; 553 517 #endif 554 518 struct q_inval *qi; /* Queued invalidation info */ 555 519 u32 *iommu_state; /* Store iommu states between suspend and resume.*/ ··· 585 563 clflush_cache_range(addr, size); 586 564 } 587 565 566 + /* 567 + * 0: readable 568 + * 1: writable 569 + * 2-6: reserved 570 + * 7: super page 571 + * 8-10: available 572 + * 11: snoop behavior 573 + * 12-63: Host physcial address 574 + */ 575 + struct dma_pte { 576 + u64 val; 577 + }; 578 + 579 + static inline void dma_clear_pte(struct dma_pte *pte) 580 + { 581 + pte->val = 0; 582 + } 583 + 584 + static inline u64 dma_pte_addr(struct dma_pte *pte) 585 + { 586 + #ifdef CONFIG_64BIT 587 + return pte->val & VTD_PAGE_MASK; 588 + #else 589 + /* Must have a full atomic 64-bit read */ 590 + return __cmpxchg64(&pte->val, 0ULL, 0ULL) & VTD_PAGE_MASK; 591 + #endif 592 + } 593 + 594 + static inline bool dma_pte_present(struct dma_pte *pte) 595 + { 596 + return (pte->val & 3) != 0; 597 + } 598 + 599 + static inline bool dma_pte_superpage(struct dma_pte *pte) 600 + { 601 + return (pte->val & DMA_PTE_LARGE_PAGE); 602 + } 603 + 604 + static inline int first_pte_in_page(struct dma_pte *pte) 605 + { 606 + return !((unsigned long)pte & ~VTD_PAGE_MASK); 607 + } 608 + 588 609 extern struct dmar_drhd_unit * dmar_find_matched_drhd_unit(struct pci_dev *dev); 589 610 extern int dmar_find_matched_atsr_unit(struct pci_dev *dev); 590 611 ··· 652 587 struct intel_iommu *domain_get_iommu(struct dmar_domain *domain); 653 588 int for_each_device_domain(int (*fn)(struct device_domain_info *info, 654 589 void *data), void *data); 590 + void iommu_flush_write_buffer(struct intel_iommu *iommu); 655 591 656 592 #ifdef CONFIG_INTEL_IOMMU_SVM 657 593 int intel_svm_init(struct intel_iommu *iommu); 658 - int intel_svm_exit(struct intel_iommu *iommu); 659 594 extern int intel_svm_enable_prq(struct intel_iommu *iommu); 660 595 extern int intel_svm_finish_prq(struct intel_iommu *iommu); 661 596 ··· 696 631 bool context_present(struct context_entry *context); 697 632 struct context_entry *iommu_context_addr(struct intel_iommu *iommu, u8 bus, 698 633 u8 devfn, int alloc); 634 + 635 + #ifdef CONFIG_INTEL_IOMMU 636 + extern int iommu_calculate_agaw(struct intel_iommu *iommu); 637 + extern int iommu_calculate_max_sagaw(struct intel_iommu *iommu); 638 + extern int dmar_disabled; 639 + extern int intel_iommu_enabled; 640 + extern int intel_iommu_tboot_noforce; 641 + #else 642 + static inline int iommu_calculate_agaw(struct intel_iommu *iommu) 643 + { 644 + return 0; 645 + } 646 + static inline int iommu_calculate_max_sagaw(struct intel_iommu *iommu) 647 + { 648 + return 0; 649 + } 650 + #define dmar_disabled (1) 651 + #define intel_iommu_enabled (0) 652 + #endif 699 653 700 654 #endif
+16 -2
include/linux/iommu.h
··· 168 168 * @map: map a physically contiguous memory region to an iommu domain 169 169 * @unmap: unmap a physically contiguous memory region from an iommu domain 170 170 * @flush_tlb_all: Synchronously flush all hardware TLBs for this domain 171 - * @tlb_range_add: Add a given iova range to the flush queue for this domain 172 - * @tlb_sync: Flush all queued ranges from the hardware TLBs and empty flush 171 + * @iotlb_range_add: Add a given iova range to the flush queue for this domain 172 + * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush 173 173 * queue 174 174 * @iova_to_phys: translate iova to physical address 175 175 * @add_device: add device to iommu grouping ··· 397 397 void iommu_fwspec_free(struct device *dev); 398 398 int iommu_fwspec_add_ids(struct device *dev, u32 *ids, int num_ids); 399 399 const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode); 400 + 401 + static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev) 402 + { 403 + return dev->iommu_fwspec; 404 + } 405 + 406 + static inline void dev_iommu_fwspec_set(struct device *dev, 407 + struct iommu_fwspec *fwspec) 408 + { 409 + dev->iommu_fwspec = fwspec; 410 + } 411 + 412 + int iommu_probe_device(struct device *dev); 413 + void iommu_release_device(struct device *dev); 400 414 401 415 #else /* CONFIG_IOMMU_API */ 402 416