Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

x86, iommu/vt-d: Add an option to disable Intel IOMMU force on

IOMMU harms performance signficantly when we run very fast networking
workloads. It's 40GB networking doing XDP test. Software overhead is
almost unaware, but it's the IOTLB miss (based on our analysis) which
kills the performance. We observed the same performance issue even with
software passthrough (identity mapping), only the hardware passthrough
survives. The pps with iommu (with software passthrough) is only about
~30% of that without it. This is a limitation in hardware based on our
observation, so we'd like to disable the IOMMU force on, but we do want
to use TBOOT and we can sacrifice the DMA security bought by IOMMU. I
must admit I know nothing about TBOOT, but TBOOT guys (cc-ed) think not
eabling IOMMU is totally ok.

So introduce a new boot option to disable the force on. It's kind of
silly we need to run into intel_iommu_init even without force on, but we
need to disable TBOOT PMR registers. For system without the boot option,
nothing is changed.

Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>

authored by

Shaohua Li and committed by
Joerg Roedel
bfd20f1c 161b28aa

+31
+9
Documentation/admin-guide/kernel-parameters.txt
··· 1578 1578 extended tables themselves, and also PASID support. With 1579 1579 this option set, extended tables will not be used even 1580 1580 on hardware which claims to support them. 1581 + tboot_noforce [Default Off] 1582 + Do not force the Intel IOMMU enabled under tboot. 1583 + By default, tboot will force Intel IOMMU on, which 1584 + could harm performance of some high-throughput 1585 + devices like 40GBit network cards, even if identity 1586 + mapping is enabled. 1587 + Note that using this option lowers the security 1588 + provided by tboot because it makes the system 1589 + vulnerable to DMA attacks. 1581 1590 1582 1591 intel_idle.max_cstate= [KNL,HW,ACPI,X86] 1583 1592 0 disables intel_idle and fall back on acpi_idle.
+3
arch/x86/kernel/tboot.c
··· 510 510 if (!tboot_enabled()) 511 511 return 0; 512 512 513 + if (!intel_iommu_tboot_noforce) 514 + return 1; 515 + 513 516 if (no_iommu || swiotlb || dmar_disabled) 514 517 pr_warning("Forcing Intel-IOMMU to enabled\n"); 515 518
+18
drivers/iommu/intel-iommu.c
··· 183 183 * (used when kernel is launched w/ TXT) 184 184 */ 185 185 static int force_on = 0; 186 + int intel_iommu_tboot_noforce; 186 187 187 188 /* 188 189 * 0: Present ··· 608 607 "Intel-IOMMU: enable pre-production PASID support\n"); 609 608 intel_iommu_pasid28 = 1; 610 609 iommu_identity_mapping |= IDENTMAP_GFX; 610 + } else if (!strncmp(str, "tboot_noforce", 13)) { 611 + printk(KERN_INFO 612 + "Intel-IOMMU: not forcing on after tboot. This could expose security risk for tboot\n"); 613 + intel_iommu_tboot_noforce = 1; 611 614 } 612 615 613 616 str += strcspn(str, ","); ··· 4855 4850 } 4856 4851 4857 4852 if (no_iommu || dmar_disabled) { 4853 + /* 4854 + * We exit the function here to ensure IOMMU's remapping and 4855 + * mempool aren't setup, which means that the IOMMU's PMRs 4856 + * won't be disabled via the call to init_dmars(). So disable 4857 + * it explicitly here. The PMRs were setup by tboot prior to 4858 + * calling SENTER, but the kernel is expected to reset/tear 4859 + * down the PMRs. 4860 + */ 4861 + if (intel_iommu_tboot_noforce) { 4862 + for_each_iommu(iommu, drhd) 4863 + iommu_disable_protect_mem_regions(iommu); 4864 + } 4865 + 4858 4866 /* 4859 4867 * Make sure the IOMMUs are switched off, even when we 4860 4868 * boot into a kexec kernel and the previous kernel left
+1
include/linux/dma_remapping.h
··· 39 39 extern int iommu_calculate_max_sagaw(struct intel_iommu *iommu); 40 40 extern int dmar_disabled; 41 41 extern int intel_iommu_enabled; 42 + extern int intel_iommu_tboot_noforce; 42 43 #else 43 44 static inline int iommu_calculate_agaw(struct intel_iommu *iommu) 44 45 {