Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

iommu/arm-smmu-v3-iommufd: Allow attaching nested domain for GBPA cases

A vDEVICE has been a hard requirement for attaching a nested domain to the
device. This makes sense when installing a guest STE, since a vSID must be
present and given to the kernel during the vDEVICE allocation.

But, when CR0.SMMUEN is disabled, VM doesn't really need a vSID to program
the vSMMU behavior as GBPA will take effect, in which case the vSTE in the
nested domain could have carried the bypass or abort configuration in GBPA
register. Thus, having such a hard requirement doesn't work well for GBPA.

Skip vmaster allocation in arm_smmu_attach_prepare_vmaster() for an abort
or bypass vSTE. Note that device on this attachment won't report vevents.

Update the uAPI doc accordingly.

Link: https://patch.msgid.link/r/20251103172755.2026145-1-nicolinc@nvidia.com
Tested-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Tested-by: Shuai Xue <xueshuai@linux.alibaba.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

authored by

Nicolin Chen and committed by
Jason Gunthorpe
81c45c62 ac3fd01e

+22 -1
+12 -1
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
··· 99 99 int arm_smmu_attach_prepare_vmaster(struct arm_smmu_attach_state *state, 100 100 struct arm_smmu_nested_domain *nested_domain) 101 101 { 102 + unsigned int cfg = 103 + FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(nested_domain->ste[0])); 102 104 struct arm_smmu_vmaster *vmaster; 103 105 unsigned long vsid; 104 106 int ret; ··· 109 107 110 108 ret = iommufd_viommu_get_vdev_id(&nested_domain->vsmmu->core, 111 109 state->master->dev, &vsid); 112 - if (ret) 110 + /* 111 + * Attaching to a translate nested domain must allocate a vDEVICE prior, 112 + * as CD/ATS invalidations and vevents require a vSID to work properly. 113 + * A abort/bypass domain is allowed to attach w/o vmaster for GBPA case. 114 + */ 115 + if (ret) { 116 + if (cfg == STRTAB_STE_0_CFG_ABORT || 117 + cfg == STRTAB_STE_0_CFG_BYPASS) 118 + return 0; 113 119 return ret; 120 + } 114 121 115 122 vmaster = kzalloc(sizeof(*vmaster), GFP_KERNEL); 116 123 if (!vmaster)
+10
include/uapi/linux/iommufd.h
··· 450 450 * nested domain will translate the same as the nesting parent. The S1 will 451 451 * install a Context Descriptor Table pointing at userspace memory translated 452 452 * by the nesting parent. 453 + * 454 + * It's suggested to allocate a vDEVICE object carrying vSID and then re-attach 455 + * the nested domain, as soon as the vSID is available in the VMM level: 456 + * 457 + * - when Cfg=translate, a vDEVICE must be allocated prior to attaching to the 458 + * allocated nested domain, as CD/ATS invalidations and vevents need a vSID. 459 + * - when Cfg=bypass/abort, a vDEVICE is not enforced during the nested domain 460 + * attachment, to support a GBPA case where VM sets CR0.SMMUEN=0. However, if 461 + * VM sets CR0.SMMUEN=1 while missing a vDEVICE object, kernel would fail to 462 + * report events to the VM. E.g. F_TRANSLATION when guest STE.Cfg=abort. 453 463 */ 454 464 struct iommu_hwpt_arm_smmuv3 { 455 465 __aligned_le64 ste[2];