Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

vfio: Introduce DMA logging uAPIs

DMA logging allows a device to internally record what DMAs the device is
initiating and report them back to userspace. It is part of the VFIO
migration infrastructure that allows implementing dirty page tracking
during the pre copy phase of live migration. Only DMA WRITEs are logged,
and this API is not connected to VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE.

This patch introduces the DMA logging involved uAPIs.

It uses the FEATURE ioctl with its GET/SET/PROBE options as of below.

It exposes a PROBE option to detect if the device supports DMA logging.
It exposes a SET option to start device DMA logging in given IOVAs
ranges.
It exposes a SET option to stop device DMA logging that was previously
started.
It exposes a GET option to read back and clear the device DMA log.

Extra details exist as part of vfio.h per a specific option.

Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220908183448.195262-4-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

authored by

Yishai Hadas and committed by
Alex Williamson
42ee53f9 71aef261

+86
+86
include/uapi/linux/vfio.h
··· 1042 1042 */ 1043 1043 #define VFIO_DEVICE_FEATURE_LOW_POWER_EXIT 5 1044 1044 1045 + /* 1046 + * Upon VFIO_DEVICE_FEATURE_SET start/stop device DMA logging. 1047 + * VFIO_DEVICE_FEATURE_PROBE can be used to detect if the device supports 1048 + * DMA logging. 1049 + * 1050 + * DMA logging allows a device to internally record what DMAs the device is 1051 + * initiating and report them back to userspace. It is part of the VFIO 1052 + * migration infrastructure that allows implementing dirty page tracking 1053 + * during the pre copy phase of live migration. Only DMA WRITEs are logged, 1054 + * and this API is not connected to VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE. 1055 + * 1056 + * When DMA logging is started a range of IOVAs to monitor is provided and the 1057 + * device can optimize its logging to cover only the IOVA range given. Each 1058 + * DMA that the device initiates inside the range will be logged by the device 1059 + * for later retrieval. 1060 + * 1061 + * page_size is an input that hints what tracking granularity the device 1062 + * should try to achieve. If the device cannot do the hinted page size then 1063 + * it's the driver choice which page size to pick based on its support. 1064 + * On output the device will return the page size it selected. 1065 + * 1066 + * ranges is a pointer to an array of 1067 + * struct vfio_device_feature_dma_logging_range. 1068 + * 1069 + * The core kernel code guarantees to support by minimum num_ranges that fit 1070 + * into a single kernel page. User space can try higher values but should give 1071 + * up if the above can't be achieved as of some driver limitations. 1072 + * 1073 + * A single call to start device DMA logging can be issued and a matching stop 1074 + * should follow at the end. Another start is not allowed in the meantime. 1075 + */ 1076 + struct vfio_device_feature_dma_logging_control { 1077 + __aligned_u64 page_size; 1078 + __u32 num_ranges; 1079 + __u32 __reserved; 1080 + __aligned_u64 ranges; 1081 + }; 1082 + 1083 + struct vfio_device_feature_dma_logging_range { 1084 + __aligned_u64 iova; 1085 + __aligned_u64 length; 1086 + }; 1087 + 1088 + #define VFIO_DEVICE_FEATURE_DMA_LOGGING_START 6 1089 + 1090 + /* 1091 + * Upon VFIO_DEVICE_FEATURE_SET stop device DMA logging that was started 1092 + * by VFIO_DEVICE_FEATURE_DMA_LOGGING_START 1093 + */ 1094 + #define VFIO_DEVICE_FEATURE_DMA_LOGGING_STOP 7 1095 + 1096 + /* 1097 + * Upon VFIO_DEVICE_FEATURE_GET read back and clear the device DMA log 1098 + * 1099 + * Query the device's DMA log for written pages within the given IOVA range. 1100 + * During querying the log is cleared for the IOVA range. 1101 + * 1102 + * bitmap is a pointer to an array of u64s that will hold the output bitmap 1103 + * with 1 bit reporting a page_size unit of IOVA. The mapping of IOVA to bits 1104 + * is given by: 1105 + * bitmap[(addr - iova)/page_size] & (1ULL << (addr % 64)) 1106 + * 1107 + * The input page_size can be any power of two value and does not have to 1108 + * match the value given to VFIO_DEVICE_FEATURE_DMA_LOGGING_START. The driver 1109 + * will format its internal logging to match the reporting page size, possibly 1110 + * by replicating bits if the internal page size is lower than requested. 1111 + * 1112 + * The LOGGING_REPORT will only set bits in the bitmap and never clear or 1113 + * perform any initialization of the user provided bitmap. 1114 + * 1115 + * If any error is returned userspace should assume that the dirty log is 1116 + * corrupted. Error recovery is to consider all memory dirty and try to 1117 + * restart the dirty tracking, or to abort/restart the whole migration. 1118 + * 1119 + * If DMA logging is not enabled, an error will be returned. 1120 + * 1121 + */ 1122 + struct vfio_device_feature_dma_logging_report { 1123 + __aligned_u64 iova; 1124 + __aligned_u64 length; 1125 + __aligned_u64 page_size; 1126 + __aligned_u64 bitmap; 1127 + }; 1128 + 1129 + #define VFIO_DEVICE_FEATURE_DMA_LOGGING_REPORT 8 1130 + 1045 1131 /* -------- API for Type1 VFIO IOMMU -------- */ 1046 1132 1047 1133 /**