Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

uacce: Add documents for uacce

Uacce (Unified/User-space-access-intended Accelerator Framework) is
a kernel module targets to provide Shared Virtual Addressing (SVA)
between the accelerator and process.

This patch add document to explain how it works.

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Kenneth Lee <liguozhu@hisilicon.com>
Signed-off-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

authored by

Kenneth Lee and committed by
Herbert Xu
aa017ab9 41ccdbfd

+176
+176
Documentation/misc-devices/uacce.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0 2 + 3 + Introduction of Uacce 4 + --------------------- 5 + 6 + Uacce (Unified/User-space-access-intended Accelerator Framework) targets to 7 + provide Shared Virtual Addressing (SVA) between accelerators and processes. 8 + So accelerator can access any data structure of the main cpu. 9 + This differs from the data sharing between cpu and io device, which share 10 + only data content rather than address. 11 + Because of the unified address, hardware and user space of process can 12 + share the same virtual address in the communication. 13 + Uacce takes the hardware accelerator as a heterogeneous processor, while 14 + IOMMU share the same CPU page tables and as a result the same translation 15 + from va to pa. 16 + 17 + :: 18 + 19 + __________________________ __________________________ 20 + | | | | 21 + | User application (CPU) | | Hardware Accelerator | 22 + |__________________________| |__________________________| 23 + 24 + | | 25 + | va | va 26 + V V 27 + __________ __________ 28 + | | | | 29 + | MMU | | IOMMU | 30 + |__________| |__________| 31 + | | 32 + | | 33 + V pa V pa 34 + _______________________________________ 35 + | | 36 + | Memory | 37 + |_______________________________________| 38 + 39 + 40 + 41 + Architecture 42 + ------------ 43 + 44 + Uacce is the kernel module, taking charge of iommu and address sharing. 45 + The user drivers and libraries are called WarpDrive. 46 + 47 + The uacce device, built around the IOMMU SVA API, can access multiple 48 + address spaces, including the one without PASID. 49 + 50 + A virtual concept, queue, is used for the communication. It provides a 51 + FIFO-like interface. And it maintains a unified address space between the 52 + application and all involved hardware. 53 + 54 + :: 55 + 56 + ___________________ ________________ 57 + | | user API | | 58 + | WarpDrive library | ------------> | user driver | 59 + |___________________| |________________| 60 + | | 61 + | | 62 + | queue fd | 63 + | | 64 + | | 65 + v | 66 + ___________________ _________ | 67 + | | | | | mmap memory 68 + | Other framework | | uacce | | r/w interface 69 + | crypto/nic/others | |_________| | 70 + |___________________| | 71 + | | | 72 + | register | register | 73 + | | | 74 + | | | 75 + | _________________ __________ | 76 + | | | | | | 77 + ------------- | Device Driver | | IOMMU | | 78 + |_________________| |__________| | 79 + | | 80 + | V 81 + | ___________________ 82 + | | | 83 + -------------------------- | Device(Hardware) | 84 + |___________________| 85 + 86 + 87 + How does it work 88 + ---------------- 89 + 90 + Uacce uses mmap and IOMMU to play the trick. 91 + 92 + Uacce creates a chrdev for every device registered to it. New queue is 93 + created when user application open the chrdev. The file descriptor is used 94 + as the user handle of the queue. 95 + The accelerator device present itself as an Uacce object, which exports as 96 + a chrdev to the user space. The user application communicates with the 97 + hardware by ioctl (as control path) or share memory (as data path). 98 + 99 + The control path to the hardware is via file operation, while data path is 100 + via mmap space of the queue fd. 101 + 102 + The queue file address space: 103 + 104 + :: 105 + 106 + /** 107 + * enum uacce_qfrt: qfrt type 108 + * @UACCE_QFRT_MMIO: device mmio region 109 + * @UACCE_QFRT_DUS: device user share region 110 + */ 111 + enum uacce_qfrt { 112 + UACCE_QFRT_MMIO = 0, 113 + UACCE_QFRT_DUS = 1, 114 + }; 115 + 116 + All regions are optional and differ from device type to type. 117 + Each region can be mmapped only once, otherwise -EEXIST returns. 118 + 119 + The device mmio region is mapped to the hardware mmio space. It is generally 120 + used for doorbell or other notification to the hardware. It is not fast enough 121 + as data channel. 122 + 123 + The device user share region is used for share data buffer between user process 124 + and device. 125 + 126 + 127 + The Uacce register API 128 + ---------------------- 129 + 130 + The register API is defined in uacce.h. 131 + 132 + :: 133 + 134 + struct uacce_interface { 135 + char name[UACCE_MAX_NAME_SIZE]; 136 + unsigned int flags; 137 + const struct uacce_ops *ops; 138 + }; 139 + 140 + According to the IOMMU capability, uacce_interface flags can be: 141 + 142 + :: 143 + 144 + /** 145 + * UACCE Device flags: 146 + * UACCE_DEV_SVA: Shared Virtual Addresses 147 + * Support PASID 148 + * Support device page faults (PCI PRI or SMMU Stall) 149 + */ 150 + #define UACCE_DEV_SVA BIT(0) 151 + 152 + struct uacce_device *uacce_alloc(struct device *parent, 153 + struct uacce_interface *interface); 154 + int uacce_register(struct uacce_device *uacce); 155 + void uacce_remove(struct uacce_device *uacce); 156 + 157 + uacce_register results can be: 158 + 159 + a. If uacce module is not compiled, ERR_PTR(-ENODEV) 160 + 161 + b. Succeed with the desired flags 162 + 163 + c. Succeed with the negotiated flags, for example 164 + 165 + uacce_interface.flags = UACCE_DEV_SVA but uacce->flags = ~UACCE_DEV_SVA 166 + 167 + So user driver need check return value as well as the negotiated uacce->flags. 168 + 169 + 170 + The user driver 171 + --------------- 172 + 173 + The queue file mmap space will need a user driver to wrap the communication 174 + protocol. Uacce provides some attributes in sysfs for the user driver to 175 + match the right accelerator accordingly. 176 + More details in Documentation/ABI/testing/sysfs-driver-uacce.