Linux kernel mirror (for testing)
git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel
os
linux
1.. include:: <isonum.txt>
2
3=====================
4VFIO Mediated devices
5=====================
6
7:Copyright: |copy| 2016, NVIDIA CORPORATION. All rights reserved.
8:Author: Neo Jia <cjia@nvidia.com>
9:Author: Kirti Wankhede <kwankhede@nvidia.com>
10
11This program is free software; you can redistribute it and/or modify
12it under the terms of the GNU General Public License version 2 as
13published by the Free Software Foundation.
14
15
16Virtual Function I/O (VFIO) Mediated devices[1]
17===============================================
18
19The number of use cases for virtualizing DMA devices that do not have built-in
20SR_IOV capability is increasing. Previously, to virtualize such devices,
21developers had to create their own management interfaces and APIs, and then
22integrate them with user space software. To simplify integration with user space
23software, we have identified common requirements and a unified management
24interface for such devices.
25
26The VFIO driver framework provides unified APIs for direct device access. It is
27an IOMMU/device-agnostic framework for exposing direct device access to user
28space in a secure, IOMMU-protected environment. This framework is used for
29multiple devices, such as GPUs, network adapters, and compute accelerators. With
30direct device access, virtual machines or user space applications have direct
31access to the physical device. This framework is reused for mediated devices.
32
33The mediated core driver provides a common interface for mediated device
34management that can be used by drivers of different devices. This module
35provides a generic interface to perform these operations:
36
37* Create and destroy a mediated device
38* Add a mediated device to and remove it from a mediated bus driver
39* Add a mediated device to and remove it from an IOMMU group
40
41The mediated core driver also provides an interface to register a bus driver.
42For example, the mediated VFIO mdev driver is designed for mediated devices and
43supports VFIO APIs. The mediated bus driver adds a mediated device to and
44removes it from a VFIO group.
45
46The following high-level block diagram shows the main components and interfaces
47in the VFIO mediated driver framework. The diagram shows NVIDIA, Intel, and IBM
48devices as examples, as these devices are the first devices to use this module::
49
50 +---------------+
51 | |
52 | +-----------+ | mdev_register_driver() +--------------+
53 | | | +<------------------------+ |
54 | | mdev | | | |
55 | | bus | +------------------------>+ vfio_mdev.ko |<-> VFIO user
56 | | driver | | probe()/remove() | | APIs
57 | | | | +--------------+
58 | +-----------+ |
59 | |
60 | MDEV CORE |
61 | MODULE |
62 | mdev.ko |
63 | +-----------+ | mdev_register_device() +--------------+
64 | | | +<------------------------+ |
65 | | | | | nvidia.ko |<-> physical
66 | | | +------------------------>+ | device
67 | | | | callbacks +--------------+
68 | | Physical | |
69 | | device | | mdev_register_device() +--------------+
70 | | interface | |<------------------------+ |
71 | | | | | i915.ko |<-> physical
72 | | | +------------------------>+ | device
73 | | | | callbacks +--------------+
74 | | | |
75 | | | | mdev_register_device() +--------------+
76 | | | +<------------------------+ |
77 | | | | | ccw_device.ko|<-> physical
78 | | | +------------------------>+ | device
79 | | | | callbacks +--------------+
80 | +-----------+ |
81 +---------------+
82
83
84Registration Interfaces
85=======================
86
87The mediated core driver provides the following types of registration
88interfaces:
89
90* Registration interface for a mediated bus driver
91* Physical device driver interface
92
93Registration Interface for a Mediated Bus Driver
94------------------------------------------------
95
96The registration interface for a mediated device driver provides the following
97structure to represent a mediated device's driver::
98
99 /*
100 * struct mdev_driver [2] - Mediated device's driver
101 * @probe: called when new device created
102 * @remove: called when device removed
103 * @driver: device driver structure
104 */
105 struct mdev_driver {
106 int (*probe) (struct mdev_device *dev);
107 void (*remove) (struct mdev_device *dev);
108 struct device_driver driver;
109 };
110
111A mediated bus driver for mdev should use this structure in the function calls
112to register and unregister itself with the core driver:
113
114* Register::
115
116 extern int mdev_register_driver(struct mdev_driver *drv);
117
118* Unregister::
119
120 extern void mdev_unregister_driver(struct mdev_driver *drv);
121
122The mediated bus driver is responsible for adding mediated devices to the VFIO
123group when devices are bound to the driver and removing mediated devices from
124the VFIO when devices are unbound from the driver.
125
126
127Physical Device Driver Interface
128--------------------------------
129
130The physical device driver interface provides the mdev_parent_ops[3] structure
131to define the APIs to manage work in the mediated core driver that is related
132to the physical device.
133
134The structures in the mdev_parent_ops structure are as follows:
135
136* dev_attr_groups: attributes of the parent device
137* mdev_attr_groups: attributes of the mediated device
138* supported_config: attributes to define supported configurations
139* device_driver: device driver to bind for mediated device instances
140
141The mdev_parent_ops also still has various functions pointers. Theses exist
142for historical reasons only and shall not be used for new drivers.
143
144When a driver wants to add the GUID creation sysfs to an existing device it has
145probe'd to then it should call::
146
147 extern int mdev_register_device(struct device *dev,
148 const struct mdev_parent_ops *ops);
149
150This will provide the 'mdev_supported_types/XX/create' files which can then be
151used to trigger the creation of a mdev_device. The created mdev_device will be
152attached to the specified driver.
153
154When the driver needs to remove itself it calls::
155
156 extern void mdev_unregister_device(struct device *dev);
157
158Which will unbind and destroy all the created mdevs and remove the sysfs files.
159
160Mediated Device Management Interface Through sysfs
161==================================================
162
163The management interface through sysfs enables user space software, such as
164libvirt, to query and configure mediated devices in a hardware-agnostic fashion.
165This management interface provides flexibility to the underlying physical
166device's driver to support features such as:
167
168* Mediated device hot plug
169* Multiple mediated devices in a single virtual machine
170* Multiple mediated devices from different physical devices
171
172Links in the mdev_bus Class Directory
173-------------------------------------
174The /sys/class/mdev_bus/ directory contains links to devices that are registered
175with the mdev core driver.
176
177Directories and files under the sysfs for Each Physical Device
178--------------------------------------------------------------
179
180::
181
182 |- [parent physical device]
183 |--- Vendor-specific-attributes [optional]
184 |--- [mdev_supported_types]
185 | |--- [<type-id>]
186 | | |--- create
187 | | |--- name
188 | | |--- available_instances
189 | | |--- device_api
190 | | |--- description
191 | | |--- [devices]
192 | |--- [<type-id>]
193 | | |--- create
194 | | |--- name
195 | | |--- available_instances
196 | | |--- device_api
197 | | |--- description
198 | | |--- [devices]
199 | |--- [<type-id>]
200 | |--- create
201 | |--- name
202 | |--- available_instances
203 | |--- device_api
204 | |--- description
205 | |--- [devices]
206
207* [mdev_supported_types]
208
209 The list of currently supported mediated device types and their details.
210
211 [<type-id>], device_api, and available_instances are mandatory attributes
212 that should be provided by vendor driver.
213
214* [<type-id>]
215
216 The [<type-id>] name is created by adding the device driver string as a prefix
217 to the string provided by the vendor driver. This format of this name is as
218 follows::
219
220 sprintf(buf, "%s-%s", dev_driver_string(parent->dev), group->name);
221
222 (or using mdev_parent_dev(mdev) to arrive at the parent device outside
223 of the core mdev code)
224
225* device_api
226
227 This attribute should show which device API is being created, for example,
228 "vfio-pci" for a PCI device.
229
230* available_instances
231
232 This attribute should show the number of devices of type <type-id> that can be
233 created.
234
235* [device]
236
237 This directory contains links to the devices of type <type-id> that have been
238 created.
239
240* name
241
242 This attribute should show human readable name. This is optional attribute.
243
244* description
245
246 This attribute should show brief features/description of the type. This is
247 optional attribute.
248
249Directories and Files Under the sysfs for Each mdev Device
250----------------------------------------------------------
251
252::
253
254 |- [parent phy device]
255 |--- [$MDEV_UUID]
256 |--- remove
257 |--- mdev_type {link to its type}
258 |--- vendor-specific-attributes [optional]
259
260* remove (write only)
261
262Writing '1' to the 'remove' file destroys the mdev device. The vendor driver can
263fail the remove() callback if that device is active and the vendor driver
264doesn't support hot unplug.
265
266Example::
267
268 # echo 1 > /sys/bus/mdev/devices/$mdev_UUID/remove
269
270Mediated device Hot plug
271------------------------
272
273Mediated devices can be created and assigned at runtime. The procedure to hot
274plug a mediated device is the same as the procedure to hot plug a PCI device.
275
276Translation APIs for Mediated Devices
277=====================================
278
279The following APIs are provided for translating user pfn to host pfn in a VFIO
280driver::
281
282 extern int vfio_pin_pages(struct device *dev, unsigned long *user_pfn,
283 int npage, int prot, unsigned long *phys_pfn);
284
285 extern int vfio_unpin_pages(struct device *dev, unsigned long *user_pfn,
286 int npage);
287
288These functions call back into the back-end IOMMU module by using the pin_pages
289and unpin_pages callbacks of the struct vfio_iommu_driver_ops[4]. Currently
290these callbacks are supported in the TYPE1 IOMMU module. To enable them for
291other IOMMU backend modules, such as PPC64 sPAPR module, they need to provide
292these two callback functions.
293
294Using the Sample Code
295=====================
296
297mtty.c in samples/vfio-mdev/ directory is a sample driver program to
298demonstrate how to use the mediated device framework.
299
300The sample driver creates an mdev device that simulates a serial port over a PCI
301card.
302
3031. Build and load the mtty.ko module.
304
305 This step creates a dummy device, /sys/devices/virtual/mtty/mtty/
306
307 Files in this device directory in sysfs are similar to the following::
308
309 # tree /sys/devices/virtual/mtty/mtty/
310 /sys/devices/virtual/mtty/mtty/
311 |-- mdev_supported_types
312 | |-- mtty-1
313 | | |-- available_instances
314 | | |-- create
315 | | |-- device_api
316 | | |-- devices
317 | | `-- name
318 | `-- mtty-2
319 | |-- available_instances
320 | |-- create
321 | |-- device_api
322 | |-- devices
323 | `-- name
324 |-- mtty_dev
325 | `-- sample_mtty_dev
326 |-- power
327 | |-- autosuspend_delay_ms
328 | |-- control
329 | |-- runtime_active_time
330 | |-- runtime_status
331 | `-- runtime_suspended_time
332 |-- subsystem -> ../../../../class/mtty
333 `-- uevent
334
3352. Create a mediated device by using the dummy device that you created in the
336 previous step::
337
338 # echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001" > \
339 /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create
340
3413. Add parameters to qemu-kvm::
342
343 -device vfio-pci,\
344 sysfsdev=/sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001
345
3464. Boot the VM.
347
348 In the Linux guest VM, with no hardware on the host, the device appears
349 as follows::
350
351 # lspci -s 00:05.0 -xxvv
352 00:05.0 Serial controller: Device 4348:3253 (rev 10) (prog-if 02 [16550])
353 Subsystem: Device 4348:3253
354 Physical Slot: 5
355 Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
356 Stepping- SERR- FastB2B- DisINTx-
357 Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
358 <TAbort- <MAbort- >SERR- <PERR- INTx-
359 Interrupt: pin A routed to IRQ 10
360 Region 0: I/O ports at c150 [size=8]
361 Region 1: I/O ports at c158 [size=8]
362 Kernel driver in use: serial
363 00: 48 43 53 32 01 00 00 02 10 02 00 07 00 00 00 00
364 10: 51 c1 00 00 59 c1 00 00 00 00 00 00 00 00 00 00
365 20: 00 00 00 00 00 00 00 00 00 00 00 00 48 43 53 32
366 30: 00 00 00 00 00 00 00 00 00 00 00 00 0a 01 00 00
367
368 In the Linux guest VM, dmesg output for the device is as follows:
369
370 serial 0000:00:05.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, high) -> IRQ 10
371 0000:00:05.0: ttyS1 at I/O 0xc150 (irq = 10) is a 16550A
372 0000:00:05.0: ttyS2 at I/O 0xc158 (irq = 10) is a 16550A
373
374
3755. In the Linux guest VM, check the serial ports::
376
377 # setserial -g /dev/ttyS*
378 /dev/ttyS0, UART: 16550A, Port: 0x03f8, IRQ: 4
379 /dev/ttyS1, UART: 16550A, Port: 0xc150, IRQ: 10
380 /dev/ttyS2, UART: 16550A, Port: 0xc158, IRQ: 10
381
3826. Using minicom or any terminal emulation program, open port /dev/ttyS1 or
383 /dev/ttyS2 with hardware flow control disabled.
384
3857. Type data on the minicom terminal or send data to the terminal emulation
386 program and read the data.
387
388 Data is loop backed from hosts mtty driver.
389
3908. Destroy the mediated device that you created::
391
392 # echo 1 > /sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001/remove
393
394References
395==========
396
3971. See Documentation/driver-api/vfio.rst for more information on VFIO.
3982. struct mdev_driver in include/linux/mdev.h
3993. struct mdev_parent_ops in include/linux/mdev.h
4004. struct vfio_iommu_driver_ops in include/linux/vfio.h