Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

docs: crypto: convert async-tx-api.txt to ReST format

- Place the txt index inside a comment;
- Use title and chapter markups;
- Adjust markups for numbered list;
- Mark literal blocks as such;
- Use tables markup.
- Adjust indentation when needed.

Acked-By: Vinod Koul <vkoul@kernel.org> # dmaengine
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/98977242130efe86d1200f7a167299d4c1c205c5.1592203650.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

authored by

Mauro Carvalho Chehab and committed by
Jonathan Corbet
ddc92399 5846551b

+146 -99
+141 -96
Documentation/crypto/async-tx-api.txt Documentation/crypto/async-tx-api.rst
··· 1 - Asynchronous Transfers/Transforms API 1 + .. SPDX-License-Identifier: GPL-2.0 2 2 3 - 1 INTRODUCTION 3 + ===================================== 4 + Asynchronous Transfers/Transforms API 5 + ===================================== 4 6 5 - 2 GENEALOGY 7 + .. Contents 6 8 7 - 3 USAGE 8 - 3.1 General format of the API 9 - 3.2 Supported operations 10 - 3.3 Descriptor management 11 - 3.4 When does the operation execute? 12 - 3.5 When does the operation complete? 13 - 3.6 Constraints 14 - 3.7 Example 9 + 1. INTRODUCTION 15 10 16 - 4 DMAENGINE DRIVER DEVELOPER NOTES 17 - 4.1 Conformance points 18 - 4.2 "My application needs exclusive control of hardware channels" 11 + 2 GENEALOGY 19 12 20 - 5 SOURCE 13 + 3 USAGE 14 + 3.1 General format of the API 15 + 3.2 Supported operations 16 + 3.3 Descriptor management 17 + 3.4 When does the operation execute? 18 + 3.5 When does the operation complete? 19 + 3.6 Constraints 20 + 3.7 Example 21 21 22 - --- 22 + 4 DMAENGINE DRIVER DEVELOPER NOTES 23 + 4.1 Conformance points 24 + 4.2 "My application needs exclusive control of hardware channels" 23 25 24 - 1 INTRODUCTION 26 + 5 SOURCE 27 + 28 + 1. Introduction 29 + =============== 25 30 26 31 The async_tx API provides methods for describing a chain of asynchronous 27 32 bulk memory transfers/transforms with support for inter-transactional ··· 36 31 the API will fit the chain of operations to the available offload 37 32 resources. 38 33 39 - 2 GENEALOGY 34 + 2.Genealogy 35 + =========== 40 36 41 37 The API was initially designed to offload the memory copy and 42 38 xor-parity-calculations of the md-raid5 driver using the offload engines ··· 45 39 on the 'dmaengine' layer developed for offloading memory copies in the 46 40 network stack using Intel(R) I/OAT engines. The following design 47 41 features surfaced as a result: 48 - 1/ implicit synchronous path: users of the API do not need to know if 42 + 43 + 1. implicit synchronous path: users of the API do not need to know if 49 44 the platform they are running on has offload capabilities. The 50 45 operation will be offloaded when an engine is available and carried out 51 46 in software otherwise. 52 - 2/ cross channel dependency chains: the API allows a chain of dependent 47 + 2. cross channel dependency chains: the API allows a chain of dependent 53 48 operations to be submitted, like xor->copy->xor in the raid5 case. The 54 49 API automatically handles cases where the transition from one operation 55 50 to another implies a hardware channel switch. 56 - 3/ dmaengine extensions to support multiple clients and operation types 51 + 3. dmaengine extensions to support multiple clients and operation types 57 52 beyond 'memcpy' 58 53 59 - 3 USAGE 54 + 3. Usage 55 + ======== 60 56 61 - 3.1 General format of the API: 62 - struct dma_async_tx_descriptor * 63 - async_<operation>(<op specific parameters>, struct async_submit ctl *submit) 57 + 3.1 General format of the API 58 + ----------------------------- 64 59 65 - 3.2 Supported operations: 66 - memcpy - memory copy between a source and a destination buffer 67 - memset - fill a destination buffer with a byte value 68 - xor - xor a series of source buffers and write the result to a 60 + :: 61 + 62 + struct dma_async_tx_descriptor * 63 + async_<operation>(<op specific parameters>, struct async_submit ctl *submit) 64 + 65 + 3.2 Supported operations 66 + ------------------------ 67 + 68 + ======== ==================================================================== 69 + memcpy memory copy between a source and a destination buffer 70 + memset fill a destination buffer with a byte value 71 + xor xor a series of source buffers and write the result to a 69 72 destination buffer 70 - xor_val - xor a series of source buffers and set a flag if the 73 + xor_val xor a series of source buffers and set a flag if the 71 74 result is zero. The implementation attempts to prevent 72 75 writes to memory 73 - pq - generate the p+q (raid6 syndrome) from a series of source buffers 74 - pq_val - validate that a p and or q buffer are in sync with a given series of 76 + pq generate the p+q (raid6 syndrome) from a series of source buffers 77 + pq_val validate that a p and or q buffer are in sync with a given series of 75 78 sources 76 - datap - (raid6_datap_recov) recover a raid6 data block and the p block 79 + datap (raid6_datap_recov) recover a raid6 data block and the p block 77 80 from the given sources 78 - 2data - (raid6_2data_recov) recover 2 raid6 data blocks from the given 81 + 2data (raid6_2data_recov) recover 2 raid6 data blocks from the given 79 82 sources 83 + ======== ==================================================================== 80 84 81 - 3.3 Descriptor management: 85 + 3.3 Descriptor management 86 + ------------------------- 87 + 82 88 The return value is non-NULL and points to a 'descriptor' when the operation 83 89 has been queued to execute asynchronously. Descriptors are recycled 84 90 resources, under control of the offload engine driver, to be reused as ··· 100 82 acknowledged by the application before the offload engine driver is allowed to 101 83 recycle (or free) the descriptor. A descriptor can be acked by one of the 102 84 following methods: 103 - 1/ setting the ASYNC_TX_ACK flag if no child operations are to be submitted 104 - 2/ submitting an unacknowledged descriptor as a dependency to another 85 + 86 + 1. setting the ASYNC_TX_ACK flag if no child operations are to be submitted 87 + 2. submitting an unacknowledged descriptor as a dependency to another 105 88 async_tx call will implicitly set the acknowledged state. 106 - 3/ calling async_tx_ack() on the descriptor. 89 + 3. calling async_tx_ack() on the descriptor. 107 90 108 91 3.4 When does the operation execute? 92 + ------------------------------------ 93 + 109 94 Operations do not immediately issue after return from the 110 95 async_<operation> call. Offload engine drivers batch operations to 111 96 improve performance by reducing the number of mmio cycles needed to ··· 119 98 mapping. 120 99 121 100 3.5 When does the operation complete? 101 + ------------------------------------- 102 + 122 103 There are two methods for an application to learn about the completion 123 104 of an operation. 124 - 1/ Call dma_wait_for_async_tx(). This call causes the CPU to spin while 105 + 106 + 1. Call dma_wait_for_async_tx(). This call causes the CPU to spin while 125 107 it polls for the completion of the operation. It handles dependency 126 108 chains and issuing pending operations. 127 - 2/ Specify a completion callback. The callback routine runs in tasklet 109 + 2. Specify a completion callback. The callback routine runs in tasklet 128 110 context if the offload engine driver supports interrupts, or it is 129 111 called in application context if the operation is carried out 130 112 synchronously in software. The callback can be set in the call to ··· 135 111 unknown length it can use the async_trigger_callback() routine to set a 136 112 completion interrupt/callback at the end of the chain. 137 113 138 - 3.6 Constraints: 139 - 1/ Calls to async_<operation> are not permitted in IRQ context. Other 114 + 3.6 Constraints 115 + --------------- 116 + 117 + 1. Calls to async_<operation> are not permitted in IRQ context. Other 140 118 contexts are permitted provided constraint #2 is not violated. 141 - 2/ Completion callback routines cannot submit new operations. This 119 + 2. Completion callback routines cannot submit new operations. This 142 120 results in recursion in the synchronous case and spin_locks being 143 121 acquired twice in the asynchronous case. 144 122 145 - 3.7 Example: 123 + 3.7 Example 124 + ----------- 125 + 146 126 Perform a xor->copy->xor operation where each operation depends on the 147 - result from the previous operation: 127 + result from the previous operation:: 148 128 149 - void callback(void *param) 150 - { 151 - struct completion *cmp = param; 129 + void callback(void *param) 130 + { 131 + struct completion *cmp = param; 152 132 153 - complete(cmp); 154 - } 133 + complete(cmp); 134 + } 155 135 156 - void run_xor_copy_xor(struct page **xor_srcs, 157 - int xor_src_cnt, 158 - struct page *xor_dest, 159 - size_t xor_len, 160 - struct page *copy_src, 161 - struct page *copy_dest, 162 - size_t copy_len) 163 - { 164 - struct dma_async_tx_descriptor *tx; 165 - addr_conv_t addr_conv[xor_src_cnt]; 166 - struct async_submit_ctl submit; 167 - addr_conv_t addr_conv[NDISKS]; 168 - struct completion cmp; 136 + void run_xor_copy_xor(struct page **xor_srcs, 137 + int xor_src_cnt, 138 + struct page *xor_dest, 139 + size_t xor_len, 140 + struct page *copy_src, 141 + struct page *copy_dest, 142 + size_t copy_len) 143 + { 144 + struct dma_async_tx_descriptor *tx; 145 + addr_conv_t addr_conv[xor_src_cnt]; 146 + struct async_submit_ctl submit; 147 + addr_conv_t addr_conv[NDISKS]; 148 + struct completion cmp; 169 149 170 - init_async_submit(&submit, ASYNC_TX_XOR_DROP_DST, NULL, NULL, NULL, 171 - addr_conv); 172 - tx = async_xor(xor_dest, xor_srcs, 0, xor_src_cnt, xor_len, &submit) 150 + init_async_submit(&submit, ASYNC_TX_XOR_DROP_DST, NULL, NULL, NULL, 151 + addr_conv); 152 + tx = async_xor(xor_dest, xor_srcs, 0, xor_src_cnt, xor_len, &submit) 173 153 174 - submit->depend_tx = tx; 175 - tx = async_memcpy(copy_dest, copy_src, 0, 0, copy_len, &submit); 154 + submit->depend_tx = tx; 155 + tx = async_memcpy(copy_dest, copy_src, 0, 0, copy_len, &submit); 176 156 177 - init_completion(&cmp); 178 - init_async_submit(&submit, ASYNC_TX_XOR_DROP_DST | ASYNC_TX_ACK, tx, 179 - callback, &cmp, addr_conv); 180 - tx = async_xor(xor_dest, xor_srcs, 0, xor_src_cnt, xor_len, &submit); 157 + init_completion(&cmp); 158 + init_async_submit(&submit, ASYNC_TX_XOR_DROP_DST | ASYNC_TX_ACK, tx, 159 + callback, &cmp, addr_conv); 160 + tx = async_xor(xor_dest, xor_srcs, 0, xor_src_cnt, xor_len, &submit); 181 161 182 - async_tx_issue_pending_all(); 162 + async_tx_issue_pending_all(); 183 163 184 - wait_for_completion(&cmp); 185 - } 164 + wait_for_completion(&cmp); 165 + } 186 166 187 167 See include/linux/async_tx.h for more information on the flags. See the 188 168 ops_run_* and ops_complete_* routines in drivers/md/raid5.c for more 189 169 implementation examples. 190 170 191 - 4 DRIVER DEVELOPMENT NOTES 171 + 4. Driver Development Notes 172 + =========================== 192 173 193 - 4.1 Conformance points: 174 + 4.1 Conformance points 175 + ---------------------- 176 + 194 177 There are a few conformance points required in dmaengine drivers to 195 178 accommodate assumptions made by applications using the async_tx API: 196 - 1/ Completion callbacks are expected to happen in tasklet context 197 - 2/ dma_async_tx_descriptor fields are never manipulated in IRQ context 198 - 3/ Use async_tx_run_dependencies() in the descriptor clean up path to 179 + 180 + 1. Completion callbacks are expected to happen in tasklet context 181 + 2. dma_async_tx_descriptor fields are never manipulated in IRQ context 182 + 3. Use async_tx_run_dependencies() in the descriptor clean up path to 199 183 handle submission of dependent operations 200 184 201 185 4.2 "My application needs exclusive control of hardware channels" 186 + ----------------------------------------------------------------- 187 + 202 188 Primarily this requirement arises from cases where a DMA engine driver 203 189 is being used to support device-to-memory operations. A channel that is 204 190 performing these operations cannot, for many platform specific reasons, 205 191 be shared. For these cases the dma_request_channel() interface is 206 192 provided. 207 193 208 - The interface is: 209 - struct dma_chan *dma_request_channel(dma_cap_mask_t mask, 210 - dma_filter_fn filter_fn, 211 - void *filter_param); 194 + The interface is:: 212 195 213 - Where dma_filter_fn is defined as: 214 - typedef bool (*dma_filter_fn)(struct dma_chan *chan, void *filter_param); 196 + struct dma_chan *dma_request_channel(dma_cap_mask_t mask, 197 + dma_filter_fn filter_fn, 198 + void *filter_param); 199 + 200 + Where dma_filter_fn is defined as:: 201 + 202 + typedef bool (*dma_filter_fn)(struct dma_chan *chan, void *filter_param); 215 203 216 204 When the optional 'filter_fn' parameter is set to NULL 217 205 dma_request_channel simply returns the first channel that satisfies the ··· 243 207 unused "public" channel. 244 208 245 209 A couple caveats to note when implementing a driver and consumer: 246 - 1/ Once a channel has been privately allocated it will no longer be 210 + 211 + 1. Once a channel has been privately allocated it will no longer be 247 212 considered by the general-purpose allocator even after a call to 248 213 dma_release_channel(). 249 - 2/ Since capabilities are specified at the device level a dma_device 214 + 2. Since capabilities are specified at the device level a dma_device 250 215 with multiple channels will either have all channels public, or all 251 216 channels private. 252 217 253 - 5 SOURCE 218 + 5. Source 219 + --------- 254 220 255 - include/linux/dmaengine.h: core header file for DMA drivers and api users 256 - drivers/dma/dmaengine.c: offload engine channel management routines 257 - drivers/dma/: location for offload engine drivers 258 - include/linux/async_tx.h: core header file for the async_tx api 259 - crypto/async_tx/async_tx.c: async_tx interface to dmaengine and common code 260 - crypto/async_tx/async_memcpy.c: copy offload 261 - crypto/async_tx/async_xor.c: xor and xor zero sum offload 221 + include/linux/dmaengine.h: 222 + core header file for DMA drivers and api users 223 + drivers/dma/dmaengine.c: 224 + offload engine channel management routines 225 + drivers/dma/: 226 + location for offload engine drivers 227 + include/linux/async_tx.h: 228 + core header file for the async_tx api 229 + crypto/async_tx/async_tx.c: 230 + async_tx interface to dmaengine and common code 231 + crypto/async_tx/async_memcpy.c: 232 + copy offload 233 + crypto/async_tx/async_xor.c: 234 + xor and xor zero sum offload
+2
Documentation/crypto/index.rst
··· 19 19 intro 20 20 api-intro 21 21 architecture 22 + 23 + async-tx-api 22 24 asymmetric-keys 23 25 devel-algos 24 26 userspace-if
+1 -1
Documentation/driver-api/dmaengine/client.rst
··· 5 5 Vinod Koul <vinod dot koul at intel.com> 6 6 7 7 .. note:: For DMA Engine usage in async_tx please see: 8 - ``Documentation/crypto/async-tx-api.txt`` 8 + ``Documentation/crypto/async-tx-api.rst`` 9 9 10 10 11 11 Below is a guide to device driver writers on how to use the Slave-DMA API of the
+1 -1
Documentation/driver-api/dmaengine/provider.rst
··· 95 95 ensure that it stayed compatible. 96 96 97 97 For more information on the Async TX API, please look the relevant 98 - documentation file in Documentation/crypto/async-tx-api.txt. 98 + documentation file in Documentation/crypto/async-tx-api.rst. 99 99 100 100 DMAEngine APIs 101 101 ==============
+1 -1
MAINTAINERS
··· 2837 2837 R: Dan Williams <dan.j.williams@intel.com> 2838 2838 S: Odd fixes 2839 2839 W: http://sourceforge.net/projects/xscaleiop 2840 - F: Documentation/crypto/async-tx-api.txt 2840 + F: Documentation/crypto/async-tx-api.rst 2841 2841 F: crypto/async_tx/ 2842 2842 F: drivers/dma/ 2843 2843 F: include/linux/async_tx.h