dmaengine: doc: ReSTize pxa_dma doc · tjh.dev/kernel@fbbe0bf

-153

Documentation/dmaengine/pxa_dma.txt

··· 1 - PXA/MMP - DMA Slave controller 2 - ============================== 3 - 4 - Constraints 5 - ----------- 6 - a) Transfers hot queuing 7 - A driver submitting a transfer and issuing it should be granted the transfer 8 - is queued even on a running DMA channel. 9 - This implies that the queuing doesn't wait for the previous transfer end, 10 - and that the descriptor chaining is not only done in the irq/tasklet code 11 - triggered by the end of the transfer. 12 - A transfer which is submitted and issued on a phy doesn't wait for a phy to 13 - stop and restart, but is submitted on a "running channel". The other 14 - drivers, especially mmp_pdma waited for the phy to stop before relaunching 15 - a new transfer. 16 - 17 - b) All transfers having asked for confirmation should be signaled 18 - Any issued transfer with DMA_PREP_INTERRUPT should trigger a callback call. 19 - This implies that even if an irq/tasklet is triggered by end of tx1, but 20 - at the time of irq/dma tx2 is already finished, tx1->complete() and 21 - tx2->complete() should be called. 22 - 23 - c) Channel running state 24 - A driver should be able to query if a channel is running or not. For the 25 - multimedia case, such as video capture, if a transfer is submitted and then 26 - a check of the DMA channel reports a "stopped channel", the transfer should 27 - not be issued until the next "start of frame interrupt", hence the need to 28 - know if a channel is in running or stopped state. 29 - 30 - d) Bandwidth guarantee 31 - The PXA architecture has 4 levels of DMAs priorities : high, normal, low. 32 - The high priorities get twice as much bandwidth as the normal, which get twice 33 - as much as the low priorities. 34 - A driver should be able to request a priority, especially the real-time 35 - ones such as pxa_camera with (big) throughputs. 36 - 37 - Design 38 - ------ 39 - a) Virtual channels 40 - Same concept as in sa11x0 driver, ie. a driver was assigned a "virtual 41 - channel" linked to the requestor line, and the physical DMA channel is 42 - assigned on the fly when the transfer is issued. 43 - 44 - b) Transfer anatomy for a scatter-gather transfer 45 - +------------+-----+---------------+----------------+-----------------+ 46 - | desc-sg[0] | ... | desc-sg[last] | status updater | finisher/linker | 47 - +------------+-----+---------------+----------------+-----------------+ 48 - 49 - This structure is pointed by dma->sg_cpu. 50 - The descriptors are used as follows : 51 - - desc-sg[i]: i-th descriptor, transferring the i-th sg 52 - element to the video buffer scatter gather 53 - - status updater 54 - Transfers a single u32 to a well known dma coherent memory to leave 55 - a trace that this transfer is done. The "well known" is unique per 56 - physical channel, meaning that a read of this value will tell which 57 - is the last finished transfer at that point in time. 58 - - finisher: has ddadr=DADDR_STOP, dcmd=ENDIRQEN 59 - - linker: has ddadr= desc-sg[0] of next transfer, dcmd=0 60 - 61 - c) Transfers hot-chaining 62 - Suppose the running chain is : 63 - Buffer 1 Buffer 2 64 - +---------+----+---+ +----+----+----+---+ 65 - | d0 | .. | dN | l | | d0 | .. | dN | f | 66 - +---------+----+-|-+ ^----+----+----+---+ 67 - | | 68 - +----+ 69 - 70 - After a call to dmaengine_submit(b3), the chain will look like : 71 - Buffer 1 Buffer 2 Buffer 3 72 - +---------+----+---+ +----+----+----+---+ +----+----+----+---+ 73 - | d0 | .. | dN | l | | d0 | .. | dN | l | | d0 | .. | dN | f | 74 - +---------+----+-|-+ ^----+----+----+-|-+ ^----+----+----+---+ 75 - | | | | 76 - +----+ +----+ 77 - new_link 78 - 79 - If while new_link was created the DMA channel stopped, it is _not_ 80 - restarted. Hot-chaining doesn't break the assumption that 81 - dma_async_issue_pending() is to be used to ensure the transfer is actually started. 82 - 83 - One exception to this rule : 84 - - if Buffer1 and Buffer2 had all their addresses 8 bytes aligned 85 - - and if Buffer3 has at least one address not 4 bytes aligned 86 - - then hot-chaining cannot happen, as the channel must be stopped, the 87 - "align bit" must be set, and the channel restarted As a consequence, 88 - such a transfer tx_submit() will be queued on the submitted queue, and 89 - this specific case if the DMA is already running in aligned mode. 90 - 91 - d) Transfers completion updater 92 - Each time a transfer is completed on a channel, an interrupt might be 93 - generated or not, up to the client's request. But in each case, the last 94 - descriptor of a transfer, the "status updater", will write the latest 95 - transfer being completed into the physical channel's completion mark. 96 - 97 - This will speed up residue calculation, for large transfers such as video 98 - buffers which hold around 6k descriptors or more. This also allows without 99 - any lock to find out what is the latest completed transfer in a running 100 - DMA chain. 101 - 102 - e) Transfers completion, irq and tasklet 103 - When a transfer flagged as "DMA_PREP_INTERRUPT" is finished, the dma irq 104 - is raised. Upon this interrupt, a tasklet is scheduled for the physical 105 - channel. 106 - The tasklet is responsible for : 107 - - reading the physical channel last updater mark 108 - - calling all the transfer callbacks of finished transfers, based on 109 - that mark, and each transfer flags. 110 - If a transfer is completed while this handling is done, a dma irq will 111 - be raised, and the tasklet will be scheduled once again, having a new 112 - updater mark. 113 - 114 - f) Residue 115 - Residue granularity will be descriptor based. The issued but not completed 116 - transfers will be scanned for all of their descriptors against the 117 - currently running descriptor. 118 - 119 - g) Most complicated case of driver's tx queues 120 - The most tricky situation is when : 121 - - there are not "acked" transfers (tx0) 122 - - a driver submitted an aligned tx1, not chained 123 - - a driver submitted an aligned tx2 => tx2 is cold chained to tx1 124 - - a driver issued tx1+tx2 => channel is running in aligned mode 125 - - a driver submitted an aligned tx3 => tx3 is hot-chained 126 - - a driver submitted an unaligned tx4 => tx4 is put in submitted queue, 127 - not chained 128 - - a driver issued tx4 => tx4 is put in issued queue, not chained 129 - - a driver submitted an aligned tx5 => tx5 is put in submitted queue, not 130 - chained 131 - - a driver submitted an aligned tx6 => tx6 is put in submitted queue, 132 - cold chained to tx5 133 - 134 - This translates into (after tx4 is issued) : 135 - - issued queue 136 - +-----+ +-----+ +-----+ +-----+ 137 - | tx1 | | tx2 | | tx3 | | tx4 | 138 - +---|-+ ^---|-+ ^-----+ +-----+ 139 - | | | | 140 - +---+ +---+ 141 - - submitted queue 142 - +-----+ +-----+ 143 - | tx5 | | tx6 | 144 - +---|-+ ^-----+ 145 - | | 146 - +---+ 147 - - completed queue : empty 148 - - allocated queue : tx0 149 - 150 - It should be noted that after tx3 is completed, the channel is stopped, and 151 - restarted in "unaligned mode" to handle tx4. 152 - 153 - Author: Robert Jarzmik <robert.jarzmik@free.fr>

+10

Documentation/driver-api/dmaengine/index.rst

··· 37 37 38 38 dmatest 39 39 40 + PXA DMA documentation 41 + ---------------------- 42 + 43 + This book adds some notes about PXA DMA 44 + 45 + .. toctree:: 46 + :maxdepth: 1 47 + 48 + pxa_dma 49 + 40 50 .. only:: subproject 41 51 42 52 Indices

+190

Documentation/driver-api/dmaengine/pxa_dma.rst

··· 1 + ============================== 2 + PXA/MMP - DMA Slave controller 3 + ============================== 4 + 5 + Constraints 6 + =========== 7 + 8 + a) Transfers hot queuing 9 + A driver submitting a transfer and issuing it should be granted the transfer 10 + is queued even on a running DMA channel. 11 + This implies that the queuing doesn't wait for the previous transfer end, 12 + and that the descriptor chaining is not only done in the irq/tasklet code 13 + triggered by the end of the transfer. 14 + A transfer which is submitted and issued on a phy doesn't wait for a phy to 15 + stop and restart, but is submitted on a "running channel". The other 16 + drivers, especially mmp_pdma waited for the phy to stop before relaunching 17 + a new transfer. 18 + 19 + b) All transfers having asked for confirmation should be signaled 20 + Any issued transfer with DMA_PREP_INTERRUPT should trigger a callback call. 21 + This implies that even if an irq/tasklet is triggered by end of tx1, but 22 + at the time of irq/dma tx2 is already finished, tx1->complete() and 23 + tx2->complete() should be called. 24 + 25 + c) Channel running state 26 + A driver should be able to query if a channel is running or not. For the 27 + multimedia case, such as video capture, if a transfer is submitted and then 28 + a check of the DMA channel reports a "stopped channel", the transfer should 29 + not be issued until the next "start of frame interrupt", hence the need to 30 + know if a channel is in running or stopped state. 31 + 32 + d) Bandwidth guarantee 33 + The PXA architecture has 4 levels of DMAs priorities : high, normal, low. 34 + The high priorities get twice as much bandwidth as the normal, which get twice 35 + as much as the low priorities. 36 + A driver should be able to request a priority, especially the real-time 37 + ones such as pxa_camera with (big) throughputs. 38 + 39 + Design 40 + ====== 41 + a) Virtual channels 42 + Same concept as in sa11x0 driver, ie. a driver was assigned a "virtual 43 + channel" linked to the requestor line, and the physical DMA channel is 44 + assigned on the fly when the transfer is issued. 45 + 46 + b) Transfer anatomy for a scatter-gather transfer 47 + 48 + :: 49 + 50 + +------------+-----+---------------+----------------+-----------------+ 51 + | desc-sg[0] | ... | desc-sg[last] | status updater | finisher/linker | 52 + +------------+-----+---------------+----------------+-----------------+ 53 + 54 + This structure is pointed by dma->sg_cpu. 55 + The descriptors are used as follows : 56 + 57 + - desc-sg[i]: i-th descriptor, transferring the i-th sg 58 + element to the video buffer scatter gather 59 + 60 + - status updater 61 + Transfers a single u32 to a well known dma coherent memory to leave 62 + a trace that this transfer is done. The "well known" is unique per 63 + physical channel, meaning that a read of this value will tell which 64 + is the last finished transfer at that point in time. 65 + 66 + - finisher: has ddadr=DADDR_STOP, dcmd=ENDIRQEN 67 + 68 + - linker: has ddadr= desc-sg[0] of next transfer, dcmd=0 69 + 70 + c) Transfers hot-chaining 71 + Suppose the running chain is: 72 + 73 + :: 74 + 75 + Buffer 1 Buffer 2 76 + +---------+----+---+ +----+----+----+---+ 77 + | d0 | .. | dN | l | | d0 | .. | dN | f | 78 + +---------+----+-|-+ ^----+----+----+---+ 79 + | | 80 + +----+ 81 + 82 + After a call to dmaengine_submit(b3), the chain will look like: 83 + 84 + :: 85 + 86 + Buffer 1 Buffer 2 Buffer 3 87 + +---------+----+---+ +----+----+----+---+ +----+----+----+---+ 88 + | d0 | .. | dN | l | | d0 | .. | dN | l | | d0 | .. | dN | f | 89 + +---------+----+-|-+ ^----+----+----+-|-+ ^----+----+----+---+ 90 + | | | | 91 + +----+ +----+ 92 + new_link 93 + 94 + If while new_link was created the DMA channel stopped, it is _not_ 95 + restarted. Hot-chaining doesn't break the assumption that 96 + dma_async_issue_pending() is to be used to ensure the transfer is actually started. 97 + 98 + One exception to this rule : 99 + 100 + - if Buffer1 and Buffer2 had all their addresses 8 bytes aligned 101 + 102 + - and if Buffer3 has at least one address not 4 bytes aligned 103 + 104 + - then hot-chaining cannot happen, as the channel must be stopped, the 105 + "align bit" must be set, and the channel restarted As a consequence, 106 + such a transfer tx_submit() will be queued on the submitted queue, and 107 + this specific case if the DMA is already running in aligned mode. 108 + 109 + d) Transfers completion updater 110 + Each time a transfer is completed on a channel, an interrupt might be 111 + generated or not, up to the client's request. But in each case, the last 112 + descriptor of a transfer, the "status updater", will write the latest 113 + transfer being completed into the physical channel's completion mark. 114 + 115 + This will speed up residue calculation, for large transfers such as video 116 + buffers which hold around 6k descriptors or more. This also allows without 117 + any lock to find out what is the latest completed transfer in a running 118 + DMA chain. 119 + 120 + e) Transfers completion, irq and tasklet 121 + When a transfer flagged as "DMA_PREP_INTERRUPT" is finished, the dma irq 122 + is raised. Upon this interrupt, a tasklet is scheduled for the physical 123 + channel. 124 + 125 + The tasklet is responsible for : 126 + 127 + - reading the physical channel last updater mark 128 + 129 + - calling all the transfer callbacks of finished transfers, based on 130 + that mark, and each transfer flags. 131 + 132 + If a transfer is completed while this handling is done, a dma irq will 133 + be raised, and the tasklet will be scheduled once again, having a new 134 + updater mark. 135 + 136 + f) Residue 137 + Residue granularity will be descriptor based. The issued but not completed 138 + transfers will be scanned for all of their descriptors against the 139 + currently running descriptor. 140 + 141 + g) Most complicated case of driver's tx queues 142 + The most tricky situation is when : 143 + 144 + - there are not "acked" transfers (tx0) 145 + 146 + - a driver submitted an aligned tx1, not chained 147 + 148 + - a driver submitted an aligned tx2 => tx2 is cold chained to tx1 149 + 150 + - a driver issued tx1+tx2 => channel is running in aligned mode 151 + 152 + - a driver submitted an aligned tx3 => tx3 is hot-chained 153 + 154 + - a driver submitted an unaligned tx4 => tx4 is put in submitted queue, 155 + not chained 156 + 157 + - a driver issued tx4 => tx4 is put in issued queue, not chained 158 + 159 + - a driver submitted an aligned tx5 => tx5 is put in submitted queue, not 160 + chained 161 + 162 + - a driver submitted an aligned tx6 => tx6 is put in submitted queue, 163 + cold chained to tx5 164 + 165 + This translates into (after tx4 is issued) : 166 + 167 + - issued queue 168 + 169 + :: 170 + 171 + +-----+ +-----+ +-----+ +-----+ 172 + | tx1 | | tx2 | | tx3 | | tx4 | 173 + +---|-+ ^---|-+ ^-----+ +-----+ 174 + | | | | 175 + +---+ +---+ 176 + - submitted queue 177 + +-----+ +-----+ 178 + | tx5 | | tx6 | 179 + +---|-+ ^-----+ 180 + | | 181 + +---+ 182 + 183 + - completed queue : empty 184 + 185 + - allocated queue : tx0 186 + 187 + It should be noted that after tx3 is completed, the channel is stopped, and 188 + restarted in "unaligned mode" to handle tx4. 189 + 190 + Author: Robert Jarzmik <robert.jarzmik@free.fr>