Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Documentation: dmaengine: pxa-dma design

Document the new design of the pxa dma driver.

Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>

authored by

Robert Jarzmik and committed by
Vinod Koul
16eea6b4 5ebe6afa

+153
+153
Documentation/dmaengine/pxa_dma.txt
··· 1 + PXA/MMP - DMA Slave controller 2 + ============================== 3 + 4 + Constraints 5 + ----------- 6 + a) Transfers hot queuing 7 + A driver submitting a transfer and issuing it should be granted the transfer 8 + is queued even on a running DMA channel. 9 + This implies that the queuing doesn't wait for the previous transfer end, 10 + and that the descriptor chaining is not only done in the irq/tasklet code 11 + triggered by the end of the transfer. 12 + A transfer which is submitted and issued on a phy doesn't wait for a phy to 13 + stop and restart, but is submitted on a "running channel". The other 14 + drivers, especially mmp_pdma waited for the phy to stop before relaunching 15 + a new transfer. 16 + 17 + b) All transfers having asked for confirmation should be signaled 18 + Any issued transfer with DMA_PREP_INTERRUPT should trigger a callback call. 19 + This implies that even if an irq/tasklet is triggered by end of tx1, but 20 + at the time of irq/dma tx2 is already finished, tx1->complete() and 21 + tx2->complete() should be called. 22 + 23 + c) Channel running state 24 + A driver should be able to query if a channel is running or not. For the 25 + multimedia case, such as video capture, if a transfer is submitted and then 26 + a check of the DMA channel reports a "stopped channel", the transfer should 27 + not be issued until the next "start of frame interrupt", hence the need to 28 + know if a channel is in running or stopped state. 29 + 30 + d) Bandwidth guarantee 31 + The PXA architecture has 4 levels of DMAs priorities : high, normal, low. 32 + The high prorities get twice as much bandwidth as the normal, which get twice 33 + as much as the low priorities. 34 + A driver should be able to request a priority, especially the real-time 35 + ones such as pxa_camera with (big) throughputs. 36 + 37 + Design 38 + ------ 39 + a) Virtual channels 40 + Same concept as in sa11x0 driver, ie. a driver was assigned a "virtual 41 + channel" linked to the requestor line, and the physical DMA channel is 42 + assigned on the fly when the transfer is issued. 43 + 44 + b) Transfer anatomy for a scatter-gather transfer 45 + +------------+-----+---------------+----------------+-----------------+ 46 + | desc-sg[0] | ... | desc-sg[last] | status updater | finisher/linker | 47 + +------------+-----+---------------+----------------+-----------------+ 48 + 49 + This structure is pointed by dma->sg_cpu. 50 + The descriptors are used as follows : 51 + - desc-sg[i]: i-th descriptor, transferring the i-th sg 52 + element to the video buffer scatter gather 53 + - status updater 54 + Transfers a single u32 to a well known dma coherent memory to leave 55 + a trace that this transfer is done. The "well known" is unique per 56 + physical channel, meaning that a read of this value will tell which 57 + is the last finished transfer at that point in time. 58 + - finisher: has ddadr=DADDR_STOP, dcmd=ENDIRQEN 59 + - linker: has ddadr= desc-sg[0] of next transfer, dcmd=0 60 + 61 + c) Transfers hot-chaining 62 + Suppose the running chain is : 63 + Buffer 1 Buffer 2 64 + +---------+----+---+ +----+----+----+---+ 65 + | d0 | .. | dN | l | | d0 | .. | dN | f | 66 + +---------+----+-|-+ ^----+----+----+---+ 67 + | | 68 + +----+ 69 + 70 + After a call to dmaengine_submit(b3), the chain will look like : 71 + Buffer 1 Buffer 2 Buffer 3 72 + +---------+----+---+ +----+----+----+---+ +----+----+----+---+ 73 + | d0 | .. | dN | l | | d0 | .. | dN | l | | d0 | .. | dN | f | 74 + +---------+----+-|-+ ^----+----+----+-|-+ ^----+----+----+---+ 75 + | | | | 76 + +----+ +----+ 77 + new_link 78 + 79 + If while new_link was created the DMA channel stopped, it is _not_ 80 + restarted. Hot-chaining doesn't break the assumption that 81 + dma_async_issue_pending() is to be used to ensure the transfer is actually started. 82 + 83 + One exception to this rule : 84 + - if Buffer1 and Buffer2 had all their addresses 8 bytes aligned 85 + - and if Buffer3 has at least one address not 4 bytes aligned 86 + - then hot-chaining cannot happen, as the channel must be stopped, the 87 + "align bit" must be set, and the channel restarted As a consequence, 88 + such a transfer tx_submit() will be queued on the submitted queue, and 89 + this specific case if the DMA is already running in aligned mode. 90 + 91 + d) Transfers completion updater 92 + Each time a transfer is completed on a channel, an interrupt might be 93 + generated or not, up to the client's request. But in each case, the last 94 + descriptor of a transfer, the "status updater", will write the latest 95 + transfer being completed into the physical channel's completion mark. 96 + 97 + This will speed up residue calculation, for large transfers such as video 98 + buffers which hold around 6k descriptors or more. This also allows without 99 + any lock to find out what is the latest completed transfer in a running 100 + DMA chain. 101 + 102 + e) Transfers completion, irq and tasklet 103 + When a transfer flagged as "DMA_PREP_INTERRUPT" is finished, the dma irq 104 + is raised. Upon this interrupt, a tasklet is scheduled for the physical 105 + channel. 106 + The tasklet is responsible for : 107 + - reading the physical channel last updater mark 108 + - calling all the transfer callbacks of finished transfers, based on 109 + that mark, and each transfer flags. 110 + If a transfer is completed while this handling is done, a dma irq will 111 + be raised, and the tasklet will be scheduled once again, having a new 112 + updater mark. 113 + 114 + f) Residue 115 + Residue granularity will be descriptor based. The issued but not completed 116 + transfers will be scanned for all of their descriptors against the 117 + currently running descriptor. 118 + 119 + g) Most complicated case of driver's tx queues 120 + The most tricky situation is when : 121 + - there are not "acked" transfers (tx0) 122 + - a driver submitted an aligned tx1, not chained 123 + - a driver submitted an aligned tx2 => tx2 is cold chained to tx1 124 + - a driver issued tx1+tx2 => channel is running in aligned mode 125 + - a driver submitted an aligned tx3 => tx3 is hot-chained 126 + - a driver submitted an unaligned tx4 => tx4 is put in submitted queue, 127 + not chained 128 + - a driver issued tx4 => tx4 is put in issued queue, not chained 129 + - a driver submitted an aligned tx5 => tx5 is put in submitted queue, not 130 + chained 131 + - a driver submitted an aligned tx6 => tx6 is put in submitted queue, 132 + cold chained to tx5 133 + 134 + This translates into (after tx4 is issued) : 135 + - issued queue 136 + +-----+ +-----+ +-----+ +-----+ 137 + | tx1 | | tx2 | | tx3 | | tx4 | 138 + +---|-+ ^---|-+ ^-----+ +-----+ 139 + | | | | 140 + +---+ +---+ 141 + - submitted queue 142 + +-----+ +-----+ 143 + | tx5 | | tx6 | 144 + +---|-+ ^-----+ 145 + | | 146 + +---+ 147 + - completed queue : empty 148 + - allocated queue : tx0 149 + 150 + It should be noted that after tx3 is completed, the channel is stopped, and 151 + restarted in "unaligned mode" to handle tx4. 152 + 153 + Author: Robert Jarzmik <robert.jarzmik@free.fr>