iwlagn: fix RX skb alignment

So I dug deeper into the DMA problems I had with iwlagn and a kind soul
helped me in that he said something about pci-e alignment and mentioned
the iwl_rx_allocate function to check for crossing 4KB boundaries. Since
there's 8KB A-MPDU support, crossing 4k boundaries didn't seem like
something the device would fail with, but when I looked into the
function for a minute anyway I stumbled over this little gem:

BUG_ON(rxb->dma_addr & (~DMA_BIT_MASK(36) & 0xff));

Clearly, that is a totally bogus check, one would hope the compiler
removes it entirely. (Think about it)

After fixing it, I obviously ran into it, nothing guarantees the
alignment the way you want it, because of the way skbs and their
headroom are allocated. I won't explain that here nor double-check that
I'm right, that goes beyond what most of the CC'ed people care about.

So then I came up with the patch below, and so far my system has
survived minutes with 64K pages, when it would previously fail in
seconds. And I haven't seen a single instance of the TX bug either. But
when you see the patch it'll be pretty obvious to you why.

This should fix the following reported kernel bugs:

http://bugzilla.kernel.org/show_bug.cgi?id=11596
http://bugzilla.kernel.org/show_bug.cgi?id=11393
http://bugzilla.kernel.org/show_bug.cgi?id=11983

I haven't checked if there are any elsewhere, but I suppose RHBZ will
have a few instances too...

I'd like to ask anyone who is CC'ed (those are people I know ran into
the bug) to try this patch.

I am convinced that this patch is correct in spirit, but I haven't
understood why, for example, there are so many unmap calls. I'm not
entirely convinced that this is the only bug leading to the TX reply
errors.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

authored by Johannes Berg and committed by John W. Linville 4018517a 8e3bad65

+22 -13
+3 -3
drivers/net/wireless/iwlwifi/iwl-agn.c
··· 1384 1384 1385 1385 rxq->queue[i] = NULL; 1386 1386 1387 - pci_dma_sync_single_for_cpu(priv->pci_dev, rxb->dma_addr, 1387 + pci_dma_sync_single_for_cpu(priv->pci_dev, rxb->aligned_dma_addr, 1388 1388 priv->hw_params.rx_buf_size, 1389 1389 PCI_DMA_FROMDEVICE); 1390 1390 pkt = (struct iwl_rx_packet *)rxb->skb->data; ··· 1436 1436 rxb->skb = NULL; 1437 1437 } 1438 1438 1439 - pci_unmap_single(priv->pci_dev, rxb->dma_addr, 1440 - priv->hw_params.rx_buf_size, 1439 + pci_unmap_single(priv->pci_dev, rxb->real_dma_addr, 1440 + priv->hw_params.rx_buf_size + 256, 1441 1441 PCI_DMA_FROMDEVICE); 1442 1442 spin_lock_irqsave(&rxq->lock, flags); 1443 1443 list_add_tail(&rxb->list, &priv->rxq.rx_used);
+2 -1
drivers/net/wireless/iwlwifi/iwl-dev.h
··· 89 89 #define DEFAULT_LONG_RETRY_LIMIT 4U 90 90 91 91 struct iwl_rx_mem_buffer { 92 - dma_addr_t dma_addr; 92 + dma_addr_t real_dma_addr; 93 + dma_addr_t aligned_dma_addr; 93 94 struct sk_buff *skb; 94 95 struct list_head list; 95 96 };
+17 -9
drivers/net/wireless/iwlwifi/iwl-rx.c
··· 204 204 list_del(element); 205 205 206 206 /* Point to Rx buffer via next RBD in circular buffer */ 207 - rxq->bd[rxq->write] = iwl_dma_addr2rbd_ptr(priv, rxb->dma_addr); 207 + rxq->bd[rxq->write] = iwl_dma_addr2rbd_ptr(priv, rxb->aligned_dma_addr); 208 208 rxq->queue[rxq->write] = rxb; 209 209 rxq->write = (rxq->write + 1) & RX_QUEUE_MASK; 210 210 rxq->free_count--; ··· 251 251 rxb = list_entry(element, struct iwl_rx_mem_buffer, list); 252 252 253 253 /* Alloc a new receive buffer */ 254 - rxb->skb = alloc_skb(priv->hw_params.rx_buf_size, 254 + rxb->skb = alloc_skb(priv->hw_params.rx_buf_size + 256, 255 255 __GFP_NOWARN | GFP_ATOMIC); 256 256 if (!rxb->skb) { 257 257 if (net_ratelimit()) ··· 266 266 list_del(element); 267 267 268 268 /* Get physical address of RB/SKB */ 269 - rxb->dma_addr = 270 - pci_map_single(priv->pci_dev, rxb->skb->data, 271 - priv->hw_params.rx_buf_size, PCI_DMA_FROMDEVICE); 269 + rxb->real_dma_addr = pci_map_single( 270 + priv->pci_dev, 271 + rxb->skb->data, 272 + priv->hw_params.rx_buf_size + 256, 273 + PCI_DMA_FROMDEVICE); 274 + /* dma address must be no more than 36 bits */ 275 + BUG_ON(rxb->real_dma_addr & ~DMA_BIT_MASK(36)); 276 + /* and also 256 byte aligned! */ 277 + rxb->aligned_dma_addr = ALIGN(rxb->real_dma_addr, 256); 278 + skb_reserve(rxb->skb, rxb->aligned_dma_addr - rxb->real_dma_addr); 279 + 272 280 list_add_tail(&rxb->list, &rxq->rx_free); 273 281 rxq->free_count++; 274 282 } ··· 308 300 for (i = 0; i < RX_QUEUE_SIZE + RX_FREE_BUFFERS; i++) { 309 301 if (rxq->pool[i].skb != NULL) { 310 302 pci_unmap_single(priv->pci_dev, 311 - rxq->pool[i].dma_addr, 312 - priv->hw_params.rx_buf_size, 303 + rxq->pool[i].real_dma_addr, 304 + priv->hw_params.rx_buf_size + 256, 313 305 PCI_DMA_FROMDEVICE); 314 306 dev_kfree_skb(rxq->pool[i].skb); 315 307 } ··· 362 354 * to an SKB, so we need to unmap and free potential storage */ 363 355 if (rxq->pool[i].skb != NULL) { 364 356 pci_unmap_single(priv->pci_dev, 365 - rxq->pool[i].dma_addr, 366 - priv->hw_params.rx_buf_size, 357 + rxq->pool[i].real_dma_addr, 358 + priv->hw_params.rx_buf_size + 256, 367 359 PCI_DMA_FROMDEVICE); 368 360 priv->alloc_rxb_skb--; 369 361 dev_kfree_skb(rxq->pool[i].skb);