scsi: sd: use mempool for discard special page

When boxes are run near (or to) OOM, we have a problem with the discard
page allocation in sd. If we fail allocating the special page, we return
busy, and it'll get retried. But since ordering is honored for dispatch
requests, we can keep retrying this same IO and failing. Behind that IO
could be requests that want to free memory, but they never get the
chance. This means you get repeated spews of traces like this:

[1201401.625972] Call Trace:
[1201401.631748] dump_stack+0x4d/0x65
[1201401.639445] warn_alloc+0xec/0x190
[1201401.647335] __alloc_pages_slowpath+0xe84/0xf30
[1201401.657722] ? get_page_from_freelist+0x11b/0xb10
[1201401.668475] ? __alloc_pages_slowpath+0x2e/0xf30
[1201401.679054] __alloc_pages_nodemask+0x1f9/0x210
[1201401.689424] alloc_pages_current+0x8c/0x110
[1201401.699025] sd_setup_write_same16_cmnd+0x51/0x150
[1201401.709987] sd_init_command+0x49c/0xb70
[1201401.719029] scsi_setup_cmnd+0x9c/0x160
[1201401.727877] scsi_queue_rq+0x4d9/0x610
[1201401.736535] blk_mq_dispatch_rq_list+0x19a/0x360
[1201401.747113] blk_mq_sched_dispatch_requests+0xff/0x190
[1201401.758844] __blk_mq_run_hw_queue+0x95/0xa0
[1201401.768653] blk_mq_run_work_fn+0x2c/0x30
[1201401.777886] process_one_work+0x14b/0x400
[1201401.787119] worker_thread+0x4b/0x470
[1201401.795586] kthread+0x110/0x150
[1201401.803089] ? rescuer_thread+0x320/0x320
[1201401.812322] ? kthread_park+0x90/0x90
[1201401.820787] ? do_syscall_64+0x53/0x150
[1201401.829635] ret_from_fork+0x29/0x40

Ensure that the discard page allocation has a mempool backing, so we
know we can make progress.

Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

authored by Jens Axboe and committed by Martin K. Petersen 61cce6f6 9e6371d3

Changed files
+19 -4
drivers
scsi
+19 -4
drivers/scsi/sd.c
··· 133 133 134 134 static struct kmem_cache *sd_cdb_cache; 135 135 static mempool_t *sd_cdb_pool; 136 + static mempool_t *sd_page_pool; 136 137 137 138 static const char *sd_cache_types[] = { 138 139 "write through", "none", "write back", ··· 760 759 unsigned int data_len = 24; 761 760 char *buf; 762 761 763 - rq->special_vec.bv_page = alloc_page(GFP_ATOMIC | __GFP_ZERO); 762 + rq->special_vec.bv_page = mempool_alloc(sd_page_pool, GFP_ATOMIC); 764 763 if (!rq->special_vec.bv_page) 765 764 return BLKPREP_DEFER; 765 + clear_highpage(rq->special_vec.bv_page); 766 766 rq->special_vec.bv_offset = 0; 767 767 rq->special_vec.bv_len = data_len; 768 768 rq->rq_flags |= RQF_SPECIAL_PAYLOAD; ··· 794 792 u32 nr_sectors = blk_rq_sectors(rq) >> (ilog2(sdp->sector_size) - 9); 795 793 u32 data_len = sdp->sector_size; 796 794 797 - rq->special_vec.bv_page = alloc_page(GFP_ATOMIC | __GFP_ZERO); 795 + rq->special_vec.bv_page = mempool_alloc(sd_page_pool, GFP_ATOMIC); 798 796 if (!rq->special_vec.bv_page) 799 797 return BLKPREP_DEFER; 798 + clear_highpage(rq->special_vec.bv_page); 800 799 rq->special_vec.bv_offset = 0; 801 800 rq->special_vec.bv_len = data_len; 802 801 rq->rq_flags |= RQF_SPECIAL_PAYLOAD; ··· 825 822 u32 nr_sectors = blk_rq_sectors(rq) >> (ilog2(sdp->sector_size) - 9); 826 823 u32 data_len = sdp->sector_size; 827 824 828 - rq->special_vec.bv_page = alloc_page(GFP_ATOMIC | __GFP_ZERO); 825 + rq->special_vec.bv_page = mempool_alloc(sd_page_pool, GFP_ATOMIC); 829 826 if (!rq->special_vec.bv_page) 830 827 return BLKPREP_DEFER; 828 + clear_highpage(rq->special_vec.bv_page); 831 829 rq->special_vec.bv_offset = 0; 832 830 rq->special_vec.bv_len = data_len; 833 831 rq->rq_flags |= RQF_SPECIAL_PAYLOAD; ··· 1290 1286 u8 *cmnd; 1291 1287 1292 1288 if (rq->rq_flags & RQF_SPECIAL_PAYLOAD) 1293 - __free_page(rq->special_vec.bv_page); 1289 + mempool_free(rq->special_vec.bv_page, sd_page_pool); 1294 1290 1295 1291 if (SCpnt->cmnd != scsi_req(rq)->cmd) { 1296 1292 cmnd = SCpnt->cmnd; ··· 3627 3623 goto err_out_cache; 3628 3624 } 3629 3625 3626 + sd_page_pool = mempool_create_page_pool(SD_MEMPOOL_SIZE, 0); 3627 + if (!sd_page_pool) { 3628 + printk(KERN_ERR "sd: can't init discard page pool\n"); 3629 + err = -ENOMEM; 3630 + goto err_out_ppool; 3631 + } 3632 + 3630 3633 err = scsi_register_driver(&sd_template.gendrv); 3631 3634 if (err) 3632 3635 goto err_out_driver; ··· 3641 3630 return 0; 3642 3631 3643 3632 err_out_driver: 3633 + mempool_destroy(sd_page_pool); 3634 + 3635 + err_out_ppool: 3644 3636 mempool_destroy(sd_cdb_pool); 3645 3637 3646 3638 err_out_cache: ··· 3670 3656 3671 3657 scsi_unregister_driver(&sd_template.gendrv); 3672 3658 mempool_destroy(sd_cdb_pool); 3659 + mempool_destroy(sd_page_pool); 3673 3660 kmem_cache_destroy(sd_cdb_cache); 3674 3661 3675 3662 class_unregister(&sd_disk_class);