dm mpath: add missing path switching locking

Moving the path activation to workqueue along with scsi_dh patches introduced
a race. It is due to the fact that the current_pgpath (in the multipath data
structure) can be modified if changes happen in any of the paths leading to
the lun. If the changes lead to current_pgpath being set to NULL, then it
leads to the invalid access which results in the panic below.

This patch fixes that by storing the pgpath to activate in the multipath data
structure and properly protecting it.

Note that if activate_path is called twice in succession with different pgpath,
with the second one being called before the first one is done, then activate
path will be called twice for the second pgpath, which is fine.

Unable to handle kernel paging request for data at address 0x00000020
Faulting instruction address: 0xd000000000aa1844
cpu 0x1: Vector: 300 (Data Access) at [c00000006b987a80]
pc: d000000000aa1844: .activate_path+0x30/0x218 [dm_multipath]
lr: c000000000087a2c: .run_workqueue+0x114/0x204
sp: c00000006b987d00
msr: 8000000000009032
dar: 20
dsisr: 40000000
current = 0xc0000000676bb3f0
paca = 0xc0000000006f3680
pid = 2528, comm = kmpath_handlerd
enter ? for help
[c00000006b987da0] c000000000087a2c .run_workqueue+0x114/0x204
[c00000006b987e40] c000000000088b58 .worker_thread+0x120/0x144
[c00000006b987f00] c00000000008ca70 .kthread+0x78/0xc4
[c00000006b987f90] c000000000027cc8 .kernel_thread+0x4c/0x68

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

authored by Chandra Seetharaman and committed by Alasdair G Kergon 7253a334 b01cd5ac

+15 -1
+15 -1
drivers/md/dm-mpath.c
··· 63 63 64 64 const char *hw_handler_name; 65 65 struct work_struct activate_path; 66 + struct pgpath *pgpath_to_activate; 66 67 unsigned nr_priority_groups; 67 68 struct list_head priority_groups; 68 69 unsigned pg_init_required; /* pg_init needs calling? */ ··· 147 146 148 147 static void free_pgpaths(struct list_head *pgpaths, struct dm_target *ti) 149 148 { 149 + unsigned long flags; 150 150 struct pgpath *pgpath, *tmp; 151 151 struct multipath *m = ti->private; 152 152 ··· 156 154 if (m->hw_handler_name) 157 155 scsi_dh_detach(bdev_get_queue(pgpath->path.dev->bdev)); 158 156 dm_put_device(ti, pgpath->path.dev); 157 + spin_lock_irqsave(&m->lock, flags); 158 + if (m->pgpath_to_activate == pgpath) 159 + m->pgpath_to_activate = NULL; 160 + spin_unlock_irqrestore(&m->lock, flags); 159 161 free_pgpath(pgpath); 160 162 } 161 163 } ··· 427 421 __choose_pgpath(m); 428 422 429 423 pgpath = m->current_pgpath; 424 + m->pgpath_to_activate = m->current_pgpath; 430 425 431 426 if ((pgpath && !m->queue_io) || 432 427 (!pgpath && !m->queue_if_no_path)) ··· 1100 1093 int ret; 1101 1094 struct multipath *m = 1102 1095 container_of(work, struct multipath, activate_path); 1103 - struct dm_path *path = &m->current_pgpath->path; 1096 + struct dm_path *path; 1097 + unsigned long flags; 1104 1098 1099 + spin_lock_irqsave(&m->lock, flags); 1100 + path = &m->pgpath_to_activate->path; 1101 + m->pgpath_to_activate = NULL; 1102 + spin_unlock_irqrestore(&m->lock, flags); 1103 + if (!path) 1104 + return; 1105 1105 ret = scsi_dh_activate(bdev_get_queue(path->dev->bdev)); 1106 1106 pg_init_done(path, ret); 1107 1107 }