Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

RDMA/cm: Fix leaking the multicast GID table reference

If the CM ID is destroyed while the CM event for multicast creating is
still queued the cancel_work_sync() will prevent the work from running
which also prevents destroying the ah_attr. This leaks a refcount and
triggers a WARN:

GID entry ref leak for dev syz1 index 2 ref=573
WARNING: CPU: 1 PID: 655 at drivers/infiniband/core/cache.c:809 release_gid_table drivers/infiniband/core/cache.c:806 [inline]
WARNING: CPU: 1 PID: 655 at drivers/infiniband/core/cache.c:809 gid_table_release_one+0x284/0x3cc drivers/infiniband/core/cache.c:886

Destroy the ah_attr after canceling the work, it is safe to call this
twice.

Link: https://patch.msgid.link/r/0-v1-4285d070a6b2+20a-rdma_mc_gid_leak_syz_jgg@nvidia.com
Cc: stable@vger.kernel.org
Fixes: fe454dc31e84 ("RDMA/ucma: Fix use-after-free bug in ucma_create_uevent")
Reported-by: syzbot+b0da83a6c0e2e2bddbd4@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68232e7b.050a0220.f2294.09f6.GAE@google.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

+3
+3
drivers/infiniband/core/cma.c
··· 2009 2009 ib_sa_free_multicast(mc->sa_mc); 2010 2010 2011 2011 if (rdma_protocol_roce(id_priv->id.device, id_priv->id.port_num)) { 2012 + struct rdma_cm_event *event = &mc->iboe_join.event; 2012 2013 struct rdma_dev_addr *dev_addr = 2013 2014 &id_priv->id.route.addr.dev_addr; 2014 2015 struct net_device *ndev = NULL; ··· 2032 2031 dev_put(ndev); 2033 2032 2034 2033 cancel_work_sync(&mc->iboe_join.work); 2034 + if (event->event == RDMA_CM_EVENT_MULTICAST_JOIN) 2035 + rdma_destroy_ah_attr(&event->param.ud.ah_attr); 2035 2036 } 2036 2037 kfree(mc); 2037 2038 }