Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

watchdog: Add support for dynamically allocated watchdog_device structs

If a driver's watchdog_device struct is part of a dynamically allocated
struct (which it often will be), merely locking the module is not enough,
even with a drivers module locked, the driver can be unbound from the device,
examples:
1) The root user can unbind it through sysfd
2) The i2c bus master driver being unloaded for an i2c watchdog

I will gladly admit that these are corner cases, but we still need to handle
them correctly.

The fix for this consists of 2 parts:
1) Add ref / unref operations, so that the driver can refcount the struct
holding the watchdog_device struct and delay freeing it until any
open filehandles referring to it are closed
2) Most driver operations will do IO on the device and the driver should not
do any IO on the device after it has been unbound. Rather then letting each
driver deal with this internally, it is better to ensure at the watchdog
core level that no operations (other then unref) will get called after
the driver has called watchdog_unregister_device(). This actually is the
bulk of this patch.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>

authored by

Hans de Goede and committed by
Wim Van Sebroeck
e907df32 f4e9c82f

+86 -2
+27 -1
Documentation/watchdog/watchdog-kernel-api.txt
··· 1 1 The Linux WatchDog Timer Driver Core kernel API. 2 2 =============================================== 3 - Last reviewed: 21-May-2012 3 + Last reviewed: 22-May-2012 4 4 5 5 Wim Van Sebroeck <wim@iguana.be> 6 6 ··· 93 93 unsigned int (*status)(struct watchdog_device *); 94 94 int (*set_timeout)(struct watchdog_device *, unsigned int); 95 95 unsigned int (*get_timeleft)(struct watchdog_device *); 96 + void (*ref)(struct watchdog_device *); 97 + void (*unref)(struct watchdog_device *); 96 98 long (*ioctl)(struct watchdog_device *, unsigned int, unsigned long); 97 99 }; 98 100 ··· 102 100 driver's operations. This module owner will be used to lock the module when 103 101 the watchdog is active. (This to avoid a system crash when you unload the 104 102 module and /dev/watchdog is still open). 103 + 104 + If the watchdog_device struct is dynamically allocated, just locking the module 105 + is not enough and a driver also needs to define the ref and unref operations to 106 + ensure the structure holding the watchdog_device does not go away. 107 + 108 + The simplest (and usually sufficient) implementation of this is to: 109 + 1) Add a kref struct to the same structure which is holding the watchdog_device 110 + 2) Define a release callback for the kref which frees the struct holding both 111 + 3) Call kref_init on this kref *before* calling watchdog_register_device() 112 + 4) Define a ref operation calling kref_get on this kref 113 + 5) Define a unref operation calling kref_put on this kref 114 + 6) When it is time to cleanup: 115 + * Do not kfree() the struct holding both, the last kref_put will do this! 116 + * *After* calling watchdog_unregister_device() call kref_put on the kref 117 + 105 118 Some operations are mandatory and some are optional. The mandatory operations 106 119 are: 107 120 * start: this is a pointer to the routine that starts the watchdog timer ··· 157 140 (Note: the WDIOF_SETTIMEOUT needs to be set in the options field of the 158 141 watchdog's info structure). 159 142 * get_timeleft: this routines returns the time that's left before a reset. 143 + * ref: the operation that calls kref_get on the kref of a dynamically 144 + allocated watchdog_device struct. 145 + * unref: the operation that calls kref_put on the kref of a dynamically 146 + allocated watchdog_device struct. 160 147 * ioctl: if this routine is present then it will be called first before we do 161 148 our own internal ioctl call handling. This routine should return -ENOIOCTLCMD 162 149 if a command is not supported. The parameters that are passed to the ioctl ··· 180 159 (This bit should only be used by the WatchDog Timer Driver Core). 181 160 * WDOG_NO_WAY_OUT: this bit stores the nowayout setting for the watchdog. 182 161 If this bit is set then the watchdog timer will not be able to stop. 162 + * WDOG_UNREGISTERED: this bit gets set by the WatchDog Timer Driver Core 163 + after calling watchdog_unregister_device, and then checked before calling 164 + any watchdog_ops, so that you can be sure that no operations (other then 165 + unref) will get called after unregister, even if userspace still holds a 166 + reference to /dev/watchdog 183 167 184 168 To set the WDOG_NO_WAY_OUT status bit (before registering your watchdog 185 169 timer device) you can either:
+54 -1
drivers/watchdog/watchdog_dev.c
··· 65 65 66 66 mutex_lock(&wddev->lock); 67 67 68 + if (test_bit(WDOG_UNREGISTERED, &wddev->status)) { 69 + err = -ENODEV; 70 + goto out_ping; 71 + } 72 + 68 73 if (!watchdog_active(wddev)) 69 74 goto out_ping; 70 75 ··· 98 93 99 94 mutex_lock(&wddev->lock); 100 95 96 + if (test_bit(WDOG_UNREGISTERED, &wddev->status)) { 97 + err = -ENODEV; 98 + goto out_start; 99 + } 100 + 101 101 if (watchdog_active(wddev)) 102 102 goto out_start; 103 103 ··· 130 120 int err = 0; 131 121 132 122 mutex_lock(&wddev->lock); 123 + 124 + if (test_bit(WDOG_UNREGISTERED, &wddev->status)) { 125 + err = -ENODEV; 126 + goto out_stop; 127 + } 133 128 134 129 if (!watchdog_active(wddev)) 135 130 goto out_stop; ··· 173 158 174 159 mutex_lock(&wddev->lock); 175 160 161 + if (test_bit(WDOG_UNREGISTERED, &wddev->status)) { 162 + err = -ENODEV; 163 + goto out_status; 164 + } 165 + 176 166 *status = wddev->ops->status(wddev); 177 167 168 + out_status: 178 169 mutex_unlock(&wddev->lock); 179 170 return err; 180 171 } ··· 206 185 207 186 mutex_lock(&wddev->lock); 208 187 188 + if (test_bit(WDOG_UNREGISTERED, &wddev->status)) { 189 + err = -ENODEV; 190 + goto out_timeout; 191 + } 192 + 209 193 err = wddev->ops->set_timeout(wddev, timeout); 210 194 195 + out_timeout: 211 196 mutex_unlock(&wddev->lock); 212 197 return err; 213 198 } ··· 237 210 238 211 mutex_lock(&wddev->lock); 239 212 213 + if (test_bit(WDOG_UNREGISTERED, &wddev->status)) { 214 + err = -ENODEV; 215 + goto out_timeleft; 216 + } 217 + 240 218 *timeleft = wddev->ops->get_timeleft(wddev); 241 219 220 + out_timeleft: 242 221 mutex_unlock(&wddev->lock); 243 222 return err; 244 223 } ··· 266 233 267 234 mutex_lock(&wddev->lock); 268 235 236 + if (test_bit(WDOG_UNREGISTERED, &wddev->status)) { 237 + err = -ENODEV; 238 + goto out_ioctl; 239 + } 240 + 269 241 err = wddev->ops->ioctl(wddev, cmd, arg); 270 242 243 + out_ioctl: 271 244 mutex_unlock(&wddev->lock); 272 245 return err; 273 246 } ··· 437 398 438 399 file->private_data = wdd; 439 400 401 + if (wdd->ops->ref) 402 + wdd->ops->ref(wdd); 403 + 440 404 /* dev/watchdog is a virtual (and thus non-seekable) filesystem */ 441 405 return nonseekable_open(inode, file); 442 406 ··· 476 434 477 435 /* If the watchdog was not stopped, send a keepalive ping */ 478 436 if (err < 0) { 479 - dev_crit(wdd->dev, "watchdog did not stop!\n"); 437 + mutex_lock(&wdd->lock); 438 + if (!test_bit(WDOG_UNREGISTERED, &wdd->status)) 439 + dev_crit(wdd->dev, "watchdog did not stop!\n"); 440 + mutex_unlock(&wdd->lock); 480 441 watchdog_ping(wdd); 481 442 } 482 443 ··· 488 443 489 444 /* make sure that /dev/watchdog can be re-opened */ 490 445 clear_bit(WDOG_DEV_OPEN, &wdd->status); 446 + 447 + /* Note wdd may be gone after this, do not use after this! */ 448 + if (wdd->ops->unref) 449 + wdd->ops->unref(wdd); 491 450 492 451 return 0; 493 452 } ··· 564 515 565 516 int watchdog_dev_unregister(struct watchdog_device *watchdog) 566 517 { 518 + mutex_lock(&watchdog->lock); 519 + set_bit(WDOG_UNREGISTERED, &watchdog->status); 520 + mutex_unlock(&watchdog->lock); 521 + 567 522 cdev_del(&watchdog->cdev); 568 523 if (watchdog->id == 0) { 569 524 misc_deregister(&watchdog_miscdev);
+5
include/linux/watchdog.h
··· 71 71 * @status: The routine that shows the status of the watchdog device. 72 72 * @set_timeout:The routine for setting the watchdog devices timeout value. 73 73 * @get_timeleft:The routine that get's the time that's left before a reset. 74 + * @ref: The ref operation for dyn. allocated watchdog_device structs 75 + * @unref: The unref operation for dyn. allocated watchdog_device structs 74 76 * @ioctl: The routines that handles extra ioctl calls. 75 77 * 76 78 * The watchdog_ops structure contains a list of low-level operations ··· 90 88 unsigned int (*status)(struct watchdog_device *); 91 89 int (*set_timeout)(struct watchdog_device *, unsigned int); 92 90 unsigned int (*get_timeleft)(struct watchdog_device *); 91 + void (*ref)(struct watchdog_device *); 92 + void (*unref)(struct watchdog_device *); 93 93 long (*ioctl)(struct watchdog_device *, unsigned int, unsigned long); 94 94 }; 95 95 ··· 139 135 #define WDOG_DEV_OPEN 1 /* Opened via /dev/watchdog ? */ 140 136 #define WDOG_ALLOW_RELEASE 2 /* Did we receive the magic char ? */ 141 137 #define WDOG_NO_WAY_OUT 3 /* Is 'nowayout' feature set ? */ 138 + #define WDOG_UNREGISTERED 4 /* Has the device been unregistered */ 142 139 }; 143 140 144 141 #ifdef CONFIG_WATCHDOG_NOWAYOUT