Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

PM, libnvdimm: Add runtime firmware activation support

Abstract platform specific mechanics for nvdimm firmware activation
behind a handful of generic ops. At the bus level ->activate_state()
indicates the unified state (idle, busy, armed) of all DIMMs on the bus,
and ->capability() indicates the system state expectations for activate.
At the DIMM level ->activate_state() indicates the per-DIMM state,
->activate_result() indicates the outcome of the last activation
attempt, and ->arm() attempts to transition the DIMM from 'idle' to
'armed'.

A new hibernate_quiet_exec() facility is added to support firmware
activation in an OS defined system quiesce state. It leverages the fact
that the hibernate-freeze state wants to assert that a memory
hibernation snapshot can be taken. This is in contrast to a platform
firmware defined quiesce state that may forcefully quiet the memory
controller independent of whether an individual device-driver properly
supports hibernate-freeze.

The libnvdimm sysfs interface is extended to support detection of a
firmware activate capability. The mechanism supports enumeration and
triggering of firmware activate, optionally in the
hibernate_quiet_exec() context.

[rafael: hibernate_quiet_exec() proposal]
[vishal: fix up sparse warning, grammar in Documentation/]

Cc: Pavel Machek <pavel@ucw.cz>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Co-developed-by: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Signed-off-by: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>

authored by

Dan Williams and committed by
Vishal Verma
48001ea5 5cf81ce1

+500
+2
Documentation/ABI/testing/sysfs-bus-nvdimm
··· 1 + The libnvdimm sub-system implements a common sysfs interface for 2 + platform nvdimm resources. See Documentation/driver-api/nvdimm/.
+86
Documentation/driver-api/nvdimm/firmware-activate.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0 2 + 3 + ================================== 4 + NVDIMM Runtime Firmware Activation 5 + ================================== 6 + 7 + Some persistent memory devices run a firmware locally on the device / 8 + "DIMM" to perform tasks like media management, capacity provisioning, 9 + and health monitoring. The process of updating that firmware typically 10 + involves a reboot because it has implications for in-flight memory 11 + transactions. However, reboots are disruptive and at least the Intel 12 + persistent memory platform implementation, described by the Intel ACPI 13 + DSM specification [1], has added support for activating firmware at 14 + runtime. 15 + 16 + A native sysfs interface is implemented in libnvdimm to allow platform 17 + to advertise and control their local runtime firmware activation 18 + capability. 19 + 20 + The libnvdimm bus object, ndbusX, implements an ndbusX/firmware/activate 21 + attribute that shows the state of the firmware activation as one of 'idle', 22 + 'armed', 'overflow', and 'busy'. 23 + 24 + - idle: 25 + No devices are set / armed to activate firmware 26 + 27 + - armed: 28 + At least one device is armed 29 + 30 + - busy: 31 + In the busy state armed devices are in the process of transitioning 32 + back to idle and completing an activation cycle. 33 + 34 + - overflow: 35 + If the platform has a concept of incremental work needed to perform 36 + the activation it could be the case that too many DIMMs are armed for 37 + activation. In that scenario the potential for firmware activation to 38 + timeout is indicated by the 'overflow' state. 39 + 40 + The 'ndbusX/firmware/activate' property can be written with a value of 41 + either 'live', or 'quiesce'. A value of 'quiesce' triggers the kernel to 42 + run firmware activation from within the equivalent of the hibernation 43 + 'freeze' state where drivers and applications are notified to stop their 44 + modifications of system memory. A value of 'live' attempts 45 + firmware activation without this hibernation cycle. The 46 + 'ndbusX/firmware/activate' property will be elided completely if no 47 + firmware activation capability is detected. 48 + 49 + Another property 'ndbusX/firmware/capability' indicates a value of 50 + 'live' or 'quiesce', where 'live' indicates that the firmware 51 + does not require or inflict any quiesce period on the system to update 52 + firmware. A capability value of 'quiesce' indicates that firmware does 53 + expect and injects a quiet period for the memory controller, but 'live' 54 + may still be written to 'ndbusX/firmware/activate' as an override to 55 + assume the risk of racing firmware update with in-flight device and 56 + application activity. The 'ndbusX/firmware/capability' property will be 57 + elided completely if no firmware activation capability is detected. 58 + 59 + The libnvdimm memory-device / DIMM object, nmemX, implements 60 + 'nmemX/firmware/activate' and 'nmemX/firmware/result' attributes to 61 + communicate the per-device firmware activation state. Similar to the 62 + 'ndbusX/firmware/activate' attribute, the 'nmemX/firmware/activate' 63 + attribute indicates 'idle', 'armed', or 'busy'. The state transitions 64 + from 'armed' to 'idle' when the system is prepared to activate firmware, 65 + firmware staged + state set to armed, and 'ndbusX/firmware/activate' is 66 + triggered. After that activation event the nmemX/firmware/result 67 + attribute reflects the state of the last activation as one of: 68 + 69 + - none: 70 + No runtime activation triggered since the last time the device was reset 71 + 72 + - success: 73 + The last runtime activation completed successfully. 74 + 75 + - fail: 76 + The last runtime activation failed for device-specific reasons. 77 + 78 + - not_staged: 79 + The last runtime activation failed due to a sequencing error of the 80 + firmware image not being staged. 81 + 82 + - need_reset: 83 + Runtime firmware activation failed, but the firmware can still be 84 + activated via the legacy method of power-cycling the system. 85 + 86 + [1]: https://docs.pmem.io/persistent-memory/
+149
drivers/nvdimm/core.c
··· 4 4 */ 5 5 #include <linux/libnvdimm.h> 6 6 #include <linux/badblocks.h> 7 + #include <linux/suspend.h> 7 8 #include <linux/export.h> 8 9 #include <linux/module.h> 9 10 #include <linux/blkdev.h> ··· 390 389 .attrs = nvdimm_bus_attributes, 391 390 }; 392 391 392 + static ssize_t capability_show(struct device *dev, 393 + struct device_attribute *attr, char *buf) 394 + { 395 + struct nvdimm_bus *nvdimm_bus = to_nvdimm_bus(dev); 396 + struct nvdimm_bus_descriptor *nd_desc = nvdimm_bus->nd_desc; 397 + enum nvdimm_fwa_capability cap; 398 + 399 + if (!nd_desc->fw_ops) 400 + return -EOPNOTSUPP; 401 + 402 + nvdimm_bus_lock(dev); 403 + cap = nd_desc->fw_ops->capability(nd_desc); 404 + nvdimm_bus_unlock(dev); 405 + 406 + switch (cap) { 407 + case NVDIMM_FWA_CAP_QUIESCE: 408 + return sprintf(buf, "quiesce\n"); 409 + case NVDIMM_FWA_CAP_LIVE: 410 + return sprintf(buf, "live\n"); 411 + default: 412 + return -EOPNOTSUPP; 413 + } 414 + } 415 + 416 + static DEVICE_ATTR_RO(capability); 417 + 418 + static ssize_t activate_show(struct device *dev, 419 + struct device_attribute *attr, char *buf) 420 + { 421 + struct nvdimm_bus *nvdimm_bus = to_nvdimm_bus(dev); 422 + struct nvdimm_bus_descriptor *nd_desc = nvdimm_bus->nd_desc; 423 + enum nvdimm_fwa_capability cap; 424 + enum nvdimm_fwa_state state; 425 + 426 + if (!nd_desc->fw_ops) 427 + return -EOPNOTSUPP; 428 + 429 + nvdimm_bus_lock(dev); 430 + cap = nd_desc->fw_ops->capability(nd_desc); 431 + state = nd_desc->fw_ops->activate_state(nd_desc); 432 + nvdimm_bus_unlock(dev); 433 + 434 + if (cap < NVDIMM_FWA_CAP_QUIESCE) 435 + return -EOPNOTSUPP; 436 + 437 + switch (state) { 438 + case NVDIMM_FWA_IDLE: 439 + return sprintf(buf, "idle\n"); 440 + case NVDIMM_FWA_BUSY: 441 + return sprintf(buf, "busy\n"); 442 + case NVDIMM_FWA_ARMED: 443 + return sprintf(buf, "armed\n"); 444 + case NVDIMM_FWA_ARM_OVERFLOW: 445 + return sprintf(buf, "overflow\n"); 446 + default: 447 + return -ENXIO; 448 + } 449 + } 450 + 451 + static int exec_firmware_activate(void *data) 452 + { 453 + struct nvdimm_bus_descriptor *nd_desc = data; 454 + 455 + return nd_desc->fw_ops->activate(nd_desc); 456 + } 457 + 458 + static ssize_t activate_store(struct device *dev, 459 + struct device_attribute *attr, const char *buf, size_t len) 460 + { 461 + struct nvdimm_bus *nvdimm_bus = to_nvdimm_bus(dev); 462 + struct nvdimm_bus_descriptor *nd_desc = nvdimm_bus->nd_desc; 463 + enum nvdimm_fwa_state state; 464 + bool quiesce; 465 + ssize_t rc; 466 + 467 + if (!nd_desc->fw_ops) 468 + return -EOPNOTSUPP; 469 + 470 + if (sysfs_streq(buf, "live")) 471 + quiesce = false; 472 + else if (sysfs_streq(buf, "quiesce")) 473 + quiesce = true; 474 + else 475 + return -EINVAL; 476 + 477 + nvdimm_bus_lock(dev); 478 + state = nd_desc->fw_ops->activate_state(nd_desc); 479 + 480 + switch (state) { 481 + case NVDIMM_FWA_BUSY: 482 + rc = -EBUSY; 483 + break; 484 + case NVDIMM_FWA_ARMED: 485 + case NVDIMM_FWA_ARM_OVERFLOW: 486 + if (quiesce) 487 + rc = hibernate_quiet_exec(exec_firmware_activate, nd_desc); 488 + else 489 + rc = nd_desc->fw_ops->activate(nd_desc); 490 + break; 491 + case NVDIMM_FWA_IDLE: 492 + default: 493 + rc = -ENXIO; 494 + } 495 + nvdimm_bus_unlock(dev); 496 + 497 + if (rc == 0) 498 + rc = len; 499 + return rc; 500 + } 501 + 502 + static DEVICE_ATTR_ADMIN_RW(activate); 503 + 504 + static umode_t nvdimm_bus_firmware_visible(struct kobject *kobj, struct attribute *a, int n) 505 + { 506 + struct device *dev = container_of(kobj, typeof(*dev), kobj); 507 + struct nvdimm_bus *nvdimm_bus = to_nvdimm_bus(dev); 508 + struct nvdimm_bus_descriptor *nd_desc = nvdimm_bus->nd_desc; 509 + enum nvdimm_fwa_capability cap; 510 + 511 + /* 512 + * Both 'activate' and 'capability' disappear when no ops 513 + * detected, or a negative capability is indicated. 514 + */ 515 + if (!nd_desc->fw_ops) 516 + return 0; 517 + 518 + nvdimm_bus_lock(dev); 519 + cap = nd_desc->fw_ops->capability(nd_desc); 520 + nvdimm_bus_unlock(dev); 521 + 522 + if (cap < NVDIMM_FWA_CAP_QUIESCE) 523 + return 0; 524 + 525 + return a->mode; 526 + } 527 + static struct attribute *nvdimm_bus_firmware_attributes[] = { 528 + &dev_attr_activate.attr, 529 + &dev_attr_capability.attr, 530 + NULL, 531 + }; 532 + 533 + static const struct attribute_group nvdimm_bus_firmware_attribute_group = { 534 + .name = "firmware", 535 + .attrs = nvdimm_bus_firmware_attributes, 536 + .is_visible = nvdimm_bus_firmware_visible, 537 + }; 538 + 393 539 const struct attribute_group *nvdimm_bus_attribute_groups[] = { 394 540 &nvdimm_bus_attribute_group, 541 + &nvdimm_bus_firmware_attribute_group, 395 542 NULL, 396 543 }; 397 544
+115
drivers/nvdimm/dimm_devs.c
··· 446 446 .is_visible = nvdimm_visible, 447 447 }; 448 448 449 + static ssize_t result_show(struct device *dev, struct device_attribute *attr, char *buf) 450 + { 451 + struct nvdimm *nvdimm = to_nvdimm(dev); 452 + enum nvdimm_fwa_result result; 453 + 454 + if (!nvdimm->fw_ops) 455 + return -EOPNOTSUPP; 456 + 457 + nvdimm_bus_lock(dev); 458 + result = nvdimm->fw_ops->activate_result(nvdimm); 459 + nvdimm_bus_unlock(dev); 460 + 461 + switch (result) { 462 + case NVDIMM_FWA_RESULT_NONE: 463 + return sprintf(buf, "none\n"); 464 + case NVDIMM_FWA_RESULT_SUCCESS: 465 + return sprintf(buf, "success\n"); 466 + case NVDIMM_FWA_RESULT_FAIL: 467 + return sprintf(buf, "fail\n"); 468 + case NVDIMM_FWA_RESULT_NOTSTAGED: 469 + return sprintf(buf, "not_staged\n"); 470 + case NVDIMM_FWA_RESULT_NEEDRESET: 471 + return sprintf(buf, "need_reset\n"); 472 + default: 473 + return -ENXIO; 474 + } 475 + } 476 + static DEVICE_ATTR_ADMIN_RO(result); 477 + 478 + static ssize_t activate_show(struct device *dev, struct device_attribute *attr, char *buf) 479 + { 480 + struct nvdimm *nvdimm = to_nvdimm(dev); 481 + enum nvdimm_fwa_state state; 482 + 483 + if (!nvdimm->fw_ops) 484 + return -EOPNOTSUPP; 485 + 486 + nvdimm_bus_lock(dev); 487 + state = nvdimm->fw_ops->activate_state(nvdimm); 488 + nvdimm_bus_unlock(dev); 489 + 490 + switch (state) { 491 + case NVDIMM_FWA_IDLE: 492 + return sprintf(buf, "idle\n"); 493 + case NVDIMM_FWA_BUSY: 494 + return sprintf(buf, "busy\n"); 495 + case NVDIMM_FWA_ARMED: 496 + return sprintf(buf, "armed\n"); 497 + default: 498 + return -ENXIO; 499 + } 500 + } 501 + 502 + static ssize_t activate_store(struct device *dev, struct device_attribute *attr, 503 + const char *buf, size_t len) 504 + { 505 + struct nvdimm *nvdimm = to_nvdimm(dev); 506 + enum nvdimm_fwa_trigger arg; 507 + int rc; 508 + 509 + if (!nvdimm->fw_ops) 510 + return -EOPNOTSUPP; 511 + 512 + if (sysfs_streq(buf, "arm")) 513 + arg = NVDIMM_FWA_ARM; 514 + else if (sysfs_streq(buf, "disarm")) 515 + arg = NVDIMM_FWA_DISARM; 516 + else 517 + return -EINVAL; 518 + 519 + nvdimm_bus_lock(dev); 520 + rc = nvdimm->fw_ops->arm(nvdimm, arg); 521 + nvdimm_bus_unlock(dev); 522 + 523 + if (rc < 0) 524 + return rc; 525 + return len; 526 + } 527 + static DEVICE_ATTR_ADMIN_RW(activate); 528 + 529 + static struct attribute *nvdimm_firmware_attributes[] = { 530 + &dev_attr_activate.attr, 531 + &dev_attr_result.attr, 532 + }; 533 + 534 + static umode_t nvdimm_firmware_visible(struct kobject *kobj, struct attribute *a, int n) 535 + { 536 + struct device *dev = container_of(kobj, typeof(*dev), kobj); 537 + struct nvdimm_bus *nvdimm_bus = walk_to_nvdimm_bus(dev); 538 + struct nvdimm_bus_descriptor *nd_desc = nvdimm_bus->nd_desc; 539 + struct nvdimm *nvdimm = to_nvdimm(dev); 540 + enum nvdimm_fwa_capability cap; 541 + 542 + if (!nd_desc->fw_ops) 543 + return 0; 544 + if (!nvdimm->fw_ops) 545 + return 0; 546 + 547 + nvdimm_bus_lock(dev); 548 + cap = nd_desc->fw_ops->capability(nd_desc); 549 + nvdimm_bus_unlock(dev); 550 + 551 + if (cap < NVDIMM_FWA_CAP_QUIESCE) 552 + return 0; 553 + 554 + return a->mode; 555 + } 556 + 557 + static const struct attribute_group nvdimm_firmware_attribute_group = { 558 + .name = "firmware", 559 + .attrs = nvdimm_firmware_attributes, 560 + .is_visible = nvdimm_firmware_visible, 561 + }; 562 + 449 563 static const struct attribute_group *nvdimm_attribute_groups[] = { 450 564 &nd_device_attribute_group, 451 565 &nvdimm_attribute_group, 566 + &nvdimm_firmware_attribute_group, 452 567 NULL, 453 568 }; 454 569
+1
drivers/nvdimm/nd-core.h
··· 45 45 struct kernfs_node *overwrite_state; 46 46 } sec; 47 47 struct delayed_work dwork; 48 + const struct nvdimm_fw_ops *fw_ops; 48 49 }; 49 50 50 51 static inline unsigned long nvdimm_security_flags(
+44
include/linux/libnvdimm.h
··· 86 86 int (*flush_probe)(struct nvdimm_bus_descriptor *nd_desc); 87 87 int (*clear_to_send)(struct nvdimm_bus_descriptor *nd_desc, 88 88 struct nvdimm *nvdimm, unsigned int cmd, void *data); 89 + const struct nvdimm_bus_fw_ops *fw_ops; 89 90 }; 90 91 91 92 struct nd_cmd_desc { ··· 199 198 int (*overwrite)(struct nvdimm *nvdimm, 200 199 const struct nvdimm_key_data *key_data); 201 200 int (*query_overwrite)(struct nvdimm *nvdimm); 201 + }; 202 + 203 + enum nvdimm_fwa_state { 204 + NVDIMM_FWA_INVALID, 205 + NVDIMM_FWA_IDLE, 206 + NVDIMM_FWA_ARMED, 207 + NVDIMM_FWA_BUSY, 208 + NVDIMM_FWA_ARM_OVERFLOW, 209 + }; 210 + 211 + enum nvdimm_fwa_trigger { 212 + NVDIMM_FWA_ARM, 213 + NVDIMM_FWA_DISARM, 214 + }; 215 + 216 + enum nvdimm_fwa_capability { 217 + NVDIMM_FWA_CAP_INVALID, 218 + NVDIMM_FWA_CAP_NONE, 219 + NVDIMM_FWA_CAP_QUIESCE, 220 + NVDIMM_FWA_CAP_LIVE, 221 + }; 222 + 223 + enum nvdimm_fwa_result { 224 + NVDIMM_FWA_RESULT_INVALID, 225 + NVDIMM_FWA_RESULT_NONE, 226 + NVDIMM_FWA_RESULT_SUCCESS, 227 + NVDIMM_FWA_RESULT_NOTSTAGED, 228 + NVDIMM_FWA_RESULT_NEEDRESET, 229 + NVDIMM_FWA_RESULT_FAIL, 230 + }; 231 + 232 + struct nvdimm_bus_fw_ops { 233 + enum nvdimm_fwa_state (*activate_state) 234 + (struct nvdimm_bus_descriptor *nd_desc); 235 + enum nvdimm_fwa_capability (*capability) 236 + (struct nvdimm_bus_descriptor *nd_desc); 237 + int (*activate)(struct nvdimm_bus_descriptor *nd_desc); 238 + }; 239 + 240 + struct nvdimm_fw_ops { 241 + enum nvdimm_fwa_state (*activate_state)(struct nvdimm *nvdimm); 242 + enum nvdimm_fwa_result (*activate_result)(struct nvdimm *nvdimm); 243 + int (*arm)(struct nvdimm *nvdimm, enum nvdimm_fwa_trigger arg); 202 244 }; 203 245 204 246 void badrange_init(struct badrange *badrange);
+6
include/linux/suspend.h
··· 453 453 asmlinkage int swsusp_save(void); 454 454 extern struct pbe *restore_pblist; 455 455 int pfn_is_nosave(unsigned long pfn); 456 + 457 + int hibernate_quiet_exec(int (*func)(void *data), void *data); 456 458 #else /* CONFIG_HIBERNATION */ 457 459 static inline void register_nosave_region(unsigned long b, unsigned long e) {} 458 460 static inline void register_nosave_region_late(unsigned long b, unsigned long e) {} ··· 466 464 static inline int hibernate(void) { return -ENOSYS; } 467 465 static inline bool system_entering_hibernation(void) { return false; } 468 466 static inline bool hibernation_available(void) { return false; } 467 + 468 + static inline int hibernate_quiet_exec(int (*func)(void *data), void *data) { 469 + return -ENOTSUPP; 470 + } 469 471 #endif /* CONFIG_HIBERNATION */ 470 472 471 473 #ifdef CONFIG_HIBERNATION_SNAPSHOT_DEV
+97
kernel/power/hibernate.c
··· 795 795 return error; 796 796 } 797 797 798 + /** 799 + * hibernate_quiet_exec - Execute a function with all devices frozen. 800 + * @func: Function to execute. 801 + * @data: Data pointer to pass to @func. 802 + * 803 + * Return the @func return value or an error code if it cannot be executed. 804 + */ 805 + int hibernate_quiet_exec(int (*func)(void *data), void *data) 806 + { 807 + int error, nr_calls = 0; 808 + 809 + lock_system_sleep(); 810 + 811 + if (!hibernate_acquire()) { 812 + error = -EBUSY; 813 + goto unlock; 814 + } 815 + 816 + pm_prepare_console(); 817 + 818 + error = __pm_notifier_call_chain(PM_HIBERNATION_PREPARE, -1, &nr_calls); 819 + if (error) { 820 + nr_calls--; 821 + goto exit; 822 + } 823 + 824 + error = freeze_processes(); 825 + if (error) 826 + goto exit; 827 + 828 + lock_device_hotplug(); 829 + 830 + pm_suspend_clear_flags(); 831 + 832 + error = platform_begin(true); 833 + if (error) 834 + goto thaw; 835 + 836 + error = freeze_kernel_threads(); 837 + if (error) 838 + goto thaw; 839 + 840 + error = dpm_prepare(PMSG_FREEZE); 841 + if (error) 842 + goto dpm_complete; 843 + 844 + suspend_console(); 845 + 846 + error = dpm_suspend(PMSG_FREEZE); 847 + if (error) 848 + goto dpm_resume; 849 + 850 + error = dpm_suspend_end(PMSG_FREEZE); 851 + if (error) 852 + goto dpm_resume; 853 + 854 + error = platform_pre_snapshot(true); 855 + if (error) 856 + goto skip; 857 + 858 + error = func(data); 859 + 860 + skip: 861 + platform_finish(true); 862 + 863 + dpm_resume_start(PMSG_THAW); 864 + 865 + dpm_resume: 866 + dpm_resume(PMSG_THAW); 867 + 868 + resume_console(); 869 + 870 + dpm_complete: 871 + dpm_complete(PMSG_THAW); 872 + 873 + thaw_kernel_threads(); 874 + 875 + thaw: 876 + platform_end(true); 877 + 878 + unlock_device_hotplug(); 879 + 880 + thaw_processes(); 881 + 882 + exit: 883 + __pm_notifier_call_chain(PM_POST_HIBERNATION, nr_calls, NULL); 884 + 885 + pm_restore_console(); 886 + 887 + hibernate_release(); 888 + 889 + unlock: 890 + unlock_system_sleep(); 891 + 892 + return error; 893 + } 894 + EXPORT_SYMBOL_GPL(hibernate_quiet_exec); 798 895 799 896 /** 800 897 * software_resume - Resume from a saved hibernation image.