Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge branch 'netconsole-add-taskname-sysdata-support'

Breno Leitao says:

====================
netconsole: Add taskname sysdata support

This patchset introduces a new feature to the netconsole extradata
subsystem that enables the inclusion of the current task's name in the
sysdata output of netconsole messages.

This enhancement is particularly valuable for large-scale deployments,
such as Meta's, where netconsole collects messages from millions of
servers and stores them in a data warehouse for analysis. Engineers
often rely on these messages to investigate issues and assess kernel
health.

One common challenge we face is determining the context in which
a particular message was generated. By including the task name
(task->comm) with each message, this feature provides a direct answer to
the frequently asked question: "What was running when this message was
generated?"

This added context will significantly improve our ability to diagnose
and troubleshoot issues, making it easier to interpret output of
netconsole.

The patchset consists of seven patches that implement the following changes:

* Refactor CPU number formatting into a separate function
* Prefix CPU_NR sysdata feature with SYSDATA_
* Patch to covert a bitwise operation into boolean
* Add configfs controls for taskname sysdata feature
* Add taskname to extradata entry count
* Add support for including task name in netconsole's extra data output
* Document the task name feature in Documentation/networking/netconsole.rst
* Add test coverage for the task name feature to the existing sysdata selftest script

These changes allow users to enable or disable the task name feature via
configfs and provide additional context for kernel messages by showing
which task generated each console message.

I have tested these patches on some servers and they seem to work as
expected.

v1: https://lore.kernel.org/r/20250221-netcons_current-v1-0-21c86ae8fc0d@debian.org

Signed-off-by: Breno Leitao <leitao@debian.org>
====================

Link: https://patch.msgid.link/20250228-netcons_current-v2-0-f53ff79a0db2@debian.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

+153 -21
+28
Documentation/networking/netconsole.rst
··· 240 240 241 241 It is recommended to not write user data values with newlines. 242 242 243 + Task name auto population in userdata 244 + ------------------------------------- 245 + 246 + Inside the netconsole configfs hierarchy, there is a file called 247 + `taskname_enabled` under the `userdata` directory. This file is used to enable 248 + or disable the automatic task name population feature. This feature 249 + automatically populates the current task name that is scheduled in the CPU 250 + sneding the message. 251 + 252 + To enable task name auto-population:: 253 + 254 + echo 1 > /sys/kernel/config/netconsole/target1/userdata/taskname_enabled 255 + 256 + When this option is enabled, the netconsole messages will include an additional 257 + line in the userdata field with the format `taskname=<task name>`. This allows 258 + the receiver of the netconsole messages to easily find which application was 259 + currently scheduled when that message was generated, providing extra context 260 + for kernel messages and helping to categorize them. 261 + 262 + Example:: 263 + 264 + echo "This is a message" > /dev/kmsg 265 + 12,607,22085407756,-;This is a message 266 + taskname=echo 267 + 268 + In this example, the message was generated while "echo" was the current 269 + scheduled process. 270 + 243 271 CPU number auto population in userdata 244 272 -------------------------------------- 245 273
+81 -14
drivers/net/netconsole.c
··· 103 103 */ 104 104 enum sysdata_feature { 105 105 /* Populate the CPU that sends the message */ 106 - CPU_NR = BIT(0), 106 + SYSDATA_CPU_NR = BIT(0), 107 + /* Populate the task name (as in current->comm) in sysdata */ 108 + SYSDATA_TASKNAME = BIT(1), 107 109 }; 108 110 109 111 /** ··· 420 418 bool cpu_nr_enabled; 421 419 422 420 mutex_lock(&dynamic_netconsole_mutex); 423 - cpu_nr_enabled = !!(nt->sysdata_fields & CPU_NR); 421 + cpu_nr_enabled = !!(nt->sysdata_fields & SYSDATA_CPU_NR); 424 422 mutex_unlock(&dynamic_netconsole_mutex); 425 423 426 424 return sysfs_emit(buf, "%d\n", cpu_nr_enabled); 425 + } 426 + 427 + /* configfs helper to display if taskname sysdata feature is enabled */ 428 + static ssize_t sysdata_taskname_enabled_show(struct config_item *item, 429 + char *buf) 430 + { 431 + struct netconsole_target *nt = to_target(item->ci_parent); 432 + bool taskname_enabled; 433 + 434 + mutex_lock(&dynamic_netconsole_mutex); 435 + taskname_enabled = !!(nt->sysdata_fields & SYSDATA_TASKNAME); 436 + mutex_unlock(&dynamic_netconsole_mutex); 437 + 438 + return sysfs_emit(buf, "%d\n", taskname_enabled); 427 439 } 428 440 429 441 /* ··· 715 699 /* Userdata entries */ 716 700 entries = list_count_nodes(&nt->userdata_group.cg_children); 717 701 /* Plus sysdata entries */ 718 - if (nt->sysdata_fields & CPU_NR) 702 + if (nt->sysdata_fields & SYSDATA_CPU_NR) 703 + entries += 1; 704 + if (nt->sysdata_fields & SYSDATA_TASKNAME) 719 705 entries += 1; 720 706 721 707 return entries; ··· 855 837 nt->extradata_complete[nt->userdata_length] = 0; 856 838 } 857 839 840 + static ssize_t sysdata_taskname_enabled_store(struct config_item *item, 841 + const char *buf, size_t count) 842 + { 843 + struct netconsole_target *nt = to_target(item->ci_parent); 844 + bool taskname_enabled, curr; 845 + ssize_t ret; 846 + 847 + ret = kstrtobool(buf, &taskname_enabled); 848 + if (ret) 849 + return ret; 850 + 851 + mutex_lock(&dynamic_netconsole_mutex); 852 + curr = !!(nt->sysdata_fields & SYSDATA_TASKNAME); 853 + if (taskname_enabled == curr) 854 + goto unlock_ok; 855 + 856 + if (taskname_enabled && 857 + count_extradata_entries(nt) >= MAX_EXTRADATA_ITEMS) { 858 + ret = -ENOSPC; 859 + goto unlock; 860 + } 861 + 862 + if (taskname_enabled) 863 + nt->sysdata_fields |= SYSDATA_TASKNAME; 864 + else 865 + disable_sysdata_feature(nt, SYSDATA_TASKNAME); 866 + 867 + unlock_ok: 868 + ret = strnlen(buf, count); 869 + unlock: 870 + mutex_unlock(&dynamic_netconsole_mutex); 871 + return ret; 872 + } 873 + 858 874 /* configfs helper to sysdata cpu_nr feature */ 859 875 static ssize_t sysdata_cpu_nr_enabled_store(struct config_item *item, 860 876 const char *buf, size_t count) ··· 902 850 return ret; 903 851 904 852 mutex_lock(&dynamic_netconsole_mutex); 905 - curr = nt->sysdata_fields & CPU_NR; 853 + curr = !!(nt->sysdata_fields & SYSDATA_CPU_NR); 906 854 if (cpu_nr_enabled == curr) 907 855 /* no change requested */ 908 856 goto unlock_ok; ··· 917 865 } 918 866 919 867 if (cpu_nr_enabled) 920 - nt->sysdata_fields |= CPU_NR; 868 + nt->sysdata_fields |= SYSDATA_CPU_NR; 921 869 else 922 870 /* This is special because extradata_complete might have 923 871 * remaining data from previous sysdata, and it needs to be 924 872 * cleaned. 925 873 */ 926 - disable_sysdata_feature(nt, CPU_NR); 874 + disable_sysdata_feature(nt, SYSDATA_CPU_NR); 927 875 928 876 unlock_ok: 929 877 ret = strnlen(buf, count); ··· 934 882 935 883 CONFIGFS_ATTR(userdatum_, value); 936 884 CONFIGFS_ATTR(sysdata_, cpu_nr_enabled); 885 + CONFIGFS_ATTR(sysdata_, taskname_enabled); 937 886 938 887 static struct configfs_attribute *userdatum_attrs[] = { 939 888 &userdatum_attr_value, ··· 995 942 996 943 static struct configfs_attribute *userdata_attrs[] = { 997 944 &sysdata_attr_cpu_nr_enabled, 945 + &sysdata_attr_taskname_enabled, 998 946 NULL, 999 947 }; 1000 948 ··· 1171 1117 init_target_config_group(nt, target_name); 1172 1118 } 1173 1119 1120 + static int append_cpu_nr(struct netconsole_target *nt, int offset) 1121 + { 1122 + /* Append cpu=%d at extradata_complete after userdata str */ 1123 + return scnprintf(&nt->extradata_complete[offset], 1124 + MAX_EXTRADATA_ENTRY_LEN, " cpu=%u\n", 1125 + raw_smp_processor_id()); 1126 + } 1127 + 1128 + static int append_taskname(struct netconsole_target *nt, int offset) 1129 + { 1130 + return scnprintf(&nt->extradata_complete[offset], 1131 + MAX_EXTRADATA_ENTRY_LEN, " taskname=%s\n", 1132 + current->comm); 1133 + } 1174 1134 /* 1175 1135 * prepare_extradata - append sysdata at extradata_complete in runtime 1176 1136 * @nt: target to send message to 1177 1137 */ 1178 1138 static int prepare_extradata(struct netconsole_target *nt) 1179 1139 { 1180 - int sysdata_len, extradata_len; 1140 + u32 fields = SYSDATA_CPU_NR | SYSDATA_TASKNAME; 1141 + int extradata_len; 1181 1142 1182 1143 /* userdata was appended when configfs write helper was called 1183 1144 * by update_userdata(). 1184 1145 */ 1185 1146 extradata_len = nt->userdata_length; 1186 1147 1187 - if (!(nt->sysdata_fields & CPU_NR)) 1148 + if (!(nt->sysdata_fields & fields)) 1188 1149 goto out; 1189 1150 1190 - /* Append cpu=%d at extradata_complete after userdata str */ 1191 - sysdata_len = scnprintf(&nt->extradata_complete[nt->userdata_length], 1192 - MAX_EXTRADATA_ENTRY_LEN, " cpu=%u\n", 1193 - raw_smp_processor_id()); 1194 - 1195 - extradata_len += sysdata_len; 1151 + if (nt->sysdata_fields & SYSDATA_CPU_NR) 1152 + extradata_len += append_cpu_nr(nt, extradata_len); 1153 + if (nt->sysdata_fields & SYSDATA_TASKNAME) 1154 + extradata_len += append_taskname(nt, extradata_len); 1196 1155 1197 1156 WARN_ON_ONCE(extradata_len > 1198 1157 MAX_EXTRADATA_ENTRY_LEN * MAX_EXTRADATA_ITEMS);
+44 -7
tools/testing/selftests/drivers/net/netcons_sysdata.sh
··· 31 31 echo 1 > "${NETCONS_PATH}/userdata/cpu_nr_enabled" 32 32 } 33 33 34 + # Enable the taskname to be appended to sysdata 35 + function set_taskname() { 36 + if [[ ! -f "${NETCONS_PATH}/userdata/taskname_enabled" ]] 37 + then 38 + echo "Not able to enable taskname sysdata append. Configfs not available in ${NETCONS_PATH}/userdata/taskname_enabled" >&2 39 + exit "${ksft_skip}" 40 + fi 41 + 42 + echo 1 > "${NETCONS_PATH}/userdata/taskname_enabled" 43 + } 44 + 34 45 # Disable the sysdata cpu_nr feature 35 46 function unset_cpu_nr() { 36 47 echo 0 > "${NETCONS_PATH}/userdata/cpu_nr_enabled" 37 48 } 38 49 39 - # Test if MSG content and `cpu=${CPU}` exists in OUTPUT_FILE 40 - function validate_sysdata_cpu_exists() { 50 + # Once called, taskname=<..> will not be appended anymore 51 + function unset_taskname() { 52 + echo 0 > "${NETCONS_PATH}/userdata/taskname_enabled" 53 + } 54 + 55 + # Test if MSG contains sysdata 56 + function validate_sysdata() { 41 57 # OUTPUT_FILE will contain something like: 42 58 # 6.11.1-0_fbk0_rc13_509_g30d75cea12f7,13,1822,115075213798,-;netconsole selftest: netcons_gtJHM 43 59 # userdatakey=userdatavalue 44 60 # cpu=X 61 + # taskname=<taskname> 62 + 63 + # Echo is what this test uses to create the message. See runtest() 64 + # function 65 + SENDER="echo" 45 66 46 67 if [ ! -f "$OUTPUT_FILE" ]; then 47 68 echo "FAIL: File was not generated." >&2 ··· 83 62 exit "${ksft_fail}" 84 63 fi 85 64 65 + if ! grep -q "taskname=${SENDER}" "${OUTPUT_FILE}"; then 66 + echo "FAIL: 'taskname=echo' not found in ${OUTPUT_FILE}" >&2 67 + cat "${OUTPUT_FILE}" >&2 68 + exit "${ksft_fail}" 69 + fi 70 + 86 71 rm "${OUTPUT_FILE}" 87 72 pkill_socat 88 73 } 89 74 90 - # Test if MSG content exists in OUTPUT_FILE but no `cpu=` string 91 - function validate_sysdata_no_cpu() { 75 + # Test if MSG content exists in OUTPUT_FILE but no `cpu=` and `taskname=` 76 + # strings 77 + function validate_no_sysdata() { 92 78 if [ ! -f "$OUTPUT_FILE" ]; then 93 79 echo "FAIL: File was not generated." >&2 94 80 exit "${ksft_fail}" ··· 109 81 110 82 if grep -q "cpu=" "${OUTPUT_FILE}"; then 111 83 echo "FAIL: 'cpu= found in ${OUTPUT_FILE}" >&2 84 + cat "${OUTPUT_FILE}" >&2 85 + exit "${ksft_fail}" 86 + fi 87 + 88 + if grep -q "taskname=" "${OUTPUT_FILE}"; then 89 + echo "FAIL: 'taskname= found in ${OUTPUT_FILE}" >&2 112 90 cat "${OUTPUT_FILE}" >&2 113 91 exit "${ksft_fail}" 114 92 fi ··· 167 133 MSG="Test #1 from CPU${CPU}" 168 134 # Enable the auto population of cpu_nr 169 135 set_cpu_nr 136 + # Enable taskname to be appended to sysdata 137 + set_taskname 170 138 runtest 171 139 # Make sure the message was received in the dst part 172 140 # and exit 173 - validate_sysdata_cpu_exists 141 + validate_sysdata 174 142 175 143 #==================================================== 176 144 # TEST #2 ··· 184 148 MSG="Test #2 from CPU${CPU}" 185 149 set_user_data 186 150 runtest 187 - validate_sysdata_cpu_exists 151 + validate_sysdata 188 152 189 153 # =================================================== 190 154 # TEST #3 ··· 196 160 MSG="Test #3 from CPU${CPU}" 197 161 # Enable the auto population of cpu_nr 198 162 unset_cpu_nr 163 + unset_taskname 199 164 runtest 200 165 # At this time, cpu= shouldn't be present in the msg 201 - validate_sysdata_no_cpu 166 + validate_no_sysdata 202 167 203 168 exit "${ksft_pass}"