Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

EDAC, mce_amd: Print IPID and Syndrome on a separate line

Currently, the IPID and Syndrome are printed on the same line as the
Address. There are cases when we can have a valid Syndrome but not a
valid Address.

For example, the MCA_SYND register can be used to hold more detailed
error info that the hardware folks can use. It's not just DRAM ECC
syndromes. There are some error types that aren't related to memory that
may have valid syndromes, like some errors related to links in the Data
Fabric, etc.

In these cases, the IPID and Syndrome are not printed at the same log
level as the rest of the stanza, so users won't see them on the console.

Console:
[Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b
[Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2

Dmesg:
[Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b
, Syndrome: 0x000000010b404000, IPID: 0x0001002e00000002
[Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2

Print the IPID first and on a new line. The IPID should always be
printed on SMCA systems. The Syndrome will then be printed with the IPID
and at the same log level when valid:

[Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b
[Hardware Error]: IPID: 0x0001002e00000002, Syndrome: 0x000000010b404000
[Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1487192182-2474-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>

authored by

Yazen Ghannam and committed by
Borislav Petkov
75bf2f64 e62d2ca9

+4 -5
+4 -5
drivers/edac/mce_amd.c
··· 991 991 pr_cont("]: 0x%016llx\n", m->status); 992 992 993 993 if (m->status & MCI_STATUS_ADDRV) 994 - pr_emerg(HW_ERR "Error Addr: 0x%016llx", m->addr); 994 + pr_emerg(HW_ERR "Error Addr: 0x%016llx\n", m->addr); 995 995 996 996 if (boot_cpu_has(X86_FEATURE_SMCA)) { 997 + pr_emerg(HW_ERR "IPID: 0x%016llx", m->ipid); 998 + 997 999 if (m->status & MCI_STATUS_SYNDV) 998 1000 pr_cont(", Syndrome: 0x%016llx", m->synd); 999 - 1000 - pr_cont(", IPID: 0x%016llx", m->ipid); 1001 1001 1002 1002 pr_cont("\n"); 1003 1003 1004 1004 decode_smca_errors(m); 1005 1005 goto err_code; 1006 - } else 1007 - pr_cont("\n"); 1006 + } 1008 1007 1009 1008 if (!fam_ops) 1010 1009 goto err_code;