Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'edac_for_4.2_2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp

Pull EDAC updates from Borislav Petkov:

- New APM X-Gene SoC EDAC driver (Loc Ho)

- AMD error injection module improvements (Aravind Gopalakrishnan)

- Altera Arria 10 support (Thor Thayer)

- misc fixes and cleanups all over the place

* tag 'edac_for_4.2_2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: (28 commits)
EDAC: Update Documentation/edac.txt
EDAC: Fix typos in Documentation/edac.txt
EDAC, mce_amd_inj: Set MISCV on injection
EDAC, mce_amd_inj: Move bit preparations before the injection
EDAC, mce_amd_inj: Cleanup and simplify README
EDAC, altera: Do not allow suspend when EDAC is enabled
EDAC, mce_amd_inj: Make inj_type static
arm: socfpga: dts: Add Arria10 SDRAM EDAC DTS support
EDAC, altera: Add Arria10 EDAC support
EDAC, altera: Refactor for Altera CycloneV SoC
EDAC, altera: Generalize driver to use DT Memory size
EDAC, mce_amd_inj: Add README file
EDAC, mce_amd_inj: Add individual permissions field to dfs_node
EDAC, mce_amd_inj: Modify flags attribute to use string arguments
EDAC, mce_amd_inj: Read out number of MCE banks from the hardware
EDAC, mce_amd_inj: Use MCE_INJECT_GET macro for bank node too
EDAC, xgene: Fix cpuid abuse
EDAC, mpc85xx: Extend error address to 64 bit
EDAC, mpc8xxx: Adapt for FSL SoC
EDAC, edac_stub: Drop arch-specific include
...

+2174 -376
+1 -1
Documentation/devicetree/bindings/arm/altera/socfpga-sdram-edac.txt
··· 2 2 The EDAC accesses a range of registers in the SDRAM controller. 3 3 4 4 Required properties: 5 - - compatible : should contain "altr,sdram-edac"; 5 + - compatible : should contain "altr,sdram-edac" or "altr,sdram-edac-a10" 6 6 - altr,sdr-syscon : phandle of the sdr module 7 7 - interrupts : Should contain the SDRAM ECC IRQ in the 8 8 appropriate format for the IRQ controller.
+79
Documentation/devicetree/bindings/edac/apm-xgene-edac.txt
··· 1 + * APM X-Gene SoC EDAC node 2 + 3 + EDAC node is defined to describe on-chip error detection and correction. 4 + The follow error types are supported: 5 + 6 + memory controller - Memory controller 7 + PMD (L1/L2) - Processor module unit (PMD) L1/L2 cache 8 + 9 + The following section describes the EDAC DT node binding. 10 + 11 + Required properties: 12 + - compatible : Shall be "apm,xgene-edac". 13 + - regmap-csw : Regmap of the CPU switch fabric (CSW) resource. 14 + - regmap-mcba : Regmap of the MCB-A (memory bridge) resource. 15 + - regmap-mcbb : Regmap of the MCB-B (memory bridge) resource. 16 + - regmap-efuse : Regmap of the PMD efuse resource. 17 + - reg : First resource shall be the CPU bus (PCP) resource. 18 + - interrupts : Interrupt-specifier for MCU, PMD, L3, or SoC error 19 + IRQ(s). 20 + 21 + Required properties for memory controller subnode: 22 + - compatible : Shall be "apm,xgene-edac-mc". 23 + - reg : First resource shall be the memory controller unit 24 + (MCU) resource. 25 + - memory-controller : Instance number of the memory controller. 26 + 27 + Required properties for PMD subnode: 28 + - compatible : Shall be "apm,xgene-edac-pmd" or 29 + "apm,xgene-edac-pmd-v2". 30 + - reg : First resource shall be the PMD resource. 31 + - pmd-controller : Instance number of the PMD controller. 32 + 33 + Example: 34 + csw: csw@7e200000 { 35 + compatible = "apm,xgene-csw", "syscon"; 36 + reg = <0x0 0x7e200000 0x0 0x1000>; 37 + }; 38 + 39 + mcba: mcba@7e700000 { 40 + compatible = "apm,xgene-mcb", "syscon"; 41 + reg = <0x0 0x7e700000 0x0 0x1000>; 42 + }; 43 + 44 + mcbb: mcbb@7e720000 { 45 + compatible = "apm,xgene-mcb", "syscon"; 46 + reg = <0x0 0x7e720000 0x0 0x1000>; 47 + }; 48 + 49 + efuse: efuse@1054a000 { 50 + compatible = "apm,xgene-efuse", "syscon"; 51 + reg = <0x0 0x1054a000 0x0 0x20>; 52 + }; 53 + 54 + edac@78800000 { 55 + compatible = "apm,xgene-edac"; 56 + #address-cells = <2>; 57 + #size-cells = <2>; 58 + ranges; 59 + regmap-csw = <&csw>; 60 + regmap-mcba = <&mcba>; 61 + regmap-mcbb = <&mcbb>; 62 + regmap-efuse = <&efuse>; 63 + reg = <0x0 0x78800000 0x0 0x100>; 64 + interrupts = <0x0 0x20 0x4>, 65 + <0x0 0x21 0x4>, 66 + <0x0 0x27 0x4>; 67 + 68 + edacmc@7e800000 { 69 + compatible = "apm,xgene-edac-mc"; 70 + reg = <0x0 0x7e800000 0x0 0x1000>; 71 + memory-controller = <0>; 72 + }; 73 + 74 + edacpmd@7c000000 { 75 + compatible = "apm,xgene-edac-pmd"; 76 + reg = <0x0 0x7c000000 0x0 0x200000>; 77 + pmd-controller = <0>; 78 + }; 79 + };
+139 -152
Documentation/edac.txt
··· 1 - 2 - 3 1 EDAC - Error Detection And Correction 4 - 5 - Written by Doug Thompson <dougthompson@xmission.com> 6 - 7 Dec 2005 7 - 17 Jul 2007 Updated 8 - 9 - (c) Mauro Carvalho Chehab 10 - 05 Aug 2009 Nehalem interface 11 - 12 - EDAC is maintained and written by: 13 - 14 - Doug Thompson, Dave Jiang, Dave Peterson et al, 15 - original author: Thayne Harbaugh, 16 - 17 - Contact: 18 - website: bluesmoke.sourceforge.net 19 - mailing list: bluesmoke-devel@lists.sourceforge.net 2 + ===================================== 20 3 21 4 "bluesmoke" was the name for this device driver when it was "out-of-tree" 22 5 and maintained at sourceforge.net. When it was pushed into 2.6.16 for the 23 6 first time, it was renamed to 'EDAC'. 24 7 25 - The bluesmoke project at sourceforge.net is now utilized as a 'staging area' 26 - for EDAC development, before it is sent upstream to kernel.org 8 + PURPOSE 9 + ------- 27 10 28 - At the bluesmoke/EDAC project site is a series of quilt patches against 29 - recent kernels, stored in a SVN repository. For easier downloading, there 30 - is also a tarball snapshot available. 31 - 32 - ============================================================================ 33 - EDAC PURPOSE 34 - 35 - The 'edac' kernel module goal is to detect and report errors that occur 36 - within the computer system running under linux. 11 + The 'edac' kernel module's goal is to detect and report hardware errors 12 + that occur within the computer system running under linux. 37 13 38 14 MEMORY 15 + ------ 39 16 40 - In the initial release, memory Correctable Errors (CE) and Uncorrectable 41 - Errors (UE) are the primary errors being harvested. These types of errors 42 - are harvested by the 'edac_mc' class of device. 17 + Memory Correctable Errors (CE) and Uncorrectable Errors (UE) are the 18 + primary errors being harvested. These types of errors are harvested by 19 + the 'edac_mc' device. 43 20 44 21 Detecting CE events, then harvesting those events and reporting them, 45 - CAN be a predictor of future UE events. With CE events, the system can 46 - continue to operate, but with less safety. Preventive maintenance and 47 - proactive part replacement of memory DIMMs exhibiting CEs can reduce 48 - the likelihood of the dreaded UE events and system 'panics'. 22 + *can* but must not necessarily be a predictor of future UE events. With 23 + CE events only, the system can and will continue to operate as no data 24 + has been damaged yet. 49 25 50 - NON-MEMORY 26 + However, preventive maintenance and proactive part replacement of memory 27 + DIMMs exhibiting CEs can reduce the likelihood of the dreaded UE events 28 + and system panics. 29 + 30 + OTHER HARDWARE ELEMENTS 31 + ----------------------- 51 32 52 33 A new feature for EDAC, the edac_device class of device, was added in 53 34 the 2.6.23 version of the kernel. ··· 37 56 to have their states harvested and presented to userspace via the sysfs 38 57 interface. 39 58 40 - Some architectures have ECC detectors for L1, L2 and L3 caches, along with DMA 41 - engines, fabric switches, main data path switches, interconnections, 42 - and various other hardware data paths. If the hardware reports it, then 43 - a edac_device device probably can be constructed to harvest and present 44 - that to userspace. 59 + Some architectures have ECC detectors for L1, L2 and L3 caches, 60 + along with DMA engines, fabric switches, main data path switches, 61 + interconnections, and various other hardware data paths. If the hardware 62 + reports it, then a edac_device device probably can be constructed to 63 + harvest and present that to userspace. 45 64 46 65 47 66 PCI BUS SCANNING 67 + ---------------- 48 68 49 - In addition, PCI Bus Parity and SERR Errors are scanned for on PCI devices 50 - in order to determine if errors are occurring on data transfers. 69 + In addition, PCI devices are scanned for PCI Bus Parity and SERR Errors 70 + in order to determine if errors are occurring during data transfers. 51 71 52 72 The presence of PCI Parity errors must be examined with a grain of salt. 53 - There are several add-in adapters that do NOT follow the PCI specification 73 + There are several add-in adapters that do *not* follow the PCI specification 54 74 with regards to Parity generation and reporting. The specification says 55 75 the vendor should tie the parity status bits to 0 if they do not intend 56 76 to generate parity. Some vendors do not do this, and thus the parity bit 57 77 can "float" giving false positives. 58 78 59 - In the kernel there is a PCI device attribute located in sysfs that is 60 - checked by the EDAC PCI scanning code. If that attribute is set, 61 - PCI parity/error scanning is skipped for that device. The attribute 62 - is: 79 + There is a PCI device attribute located in sysfs that is checked by 80 + the EDAC PCI scanning code. If that attribute is set, PCI parity/error 81 + scanning is skipped for that device. The attribute is: 63 82 64 83 broken_parity_status 65 84 66 - as is located in /sys/devices/pci<XXX>/0000:XX:YY.Z directories for 85 + and is located in /sys/devices/pci<XXX>/0000:XX:YY.Z directories for 67 86 PCI devices. 68 87 69 - FUTURE HARDWARE SCANNING 70 88 71 - EDAC will have future error detectors that will be integrated with 72 - EDAC or added to it, in the following list: 73 - 74 - MCE Machine Check Exception 75 - MCA Machine Check Architecture 76 - NMI NMI notification of ECC errors 77 - MSRs Machine Specific Register error cases 78 - and other mechanisms. 79 - 80 - These errors are usually bus errors, ECC errors, thermal throttling 81 - and the like. 82 - 83 - 84 - ============================================================================ 85 - EDAC VERSIONING 89 + VERSIONING 90 + ---------- 86 91 87 92 EDAC is composed of a "core" module (edac_core.ko) and several Memory 88 - Controller (MC) driver modules. On a given system, the CORE 89 - is loaded and one MC driver will be loaded. Both the CORE and 90 - the MC driver (or edac_device driver) have individual versions that reflect 91 - current release level of their respective modules. 93 + Controller (MC) driver modules. On a given system, the CORE is loaded 94 + and one MC driver will be loaded. Both the CORE and the MC driver (or 95 + edac_device driver) have individual versions that reflect current 96 + release level of their respective modules. 92 97 93 - Thus, to "report" on what version a system is running, one must report both 94 - the CORE's and the MC driver's versions. 98 + Thus, to "report" on what version a system is running, one must report 99 + both the CORE's and the MC driver's versions. 95 100 96 101 97 102 LOADING 103 + ------- 98 104 99 - If 'edac' was statically linked with the kernel then no loading is 100 - necessary. If 'edac' was built as modules then simply modprobe the 101 - 'edac' pieces that you need. You should be able to modprobe 102 - hardware-specific modules and have the dependencies load the necessary core 103 - modules. 105 + If 'edac' was statically linked with the kernel then no loading 106 + is necessary. If 'edac' was built as modules then simply modprobe 107 + the 'edac' pieces that you need. You should be able to modprobe 108 + hardware-specific modules and have the dependencies load the necessary 109 + core modules. 104 110 105 111 Example: 106 112 ··· 97 129 core module. 98 130 99 131 100 - ============================================================================ 101 - EDAC sysfs INTERFACE 132 + SYSFS INTERFACE 133 + --------------- 102 134 103 - EDAC presents a 'sysfs' interface for control, reporting and attribute 104 - reporting purposes. 135 + EDAC presents a 'sysfs' interface for control and reporting purposes. It 136 + lives in the /sys/devices/system/edac directory. 105 137 106 - EDAC lives in the /sys/devices/system/edac directory. 107 - 108 - Within this directory there currently reside 2 'edac' components: 138 + Within this directory there currently reside 2 components: 109 139 110 140 mc memory controller(s) system 111 141 pci PCI control and status system 112 142 113 143 114 - ============================================================================ 144 + 115 145 Memory Controller (mc) Model 146 + ---------------------------- 116 147 117 - First a background on the memory controller's model abstracted in EDAC. 118 - Each 'mc' device controls a set of DIMM memory modules. These modules are 119 - laid out in a Chip-Select Row (csrowX) and Channel table (chX). There can 120 - be multiple csrows and multiple channels. 148 + Each 'mc' device controls a set of DIMM memory modules. These modules 149 + are laid out in a Chip-Select Row (csrowX) and Channel table (chX). 150 + There can be multiple csrows and multiple channels. 121 151 122 - Memory controllers allow for several csrows, with 8 csrows being a typical value. 123 - Yet, the actual number of csrows depends on the electrical "loading" 124 - of a given motherboard, memory controller and DIMM characteristics. 152 + Memory controllers allow for several csrows, with 8 csrows being a 153 + typical value. Yet, the actual number of csrows depends on the layout of 154 + a given motherboard, memory controller and DIMM characteristics. 125 155 126 - Dual channels allows for 128 bit data transfers to the CPU from memory. 127 - Some newer chipsets allow for more than 2 channels, like Fully Buffered DIMMs 128 - (FB-DIMMs). The following example will assume 2 channels: 156 + Dual channels allows for 128 bit data transfers to/from the CPU from/to 157 + memory. Some newer chipsets allow for more than 2 channels, like Fully 158 + Buffered DIMMs (FB-DIMMs). The following example will assume 2 channels: 129 159 130 160 131 161 Channel 0 Channel 1 ··· 145 179 DIMM_A1 146 180 DIMM_B1 147 181 148 - Labels for these slots are usually silk screened on the motherboard. Slots 149 - labeled 'A' are channel 0 in this example. Slots labeled 'B' 150 - are channel 1. Notice that there are two csrows possible on a 151 - physical DIMM. These csrows are allocated their csrow assignment 152 - based on the slot into which the memory DIMM is placed. Thus, when 1 DIMM 153 - is placed in each Channel, the csrows cross both DIMMs. 182 + Labels for these slots are usually silk-screened on the motherboard. 183 + Slots labeled 'A' are channel 0 in this example. Slots labeled 'B' are 184 + channel 1. Notice that there are two csrows possible on a physical DIMM. 185 + These csrows are allocated their csrow assignment based on the slot into 186 + which the memory DIMM is placed. Thus, when 1 DIMM is placed in each 187 + Channel, the csrows cross both DIMMs. 154 188 155 189 Memory DIMMs come single or dual "ranked". A rank is a populated csrow. 156 190 Thus, 2 single ranked DIMMs, placed in slots DIMM_A0 and DIMM_B0 above ··· 159 193 csrow1 will be populated. The pattern repeats itself for csrow2 and 160 194 csrow3. 161 195 162 - The representation of the above is reflected in the directory tree 163 - in EDAC's sysfs interface. Starting in directory 196 + The representation of the above is reflected in the directory 197 + tree in EDAC's sysfs interface. Starting in directory 164 198 /sys/devices/system/edac/mc each memory controller will be represented 165 199 by its own 'mcX' directory, where 'X' is the index of the MC. 166 200 ··· 183 217 |->csrow3 184 218 .... 185 219 186 - Notice that there is no csrow1, which indicates that csrow0 is 187 - composed of a single ranked DIMMs. This should also apply in both 188 - Channels, in order to have dual-channel mode be operational. Since 189 - both csrow2 and csrow3 are populated, this indicates a dual ranked 190 - set of DIMMs for channels 0 and 1. 220 + Notice that there is no csrow1, which indicates that csrow0 is composed 221 + of a single ranked DIMMs. This should also apply in both Channels, in 222 + order to have dual-channel mode be operational. Since both csrow2 and 223 + csrow3 are populated, this indicates a dual ranked set of DIMMs for 224 + channels 0 and 1. 191 225 192 226 193 - Within each of the 'mcX' and 'csrowX' directories are several 194 - EDAC control and attribute files. 227 + Within each of the 'mcX' and 'csrowX' directories are several EDAC 228 + control and attribute files. 195 229 196 - ============================================================================ 197 - 'mcX' DIRECTORIES 198 230 231 + 'mcX' directories 232 + ----------------- 199 233 200 234 In 'mcX' directories are EDAC control and attribute files for 201 235 this 'X' instance of the memory controllers. 202 236 203 237 For a description of the sysfs API, please see: 204 - Documentation/ABI/testing/sysfs/devices-edac 238 + Documentation/ABI/testing/sysfs-devices-edac 205 239 206 240 207 - ============================================================================ 208 - 'csrowX' DIRECTORIES 209 241 210 - When CONFIG_EDAC_LEGACY_SYSFS is enabled, the sysfs will contain the 211 - csrowX directories. As this API doesn't work properly for Rambus, FB-DIMMs 212 - and modern Intel Memory Controllers, this is being deprecated in favor 213 - of dimmX directories. 242 + 'csrowX' directories 243 + -------------------- 244 + 245 + When CONFIG_EDAC_LEGACY_SYSFS is enabled, sysfs will contain the csrowX 246 + directories. As this API doesn't work properly for Rambus, FB-DIMMs and 247 + modern Intel Memory Controllers, this is being deprecated in favor of 248 + dimmX directories. 214 249 215 250 In the 'csrowX' directories are EDAC control and attribute files for 216 251 this 'X' instance of csrow: ··· 232 265 'ce_count' 233 266 234 267 This attribute file displays the total count of correctable 235 - errors that have occurred on this csrow. This 236 - count is very important to examine. CEs provide early 237 - indications that a DIMM is beginning to fail. This count 238 - field should be monitored for non-zero values and report 239 - such information to the system administrator. 268 + errors that have occurred on this csrow. This count is very 269 + important to examine. CEs provide early indications that a 270 + DIMM is beginning to fail. This count field should be 271 + monitored for non-zero values and report such information 272 + to the system administrator. 240 273 241 274 242 275 Total memory managed by this csrow attribute file: 243 276 244 277 'size_mb' 245 278 246 - This attribute file displays, in count of megabytes, of memory 279 + This attribute file displays, in count of megabytes, the memory 247 280 that this csrow contains. 248 281 249 282 ··· 344 377 motherboard specific and determination of this information 345 378 must occur in userland at this time. 346 379 347 - ============================================================================ 348 - SYSTEM LOGGING 349 380 350 - If logging for UEs and CEs are enabled then system logs will have 351 - error notices indicating errors that have been detected: 381 + 382 + SYSTEM LOGGING 383 + -------------- 384 + 385 + If logging for UEs and CEs is enabled, then system logs will contain 386 + information indicating that errors have been detected: 352 387 353 388 EDAC MC0: CE page 0x283, offset 0xce0, grain 8, syndrome 0x6ec3, row 0, 354 389 channel 1 "DIMM_B1": amd76x_edac ··· 373 404 and then an optional, driver-specific message that may 374 405 have additional information. 375 406 376 - Both UEs and CEs with no info will lack all but memory controller, 377 - error type, a notice of "no info" and then an optional, 378 - driver-specific error message. 407 + Both UEs and CEs with no info will lack all but memory controller, error 408 + type, a notice of "no info" and then an optional, driver-specific error 409 + message. 379 410 380 411 381 - ============================================================================ 382 412 PCI Bus Parity Detection 413 + ------------------------ 383 414 384 - 385 - On Header Type 00 devices the primary status is looked at 386 - for any parity error regardless of whether Parity is enabled on the 387 - device. (The spec indicates parity is generated in some cases). 388 - On Header Type 01 bridges, the secondary status register is also 389 - looked at to see if parity occurred on the bus on the other side of 390 - the bridge. 415 + On Header Type 00 devices, the primary status is looked at for any 416 + parity error regardless of whether parity is enabled on the device or 417 + not. (The spec indicates parity is generated in some cases). On Header 418 + Type 01 bridges, the secondary status register is also looked at to see 419 + if parity occurred on the bus on the other side of the bridge. 391 420 392 421 393 422 SYSFS CONFIGURATION 423 + ------------------- 394 424 395 425 Under /sys/devices/system/edac/pci are control and attribute files as follows: 396 426 ··· 418 450 have been detected. 419 451 420 452 421 - ============================================================================ 453 + 422 454 MODULE PARAMETERS 455 + ----------------- 423 456 424 457 Panic on UE control file: 425 458 ··· 485 516 'panic_on_pci_parity' 486 517 487 518 488 - This control files enables or disables panicking when a parity 519 + This control file enables or disables panicking when a parity 489 520 error has been detected. 490 521 491 522 ··· 499 530 500 531 501 532 502 - ======================================================================= 503 - 504 - 505 - EDAC_DEVICE type of device 533 + EDAC device type 534 + ---------------- 506 535 507 536 In the header file, edac_core.h, there is a series of edac_device structures 508 537 and APIs for the EDAC_DEVICE. ··· 540 573 The symlink points to the 'struct dev' that is registered for this edac_device. 541 574 542 575 INSTANCES 576 + --------- 543 577 544 578 One or more instance directories are present. For the 'test_device_edac' case: 545 579 ··· 554 586 ue_count total of UE events of subdirectories 555 587 556 588 BLOCKS 589 + ------ 557 590 558 591 At the lowest directory level is the 'block' directory. There can be 0, 1 559 592 or more blocks specified in each instance. ··· 586 617 reset all the above counters. 587 618 588 619 589 - Use of the 'test_device_edac' driver should any others to create their own 620 + Use of the 'test_device_edac' driver should enable any others to create their own 590 621 unique drivers for their hardware systems. 591 622 592 623 The 'test_device_edac' sample driver is located at the 593 624 bluesmoke.sourceforge.net project site for EDAC. 594 625 595 - ======================================================================= 626 + 596 627 NEHALEM USAGE OF EDAC APIs 628 + -------------------------- 597 629 598 630 This chapter documents some EXPERIMENTAL mappings for EDAC API to handle 599 631 Nehalem EDAC driver. They will likely be changed on future versions ··· 603 633 Due to the way Nehalem exports Memory Controller data, some adjustments 604 634 were done at i7core_edac driver. This chapter will cover those differences 605 635 606 - 1) On Nehalem, there are one Memory Controller per Quick Patch Interconnect 636 + 1) On Nehalem, there is one Memory Controller per Quick Patch Interconnect 607 637 (QPI). At the driver, the term "socket" means one QPI. This is 608 638 associated with a physical CPU socket. 609 639 ··· 612 642 Each channel can have up to 3 DIMMs. 613 643 614 644 The minimum known unity is DIMMs. There are no information about csrows. 615 - As EDAC API maps the minimum unity is csrows, the driver sequencially 645 + As EDAC API maps the minimum unity is csrows, the driver sequentially 616 646 maps channel/dimm into different csrows. 617 647 618 648 For example, supposing the following layout: ··· 634 664 635 665 Each QPI is exported as a different memory controller. 636 666 637 - 2) Nehalem MC has the hability to generate errors. The driver implements this 667 + 2) Nehalem MC has the ability to generate errors. The driver implements this 638 668 functionality via some error injection nodes: 639 669 640 670 For injecting a memory error, there are some sysfs nodes, under ··· 741 771 742 772 The standard error counters are generated when an mcelog error is received 743 773 by the driver. Since, with udimm, this is counted by software, it is 744 - possible that some errors could be lost. With rdimm's, they displays the 774 + possible that some errors could be lost. With rdimm's, they display the 745 775 contents of the registers 776 + 777 + CREDITS: 778 + ======== 779 + 780 + Written by Doug Thompson <dougthompson@xmission.com> 781 + 7 Dec 2005 782 + 17 Jul 2007 Updated 783 + 784 + (c) Mauro Carvalho Chehab 785 + 05 Aug 2009 Nehalem interface 786 + 787 + EDAC authors/maintainers: 788 + 789 + Doug Thompson, Dave Jiang, Dave Peterson et al, 790 + Mauro Carvalho Chehab 791 + Borislav Petkov 792 + original author: Thayne Harbaugh
+12 -5
MAINTAINERS
··· 3777 3777 F: drivers/edac/ie31200_edac.c 3778 3778 3779 3779 EDAC-MPC85XX 3780 - M: Johannes Thumshirn <johannes.thumshirn@men.de> 3780 + M: Johannes Thumshirn <morbidrsa@gmail.com> 3781 3781 L: linux-edac@vger.kernel.org 3782 3782 W: bluesmoke.sourceforge.net 3783 3783 S: Maintained ··· 3803 3803 W: bluesmoke.sourceforge.net 3804 3804 S: Maintained 3805 3805 F: drivers/edac/sb_edac.c 3806 + 3807 + EDAC-XGENE 3808 + APPLIED MICRO (APM) X-GENE SOC EDAC 3809 + M: Loc Ho <lho@apm.com> 3810 + S: Supported 3811 + F: drivers/edac/xgene_edac.c 3812 + F: Documentation/devicetree/bindings/edac/apm-xgene-edac.txt 3806 3813 3807 3814 EDIROL UA-101/UA-1000 DRIVER 3808 3815 M: Clemens Ladisch <clemens@ladisch.de> ··· 6495 6488 F: include/uapi/mtd/ 6496 6489 6497 6490 MEN A21 WATCHDOG DRIVER 6498 - M: Johannes Thumshirn <johannes.thumshirn@men.de> 6491 + M: Johannes Thumshirn <morbidrsa@gmail.com> 6499 6492 L: linux-watchdog@vger.kernel.org 6500 - S: Supported 6493 + S: Maintained 6501 6494 F: drivers/watchdog/mena21_wdt.c 6502 6495 6503 6496 MEN CHAMELEON BUS (mcb) 6504 - M: Johannes Thumshirn <johannes.thumshirn@men.de> 6505 - S: Supported 6497 + M: Johannes Thumshirn <morbidrsa@gmail.com> 6498 + S: Maintained 6506 6499 F: drivers/mcb/ 6507 6500 F: include/linux/mcb.h 6508 6501
+2
arch/arm/Kconfig
··· 15 15 select CLONE_BACKWARDS 16 16 select CPU_PM if (SUSPEND || CPU_IDLE) 17 17 select DCACHE_WORD_ACCESS if HAVE_EFFICIENT_UNALIGNED_ACCESS 18 + select EDAC_SUPPORT 19 + select EDAC_ATOMIC_SCRUB 18 20 select GENERIC_ALLOCATOR 19 21 select GENERIC_ATOMIC64 if (CPU_V7M || CPU_V6 || !CPU_32v6K || !AEABI) 20 22 select GENERIC_CLOCKEVENTS_BROADCAST if SMP
+11
arch/arm/boot/dts/socfpga_arria10.dtsi
··· 253 253 status = "disabled"; 254 254 }; 255 255 256 + sdr: sdr@ffc25000 { 257 + compatible = "syscon"; 258 + reg = <0xffcfb100 0x80>; 259 + }; 260 + 261 + sdramedac { 262 + compatible = "altr,sdram-edac-a10"; 263 + altr,sdr-syscon = <&sdr>; 264 + interrupts = <0 2 4>, <0 0 4>; 265 + }; 266 + 256 267 L2: l2-cache@fffff000 { 257 268 compatible = "arm,pl310-cache"; 258 269 reg = <0xfffff000 0x1000>;
+3 -2
arch/arm/include/asm/edac.h
··· 18 18 #define ASM_EDAC_H 19 19 /* 20 20 * ECC atomic, DMA, SMP and interrupt safe scrub function. 21 - * Implements the per arch atomic_scrub() that EDAC use for software 21 + * Implements the per arch edac_atomic_scrub() that EDAC use for software 22 22 * ECC scrubbing. It reads memory and then writes back the original 23 23 * value, allowing the hardware to detect and correct memory errors. 24 24 */ 25 - static inline void atomic_scrub(void *va, u32 size) 25 + 26 + static inline void edac_atomic_scrub(void *va, u32 size) 26 27 { 27 28 #if __LINUX_ARM_ARCH__ >= 6 28 29 unsigned int *virt_addr = va;
+1
arch/arm64/Kconfig
··· 23 23 select BUILDTIME_EXTABLE_SORT 24 24 select CLONE_BACKWARDS 25 25 select COMMON_CLK 26 + select EDAC_SUPPORT 26 27 select CPU_PM if (SUSPEND || CPU_IDLE) 27 28 select DCACHE_WORD_ACCESS 28 29 select GENERIC_ALLOCATOR
+83
arch/arm64/boot/dts/apm/apm-storm.dtsi
··· 396 396 0x0 0x1f 0x4>; 397 397 }; 398 398 399 + csw: csw@7e200000 { 400 + compatible = "apm,xgene-csw", "syscon"; 401 + reg = <0x0 0x7e200000 0x0 0x1000>; 402 + }; 403 + 404 + mcba: mcba@7e700000 { 405 + compatible = "apm,xgene-mcb", "syscon"; 406 + reg = <0x0 0x7e700000 0x0 0x1000>; 407 + }; 408 + 409 + mcbb: mcbb@7e720000 { 410 + compatible = "apm,xgene-mcb", "syscon"; 411 + reg = <0x0 0x7e720000 0x0 0x1000>; 412 + }; 413 + 414 + efuse: efuse@1054a000 { 415 + compatible = "apm,xgene-efuse", "syscon"; 416 + reg = <0x0 0x1054a000 0x0 0x20>; 417 + }; 418 + 419 + edac@78800000 { 420 + compatible = "apm,xgene-edac"; 421 + #address-cells = <2>; 422 + #size-cells = <2>; 423 + ranges; 424 + regmap-csw = <&csw>; 425 + regmap-mcba = <&mcba>; 426 + regmap-mcbb = <&mcbb>; 427 + regmap-efuse = <&efuse>; 428 + reg = <0x0 0x78800000 0x0 0x100>; 429 + interrupts = <0x0 0x20 0x4>, 430 + <0x0 0x21 0x4>, 431 + <0x0 0x27 0x4>; 432 + 433 + edacmc@7e800000 { 434 + compatible = "apm,xgene-edac-mc"; 435 + reg = <0x0 0x7e800000 0x0 0x1000>; 436 + memory-controller = <0>; 437 + }; 438 + 439 + edacmc@7e840000 { 440 + compatible = "apm,xgene-edac-mc"; 441 + reg = <0x0 0x7e840000 0x0 0x1000>; 442 + memory-controller = <1>; 443 + }; 444 + 445 + edacmc@7e880000 { 446 + compatible = "apm,xgene-edac-mc"; 447 + reg = <0x0 0x7e880000 0x0 0x1000>; 448 + memory-controller = <2>; 449 + }; 450 + 451 + edacmc@7e8c0000 { 452 + compatible = "apm,xgene-edac-mc"; 453 + reg = <0x0 0x7e8c0000 0x0 0x1000>; 454 + memory-controller = <3>; 455 + }; 456 + 457 + edacpmd@7c000000 { 458 + compatible = "apm,xgene-edac-pmd"; 459 + reg = <0x0 0x7c000000 0x0 0x200000>; 460 + pmd-controller = <0>; 461 + }; 462 + 463 + edacpmd@7c200000 { 464 + compatible = "apm,xgene-edac-pmd"; 465 + reg = <0x0 0x7c200000 0x0 0x200000>; 466 + pmd-controller = <1>; 467 + }; 468 + 469 + edacpmd@7c400000 { 470 + compatible = "apm,xgene-edac-pmd"; 471 + reg = <0x0 0x7c400000 0x0 0x200000>; 472 + pmd-controller = <2>; 473 + }; 474 + 475 + edacpmd@7c600000 { 476 + compatible = "apm,xgene-edac-pmd"; 477 + reg = <0x0 0x7c600000 0x0 0x200000>; 478 + pmd-controller = <3>; 479 + }; 480 + }; 481 + 399 482 pcie0: pcie@1f2b0000 { 400 483 status = "disabled"; 401 484 device_type = "pci";
+1
arch/mips/Kconfig
··· 819 819 select SYS_SUPPORTS_64BIT_KERNEL 820 820 select SYS_SUPPORTS_BIG_ENDIAN 821 821 select EDAC_SUPPORT 822 + select EDAC_ATOMIC_SCRUB 822 823 select SYS_SUPPORTS_LITTLE_ENDIAN 823 824 select SYS_SUPPORTS_HOTPLUG_CPU if CPU_BIG_ENDIAN 824 825 select SYS_HAS_EARLY_PRINTK
+2 -2
arch/mips/include/asm/edac.h
··· 5 5 6 6 /* ECC atomic, DMA, SMP and interrupt safe scrub function */ 7 7 8 - static inline void atomic_scrub(void *va, u32 size) 8 + static inline void edac_atomic_scrub(void *va, u32 size) 9 9 { 10 10 unsigned long *virt_addr = va; 11 11 unsigned long temp; ··· 21 21 22 22 __asm__ __volatile__ ( 23 23 " .set mips2 \n" 24 - "1: ll %0, %1 # atomic_scrub \n" 24 + "1: ll %0, %1 # edac_atomic_scrub \n" 25 25 " addu %0, $0 \n" 26 26 " sc %0, %1 \n" 27 27 " beqz %0, 1b \n"
+2
arch/powerpc/Kconfig
··· 153 153 select NO_BOOTMEM 154 154 select HAVE_GENERIC_RCU_GUP 155 155 select HAVE_PERF_EVENTS_NMI if PPC64 156 + select EDAC_SUPPORT 157 + select EDAC_ATOMIC_SCRUB 156 158 157 159 config GENERIC_CSUM 158 160 def_bool CPU_LITTLE_ENDIAN
+2 -2
arch/powerpc/include/asm/edac.h
··· 12 12 #define ASM_EDAC_H 13 13 /* 14 14 * ECC atomic, DMA, SMP and interrupt safe scrub function. 15 - * Implements the per arch atomic_scrub() that EDAC use for software 15 + * Implements the per arch edac_atomic_scrub() that EDAC use for software 16 16 * ECC scrubbing. It reads memory and then writes back the original 17 17 * value, allowing the hardware to detect and correct memory errors. 18 18 */ 19 - static __inline__ void atomic_scrub(void *va, u32 size) 19 + static __inline__ void edac_atomic_scrub(void *va, u32 size) 20 20 { 21 21 unsigned int *virt_addr = va; 22 22 unsigned int temp;
+1
arch/tile/Kconfig
··· 28 28 select HAVE_DEBUG_STACKOVERFLOW 29 29 select ARCH_WANT_FRAME_POINTERS 30 30 select HAVE_CONTEXT_TRACKING 31 + select EDAC_SUPPORT 31 32 32 33 # FIXME: investigate whether we need/want these options. 33 34 # select HAVE_IOREMAP_PROT
-29
arch/tile/include/asm/edac.h
··· 1 - /* 2 - * Copyright 2011 Tilera Corporation. All Rights Reserved. 3 - * 4 - * This program is free software; you can redistribute it and/or 5 - * modify it under the terms of the GNU General Public License 6 - * as published by the Free Software Foundation, version 2. 7 - * 8 - * This program is distributed in the hope that it will be useful, but 9 - * WITHOUT ANY WARRANTY; without even the implied warranty of 10 - * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or 11 - * NON INFRINGEMENT. See the GNU General Public License for 12 - * more details. 13 - */ 14 - 15 - #ifndef _ASM_TILE_EDAC_H 16 - #define _ASM_TILE_EDAC_H 17 - 18 - /* ECC atomic, DMA, SMP and interrupt safe scrub function */ 19 - 20 - static inline void atomic_scrub(void *va, u32 size) 21 - { 22 - /* 23 - * These is nothing to be done here because CE is 24 - * corrected by the mshim. 25 - */ 26 - return; 27 - } 28 - 29 - #endif /* _ASM_TILE_EDAC_H */
+2
arch/x86/Kconfig
··· 50 50 select CLONE_BACKWARDS if X86_32 51 51 select COMPAT_OLD_SIGACTION if IA32_EMULATION 52 52 select DCACHE_WORD_ACCESS 53 + select EDAC_ATOMIC_SCRUB 54 + select EDAC_SUPPORT 53 55 select GENERIC_CLOCKEVENTS 54 56 select GENERIC_CLOCKEVENTS_BROADCAST if X86_64 || (X86_32 && X86_LOCAL_APIC) 55 57 select GENERIC_CLOCKEVENTS_MIN_ADJUST
+1 -1
arch/x86/include/asm/edac.h
··· 3 3 4 4 /* ECC atomic, DMA, SMP and interrupt safe scrub function */ 5 5 6 - static inline void atomic_scrub(void *va, u32 size) 6 + static inline void edac_atomic_scrub(void *va, u32 size) 7 7 { 8 8 u32 i, *virt_addr = va; 9 9
+15 -7
drivers/edac/Kconfig
··· 2 2 # EDAC Kconfig 3 3 # Copyright (c) 2008 Doug Thompson www.softwarebitmaker.com 4 4 # Licensed and distributed under the GPL 5 - # 5 + 6 + config EDAC_ATOMIC_SCRUB 7 + bool 6 8 7 9 config EDAC_SUPPORT 8 10 bool 9 11 10 12 menuconfig EDAC 11 13 bool "EDAC (Error Detection And Correction) reporting" 12 - depends on HAS_IOMEM 13 - depends on X86 || PPC || TILE || ARM || EDAC_SUPPORT 14 + depends on HAS_IOMEM && EDAC_SUPPORT 14 15 help 15 16 EDAC is designed to report errors in the core system. 16 17 These are low-level errors that are reported in the CPU or ··· 263 262 264 263 config EDAC_MPC85XX 265 264 tristate "Freescale MPC83xx / MPC85xx" 266 - depends on EDAC_MM_EDAC && FSL_SOC && (PPC_83xx || PPC_85xx) 265 + depends on EDAC_MM_EDAC && FSL_SOC 267 266 help 268 267 Support for error detection and correction on the Freescale 269 - MPC8349, MPC8560, MPC8540, MPC8548 268 + MPC8349, MPC8560, MPC8540, MPC8548, T4240 270 269 271 270 config EDAC_MV64X60 272 271 tristate "Marvell MV64x60" ··· 378 377 Cavium Octeon family of SOCs. 379 378 380 379 config EDAC_ALTERA_MC 381 - tristate "Altera SDRAM Memory Controller EDAC" 382 - depends on EDAC_MM_EDAC && ARCH_SOCFPGA 380 + bool "Altera SDRAM Memory Controller EDAC" 381 + depends on EDAC_MM_EDAC=y && ARCH_SOCFPGA 383 382 help 384 383 Support for error detection and correction on the 385 384 Altera SDRAM memory controller. Note that the ··· 392 391 help 393 392 Support for error detection and correction on the Synopsys DDR 394 393 memory controller. 394 + 395 + config EDAC_XGENE 396 + tristate "APM X-Gene SoC" 397 + depends on EDAC_MM_EDAC && (ARM64 || COMPILE_TEST) 398 + help 399 + Support for error detection and correction on the 400 + APM X-Gene family of SOCs. 395 401 396 402 endif # EDAC
+1
drivers/edac/Makefile
··· 68 68 69 69 obj-$(CONFIG_EDAC_ALTERA_MC) += altera_edac.o 70 70 obj-$(CONFIG_EDAC_SYNOPSYS) += synopsys_edac.o 71 + obj-$(CONFIG_EDAC_XGENE) += xgene_edac.o
+238 -135
drivers/edac/altera_edac.c
··· 1 1 /* 2 - * Copyright Altera Corporation (C) 2014. All rights reserved. 2 + * Copyright Altera Corporation (C) 2014-2015. All rights reserved. 3 3 * Copyright 2011-2012 Calxeda, Inc. 4 4 * 5 5 * This program is free software; you can redistribute it and/or modify it ··· 28 28 #include <linux/types.h> 29 29 #include <linux/uaccess.h> 30 30 31 + #include "altera_edac.h" 31 32 #include "edac_core.h" 32 33 #include "edac_module.h" 33 34 34 35 #define EDAC_MOD_STR "altera_edac" 35 36 #define EDAC_VERSION "1" 36 37 37 - /* SDRAM Controller CtrlCfg Register */ 38 - #define CTLCFG_OFST 0x00 38 + static const struct altr_sdram_prv_data c5_data = { 39 + .ecc_ctrl_offset = CV_CTLCFG_OFST, 40 + .ecc_ctl_en_mask = CV_CTLCFG_ECC_AUTO_EN, 41 + .ecc_stat_offset = CV_DRAMSTS_OFST, 42 + .ecc_stat_ce_mask = CV_DRAMSTS_SBEERR, 43 + .ecc_stat_ue_mask = CV_DRAMSTS_DBEERR, 44 + .ecc_saddr_offset = CV_ERRADDR_OFST, 45 + .ecc_daddr_offset = CV_ERRADDR_OFST, 46 + .ecc_cecnt_offset = CV_SBECOUNT_OFST, 47 + .ecc_uecnt_offset = CV_DBECOUNT_OFST, 48 + .ecc_irq_en_offset = CV_DRAMINTR_OFST, 49 + .ecc_irq_en_mask = CV_DRAMINTR_INTREN, 50 + .ecc_irq_clr_offset = CV_DRAMINTR_OFST, 51 + .ecc_irq_clr_mask = (CV_DRAMINTR_INTRCLR | CV_DRAMINTR_INTREN), 52 + .ecc_cnt_rst_offset = CV_DRAMINTR_OFST, 53 + .ecc_cnt_rst_mask = CV_DRAMINTR_INTRCLR, 54 + #ifdef CONFIG_EDAC_DEBUG 55 + .ce_ue_trgr_offset = CV_CTLCFG_OFST, 56 + .ce_set_mask = CV_CTLCFG_GEN_SB_ERR, 57 + .ue_set_mask = CV_CTLCFG_GEN_DB_ERR, 58 + #endif 59 + }; 39 60 40 - /* SDRAM Controller CtrlCfg Register Bit Masks */ 41 - #define CTLCFG_ECC_EN 0x400 42 - #define CTLCFG_ECC_CORR_EN 0x800 43 - #define CTLCFG_GEN_SB_ERR 0x2000 44 - #define CTLCFG_GEN_DB_ERR 0x4000 45 - 46 - #define CTLCFG_ECC_AUTO_EN (CTLCFG_ECC_EN | \ 47 - CTLCFG_ECC_CORR_EN) 48 - 49 - /* SDRAM Controller Address Width Register */ 50 - #define DRAMADDRW_OFST 0x2C 51 - 52 - /* SDRAM Controller Address Widths Field Register */ 53 - #define DRAMADDRW_COLBIT_MASK 0x001F 54 - #define DRAMADDRW_COLBIT_SHIFT 0 55 - #define DRAMADDRW_ROWBIT_MASK 0x03E0 56 - #define DRAMADDRW_ROWBIT_SHIFT 5 57 - #define DRAMADDRW_BANKBIT_MASK 0x1C00 58 - #define DRAMADDRW_BANKBIT_SHIFT 10 59 - #define DRAMADDRW_CSBIT_MASK 0xE000 60 - #define DRAMADDRW_CSBIT_SHIFT 13 61 - 62 - /* SDRAM Controller Interface Data Width Register */ 63 - #define DRAMIFWIDTH_OFST 0x30 64 - 65 - /* SDRAM Controller Interface Data Width Defines */ 66 - #define DRAMIFWIDTH_16B_ECC 24 67 - #define DRAMIFWIDTH_32B_ECC 40 68 - 69 - /* SDRAM Controller DRAM Status Register */ 70 - #define DRAMSTS_OFST 0x38 71 - 72 - /* SDRAM Controller DRAM Status Register Bit Masks */ 73 - #define DRAMSTS_SBEERR 0x04 74 - #define DRAMSTS_DBEERR 0x08 75 - #define DRAMSTS_CORR_DROP 0x10 76 - 77 - /* SDRAM Controller DRAM IRQ Register */ 78 - #define DRAMINTR_OFST 0x3C 79 - 80 - /* SDRAM Controller DRAM IRQ Register Bit Masks */ 81 - #define DRAMINTR_INTREN 0x01 82 - #define DRAMINTR_SBEMASK 0x02 83 - #define DRAMINTR_DBEMASK 0x04 84 - #define DRAMINTR_CORRDROPMASK 0x08 85 - #define DRAMINTR_INTRCLR 0x10 86 - 87 - /* SDRAM Controller Single Bit Error Count Register */ 88 - #define SBECOUNT_OFST 0x40 89 - 90 - /* SDRAM Controller Single Bit Error Count Register Bit Masks */ 91 - #define SBECOUNT_MASK 0x0F 92 - 93 - /* SDRAM Controller Double Bit Error Count Register */ 94 - #define DBECOUNT_OFST 0x44 95 - 96 - /* SDRAM Controller Double Bit Error Count Register Bit Masks */ 97 - #define DBECOUNT_MASK 0x0F 98 - 99 - /* SDRAM Controller ECC Error Address Register */ 100 - #define ERRADDR_OFST 0x48 101 - 102 - /* SDRAM Controller ECC Error Address Register Bit Masks */ 103 - #define ERRADDR_MASK 0xFFFFFFFF 104 - 105 - /* Altera SDRAM Memory Controller data */ 106 - struct altr_sdram_mc_data { 107 - struct regmap *mc_vbase; 61 + static const struct altr_sdram_prv_data a10_data = { 62 + .ecc_ctrl_offset = A10_ECCCTRL1_OFST, 63 + .ecc_ctl_en_mask = A10_ECCCTRL1_ECC_EN, 64 + .ecc_stat_offset = A10_INTSTAT_OFST, 65 + .ecc_stat_ce_mask = A10_INTSTAT_SBEERR, 66 + .ecc_stat_ue_mask = A10_INTSTAT_DBEERR, 67 + .ecc_saddr_offset = A10_SERRADDR_OFST, 68 + .ecc_daddr_offset = A10_DERRADDR_OFST, 69 + .ecc_irq_en_offset = A10_ERRINTEN_OFST, 70 + .ecc_irq_en_mask = A10_ECC_IRQ_EN_MASK, 71 + .ecc_irq_clr_offset = A10_INTSTAT_OFST, 72 + .ecc_irq_clr_mask = (A10_INTSTAT_SBEERR | A10_INTSTAT_DBEERR), 73 + .ecc_cnt_rst_offset = A10_ECCCTRL1_OFST, 74 + .ecc_cnt_rst_mask = A10_ECC_CNT_RESET_MASK, 75 + #ifdef CONFIG_EDAC_DEBUG 76 + .ce_ue_trgr_offset = A10_DIAGINTTEST_OFST, 77 + .ce_set_mask = A10_DIAGINT_TSERRA_MASK, 78 + .ue_set_mask = A10_DIAGINT_TDERRA_MASK, 79 + #endif 108 80 }; 109 81 110 82 static irqreturn_t altr_sdram_mc_err_handler(int irq, void *dev_id) 111 83 { 112 84 struct mem_ctl_info *mci = dev_id; 113 85 struct altr_sdram_mc_data *drvdata = mci->pvt_info; 114 - u32 status, err_count, err_addr; 86 + const struct altr_sdram_prv_data *priv = drvdata->data; 87 + u32 status, err_count = 1, err_addr; 115 88 116 - /* Error Address is shared by both SBE & DBE */ 117 - regmap_read(drvdata->mc_vbase, ERRADDR_OFST, &err_addr); 89 + regmap_read(drvdata->mc_vbase, priv->ecc_stat_offset, &status); 118 90 119 - regmap_read(drvdata->mc_vbase, DRAMSTS_OFST, &status); 120 - 121 - if (status & DRAMSTS_DBEERR) { 122 - regmap_read(drvdata->mc_vbase, DBECOUNT_OFST, &err_count); 91 + if (status & priv->ecc_stat_ue_mask) { 92 + regmap_read(drvdata->mc_vbase, priv->ecc_daddr_offset, 93 + &err_addr); 94 + if (priv->ecc_uecnt_offset) 95 + regmap_read(drvdata->mc_vbase, priv->ecc_uecnt_offset, 96 + &err_count); 123 97 panic("\nEDAC: [%d Uncorrectable errors @ 0x%08X]\n", 124 98 err_count, err_addr); 125 99 } 126 - if (status & DRAMSTS_SBEERR) { 127 - regmap_read(drvdata->mc_vbase, SBECOUNT_OFST, &err_count); 100 + if (status & priv->ecc_stat_ce_mask) { 101 + regmap_read(drvdata->mc_vbase, priv->ecc_saddr_offset, 102 + &err_addr); 103 + if (priv->ecc_uecnt_offset) 104 + regmap_read(drvdata->mc_vbase, priv->ecc_cecnt_offset, 105 + &err_count); 128 106 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, err_count, 129 107 err_addr >> PAGE_SHIFT, 130 108 err_addr & ~PAGE_MASK, 0, 131 109 0, 0, -1, mci->ctl_name, ""); 110 + /* Clear IRQ to resume */ 111 + regmap_write(drvdata->mc_vbase, priv->ecc_irq_clr_offset, 112 + priv->ecc_irq_clr_mask); 113 + 114 + return IRQ_HANDLED; 132 115 } 133 - 134 - regmap_write(drvdata->mc_vbase, DRAMINTR_OFST, 135 - (DRAMINTR_INTRCLR | DRAMINTR_INTREN)); 136 - 137 - return IRQ_HANDLED; 116 + return IRQ_NONE; 138 117 } 139 118 140 119 #ifdef CONFIG_EDAC_DEBUG ··· 123 144 { 124 145 struct mem_ctl_info *mci = file->private_data; 125 146 struct altr_sdram_mc_data *drvdata = mci->pvt_info; 147 + const struct altr_sdram_prv_data *priv = drvdata->data; 126 148 u32 *ptemp; 127 149 dma_addr_t dma_handle; 128 150 u32 reg, read_reg; ··· 136 156 return -ENOMEM; 137 157 } 138 158 139 - regmap_read(drvdata->mc_vbase, CTLCFG_OFST, &read_reg); 140 - read_reg &= ~(CTLCFG_GEN_SB_ERR | CTLCFG_GEN_DB_ERR); 159 + regmap_read(drvdata->mc_vbase, priv->ce_ue_trgr_offset, 160 + &read_reg); 161 + read_reg &= ~(priv->ce_set_mask | priv->ue_set_mask); 141 162 142 163 /* Error are injected by writing a word while the SBE or DBE 143 164 * bit in the CTLCFG register is set. Reading the word will ··· 147 166 if (count == 3) { 148 167 edac_printk(KERN_ALERT, EDAC_MC, 149 168 "Inject Double bit error\n"); 150 - regmap_write(drvdata->mc_vbase, CTLCFG_OFST, 151 - (read_reg | CTLCFG_GEN_DB_ERR)); 169 + regmap_write(drvdata->mc_vbase, priv->ce_ue_trgr_offset, 170 + (read_reg | priv->ue_set_mask)); 152 171 } else { 153 172 edac_printk(KERN_ALERT, EDAC_MC, 154 173 "Inject Single bit error\n"); 155 - regmap_write(drvdata->mc_vbase, CTLCFG_OFST, 156 - (read_reg | CTLCFG_GEN_SB_ERR)); 174 + regmap_write(drvdata->mc_vbase, priv->ce_ue_trgr_offset, 175 + (read_reg | priv->ce_set_mask)); 157 176 } 158 177 159 178 ptemp[0] = 0x5A5A5A5A; 160 179 ptemp[1] = 0xA5A5A5A5; 161 180 162 181 /* Clear the error injection bits */ 163 - regmap_write(drvdata->mc_vbase, CTLCFG_OFST, read_reg); 182 + regmap_write(drvdata->mc_vbase, priv->ce_ue_trgr_offset, read_reg); 164 183 /* Ensure it has been written out */ 165 184 wmb(); 166 185 ··· 200 219 {} 201 220 #endif 202 221 203 - /* Get total memory size in bytes */ 204 - static u32 altr_sdram_get_total_mem_size(struct regmap *mc_vbase) 222 + /* Get total memory size from Open Firmware DTB */ 223 + static unsigned long get_total_mem(void) 205 224 { 206 - u32 size, read_reg, row, bank, col, cs, width; 225 + struct device_node *np = NULL; 226 + const unsigned int *reg, *reg_end; 227 + int len, sw, aw; 228 + unsigned long start, size, total_mem = 0; 207 229 208 - if (regmap_read(mc_vbase, DRAMADDRW_OFST, &read_reg) < 0) 209 - return 0; 230 + for_each_node_by_type(np, "memory") { 231 + aw = of_n_addr_cells(np); 232 + sw = of_n_size_cells(np); 233 + reg = (const unsigned int *)of_get_property(np, "reg", &len); 234 + reg_end = reg + (len / sizeof(u32)); 210 235 211 - if (regmap_read(mc_vbase, DRAMIFWIDTH_OFST, &width) < 0) 212 - return 0; 236 + total_mem = 0; 237 + do { 238 + start = of_read_number(reg, aw); 239 + reg += aw; 240 + size = of_read_number(reg, sw); 241 + reg += sw; 242 + total_mem += size; 243 + } while (reg < reg_end); 244 + } 245 + edac_dbg(0, "total_mem 0x%lx\n", total_mem); 246 + return total_mem; 247 + } 213 248 214 - col = (read_reg & DRAMADDRW_COLBIT_MASK) >> 215 - DRAMADDRW_COLBIT_SHIFT; 216 - row = (read_reg & DRAMADDRW_ROWBIT_MASK) >> 217 - DRAMADDRW_ROWBIT_SHIFT; 218 - bank = (read_reg & DRAMADDRW_BANKBIT_MASK) >> 219 - DRAMADDRW_BANKBIT_SHIFT; 220 - cs = (read_reg & DRAMADDRW_CSBIT_MASK) >> 221 - DRAMADDRW_CSBIT_SHIFT; 249 + static const struct of_device_id altr_sdram_ctrl_of_match[] = { 250 + { .compatible = "altr,sdram-edac", .data = (void *)&c5_data}, 251 + { .compatible = "altr,sdram-edac-a10", .data = (void *)&a10_data}, 252 + {}, 253 + }; 254 + MODULE_DEVICE_TABLE(of, altr_sdram_ctrl_of_match); 222 255 223 - /* Correct for ECC as its not addressible */ 224 - if (width == DRAMIFWIDTH_32B_ECC) 225 - width = 32; 226 - if (width == DRAMIFWIDTH_16B_ECC) 227 - width = 16; 256 + static int a10_init(struct regmap *mc_vbase) 257 + { 258 + if (regmap_update_bits(mc_vbase, A10_INTMODE_OFST, 259 + A10_INTMODE_SB_INT, A10_INTMODE_SB_INT)) { 260 + edac_printk(KERN_ERR, EDAC_MC, 261 + "Error setting SB IRQ mode\n"); 262 + return -ENODEV; 263 + } 228 264 229 - /* calculate the SDRAM size base on this info */ 230 - size = 1 << (row + bank + col); 231 - size = size * cs * (width / 8); 232 - return size; 265 + if (regmap_write(mc_vbase, A10_SERRCNTREG_OFST, 1)) { 266 + edac_printk(KERN_ERR, EDAC_MC, 267 + "Error setting trigger count\n"); 268 + return -ENODEV; 269 + } 270 + 271 + return 0; 272 + } 273 + 274 + static int a10_unmask_irq(struct platform_device *pdev, u32 mask) 275 + { 276 + void __iomem *sm_base; 277 + int ret = 0; 278 + 279 + if (!request_mem_region(A10_SYMAN_INTMASK_CLR, sizeof(u32), 280 + dev_name(&pdev->dev))) { 281 + edac_printk(KERN_ERR, EDAC_MC, 282 + "Unable to request mem region\n"); 283 + return -EBUSY; 284 + } 285 + 286 + sm_base = ioremap(A10_SYMAN_INTMASK_CLR, sizeof(u32)); 287 + if (!sm_base) { 288 + edac_printk(KERN_ERR, EDAC_MC, 289 + "Unable to ioremap device\n"); 290 + 291 + ret = -ENOMEM; 292 + goto release; 293 + } 294 + 295 + iowrite32(mask, sm_base); 296 + 297 + iounmap(sm_base); 298 + 299 + release: 300 + release_mem_region(A10_SYMAN_INTMASK_CLR, sizeof(u32)); 301 + 302 + return ret; 233 303 } 234 304 235 305 static int altr_sdram_probe(struct platform_device *pdev) 236 306 { 307 + const struct of_device_id *id; 237 308 struct edac_mc_layer layers[2]; 238 309 struct mem_ctl_info *mci; 239 310 struct altr_sdram_mc_data *drvdata; 311 + const struct altr_sdram_prv_data *priv; 240 312 struct regmap *mc_vbase; 241 313 struct dimm_info *dimm; 242 - u32 read_reg, mem_size; 243 - int irq; 244 - int res = 0; 314 + u32 read_reg; 315 + int irq, irq2, res = 0; 316 + unsigned long mem_size, irqflags = 0; 245 317 246 - /* Validate the SDRAM controller has ECC enabled */ 318 + id = of_match_device(altr_sdram_ctrl_of_match, &pdev->dev); 319 + if (!id) 320 + return -ENODEV; 321 + 247 322 /* Grab the register range from the sdr controller in device tree */ 248 323 mc_vbase = syscon_regmap_lookup_by_phandle(pdev->dev.of_node, 249 324 "altr,sdr-syscon"); ··· 309 272 return -ENODEV; 310 273 } 311 274 312 - if (regmap_read(mc_vbase, CTLCFG_OFST, &read_reg) || 313 - ((read_reg & CTLCFG_ECC_AUTO_EN) != CTLCFG_ECC_AUTO_EN)) { 275 + /* Check specific dependencies for the module */ 276 + priv = of_match_node(altr_sdram_ctrl_of_match, 277 + pdev->dev.of_node)->data; 278 + 279 + /* Validate the SDRAM controller has ECC enabled */ 280 + if (regmap_read(mc_vbase, priv->ecc_ctrl_offset, &read_reg) || 281 + ((read_reg & priv->ecc_ctl_en_mask) != priv->ecc_ctl_en_mask)) { 314 282 edac_printk(KERN_ERR, EDAC_MC, 315 283 "No ECC/ECC disabled [0x%08X]\n", read_reg); 316 284 return -ENODEV; 317 285 } 318 286 319 287 /* Grab memory size from device tree. */ 320 - mem_size = altr_sdram_get_total_mem_size(mc_vbase); 288 + mem_size = get_total_mem(); 321 289 if (!mem_size) { 322 - edac_printk(KERN_ERR, EDAC_MC, 323 - "Unable to calculate memory size\n"); 290 + edac_printk(KERN_ERR, EDAC_MC, "Unable to calculate memory size\n"); 324 291 return -ENODEV; 325 292 } 326 293 327 - /* Ensure the SDRAM Interrupt is disabled and cleared */ 328 - if (regmap_write(mc_vbase, DRAMINTR_OFST, DRAMINTR_INTRCLR)) { 294 + /* Ensure the SDRAM Interrupt is disabled */ 295 + if (regmap_update_bits(mc_vbase, priv->ecc_irq_en_offset, 296 + priv->ecc_irq_en_mask, 0)) { 329 297 edac_printk(KERN_ERR, EDAC_MC, 330 - "Error clearing SDRAM ECC IRQ\n"); 298 + "Error disabling SDRAM ECC IRQ\n"); 299 + return -ENODEV; 300 + } 301 + 302 + /* Toggle to clear the SDRAM Error count */ 303 + if (regmap_update_bits(mc_vbase, priv->ecc_cnt_rst_offset, 304 + priv->ecc_cnt_rst_mask, 305 + priv->ecc_cnt_rst_mask)) { 306 + edac_printk(KERN_ERR, EDAC_MC, 307 + "Error clearing SDRAM ECC count\n"); 308 + return -ENODEV; 309 + } 310 + 311 + if (regmap_update_bits(mc_vbase, priv->ecc_cnt_rst_offset, 312 + priv->ecc_cnt_rst_mask, 0)) { 313 + edac_printk(KERN_ERR, EDAC_MC, 314 + "Error clearing SDRAM ECC count\n"); 331 315 return -ENODEV; 332 316 } 333 317 ··· 358 300 "No irq %d in DT\n", irq); 359 301 return -ENODEV; 360 302 } 303 + 304 + /* Arria10 has a 2nd IRQ */ 305 + irq2 = platform_get_irq(pdev, 1); 361 306 362 307 layers[0].type = EDAC_MC_LAYER_CHIP_SELECT; 363 308 layers[0].size = 1; ··· 376 315 mci->pdev = &pdev->dev; 377 316 drvdata = mci->pvt_info; 378 317 drvdata->mc_vbase = mc_vbase; 318 + drvdata->data = priv; 379 319 platform_set_drvdata(pdev, mci); 380 320 381 321 if (!devres_open_group(&pdev->dev, NULL, GFP_KERNEL)) { 322 + edac_printk(KERN_ERR, EDAC_MC, 323 + "Unable to get managed device resource\n"); 382 324 res = -ENOMEM; 383 325 goto free; 384 326 } ··· 406 342 if (res < 0) 407 343 goto err; 408 344 345 + /* Only the Arria10 has separate IRQs */ 346 + if (irq2 > 0) { 347 + /* Arria10 specific initialization */ 348 + res = a10_init(mc_vbase); 349 + if (res < 0) 350 + goto err2; 351 + 352 + res = devm_request_irq(&pdev->dev, irq2, 353 + altr_sdram_mc_err_handler, 354 + IRQF_SHARED, dev_name(&pdev->dev), mci); 355 + if (res < 0) { 356 + edac_mc_printk(mci, KERN_ERR, 357 + "Unable to request irq %d\n", irq2); 358 + res = -ENODEV; 359 + goto err2; 360 + } 361 + 362 + res = a10_unmask_irq(pdev, A10_DDR0_IRQ_MASK); 363 + if (res < 0) 364 + goto err2; 365 + 366 + irqflags = IRQF_SHARED; 367 + } 368 + 409 369 res = devm_request_irq(&pdev->dev, irq, altr_sdram_mc_err_handler, 410 - 0, dev_name(&pdev->dev), mci); 370 + irqflags, dev_name(&pdev->dev), mci); 411 371 if (res < 0) { 412 372 edac_mc_printk(mci, KERN_ERR, 413 373 "Unable to request irq %d\n", irq); ··· 439 351 goto err2; 440 352 } 441 353 442 - if (regmap_write(drvdata->mc_vbase, DRAMINTR_OFST, 443 - (DRAMINTR_INTRCLR | DRAMINTR_INTREN))) { 354 + /* Infrastructure ready - enable the IRQ */ 355 + if (regmap_update_bits(drvdata->mc_vbase, priv->ecc_irq_en_offset, 356 + priv->ecc_irq_en_mask, priv->ecc_irq_en_mask)) { 444 357 edac_mc_printk(mci, KERN_ERR, 445 358 "Error enabling SDRAM ECC IRQ\n"); 446 359 res = -ENODEV; ··· 477 388 return 0; 478 389 } 479 390 480 - static const struct of_device_id altr_sdram_ctrl_of_match[] = { 481 - { .compatible = "altr,sdram-edac", }, 482 - {}, 391 + /* 392 + * If you want to suspend, need to disable EDAC by removing it 393 + * from the device tree or defconfig. 394 + */ 395 + #ifdef CONFIG_PM 396 + static int altr_sdram_prepare(struct device *dev) 397 + { 398 + pr_err("Suspend not allowed when EDAC is enabled.\n"); 399 + 400 + return -EPERM; 401 + } 402 + 403 + static const struct dev_pm_ops altr_sdram_pm_ops = { 404 + .prepare = altr_sdram_prepare, 483 405 }; 484 - MODULE_DEVICE_TABLE(of, altr_sdram_ctrl_of_match); 406 + #endif 485 407 486 408 static struct platform_driver altr_sdram_edac_driver = { 487 409 .probe = altr_sdram_probe, 488 410 .remove = altr_sdram_remove, 489 411 .driver = { 490 412 .name = "altr_sdram_edac", 413 + #ifdef CONFIG_PM 414 + .pm = &altr_sdram_pm_ops, 415 + #endif 491 416 .of_match_table = altr_sdram_ctrl_of_match, 492 417 }, 493 418 };
+201
drivers/edac/altera_edac.h
··· 1 + /* 2 + * 3 + * Copyright (C) 2015 Altera Corporation 4 + * 5 + * This program is free software; you can redistribute it and/or modify it 6 + * under the terms and conditions of the GNU General Public License, 7 + * version 2, as published by the Free Software Foundation. 8 + * 9 + * This program is distributed in the hope it will be useful, but WITHOUT 10 + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 11 + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 12 + * more details. 13 + * 14 + * You should have received a copy of the GNU General Public License along with 15 + * this program. If not, see <http://www.gnu.org/licenses/>. 16 + */ 17 + 18 + #ifndef _ALTERA_EDAC_H 19 + #define _ALTERA_EDAC_H 20 + 21 + #include <linux/edac.h> 22 + #include <linux/types.h> 23 + 24 + /* SDRAM Controller CtrlCfg Register */ 25 + #define CV_CTLCFG_OFST 0x00 26 + 27 + /* SDRAM Controller CtrlCfg Register Bit Masks */ 28 + #define CV_CTLCFG_ECC_EN 0x400 29 + #define CV_CTLCFG_ECC_CORR_EN 0x800 30 + #define CV_CTLCFG_GEN_SB_ERR 0x2000 31 + #define CV_CTLCFG_GEN_DB_ERR 0x4000 32 + 33 + #define CV_CTLCFG_ECC_AUTO_EN (CV_CTLCFG_ECC_EN | \ 34 + CV_CTLCFG_ECC_CORR_EN) 35 + 36 + /* SDRAM Controller Address Width Register */ 37 + #define CV_DRAMADDRW_OFST 0x2C 38 + 39 + /* SDRAM Controller Address Widths Field Register */ 40 + #define DRAMADDRW_COLBIT_MASK 0x001F 41 + #define DRAMADDRW_COLBIT_SHIFT 0 42 + #define DRAMADDRW_ROWBIT_MASK 0x03E0 43 + #define DRAMADDRW_ROWBIT_SHIFT 5 44 + #define CV_DRAMADDRW_BANKBIT_MASK 0x1C00 45 + #define CV_DRAMADDRW_BANKBIT_SHIFT 10 46 + #define CV_DRAMADDRW_CSBIT_MASK 0xE000 47 + #define CV_DRAMADDRW_CSBIT_SHIFT 13 48 + 49 + /* SDRAM Controller Interface Data Width Register */ 50 + #define CV_DRAMIFWIDTH_OFST 0x30 51 + 52 + /* SDRAM Controller Interface Data Width Defines */ 53 + #define CV_DRAMIFWIDTH_16B_ECC 24 54 + #define CV_DRAMIFWIDTH_32B_ECC 40 55 + 56 + /* SDRAM Controller DRAM Status Register */ 57 + #define CV_DRAMSTS_OFST 0x38 58 + 59 + /* SDRAM Controller DRAM Status Register Bit Masks */ 60 + #define CV_DRAMSTS_SBEERR 0x04 61 + #define CV_DRAMSTS_DBEERR 0x08 62 + #define CV_DRAMSTS_CORR_DROP 0x10 63 + 64 + /* SDRAM Controller DRAM IRQ Register */ 65 + #define CV_DRAMINTR_OFST 0x3C 66 + 67 + /* SDRAM Controller DRAM IRQ Register Bit Masks */ 68 + #define CV_DRAMINTR_INTREN 0x01 69 + #define CV_DRAMINTR_SBEMASK 0x02 70 + #define CV_DRAMINTR_DBEMASK 0x04 71 + #define CV_DRAMINTR_CORRDROPMASK 0x08 72 + #define CV_DRAMINTR_INTRCLR 0x10 73 + 74 + /* SDRAM Controller Single Bit Error Count Register */ 75 + #define CV_SBECOUNT_OFST 0x40 76 + 77 + /* SDRAM Controller Double Bit Error Count Register */ 78 + #define CV_DBECOUNT_OFST 0x44 79 + 80 + /* SDRAM Controller ECC Error Address Register */ 81 + #define CV_ERRADDR_OFST 0x48 82 + 83 + /*-----------------------------------------*/ 84 + 85 + /* SDRAM Controller EccCtrl Register */ 86 + #define A10_ECCCTRL1_OFST 0x00 87 + 88 + /* SDRAM Controller EccCtrl Register Bit Masks */ 89 + #define A10_ECCCTRL1_ECC_EN 0x001 90 + #define A10_ECCCTRL1_CNT_RST 0x010 91 + #define A10_ECCCTRL1_AWB_CNT_RST 0x100 92 + #define A10_ECC_CNT_RESET_MASK (A10_ECCCTRL1_CNT_RST | \ 93 + A10_ECCCTRL1_AWB_CNT_RST) 94 + 95 + /* SDRAM Controller Address Width Register */ 96 + #define CV_DRAMADDRW 0xFFC2502C 97 + #define A10_DRAMADDRW 0xFFCFA0A8 98 + 99 + /* SDRAM Controller Address Widths Field Register */ 100 + #define DRAMADDRW_COLBIT_MASK 0x001F 101 + #define DRAMADDRW_COLBIT_SHIFT 0 102 + #define DRAMADDRW_ROWBIT_MASK 0x03E0 103 + #define DRAMADDRW_ROWBIT_SHIFT 5 104 + #define CV_DRAMADDRW_BANKBIT_MASK 0x1C00 105 + #define CV_DRAMADDRW_BANKBIT_SHIFT 10 106 + #define CV_DRAMADDRW_CSBIT_MASK 0xE000 107 + #define CV_DRAMADDRW_CSBIT_SHIFT 13 108 + 109 + #define A10_DRAMADDRW_BANKBIT_MASK 0x3C00 110 + #define A10_DRAMADDRW_BANKBIT_SHIFT 10 111 + #define A10_DRAMADDRW_GRPBIT_MASK 0xC000 112 + #define A10_DRAMADDRW_GRPBIT_SHIFT 14 113 + #define A10_DRAMADDRW_CSBIT_MASK 0x70000 114 + #define A10_DRAMADDRW_CSBIT_SHIFT 16 115 + 116 + /* SDRAM Controller Interface Data Width Register */ 117 + #define CV_DRAMIFWIDTH 0xFFC25030 118 + #define A10_DRAMIFWIDTH 0xFFCFB008 119 + 120 + /* SDRAM Controller Interface Data Width Defines */ 121 + #define CV_DRAMIFWIDTH_16B_ECC 24 122 + #define CV_DRAMIFWIDTH_32B_ECC 40 123 + 124 + #define A10_DRAMIFWIDTH_16B 0x0 125 + #define A10_DRAMIFWIDTH_32B 0x1 126 + #define A10_DRAMIFWIDTH_64B 0x2 127 + 128 + /* SDRAM Controller DRAM IRQ Register */ 129 + #define A10_ERRINTEN_OFST 0x10 130 + 131 + /* SDRAM Controller DRAM IRQ Register Bit Masks */ 132 + #define A10_ERRINTEN_SERRINTEN 0x01 133 + #define A10_ERRINTEN_DERRINTEN 0x02 134 + #define A10_ECC_IRQ_EN_MASK (A10_ERRINTEN_SERRINTEN | \ 135 + A10_ERRINTEN_DERRINTEN) 136 + 137 + /* SDRAM Interrupt Mode Register */ 138 + #define A10_INTMODE_OFST 0x1C 139 + #define A10_INTMODE_SB_INT 1 140 + 141 + /* SDRAM Controller Error Status Register */ 142 + #define A10_INTSTAT_OFST 0x20 143 + 144 + /* SDRAM Controller Error Status Register Bit Masks */ 145 + #define A10_INTSTAT_SBEERR 0x01 146 + #define A10_INTSTAT_DBEERR 0x02 147 + 148 + /* SDRAM Controller ECC Error Address Register */ 149 + #define A10_DERRADDR_OFST 0x2C 150 + #define A10_SERRADDR_OFST 0x30 151 + 152 + /* SDRAM Controller ECC Diagnostic Register */ 153 + #define A10_DIAGINTTEST_OFST 0x24 154 + 155 + #define A10_DIAGINT_TSERRA_MASK 0x0001 156 + #define A10_DIAGINT_TDERRA_MASK 0x0100 157 + 158 + #define A10_SBERR_IRQ 34 159 + #define A10_DBERR_IRQ 32 160 + 161 + /* SDRAM Single Bit Error Count Compare Set Register */ 162 + #define A10_SERRCNTREG_OFST 0x3C 163 + 164 + #define A10_SYMAN_INTMASK_CLR 0xFFD06098 165 + #define A10_INTMASK_CLR_OFST 0x10 166 + #define A10_DDR0_IRQ_MASK BIT(17) 167 + 168 + struct altr_sdram_prv_data { 169 + int ecc_ctrl_offset; 170 + int ecc_ctl_en_mask; 171 + int ecc_cecnt_offset; 172 + int ecc_uecnt_offset; 173 + int ecc_stat_offset; 174 + int ecc_stat_ce_mask; 175 + int ecc_stat_ue_mask; 176 + int ecc_saddr_offset; 177 + int ecc_daddr_offset; 178 + int ecc_irq_en_offset; 179 + int ecc_irq_en_mask; 180 + int ecc_irq_clr_offset; 181 + int ecc_irq_clr_mask; 182 + int ecc_cnt_rst_offset; 183 + int ecc_cnt_rst_mask; 184 + #ifdef CONFIG_EDAC_DEBUG 185 + struct edac_dev_sysfs_attribute *eccmgr_sysfs_attr; 186 + int ecc_enable_mask; 187 + int ce_set_mask; 188 + int ue_set_mask; 189 + int ce_ue_trgr_offset; 190 + #endif 191 + }; 192 + 193 + /* Altera SDRAM Memory Controller data */ 194 + struct altr_sdram_mc_data { 195 + struct regmap *mc_vbase; 196 + int sb_irq; 197 + int db_irq; 198 + const struct altr_sdram_prv_data *data; 199 + }; 200 + 201 + #endif /* #ifndef _ALTERA_EDAC_H */
+7 -2
drivers/edac/edac_mc.c
··· 30 30 #include <linux/bitops.h> 31 31 #include <asm/uaccess.h> 32 32 #include <asm/page.h> 33 - #include <asm/edac.h> 34 33 #include "edac_core.h" 35 34 #include "edac_module.h" 36 35 #include <ras/ras_event.h> 36 + 37 + #ifdef CONFIG_EDAC_ATOMIC_SCRUB 38 + #include <asm/edac.h> 39 + #else 40 + #define edac_atomic_scrub(va, size) do { } while (0) 41 + #endif 37 42 38 43 /* lock to memory controller's control array */ 39 44 static DEFINE_MUTEX(mem_ctls_mutex); ··· 879 874 virt_addr = kmap_atomic(pg); 880 875 881 876 /* Perform architecture specific atomic scrub operation */ 882 - atomic_scrub(virt_addr + offset, size); 877 + edac_atomic_scrub(virt_addr + offset, size); 883 878 884 879 /* Unmap and complete */ 885 880 kunmap_atomic(virt_addr);
-1
drivers/edac/edac_stub.c
··· 16 16 #include <linux/edac.h> 17 17 #include <linux/atomic.h> 18 18 #include <linux/device.h> 19 - #include <asm/edac.h> 20 19 21 20 int edac_op_state = EDAC_OPSTATE_INVAL; 22 21 EXPORT_SYMBOL_GPL(edac_op_state);
+147 -34
drivers/edac/mce_amd_inj.c
··· 15 15 #include <linux/device.h> 16 16 #include <linux/module.h> 17 17 #include <linux/cpu.h> 18 + #include <linux/string.h> 19 + #include <linux/uaccess.h> 18 20 #include <asm/mce.h> 19 21 20 22 #include "mce_amd.h" ··· 26 24 */ 27 25 static struct mce i_mce; 28 26 static struct dentry *dfs_inj; 27 + 28 + static u8 n_banks; 29 + 30 + #define MAX_FLAG_OPT_SIZE 3 31 + 32 + enum injection_type { 33 + SW_INJ = 0, /* SW injection, simply decode the error */ 34 + HW_INJ, /* Trigger a #MC */ 35 + N_INJ_TYPES, 36 + }; 37 + 38 + static const char * const flags_options[] = { 39 + [SW_INJ] = "sw", 40 + [HW_INJ] = "hw", 41 + NULL 42 + }; 43 + 44 + /* Set default injection to SW_INJ */ 45 + static enum injection_type inj_type = SW_INJ; 29 46 30 47 #define MCE_INJECT_SET(reg) \ 31 48 static int inj_##reg##_set(void *data, u64 val) \ ··· 100 79 return err; 101 80 } 102 81 103 - static int flags_get(void *data, u64 *val) 82 + static int __set_inj(const char *buf) 104 83 { 105 - struct mce *m = (struct mce *)data; 84 + int i; 106 85 107 - *val = m->inject_flags; 108 - 109 - return 0; 86 + for (i = 0; i < N_INJ_TYPES; i++) { 87 + if (!strncmp(flags_options[i], buf, strlen(flags_options[i]))) { 88 + inj_type = i; 89 + return 0; 90 + } 91 + } 92 + return -EINVAL; 110 93 } 111 94 112 - static int flags_set(void *data, u64 val) 95 + static ssize_t flags_read(struct file *filp, char __user *ubuf, 96 + size_t cnt, loff_t *ppos) 113 97 { 114 - struct mce *m = (struct mce *)data; 98 + char buf[MAX_FLAG_OPT_SIZE]; 99 + int n; 115 100 116 - m->inject_flags = (u8)val; 117 - return 0; 101 + n = sprintf(buf, "%s\n", flags_options[inj_type]); 102 + 103 + return simple_read_from_buffer(ubuf, cnt, ppos, buf, n); 118 104 } 119 105 120 - DEFINE_SIMPLE_ATTRIBUTE(flags_fops, flags_get, flags_set, "%llu\n"); 106 + static ssize_t flags_write(struct file *filp, const char __user *ubuf, 107 + size_t cnt, loff_t *ppos) 108 + { 109 + char buf[MAX_FLAG_OPT_SIZE], *__buf; 110 + int err; 111 + size_t ret; 112 + 113 + if (cnt > MAX_FLAG_OPT_SIZE) 114 + cnt = MAX_FLAG_OPT_SIZE; 115 + 116 + ret = cnt; 117 + 118 + if (copy_from_user(&buf, ubuf, cnt)) 119 + return -EFAULT; 120 + 121 + buf[cnt - 1] = 0; 122 + 123 + /* strip whitespace */ 124 + __buf = strstrip(buf); 125 + 126 + err = __set_inj(__buf); 127 + if (err) { 128 + pr_err("%s: Invalid flags value: %s\n", __func__, __buf); 129 + return err; 130 + } 131 + 132 + *ppos += ret; 133 + 134 + return ret; 135 + } 136 + 137 + static const struct file_operations flags_fops = { 138 + .read = flags_read, 139 + .write = flags_write, 140 + .llseek = generic_file_llseek, 141 + }; 121 142 122 143 /* 123 144 * On which CPU to inject? ··· 191 128 unsigned int cpu = i_mce.extcpu; 192 129 u8 b = i_mce.bank; 193 130 194 - if (!(i_mce.inject_flags & MCJ_EXCEPTION)) { 131 + if (i_mce.misc) 132 + i_mce.status |= MCI_STATUS_MISCV; 133 + 134 + if (inj_type == SW_INJ) { 195 135 amd_decode_mce(NULL, 0, &i_mce); 196 136 return; 197 137 } 198 - 199 - get_online_cpus(); 200 - if (!cpu_online(cpu)) 201 - goto err; 202 138 203 139 /* prep MCE global settings for the injection */ 204 140 mcg_status = MCG_STATUS_MCIP | MCG_STATUS_EIPV; 205 141 206 142 if (!(i_mce.status & MCI_STATUS_PCC)) 207 143 mcg_status |= MCG_STATUS_RIPV; 144 + 145 + get_online_cpus(); 146 + if (!cpu_online(cpu)) 147 + goto err; 208 148 209 149 toggle_hw_mce_inject(cpu, true); 210 150 ··· 240 174 { 241 175 struct mce *m = (struct mce *)data; 242 176 243 - if (val > 5) { 244 - if (boot_cpu_data.x86 != 0x15 || val > 6) { 245 - pr_err("Non-existent MCE bank: %llu\n", val); 246 - return -EINVAL; 247 - } 177 + if (val >= n_banks) { 178 + pr_err("Non-existent MCE bank: %llu\n", val); 179 + return -EINVAL; 248 180 } 249 181 250 182 m->bank = val; ··· 251 187 return 0; 252 188 } 253 189 254 - static int inj_bank_get(void *data, u64 *val) 255 - { 256 - struct mce *m = (struct mce *)data; 257 - 258 - *val = m->bank; 259 - return 0; 260 - } 190 + MCE_INJECT_GET(bank); 261 191 262 192 DEFINE_SIMPLE_ATTRIBUTE(bank_fops, inj_bank_get, inj_bank_set, "%llu\n"); 193 + 194 + static const char readme_msg[] = 195 + "Description of the files and their usages:\n" 196 + "\n" 197 + "Note1: i refers to the bank number below.\n" 198 + "Note2: See respective BKDGs for the exact bit definitions of the files below\n" 199 + "as they mirror the hardware registers.\n" 200 + "\n" 201 + "status:\t Set MCi_STATUS: the bits in that MSR control the error type and\n" 202 + "\t attributes of the error which caused the MCE.\n" 203 + "\n" 204 + "misc:\t Set MCi_MISC: provide auxiliary info about the error. It is mostly\n" 205 + "\t used for error thresholding purposes and its validity is indicated by\n" 206 + "\t MCi_STATUS[MiscV].\n" 207 + "\n" 208 + "addr:\t Error address value to be written to MCi_ADDR. Log address information\n" 209 + "\t associated with the error.\n" 210 + "\n" 211 + "cpu:\t The CPU to inject the error on.\n" 212 + "\n" 213 + "bank:\t Specify the bank you want to inject the error into: the number of\n" 214 + "\t banks in a processor varies and is family/model-specific, therefore, the\n" 215 + "\t supplied value is sanity-checked. Setting the bank value also triggers the\n" 216 + "\t injection.\n" 217 + "\n" 218 + "flags:\t Injection type to be performed. Writing to this file will trigger a\n" 219 + "\t real machine check, an APIC interrupt or invoke the error decoder routines\n" 220 + "\t for AMD processors.\n" 221 + "\n" 222 + "\t Allowed error injection types:\n" 223 + "\t - \"sw\": Software error injection. Decode error to a human-readable \n" 224 + "\t format only. Safe to use.\n" 225 + "\t - \"hw\": Hardware error injection. Causes the #MC exception handler to \n" 226 + "\t handle the error. Be warned: might cause system panic if MCi_STATUS[PCC] \n" 227 + "\t is set. Therefore, consider setting (debugfs_mountpoint)/mce/fake_panic \n" 228 + "\t before injecting.\n" 229 + "\n"; 230 + 231 + static ssize_t 232 + inj_readme_read(struct file *filp, char __user *ubuf, 233 + size_t cnt, loff_t *ppos) 234 + { 235 + return simple_read_from_buffer(ubuf, cnt, ppos, 236 + readme_msg, strlen(readme_msg)); 237 + } 238 + 239 + static const struct file_operations readme_fops = { 240 + .read = inj_readme_read, 241 + }; 263 242 264 243 static struct dfs_node { 265 244 char *name; 266 245 struct dentry *d; 267 246 const struct file_operations *fops; 247 + umode_t perm; 268 248 } dfs_fls[] = { 269 - { .name = "status", .fops = &status_fops }, 270 - { .name = "misc", .fops = &misc_fops }, 271 - { .name = "addr", .fops = &addr_fops }, 272 - { .name = "bank", .fops = &bank_fops }, 273 - { .name = "flags", .fops = &flags_fops }, 274 - { .name = "cpu", .fops = &extcpu_fops }, 249 + { .name = "status", .fops = &status_fops, .perm = S_IRUSR | S_IWUSR }, 250 + { .name = "misc", .fops = &misc_fops, .perm = S_IRUSR | S_IWUSR }, 251 + { .name = "addr", .fops = &addr_fops, .perm = S_IRUSR | S_IWUSR }, 252 + { .name = "bank", .fops = &bank_fops, .perm = S_IRUSR | S_IWUSR }, 253 + { .name = "flags", .fops = &flags_fops, .perm = S_IRUSR | S_IWUSR }, 254 + { .name = "cpu", .fops = &extcpu_fops, .perm = S_IRUSR | S_IWUSR }, 255 + { .name = "README", .fops = &readme_fops, .perm = S_IRUSR | S_IRGRP | S_IROTH }, 275 256 }; 276 257 277 258 static int __init init_mce_inject(void) 278 259 { 279 260 int i; 261 + u64 cap; 262 + 263 + rdmsrl(MSR_IA32_MCG_CAP, cap); 264 + n_banks = cap & MCG_BANKCNT_MASK; 280 265 281 266 dfs_inj = debugfs_create_dir("mce-inject", NULL); 282 267 if (!dfs_inj) ··· 333 220 334 221 for (i = 0; i < ARRAY_SIZE(dfs_fls); i++) { 335 222 dfs_fls[i].d = debugfs_create_file(dfs_fls[i].name, 336 - S_IRUSR | S_IWUSR, 223 + dfs_fls[i].perm, 337 224 dfs_inj, 338 225 &i_mce, 339 226 dfs_fls[i].fops);
+7 -3
drivers/edac/mpc85xx_edac.c
··· 811 811 } 812 812 } 813 813 814 + #define make64(high, low) (((u64)(high) << 32) | (low)) 815 + 814 816 static void mpc85xx_mc_check(struct mem_ctl_info *mci) 815 817 { 816 818 struct mpc85xx_mc_pdata *pdata = mci->pvt_info; ··· 820 818 u32 bus_width; 821 819 u32 err_detect; 822 820 u32 syndrome; 823 - u32 err_addr; 821 + u64 err_addr; 824 822 u32 pfn; 825 823 int row_index; 826 824 u32 cap_high; ··· 851 849 else 852 850 syndrome &= 0xffff; 853 851 854 - err_addr = in_be32(pdata->mc_vbase + MPC85XX_MC_CAPTURE_ADDRESS); 852 + err_addr = make64( 853 + in_be32(pdata->mc_vbase + MPC85XX_MC_CAPTURE_EXT_ADDRESS), 854 + in_be32(pdata->mc_vbase + MPC85XX_MC_CAPTURE_ADDRESS)); 855 855 pfn = err_addr >> PAGE_SHIFT; 856 856 857 857 for (row_index = 0; row_index < mci->nr_csrows; row_index++) { ··· 890 886 mpc85xx_mc_printk(mci, KERN_ERR, 891 887 "Captured Data / ECC:\t%#8.8x_%08x / %#2.2x\n", 892 888 cap_high, cap_low, syndrome); 893 - mpc85xx_mc_printk(mci, KERN_ERR, "Err addr: %#8.8x\n", err_addr); 889 + mpc85xx_mc_printk(mci, KERN_ERR, "Err addr: %#8.8llx\n", err_addr); 894 890 mpc85xx_mc_printk(mci, KERN_ERR, "PFN: %#8.8x\n", pfn); 895 891 896 892 /* we are out of range */
+1
drivers/edac/mpc85xx_edac.h
··· 43 43 #define MPC85XX_MC_ERR_INT_EN 0x0e48 44 44 #define MPC85XX_MC_CAPTURE_ATRIBUTES 0x0e4c 45 45 #define MPC85XX_MC_CAPTURE_ADDRESS 0x0e50 46 + #define MPC85XX_MC_CAPTURE_EXT_ADDRESS 0x0e54 46 47 #define MPC85XX_MC_ERR_SBE 0x0e58 47 48 48 49 #define DSC_MEM_EN 0x80000000
+1215
drivers/edac/xgene_edac.c
··· 1 + /* 2 + * APM X-Gene SoC EDAC (error detection and correction) 3 + * 4 + * Copyright (c) 2015, Applied Micro Circuits Corporation 5 + * Author: Feng Kan <fkan@apm.com> 6 + * Loc Ho <lho@apm.com> 7 + * 8 + * This program is free software; you can redistribute it and/or modify it 9 + * under the terms of the GNU General Public License as published by the 10 + * Free Software Foundation; either version 2 of the License, or (at your 11 + * option) any later version. 12 + * 13 + * This program is distributed in the hope that it will be useful, 14 + * but WITHOUT ANY WARRANTY; without even the implied warranty of 15 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 16 + * GNU General Public License for more details. 17 + * 18 + * You should have received a copy of the GNU General Public License 19 + * along with this program. If not, see <http://www.gnu.org/licenses/>. 20 + */ 21 + 22 + #include <linux/ctype.h> 23 + #include <linux/edac.h> 24 + #include <linux/interrupt.h> 25 + #include <linux/mfd/syscon.h> 26 + #include <linux/module.h> 27 + #include <linux/of.h> 28 + #include <linux/of_address.h> 29 + #include <linux/regmap.h> 30 + 31 + #include "edac_core.h" 32 + 33 + #define EDAC_MOD_STR "xgene_edac" 34 + 35 + /* Global error configuration status registers (CSR) */ 36 + #define PCPHPERRINTSTS 0x0000 37 + #define PCPHPERRINTMSK 0x0004 38 + #define MCU_CTL_ERR_MASK BIT(12) 39 + #define IOB_PA_ERR_MASK BIT(11) 40 + #define IOB_BA_ERR_MASK BIT(10) 41 + #define IOB_XGIC_ERR_MASK BIT(9) 42 + #define IOB_RB_ERR_MASK BIT(8) 43 + #define L3C_UNCORR_ERR_MASK BIT(5) 44 + #define MCU_UNCORR_ERR_MASK BIT(4) 45 + #define PMD3_MERR_MASK BIT(3) 46 + #define PMD2_MERR_MASK BIT(2) 47 + #define PMD1_MERR_MASK BIT(1) 48 + #define PMD0_MERR_MASK BIT(0) 49 + #define PCPLPERRINTSTS 0x0008 50 + #define PCPLPERRINTMSK 0x000C 51 + #define CSW_SWITCH_TRACE_ERR_MASK BIT(2) 52 + #define L3C_CORR_ERR_MASK BIT(1) 53 + #define MCU_CORR_ERR_MASK BIT(0) 54 + #define MEMERRINTSTS 0x0010 55 + #define MEMERRINTMSK 0x0014 56 + 57 + struct xgene_edac { 58 + struct device *dev; 59 + struct regmap *csw_map; 60 + struct regmap *mcba_map; 61 + struct regmap *mcbb_map; 62 + struct regmap *efuse_map; 63 + void __iomem *pcp_csr; 64 + spinlock_t lock; 65 + struct dentry *dfs; 66 + 67 + struct list_head mcus; 68 + struct list_head pmds; 69 + 70 + struct mutex mc_lock; 71 + int mc_active_mask; 72 + int mc_registered_mask; 73 + }; 74 + 75 + static void xgene_edac_pcp_rd(struct xgene_edac *edac, u32 reg, u32 *val) 76 + { 77 + *val = readl(edac->pcp_csr + reg); 78 + } 79 + 80 + static void xgene_edac_pcp_clrbits(struct xgene_edac *edac, u32 reg, 81 + u32 bits_mask) 82 + { 83 + u32 val; 84 + 85 + spin_lock(&edac->lock); 86 + val = readl(edac->pcp_csr + reg); 87 + val &= ~bits_mask; 88 + writel(val, edac->pcp_csr + reg); 89 + spin_unlock(&edac->lock); 90 + } 91 + 92 + static void xgene_edac_pcp_setbits(struct xgene_edac *edac, u32 reg, 93 + u32 bits_mask) 94 + { 95 + u32 val; 96 + 97 + spin_lock(&edac->lock); 98 + val = readl(edac->pcp_csr + reg); 99 + val |= bits_mask; 100 + writel(val, edac->pcp_csr + reg); 101 + spin_unlock(&edac->lock); 102 + } 103 + 104 + /* Memory controller error CSR */ 105 + #define MCU_MAX_RANK 8 106 + #define MCU_RANK_STRIDE 0x40 107 + 108 + #define MCUGECR 0x0110 109 + #define MCU_GECR_DEMANDUCINTREN_MASK BIT(0) 110 + #define MCU_GECR_BACKUCINTREN_MASK BIT(1) 111 + #define MCU_GECR_CINTREN_MASK BIT(2) 112 + #define MUC_GECR_MCUADDRERREN_MASK BIT(9) 113 + #define MCUGESR 0x0114 114 + #define MCU_GESR_ADDRNOMATCH_ERR_MASK BIT(7) 115 + #define MCU_GESR_ADDRMULTIMATCH_ERR_MASK BIT(6) 116 + #define MCU_GESR_PHYP_ERR_MASK BIT(3) 117 + #define MCUESRR0 0x0314 118 + #define MCU_ESRR_MULTUCERR_MASK BIT(3) 119 + #define MCU_ESRR_BACKUCERR_MASK BIT(2) 120 + #define MCU_ESRR_DEMANDUCERR_MASK BIT(1) 121 + #define MCU_ESRR_CERR_MASK BIT(0) 122 + #define MCUESRRA0 0x0318 123 + #define MCUEBLRR0 0x031c 124 + #define MCU_EBLRR_ERRBANK_RD(src) (((src) & 0x00000007) >> 0) 125 + #define MCUERCRR0 0x0320 126 + #define MCU_ERCRR_ERRROW_RD(src) (((src) & 0xFFFF0000) >> 16) 127 + #define MCU_ERCRR_ERRCOL_RD(src) ((src) & 0x00000FFF) 128 + #define MCUSBECNT0 0x0324 129 + #define MCU_SBECNT_COUNT(src) ((src) & 0xFFFF) 130 + 131 + #define CSW_CSWCR 0x0000 132 + #define CSW_CSWCR_DUALMCB_MASK BIT(0) 133 + 134 + #define MCBADDRMR 0x0000 135 + #define MCBADDRMR_MCU_INTLV_MODE_MASK BIT(3) 136 + #define MCBADDRMR_DUALMCU_MODE_MASK BIT(2) 137 + #define MCBADDRMR_MCB_INTLV_MODE_MASK BIT(1) 138 + #define MCBADDRMR_ADDRESS_MODE_MASK BIT(0) 139 + 140 + struct xgene_edac_mc_ctx { 141 + struct list_head next; 142 + char *name; 143 + struct mem_ctl_info *mci; 144 + struct xgene_edac *edac; 145 + void __iomem *mcu_csr; 146 + u32 mcu_id; 147 + }; 148 + 149 + static ssize_t xgene_edac_mc_err_inject_write(struct file *file, 150 + const char __user *data, 151 + size_t count, loff_t *ppos) 152 + { 153 + struct mem_ctl_info *mci = file->private_data; 154 + struct xgene_edac_mc_ctx *ctx = mci->pvt_info; 155 + int i; 156 + 157 + for (i = 0; i < MCU_MAX_RANK; i++) { 158 + writel(MCU_ESRR_MULTUCERR_MASK | MCU_ESRR_BACKUCERR_MASK | 159 + MCU_ESRR_DEMANDUCERR_MASK | MCU_ESRR_CERR_MASK, 160 + ctx->mcu_csr + MCUESRRA0 + i * MCU_RANK_STRIDE); 161 + } 162 + return count; 163 + } 164 + 165 + static const struct file_operations xgene_edac_mc_debug_inject_fops = { 166 + .open = simple_open, 167 + .write = xgene_edac_mc_err_inject_write, 168 + .llseek = generic_file_llseek, 169 + }; 170 + 171 + static void xgene_edac_mc_create_debugfs_node(struct mem_ctl_info *mci) 172 + { 173 + if (!IS_ENABLED(CONFIG_EDAC_DEBUG)) 174 + return; 175 + #ifdef CONFIG_EDAC_DEBUG 176 + if (!mci->debugfs) 177 + return; 178 + debugfs_create_file("inject_ctrl", S_IWUSR, mci->debugfs, mci, 179 + &xgene_edac_mc_debug_inject_fops); 180 + #endif 181 + } 182 + 183 + static void xgene_edac_mc_check(struct mem_ctl_info *mci) 184 + { 185 + struct xgene_edac_mc_ctx *ctx = mci->pvt_info; 186 + unsigned int pcp_hp_stat; 187 + unsigned int pcp_lp_stat; 188 + u32 reg; 189 + u32 rank; 190 + u32 bank; 191 + u32 count; 192 + u32 col_row; 193 + 194 + xgene_edac_pcp_rd(ctx->edac, PCPHPERRINTSTS, &pcp_hp_stat); 195 + xgene_edac_pcp_rd(ctx->edac, PCPLPERRINTSTS, &pcp_lp_stat); 196 + if (!((MCU_UNCORR_ERR_MASK & pcp_hp_stat) || 197 + (MCU_CTL_ERR_MASK & pcp_hp_stat) || 198 + (MCU_CORR_ERR_MASK & pcp_lp_stat))) 199 + return; 200 + 201 + for (rank = 0; rank < MCU_MAX_RANK; rank++) { 202 + reg = readl(ctx->mcu_csr + MCUESRR0 + rank * MCU_RANK_STRIDE); 203 + 204 + /* Detect uncorrectable memory error */ 205 + if (reg & (MCU_ESRR_DEMANDUCERR_MASK | 206 + MCU_ESRR_BACKUCERR_MASK)) { 207 + /* Detected uncorrectable memory error */ 208 + edac_mc_chipset_printk(mci, KERN_ERR, "X-Gene", 209 + "MCU uncorrectable error at rank %d\n", rank); 210 + 211 + edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 212 + 1, 0, 0, 0, 0, 0, -1, mci->ctl_name, ""); 213 + } 214 + 215 + /* Detect correctable memory error */ 216 + if (reg & MCU_ESRR_CERR_MASK) { 217 + bank = readl(ctx->mcu_csr + MCUEBLRR0 + 218 + rank * MCU_RANK_STRIDE); 219 + col_row = readl(ctx->mcu_csr + MCUERCRR0 + 220 + rank * MCU_RANK_STRIDE); 221 + count = readl(ctx->mcu_csr + MCUSBECNT0 + 222 + rank * MCU_RANK_STRIDE); 223 + edac_mc_chipset_printk(mci, KERN_WARNING, "X-Gene", 224 + "MCU correctable error at rank %d bank %d column %d row %d count %d\n", 225 + rank, MCU_EBLRR_ERRBANK_RD(bank), 226 + MCU_ERCRR_ERRCOL_RD(col_row), 227 + MCU_ERCRR_ERRROW_RD(col_row), 228 + MCU_SBECNT_COUNT(count)); 229 + 230 + edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, 231 + 1, 0, 0, 0, 0, 0, -1, mci->ctl_name, ""); 232 + } 233 + 234 + /* Clear all error registers */ 235 + writel(0x0, ctx->mcu_csr + MCUEBLRR0 + rank * MCU_RANK_STRIDE); 236 + writel(0x0, ctx->mcu_csr + MCUERCRR0 + rank * MCU_RANK_STRIDE); 237 + writel(0x0, ctx->mcu_csr + MCUSBECNT0 + 238 + rank * MCU_RANK_STRIDE); 239 + writel(reg, ctx->mcu_csr + MCUESRR0 + rank * MCU_RANK_STRIDE); 240 + } 241 + 242 + /* Detect memory controller error */ 243 + reg = readl(ctx->mcu_csr + MCUGESR); 244 + if (reg) { 245 + if (reg & MCU_GESR_ADDRNOMATCH_ERR_MASK) 246 + edac_mc_chipset_printk(mci, KERN_WARNING, "X-Gene", 247 + "MCU address miss-match error\n"); 248 + if (reg & MCU_GESR_ADDRMULTIMATCH_ERR_MASK) 249 + edac_mc_chipset_printk(mci, KERN_WARNING, "X-Gene", 250 + "MCU address multi-match error\n"); 251 + 252 + writel(reg, ctx->mcu_csr + MCUGESR); 253 + } 254 + } 255 + 256 + static void xgene_edac_mc_irq_ctl(struct mem_ctl_info *mci, bool enable) 257 + { 258 + struct xgene_edac_mc_ctx *ctx = mci->pvt_info; 259 + unsigned int val; 260 + 261 + if (edac_op_state != EDAC_OPSTATE_INT) 262 + return; 263 + 264 + mutex_lock(&ctx->edac->mc_lock); 265 + 266 + /* 267 + * As there is only single bit for enable error and interrupt mask, 268 + * we must only enable top level interrupt after all MCUs are 269 + * registered. Otherwise, if there is an error and the corresponding 270 + * MCU has not registered, the interrupt will never get cleared. To 271 + * determine all MCU have registered, we will keep track of active 272 + * MCUs and registered MCUs. 273 + */ 274 + if (enable) { 275 + /* Set registered MCU bit */ 276 + ctx->edac->mc_registered_mask |= 1 << ctx->mcu_id; 277 + 278 + /* Enable interrupt after all active MCU registered */ 279 + if (ctx->edac->mc_registered_mask == 280 + ctx->edac->mc_active_mask) { 281 + /* Enable memory controller top level interrupt */ 282 + xgene_edac_pcp_clrbits(ctx->edac, PCPHPERRINTMSK, 283 + MCU_UNCORR_ERR_MASK | 284 + MCU_CTL_ERR_MASK); 285 + xgene_edac_pcp_clrbits(ctx->edac, PCPLPERRINTMSK, 286 + MCU_CORR_ERR_MASK); 287 + } 288 + 289 + /* Enable MCU interrupt and error reporting */ 290 + val = readl(ctx->mcu_csr + MCUGECR); 291 + val |= MCU_GECR_DEMANDUCINTREN_MASK | 292 + MCU_GECR_BACKUCINTREN_MASK | 293 + MCU_GECR_CINTREN_MASK | 294 + MUC_GECR_MCUADDRERREN_MASK; 295 + writel(val, ctx->mcu_csr + MCUGECR); 296 + } else { 297 + /* Disable MCU interrupt */ 298 + val = readl(ctx->mcu_csr + MCUGECR); 299 + val &= ~(MCU_GECR_DEMANDUCINTREN_MASK | 300 + MCU_GECR_BACKUCINTREN_MASK | 301 + MCU_GECR_CINTREN_MASK | 302 + MUC_GECR_MCUADDRERREN_MASK); 303 + writel(val, ctx->mcu_csr + MCUGECR); 304 + 305 + /* Disable memory controller top level interrupt */ 306 + xgene_edac_pcp_setbits(ctx->edac, PCPHPERRINTMSK, 307 + MCU_UNCORR_ERR_MASK | MCU_CTL_ERR_MASK); 308 + xgene_edac_pcp_setbits(ctx->edac, PCPLPERRINTMSK, 309 + MCU_CORR_ERR_MASK); 310 + 311 + /* Clear registered MCU bit */ 312 + ctx->edac->mc_registered_mask &= ~(1 << ctx->mcu_id); 313 + } 314 + 315 + mutex_unlock(&ctx->edac->mc_lock); 316 + } 317 + 318 + static int xgene_edac_mc_is_active(struct xgene_edac_mc_ctx *ctx, int mc_idx) 319 + { 320 + unsigned int reg; 321 + u32 mcu_mask; 322 + 323 + if (regmap_read(ctx->edac->csw_map, CSW_CSWCR, &reg)) 324 + return 0; 325 + 326 + if (reg & CSW_CSWCR_DUALMCB_MASK) { 327 + /* 328 + * Dual MCB active - Determine if all 4 active or just MCU0 329 + * and MCU2 active 330 + */ 331 + if (regmap_read(ctx->edac->mcbb_map, MCBADDRMR, &reg)) 332 + return 0; 333 + mcu_mask = (reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0xF : 0x5; 334 + } else { 335 + /* 336 + * Single MCB active - Determine if MCU0/MCU1 or just MCU0 337 + * active 338 + */ 339 + if (regmap_read(ctx->edac->mcba_map, MCBADDRMR, &reg)) 340 + return 0; 341 + mcu_mask = (reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0x3 : 0x1; 342 + } 343 + 344 + /* Save active MC mask if hasn't set already */ 345 + if (!ctx->edac->mc_active_mask) 346 + ctx->edac->mc_active_mask = mcu_mask; 347 + 348 + return (mcu_mask & (1 << mc_idx)) ? 1 : 0; 349 + } 350 + 351 + static int xgene_edac_mc_add(struct xgene_edac *edac, struct device_node *np) 352 + { 353 + struct mem_ctl_info *mci; 354 + struct edac_mc_layer layers[2]; 355 + struct xgene_edac_mc_ctx tmp_ctx; 356 + struct xgene_edac_mc_ctx *ctx; 357 + struct resource res; 358 + int rc; 359 + 360 + memset(&tmp_ctx, 0, sizeof(tmp_ctx)); 361 + tmp_ctx.edac = edac; 362 + 363 + if (!devres_open_group(edac->dev, xgene_edac_mc_add, GFP_KERNEL)) 364 + return -ENOMEM; 365 + 366 + rc = of_address_to_resource(np, 0, &res); 367 + if (rc < 0) { 368 + dev_err(edac->dev, "no MCU resource address\n"); 369 + goto err_group; 370 + } 371 + tmp_ctx.mcu_csr = devm_ioremap_resource(edac->dev, &res); 372 + if (IS_ERR(tmp_ctx.mcu_csr)) { 373 + dev_err(edac->dev, "unable to map MCU resource\n"); 374 + rc = PTR_ERR(tmp_ctx.mcu_csr); 375 + goto err_group; 376 + } 377 + 378 + /* Ignore non-active MCU */ 379 + if (of_property_read_u32(np, "memory-controller", &tmp_ctx.mcu_id)) { 380 + dev_err(edac->dev, "no memory-controller property\n"); 381 + rc = -ENODEV; 382 + goto err_group; 383 + } 384 + if (!xgene_edac_mc_is_active(&tmp_ctx, tmp_ctx.mcu_id)) { 385 + rc = -ENODEV; 386 + goto err_group; 387 + } 388 + 389 + layers[0].type = EDAC_MC_LAYER_CHIP_SELECT; 390 + layers[0].size = 4; 391 + layers[0].is_virt_csrow = true; 392 + layers[1].type = EDAC_MC_LAYER_CHANNEL; 393 + layers[1].size = 2; 394 + layers[1].is_virt_csrow = false; 395 + mci = edac_mc_alloc(tmp_ctx.mcu_id, ARRAY_SIZE(layers), layers, 396 + sizeof(*ctx)); 397 + if (!mci) { 398 + rc = -ENOMEM; 399 + goto err_group; 400 + } 401 + 402 + ctx = mci->pvt_info; 403 + *ctx = tmp_ctx; /* Copy over resource value */ 404 + ctx->name = "xgene_edac_mc_err"; 405 + ctx->mci = mci; 406 + mci->pdev = &mci->dev; 407 + mci->ctl_name = ctx->name; 408 + mci->dev_name = ctx->name; 409 + 410 + mci->mtype_cap = MEM_FLAG_RDDR | MEM_FLAG_RDDR2 | MEM_FLAG_RDDR3 | 411 + MEM_FLAG_DDR | MEM_FLAG_DDR2 | MEM_FLAG_DDR3; 412 + mci->edac_ctl_cap = EDAC_FLAG_SECDED; 413 + mci->edac_cap = EDAC_FLAG_SECDED; 414 + mci->mod_name = EDAC_MOD_STR; 415 + mci->mod_ver = "0.1"; 416 + mci->ctl_page_to_phys = NULL; 417 + mci->scrub_cap = SCRUB_FLAG_HW_SRC; 418 + mci->scrub_mode = SCRUB_HW_SRC; 419 + 420 + if (edac_op_state == EDAC_OPSTATE_POLL) 421 + mci->edac_check = xgene_edac_mc_check; 422 + 423 + if (edac_mc_add_mc(mci)) { 424 + dev_err(edac->dev, "edac_mc_add_mc failed\n"); 425 + rc = -EINVAL; 426 + goto err_free; 427 + } 428 + 429 + xgene_edac_mc_create_debugfs_node(mci); 430 + 431 + list_add(&ctx->next, &edac->mcus); 432 + 433 + xgene_edac_mc_irq_ctl(mci, true); 434 + 435 + devres_remove_group(edac->dev, xgene_edac_mc_add); 436 + 437 + dev_info(edac->dev, "X-Gene EDAC MC registered\n"); 438 + return 0; 439 + 440 + err_free: 441 + edac_mc_free(mci); 442 + err_group: 443 + devres_release_group(edac->dev, xgene_edac_mc_add); 444 + return rc; 445 + } 446 + 447 + static int xgene_edac_mc_remove(struct xgene_edac_mc_ctx *mcu) 448 + { 449 + xgene_edac_mc_irq_ctl(mcu->mci, false); 450 + edac_mc_del_mc(&mcu->mci->dev); 451 + edac_mc_free(mcu->mci); 452 + return 0; 453 + } 454 + 455 + /* CPU L1/L2 error CSR */ 456 + #define MAX_CPU_PER_PMD 2 457 + #define CPU_CSR_STRIDE 0x00100000 458 + #define CPU_L2C_PAGE 0x000D0000 459 + #define CPU_MEMERR_L2C_PAGE 0x000E0000 460 + #define CPU_MEMERR_CPU_PAGE 0x000F0000 461 + 462 + #define MEMERR_CPU_ICFECR_PAGE_OFFSET 0x0000 463 + #define MEMERR_CPU_ICFESR_PAGE_OFFSET 0x0004 464 + #define MEMERR_CPU_ICFESR_ERRWAY_RD(src) (((src) & 0xFF000000) >> 24) 465 + #define MEMERR_CPU_ICFESR_ERRINDEX_RD(src) (((src) & 0x003F0000) >> 16) 466 + #define MEMERR_CPU_ICFESR_ERRINFO_RD(src) (((src) & 0x0000FF00) >> 8) 467 + #define MEMERR_CPU_ICFESR_ERRTYPE_RD(src) (((src) & 0x00000070) >> 4) 468 + #define MEMERR_CPU_ICFESR_MULTCERR_MASK BIT(2) 469 + #define MEMERR_CPU_ICFESR_CERR_MASK BIT(0) 470 + #define MEMERR_CPU_LSUESR_PAGE_OFFSET 0x000c 471 + #define MEMERR_CPU_LSUESR_ERRWAY_RD(src) (((src) & 0xFF000000) >> 24) 472 + #define MEMERR_CPU_LSUESR_ERRINDEX_RD(src) (((src) & 0x003F0000) >> 16) 473 + #define MEMERR_CPU_LSUESR_ERRINFO_RD(src) (((src) & 0x0000FF00) >> 8) 474 + #define MEMERR_CPU_LSUESR_ERRTYPE_RD(src) (((src) & 0x00000070) >> 4) 475 + #define MEMERR_CPU_LSUESR_MULTCERR_MASK BIT(2) 476 + #define MEMERR_CPU_LSUESR_CERR_MASK BIT(0) 477 + #define MEMERR_CPU_LSUECR_PAGE_OFFSET 0x0008 478 + #define MEMERR_CPU_MMUECR_PAGE_OFFSET 0x0010 479 + #define MEMERR_CPU_MMUESR_PAGE_OFFSET 0x0014 480 + #define MEMERR_CPU_MMUESR_ERRWAY_RD(src) (((src) & 0xFF000000) >> 24) 481 + #define MEMERR_CPU_MMUESR_ERRINDEX_RD(src) (((src) & 0x007F0000) >> 16) 482 + #define MEMERR_CPU_MMUESR_ERRINFO_RD(src) (((src) & 0x0000FF00) >> 8) 483 + #define MEMERR_CPU_MMUESR_ERRREQSTR_LSU_MASK BIT(7) 484 + #define MEMERR_CPU_MMUESR_ERRTYPE_RD(src) (((src) & 0x00000070) >> 4) 485 + #define MEMERR_CPU_MMUESR_MULTCERR_MASK BIT(2) 486 + #define MEMERR_CPU_MMUESR_CERR_MASK BIT(0) 487 + #define MEMERR_CPU_ICFESRA_PAGE_OFFSET 0x0804 488 + #define MEMERR_CPU_LSUESRA_PAGE_OFFSET 0x080c 489 + #define MEMERR_CPU_MMUESRA_PAGE_OFFSET 0x0814 490 + 491 + #define MEMERR_L2C_L2ECR_PAGE_OFFSET 0x0000 492 + #define MEMERR_L2C_L2ESR_PAGE_OFFSET 0x0004 493 + #define MEMERR_L2C_L2ESR_ERRSYN_RD(src) (((src) & 0xFF000000) >> 24) 494 + #define MEMERR_L2C_L2ESR_ERRWAY_RD(src) (((src) & 0x00FC0000) >> 18) 495 + #define MEMERR_L2C_L2ESR_ERRCPU_RD(src) (((src) & 0x00020000) >> 17) 496 + #define MEMERR_L2C_L2ESR_ERRGROUP_RD(src) (((src) & 0x0000E000) >> 13) 497 + #define MEMERR_L2C_L2ESR_ERRACTION_RD(src) (((src) & 0x00001C00) >> 10) 498 + #define MEMERR_L2C_L2ESR_ERRTYPE_RD(src) (((src) & 0x00000300) >> 8) 499 + #define MEMERR_L2C_L2ESR_MULTUCERR_MASK BIT(3) 500 + #define MEMERR_L2C_L2ESR_MULTICERR_MASK BIT(2) 501 + #define MEMERR_L2C_L2ESR_UCERR_MASK BIT(1) 502 + #define MEMERR_L2C_L2ESR_ERR_MASK BIT(0) 503 + #define MEMERR_L2C_L2EALR_PAGE_OFFSET 0x0008 504 + #define CPUX_L2C_L2RTOCR_PAGE_OFFSET 0x0010 505 + #define MEMERR_L2C_L2EAHR_PAGE_OFFSET 0x000c 506 + #define CPUX_L2C_L2RTOSR_PAGE_OFFSET 0x0014 507 + #define MEMERR_L2C_L2RTOSR_MULTERR_MASK BIT(1) 508 + #define MEMERR_L2C_L2RTOSR_ERR_MASK BIT(0) 509 + #define CPUX_L2C_L2RTOALR_PAGE_OFFSET 0x0018 510 + #define CPUX_L2C_L2RTOAHR_PAGE_OFFSET 0x001c 511 + #define MEMERR_L2C_L2ESRA_PAGE_OFFSET 0x0804 512 + 513 + /* 514 + * Processor Module Domain (PMD) context - Context for a pair of processsors. 515 + * Each PMD consists of 2 CPUs and a shared L2 cache. Each CPU consists of 516 + * its own L1 cache. 517 + */ 518 + struct xgene_edac_pmd_ctx { 519 + struct list_head next; 520 + struct device ddev; 521 + char *name; 522 + struct xgene_edac *edac; 523 + struct edac_device_ctl_info *edac_dev; 524 + void __iomem *pmd_csr; 525 + u32 pmd; 526 + int version; 527 + }; 528 + 529 + static void xgene_edac_pmd_l1_check(struct edac_device_ctl_info *edac_dev, 530 + int cpu_idx) 531 + { 532 + struct xgene_edac_pmd_ctx *ctx = edac_dev->pvt_info; 533 + void __iomem *pg_f; 534 + u32 val; 535 + 536 + pg_f = ctx->pmd_csr + cpu_idx * CPU_CSR_STRIDE + CPU_MEMERR_CPU_PAGE; 537 + 538 + val = readl(pg_f + MEMERR_CPU_ICFESR_PAGE_OFFSET); 539 + if (val) { 540 + dev_err(edac_dev->dev, 541 + "CPU%d L1 memory error ICF 0x%08X Way 0x%02X Index 0x%02X Info 0x%02X\n", 542 + ctx->pmd * MAX_CPU_PER_PMD + cpu_idx, val, 543 + MEMERR_CPU_ICFESR_ERRWAY_RD(val), 544 + MEMERR_CPU_ICFESR_ERRINDEX_RD(val), 545 + MEMERR_CPU_ICFESR_ERRINFO_RD(val)); 546 + if (val & MEMERR_CPU_ICFESR_CERR_MASK) 547 + dev_err(edac_dev->dev, 548 + "One or more correctable error\n"); 549 + if (val & MEMERR_CPU_ICFESR_MULTCERR_MASK) 550 + dev_err(edac_dev->dev, "Multiple correctable error\n"); 551 + switch (MEMERR_CPU_ICFESR_ERRTYPE_RD(val)) { 552 + case 1: 553 + dev_err(edac_dev->dev, "L1 TLB multiple hit\n"); 554 + break; 555 + case 2: 556 + dev_err(edac_dev->dev, "Way select multiple hit\n"); 557 + break; 558 + case 3: 559 + dev_err(edac_dev->dev, "Physical tag parity error\n"); 560 + break; 561 + case 4: 562 + case 5: 563 + dev_err(edac_dev->dev, "L1 data parity error\n"); 564 + break; 565 + case 6: 566 + dev_err(edac_dev->dev, "L1 pre-decode parity error\n"); 567 + break; 568 + } 569 + 570 + /* Clear any HW errors */ 571 + writel(val, pg_f + MEMERR_CPU_ICFESR_PAGE_OFFSET); 572 + 573 + if (val & (MEMERR_CPU_ICFESR_CERR_MASK | 574 + MEMERR_CPU_ICFESR_MULTCERR_MASK)) 575 + edac_device_handle_ce(edac_dev, 0, 0, 576 + edac_dev->ctl_name); 577 + } 578 + 579 + val = readl(pg_f + MEMERR_CPU_LSUESR_PAGE_OFFSET); 580 + if (val) { 581 + dev_err(edac_dev->dev, 582 + "CPU%d memory error LSU 0x%08X Way 0x%02X Index 0x%02X Info 0x%02X\n", 583 + ctx->pmd * MAX_CPU_PER_PMD + cpu_idx, val, 584 + MEMERR_CPU_LSUESR_ERRWAY_RD(val), 585 + MEMERR_CPU_LSUESR_ERRINDEX_RD(val), 586 + MEMERR_CPU_LSUESR_ERRINFO_RD(val)); 587 + if (val & MEMERR_CPU_LSUESR_CERR_MASK) 588 + dev_err(edac_dev->dev, 589 + "One or more correctable error\n"); 590 + if (val & MEMERR_CPU_LSUESR_MULTCERR_MASK) 591 + dev_err(edac_dev->dev, "Multiple correctable error\n"); 592 + switch (MEMERR_CPU_LSUESR_ERRTYPE_RD(val)) { 593 + case 0: 594 + dev_err(edac_dev->dev, "Load tag error\n"); 595 + break; 596 + case 1: 597 + dev_err(edac_dev->dev, "Load data error\n"); 598 + break; 599 + case 2: 600 + dev_err(edac_dev->dev, "WSL multihit error\n"); 601 + break; 602 + case 3: 603 + dev_err(edac_dev->dev, "Store tag error\n"); 604 + break; 605 + case 4: 606 + dev_err(edac_dev->dev, 607 + "DTB multihit from load pipeline error\n"); 608 + break; 609 + case 5: 610 + dev_err(edac_dev->dev, 611 + "DTB multihit from store pipeline error\n"); 612 + break; 613 + } 614 + 615 + /* Clear any HW errors */ 616 + writel(val, pg_f + MEMERR_CPU_LSUESR_PAGE_OFFSET); 617 + 618 + if (val & (MEMERR_CPU_LSUESR_CERR_MASK | 619 + MEMERR_CPU_LSUESR_MULTCERR_MASK)) 620 + edac_device_handle_ce(edac_dev, 0, 0, 621 + edac_dev->ctl_name); 622 + } 623 + 624 + val = readl(pg_f + MEMERR_CPU_MMUESR_PAGE_OFFSET); 625 + if (val) { 626 + dev_err(edac_dev->dev, 627 + "CPU%d memory error MMU 0x%08X Way 0x%02X Index 0x%02X Info 0x%02X %s\n", 628 + ctx->pmd * MAX_CPU_PER_PMD + cpu_idx, val, 629 + MEMERR_CPU_MMUESR_ERRWAY_RD(val), 630 + MEMERR_CPU_MMUESR_ERRINDEX_RD(val), 631 + MEMERR_CPU_MMUESR_ERRINFO_RD(val), 632 + val & MEMERR_CPU_MMUESR_ERRREQSTR_LSU_MASK ? "LSU" : 633 + "ICF"); 634 + if (val & MEMERR_CPU_MMUESR_CERR_MASK) 635 + dev_err(edac_dev->dev, 636 + "One or more correctable error\n"); 637 + if (val & MEMERR_CPU_MMUESR_MULTCERR_MASK) 638 + dev_err(edac_dev->dev, "Multiple correctable error\n"); 639 + switch (MEMERR_CPU_MMUESR_ERRTYPE_RD(val)) { 640 + case 0: 641 + dev_err(edac_dev->dev, "Stage 1 UTB hit error\n"); 642 + break; 643 + case 1: 644 + dev_err(edac_dev->dev, "Stage 1 UTB miss error\n"); 645 + break; 646 + case 2: 647 + dev_err(edac_dev->dev, "Stage 1 UTB allocate error\n"); 648 + break; 649 + case 3: 650 + dev_err(edac_dev->dev, 651 + "TMO operation single bank error\n"); 652 + break; 653 + case 4: 654 + dev_err(edac_dev->dev, "Stage 2 UTB error\n"); 655 + break; 656 + case 5: 657 + dev_err(edac_dev->dev, "Stage 2 UTB miss error\n"); 658 + break; 659 + case 6: 660 + dev_err(edac_dev->dev, "Stage 2 UTB allocate error\n"); 661 + break; 662 + case 7: 663 + dev_err(edac_dev->dev, 664 + "TMO operation multiple bank error\n"); 665 + break; 666 + } 667 + 668 + /* Clear any HW errors */ 669 + writel(val, pg_f + MEMERR_CPU_MMUESR_PAGE_OFFSET); 670 + 671 + edac_device_handle_ce(edac_dev, 0, 0, edac_dev->ctl_name); 672 + } 673 + } 674 + 675 + static void xgene_edac_pmd_l2_check(struct edac_device_ctl_info *edac_dev) 676 + { 677 + struct xgene_edac_pmd_ctx *ctx = edac_dev->pvt_info; 678 + void __iomem *pg_d; 679 + void __iomem *pg_e; 680 + u32 val_hi; 681 + u32 val_lo; 682 + u32 val; 683 + 684 + /* Check L2 */ 685 + pg_e = ctx->pmd_csr + CPU_MEMERR_L2C_PAGE; 686 + val = readl(pg_e + MEMERR_L2C_L2ESR_PAGE_OFFSET); 687 + if (val) { 688 + val_lo = readl(pg_e + MEMERR_L2C_L2EALR_PAGE_OFFSET); 689 + val_hi = readl(pg_e + MEMERR_L2C_L2EAHR_PAGE_OFFSET); 690 + dev_err(edac_dev->dev, 691 + "PMD%d memory error L2C L2ESR 0x%08X @ 0x%08X.%08X\n", 692 + ctx->pmd, val, val_hi, val_lo); 693 + dev_err(edac_dev->dev, 694 + "ErrSyndrome 0x%02X ErrWay 0x%02X ErrCpu %d ErrGroup 0x%02X ErrAction 0x%02X\n", 695 + MEMERR_L2C_L2ESR_ERRSYN_RD(val), 696 + MEMERR_L2C_L2ESR_ERRWAY_RD(val), 697 + MEMERR_L2C_L2ESR_ERRCPU_RD(val), 698 + MEMERR_L2C_L2ESR_ERRGROUP_RD(val), 699 + MEMERR_L2C_L2ESR_ERRACTION_RD(val)); 700 + 701 + if (val & MEMERR_L2C_L2ESR_ERR_MASK) 702 + dev_err(edac_dev->dev, 703 + "One or more correctable error\n"); 704 + if (val & MEMERR_L2C_L2ESR_MULTICERR_MASK) 705 + dev_err(edac_dev->dev, "Multiple correctable error\n"); 706 + if (val & MEMERR_L2C_L2ESR_UCERR_MASK) 707 + dev_err(edac_dev->dev, 708 + "One or more uncorrectable error\n"); 709 + if (val & MEMERR_L2C_L2ESR_MULTUCERR_MASK) 710 + dev_err(edac_dev->dev, 711 + "Multiple uncorrectable error\n"); 712 + 713 + switch (MEMERR_L2C_L2ESR_ERRTYPE_RD(val)) { 714 + case 0: 715 + dev_err(edac_dev->dev, "Outbound SDB parity error\n"); 716 + break; 717 + case 1: 718 + dev_err(edac_dev->dev, "Inbound SDB parity error\n"); 719 + break; 720 + case 2: 721 + dev_err(edac_dev->dev, "Tag ECC error\n"); 722 + break; 723 + case 3: 724 + dev_err(edac_dev->dev, "Data ECC error\n"); 725 + break; 726 + } 727 + 728 + /* Clear any HW errors */ 729 + writel(val, pg_e + MEMERR_L2C_L2ESR_PAGE_OFFSET); 730 + 731 + if (val & (MEMERR_L2C_L2ESR_ERR_MASK | 732 + MEMERR_L2C_L2ESR_MULTICERR_MASK)) 733 + edac_device_handle_ce(edac_dev, 0, 0, 734 + edac_dev->ctl_name); 735 + if (val & (MEMERR_L2C_L2ESR_UCERR_MASK | 736 + MEMERR_L2C_L2ESR_MULTUCERR_MASK)) 737 + edac_device_handle_ue(edac_dev, 0, 0, 738 + edac_dev->ctl_name); 739 + } 740 + 741 + /* Check if any memory request timed out on L2 cache */ 742 + pg_d = ctx->pmd_csr + CPU_L2C_PAGE; 743 + val = readl(pg_d + CPUX_L2C_L2RTOSR_PAGE_OFFSET); 744 + if (val) { 745 + val_lo = readl(pg_d + CPUX_L2C_L2RTOALR_PAGE_OFFSET); 746 + val_hi = readl(pg_d + CPUX_L2C_L2RTOAHR_PAGE_OFFSET); 747 + dev_err(edac_dev->dev, 748 + "PMD%d L2C error L2C RTOSR 0x%08X @ 0x%08X.%08X\n", 749 + ctx->pmd, val, val_hi, val_lo); 750 + writel(val, pg_d + CPUX_L2C_L2RTOSR_PAGE_OFFSET); 751 + } 752 + } 753 + 754 + static void xgene_edac_pmd_check(struct edac_device_ctl_info *edac_dev) 755 + { 756 + struct xgene_edac_pmd_ctx *ctx = edac_dev->pvt_info; 757 + unsigned int pcp_hp_stat; 758 + int i; 759 + 760 + xgene_edac_pcp_rd(ctx->edac, PCPHPERRINTSTS, &pcp_hp_stat); 761 + if (!((PMD0_MERR_MASK << ctx->pmd) & pcp_hp_stat)) 762 + return; 763 + 764 + /* Check CPU L1 error */ 765 + for (i = 0; i < MAX_CPU_PER_PMD; i++) 766 + xgene_edac_pmd_l1_check(edac_dev, i); 767 + 768 + /* Check CPU L2 error */ 769 + xgene_edac_pmd_l2_check(edac_dev); 770 + } 771 + 772 + static void xgene_edac_pmd_cpu_hw_cfg(struct edac_device_ctl_info *edac_dev, 773 + int cpu) 774 + { 775 + struct xgene_edac_pmd_ctx *ctx = edac_dev->pvt_info; 776 + void __iomem *pg_f = ctx->pmd_csr + cpu * CPU_CSR_STRIDE + 777 + CPU_MEMERR_CPU_PAGE; 778 + 779 + /* 780 + * Enable CPU memory error: 781 + * MEMERR_CPU_ICFESRA, MEMERR_CPU_LSUESRA, and MEMERR_CPU_MMUESRA 782 + */ 783 + writel(0x00000301, pg_f + MEMERR_CPU_ICFECR_PAGE_OFFSET); 784 + writel(0x00000301, pg_f + MEMERR_CPU_LSUECR_PAGE_OFFSET); 785 + writel(0x00000101, pg_f + MEMERR_CPU_MMUECR_PAGE_OFFSET); 786 + } 787 + 788 + static void xgene_edac_pmd_hw_cfg(struct edac_device_ctl_info *edac_dev) 789 + { 790 + struct xgene_edac_pmd_ctx *ctx = edac_dev->pvt_info; 791 + void __iomem *pg_d = ctx->pmd_csr + CPU_L2C_PAGE; 792 + void __iomem *pg_e = ctx->pmd_csr + CPU_MEMERR_L2C_PAGE; 793 + 794 + /* Enable PMD memory error - MEMERR_L2C_L2ECR and L2C_L2RTOCR */ 795 + writel(0x00000703, pg_e + MEMERR_L2C_L2ECR_PAGE_OFFSET); 796 + /* Configure L2C HW request time out feature if supported */ 797 + if (ctx->version > 1) 798 + writel(0x00000119, pg_d + CPUX_L2C_L2RTOCR_PAGE_OFFSET); 799 + } 800 + 801 + static void xgene_edac_pmd_hw_ctl(struct edac_device_ctl_info *edac_dev, 802 + bool enable) 803 + { 804 + struct xgene_edac_pmd_ctx *ctx = edac_dev->pvt_info; 805 + int i; 806 + 807 + /* Enable PMD error interrupt */ 808 + if (edac_dev->op_state == OP_RUNNING_INTERRUPT) { 809 + if (enable) 810 + xgene_edac_pcp_clrbits(ctx->edac, PCPHPERRINTMSK, 811 + PMD0_MERR_MASK << ctx->pmd); 812 + else 813 + xgene_edac_pcp_setbits(ctx->edac, PCPHPERRINTMSK, 814 + PMD0_MERR_MASK << ctx->pmd); 815 + } 816 + 817 + if (enable) { 818 + xgene_edac_pmd_hw_cfg(edac_dev); 819 + 820 + /* Two CPUs per a PMD */ 821 + for (i = 0; i < MAX_CPU_PER_PMD; i++) 822 + xgene_edac_pmd_cpu_hw_cfg(edac_dev, i); 823 + } 824 + } 825 + 826 + static ssize_t xgene_edac_pmd_l1_inject_ctrl_write(struct file *file, 827 + const char __user *data, 828 + size_t count, loff_t *ppos) 829 + { 830 + struct edac_device_ctl_info *edac_dev = file->private_data; 831 + struct xgene_edac_pmd_ctx *ctx = edac_dev->pvt_info; 832 + void __iomem *cpux_pg_f; 833 + int i; 834 + 835 + for (i = 0; i < MAX_CPU_PER_PMD; i++) { 836 + cpux_pg_f = ctx->pmd_csr + i * CPU_CSR_STRIDE + 837 + CPU_MEMERR_CPU_PAGE; 838 + 839 + writel(MEMERR_CPU_ICFESR_MULTCERR_MASK | 840 + MEMERR_CPU_ICFESR_CERR_MASK, 841 + cpux_pg_f + MEMERR_CPU_ICFESRA_PAGE_OFFSET); 842 + writel(MEMERR_CPU_LSUESR_MULTCERR_MASK | 843 + MEMERR_CPU_LSUESR_CERR_MASK, 844 + cpux_pg_f + MEMERR_CPU_LSUESRA_PAGE_OFFSET); 845 + writel(MEMERR_CPU_MMUESR_MULTCERR_MASK | 846 + MEMERR_CPU_MMUESR_CERR_MASK, 847 + cpux_pg_f + MEMERR_CPU_MMUESRA_PAGE_OFFSET); 848 + } 849 + return count; 850 + } 851 + 852 + static ssize_t xgene_edac_pmd_l2_inject_ctrl_write(struct file *file, 853 + const char __user *data, 854 + size_t count, loff_t *ppos) 855 + { 856 + struct edac_device_ctl_info *edac_dev = file->private_data; 857 + struct xgene_edac_pmd_ctx *ctx = edac_dev->pvt_info; 858 + void __iomem *pg_e = ctx->pmd_csr + CPU_MEMERR_L2C_PAGE; 859 + 860 + writel(MEMERR_L2C_L2ESR_MULTUCERR_MASK | 861 + MEMERR_L2C_L2ESR_MULTICERR_MASK | 862 + MEMERR_L2C_L2ESR_UCERR_MASK | 863 + MEMERR_L2C_L2ESR_ERR_MASK, 864 + pg_e + MEMERR_L2C_L2ESRA_PAGE_OFFSET); 865 + return count; 866 + } 867 + 868 + static const struct file_operations xgene_edac_pmd_debug_inject_fops[] = { 869 + { 870 + .open = simple_open, 871 + .write = xgene_edac_pmd_l1_inject_ctrl_write, 872 + .llseek = generic_file_llseek, }, 873 + { 874 + .open = simple_open, 875 + .write = xgene_edac_pmd_l2_inject_ctrl_write, 876 + .llseek = generic_file_llseek, }, 877 + { } 878 + }; 879 + 880 + static void xgene_edac_pmd_create_debugfs_nodes( 881 + struct edac_device_ctl_info *edac_dev) 882 + { 883 + struct xgene_edac_pmd_ctx *ctx = edac_dev->pvt_info; 884 + struct dentry *edac_debugfs; 885 + char name[30]; 886 + 887 + if (!IS_ENABLED(CONFIG_EDAC_DEBUG)) 888 + return; 889 + 890 + /* 891 + * Todo: Switch to common EDAC debug file system for edac device 892 + * when available. 893 + */ 894 + if (!ctx->edac->dfs) { 895 + ctx->edac->dfs = debugfs_create_dir(edac_dev->dev->kobj.name, 896 + NULL); 897 + if (!ctx->edac->dfs) 898 + return; 899 + } 900 + sprintf(name, "PMD%d", ctx->pmd); 901 + edac_debugfs = debugfs_create_dir(name, ctx->edac->dfs); 902 + if (!edac_debugfs) 903 + return; 904 + 905 + debugfs_create_file("l1_inject_ctrl", S_IWUSR, edac_debugfs, edac_dev, 906 + &xgene_edac_pmd_debug_inject_fops[0]); 907 + debugfs_create_file("l2_inject_ctrl", S_IWUSR, edac_debugfs, edac_dev, 908 + &xgene_edac_pmd_debug_inject_fops[1]); 909 + } 910 + 911 + static int xgene_edac_pmd_available(u32 efuse, int pmd) 912 + { 913 + return (efuse & (1 << pmd)) ? 0 : 1; 914 + } 915 + 916 + static int xgene_edac_pmd_add(struct xgene_edac *edac, struct device_node *np, 917 + int version) 918 + { 919 + struct edac_device_ctl_info *edac_dev; 920 + struct xgene_edac_pmd_ctx *ctx; 921 + struct resource res; 922 + char edac_name[10]; 923 + u32 pmd; 924 + int rc; 925 + u32 val; 926 + 927 + if (!devres_open_group(edac->dev, xgene_edac_pmd_add, GFP_KERNEL)) 928 + return -ENOMEM; 929 + 930 + /* Determine if this PMD is disabled */ 931 + if (of_property_read_u32(np, "pmd-controller", &pmd)) { 932 + dev_err(edac->dev, "no pmd-controller property\n"); 933 + rc = -ENODEV; 934 + goto err_group; 935 + } 936 + rc = regmap_read(edac->efuse_map, 0, &val); 937 + if (rc) 938 + goto err_group; 939 + if (!xgene_edac_pmd_available(val, pmd)) { 940 + rc = -ENODEV; 941 + goto err_group; 942 + } 943 + 944 + sprintf(edac_name, "l2c%d", pmd); 945 + edac_dev = edac_device_alloc_ctl_info(sizeof(*ctx), 946 + edac_name, 1, "l2c", 1, 2, NULL, 947 + 0, edac_device_alloc_index()); 948 + if (!edac_dev) { 949 + rc = -ENOMEM; 950 + goto err_group; 951 + } 952 + 953 + ctx = edac_dev->pvt_info; 954 + ctx->name = "xgene_pmd_err"; 955 + ctx->pmd = pmd; 956 + ctx->edac = edac; 957 + ctx->edac_dev = edac_dev; 958 + ctx->ddev = *edac->dev; 959 + ctx->version = version; 960 + edac_dev->dev = &ctx->ddev; 961 + edac_dev->ctl_name = ctx->name; 962 + edac_dev->dev_name = ctx->name; 963 + edac_dev->mod_name = EDAC_MOD_STR; 964 + 965 + rc = of_address_to_resource(np, 0, &res); 966 + if (rc < 0) { 967 + dev_err(edac->dev, "no PMD resource address\n"); 968 + goto err_free; 969 + } 970 + ctx->pmd_csr = devm_ioremap_resource(edac->dev, &res); 971 + if (IS_ERR(ctx->pmd_csr)) { 972 + dev_err(edac->dev, 973 + "devm_ioremap_resource failed for PMD resource address\n"); 974 + rc = PTR_ERR(ctx->pmd_csr); 975 + goto err_free; 976 + } 977 + 978 + if (edac_op_state == EDAC_OPSTATE_POLL) 979 + edac_dev->edac_check = xgene_edac_pmd_check; 980 + 981 + xgene_edac_pmd_create_debugfs_nodes(edac_dev); 982 + 983 + rc = edac_device_add_device(edac_dev); 984 + if (rc > 0) { 985 + dev_err(edac->dev, "edac_device_add_device failed\n"); 986 + rc = -ENOMEM; 987 + goto err_free; 988 + } 989 + 990 + if (edac_op_state == EDAC_OPSTATE_INT) 991 + edac_dev->op_state = OP_RUNNING_INTERRUPT; 992 + 993 + list_add(&ctx->next, &edac->pmds); 994 + 995 + xgene_edac_pmd_hw_ctl(edac_dev, 1); 996 + 997 + devres_remove_group(edac->dev, xgene_edac_pmd_add); 998 + 999 + dev_info(edac->dev, "X-Gene EDAC PMD%d registered\n", ctx->pmd); 1000 + return 0; 1001 + 1002 + err_free: 1003 + edac_device_free_ctl_info(edac_dev); 1004 + err_group: 1005 + devres_release_group(edac->dev, xgene_edac_pmd_add); 1006 + return rc; 1007 + } 1008 + 1009 + static int xgene_edac_pmd_remove(struct xgene_edac_pmd_ctx *pmd) 1010 + { 1011 + struct edac_device_ctl_info *edac_dev = pmd->edac_dev; 1012 + 1013 + xgene_edac_pmd_hw_ctl(edac_dev, 0); 1014 + edac_device_del_device(edac_dev->dev); 1015 + edac_device_free_ctl_info(edac_dev); 1016 + return 0; 1017 + } 1018 + 1019 + static irqreturn_t xgene_edac_isr(int irq, void *dev_id) 1020 + { 1021 + struct xgene_edac *ctx = dev_id; 1022 + struct xgene_edac_pmd_ctx *pmd; 1023 + unsigned int pcp_hp_stat; 1024 + unsigned int pcp_lp_stat; 1025 + 1026 + xgene_edac_pcp_rd(ctx, PCPHPERRINTSTS, &pcp_hp_stat); 1027 + xgene_edac_pcp_rd(ctx, PCPLPERRINTSTS, &pcp_lp_stat); 1028 + if ((MCU_UNCORR_ERR_MASK & pcp_hp_stat) || 1029 + (MCU_CTL_ERR_MASK & pcp_hp_stat) || 1030 + (MCU_CORR_ERR_MASK & pcp_lp_stat)) { 1031 + struct xgene_edac_mc_ctx *mcu; 1032 + 1033 + list_for_each_entry(mcu, &ctx->mcus, next) { 1034 + xgene_edac_mc_check(mcu->mci); 1035 + } 1036 + } 1037 + 1038 + list_for_each_entry(pmd, &ctx->pmds, next) { 1039 + if ((PMD0_MERR_MASK << pmd->pmd) & pcp_hp_stat) 1040 + xgene_edac_pmd_check(pmd->edac_dev); 1041 + } 1042 + 1043 + return IRQ_HANDLED; 1044 + } 1045 + 1046 + static int xgene_edac_probe(struct platform_device *pdev) 1047 + { 1048 + struct xgene_edac *edac; 1049 + struct device_node *child; 1050 + struct resource *res; 1051 + int rc; 1052 + 1053 + edac = devm_kzalloc(&pdev->dev, sizeof(*edac), GFP_KERNEL); 1054 + if (!edac) 1055 + return -ENOMEM; 1056 + 1057 + edac->dev = &pdev->dev; 1058 + platform_set_drvdata(pdev, edac); 1059 + INIT_LIST_HEAD(&edac->mcus); 1060 + INIT_LIST_HEAD(&edac->pmds); 1061 + spin_lock_init(&edac->lock); 1062 + mutex_init(&edac->mc_lock); 1063 + 1064 + edac->csw_map = syscon_regmap_lookup_by_phandle(pdev->dev.of_node, 1065 + "regmap-csw"); 1066 + if (IS_ERR(edac->csw_map)) { 1067 + dev_err(edac->dev, "unable to get syscon regmap csw\n"); 1068 + rc = PTR_ERR(edac->csw_map); 1069 + goto out_err; 1070 + } 1071 + 1072 + edac->mcba_map = syscon_regmap_lookup_by_phandle(pdev->dev.of_node, 1073 + "regmap-mcba"); 1074 + if (IS_ERR(edac->mcba_map)) { 1075 + dev_err(edac->dev, "unable to get syscon regmap mcba\n"); 1076 + rc = PTR_ERR(edac->mcba_map); 1077 + goto out_err; 1078 + } 1079 + 1080 + edac->mcbb_map = syscon_regmap_lookup_by_phandle(pdev->dev.of_node, 1081 + "regmap-mcbb"); 1082 + if (IS_ERR(edac->mcbb_map)) { 1083 + dev_err(edac->dev, "unable to get syscon regmap mcbb\n"); 1084 + rc = PTR_ERR(edac->mcbb_map); 1085 + goto out_err; 1086 + } 1087 + edac->efuse_map = syscon_regmap_lookup_by_phandle(pdev->dev.of_node, 1088 + "regmap-efuse"); 1089 + if (IS_ERR(edac->efuse_map)) { 1090 + dev_err(edac->dev, "unable to get syscon regmap efuse\n"); 1091 + rc = PTR_ERR(edac->efuse_map); 1092 + goto out_err; 1093 + } 1094 + 1095 + res = platform_get_resource(pdev, IORESOURCE_MEM, 0); 1096 + edac->pcp_csr = devm_ioremap_resource(&pdev->dev, res); 1097 + if (IS_ERR(edac->pcp_csr)) { 1098 + dev_err(&pdev->dev, "no PCP resource address\n"); 1099 + rc = PTR_ERR(edac->pcp_csr); 1100 + goto out_err; 1101 + } 1102 + 1103 + if (edac_op_state == EDAC_OPSTATE_INT) { 1104 + int irq; 1105 + int i; 1106 + 1107 + for (i = 0; i < 3; i++) { 1108 + irq = platform_get_irq(pdev, i); 1109 + if (irq < 0) { 1110 + dev_err(&pdev->dev, "No IRQ resource\n"); 1111 + rc = -EINVAL; 1112 + goto out_err; 1113 + } 1114 + rc = devm_request_irq(&pdev->dev, irq, 1115 + xgene_edac_isr, IRQF_SHARED, 1116 + dev_name(&pdev->dev), edac); 1117 + if (rc) { 1118 + dev_err(&pdev->dev, 1119 + "Could not request IRQ %d\n", irq); 1120 + goto out_err; 1121 + } 1122 + } 1123 + } 1124 + 1125 + for_each_child_of_node(pdev->dev.of_node, child) { 1126 + if (!of_device_is_available(child)) 1127 + continue; 1128 + if (of_device_is_compatible(child, "apm,xgene-edac-mc")) 1129 + xgene_edac_mc_add(edac, child); 1130 + if (of_device_is_compatible(child, "apm,xgene-edac-pmd")) 1131 + xgene_edac_pmd_add(edac, child, 1); 1132 + if (of_device_is_compatible(child, "apm,xgene-edac-pmd-v2")) 1133 + xgene_edac_pmd_add(edac, child, 2); 1134 + } 1135 + 1136 + return 0; 1137 + 1138 + out_err: 1139 + return rc; 1140 + } 1141 + 1142 + static int xgene_edac_remove(struct platform_device *pdev) 1143 + { 1144 + struct xgene_edac *edac = dev_get_drvdata(&pdev->dev); 1145 + struct xgene_edac_mc_ctx *mcu; 1146 + struct xgene_edac_mc_ctx *temp_mcu; 1147 + struct xgene_edac_pmd_ctx *pmd; 1148 + struct xgene_edac_pmd_ctx *temp_pmd; 1149 + 1150 + list_for_each_entry_safe(mcu, temp_mcu, &edac->mcus, next) { 1151 + xgene_edac_mc_remove(mcu); 1152 + } 1153 + 1154 + list_for_each_entry_safe(pmd, temp_pmd, &edac->pmds, next) { 1155 + xgene_edac_pmd_remove(pmd); 1156 + } 1157 + return 0; 1158 + } 1159 + 1160 + static const struct of_device_id xgene_edac_of_match[] = { 1161 + { .compatible = "apm,xgene-edac" }, 1162 + {}, 1163 + }; 1164 + MODULE_DEVICE_TABLE(of, xgene_edac_of_match); 1165 + 1166 + static struct platform_driver xgene_edac_driver = { 1167 + .probe = xgene_edac_probe, 1168 + .remove = xgene_edac_remove, 1169 + .driver = { 1170 + .name = "xgene-edac", 1171 + .owner = THIS_MODULE, 1172 + .of_match_table = xgene_edac_of_match, 1173 + }, 1174 + }; 1175 + 1176 + static int __init xgene_edac_init(void) 1177 + { 1178 + int rc; 1179 + 1180 + /* Make sure error reporting method is sane */ 1181 + switch (edac_op_state) { 1182 + case EDAC_OPSTATE_POLL: 1183 + case EDAC_OPSTATE_INT: 1184 + break; 1185 + default: 1186 + edac_op_state = EDAC_OPSTATE_INT; 1187 + break; 1188 + } 1189 + 1190 + rc = platform_driver_register(&xgene_edac_driver); 1191 + if (rc) { 1192 + edac_printk(KERN_ERR, EDAC_MOD_STR, 1193 + "EDAC fails to register\n"); 1194 + goto reg_failed; 1195 + } 1196 + 1197 + return 0; 1198 + 1199 + reg_failed: 1200 + return rc; 1201 + } 1202 + module_init(xgene_edac_init); 1203 + 1204 + static void __exit xgene_edac_exit(void) 1205 + { 1206 + platform_driver_unregister(&xgene_edac_driver); 1207 + } 1208 + module_exit(xgene_edac_exit); 1209 + 1210 + MODULE_LICENSE("GPL"); 1211 + MODULE_AUTHOR("Feng Kan <fkan@apm.com>"); 1212 + MODULE_DESCRIPTION("APM X-Gene EDAC driver"); 1213 + module_param(edac_op_state, int, 0444); 1214 + MODULE_PARM_DESC(edac_op_state, 1215 + "EDAC error reporting state: 0=Poll, 2=Interrupt");