Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

[PATCH] EDAC: Add memory scrubbing controls API to core

This is an attempt of providing an interface for memory scrubbing control in
EDAC.

This patch modifies the EDAC Core to provide the Interface for memory
controller modules to implment.

The following things are still outstanding:

- K8 is the first implemenation,

The patch provide a method of configuring the K8 hardware memory scrubber
via the 'mcX' sysfs directory. There should be some fallback to a generic
scrubber implemented in software if the hardware does not support
scrubbing.

Or .. the scrubbing sysfs entry should not be visible at all.

- Only works with SDRAM, not cache,

The K8 can scrub cache and l2cache also - but I think this is not so
useful as the cache is busy all the time (one hopes).

One would also expect that cache scrubbing requires hardware support.

- Error Handling,

I would like that errors are returned to the user in "terms of file
system".

- Presentation,

I chose Bandwidth in Bytes/Second as a representation of the scrubbing
rate for the following reasons:

I like that the sysfs entries are sort-of textual, related to something
that makes sense instead of magical values that must be looked up.

"My People" wants "% main memory scrubbed per hour" others prefer "%
memory bandwidth used" as representation, "bandwith used" makes it easy to
calculate both versions in one-liner scripts.

If one later wants to scrub cache, the scaling becomes wierd for K8
changing from "blocks of 64 byte memory" to "blocks of 64 cache lines" to
"blocks of 64 bit". Using "bandwidth used" makes sense in all three cases,
(I.M.O. anyway ;-).

- Discovery,

There is no way to discover the possible settings and what they do
without reading the code and the documentation.

*I* do not know how to make that work in a practical way.

- Bugs(??),

other tools can set invalid values in the memory scrub control register,
those will read back as '-1', requiring the user to reset the scrub rate.
This is how *I* think it should be.

- Afflicting other areas of code,

I made changes to edac_mc.c and edac_mc.h which will show up globally -
this is not nice, it would be better that the memory scrubbing fuctionality
and interface could be entirely contained within the memory controller it
applies to.

Frithiof Jensen

edac_mc.c and its .h file is a CORE helper module for EDAC
driver modules. This provides the abstraction for device specific
drivers. It is fine to modify this CORE to provide help for
new features of the the drivers

doug thompson

Signed-off-by: Frithiof Jensen <frithiof.jensen@ericson.com>
Signed-off-by: doug thompson <norsk5@xmission.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Frithiof Jensen and committed by
Linus Torvalds
4f423ddf 84db003f

+82 -1
+15 -1
Documentation/drivers/edac/edac.txt
··· 339 339 340 340 'device' 341 341 342 - Symlink to the memory controller device 342 + Symlink to the memory controller device. 343 + 344 + Sdram memory scrubbing rate: 345 + 346 + 'sdram_scrub_rate' 347 + 348 + Read/Write attribute file that controls memory scrubbing. The scrubbing 349 + rate is set by writing a minimum bandwith in bytes/sec to the attribute 350 + file. The rate will be translated to an internal value that gives at 351 + least the specified rate. 352 + 353 + Reading the file will return the actual scrubbing rate employed. 354 + 355 + If configuration fails or memory scrubbing is not implemented, the value 356 + of the attribute file will be -1. 343 357 344 358 345 359
+55
drivers/edac/edac_mc.c
··· 927 927 return count; 928 928 } 929 929 930 + /* memory scrubbing */ 931 + static ssize_t mci_sdram_scrub_rate_store(struct mem_ctl_info *mci, 932 + const char *data, size_t count) 933 + { 934 + u32 bandwidth = -1; 935 + 936 + if (mci->set_sdram_scrub_rate) { 937 + 938 + memctrl_int_store(&bandwidth, data, count); 939 + 940 + if (!(*mci->set_sdram_scrub_rate)(mci, &bandwidth)) { 941 + edac_printk(KERN_DEBUG, EDAC_MC, 942 + "Scrub rate set successfully, applied: %d\n", 943 + bandwidth); 944 + } else { 945 + /* FIXME: error codes maybe? */ 946 + edac_printk(KERN_DEBUG, EDAC_MC, 947 + "Scrub rate set FAILED, could not apply: %d\n", 948 + bandwidth); 949 + } 950 + } else { 951 + /* FIXME: produce "not implemented" ERROR for user-side. */ 952 + edac_printk(KERN_WARNING, EDAC_MC, 953 + "Memory scrubbing 'set'control is not implemented!\n"); 954 + } 955 + return count; 956 + } 957 + 958 + static ssize_t mci_sdram_scrub_rate_show(struct mem_ctl_info *mci, char *data) 959 + { 960 + u32 bandwidth = -1; 961 + 962 + if (mci->get_sdram_scrub_rate) { 963 + if (!(*mci->get_sdram_scrub_rate)(mci, &bandwidth)) { 964 + edac_printk(KERN_DEBUG, EDAC_MC, 965 + "Scrub rate successfully, fetched: %d\n", 966 + bandwidth); 967 + } else { 968 + /* FIXME: error codes maybe? */ 969 + edac_printk(KERN_DEBUG, EDAC_MC, 970 + "Scrub rate fetch FAILED, got: %d\n", 971 + bandwidth); 972 + } 973 + } else { 974 + /* FIXME: produce "not implemented" ERROR for user-side. */ 975 + edac_printk(KERN_WARNING, EDAC_MC, 976 + "Memory scrubbing 'get' control is not implemented!\n"); 977 + } 978 + return sprintf(data, "%d\n", bandwidth); 979 + } 980 + 930 981 /* default attribute files for the MCI object */ 931 982 static ssize_t mci_ue_count_show(struct mem_ctl_info *mci, char *data) 932 983 { ··· 1084 1033 MCIDEV_ATTR(ue_count,S_IRUGO,mci_ue_count_show,NULL); 1085 1034 MCIDEV_ATTR(ce_count,S_IRUGO,mci_ce_count_show,NULL); 1086 1035 1036 + /* memory scrubber attribute file */ 1037 + MCIDEV_ATTR(sdram_scrub_rate,S_IRUGO|S_IWUSR,mci_sdram_scrub_rate_show,mci_sdram_scrub_rate_store); 1038 + 1087 1039 static struct mcidev_attribute *mci_attr[] = { 1088 1040 &mci_attr_reset_counters, 1089 1041 &mci_attr_mc_name, ··· 1096 1042 &mci_attr_ce_noinfo_count, 1097 1043 &mci_attr_ue_count, 1098 1044 &mci_attr_ce_count, 1045 + &mci_attr_sdram_scrub_rate, 1099 1046 NULL 1100 1047 }; 1101 1048
+12
drivers/edac/edac_mc.h
··· 315 315 unsigned long scrub_cap; /* chipset scrub capabilities */ 316 316 enum scrub_type scrub_mode; /* current scrub mode */ 317 317 318 + /* Translates sdram memory scrub rate given in bytes/sec to the 319 + internal representation and configures whatever else needs 320 + to be configured. 321 + */ 322 + int (*set_sdram_scrub_rate) (struct mem_ctl_info *mci, u32 *bw); 323 + 324 + /* Get the current sdram memory scrub rate from the internal 325 + representation and converts it to the closest matching 326 + bandwith in bytes/sec. 327 + */ 328 + int (*get_sdram_scrub_rate) (struct mem_ctl_info *mci, u32 *bw); 329 + 318 330 /* pointer to edac checking routine */ 319 331 void (*edac_check) (struct mem_ctl_info * mci); 320 332 /*