Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

mm: Check if section present during memory block (un)registering

Tony found on his setup, if memory block size 512M will cause crash
during booting.

BUG: unable to handle kernel paging request at ffffea0074000020
IP: [<ffffffff81670527>] get_nid_for_pfn+0x17/0x40
PGD 128ffcb067 PUD 128ffc9067 PMD 0
Oops: 0000 [#1] SMP
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.2.0-rc8 #1
...
Call Trace:
[<ffffffff81453b56>] ? register_mem_sect_under_node+0x66/0xe0
[<ffffffff81453eeb>] register_one_node+0x17b/0x240
[<ffffffff81b1f1ed>] ? pci_iommu_alloc+0x6e/0x6e
[<ffffffff81b1f229>] topology_init+0x3c/0x95
[<ffffffff8100213d>] do_one_initcall+0xcd/0x1f0

The system has non continuous RAM address:
BIOS-e820: [mem 0x0000001300000000-0x0000001cffffffff] usable
BIOS-e820: [mem 0x0000001d70000000-0x0000001ec7ffefff] usable
BIOS-e820: [mem 0x0000001f00000000-0x0000002bffffffff] usable
BIOS-e820: [mem 0x0000002c18000000-0x0000002d6fffefff] usable
BIOS-e820: [mem 0x0000002e00000000-0x00000039ffffffff] usable

So there are start sections in memory block not present.
For example:
memory block : [0x2c18000000, 0x2c20000000) 512M
first three sections are not present.

Current register_mem_sect_under_node() assume first section is present,
but memory block section number range [start_section_nr, end_section_nr]
would include not present section.

For arch that support vmemmap, we don't setup memmap for struct page area
within not present sections area.

So skip the pfn range that belong to absent section.

Also fixes unregister_mem_sect_under_nodes() that assume one section per
memory block.

Reported-by: Tony Luck <tony.luck@intel.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Fixes: bdee237c0343 ("x86: mm: Use 2GB memory block size on large memory x86-64 systems")
Fixes: 982792c782ef ("x86, mm: probe memory block size for generic x86 64bit")
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: stable@vger.kernel.org #v3.15
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

authored by

Yinghai Lu and committed by
Greg Kroah-Hartman
7568fb63 1f35d04a

+27 -4
+27 -4
drivers/base/node.c
··· 390 390 sect_end_pfn = section_nr_to_pfn(mem_blk->end_section_nr); 391 391 sect_end_pfn += PAGES_PER_SECTION - 1; 392 392 for (pfn = sect_start_pfn; pfn <= sect_end_pfn; pfn++) { 393 - int page_nid; 393 + int page_nid, scn_nr; 394 + 395 + /* 396 + * memory block could have several absent sections from start. 397 + * skip pfn range from absent section 398 + */ 399 + scn_nr = pfn_to_section_nr(pfn); 400 + if (!present_section_nr(scn_nr)) { 401 + pfn = round_down(pfn + PAGES_PER_SECTION, 402 + PAGES_PER_SECTION) - 1; 403 + continue; 404 + } 394 405 395 406 /* 396 407 * memory block could have several absent sections from start. ··· 447 436 return -ENOMEM; 448 437 nodes_clear(*unlinked_nodes); 449 438 450 - sect_start_pfn = section_nr_to_pfn(phys_index); 451 - sect_end_pfn = sect_start_pfn + PAGES_PER_SECTION - 1; 439 + sect_start_pfn = section_nr_to_pfn(mem_blk->start_section_nr); 440 + sect_end_pfn = section_nr_to_pfn(mem_blk->end_section_nr); 441 + sect_end_pfn += PAGES_PER_SECTION - 1; 452 442 for (pfn = sect_start_pfn; pfn <= sect_end_pfn; pfn++) { 453 - int nid; 443 + int nid, scn_nr; 444 + 445 + /* 446 + * memory block could have several absent sections from start. 447 + * skip pfn range from absent section 448 + */ 449 + scn_nr = pfn_to_section_nr(pfn); 450 + if (!present_section_nr(scn_nr)) { 451 + pfn = round_down(pfn + PAGES_PER_SECTION, 452 + PAGES_PER_SECTION) - 1; 453 + continue; 454 + } 454 455 455 456 nid = get_nid_for_pfn(pfn); 456 457 if (nid < 0)