[IA64] Cache error recovery

Similar to memory error recovery, when a cache error is consumed
by a user process terminate the user instead of crashing the system.

Signed-off-by: Russ Anderson (rja@sgi.com)
Acked-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>

authored by Russ Anderson and committed by Tony Luck 396e8e76 618b206f

+11 -21
+11 -21
arch/ia64/kernel/mca_drv.c
··· 602 602 default: 603 603 break; 604 604 } 605 + } else if (psp->cc && !psp->bc) { /* Cache error */ 606 + status = recover_from_read_error(slidx, peidx, pbci, sos); 605 607 } 606 608 607 609 return status; ··· 647 645 * Return value: 648 646 * 1 on Success / 0 on Failure 649 647 */ 650 - /* 651 - * Later we try to recover when below all conditions are satisfied. 652 - * 1. Only one processor error section is exist. 653 - * 2. BUS_CHECK is exist and the others are not exist.(Except TLB_CHECK) 654 - * 3. The entry of BUS_CHECK_INFO is 1. 655 - * 4. "External bus error" flag is set and the others are not set. 656 - */ 657 648 658 649 static int 659 650 recover_from_processor_error(int platform, slidx_table_t *slidx, ··· 682 687 /* 683 688 * The cache check and bus check bits have four possible states 684 689 * cc bc 685 - * 0 0 Weird record, not recovered 686 - * 1 0 Cache error, not recovered 687 - * 0 1 I/O error, attempt recovery 688 690 * 1 1 Memory error, attempt recovery 691 + * 1 0 Cache error, attempt recovery 692 + * 0 1 I/O error, attempt recovery 693 + * 0 0 Other error type, not recovered 689 694 */ 690 - if (psp->bc == 0 || pbci == NULL) 691 - return fatal_mca("No bus check"); 695 + if (psp->cc == 0 && (psp->bc == 0 || pbci == NULL)) 696 + return fatal_mca("No cache or bus check"); 692 697 693 698 /* 694 - * Sorry, we cannot handle so many. 699 + * Cannot handle more than one bus check. 695 700 */ 696 701 if (peidx_bus_check_num(peidx) > 1) 697 702 return fatal_mca("Too many bus checks"); 698 - /* 699 - * Well, here is only one bus error. 700 - */ 703 + 701 704 if (pbci->ib) 702 705 return fatal_mca("Internal Bus error"); 703 - if (pbci->cc) 704 - return fatal_mca("Cache-cache error"); 705 706 if (pbci->eb && pbci->bsi > 0) 706 707 return fatal_mca("External bus check fatal status"); 707 708 708 709 /* 709 - * This is a local MCA and estimated as recoverble external bus error. 710 - * (e.g. a load from poisoned memory) 711 - * This means "there are some platform errors". 710 + * This is a local MCA and estimated as a recoverble error. 712 711 */ 713 712 if (platform) 714 713 return recover_from_platform_error(slidx, peidx, pbci, sos); 714 + 715 715 /* 716 716 * On account of strange SAL error record, we cannot recover. 717 717 */