···1+commit eb92f5a745014532b83abfba04602fce87ca8393
2+Author: Chuang-Yu Cheng <cycheng@multicorewareinc.com>
3+Date: Fri Apr 8 12:04:32 2016 +0000
4+5+ CXX_FAST_TLS calling convention: performance improvement for PPC64
6+7+ This is the same change on PPC64 as r255821 on AArch64. I have even borrowed
8+ his commit message.
9+10+ The access function has a short entry and a short exit, the initialization
11+ block is only run the first time. To improve the performance, we want to
12+ have a short frame at the entry and exit.
13+14+ We explicitly handle most of the CSRs via copies. Only the CSRs that are not
15+ handled via copies will be in CSR_SaveList.
16+17+ Frame lowering and prologue/epilogue insertion will generate a short frame
18+ in the entry and exit according to CSR_SaveList. The majority of the CSRs will
19+ be handled by register allcoator. Register allocator will try to spill and
20+ reload them in the initialization block.
21+22+ We add CSRsViaCopy, it will be explicitly handled during lowering.
23+24+ 1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target
25+ supports it for the given machine function and the function has only return
26+ exits). We also call TLI->initializeSplitCSR to perform initialization.
27+ 2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to
28+ virtual registers at beginning of the entry block and copies from virtual
29+ registers to CSRsViaCopy at beginning of the exit blocks.
30+ 3> we also need to make sure the explicit copies will not be eliminated.
31+32+ Author: Tom Jablin (tjablin)
33+ Reviewers: hfinkel kbarton cycheng
34+35+ http://reviews.llvm.org/D17533
36+37+ git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265781 91177308-0d34-0410-b5e6-96231b3b80d8
38+39+diff --git a/lib/CodeGen/TargetFrameLoweringImpl.cpp b/lib/CodeGen/TargetFrameLoweringImpl.cpp
40+index 679ade1..0a0e079 100644
41+--- a/lib/CodeGen/TargetFrameLoweringImpl.cpp
42++++ b/lib/CodeGen/TargetFrameLoweringImpl.cpp
43+@@ -63,12 +63,15 @@ void TargetFrameLowering::determineCalleeSaves(MachineFunction &MF,
44+ const TargetRegisterInfo &TRI = *MF.getSubtarget().getRegisterInfo();
45+ const MCPhysReg *CSRegs = TRI.getCalleeSavedRegs(&MF);
46+47++ // Resize before the early returns. Some backends expect that
48++ // SavedRegs.size() == TRI.getNumRegs() after this call even if there are no
49++ // saved registers.
50++ SavedRegs.resize(TRI.getNumRegs());
51++
52+ // Early exit if there are no callee saved registers.
53+ if (!CSRegs || CSRegs[0] == 0)
54+ return;
55+56+- SavedRegs.resize(TRI.getNumRegs());
57+-
58+ // In Naked functions we aren't going to save any registers.
59+ if (MF.getFunction()->hasFnAttribute(Attribute::Naked))
60+ return;