Custom lowering DYNAMIC_STACKALLOC

Hi!

I'm a GSoC student this year, working on implementing split stacks on LLVM.

TL;DR: I'm facing some problems trying to get LLVM to generate the code
I want, please help me out if you can spare some time. It involves the
SelectionDAG, MachinsInstr and liveness analysis portions.

I'm currently trying to implement alloca correctly. It essentially boils
down to checking if the current stack block has enough space to hold the
alloca'ed block of memory. If yes, going the conventional way (bumping
the RSP); otherwise calling into a function that allocates the memory
from the heap [1]. The stack pointer is not modified in the second case.

I am trying to implement this by:

a. Custom lowering DYNAMIC_ALLOCA in case segmented stacks are enabled.

b. Creating a X86ISD::SEG_ALLOCA node in LowerDYNAMIC_STACKALLOC if
segmented stacks are enabled. (Right now all LowerDYNAMIC_STACKALLOC on
x86 does is check for Windows and lower the call to X86ISD::WIN_ALLOCA).

c. Having EmitLoweredSegAlloca do the checks, (calling the external
function if needed) and, in both the cases, write the pointer to the
allocated memory to RAX. If the function is called nothing extra needs
to be done, since the return value stays at RAX. If the stack pointer
was changed, I do a (X86::MOV64rr, X86::RAX).addReg(X86::RSP).

d. Setting the value of the node to RAX in LowerDYNAMIC_STACKALLOC,
making the last part of the function effectively look like this:

  // Reg is RAX or EAX, based on the subtarget
  Chain = DAG.getNode(X86ISD::SEG_ALLOCA, dl, NodeTys, Chain, Flag);
  Flag = Chain.getValue(1);
  Chain = DAG.getCopyFromReg(Chain, dl, Reg, SPTy).getValue(1);

  SDValue Ops1[2] = { Chain.getValue(0), Chain };
  return DAG.getMergeValues(Ops1, 2, dl);

Firstly, I would also like some feedback on this implementation in general.

Secondly, the problem I'm facing: in the final assembly generated, the
move instruction to RAX, in (c) is absent. I suspected this has
something to do with the liveness analysis pass. With
-debug-only=liveintervals, I see this

304L %vreg5<def> = COPY %RAX<kill>; GR64:%vreg5

in the basic block I jump to after allocating the memory (both after
bumping the SP or after calling the runtime). Perhaps this causes the
pass to think the assignment to RAX is not needed, and can be removed?
My guess is that it has something to do with the DAG.getCopyFromReg. How
can I fix this?

Thirdly, the comments say DYNAMIC_STACKALLOC is supposed to evaluate to
the new stack pointer. However, seeing that the actual update is done
inside the lowering, I am setting its value to a pointer to the new
block of memory. Now that I'm violating this assumption, what should I
change? Could this be the reason for the above MOV missing in my
original problem?

[1] There is some bookkeeping done to make sure we don't leak memory,
but I'll skip that detail to keep my email short.

Is SEG_ALLOCA marked as writing to RAX?

Is this code in github? It has been a long time since I looked at selection dags, but I could take a look.

btw, have you got -view-isel-dags (and the other view dags options) working? They are really handy for debugging this stuff.

Cheers,
Rafael

Hi!

Is SEG_ALLOCA marked as writing to RAX?

It has RAX in its Defs list.

Is this code in github? It has been a long time since I looked at
selection dags, but I could take a look.

It is up at https://github.com/sanjoy/llvm/tree/segmented-stacks

btw, have you got -view-isel-dags (and the other view dags options)
working? They are really handy for debugging this stuff.

I have, but perhaps not as extensively as they can be used. I'll give
them a try again.

Hi!

I resolved this issue - the liveness analysis did not consider physical
registers live across basic blocks. This was a FIXME - I've attached a
patch in a different mail.

Try not to use physical registers before register allocation unless you absolutely have to.

If you call a function, immediately copy the return value to a virtual register. Same thing for the stack pointer; copy it to a virtual register (using COPY, not MOV64rr).

You’ll probably need a PHI as well.

/jakob