LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 49600 - probe-stack=inline-asm will produce invalid uwtables
Summary: probe-stack=inline-asm will produce invalid uwtables
Status: NEW
Alias: None
Product: libraries
Classification: Unclassified
Component: Backend: X86 (show other bugs)
Version: trunk
Hardware: PC All
: P normal
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-03-15 17:53 PDT by simonas+llvm.org
Modified: 2021-03-15 17:56 PDT (History)
5 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description simonas+llvm.org 2021-03-15 17:53:21 PDT
Given a function as such

; RUN: llc < %s
define void @big_stack() "probe-stack"="inline-asm" uwtable {
start:
  %_two_page_stack = alloca [8192 x i8], align 1
  ret void
}

the following assembly will be generated:

big_stack:
	.cfi_startproc
	subq	$4096, %rsp
	movq	$0, (%rsp)
	subq	$3968, %rsp
	.cfi_def_cfa_offset 8072
	addq	$8064, %rsp
	.cfi_def_cfa_offset 8
	retq


Here the unwind tables are not accurate while stack probing is ongoing – the `rsp` is adjusted, but not the `cfa_offsets`. And so attempts to obtain a stack trace will fail if the current instruction is somewhere in between the instructions implementing the stack probing.

This also occurs with the non-unrolled implementation of the stack probing:

; RUN: llc < %s
define void @big_stack() "probe-stack"="inline-asm" uwtable {
start:
  %_two_page_stack = alloca [64000 x i8], align 1
  ret void
}

--->

big_stack:
	.cfi_startproc
	movq	%rsp, %r11
	subq	$61440, %r11
.LBB0_1:
	subq	$4096, %rsp
	movq	$0, (%rsp)
	cmpq	%r11, %rsp
	jne	.LBB0_1
	subq	$2432, %rsp
	.cfi_def_cfa_offset 63880
	addq	$63872, %rsp
	.cfi_def_cfa_offset 8
	retq

however in the loop case the solution needs to involve allocation of a separate register as insertion of `.cfi` directives in a loop won't help in any way.
Comment 1 simonas+llvm.org 2021-03-15 17:56:11 PDT
The correct assembly for the unrolled case would probably look a lot like this:

big_stack:
	.cfi_startproc
	subq	$4096, %rsp
	.cfi_def_cfa_offset 4096
	movq	$0, (%rsp)
	subq	$3968, %rsp
	.cfi_def_cfa_offset 8072
	addq	$8064, %rsp
	.cfi_def_cfa_offset 8
	retq

or an equivalent using `.cfi_adjust_cfa_offset` directives.