33917 – Miscompilation with `-jump-threading -correlated-propagation` and llvm.assume

LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 33917 - Miscompilation with `-jump-threading -correlated-propagation` and llvm.assume

Summary: Miscompilation with `-jump-threading -correlated-propagation` and llvm.assume

Status:	RESOLVED FIXED

Alias:	None

Product:	libraries
Classification:	Unclassified
Component:	Scalar Optimizations (show other bugs)
Version:	5.0
Hardware:	PC Windows NT

Importance:	P enhancement
Assignee:	Unassigned LLVM Bugs

URL:
Keywords:

Depends on:
Blocks:	33849
	Show dependency tree

Reported:	2017-07-24 15:53 PDT by Alex Crichton
Modified:	2018-01-29 03:06 PST (History)
CC List:	8 users (show)

See Also:
Fixed By Commit(s):

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Alex Crichton 2017-07-24 15:53:56 PDT

In upgrading rust-lang/rust to LLVM 5.0 we've discovered that the compiler itself will segfault when compiled with LLVM 5.0. This looks like it's a misoptimization during one of the `-jump-threading -correlated-propagation` passes.

We used `bugpoint` to shorten the program from ~1 million lines of IR to just a few:


; ModuleID = 'small.ll'
source_filename = "bugpoint-output-834ffe1.bc"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

declare i8* @foo()

declare i32 @rust_eh_personality() unnamed_addr

; Function Attrs: nounwind
declare void @llvm.assume(i1) #0

define void @bar() personality i32 ()* @rust_eh_personality {
bb9:
  %t9 = invoke i8* @foo()
          to label %good unwind label %bad

bad:                                              ; preds = %bb9
  %t10 = landingpad { i8*, i32 }
          cleanup
  resume { i8*, i32 } %t10

good:                                             ; preds = %bb9
  %t11 = icmp ne i8* %t9, null
  %t12 = zext i1 %t11 to i64
  %cond = icmp eq i64 %t12, 1
  br i1 %cond, label %if_true, label %done

if_true:                                          ; preds = %good
  call void @llvm.assume(i1 %t11)
  br label %done

done:                                             ; preds = %if_true, %good
  ret void
}

attributes #0 = { nounwind }




This was tested out locally via `opt -S -jump-threading -correlated-propagation foo.ll`, and then the results from before/after were diffed, yielding:



--- before.ll	2017-07-24 15:52:48.364787248 -0700
+++ after.ll	2017-07-24 15:53:34.701537275 -0700
@@ -21,13 +21,12 @@
   resume { i8*, i32 } %t10
 
 good:                                             ; preds = %bb9
-  %t11 = icmp ne i8* %t9, null
-  %t12 = zext i1 %t11 to i64
+  %t12 = zext i1 true to i64
   %cond = icmp eq i64 %t12, 1
   br i1 %cond, label %if_true, label %done
 
 if_true:                                          ; preds = %good
-  call void @llvm.assume(i1 %t11)
+  call void @llvm.assume(i1 true)
   br label %done
 
 done:                                             ; preds = %if_true, %good




I think that the transformation here may be invalid? The `%t11` temporary is inferred to being always `true` but that's not always the case coming out of `@foo`

Comment 1 Alex Crichton 2017-07-24 15:54:56 PDT

Oh I should mention that this is against the current `release_50` branch of LLVM as well!

Comment 2 Kavon Farvardin 2017-07-25 10:14:19 PDT

The transformation is being performed because %t11 is passed to @llvm.assume, which according to the semantics [1], allows the optimizer to assume the conditional is _always_ true. 

It seems your code expected that @llvm.assume, when applied to a conditional, indicates that the conditional may be assumed to be true only at the point of the intrinsic.

[1] https://llvm.org/docs/LangRef.html#llvm-assume-intrinsic

Comment 3 Davide Italiano 2017-07-25 10:19:00 PDT

(In reply to Kavon Farvardin from comment #2)
> The transformation is being performed because %t11 is passed to
> @llvm.assume, which according to the semantics [1], allows the optimizer to
> assume the conditional is _always_ true. 
> 
> It seems your code expected that @llvm.assume, when applied to a
> conditional, indicates that the conditional may be assumed to be true only
> at the point of the intrinsic.
> 
> [1] https://llvm.org/docs/LangRef.html#llvm-assume-intrinsic

I haven't looked at this closely, just bisected to https://reviews.llvm.org/D30352 but I'm afraid there's a bug hidden somewhere.
In fact, if you run ./opt -jump-threading | ./opt -correlated-propagation 
you get a different result (in fact, the code is not optimized).
This feels ... wrong. 
I think we're getting somehow a wrong value from LVI, and as JumpThreading caches LVI this only happens when you run the two passes in sequence.

Comment 4 Tim Neumann 2017-07-25 10:20:02 PDT

@Kavon: Mh, are you sure? That's not how I read the semantics section of assume, to quote:

> The intrinsic allows the optimizer to assume that the provided condition is always true **whenever the control flow reaches the intrinsic call**.

Note the **highlighted** part.

Comment 5 Kavon Farvardin 2017-07-25 10:28:12 PDT

(In reply to Tim Neumann from comment #4)
> @Kavon: Mh, are you sure? That's not how I read the semantics section of
> assume, to quote:
> 
> > The intrinsic allows the optimizer to assume that the provided condition is always true **whenever the control flow reaches the intrinsic call**.
> 
> Note the **highlighted** part.

Oops, I read the wrong subsection... transformation is wrong.

Comment 6 Davide Italiano 2017-07-25 10:32:27 PDT

Again, I suspect it's a bug in LVI.
Please note that if you remove the caching of LVI from JT (i.e. you comment the addPreserved<JumpThreading...> line) you get the correct result.

When we ask what it thinks about the value produced by the invoke, it thinks that it's never null (notconstant in LVI lingo means that the constant *can't* have that value)

LVI Getting value   %t9 = invoke i8* @foo()
          to label %good unwind label %bad at 't11'
  Result = notconstant<i8* null>


while it should be
  Result = overdefined

as we can't necessarily conclude anything about `%t9` for a couple of reasons:
1) This is an intraprocedural analysis
2) We don't know what `@foo` will do so inferring a return value out-of-thin-air seems just wrong.

Comment 7 Davide Italiano 2017-07-25 11:25:09 PDT

This is an hilarious bug, and I think I got to the bottom of it.
So, the reason why LVI gives us that answer is that it thinks the assumption is valid for the context we're in (which, of course is not).

If we run only correlated-value-propagation (or disable the caching of LVI), we don't have a DomTree available, so `isValidAssumeForContext()` looks at BBs parent->children relations, i.e.

llvm::isValidAssumeForContext (Inv=0x5b77490, CxtI=0x5b76ac0, DT=0x0)
    at ../lib/Analysis/ValueTracking.cpp:475
475       if (DT) {
(gdb) n
485       if (Inv->getParent() != CxtI->getParent())
(gdb) n
486         return false;

If we cache LVI, we *think* we have a DomTree, so here's how the invocation of `llvm::isValidAssumeForContext() looks like`


(gdb) s
llvm::isValidAssumeForContext (Inv=0x5b77490, CxtI=0x5b76ac0, DT=0x5b977e0)
    at ../lib/Analysis/ValueTracking.cpp:475
475       if (DT) {
(gdb) n
476         if (DT->dominates(Inv, CxtI))
(gdb) n
477           return true;


I was very confused because I didn't really believe in this dominance relation, so I ran verify() on the DT and booom, I got a crash. 
So, we happen to get an answer, but the underlying dominator tree is completely bogus.


Running with some more debugging reveals that:


[2017-07-25 11:22:56.638141270] 0x76c8d00   Executing Pass 'Function Pass Manager' on Module 'x.ll'...
[2017-07-25 11:22:56.638220852] 0x76e8860     Executing Pass 'Dominator Tree Construction' on Function 'bar'...
[2017-07-25 11:22:56.638297844] 0x76e8860     Executing Pass 'Basic Alias Analysis (stateless AA impl)' on Function 'bar'...
0x76e8710       Required Analyses: Assumption Cache Tracker, Dominator Tree Construction, Target Library Information
[2017-07-25 11:22:56.638354993] 0x76e8860     Executing Pass 'Function Alias Analysis Results' on Function 'bar'...
0x76e8660       Required Analyses: Basic Alias Analysis (stateless AA impl), Target Library Information
0x76e8660       Used Analyses: Scoped NoAlias Alias Analysis, Type-Based Alias Analysis, ObjC-ARC-Based Alias Analysis, Globals Alias Analysis, ScalarEvolution-based Alias Analysis, Inclusion-Based CFL Alias Analysis, Unification-Based CFL Alias Analysis
[2017-07-25 11:22:56.638427811] 0x76e8860     Executing Pass 'Lazy Value Information Analysis' on Function 'bar'...
0x76e96d0       Required Analyses: Assumption Cache Tracker, Target Library Information
[2017-07-25 11:22:56.638462642] 0x76e8860     Executing Pass 'Jump Threading' on Function 'bar'...
0x76e8410       Required Analyses: Function Alias Analysis Results, Lazy Value Information Analysis, Target Library Information
0x76e8410       Preserved Analyses: Lazy Value Information Analysis, Globals Alias Analysis
 -- 'Jump Threading' is not preserving 'Function Alias Analysis Results'
 -- 'Jump Threading' is not preserving 'Basic Alias Analysis (stateless AA impl)'
 -- 'Jump Threading' is not preserving 'Dominator Tree Construction'
 -*- 'Jump Threading' is the last user of following pass instances. Free these instances
[2017-07-25 11:22:56.638719401] 0x76e8860      Freeing Pass 'Jump Threading' on Function 'bar'...
[2017-07-25 11:22:56.638737039] 0x76e8860      Freeing Pass 'Function Alias Analysis Results' on Function 'bar'...
[2017-07-25 11:22:56.638751239] 0x76e8860      Freeing Pass 'Basic Alias Analysis (stateless AA impl)' on Function 'bar'...
[2017-07-25 11:22:56.638765276] 0x76e8860      Freeing Pass 'Dominator Tree Construction' on Function 'bar'...
[2017-07-25 11:22:56.638782278] 0x76e8860     Executing Pass 'Value Propagation' on Function 'bar'...
0x76e9780       Required Analyses: Lazy Value Information Analysis
[2017-07-25 11:22:56.638881102] 0x76e8860     Made Modification 'Value Propagation' on Function 'bar'...
0x76e9780       Preserved Analyses: Globals Alias Analysis
 -- 'Value Propagation' is not preserving 'Lazy Value Information Analysis'


So, JumpThreading claims it preserves LVI, but not DomTree, we free DomTree, but LVI still uses it as it's cached, so it's basically accessing invalid memory. Trying to work on a fix now.

Comment 8 Alex Crichton 2017-07-25 12:12:47 PDT

Thanks so much for the investigation and quick diagnosis!

Comment 9 Davide Italiano 2017-07-25 12:43:53 PDT

Anna, this looks like may interfere a bit with yours "print LVI cache after transforms" work, therefore I'd love your opinion on it.

After the analysis, it's actually clear that LVI isn't really preserved by JT, because all the analyses which depend on it (and LVI holds as references) gets freed if JT is the last user (as it happens here).

So, I see several options:
1) Drop the preservation of LVI from JT (which, as far as I can tell, is the only in-tree pass preserving that analysis). This may have a compile-time impact, but I'd rather leave with it than querying broken analyses.
FWIW, as far as I can tell Chandler implemented in the new pass manager a mechanism to invalidate LVI if one of the dependencies gets invalidated & in the new pass manager JT currently doesn't preserve LVI, so that is a legacy PM only problem.

2) Teach JT to preserve all the dependencies of LVI (i.e. DomTree & AA & AssumptionCache). I haven't estimated how much work it is, but my gut feeling says it won't be trivial. This has a maintainability drawback as well as every new analysis that preserves LVI should also explicitly be taught to preserve all the other passes, otherwise it's useless.

Ideally long(er) term we might consider moving away from analyses holding dependencies to other analyses as they're quite tricky to handle correctly (as this PR shows :)

What do you think? CC:ing Chandler as well as he might have thoughts.

Comment 10 Anna 2017-07-26 06:23:26 PDT

(In reply to Davide Italiano from comment #9)
> Anna, this looks like may interfere a bit with yours "print LVI cache after
> transforms" work, therefore I'd love your opinion on it.
> 
> After the analysis, it's actually clear that LVI isn't really preserved by
> JT, because all the analyses which depend on it (and LVI holds as
> references) gets freed if JT is the last user (as it happens here).
> 
> So, I see several options:
> 1) Drop the preservation of LVI from JT (which, as far as I can tell, is the
> only in-tree pass preserving that analysis). This may have a compile-time
> impact, but I'd rather leave with it than querying broken analyses.
> FWIW, as far as I can tell Chandler implemented in the new pass manager a
> mechanism to invalidate LVI if one of the dependencies gets invalidated & in
> the new pass manager JT currently doesn't preserve LVI, so that is a legacy
> PM only problem.
> 
Hi Davide,
Thanks for the analysis on this bug and I agree that dropping preservation of LVI in JT is probably the safest (and quickest) way forward. This would also fix another class of bugs: LVI cache from JT that's used by CVP. 

You're right that it would interfere with the "printing of LVI after transforms" work. I had initially added it precisely for LVI caching bugs in JT, which are itself tricky :)

However, the printing also enables us to view the LVI analysis results (for example: -lazy-value-info -print-lvi), apart from what happens after a transform. That functionality would remain intact, and useful even when we move away from the caching model for LVI, or when we no longer preserve LVI after JT.

Comment 11 Davide Italiano 2017-07-26 17:09:26 PDT

(In reply to Anna from comment #10)
> (In reply to Davide Italiano from comment #9)
> > Anna, this looks like may interfere a bit with yours "print LVI cache after
> > transforms" work, therefore I'd love your opinion on it.
> > 
> > After the analysis, it's actually clear that LVI isn't really preserved by
> > JT, because all the analyses which depend on it (and LVI holds as
> > references) gets freed if JT is the last user (as it happens here).
> > 
> > So, I see several options:
> > 1) Drop the preservation of LVI from JT (which, as far as I can tell, is the
> > only in-tree pass preserving that analysis). This may have a compile-time
> > impact, but I'd rather leave with it than querying broken analyses.
> > FWIW, as far as I can tell Chandler implemented in the new pass manager a
> > mechanism to invalidate LVI if one of the dependencies gets invalidated & in
> > the new pass manager JT currently doesn't preserve LVI, so that is a legacy
> > PM only problem.
> > 
> Hi Davide,
> Thanks for the analysis on this bug and I agree that dropping preservation
> of LVI in JT is probably the safest (and quickest) way forward. This would
> also fix another class of bugs: LVI cache from JT that's used by CVP. 
> 
> You're right that it would interfere with the "printing of LVI after
> transforms" work. I had initially added it precisely for LVI caching bugs in
> JT, which are itself tricky :)
> 
> However, the printing also enables us to view the LVI analysis results (for
> example: -lazy-value-info -print-lvi), apart from what happens after a
> transform. That functionality would remain intact, and useful even when we
> move away from the caching model for LVI, or when we no longer preserve LVI
> after JT.

Thanks. I agree this functionality is very useful and should stay here.
After the change one of the tests seem to fail (lvi-after-jumpthreading.ll) because the LVI cache has been purged when we call printLVI.

Can you think of an alternative way to test? (as there is no coverage of the feature without that test?)

Thanks!

--
Davide

Comment 12 Anna 2017-07-27 07:56:51 PDT

Yes, that test will fail. I thought about the printer pass a bit more. 
There are 3 kinds of bugs we can catch with the LVI printer:
1. missed cache invalidation entries of LVI between transforms
2. missed cache invalidation entries of LVI during a transform.
3. incorrectly calculated LVI results during a transform

#1 will no longer occur once we avoid preservation of LVI in JT (since there's no in-tree passes that preserve LVI). I had written and used the pass for catching bugs of kind #1.
To get #2 and #3, we will need to print the LVI *before* purging the cache in JT (and/or CVP) passes. Can we require the printer pass as part of finalization of the JT pass? I'm not sure how else we can test #2 and #3 (without which we'll have no coverage or purpose of the printer pass in tree).

Comment 13 Davide Italiano 2017-07-27 11:18:37 PDT

(In reply to Anna from comment #12)
> Yes, that test will fail. I thought about the printer pass a bit more. 
> There are 3 kinds of bugs we can catch with the LVI printer:
> 1. missed cache invalidation entries of LVI between transforms
> 2. missed cache invalidation entries of LVI during a transform.
> 3. incorrectly calculated LVI results during a transform
> 
> #1 will no longer occur once we avoid preservation of LVI in JT (since
> there's no in-tree passes that preserve LVI). I had written and used the
> pass for catching bugs of kind #1.
> To get #2 and #3, we will need to print the LVI *before* purging the cache
> in JT (and/or CVP) passes. Can we require the printer pass as part of
> finalization of the JT pass? I'm not sure how else we can test #2 and #3
> (without which we'll have no coverage or purpose of the printer pass in
> tree).

What about adding a cl::opt that calls the printer pass after JT runs (but before freeing the state?)

Comment 14 Anna 2017-07-27 15:12:15 PDT

(In reply to Davide Italiano from comment #13)
> (In reply to Anna from comment #12)
> > Yes, that test will fail. I thought about the printer pass a bit more. 
> > There are 3 kinds of bugs we can catch with the LVI printer:
> > 1. missed cache invalidation entries of LVI between transforms
> > 2. missed cache invalidation entries of LVI during a transform.
> > 3. incorrectly calculated LVI results during a transform
> > 
> > #1 will no longer occur once we avoid preservation of LVI in JT (since
> > there's no in-tree passes that preserve LVI). I had written and used the
> > pass for catching bugs of kind #1.
> > To get #2 and #3, we will need to print the LVI *before* purging the cache
> > in JT (and/or CVP) passes. Can we require the printer pass as part of
> > finalization of the JT pass? I'm not sure how else we can test #2 and #3
> > (without which we'll have no coverage or purpose of the printer pass in
> > tree).
> 
> What about adding a cl::opt that calls the printer pass after JT runs (but
> before freeing the state?)

Ah, that should work as well, and definitely much simpler :)

Comment 15 Davide Italiano 2017-07-27 19:33:23 PDT

Anna, the first patch is here
https://bugs.llvm.org/show_bug.cgi?id=33917

I'd love to get this included in 5.0 (otherwise the Rust folks will have to carry local patches around).

Thanks!

--
Davide

Comment 16 Anna 2017-07-27 19:54:41 PDT

(In reply to Davide Italiano from comment #15)
> Anna, the first patch is here
> https://bugs.llvm.org/show_bug.cgi?id=33917
> 
> I'd love to get this included in 5.0 (otherwise the Rust folks will have to
> carry local patches around).
> 
> Thanks!
> 
> --
> Davide

done. Thanks!

Comment 17 Davide Italiano 2017-07-27 20:11:29 PDT

r309353/r309355.

Hans, we might consider backporting this to 5.0 as it's blocking the Rust people.

Comment 18 Alex Crichton 2017-07-27 21:49:56 PDT

Awesome thanks so much again for the prompt fixing and helping to track this down!

Comment 19 Hans Wennborg 2017-07-28 14:37:30 PDT

(In reply to Davide Italiano from comment #17)
> r309353/r309355.
> 
> Hans, we might consider backporting this to 5.0 as it's blocking the Rust
> people.

Merged in r309439.