Skip to content

Use LTO to optimize Rust tools (cargo, miri, rustfmt, clippy, Rust Analyzer) #139588

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 11, 2025

Conversation

Kobzol
Copy link
Contributor

@Kobzol Kobzol commented Apr 9, 2025

Trying if LTO/PGO can help RA's performance, and by how much. As @Noratrieb suggested, we could actually LTO optimize all the important tools.

CC @Veykril I realized that we don't even do LTO for Rust Analyzer, that could be a very low hanging fruit to improve its performance 😅

try-job: dist-x86_64-linux

@rustbot
Copy link
Collaborator

rustbot commented Apr 9, 2025

r? @jieyouxu

rustbot has assigned @jieyouxu.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) labels Apr 9, 2025
@Kobzol
Copy link
Contributor Author

Kobzol commented Apr 9, 2025

@bors try

bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 9, 2025
Apply LTO when building rust-analyzer

Trying if LTO/PGO can help RA's performance, and by how much.

CC `@Veykril` I realized that we don't even do LTO for Rust Analyzer, that could be a very low hanging fruit to improve its performance 😅

try-job: dist-x86_64-linux
@bors
Copy link
Collaborator

bors commented Apr 9, 2025

⌛ Trying commit 3501da3 with merge 52f2f8f...

@bors
Copy link
Collaborator

bors commented Apr 9, 2025

☀️ Try build successful - checks-actions
Build commit: 52f2f8f (52f2f8f4e28f4c1680651944c8e1e4ef7115713f)

@Kobzol
Copy link
Contributor Author

Kobzol commented Apr 9, 2025

Tried the LTO optimized version on the bootstrap crate and on https://github.com/It4innovations/hyperqueue

 hyperfine --warmup 1 '/home/kobzol/.rustup/toolchains/f06e5c1e35bc5bc6131c6f8a0eb782097e3f28c3/bin/rust-analyzer analysis-stats bootstrap --run
-all-ide-things' '/home/kobzol/.rustup/toolchains/52f2f8f4e28f4c1680651944c8e1e4ef7115713f/bin/rust-analyzer analysis-stats bootstrap --run-all-ide-things' --runs 2
Benchmark 1: /home/kobzol/.rustup/toolchains/f06e5c1e35bc5bc6131c6f8a0eb782097e3f28c3/bin/rust-analyzer analysis-stats bootstrap --run-all-ide-things
  Time (mean ± σ):     19.806 s ±  0.032 s    [User: 20.432 s, System: 1.171 s]
  Range (min … max):   19.783 s … 19.828 s    2 runs
 
Benchmark 2: /home/kobzol/.rustup/toolchains/52f2f8f4e28f4c1680651944c8e1e4ef7115713f/bin/rust-analyzer analysis-stats bootstrap --run-all-ide-things
  Time (mean ± σ):     19.399 s ±  0.022 s    [User: 20.165 s, System: 1.197 s]
  Range (min … max):   19.384 s … 19.415 s    2 runs
 
Summary
  /home/kobzol/.rustup/toolchains/52f2f8f4e28f4c1680651944c8e1e4ef7115713f/bin/rust-analyzer analysis-stats bootstrap --run-all-ide-things ran
    1.02 ± 0.00 times faster than /home/kobzol/.rustup/toolchains/f06e5c1e35bc5bc6131c6f8a0eb782097e3f28c3/bin/rust-analyzer analysis-stats bootstrap --run-all-ide-things

hyperfine --warmup 1 '/home/kobzol/.rustup/toolchains/f06e5c1e35bc5bc6131c6f8a0eb782097e3f28c3/bin/rust-analyzer analysis-stats hyperqueue --run-all-ide-things' '/home/kobzol/.rustup/toolchains/52f2f8f4e28f4c1680651944c8e1e4ef7115713f/bin/rust-analyzer analysis-stats hyperqueue --run-all-ide-things' --runs 1
Benchmark 1: /home/kobzol/.rustup/toolchains/f06e5c1e35bc5bc6131c6f8a0eb782097e3f28c3/bin/rust-analyzer analysis-stats hyperqueue --run-all-ide-things
  Time (abs ≡):        57.777 s               [User: 56.585 s, System: 1.310 s]
 
Benchmark 2: /home/kobzol/.rustup/toolchains/52f2f8f4e28f4c1680651944c8e1e4ef7115713f/bin/rust-analyzer analysis-stats hyperqueue --run-all-ide-things
  Time (abs ≡):        56.439 s               [User: 55.349 s, System: 1.200 s]
 
Summary
  /home/kobzol/.rustup/toolchains/52f2f8f4e28f4c1680651944c8e1e4ef7115713f/bin/rust-analyzer analysis-stats hyperqueue --run-all-ide-things ran
    1.02 times faster than /home/kobzol/.rustup/toolchains/f06e5c1e35bc5bc6131c6f8a0eb782097e3f28c3/bin/rust-analyzer analysis-stats hyperqueue --run-all-ide-things

Looks like a solid 2% win. Not bad!

@Kobzol Kobzol force-pushed the rust-analyzer-opt branch from 3501da3 to 257d63f Compare April 9, 2025 20:29
@Kobzol
Copy link
Contributor Author

Kobzol commented Apr 9, 2025

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 9, 2025
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 9, 2025
Apply LTO when building rust-analyzer

Trying if LTO/PGO can help RA's performance, and by how much.

CC `@Veykril` I realized that we don't even do LTO for Rust Analyzer, that could be a very low hanging fruit to improve its performance 😅

try-job: dist-x86_64-linux
@bors
Copy link
Collaborator

bors commented Apr 9, 2025

⌛ Trying commit 257d63f with merge 1c4fbb0...

@bors
Copy link
Collaborator

bors commented Apr 9, 2025

☀️ Try build successful - checks-actions
Build commit: 1c4fbb0 (1c4fbb0c73905c0a461379d4974f294f1873821f)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (1c4fbb0): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
0.2% [0.2%, 0.2%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.3% [-0.5%, -0.2%] 10
Improvements ✅
(secondary)
-0.3% [-0.6%, -0.2%] 33
All ❌✅ (primary) -0.2% [-0.5%, 0.2%] 11

Max RSS (memory usage)

Results (primary -2.6%, secondary 3.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.2% [2.0%, 4.7%] 6
Improvements ✅
(primary)
-2.6% [-2.6%, -2.6%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -2.6% [-2.6%, -2.6%] 1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 780.203s -> 780.244s (0.01%)
Artifact size: 366.14 MiB -> 365.75 MiB (-0.11%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 10, 2025
@Kobzol Kobzol marked this pull request as ready for review April 10, 2025 06:55
@Kobzol Kobzol changed the title Apply LTO when building rust-analyzer Use LTO to optimize Rust tools (cargo, miri, rustfmt, clippy, Rust Analyzer) Apr 10, 2025
@Kobzol
Copy link
Contributor Author

Kobzol commented Apr 10, 2025

CI duration seems to be fine. The small win should be from LTO optimizing Cargo.

@rustbot ready

@Kobzol Kobzol force-pushed the rust-analyzer-opt branch from 257d63f to 9a26863 Compare April 10, 2025 06:57
@jieyouxu

This comment was marked as off-topic.

@jieyouxu
Copy link
Member

Ah -- do note that we're not running the main r-a test suite yet (only proc-macro-srv), but I believe other tool tests are run.

Copy link
Member

@jieyouxu jieyouxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other tools seem fine, waiting to hear back re. rust-analyzer because we don't run main r-a test suite in r-l/r CI yet

@jieyouxu
Copy link
Member

FYI @rust-lang/rust-analyzer: are you okay with building r-a with LTO?

@davidbarsky
Copy link
Contributor

Yeah, LTO seems reasonable to us!

@Veykril
Copy link
Member

Veykril commented Apr 10, 2025

On a related note, does this also affect the proc-macro server? I imagine not as I don't think it is considered a tool by boostrap?

@Kobzol
Copy link
Contributor Author

Kobzol commented Apr 10, 2025

It is also compiled with Mode::ToolRustc, so yes.

@jieyouxu
Copy link
Member

Since r-a is on board with LTO, let's give it a try. Thanks!

@bors r+ rollup=never

@bors
Copy link
Collaborator

bors commented Apr 10, 2025

📌 Commit 9a26863 has been approved by jieyouxu

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 10, 2025
@bors
Copy link
Collaborator

bors commented Apr 11, 2025

⌛ Testing commit 9a26863 with merge ed3a4aa...

@bors
Copy link
Collaborator

bors commented Apr 11, 2025

☀️ Test successful - checks-actions
Approved by: jieyouxu
Pushing ed3a4aa to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Apr 11, 2025
@bors bors merged commit ed3a4aa into rust-lang:master Apr 11, 2025
7 checks passed
@rustbot rustbot added this to the 1.88.0 milestone Apr 11, 2025
@Kobzol Kobzol deleted the rust-analyzer-opt branch April 11, 2025 20:06
Copy link

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing e1b06f7 (parent) -> ed3a4aa (this PR)

Test differences

No test diffs found

Job duration changes

  1. aarch64-apple: 3548.4s -> 4315.9s (21.6%)
  2. x86_64-apple-2: 5632.5s -> 5038.7s (-10.5%)
  3. dist-x86_64-apple: 9702.3s -> 10456.6s (7.8%)
  4. dist-aarch64-apple: 4639.4s -> 4916.3s (6.0%)
  5. x86_64-apple-1: 8083.0s -> 7607.2s (-5.9%)
  6. dist-x86_64-linux-alt: 6967.3s -> 7343.4s (5.4%)
  7. test-various: 4639.4s -> 4392.8s (-5.3%)
  8. dist-loongarch64-linux: 6344.2s -> 6645.2s (4.7%)
  9. dist-x86_64-linux: 5392.0s -> 5140.2s (-4.7%)
  10. dist-x86_64-mingw: 7765.7s -> 8116.6s (4.5%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (ed3a4aa): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.6% [-0.6%, -0.5%] 3
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results (secondary 2.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
4.4% [3.7%, 5.1%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.0% [-2.0%, -2.0%] 1
All ❌✅ (primary) - - 0

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 779.161s -> 779.659s (0.06%)
Artifact size: 365.95 MiB -> 365.50 MiB (-0.12%)

if path.ends_with("/rustdoc") &&
// Rustc tools (miri, clippy, cargo, rustfmt, rust-analyzer)
// could use the additional optimizations.
if self.mode == Mode::ToolRustc &&
// rustdoc is performance sensitive, so apply LTO to it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no object to the change, but the new comment should probably merge in this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, good catch! #139707

ChrisDenton added a commit to ChrisDenton/rust that referenced this pull request Apr 12, 2025
Fix comment in bootstrap

Didn't notice it in rust-lang#139588.

r? `@jieyouxu`
ChrisDenton added a commit to ChrisDenton/rust that referenced this pull request Apr 12, 2025
Fix comment in bootstrap

Didn't notice it in rust-lang#139588.

r? ``@jieyouxu``
ChrisDenton added a commit to ChrisDenton/rust that referenced this pull request Apr 13, 2025
Fix comment in bootstrap

Didn't notice it in rust-lang#139588.

r? ```@jieyouxu```
ChrisDenton added a commit to ChrisDenton/rust that referenced this pull request Apr 13, 2025
Fix comment in bootstrap

Didn't notice it in rust-lang#139588.

r? ````@jieyouxu````
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Apr 13, 2025
Rollup merge of rust-lang#139707 - Kobzol:fix-comment, r=jieyouxu

Fix comment in bootstrap

Didn't notice it in rust-lang#139588.

r? ````@jieyouxu````
github-actions bot pushed a commit to rust-lang/rustc-dev-guide that referenced this pull request Apr 14, 2025
Fix comment in bootstrap

Didn't notice it in rust-lang/rust#139588.

r? ````@jieyouxu````
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants