Skip to content

Add more diagnostics for compiler performance analysis #5760

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

retronym
Copy link
Member

@retronym retronym commented Mar 7, 2017

  • -Yprofile will output the difference between snapshots of GC and CPU
    snapshot data, sourced from platform MBeans
  • This output can be sent to a file with -Yprofile-destination <filename>
    By default, output is to console
  • Use -Yprofile-external-tool will generate a call to the static method
    before and after around each of the the given phases. This can be
    used to communicate with an external profiler, such as YourKit, to generate a
    profile for a subset of the compiler.
  • -Yprofile-run-gc runs the GC after each phase to help more accurately
    attribute retained heap to a given phase.

@scala-jenkins scala-jenkins added this to the 2.12.2 milestone Mar 7, 2017
mkeskells and others added 3 commits March 7, 2017 13:19
  - `-Yprofile` will output the difference between snapshots of GC and CPU
    snapshot data, sourced from platform MBeans
  - This output can be sent to a file with `-Yprofile-destination <filename>`
    By default, output is to console
  - Use `-Yprofile-external-tool` will generate a call to the static method
    `before` and `after` around each of the the given phases. This can be
    used to communicate with an external profiler, such as YourKit, to
    generate a profile for a subset of the compiler.
  - `-Yprofile-run-gc` runs the GC after each phase to help more accurately
     attribute retained heap to a given phase.

Co-Authored by: Jason Zaugg <[email protected]>
This is handy when collecting samples in YourKit. The actual
result of this class is just an approximation, we rely on JMH
in scala/compiler-benchmark for more rigourous statistics.

```
./build/quick/bin/scala -J-Dscala.benchmark.iterations=2000 scala.tools.nsc.MainBench sandbox/test.scala
```
@retronym
Copy link
Member Author

retronym commented Mar 7, 2017

Rebase of #5758

Copy link
Member

@lrytz lrytz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started reviewing this today and got sucked into getting per-phase profiling with JFR. This is my WIP branch: lrytz@9d16bb3

I haven't figured out how to do that from the command line, unfortunately..

withPostSetHook( _ => YprofileEnabled.value = true )
val YprofileExternalTool = PhasesSetting("-Yprofile-external-tool", "Enable profiling for a phase using an external tool hook. Generally only useful for a single phase", "typer").
withPostSetHook( _ => YprofileEnabled.value = true )
val YprofileRunGcBetweenPhases = PhasesSetting("-Yprofile-run-gc", "Run a GC between phases - this allows heap size to be accurate at the expense of more time. Specify a list of phases, or *", "_").
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both _ and * are not valid, should use all instead

s2 - s1
}
private def doGC(): Unit = {
System.gc()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Javadoc says: "When control returns from the method call, the Java Virtual Machine has made a best effort to reclaim space from all discarded objects", I guess that's still true for concurrent GC? Anyway, we just have to be aware that System.gc is probably not the most reliable tool.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the GC before/after was intended to provide some indication of the ratio of allocation vs retained sizes. Generally the information that this tool provide is indicative, and not 100 % reproducible, but with sufficient iteration and post processing of the data can provide a high confidence that a particular PR affected a certain metric

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the record - it also has inaccuracies related to finalization and object graphs that require multiple GC/finalization cycles to reclaim, and also with soft references

@SethTisue SethTisue modified the milestones: 2.12.3, 2.12.2 Mar 21, 2017
@SethTisue
Copy link
Member

tentatively retargeted for 2.12.3, change it back if that's wrong

@adriaanm adriaanm mentioned this pull request Mar 27, 2017
7 tasks
@mkeskells
Copy link
Contributor

Hi @retronym we have updated the base profiler to support capturing of stats from background threads (needed for #5815), like thread CPU time and allocations

This also includes better output control and formatting in https://github.com/rorygraves/scalac_perf/tree/2.12.x_profile2

is it best to take these additonal commits to a new PR or adjust the base of this one?

@adriaanm adriaanm added the performance the need for speed. usually compiler performance, sometimes runtime performance. label May 25, 2017
@retronym
Copy link
Member Author

retronym commented Jul 3, 2017

I merged the original PR instead. I'll salvage any useful changes in this PR in a new one.

@retronym retronym closed this Jul 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance the need for speed. usually compiler performance, sometimes runtime performance.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants