First of all, I’m not accustomed to the discourse’s categories so I’m sorry if this is not the right channel to discuss this.
Since version 18, clang supports MC/DC coverage analysis.
The issue I’m facing here is that for a given executable, if the list of MC/DC coverage mappings is empty, I can’t know whether it is because
-fmcdc-coverage was not provided, and thus the code was not instrumented for MC/DC analysis;
Or the flag was provided and the code was instrumented with MC/DC in mind, but there was no relevant region to be instrumented.
This is problematic, because for software qualification purposes (which is one of the motivations behind MCDC analysis), I need to make sure to raise an error at analysis if we don’t have the requested data (e.g.: calling llvm-cov with -show-mcdc when the program was only instrumented for basic coverage).
The straightest way to go in my opinion, would be to add a field to the CoverageMapping format, maybe a bitset representing “extra” coverage features.
I’m willing to make the contribution if needed, though I’m not familiar with the LLVM codebase.
@evodius96 since you implemented MC/DC support in the first place, can I ask you your opinion ?
This is problematic, because for software qualification purposes (which is one of the motivations behind MCDC analysis), I need to make sure to raise an error at analysis if we don’t have the requested data
Can you elaborate on this requirement for software qualification – is there a particular standard where this is referenced? At least in the embedded use cases I’m most familiar with, I’m not aware of this.
If you need to show that you’ve instrumented for MC/DC, I’m not sure I follow why it isn’t enough to show that you’ve used the option -fcoverage-mcdc to enable it. Using the tools for qualification software relies on confidence that the documented options do what they are supposed to do, and this is backed up by testing.
Throwing an error during visualization if the instrumentation level doesn’t match the expectation is reasonable, but I want to make sure I understand why it’s necessary.
This feature is partly meant for user friendliness.
Without such feature, the user is expected to ensure by themselves that the MC/DC flags are used correctly along the whole pipeline, from the build to the report generation. This may not be straightforward in qualification environments where the pipeline may be split across machines and handled by different people.
Moreover, this feature would prevent false negatives, which are . Currently, if the code was not instrumented with MCDC support, llvm-cov -show-mcdc won’t report any violation, even though there might be if the code was correctly instrumented.
I see this as a way to make the coverage tool more robust and more likely to be used in critical environment.
On a side note, should you agree to the usefulness of this feature (which I hope you do ^^), we should discuss the expected behaviour of merging profraws/profdatas that have different coverage levels.
I understand. I’m trying to weigh the benefits vs. potential complications of this feature. Conceptually, it makes sense – I think it would be better as a warning diagnostic in llvm-cov that could potentially be made an error. If --show-mcdc or --show-mcdc-summary are provided with a profile that isn’t so instrumented, then you get the warning.
So this is where it could get complicated. Today, the profiles can be merged seamlessly. But with the above, we’ve now introduced profile variants that need to be managed appropriately. Just thinking out loud about possible ways forward when merging an MCDC instrumented profile with a non-MCDC instrumented profile:
Disallow merging between the two profiles (llvm-profdata issues an error)
Allow merging, but defer to the non-MCDC instrumented profile format and exclude MCDC data from the merge (llvm-profdata issues a warning)
Allow merging, but defer to the MCDC instrumented profile format. However, because the data is now inconsistent, I don’t think this option makes sense.
In gnatcoverage, which I’m working on, we assume that “what can do more can do less”.
When merging trace files, we can specify an expected coverage level for the output, and if one of the input files has a lower level of instrumentation than the expected one, the merge will fail.
So that would be the option #2.
Regarding implementation, I think this behavior should not be hidden from the user:
Either we could raise a warning in llvm-profdata when there is a coverage mismatch, to let the user know that some data will be dropped.
Or maybe add a CLI argument --require-level=[branch|mcdc|...] to llvm-profdata to raise an error if one of the input files does not meet the minimum level.