Generate LLVM Source-based Code Coverage for C++ files not having unit tests

We have C++ code base and using LLVM compiler.

In order to generates source based coverage reports, we are using instructions mentioned : Source-based Code Coverage — Clang 16.0.0git documentation

I am able to generate coverage reports for src files which have unit tests written. However for src files which don’t have any unit test written, llvm-cov does not generate any reports.

I want to monitor the directory level Line Coverage percentage. LLVM-cov does a good job generating file level line coverage % as well as directory level line coverage % numbers.

The coverage percentage will be incorrect if llvm-cov only generates coverage information for only unit tested files. For example, when someone adds a new C++ src file without any unit tests to cover it, my expectation is the directory level coverage percentage should reduce. However with llvm-cov, the coverage percentage remains the same.

I looked into man page of llvm-cov, but could not find a flag that enables this feature.

More details on how LLVM coverage information for a file looks : LLVM Code Coverage Mapping Format — LLVM 16.0.0git documentation

Hi,

I can see that there are files in the report that have 0% coverage: https://lab.llvm.org/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/libcxx/src/ryu/d2fixed.cpp.html.

After generating the raw profiles using the flags for coverage (or using the CMAKE option for coverage, as the case may be), you need to run the program and then index the raw profiles. Unless you have some kind of workload or testcase, the counters will not be updated and hence coverage percentage will not increase.

For example, when someone adds a new C++ src file without any unit tests to cover it, my expectation is the directory level coverage percentage should reduce. However with llvm-cov, the coverage percentage remains the same.

Are you sure that it is remaining the same? The percentage is calculated only upto 2 decimal places and its possible that because of the large numbers involved, its rounding off to the same percentage. I’d suggest adding coverage flags to only a few files, generating the report and seeing if the coverage percentage decreases after the inclusion of a new file or not. If it doesn’t, it may be a bug.

Are you sure the files in question are actually getting linked into a binary? Object files in static libraries that aren’t used are normally discarded. And if the file isn’t in your binary, llvm-cov doesn’t know it exists.

1 Like

As you mentioned LLVM-cov does generate 0 coverage for files not unit tested.

I did not include some files in the executable. So as you mentioned, they will not generate any coverage reports.

Once i included the files, it started generating zero coverage for them.

Thank you for the quick response.

One followup question:

Due to some internal build tool limitation, we don’t compile some src files. However we do want to generate zero coverage reports for these files as well.

Is there any command you know that would generate zero coverage reports for a given file, without compilation and execution? Or any idea how we can handle it?

Did you find a soulution for your problem?

This is a huge limitation imho, i have multiple test executables and a static library that gets tested by these executables, of course not all files are included, only the files where a test exists currently so all the files that are not included do not count at all which fucks up the coverage. This is not a “source based code coverage”, this is based on the stuff linked into the binary.
llvm should be able to add 0% to the reports for every file in the source that is not tested, that would be real “source based code coverage”.

@0verEngineer

Sorry to reply this late. I did not see your message.

Yes, we sort of came up with a workaround. We identified all the source code files which are not included in the output coverage.json. Then we generated mock zero coverage for them.

By mock i mean, a simple logic to mark all the non-comment and non-new-line lines as uninstrumented and rest of the lines as instrumented but not covered. So in the html based report, you would see them as all orange.

More accurate way to identify all the instrumented lines in a code is by using llvm coverage.json’s segments and region information. That one is more complex. Let me know if you need additional information regarding any of the above.

@kshitij

Would love to see a writeup on this topic. Are you somehow generating empty .profraw files directly from the source code which you mark as not executed such that they would be included in your llvm-cov report?