In this final article on software testing, we look at test coverage. We discuss first what it is and what a good test coverage looks like. We then jump straight into an example and learn how to use gcov and lcov, two popular tools for reporting test coverage, and see how we can compute test coverage and visualise it with interactive HTML reports. We also touch upon the issue of inlined functions and explore what these are, why they are good but also why they may interfere with coverage reports. We’ll explore fixes and implement one of them.
By the end of this article, you will know pretty much all there is to test coverage metrics and how you can use gcov and lcov to give you an idea about your code’s test coverage. You should then be able to take what you have learned in this article and apply it to your own projects as well. If you already have a test suite in place, adding test coverage is as simple as adding a few additional commands to your build script and there is no good excuse not to do it as part of your build process.
Download Resources
All developed code and resources in this article are available for download. If you are encountering issues running any of the scripts, please refer to the instructions for running scripts downloaded from this website.
- Download: Software-Testing-Part-6.zip
In this series
- Part 1: How to get started with software testing for CFD codes
- Part 2: The test-driven development for writing bug-free CFD code
- Part 3: How to get started with gtest in C++ for CFD development
- Part 4: How To Test A CGNS-based Mesh Reading Library Using gtest
- Part 5: How To Test A Linear Algebra Solver Library Using gtest
- Part 6: How to use mocking in CFD test code using gtest and gmock
- Part 7: What is test coverage and how to use LCOV/GCOV for testing
In this article
- What is test coverage
- Your new best friends: lcov and gcov
- Adding test coverage to our complex number class example
- Modifying the UNIX build script
- Inspecting the generated HTML output (what's wrong?)
- Intermission: what are inline functions?
- Why are inline functions interfering with gcov's code instrumentation?
- Solution: Separate the header file into an interface (*.hpp) and implementation (*.cpp) file
- Modifying the build script yet again
- At last, a realistic HTML coverage report
- Summary
What is test coverage
At this point, we have discussed software testing at length. We first discussed the need for software testing, to ensure quality but also to allow for code refactoring, i.e. rewriting of existing code as our code base grows. We discussed the different types of tests, and focused on unit, integration, and system tests, and then moved on to discuss how we would write these tests in a test-driven development approach.
We looked at Google’s popular testing framework for C++, called Google test, or simply gtest, and then applied that to our previously developed linear algebra solver and mesh reading library. Finally, we looked at how we can mock dependencies with gtest (using its own mocking framework called gmock) and why we would like to do that for our CFD console applications.
So we have written a lot of code, and we have written additional test code but we can’t stop there. Writing the tests is one thing, but we want to get a feeling for how well we are doing with our tests. This is where test coverage comes in.
Test coverage allows us to check how much of our codebase we are actually testing. Coverage tools will instrument our code during the compilation which will allow us to count all functions and lines of code that are executed during the test. Once all tests are executed, a coverage report will be generated that allows us to inspect which functions and lines were tested. Some tools will also tell you how often you executed a certain line or function.
The coverage tool will then typically condense that information down into a single metric; the code coverage in percent. This percentage shows how many lines in your codebase you are exercising during the tests. If all lines of code are checked, then you get a percentage of 100%. Having said this, it may seem logical that we want to have as much coverage as possible and should always strive for 100% test coverage, right? There is a great comparison in Roelof Jan Elsinga’s blog entry on test coverage and why you may or may not want to achieve 100% test coverage.
Should we strive for 100% test coverage?
It is great because you are testing the entire code base, so any changes to the code base that break existing code should be found easily. Furthermore, if you think that you are testing all code, but your coverage is less than 100%, you can find dead and unused code. On the other hand, 100% may be misleading. Tests simply check that for a given input, a certain output is produced. That’s all. We use that as a surrogate to check code correctness. If you write a test with wrong inputs and assert wrong outputs, you test all code, but not its correctness.
What do I mean by that? Let’s look at an example. If we write the simplest of all libraries in the world, a calculator (oh yes, it is time for the famous and bad example of a C++ calculator again), then we will have functions for addition, subtraction, multiplication, and division. Then we go on to write a unit test for the addition example and, because we are high-paid software engineers, we muster the courage to write a test which asserts 1 + 1 = 2
(finally, all those years spent at university finally pay off!).
This is an overly simplified example, but it shows that for certain tests we can very easily reason what the output ought to be for a given input. But let’s make this more real, let’s return to our mesh reading library; when we were testing the mesh reading, say the coordinate reading, for example, how do we know we are getting the right coordinates for a given input file, especially if we get it from a mesh generator where we have no idea of the mesh file was generated?
I did not tell you how I came up with the assumed true coordinates, or mesh interface information, or boundary information, so here is how I did it: During the tests, I simply printed the information of all of these mesh information to the console and then inspected the output. I did some sense checking of it (drawing a grid with all the information) and then made a judgement call on whether I believed the data returned to me to be correct. Then, I turned the print statements into assertions and after the tests passed, I repeated the process for other unit tests.
But what if I made a wrong assumption? What if the mesh generator messed up one part and now I am starting to assert some non-sensical mesh data? Ultimately, this mesh will result in our CFD solver failing, as we need to have the correct mesh to begin with, and we will not spot this issue during unit testing. But we will get test coverage for it and might be lulled into believing that since the code is covered by tests, it is working correctly. So there is not always a correlation between code correctness and test coverage.
This is why black-box testing is so important. Black box testing doesn’t care which part of the code is exercised, but it cares only about the final outcome. If you can show that the code is doing what it is supposed to, then you have the strongest protection. The opposite is check only individual functions (unit testing), which we classified as white-box testing. They offer a good amount of protection against regressions, but even with all unit tests passing and all of them being correct, they still need to work together to achieve correct behaviour at the solver level.
It would be nonsense to write a structured mesh-based solver and then use our unstructured mesh reading library. We can write unit tests for both of them and both of these could pass, but we can never combine a structured-mesh-based data structure with an unstructured grid, so a system test here (black-box testing) would uncover this incompatibility.
But there is one more reason I personally believe should actively discourage you from striving for 100% test coverage: you start writing tests for code that does not need to be tested, as it does not contribute to the business logic of your code. The business logic is the primary outcome your code should achieve. A CFD solver should solve the Navier-Stokes equations, so you should make sure this is achieved, and any function that will help you to do this should be tested.
We will come across an example later where we will see that not all code needs to be tested. The reason why you should avoid testing code that does not contribute to your primary outcome is that these tests quickly turn into brittle tests. Remember, brittle tests are those that may result in false positive test outcomes, i.e. your test all of a sudden fails, even if the primary outcome of your code is still correct. You start spending time fixing tests that don’t contribute to your objective, and thus waste time on code that should not be tested in the first place.
So having discussed code coverage, we need to look at some tools to do that for us. There are quite a few online solutions available, where you can upload your code and then get a test result back. This may seem inconvenient, but when we look at continuous integration and delivery (CI/CD) later, then this makes sense, as we can automate the process and generate a new coverage report every time we upload a new code change to our online code repository.
For the moment, though, we will look at tools that can help us achieve coverage reports locally, and for that, we will look at lcov and gcov. We will discuss them in the next section.
Your new best friends: lcov and gcov
As alluded to in the previous section, lcov
and gcov
are tools that we can use locally (offline) to instrument our code to provide us with code coverage metrics. There is a caveat here; they only work on UNIX. Windows does have some options available, but they seem to be either expensive or available only if you are working with Microsoft’s flagship Visual Studio IDE (integrated development environment).
While we will only look at UNIX in this article, this is not necessarily a restriction for us. In reality, we don’t run test coverage locally on our PC but rather in the cloud after pushing changes to our online code repository. These are all based on UNIX systems, so if we wanted to integrate test coverage, we would have to interact with a UNIX system anyway.
If you are on Ubuntu, you can get lcov
and gcov
installed through your package manager with
sudo apt install -y lcov
This will install both lcov
and gcov
, but you could also install gcov
separately with the same command if you need to. On macOS, you can get it through homebrew with
homebrew install lcov
So what do these two tools do? gcov
is the part that will instrument your code, it will count the lines and functions that get executed and summarise that in a non-human-friendly report. This is where lcov
is coming in, it is a human-friendly interface to the coverage reports generated by gcov
. Thus, we first have to run all tests, instrumented by gcov
now, and then afterwards run lcov
with the outputs generated by gcov
.
lcov
does come with a utility called genhtml
, which, as the name suggests, will generate a coverage report in HTML format for us to inspect in our browser. It allows us to interactively go through our code base and see graphically which lines of code were and were not executed. We can use this to quickly see which parts of the code we may want to further cover or not.
The good news is that we don’t have to change anything in our code to add test coverage to our project. You only have to modify your build scripts, and this is what we are going to do in the next section.
Adding test coverage to our complex number class example
In a previous article, we developed a little toy example that we used to learn how to use gtest. This toy example implemented a complex number class, which we used for testing. We will use the same project here as a starting point and then modify the UNIX build script here only. In fact, if you want to reuse your existing project, make a copy of that example and remove the Windows build script, you won’t need it (but you will need a UNIX environment, so if you are on Windows, you will need to install WSL first, welcome to the future).
Modifying the UNIX build script
Our UNIX build script itself remains pretty much the same. The only difference here is the additional compiler flag --coverage
on line 10, and our compiler will liaise with gcov
, which we include on line 13 as an additional library/dependency. Since we installed lcov
and gcov
through the package manager, i.e. apt
on Ubuntu and homebrew
on macOS, we don’t have to provide a location for where to look for this library with the -L
flag, the package manager will put it in a location where the compiler can see it by default.
The additional code coverage magic happens on line 19 and onwards. We instruct lcov
to check the build/
directory for any code coverage report and write out a human-friendly version of that in the build/
folder, which we call coverage.info
. There may be a lot of noise in this file, for example, in my case, gcov
also instrumented all code within the gtest library, as well as the C++ standard library, because I am using elements here from these libraries. So we want to filter out only the parts that we are interested in.
This is what we do on line 22, we first extract only directories that contain src/
, and we allow any directories in front or after this directory (indicated by the asterisks, i.e. *
), and then we write a modified coverage report in the build/
directory. Then, on line 23, I also want to remove all directories containing my test code, and so remove any files that contain a tests/
folder, using the same regular expression seen before using the asterisk.
Finally, we create a coverage report in HTML format on line 26 using genhtml
and instructing it to write all files into the coverage/
folder, which we created on line 6. After we run this script, we should get a file called index.html
within the coverage/
folder, which we can open with our favourite browser and inspect.
#!/bin/bash
# clean up before building
rm -rf build coverage
mkdir -p build
mkdir -p coverage
# compile source files into object files
g++ -c -g --coverage -std=c++20 -I. -I ~/libs/include/ ./tests/unit/testComplexNumbers.cpp -o ./build/testComplexNumbers.o
# create test executable
g++ -std=c++20 ./build/testComplexNumbers.o -o ./build/testComplexNumbers -L ~/libs/lib -lgtest -lgcov
# run tests
./build/testComplexNumbers
# capture coverage
lcov --capture --directory ./build --output-file ./build/coverage.info
# extract coverage from the source (src) directory
lcov --extract ./build/coverage.info '*/src/*' --output-file ./build/coverage-src.info
lcov --remove ./build/coverage-src.info '*/tests/*' --output-file ./build/coverage-final.info
# generate HTML report
genhtml ./build/coverage-final.info --output-directory coverage
To understand what lines 22-23 are doing, try a simplified approach as well. For testing, replace lines 18-26 with the following lines and see what happens:
# capture coverage
lcov --capture --directory ./build --output-file ./build/coverage.info
# generate HTML report
genhtml ./build/coverage.info --output-directory coverage
You may get a lot of noise as well, similar to my output, but it may be different, based on your system. So if you see that you have additional folders you want to exclude in your case, check which folder you want to extract, and which you want to remove, and use a similar syntax seen in the build script on lines 22-23 to tailor your coverage report.
Inspecting the generated HTML output (what’s wrong?)
OK, with the coverage created, let’s open the HTML output file (again, for me it is in coverage/index.html
and if you followed the same build script, this is where your file is located as well) and see what we are getting. My output is shown below:
The first thing you want to check is the top right corner. It states that we have a total of 43 lines in our project, of which we are exercising 43, thus we get 100% test coverage. Similarly, we can see that we have a total of 11 functions in our complex number class, and we are using all of them, so again, 100% coverage here as well. Excellent, isn’t it?
Well, let’s see. Below the test coverage metrics and some additional metadata that primarily reveals that I like to start my days early, we have a summary of all folders that we currently test in our coverage. If you see any folders that shouldn’t be here, then you will need to go back to the build script and remove these folders, similar to what we have done on line 23 in our build script. We can click on the src/
folder, which will then list all files within the folder, of which we only have one (the complex number class). Clock on that file, and you will see the following:
All code highlighted in blue is covered by our tests. On the left, we have our line numbers in the yellow column, and next to it the line data. This data tells us how often our test has visited a certain line. So for example, the constructor was visited 19 times, and we called the isNan()
function 19 times as well, but we only ever saw the end of the constructor 18 times. Why? Because in one test the isNan()
function throws an error and we never make it to the end of the constructor. the Re()
and Im()
functions are called a total of 5 times, and some other lines only once.
Did you notice that the destructor was never called on line 14? Or what about the setIm()
function on line 19? This also doesn’t seem to be called. Strange, didn’t we get 100% coverage? Well, let’s look further down the document and we will see the following code:
We see that the operator<<()
function is not tested, yet we are still getting 100% test coverage, so what’s going wrong here? Did we not set all compiler flags properly? Well, let’s try to find a solution for this
Intermission: what are inline functions?
To increase computational performance, C++ compilers can make use of a little trick called inlining. This allows the compiler to replace a function call with the content of the function itself. Let’s look at a simple example:
#include <vector>
double centralScheme(double leftCellValue, double rightCellValue) {
return 0.5 * (leftCellValue + rightCellValue);
}
int main() {
int maxIteration = 10000;
int numberOfCells = 1000000;
std::vector<double> velocity(numberOfCells, 0.0);
for (int iteration = 0; iteration < maxIteration; ++iteration) {
for (int cell = 0; cell < numberOfCells - 1; ++cell) {
double faceValue = centralScheme(velocity[cell], velocity[cell + 1]);
// ... rest of the solver goes here ...
}
}
return 0;
}
We are creating here a basic structure for a CFD solver, where we loop iteratively over our solution, i.e. on line 12 we loop over 10 000 iterations (or time steps), and for each iteration (time step), we have to loop over our 1 000 000 cells. For each cell, we want to find what the interpolated velocity is at the face that connects two cells. For that, we use the centralScheme()
function (arguably not a good choice but sufficient for this simple example) which we define on lines 3-5.
Since we have 10 000 iterations (time steps), in which we call the centralScheme()
function 1 000 000 times, this results in a total of 10 000 * 1 000 000 = 10 000 000 000 (10 billion) function executions. Functions are a bit like pointers, in that every time we execute a function, we first have to check where in memory they are stored before executing the code within the function. Once the function is finished, we have to check where to write the return value into memory, which requires another look-up.
These look-ups are not critical and execute pretty fast, but if you have a function which consists of 1 line, and you add a bit of overhead, that overhead can quickly be as time-consuming as the function code itself. If you call this function now 10 billion times, then you are quickly adding a lot of overhead that will slow down your code considerably.
If this reminds you of pointers, where we have to check first where the pointer is pointing to in memory before we can read its value, then you are spot on, it is pretty much the same. This memory look-up (computer scientists like to use the fancy term indirection, so let’s be cool and use it as well), er, pardon me, I mean, indirection, causes overhead for pointers as it does for functions. We looked at this issue in some more detail for pointers when we discussed memory management in C++. Give it a read if you need to remind yourself how memory management works in C++.
With pointers, if we want to get rid of this issue, we replace them with variables allocated on the stack, rather than the heap (again, we go through that in the above-mentioned article), and we can increase performance. And for functions? What can we do here? Well, we use inling. So let’s look at the example again. If we defined the function to the centralScheme()
like the following instead
inline double centralScheme(double leftCellValue, double rightCellValue) {
return 0.5 * (leftCellValue + rightCellValue);
}
where we are now using the keyword inline
at the beginning of the function signature, we are telling the compiler to replace any call to centralScheme()
in the code with the function body, i.e. line 2. So, if we look at the iteration and space loop again in isolation, we have the following:
for (int iteration = 0; iteration < maxIteration; ++iteration) {
for (int cell = 0; cell < numberOfCells - 1; ++cell) {
double faceValue = centralScheme(velocity[cell], velocity[cell + 1]);
}
}
Instead of calling the function now, the compiler will inline the function, which will result in the compiler rewriting our code to:
for (int iteration = 0; iteration < maxIteration; ++iteration) {
for (int cell = 0; cell < numberOfCells - 1; ++cell) {
double faceValue = 0.5 * (velocity[cell] + velocity[cell + 1]);
}
}
On line 3, we have now copied the function body and replaced the values leftCellValue
and rightCellValue
with the actual values we passed to the function, i.e. velocity[cell]
and velocity[cell + 1]
. This further removes a copy that we did.
The good news is that we don’t have to start putting the inline
keyword everywhere in our code. Using the inline
keyword is pretty much useless these days. The compiler will automatically decide whether to inline functions or not, and this may depend on the optimisation level we allow the compiler to do, i.e. using the -O
and /O
flag for the g++
and cl
compiler, respectively. Even if we put the inline
keyword in front of a function, the compiler may decide to ignore it. And if we don’t put it, it may still inline it, so it is best to leave this to the compiler anyway.
Why are inline functions interfering with gcov’s code instrumentation?
Inlining is a good thing, and we should be thankful to our compiler if it is done correctly. But why does it mess up our code instrumentation and suggest that we have 100% cod coverage? Inlined functions share some resemblance to templates, in that we provide the blueprint with templates, but the compiler generates the code, one function for each template argument. If we call the same function with 50 different template types, then the compiler writes 50 different overloaded function definitions.
With inlined functions, the compiler will only compile and include it in the final executable if the function gets executed. If it doesn’t, it will be simply thrown out. The final piece of the puzzle is that functions in classes are implicitly inlined, and so if we have functions that are inlined within a class but never called, they don’t make it into the final executable and so gcov has no way of knowing that this function existed in the original source file. We can see it, but gcov can’t.
So how can we get around this? There are three solutions, listed in order of increasing ridiculousness:
- Separate the interface from the implementation. So far, we have provided all code for our complex number class within the header file. If we create a source file instead, where we implement the interface that we have defined in the header file, then the compiler is forced to compile all code and it will show up again in our coverage report.
- Create a dummy function that simply calls all functions that are never executed. We don’t have to call this function anywhere, by just providing the function with a reference to the functions that are never executed, the compiler will be forced to include the function in the compilation.
- Pay for a better coverage tool. Yes, I have found this solution online, this was a serious suggestion, no thank you.
Why do I say that these are in increasing order of ridiculousness? They all require us to make fundamental changes to our code and we have a coverage tool dictating now how we ought to write our code (or modify it). In return, we get a single number, a test coverage number, which begs the question if it is worth doing.
You might argue that the first suggestion is a reasonable one (and I would agree to some extent), it is a good idea to provide a separate header (interface) and source (implementation) file. But once we start to use templates, this becomes a bad idea. Think about header-only libraries, for me, the main reason to write a header-only library, apart from the fact that it is easy to use, is that it allows us to go crazy on templates, and libraries like the linear algebra library Eigen3 make very liberal use of templates.
A tool should not limit us to how we write code, but sometimes a compromise is required. In this case, there is no real reason not to provide a separate interface and implementation file, so that is the solution we will take for now. If we wanted to start to use templates, we may have to use option three (granted, there are other tools which you can use which are absolutely free, and we will come across a few of them later, but someone actually suggested that you should buy expensive code coverage tools to get around this issue, I just like to be a bit dramatic).
Solution: Separate the header file into an interface (*.hpp) and implementation (*.cpp) file
So then, let us quickly dissect our complex number class and separate the interface from its implementation. We first concentrate on the interface, which remains defined in the src/complexNumbers.hpp
file. The content is given below.
#pragma once
#include <iostream>
#include <cmath>
#include <stdexcept>
#include <limits>
class ComplexNumber {
public:
ComplexNumber(double real, double imaginary);
~ComplexNumber() = default;
double Re() const;
double Im() const;
void setRe(double real);
void setIm(double imaginary);
void conjugate();
double magnitude() const;
ComplexNumber operator+(const ComplexNumber &other);
ComplexNumber operator-(const ComplexNumber &other);
ComplexNumber operator*(const ComplexNumber &other);
ComplexNumber operator/(const ComplexNumber &other);
friend std::ostream &operator<<(std::ostream &os, const ComplexNumber &c);
private:
void isNan(double real, double imaginary) const;
private:
double _real;
double _imaginary;
};
We have essentially just left all function definitions in the class but we have removed all the function bodies (implementations). These have now moved into the new src/complexNumbers.cpp
file, which is given below:
#include "src/complexNumbers.hpp"
ComplexNumber::ComplexNumber(double real, double imaginary) : _real(real), _imaginary(imaginary) {
isNan(real, imaginary);
}
double ComplexNumber::Re() const { return _real; }
double ComplexNumber::Im() const { return _imaginary; }
void ComplexNumber::setRe(double real) { _real = real; }
void ComplexNumber::setIm(double imaginary) { _imaginary = imaginary; }
void ComplexNumber::conjugate() { _imaginary *= -1.0; }
double ComplexNumber::magnitude() const { return std::sqrt(std::pow(_real, 2) + std::pow(_imaginary, 2)); }
ComplexNumber ComplexNumber::operator+(const ComplexNumber &other) {
isNan(_real, _imaginary);
isNan(other._real, other._imaginary);
ComplexNumber c(0, 0);
c._real = _real + other._real;
c._imaginary = _imaginary + other._imaginary;
return c;
}
ComplexNumber ComplexNumber::operator-(const ComplexNumber &other) {
isNan(_real, _imaginary);
isNan(other._real, other._imaginary);
ComplexNumber c(0, 0);
c._real = _real - other._real;
c._imaginary = _imaginary - other._imaginary;
return c;
}
ComplexNumber ComplexNumber::operator*(const ComplexNumber &other) {
isNan(_real, _imaginary);
isNan(other._real, other._imaginary);
ComplexNumber c(0, 0);
c._real = _real * other._real - _imaginary * other._imaginary;
c._imaginary = _real * other._imaginary + _imaginary * other._real;
return c;
}
ComplexNumber ComplexNumber::operator/(const ComplexNumber &other) {
isNan(_real, _imaginary);
isNan(other._real, other._imaginary);
double denominator = other._real * other._real + other._imaginary * other._imaginary;
if (std::abs(denominator) < std::numeric_limits<double>::epsilon())
throw std::runtime_error("Complex number division by zero");
ComplexNumber c(0, 0);
c._real = (_real * other._real + _imaginary * other._imaginary) / denominator;
c._imaginary = (_imaginary * other._real - _real * other._imaginary) / denominator;
return c;
}
std::ostream &operator<<(std::ostream &os, const ComplexNumber &c) {
os << "(" << c._real << ", " << c._imaginary << ")";
return os;
}
void ComplexNumber::isNan(double real, double imaginary) const {
if (std::isnan(real) || std::isnan(imaginary))
throw std::runtime_error("Complex number is NaN");
}
We haven’t changed any code, we have simply separated the interface from the implementation. If you need a refresher on what the class is doing, it was discussed in more depth when we looked at test-driven development, where we used this class as a case study. Have a look at this article if you need some additional guidance.
Modifying the build script yet again
Modifying is perhaps a strong word here, within our build script, we have to account for the new source file we created above, i.e. the src/complexNumbers.cpp
file. This is now added and compiled in the build script below on line 10, and then subsequently included in the linking stage on line 14.
#!/bin/bash
# clean up before building
rm -rf build coverage
mkdir -p build
mkdir -p coverage
# compile source files into object files
g++ -c -g --coverage -std=c++20 -I. ./src/complexNumbers.cpp -o ./build/complexNumbers.o
g++ -c -g --coverage -std=c++20 -I. -I ~/libs/include/ ./tests/unit/testComplexNumbers.cpp -o ./build/testComplexNumbers.o
# create test executable
g++ -std=c++20 ./build/complexNumbers.o ./build/testComplexNumbers.o -o ./build/testComplexNumbers -L ~/libs/lib -lgtest -lgcov
# run tests
./build/testComplexNumbers
# capture coverage
lcov --capture --directory ./build --output-file ./build/coverage.info
# extract coverage from the source (src) directory
lcov --extract ./build/coverage.info '*/src/*' --output-file ./build/coverage-src.info
lcov --remove ./build/coverage-src.info '*/tests/*' --output-file ./build/coverage-final.info
# generate HTML report
genhtml ./build/coverage-final.info --output-directory coverage
At last, a realistic HTML coverage report
OK, so we have now found a solution to our coverage metric deficiency, what is the HTML report saying? Well, let’s have a look.
Now the top right corner does depict a more realistic scenario; we are still hitting the same number of lines and functions as we saw before above (and this makes sense), but since all functions are now compiled and visible to gcov, we end up with a higher number for the total number of lines and functions, which reduces our coverage metrics to 91.5% (line coverage) and 54.6% (function coverage), respectively.
If we click through to the src/
folder and the complex number class file, we now get a different report when inspecting the source code, as shown below.
At the top of the report, we see that the setIm()
function is never called, while we do use the setRm()
somewhere. Is this a critical issue? Do we now need to quickly write a test for this function before anyone else notices that we forgot to test it? If we do that, then we are writing a test which is, in spirit, equivalent to 1 + 1 = 2
. Some functions are so simple they don’t need to be explicitly tested. If we use these functions somewhere in our test to set up our test, then this is fine, but we don’t need to write separate tests for these.
In fact, I would challenge you to come up with a different implementation of the setIm()
function that introduces a bug but retains the same complexity (i.e. a single line of code). If you can come up with an implementation that compiles but is wrong, while still being only a single line, then yes, you need a test, but given the simplicity of the code, we can assert ourselves that it is working correctly.
Further down below the HTML report, we see now a new function popping up as not being tested, as shown in the image below.
This is the operator<<()
function and we don’t need a test for it either. But let’s entertain the idea that we do want to test this function for a second. How would we do that? Well, let’s look at a possible implementation.
#include <sstream>
TEST(ComplexNumberTest, TestConsoleOutput) {
// Arrange
ComplexNumber a(1.2, -0.3);
std::stringstream stream;
// Act
stream << a;
// Assert
ASSERT_EQ(stream.str(), "(1.2, -0.3)");
}
We create a complex number on line 4 and a std::stringstream
variable on line 5 that will be able to capture any output emitted by std::ostream
. We use that in the act section on line 8, where we call the operator<<()
of the complex number class and write the output into the stream
variable, which we can now use in the assert section by checking that the string that we received is equivalent to (1.2, -0.3)
. If it is, the assertion will return true and we will be done.
But let’s say we change the operator<<()
implementation so that any whitespaces are removed. So, instead of printing (1.2, -0.3)
, we are now printing (1.2,-0.3)
instead. Our test fails, but does this test indicate anything about our complex number class performing correctly or incorrectly? No, it doesn’t. Tests that do test if the observed behaviour of the class is correct or not should not end up being tested.
This decision is hard, most people want to strive for 100% test coverage, just because it looks like we care about our code and that we try to test every possible line of code. To me, test coverage of less than 100% is more honest and conveys a picture of developers trying to test scenarios rather than lines of code. If all the scenarios pass, which are deemed to be important to make the code work as intended, then this is a much stronger indication that your code is bug-free, and this type of testing may not cover every line of code.
For example, the operator<<()
implementation is something I like to provide for my classes. Not necessarily because I want to print content from it, but during development, it is cheap, quick, and easy, to print the content of the class, instead of stepping through with a debugger. It is a convenience I leave in for myself (or others), and removing this convenience to achieve 100% test coverage is wrong, at least for me.
I let you decide where you stand on this issue, and many disagree with me and still strive for 100% test coverage, and that’s OK too. In the end, it is your project, your code, and only you can decide what is best to ensure the code is working correctly.
Summary
This concludes our little exploration of software testing and how we can incorporate that into our CFD projects. This article provided us with an insight into how we can add test coverage to our tests which allows us to get an idea of how much of our code base is exercised during the tests.
Using gcov and lcov, we were able to instrument our code and then generate a human-readable HTML report, that allowed us to visually inspect which lines of code were covered, and which weren’t. This may help us to decide if we have tested every line of code that we wanted to test. This is particularly useful if we have if/else statements in our code base and we want to ensure that we have tested each branch of the if/else tree.
Software testing is not necessarily a popular topic and you can probably find many opponents, if not at least disgruntled developers who accept that software testing is important, but who don’t enjoy writing lots of unit, integration, and system tests. However, software testing provides us with the best possible protection against software regressions (bugs) and we should try to make every effort to automatically detect these.
There is one more thing, and this is where software testing becomes absolutely vital; refactoring. I have mentioned this term a few times before and mentioned that refactoring is the process of rewriting your code, either to extend or to simplify it. Without software testing, you have no idea if changes to old code will affect the correctness of it or not, but having software testing in place will allow you to catch any bugs that may creep in. If you don’t refactor, your code will become a mess over time, it’s called entropy and you can’t avoid it. Don’t fight it, embrace software testing instead!
Tom-Robin Teschner is a senior lecturer in computational fluid dynamics and course director for the MSc in computational fluid dynamics and the MSc in aerospace computational engineering at Cranfield University.