cfd.university

Another cornerstone of object-orientated programming is inheritance, which is the ability to define parent-child relations between classes. Inheritance allows child classes to derive functions and variables from the parent class and so we are able to group together classes that naturally fit together under a common base class. For example, the base class RANS may have several derived classes such as Spalart-Allmaras, kEpsilon, kOmega, and so on. In this article, we explore how to achieve this feature, why it allows us to clean up and write DRY (Don't Repeat Yourself) code, and also what some limitations are.

In this series

In this article

A motivating example

If you are already a programming expert, you may have looked at the last article's class examples given at the end for mesh reading, gradient calculation, and linear system of equation solvers and thought to yourself, this article was rubbish, the presented class designs have so many flaws, and you are right. This was on purpose, to naturally lead into this section on inheritance.

If you have found no issues with the previous class designs, then let's dissect them together and see why they need improvements. Let's start with the last design on linear systems, and for reference, I have copied the definition given in the previous article below.

We only really looked at two functions within this class; CG() and GMRES(); however, if we look at other linear algebra libraries such as PETSc, we see that there are many more possible solvers available. PETSc, in particular, has over 100 different solvers to compute the solution to the linear system [katex]Ax=b[/katex]. So, should we add another 100 functions to our class, to create this sort of mega-class that will probably have 10 000+ lines of code? Well, I hope you have answered this question with no. No one wants to read or extend a class which has 10000+ lines of code, classes should be small and precise and only ever do one thing. A linear algebra package that tries to solve too many problems at once will just become a maintenance nightmare over time.

So, what is the solution to this problem? Well, let's look back at our motivation for classes. We said that when we have shared states, i.e. we have variables that are used by several functions, it would make sense to bundle these variables and functions together into a class, make them private, and then let functions within the class use that data and manipulate it to achieve a desired outcome. This works well if there is only one way to achieve this, but in reality and in most cases, there are often several ways to achieve the same outcome. We have already seen, and discussed, that solving Ax=b can be done with several solvers, but here are a few more examples:

Writing a computational cell class

The above is not an exhaustive list and you can try to come up with additional examples as an exercise, CFD solvers are full of potential to use classes which have more than a single possible implementation logic. Let's pick the last example on computational cells and let's construct the class for a prism and triangle class. We'll only look at pseudocode, i.e. no implementation, we are only interested in the class structure.

We define two classes, Triangle and Prism on lines 3 and 19, respectively. In both classes, we define a VectorType on lines 5 and 21, and both classes provide a function to calculate the shape's centroid (lines 12 and 28), as well as the shape's area (2D) / volume (3D) on line 13 and 29, respectively. The caller of the class is able to get the centroid and area/volume on lines 8-9 for the Triangle class and on lines 24-25 for the Prism class.

A note on the constructor: You may have seen that we are using a default constructor here, i.e. we set the constructor equal to default. This is technically not required, we can construct a class without any constructors or destructors, in which case the default constructor will be automatically added by the compiler for us (i.e. if you remove lines 6 and 22, the compiler will add them back for us, as we need to have some mechanism to construct a class, even if the constructor is not doing anything). As soon as we do provide a constructor, this line is not added automatically, and only our provided constructor will be used.

We can add as many constructors as we want, to construct an object on different inputs. As an example, a GeometricLine class may have two constructors (or more). The first one constructs a line based on two points, and the second one is based on an origin, a direction, and a distance. Suffice it to say that I only added the constructor here for clarity; we don't provide an implementation but probably want to set up the state here of the class, e.g. call the private functions on lines 12-13 and 28-29 in a real working example.

Do you notice how code duplication is creeping in? For example, both classes have a VectorType and a function called calculateCentroid. We don't show here how this function is actually called, but when a user wants to get the centroid information for either a triangle or a prism element, the code would be the same, i.e.

On lines 5-6, we call the getCentroid() function for both the triangle and prism element, and we can see that the caller to the function doesn't really see a difference in the way that they interact with the class. Similar arguments can be had about the area and volume function, though in order to have a uniform naming scheme, it would be better to name them both getVolume (and then say that two-dimensional elements are per unit length) or to use a different name such as getSize, though that may not be the best choice as size is typically related to the length of a data structure, at least in the standard template library, which we will look at later.

Using inheritance to clean up our code

When we use similar classes to represent the same overarching type (here a computational cell type in the previous example), where each class implements a specific version of that overarching type (here both the triangle and prism class), we want to use inheritance. Inheritance allows us to define a base type, such as a computational cell class in our example, and then derive different classes from it that have the same functions and variables but use different internal implementations (e.g. we calculate a centroid differently for a prism element compared to a triangle element).

The importance of inheritance is that it allows us to define an interface. An interface is just an empty class with all functions and variables declared but where the functions are not yet implemented. C++ allows us to provide a default implementation, but that can be later overwritten. In this case, you wouldn't call it technically an interface anymore, but that is a different discussion. When we now derive a subclass from this interface (or base class), it inherits all functions and variables and it can provide its own implementation for each function.

Sometimes it is necessary to allow for multiple levels of inheritance, i.e. where you have a hierarchy of classes deriving from another class. For example, when we write a linear system of equation solver library and we want to expose a Matrix class, then we may make this Matrix class the overarching base class, from which we derive two subclasses; DenseMatrix and SparseMatrix, i.e. one which stores all elements (dense) and one which only stores non-zero elements (sparse). Each version of the dense and sparse matrix then may have further sub-divisions into DynamicMatrix and FixedMatrix classes, where we either allow for small matrices to be allocated on the stack (fixed) or for larger matrices to be allocated on the heap (dynamic). See also Memory Management in C++ for a refresher on the stack and heap.

As a real-world example, Eigen3 is a popular header-only C++ library for anything matrix-vector multiplication-related (including solving linear systems). The linked page shows that their inheritance hierarchy for the matrix class contains 6 levels. If you look at other libraries, having several levels of inheritance hierarchy is very common.

A base class implementation for the Computational Cell class

Let's return to our example again and create an interface for our triangle and prism class, which is shown in the following:

This class now contains all type declarations, i.e. in this case, only the VectorType, as well as all other functions we saw before. Notice also that we have to provide a (virtual) default destructor on line 8, otherwise, we run into issues with memory leaks and, worst case, undefined behaviour when deleting a derived class under certain circumstances (which we don't really have to worry about if we provide this destructor call here). We will discuss the virtual keyword in just a second. Notice on line 13 that we have changed the access modifier from private to protected. Protected access means we are granting any derived class access to this area, while all other non-derived classes are not able to access this. In other words, protected simply switches between public for derived classes who want to access this area and private for anyone else.

On lines 14-15, we defined our two functions calculateCentroid() and calculateVolume(), which are marked as virtual. This is all that we have to do so that derived classes can modify these functions. At the moment, we don't provide an implementation, indicated by the empty brackets, i.e. { }, but we could have provided some initial implementation which may then be overridden by a derived class. This may be useful if we wanted to have a fall-back implementation, or default implementation, which we always want to use if nothing else is specified. For example, we may implement a FileIO class, which provides capabilities to read and write data. By default, we may want to open files in binary mode to save storage requirements, but we may also provide a derived class FileAsciiIO, which provides file reading and writing in Ascii format, or, in other words, plain text files. A post-processing class may then derive from either class and write out solutions in either binary or ASCII format.

A strict interface definition: Using pure virtual functions

In most cases, though, having a fall-back option is clumsy and doesn't really promote clean code design. Also, in the example given above, we could create an object of type ComputationalCell and try to call the getCentroid() function, which would work, but we can see from the code above that the function for calculating the centroid is empty, and the compiler is not stopping us from calling a function within a class for which there is no implementation. In most cases, we don't want to create a variable of type ComputationalCell but rather of any of its derived types. If this is what we want to achieve, we can promote our current virtual functions to pure virtual functions. To do so, we simply add = 0 at the end of the function call, i.e. we would replace lines 14-15 in the above code example with:

This indicates to the compiler that there are no implementations allowed for either function in the base class. This also means that the two calls on lines 5-6 in the base class now have to move into the derived classes, otherwise we would get a compiler error. You should always ask yourself if there is a logical fallback option you can implement, if you can't, then use pure virtual functions. They offer the additional advantage that if you forget to implement any function which is defined as pure virtual, i.e which is marked at the end with = 0, you get a compiler error stating that you need to have an implementation available, otherwise the code can't be compiled. So this forces us to implement the interface that our base class has defined. With this in mind, let's have a look at our Triangle and Prism class again. They have now become:

Notice how on line 1 and 11 we declare our classes as usual, but then follow this with : public ComputationalCell, indicating that we want to inherit everything from the ComputationalCell class. The leading public keyword is required, but you can safely ignore its meaning for now; you can specify either public, protected, or private here which will have an influence on how the access modifiers will be treated, using public here just indicates that we don't want to change them in the base class from which we are deriving (here the ComputationalCell class, which is virtually always the case).

The VectorType information now is also not duplicated, but instead, we use the one defined in the ComputationalCell class itself, i.e. see lines 3 and 13. The constructor calls on line 4 and 14 provide an additional : ComputationalCell() suffix, which means that the constructor to the ComputationalCell class will be called before the constructor for either the Triangle or Prism class is executed. This is useful if we want to set up memory (i.e. allocate some memory), this can all be done in the base class, and we only need to implement that once in the base class constructor and then just call it from the derived class. The constructors of the derived classes no longer call the virtual functions. This may seem a bit convoluted. Inside the Triangle and Prism class, we do provide the implementation for the calculateCentroid() and calculateVolume() class, but we do not call it in these classes explicitly. Instead, by calling the constructor of the base class, i.e. ComputationalCell, which in turn is calling the aforementioned functions, the C++ runtime environment will check which type the computational cell is, i.e. either it is a triangle or prism cell, and then call to corresponding implementations of these functions within either the Triangle or Prism class.

On lines 7-8 and 16-17, we provide specific implementations for how to calculate the centroid and volume for a triangle and prism element, respectively. Well, I don't show any implementation here, but this is where we would provide it for each element. With this new class hierarchy, we haven't actually changed anything in the code where we called the classes, i.e. the following code:

will still compile and work, and now correctly call the respective implementation/function for the triangle's or prism's centroid calculation. If you want to verify that, put a print statement into each function, and you will see that the function we call on line 5 and 6 are calling the respective implementations in the Triangle and Prism class.

Performance considerations

Are there any downsides to inheritance? Well, if you look into any object-orientated program, you'll find that inheritance is the bread and butter of modern software design, and you'll see this being used everywhere, so it must be good, right? Well, there is a performance penalty to pay. The reason is that virtual function calls are resolved at runtime, meaning that when you are executing your code, it needs to be checked which function you actually want to call (i.e. from which specific derived class, as there are many possible classes available, potentially). This will incur a slight overhead, but it is not something you really need to worry about too much in most cases. However, there is a situation where this can have a noticeable performance overhead. Consider the following pseudocode:

I haven't provided the class definition for FirstOrderUpwind here, but suffice it to say, in a real CFD solver, you would likely have an overarching Interpolation class, and then derived classes such as FirstOrderUpwind, MUSCL, WENO, and so on. In the above code, we are using the interpolate() function, which we assume in this case to be a virtual function, i.e. it is declared either virtual or pure virtual in the Interpolation class, and then the FirstOrderUpwind class implements the specific logic. Now we are using this function within a loop, which loops over all the (internal) faces in the mesh and then calculates the interpolated value, of say, the velocity, at the face. We do this one million times, as the number of faces is set to 1e6 = 1 000 000, and likely have to repeat this step for each time step or iteration. Now the aforementioned small runtime overhead of determining which function has to be called becomes significant and, since the function calls are determined at runtime, this type of inheritance is called dynamic polymorphism. Luckily for us, there is also static polymorphism and this is something we will explore in the next section, among other things.

Summary

Let's recap. We use inheritance to define interfaces, predominantly, which describe a certain set of functions that each derived class has to implement. A base class can contain all the variables required, and the derived classes can access these variables, so long as the access modifier is set to protected. Whenever there is a logical way to group several classes together under a base class, we should use inheritance. If, however, we call these functions frequently, which are exposed by the interface, then we should consider static polymorphism, described in the next section.

Tom-Robin Teschner is a senior lecturer in computational fluid dynamics and course director for the MSc in computational fluid dynamics and the MSc in aerospace computational engineering at Cranfield University.

How to handle inheritance and class hierarchies in C++