This article will look into smart pointers and why we should use them over raw pointers in C++. Smart pointers are fundamental building blocks in C++ and after 2014, ever C++ programmer should be using smart pointers instead of raw pointers, there are no excuses for raw pointers (anymore). Throughout this article, we explore how we would use smart pointers in real CFD applications, where we look at creating a solver class based on either the SIMPLE or PISO algorithm. We explore how smart pointers can be used in conjunction with the factory design pattern to create clean code and also touch upon two import C++ concepts, namely copy elision and return value optimisation, to help us with memory management.
In this series
- Part 1: Choosing the right programming language for CFD development
- Part 2: Why you should use C++ for CFD development
- Part 3: The complete guide to memory management in C++ for CFD
- Part 4: Object-orientated programming in CFD
- Part 5: How to handle inheritance and class hierarchies in C++
- Part 6: Templates in C++: Boost your CFD solver performance
- Part 7: Enhance readability with operator overloading in C++
- Part 8: The power of the standard template library (STL) in C++
- Part 9: Understanding Lambda expressions and how to use them in C++
- Part 10: Reduce memory bugs with smart pointers in C++
In this article
Life before smart pointers: Problems with raw pointers
To understand smart pointers
, we first need to revisit raw pointers or classical C-style pointers. We looked at them briefly in the Understanding memory in C++, but we are going to extend that discussion. Let us start with a simple example:
#include <iostream>
int main()
{
int *pointer = new int[1];
pointer[0] = 42;
std::cout << pointer[0] << std::endl;
delete [] pointer;
return 0;
}
Nothing interesting is happening here, really. We create a pointer on line 5 which has exactly one int
, so somewhere in memory 4 bytes are allocated for us. On line 6, we set the first entry (we only have one) to 42 and verify that it has worked on line 7. Finally, we deallocate memory, and we always need to call a matching delete
for every new
operation we call.
In C, we typically use pointers to help us with one- or multi-dimensional arrays, e.g.:
#include <iostream>
int main()
{
int nx = 10; int ny = 10;
// velocity vector in the x direction
double **velocityX = new double*[nx];
for (int i = 0; i < nx; ++i) {
velocityX[i] = new double[ny];
for (int j = 0; j < ny; ++j) {
velocityX[i][j] = 0.0;
}
}
// done with work on u
for (int i = 0; i < nx; ++i) {
delete [] velocityX[i];
}
delete [] velocityX;
return 0;
}
Here, we define a two-dimensional array velocityX
on line 8, which represents the velocity component in the x-direction. We can see that it is two-dimensional by the leading two asterisks, e.g. **
. We then loop over all entries in the x-direction, allocate memory on the way, and then loop over the y-direction and assign a zero velocity field everywhere. Once we are done, we have to go through the same loop, deallocate memory, and call the same number of delete
operators as new
operators.
Moving to C++, hopefully, by now you look at this and think that this looks ugly and should be better handled with data structures, for example a std::vector
, and you would be right. C++ comes with the standard template library and this provides us, out of the box, with a comprehensive list of useful data structures and algorithms, which we looked at in our article on the standard template library earlier. If we were to use a std::vector
here, we would have no more calls to new
and delete
, this is hidden within its implementation for us. The reason we have to use the above construct in C is that we do not have anything better available. We can’t write our own custom classes to hide memory allocation in the constructor and destructor, as classes don’t exist in C. In C++, we should never use raw pointers like this.
But sometimes, we still need to use pointers, and the reason is subtle, but worth exploring. All the way back in the Understanding memory in C++ article, we looked at how to allocate memory on the stack and heap. We said that stack memory is used if we declare variables as non-pointers, whereas pointers (using the keyword new) are allocated on the heap. There is a difference in behaviour between the two. Variables on the stack will automatically be deleted when they go out of scope, whereas heap-allocated variables will only be deleted when we call the delete
operator.
Setting limits for our variables: Scoping
What is a scope then? A scope is bound by the {} braces. So, for example, everything inside a function is enclosed by {} braces. If we define a variable within this function, once we return from it, the variable will no longer be available to us if it was allocated on the stack. But we can also randomly throw around braces in code to limit the scope, such as:
#include <iostream>
int main()
{
int a = 42;
{
int b = 27;
}
std::cout << a << std::endl;
std::cout << b << std::endl;
return 0;
}
This code will produce an error because we defined b
within a block of {}
braces, so on line 10, when we try to call b
, the compiler says that it doesn’t know what you mean, there is no variable b
defined. Let’s have a quick look at OpenFOAM, where braces and scope are thrown away quite frequently, and it bamboozles me why it is done. The following example is taken from the SIMPLE algorithm (file: simpleFoam.C
) implementation of OpenFOAM (just an excerpt):
while (simple.loop())
{
Info<< "Time = " << runTime.timeName() << nl << endl;
// --- Pressure-velocity SIMPLE corrector
{
#include "UEqn.H"
#include "pEqn.H"
}
laminarTransport.correct();
turbulence->correct();
runTime.write();
runTime.printExecutionTime(Info);
}
We can see a scope being generated on lines 6 to 9. Worse still, if you go into the pEqn.H
file, you’ll see that it is also scoped, i.e. the first and last lines are enclosed in { }
, but the UEqn.H
file isn’t. I have looked at this code several times now and I can still not see a programming reason why this is necessary. I assume this is done for style or looks, not for practical reasons, and yet, this is one of those instances where I really don’t understand why OpenFOAM has taken the decision to write their code in this particular way. There are other examples, and I’m sure I’ll unearth some more of them in the future.
After looking at so many bad examples of scoping, let’s finally look at something more realistic and discuss why scoping can become a problem. A more realistic scenario that you may encounter in a CFD solver is shown below:
#include <iostream>
#include <string>
class Solver {
public:
virtual void solve() = 0;
};
class SIMPLE : public Solver {
public:
virtual void solve() final { /* ...*/ }
};
class PISO : public Solver {
public:
virtual void solve() final { /* ...*/ }
};
int main()
{
std::string name("SIMPLE");
if (name == "SIMPLE") {
SIMPLE solver;
} else if (name == "PISO") {
PISO solver;
}
solver.solve();
return 0;
}
Here, we declare an abstract base class Solver
on line 4, which defines a pure virtual function on line 6. We then go on to derive two classes from this, the SIMPLE and PISO class, which would then implement a way to solve for velocity and pressure within the solve function on lines 11 and 16, respectively.
If we now want to create a solver based on a name that we, for example, read from a parameter file as simulated on line 21, then we are in trouble if we want to solve this with the if/else construct shown on lines 23-27. You may have noticed that we are trying to define a variable called solver
to be either of type SIMPLE
or PISO
based on the name variable, but we are trying to allocate the solver variable on the stack. Hence, once the if/else statement is done, the solver
variable will be deleted from the stack, as we have reached the end of the { }
braces. Subsequently, the above-provided code will not compile and complain about line 28, as solver
is no longer defined.
What do we do? Of course, pointers to the rescue, we allocate on the heap and ignore scope entirely. Remember that variables allocated on the heap do not have a scope associated with them and so we have to explicitly call the delete
keyword when we want them to go out of scope. Focusing just on the main
function, the new implementation would look something like:
int main()
{
std::string name("SIMPLE");
Solver *solver = nullptr;
if (name == "SIMPLE") {
solver = new SIMPLE();
} else if (name == "PISO") {
solver = new PISO();
}
solver->solve();
// clean up after yourself
delete [] solver;
solver = nullptr;
return 0;
}
Now we are generating a pointer on line 4, which we initially set to a null pointer (good practice, always initialise your variables with a default value). Then, during the if/else construct, we allocate memory for this variable which now is of our custom type SIMPLE or PISO (remember, a class is nothing else than a type in C++ and they allow us to extend C++ with our own types). Since we allocate them on the heap through the new operator, they will persist beyond the if/else scope and the call on line 11 to the solve
method will now work, though we had to change from the dot (.
) to the arrow (->
) operator in order to call the function, which is just the way we have to interact with pointers.
As we have discussed further above, we never want to handle memory calls to new
and delete
ourselves, unless we know what we are doing. This implies that you do understand how to write custom memory allocators and how you pass them to data structures such as std::vector
s as template arguments. If this feels complicated, then you are probably not ready yet to deal with new
and delete
just yet, but the good news is you don’t have to. If you want to explore what it takes to write your own memory allocator for your own data structure, then I would recommend the talk by Bob Steagall on this:
Why is calling new and delete bad?
We are now in a dilemma; we can see the usefulness of pointers in the above example (and we will return to it, as this is an important design pattern), but we do not want to use new
and delete
ourselves. You may ask yourself at this point, why new
and delete
is so bad? I’d answer that question with it is just good practice. If you look around, a lot of companies where bugs could lead to catastrophic events, specifically discourage (or forbid) the use of new
and delete
. Think of airborne applications, such as space shuttles and aircraft. A memory bug here with potential undefined behaviour, would lead to a catastrophic outcome (and has, in the past).
If you are unaware, the the Ariane 5 rocket explode just after launch due to a conversion issue from a 64-bit floating point number to a 16-bit unsigned integer. The issue was that the conversion happened with values that were too great to represent by a 16-bit unsigned integer and so an overflow occurred, bringing down the rocket. Aerospace and astronautical companies are known for their distrust of the programmers’ abilities to handle memory correctly (probably for good reasons!) and usually put guidelines in place that prohibit code that could easily lead to memory bugs. Take a look at the Joint Strike Fighter Air Vehicle C++ Coding Standart by Lookheed Martin, for example, which is accessible to the public. On page 59, they state:
4.26 Memory Allocation
AV Rule 206 (MISRA Rule 118, Revised)
Allocation/deallocation from/to the free store (heap) shall not occur after initialization. Note that the “placement” operator new(), although not technically dynamic memory, may only be used in low-level memory management routines. See AV Rule 70.1 for object lifetime issues associated with placement operator new().
Rationale: repeated allocation (new/malloc) and deallocation (delete/free) from the free store/heap can result in free store/heap fragmentation and hence non-deterministic delays in free store/heap access. See Alloc.doc for alternatives.
Joint Strike Fighter Air Vehicle C++ Coding Standart by Lookheed Martin
I’d say their language is rather permissive, they say that a call to new shall not occur, but if you have good reasons to do so, you can, If you do, though, you have to resort to something like custom memory allocators as discussed above, i.e. low-level memory management. You still can’t throw around new
and delete
statements anywhere in the code. It is interesting to read that they state memory fragmentation as the reason, i.e. fragmented memory can lead to longer look-up times, which probably makes sense. So in their case, you are not allowed to use new
and delete
for performance reasons, not for security reasons. However, there are other examples where the guidelines restrict the usage of certain features to ensure memory bugs are kept to a minimum.
Okay, so we don’t want to deal with memory allocation and deallocation ourselves. Using a data structure (container) from the standard template library will cover most use cases, but what about cases like the one we encountered above with our SIMPLE
and PISO
class example? Well, we are finally primed and ready for smart pointers in C++!
This website exists to create a community of like-minded CFD enthusiasts and I’d love to start a discussion with you. If you would like to be part of it, sign up using the link below and you will receive my OpenFOAM quick reference guide, as well as my guide on Tools every CFD developer needs for free.
Join now
Smart pointers to the rescue
Smart pointers work exactly like pointers themselves, but these pointers are implemented as classes and so have a few more things in store for us. And, to complicate things further, we have three different types, of which you only really need to understand and use two (I have never used the third one, nor seen it anywhere in code). The three different types are unique
, shared
, and weak
pointers.
The difference between all these different smart pointers is ownership. A unique
pointer can only be owned by a single variable, i.e. you can’t create a second variable that points to the same memory address by assigning the unique
pointer to the second variable. A shared
pointer, then, is the opposite, you can have many variables (pointers) point to the same memory, which makes it very easy to share resources (e.g. keep a copy of that pointer in several classes). A shared
pointer will count how many variables are currently pointing to the same memory address, and, once all references to that memory address are deleted, the shared
pointer itself is deleted.
Whenever you deal with multiple owners, you may run into issues with circular references, e.g. pointer A
is pointing to B
, B
is pointing to C
, and C
is pointing back at A
again. In this case, the reference count will never decrease, and shared pointers will subsequently not be deallocated, causing memory leaks. For this reason, we have weak
pointers, which can access shared
pointers, but they do not increase the reference count of owners of the shared
pointer. That also means that if we want to use a weak
pointer, we need to check if the shared
pointer actually still exists, but as I have mentioned above, weak
pointers are rather weakly used (couldn’t resist the pun).
This all sounds wonderfully abstract, so a few examples are in place to make things a bit clearer.
#include <memory>
int main()
{
std::unique_ptr<double> unique_ptr1 = std::make_unique<double>(3.14);
// std::unique_ptr<double> unique_ptr2 = unique_ptr1;
std::cout << *unique_ptr1 << std::endl; // prints 3.14
std::shared_ptr<double> shared_ptr1 = std::make_shared<double>(2.71);
std::cout << *shared_ptr1 << std::endl; // prints 2.71
std::cout << shared_ptr1.use_count() << std::endl; // prints 1
std::shared_ptr<double> shared_ptr2 = shared_ptr1;
std::cout << shared_ptr1.use_count() << std::endl; // prints 2
std::weak_ptr<double> weak_ptr1 = shared_ptr1;
std::cout << shared_ptr1.use_count() << std::endl; // prints 2
if (auto lockedShared = weak_ptr1.lock()) {
std::cout << *lockedShared << std::endl; // prints 2.71
}
return 0;
}
In order to use smart pointers, we need to include the <memory>
header. Lines 5-7 deal with unique
pointers. On line 5, we first declare a unique
pointer, which is of type double
, and instead of calling new
on the right-hand side of the assignment, we call std::make_unique
, which handles memory allocation for us. Line 6 is commented out; if you were to remove the comment, the code would not compile anymore, as we can’t assign a unique
pointer to another one.
Lines 9-14 deal with shared
pointers and illustrate the concept of ownership mentioned previously. We declare a shared
pointer in a similar manner to unique
pointers, in that we call std::make_shared
, which handles memory allocation for us. On line 10, we show that when we dereference the pointer, we get the value printed to the screen that we assigned on line 9. Also, on line 11, we call the special function use_count()
, which returns the number of owners of that pointer. At the point of creation, there is only one, so this will print 1 to the screen. We create a second shared
pointer on line 13 and set it equal to the shared
pointer we defined on line 9; when we now call the use_count()
function on line 14, we now get 2 instead of 1 printed, as we just added another owner to the shared
pointer. We are not printing here its value, but it would return the same value as printed on line 10.
Finally, on lines 16-21, we deal with weak
pointers. We initialise it on line 16 and set it equal to the shared
pointer. It will have the same value as shared_ptr1
and shared_ptr2
, but it will not increase the reference count; i.e. calling use_count()
on shared_ptr1
will still return 2, as shown on line 17. This means that since we are not owning the pointer, it could go out of scope and be deleted before we want to use it, so we need to check that it still exists before we use it. This is shown on lines 19-21, where we first check if we can lock our weak
pointer. If so, it’ll return true
, and the if
statement will be executed. If we want to access the content of the weak
pointer, we need to use the newly created variable lockedShared
, which can be used to dereference the value stored in the original shared
pointer.
Creating a SIMPLE or PISO solver the right way with shared pointers
With that out of the way, let us return to our SIMPLE
and PISO
class example, but this time use a unique
pointer instead of a raw pointer:
#include <iostream>
#include <string>
#include <memory>
class Solver {
public:
virtual void solve() = 0;
};
class SIMPLE : public Solver {
public:
virtual void solve() final { /* ...*/ }
};
class PISO : public Solver {
public:
virtual void solve() final { /* ...*/ }
};
int main()
{
std::string name("SIMPLE");
std::unique_ptr<Solver> solver = nullptr;
if (name == "SIMPLE") {
solver = std::make_unique<SIMPLE>();
} else if (name == "PISO") {
solver = std::make_unique<PISO>();
}
solver.get()->solve();
return 0;
}
Remember to #include <memory>
, as done on line 3, so we can access smart pointers. We are now declaring a unique pointer on line 23, which is of type Solver
, and initially, we set it to a nullptr
. Then we check again which solver we want to use, i.e., SIMPLE
or PISO
, and we make calls to std::make_unique
accordingly. We pass the class we want to use, e.g., SIMPLE
or PISO
, as the template parameter and have a constructor without any arguments. If our SIMPLE or PISO class expected any constructor arguments, we would need to include them on lines 26 or 28, respectively, within the ()
brackets.
We can then use our smart pointer as shown on line 30, i.e., we first call the get()
function on the smart pointer, which will return the raw pointer for us. Then, we can do whatever we want with our pointer and call functions that are defined in either the SIMPLE
or PISO
class. Since we declared the solve()
function as a pure virtual function in the abstract base class Solver
, we are guaranteed that any class deriving from this class will have a corresponding implementation of the solve()
class, so this call will always succeed, regardless of which class we are calling.
Introducing the factory pattern with smart pointers
Let’s tidy up this example by putting the creation of the solver into its own function. We’ll skip the header include statements, as well as the class definitions for the Solver
, SIMPLE
, and PISO
class, but these would still need to be present if you want to compile the following example:
std::unique_ptr<Solver> makeSolver(std::string name) {
std::unique_ptr<Solver> solver = nullptr;
if (name == "SIMPLE") {
solver = std::make_unique<SIMPLE>();
} else if (name == "PISO") {
solver = std::make_unique<PISO>();
}
return solver;
}
int main()
{
std::string name("SIMPLE");
auto solver = makeSolver(name);
solver.get()->solve();
return 0;
}
There are a few interesting things happening here, which are worth to explore. We define a function makeSolver()
on line 1, which expects one argument of type std::string
, which tells us which solver to create. We then go through the same logic as before. However, on line 8, we return the unique
pointer from our function, and, on line 1, we define the return type as std::unique_ptr, i.e. this is copy by value, not copy by reference. You may not see the importance of this, but we stated before that unique
pointers can’t be copied, yet this is what this function is doing, and the worst (or best?) part about it is that it works. The compiler doesn’t complain. The reason here is that the compiler wants to be smart, and whenever we return values from a function where a copy can be avoided, the compiler will try to do exactly that.
In our case, there is nothing stopping the compiler from returning the unique pointer by reference, and so this is what the compiler is doing for us, and we don’t get an error message during compilation. This is known as copy elision. Something similar happens when you use a data structure like std::vector
. You may allocate a bunch of memory on the heap, and instead of copying that data around when you just want to return from a function, your compiler will automatically return by reference and leave the memory where it is. This is known as return value optimization, or RVO in short, and both copy elision and RVO are closely connected. You may come across either term in the literature, suffice it to say that these are mechanisms your compiler is employing to make your code more performant.
We can prevent the compiler from being clever by declaring variables as volatile
. This will indicate to the compiler that you don’t want it to optimise anything about this variable and leave everything exactly as it is. So, if you change line 2 to readvolatile std::unique_ptr solver = nullptr;
, then your compiler keeps the copy by value as the return value, it will not perform copy elision, and you will be hit with a myriad of error messages, all essentially saying that you can’t copy a unique
pointer. Give it a try and enjoy the onslaught of error messages, this is the best way to get intimately familiar with your compiler of choice!
To finish off the example, on line 15 in the code above, we call the function makeSolver()
and pass the std::string
as the argument to the function. Now we can add as many different solvers to the makeSolver()
function, but our code in the main()
function would not change. This is an important programming design pattern and is known as the factory pattern. Here, the makeSolver()
function acts as our factory which produces different versions of Solver
instances based on some input (here, simply the name of the solver). In real code, you may see function names containing the word factory itself, e.g. we could have also called our function solverFactory()
.
Factories are very useful to add functionality to your code dynamically. They are part of a much wider range of design patterns, which eventually we have to learn if we really want to write clean code. We’ll leave that to another day, as you will see that there are over 20 different design patterns if you click on the link above. We don’t necessarily use all of them at the same time, but knowing them and then knowing when to use them will boost your code structure significantly.
We already saw how we can simply create different instances of pressure velocity coupling algorithms using a factory pattern; other places where you may want to use are listed below to give you some idea:
- Turbulence models
- Transport models for non-Newtonian fluids
- Descretisation schemes in space
- Time-integration schemes
You get the idea; for each of these types, you may want to have several options available and then use a factory pattern in conjunction with a smart pointer to generate the variables.
Not so-smart smart pointers in OpenFOAM
We don’t have to look far for an example in real life; OpenFOAM makes use of exactly this principle, albeit a bit more convoluted. Whenever you create a new solver, there is a file called createFields.H
, which essentially constructs all of your field vectors and scalars, such as velocity and pressure. It also handles the turbulence model setup if the particular solver does support turbulence modelling. Many solver support turbulence modelling, like the simpleFoam solver, and if we look at its createFields.H
file, you’ll see code like the following at the bottom of the file:
autoPtr<incompressible::turbulenceModel> turbulence
(
incompressible::turbulenceModel::New(U, phi, laminarTransport)
);
Here, we use an autoPtr
, which was the first attempt of C++ to introduce smart pointers. An autoPtr
is very similar to what we now have as a unique
pointer. There were some issues with auto pointers, and so they have been deprecated with the C++11 standard, which was released in 2011. So there is no need to understand what an auto pointer, but the reason OpenFOAM uses them is because it was created way before 2011, and so no smart pointers were available. I guess replacing the deprecated autoPtr
has not made it to the top of the priority list yet.
Returning to the above-shown code, we create essentially a smart pointer and pass the New
function to the constructor call. This function will then figure out which turbulence model should be used (similar to the above example with if/else statements, but this is the part where things get a bit more complicated, so we won’t dig much deeper into the code here). But similar to the above factory pattern that we introduced, we can now add as many turbulence models as we want and then use them later in the code. Our code shown above would not change in terms of creating a turbulence model.
Summary
This concludes our discussion on smart pointers. They are essentially just normal raw pointers, with additional syntactic sugar to make our life a bit easier. They also handle memory allocation and deallocation for us, so we should be able to reduce memory bugs automatically by switching from raw to smart pointers. Hopefully, you will agree that smart pointers are incredibly useful and that you will make an effort to use them from now on if you are not doing so already.
In this article, we also touched upon the concept of copy elision and return value optimisation (RVO), which are not necessarily only relevant when using smart pointers, but in general when we are dealing with memory and we want to reduce unnecessary copies of large amount of data. Furthermore, we looked at the factory design pattern and how it can be used, together with smart pointers, to promote clean code design.
If you are already using raw pointers (are you a C programmer, pretending to code in C++ by using a C++ compiler on C code?), then switching to smart pointers should be straightforward, the syntax to create them is slightly different, but then you get all the benefit of having someone else looking after your memory allocation and deallocation. Use smart pointers, and thank me later!