Part of the Design Patterns series:

Feature image

Introduction Link to this heading

Performance is very important in C++ programming, so important that the use of virtual functions often produces too much overhead to utilize. But without the use of virtual functions, we lose a lot of the flexibility and generic implementations that polymorphic programming gives us. The curiously recurring template pattern (CRTP) provides a solution that allows for polymorphic implementation in our design patterns without the runtime costs that the indirection of virtual functions incurs.

CRTP Link to this heading

The CRTP uses the idea of using a base class as an abstraction, but instead of establishing a runtime relationship between the base class and its derived class, this relationship is established at compile time using templates.

C++
 1#include <iostream>
 2
 3// Base class template
 4template<typename Derived>
 5class Car {
 6public:
 7    void start() {
 8        // Call the startEngine method of the Derived class
 9        static_cast<Derived*>(this)->startEngine();
10    }
11    int size() const {
12        static_cast<Derived*>(this)->size();
13    }
14};
15
16// Derived class
17class SportsCar : public Car<SportsCar> {
18public:
19    void startEngine() {
20        std::cout << "SportsCar engine started with a roar!" << std::endl;
21    }
22    int size() const {
23        return 100;
24    }
25};
26
27// Another derived class
28class FamilyCar : public Car<FamilyCar> {
29public:
30    void startEngine() {
31        std::cout << "FamilyCar engine started quietly." << std::endl;
32    }
33    int size() const {
34        return 50;
35    }
36};
37
38int main() {
39    SportsCar mySportsCar;
40    FamilyCar myFamilyCar;
41
42    std::cout << "Starting SportsCar:" << std::endl;
43    mySportsCar.start();
44
45    std::cout << "Starting FamilyCar:" << std::endl;
46    myFamilyCar.start();
47
48    return 0;
49}

It took some time for me to wrap my head around this. Our derived class inherits from the base class while passing itself as a template argument. Then, within the base class interface, we need to ensure we static_cast itself as an instance of its template argument and call the correct function for that derived class. In the example above, even though SportsCar does not have a start() member function, it does support the call as part of its interface since it inherits from the Car base class. And because the Car base class can call the correct derived class function based on its template arguments, the startEngine function within the SportsCar class will get called.

The CRTP allows us to build a common interface via the base class as well as a default implementation for the reduction of duplicate code.

C++
 1template <class Derived>
 2class Base {
 3public:
 4    void interface() {
 5        // Static polymorphism used here
 6        static_cast<Derived*>(this)->implementation();
 7    }
 8
 9    void implementation() {
10        // Default implementation (if any)
11    }
12};
13
14class Derived : public Base<Derived> {
15public:
16    void implementation() {
17        // Custom implementation for Derived
18        std::cout << "Derived implementation" << std::endl;
19    }
20};
21
22class Derived2: public Base<Derived2> {
23};
24
25int main() {
26    Derived d;
27    d.interface();  // Calls Derived::implementation()
28
29    Derived2 d2;
30    d2.interface(); // Calls Base::implementation()
31}

Here, both d and d2 have a common interface, which is nice. Additionally, if we have a bunch of derived classes that all have the same implementation details for the implementation() function, we can default to the base class’s version without needing to duplicate code. Then, only when we need a unique implementation for a derived class can we write it up.

You should expect an implementation of CRTP to be significantly faster than a similar implementation using dynamic polymorphism. This is because:

  1. Function Call Overhead: Dynamic polymorphism requires a virtual table (vtable) lookup every time a virtual function is called. This not only adds a small runtime overhead but also can hinder certain compiler optimizations. With CRTP, function calls are resolved at compile-time, eliminating the vtable lookup and potentially enabling function inlining, which can significantly improve performance.

  2. Memory Layout and Access: Objects using dynamic polymorphism typically have an extra pointer per object (to the vtable), which increases the size of each object. This extra memory can affect cache performance, particularly if many objects are instantiated. CRTP does not require this additional pointer, which can lead to better data locality and cache usage.

  3. Compiler Optimizations: Since CRTP enables more behavior to be resolved at compile time, it opens up more opportunities for compiler optimizations like inlining, loop unrolling, and constant folding. These optimizations are often not possible with virtual functions because the exact function to be called cannot be determined until runtime.

  4. Predictability and Branch Prediction: The absence of vtable lookups means there are fewer conditional branches in your code, which can improve branch prediction on modern CPUs. Better branch prediction can lead to smoother and faster execution paths through your code.

The major benefit of any type of polymorphism is that we can work with objects even though we do not know their specific type. This is possible with static polymorphism via templates

C++
1template<typename Derived>
2float fairMarketPrice(Car<Derived> car) {
3    int car_size = car->size();
4    /// the rest of the implementation
5}

Here our function does not know which specific car is being passed to it until compile time. This gives us our polymorphism that we want.

CRTP and Abstract Classes Link to this heading

When a derived class does not implement a function that it should, the base class will attempt to do it on the derived class’s behalf. But if the base class does not implement the function either (because we are attempting to make the function something the derived class must implement), our program will be malformed. In order to match the pure virtual functions of dynamic polymorphism, we must change the implementation name that the derived class should complete to be different from the base class function to cause a compilation error when it is not implemented.

C++
 1template <typename D>
 2class B {
 3    public:
 4        void f(int i) {
 5            static_cast<D*>(this)->f_impl(i);
 6        }
 7};
 8class D : public B<D> {
 9    public:
10        void f_impl(int i) {
11            // implementation
12        }
13};

CRTP and Access Control Link to this heading

How do we allow our base class to access private functions and members of the derived class that we do not want accessible to the client? The simplest way is to use a friend declaration against the template argument.

C++
 1template <typename D>
 2class B {
 3    friend D;
 4    public:
 5        void f(int i) {
 6            static_cast<D*>(this)->f_impl(i);
 7        }
 8};
 9class D : public B<D> {
10    private:
11        void f_impl(int i) {
12            // implementation
13        }
14};

As a general rule of thumb, using a friend declaration invokes a code smell. But sometimes, such as this, it makes sense in its use.

Downside Link to this heading

There are some major shortcomings to the CRTP that need to be known before anyone goes about making changes everywhere in their code. The biggest is we lose a major benefit of dynamic polymorphism when we instead use the CRTP, which is the ability to use a common base class as an abstraction. In our car example above, you will often find using dynamic polymorphism that a collection of different types of cars will be stored in a container as instances of their base class pointer. This allows different places in the code to interact with instances of Car and interact with the Car interface without needing to worry about which exact derived type of the Car base class it is. This is a super powerful way to program generically and introduce abstractions into your system, and it is lost with CRTP. We cannot add our SportsCar and FamilyCar instantiations into a vector and then pass that around to other functions. This is because both of these classes technically have different base classes (Car & Car). C++ templates will build separate classes for both templates with no way to store them as a common Car class.

Using Concepts Instead Link to this heading

C++20 comes with the addition of concepts, which is a modern approach to restrict our templates in interesting ways. In many cases, we can use concepts instead of the CRTP.

C++
 1// Concept that defines requirements for a Car
 2template<typename T>
 3concept Car = requires(T a) {
 4    { a.startEngine() } -> std::same_as<void>;  // Requires a startEngine() function that returns void
 5};
 6
 7// Class that uses the concept
 8template<Car T>
 9class Vehicle {
10public:
11    void start() {
12        T car;
13        car.startEngine();  // Guaranteed to exist and be valid due to the Car concept
14    }
15};
16
17// A class that satisfies the Car concept
18class SportsCar {
19public:
20    void startEngine() {
21        std::cout << "SportsCar engine started with a roar!" << std::endl;
22    }
23};
24
25int main() {
26    Vehicle<SportsCar> mySportsCar;
27    mySportsCar.start();
28
29    return 0;
30}

As long as our derived class implements the Car concept requirements, we can define our Vehicle class with the start() interface similar to what we did with the CRTP above. The difference is that C++ concepts provide a cleaner, less perplexing syntax along with comprehensible compile-time errors when something is wrong. Along with these benefits, we also get the same runtime performance as CRTP. Let’s look at one more example where we want a function that can take two examples of a generic base class and compute the Euclidean distance between them. With dynamic polymorphism, we would create our abstract base class Point with virtual functions to get x() and y() values that our derived class would use to return values. We can do something similar to that with concepts by first defining our limiting concept:

C++
1template <typename T>
2concept Point = requires(T p) {
3    requires std::is_same_v<decltype(p.x()), decltype(p.y())>; // Requires T has x() & y() and they return the same types
4    requires std::is_arithmetic_v<decltype(p.x())>; // ensures the return value can be operated on by C++ math operators 
5};

Instead of defining a base class Point, we have now defined a Point concept that other classes can mimic as long as they meet all the criteria defined in the concept.

C++
 1template <typename T>
 2class Point2D {
 3    public:
 4        Point2D(T x, T y) : x_{x}, y_{y} {}
 5        auto x() { return x_; }
 6        auto y() { return y_; }
 7    private:
 8        T x_{};
 9        T y_{};
10};
11
12auto dist(Point auto p1, Point auto p2) {
13    auto a = p1.x() - p2.x();
14    auto b = p1.y() - p2.y();
15    return std::sqrt(a*a + b*b);
16}
17
18int main() {
19    Point2D p1{12.0, 14.0};
20    Point2D p2{5.0, 11.0};
21    auto e_distance = dist(p1, p2);
22}

Point2D does not inherit from Point, yet it can be passed to a function that accepts Point because it successfully honors the requirements of the concept Point. We get the convenience of generically using the Point abstraction and the performance of no virtual function indirection!

Concepts are huge, and something I plan to explore more in the future. I recommend using C++ concepts over the CRTP whenever possible.

Conclusion Link to this heading

For the C++ engineer working on high-performance applications such as high-frequency trading or game engines, static polymorphism in the form of the CRTP or C++20 Concepts is an excellent way to get some of the benefits of runtime polymorphism without the performance hit. It is not a silver bullet, particularly because we lose the ability to abstract away via a common base class. But when a common base class is not necessary, static polymorphism should be the default approach.