C++ Language Evolution

By Mark Roulo

Last Updated: 19-Feb-2021


C++ as a language began as a simple extension to C to support classes (bundling code with data). Over time it has acquired more and more complexity. Before C++11 the list of important concepts included: C++11 and C++17 added: The language gets more and more complicated over time and the language specification is now 2,000+ pages long.

Why Is This Happening?

C++ Philosophy

The C++ language and the evolution of the language seem to be driven by a handful of principles:
  1. Don't break existing code
  2. Don't pay for unused features
  3. Terseness is good
  4. Provide access to maximum run-time performance

1) Don't break existing code

This is fairly self-explanatory: The semantics of old code cannot change. One side-effect of this is that all the old concepts must continue to be supported. Most programming languages do this so this is normal, not something unique to C++.

This makes adoption of newer versions relatively inexpensive, but it also means that over time the language accrues the equivalent of "technical debt" because the obsoleted pieces cannot be removed (or, realistically, can only be removed rarely and with effort; C++ 17 has finally removed support for the 'register' keyword).

2) Don't pay for features you don't use

Fundamental to C++ philosophy is that programs that do not use a given language feature (e.g. inheritance) should pay no runtime cost for the unused feature.

A logical consequence of this is that C++ has refused to build either (true) garbage collection or reference counting into the language. If a given program wants to track pointer 'liveness' manually, then the overhead of reference counting is unwanted.

C++ then tends to put features such as reference counting (via smart pointers) into libraries rather than build them directly into the language (as is done with Swift, Python, etc.).

3) Terseness is good

As an example of idiomatic looping over all the elements in a collection, C++ has moved from this:
    std::vector<int> v{0, 1, 2, 3, 4, 5, 6, 7, 8, 9};

    for(std::vector<int>::iterator it = v.begin(); it != v.end(); ++it) {
        ...
    }
to this:
    std::vector<int> v{0, 1, 2, 3, 4, 5, 6, 7, 8, 9};

    for(auto it& = v.begin(); it != v.end(); ++it) {
        ...
    }
The 'auto' keyword hides the type declaration of the iterator (and thus the type of the elements in the collection). C++ has since moved on to prefer this:
    std::vector<int> v{0, 1, 2, 3, 4, 5, 6, 7, 8, 9};

    for (auto& it : v) {
        ...
    }
Where we now hide both the type of the iterator and the looping details where ': v' basically means 'go over the entire range of elements in v'.

4) Provide access to maximum run-time performance

C++ very much wants to allow C++ programmers to achieve maximum run-time performance. This is one of the reasons that custom allocators are supported. This is why C++ provides both virtual and non-virtual member functions (as well as static member functions).

Because C++ isn't going to require a runtime with overhead, a number of these optimizations must be done by the developer at compile-time. Java can determine at runtime whether a given member function has more than one implementation in an inheritance hierarchy. Java does not have a way to specify non-virtual member functions (final doesn't quite count!). C++ does.

Putting This All Together

The intersection of all of these decisions mean that: The "catch" to this is that getting a number of these things correct can be very difficult. Compilers and language runtimes tend to be developed by dedicated developers (most likely, on average, more talented than the typical application developer). More to the point, programming language compilers and runtimes tend to come with very extensive unit/regression tests to catch bugs.

The result is that C++ exposes to the application developers details that are often hidden from application developers by other programming langauges. Because of this: Combined with the unwillingness to break existing code, the result is a programming language that is already more complicated for application developers than maybe any other currently used programming language. And a language that will only get more complicated over time as more features get added an no features get removed.