This post has been republished via RSS; it originally appeared at: Microsoft Developer Blogs - Feed.
Thefinal
specifier in C++ marks a class or virtual member function as one which cannot be derived from or overriden. For example, consider the following code:
struct base { virtual void f() const = 0; }; struct derived final : base { void f() const override {} };If we attempt to write a new class which derives from `derived` then we get a compiler error:
struct oh_no : derived { };
<source>(9): error C3246: 'oh_no': cannot inherit from 'derived' as it has been declared as 'final' <source>(5): note: see declaration of 'derived'The
final
specifier is useful for expressing to readers of the code that a class is not to be derived from and having the compiler enforce this, but it can also improve performance through aiding devirtualization.
Devirtualization
Virtual functions require an indirect call through the vtable, which is more expensive than a direct call due to interactions with branch prediction and the instruction cache, and also the prevention of further optimizations which could be carried out after inlining the call. Devirtualization is a compiler optimization which attempts to resolve virtual function calls at compile time rather than runtime. This eliminates all the issues noted above, so it can greatly improve the performance of code which uses many virtual calls1. Here is a minimal example of devirtualization:struct dog { virtual void speak() { std::cout << "woof"; } }; int main() { dog fido; fido.speak(); }In this code, even though
dog::
speak
is a virtual function, the only possible result of main
is to output ”woof”
. If you look at the compiler output you’ll see that MSVC, GCC, and Clang all recognize this and inline the definition of dog::speak
into main
, avoiding the need for an indirect call.
The Benefit of final
The final
specifier can provide the compiler with more opportunities for devirtualization by helping it identify more cases where virtual calls can be resolved at compile time. Coming back to our original example:
struct base { virtual void f() const = 0; }; struct derived final : base { void f() const override {} };Consider this function:
void call_f(derived const& d) { d.f(); }Since
derived
is marked final
the compiler knows it cannot be derived from further. This means that the call to f
will only ever call derived::
f
, so the call can be resolved at compile time. As proof, here is the compiler output for call_f
on MSVC when derived
or derived::
f
are marked as final
:
ret 0You can see that the
derived::
f
has been inlined into the definition of call_f
. If we were to take the final
specifier off the definition, the assembly would look like this:
mov rax, QWORD PTR [rcx] rex_jmp QWORD PTR [rax]This code loads the vtable from
d
, then makes an indirect call to derived::
f
through the function pointer stored at the relevant location.
The cost of a pointer load and jump may not look like much since it’s just two instructions, but remember that this may involve a branch misprediction and/or instruction cache miss, which would result in a pipeline stall. Furthermore, if there was more code in call_f
or functions which call it, the compiler may be able to optimize it much more aggressively given the full visibility of the code which will be executed and the additional analysis which this enables.
Conclusion
Marking your classes or member functions asfinal
can improve the performance of your code by giving the compiler more opportunities to resolve virtual calls at compile time.
Consider if there are any places in your codebases which would benefit from this and measure the impact!
1 http://assemblyrequired.crashworks.org/how-slow-are-virtual-functions-really/
https://sites.cs.ucsb.edu/~urs/oocsb/papers/oopsla96.pdf
https://stackoverflow.com/questions/449827/virtual-functions-and-performance-c