New safety rules in C++ Core Check

This post has been republished via RSS; it originally appeared at: Microsoft Developer Blogs.

Rust and C++ are two popular systems programming languages. For years, the focus of C++ has been on performance. We are increasingly hearing calls from customers and security researchers that C++ should have stronger safety guarantees in the language. C++ often falls behind Rust when it comes to programming safety. Visual Studio 2019 version 16.7 contains four new rules in C++ Core Check to incorporate some safety features from Rust into C++.   For more detailed information on C++ Core Check, please see the C++ Core Guidelines Checker Reference documentation. If you’re just getting started with native code analysis tools, take a look at our introductory quick start for Code Analysis for C/C++. 

Missing default label in switch statements 

Rust's pattern matching constructs can be used similarly to the C++ switch statement. One way in which they differ, however, is that the Rust equivalent requires the programmer to cover all possible patterns being matched. This can be achieved either through writing an explicit handler for each pattern or appending a default handler for cases not explicitly covered.   For example, the following Rust code would not compile if the default handler were missing. 
// i32 == 32-bit signed integer 
fn printDieRoll(roll: i32) { 
    match roll { 
        1 => println!("one!"), 
        2 => println!("two!"), 
        3 => println!("three!"), 
        4 => println!("four!"), 
        5 => println!("five!"), 
        6 => println!("six!"), 
        _ => println!("what kind of die are you using?") // default handler 
    } 
}
This is a neat little safety feature because it guards against this extremely easy to make, but not so easy to catch, programming error.  Visual Studio does warn whenever all cases of an enum type are not covered in a C++ switch statement <C4061, C4062>. However, such a warning is not present for other types, such as integers, as in the Rust example above.  This release introduces a new check to warn whenever switch statements over non-enum types (i.e., char, int, ...) are missing a default label. You can find detailed documentation on this check hereTo enable this rule in Visual Studio, you will have to select the ruleset “C++ Core Check Style Rules”, “C++ Core Check Rules”, or “Microsoft All Rules” for your project and then run code analysis.  Re-writing the Rust example from above in C++, we would get something like below. 
void printDieRoll(int roll) { 
    switch (roll) { 
        case 1: 
            std::cout << "one\n"; 
            break; 
        case 2: 
            std::cout << "two\n"; 
            break; 
        case 3: 
            std::cout << "three\n"; 
            break; 
        case 4: 
            std::cout << "four\n"; 
            break; 
        case 5: 
            std::cout << "five\n"; 
            break; 
        case 6: 
            std::cout << "six\n"; 
            break; 
        default: 
            std::cout << "what kind of die are you using?\n"; 
            break; 
    } 
}
Removing the default handler now results in a warning. 
void printDieRoll(int roll) { 
    switch (roll) { // warning C26818: Switch statement does not cover all cases. Consider adding a 'default' label (es.79) 
        case 1: 
            std::cout << "one\n"; 
            break; 
        case 2: 
            std::cout << "two\n"; 
            break; 
        case 3:
            std::cout << "three\n"; 
            break; 
        case 4: 
            std::cout << "four\n"; 
            break; 
        case 5: 
            std::cout << "five\n"; 
            break; 
        case 6: 
            std::cout << "six\n"; 
            break; 
    } 
}

Unannotated fallthrough in switch statements 

Another restriction of Rust's match statement is that it does not support the notion of fallthrough between cases. In C++, on the other hand, the following code is perfectly valid. 
enum class Food { 
    BANANA, ORANGE, PIZZA, CAKE, KALE, CELERY 
}; 

void takeFromFridge(Food food) { 
    switch (food) { 
        case Food::BANANA: 
        case Food::ORANGE: 
            peel(food);   // implicit fallthrough 
        case Food::PIZZA: 
        case Food::CAKE: 
            eat(food); 
            break; 
        case Food::KALE: 
        case Food::CELERY: 
            throwOut(food);         
            break; 
    } 
}
While this example is perfectly sound, implicit fallthrough between cases can very easily be a bug. Say, for instance, that the programmer of the above function had forgotten the break statement after the call to eat(food). The code would run, but the behavior would be completely incorrect. With a larger and more complex codebase, tracking this type of bug can be difficult.  Fortunately, with C++17 comes the addition of the [[fallthrough]] annotation, whose purpose is to mark fallthrough between case labels, such as in the example above, so that maintainers of the code can be confident thfallthrough behavior is intended.   With Visual Studio 2019 version 16.7, warning C26819 is raised whenever a non-empty switch case falls through into a following case without marking the fallthrough using the [[fallthrough]] annotation. Detailed documentation can be found here. This rule is enabled by default in Visual Studio when you run code analysis. 
void takeFromFridge(Food food) { 
    switch (food) { 
        case Food::BANANA: // empty case, fallthrough annotation not needed 
        case Food::ORANGE: 
            peel(food);    // warning C26819: Unannotated fallthrough between switch labels (es.78) 
        case Food::PIZZA:  // empty case, fallthrough annotation not needed 
        case Food::CAKE: 
            eat(food); 
            break; 
        case Food::KALE:  // empty case, fallthrough annotation not needed 
        case Food::CELERY: 
            throwOut(food);         
            break; 
    } 
}
To fix this warning, insert a [[fallthrough]] statement.
void takeFromFridge(Food food) { 
    switch (food) { 
        case Food::BANANA: 
        case Food::ORANGE: 
            peel(food); 
            [[fallthrough]]; // the fallthrough is intended 
        case Food::PIZZA: 
        case Food::CAKE: 
            eat(food); 
            break; 
        case Food::KALE: 
        case Food::CELERY: 
            throwOut(food);         
            break; 
    } 
}

Expensive range-for copy 

A major difference between Rust and C++ is that Rust is move-by-default rather than copy-by-default.  Some Rust code:
struct MySequence {
    sequence: Vec<i32>
}

fn main() {
    let mut a = MySequence { sequence: vec![0, 1, 1, 2, 3, 5, 8, 13] };
    let mut _b = a;
    a.sequence.push(21); // error! `a` was moved into `b` and can no longer be used
}
This means that explicit copy semantics must be used in most cases whenever a copy is intended. 
#[derive(Clone)]
struct MySequence {
    sequence: Vec<i32>
}

fn main() {
    let mut a = MySequence { sequence: vec![0, 1, 1, 2, 3, 5, 8, 13] };
    let mut _b = a.clone();
    a.sequence.push(21); // much better
}
C++, on the other hand, is copy-by-default. This is not a problem in general but can sometimes be a source of bugs. One case where this commonly occurs is within range-for statements. Take for example the following piece of code. 
struct Person { 
    std::string first_name; 
    std::string last_name; 
    std::string email_address; 
}; 

void emailEveryoneInCompany(const std::vector<Person>& employees) { 
    Email email; 
    for (Person p: employees) { // copy of type `Person` occurs on each iteration 
        email.addRecipient(p.email_address); 
    } 
    email.addBody("Sorry for the spam"); 
    email.send(); 
}
In the above snippet, each element of the vector is copied into p on each iteration of the loop. This is not obvious and can be a significant source of inefficiency if the copy is expensive. To remedy this unnecessary copy, we added a new C++ Core Check rule, suggesting a way to remove the copy. 
void emailEveryoneInCompany(const std::vector<Person>& employees) { 
    Email email; 
    for (Person p: employees) { // Warning C26817: Potentially expensive copy of variable 'p' in range-for loop. Consider making it a const reference (es.71) 
        email.addRecipient(p.email_address); 
    } 
    email.addBody("Sorry for the spam"); 
    email.send(); 
}
By using the suggestion from the warning and changing the type of the variable p in the loop from a Person to a const Person&, the variable no longer receives an expensive copy of the data on each iteration. 
void emailEveryoneInCompany(const std::vector<Person>& employees) { 
    Email email; 
    for (const Person& p: employees) { // expensive copy no longer occurs 
        email.addRecipient(p.email_address); 
    } 
    email.addBody("Sorry for the spam"); 
    email.send(); 
}
To decide what constitutes an "expensive" copy, the following heuristic is used by the check:  If the size of the type is greater than twice the platform-dependent pointer size and the type is not a smart pointer or one of gsl::span, gsl::string_span, or std::string_view, then the copy is considered expensive. This means that for small datatypes such as numeric scalars, the warning will not trigger. For larger types, such as the Person type in the example above, the copy is considered expensive and a warning will be raised.  One final point to note is that the check will not fire if the variable is mutated in the loop body. 
struct Person { 
    std::string first_name; 
    std::string last_name; 
    int hourlyrate; // dollars per hour 
}; 

void giveEveryoneARaise(const std::vector<Person>& employees) { 
    for (Person p: employees) { 
        p.hourlyrate += 10; // `p` can no longer be marked `const Person&`, so the copy is unavoidable 
    } 
}
If instead the container was not const-qualified, then the copy could be avoided by changing the type Person to Person&. 
void giveEveryoneARaise() { 
    std::vector<Person> employees = getEmployees(); 
    for (Person& p: employees) { // no more expensive copying, but any subsequent mutation will change the container! 
        p.hourlyrate += 10; 
    } 
}
But this change comes with the introduction of new side-effects to the code. Therefore, the range-for copy warning only ever suggests marking the loop variable as const T&, and will not fire if the loop-variable cannot legally be marked const.  Full documentation of the check can be found hereThis rule is enabled by default in Visual Studio when you run code analysis. 

Expensive copy with the auto keyword 

The last new check in this release concerns itself with expensive copies occurring with the use of the auto type.  Consider the following Rust example in which type resolution occurs for a variable being assigned a reference. 
struct PasswordManager { 
    password: String 
} 

impl PasswordManager { 
    // member-access function returning an immutable reference to a member 
    fn getPassword(&self) -> &String { &self.password } 
    // Note: for the sake of an example dealing with expensive types, a &String is being returned. 
    // More realistically though, a string slice would be returned instead (similar to a string view in C++) 
} 

fn stealPassword(pm: &PasswordManager) { 
    let password = pm.getPassword(); // the type of `a` resolves to `&String`. No copy occurs. 
}
Because of Rust's requirement that in the majority of cases copying must be explicit, the type of password in the example automatically resolves to an immutable reference when being assigned an immutable reference, and no expensive copying is performed.  On the other hand, consider the equivalent C++ code. 
class PasswordManager { 
    std::string password; 
public: 
    const std::string& getPassword() const { return password; }  
}; 

void stealPassword(const PasswordManager& pm) {
    auto password = pm.getPassword(); // the type of `password` resolves to `std::string`. Copy occurs.
}
Here, the type of password resolves to std::string, even though the return type of getPassword() is a const-reference to a string. The resulting behavior is that the contents of PasswordManager::password get copied into the local variable password. Compare this with a function returning a pointer: 
class PasswordManager {
    std::string password;
public:
    const std::string* getPassword() const { return &password; }
};

void stealPassword(const PasswordManager& pm) {
    auto password = pm.getPassword(); // the type of `password` resolves to `const std::string*`. No copy occurs.
}
This difference in behavior between assigning a reference and a pointer to a variable marked auto is non-obvious, resulting in potentially unwanted and unexpected copying.  To guard against bugs arising from this behavior, the checker examines all instances of initialization from a reference to a variable marked auto. If the resulting copy is deemed expensive using the same heuristics as in the range-for check, the checker warns to mark the variable const auto& instead.
class PasswordManager {
    std::string password;
public:
    const std::string& getPassword() const { return password; }
};

void stealPassword(const PasswordManager& pm) {
    auto password = pm.getPassword();  // Warning C26820: Assigning by value when a const-reference would suffice, use const auto&amp; instead (p.9)
}
And as with the range-for check, this warning does not get raised whenever the variable cannot be legally marked const. 
std::string hashPassword(const PasswordManager& pm) {
    auto password = pm.getPassword(); // warning no longer gets raised because `password` is modified below
    password += "salt";
    return std::hash(password);
}
Another instance in which the warning does not get raised is whenever the reference is being derived from a temporary. In such cases, using const auto& would result in a dangling reference once the temporary gets destroyed. 
class PasswordManager {
    std::string password;
public:
    PasswordManager(const std::string& password): password(password) {}
    const std::string& getPassword() const { return password; }
};

void stealPassword() {
    const auto& password = PasswordManager("123").getPassword(); // using `const auto&` instead of just `auto`
    use_password(password); // `password` is now referencing invalid memory!
}
Full documentation of the check can be found hereThis rule is enabled by default in Visual Studio when you run code analysis.

Give us your feedback

Check out these newly added rules and the recently released GSL 3.0 library and let us know if they help you write safer C++. Stay tuned as we add more safety rules in future releases of Visual Studio. Download Visual Studio 2019 version 16.7 today and give it a try. We would love to hear from you to help us prioritize and build the right features for you. We can be reached via the comments below, Developer Community, and Twitter (@VisualC). The best way to file a bug or suggest a feature is via Developer Community. 

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.