Source: LearnCpp.com by Alex
Strings, Pointers, and References
String
One important point to note is that C-style strings follow all the same rules as arrays. This means you can initialize the string upon creation, but you cannot assign values to it using the assignment operator after that!
1 | char myString[] = "string"; // ok |
Bad practice:
1 | char name[255]; // declare array large enough to hold 255 characters |
In the above program, we’ve allocated an array of 255 characters to name
, guessing that the user will not enter these many characters. Although this is commonly seen in C/C++ programming, it is poor programming practice, because nothing is stopping the user from entering more than 255 characters (either unintentionally, or maliciously).
The recommended way of reading strings using cin is as follows:
1 | char name[255]; // declare array large enough to hold 255 characters |
This call to cin.getline()
will read up to 254 characters into name (leaving room for the null terminator '\0'
!). Any excess characters will be discarded. In this way, we guarantee that we will not overflow the array!
Note the difference between strlen()
and std::size()
. strlen() prints the number of characters before the terminator, whereas std::size (or the sizeof() trick) returns the size of the entire array, regardless of what’s in it.
1 |
|
Don’t use C-style strings
It is important to know about C-style strings because they are used in a lot of code. However, now that we’ve explained how they work, we’re going to recommend that you avoid them altogether whenever possible! Unless you have a specific, compelling reason to use C-style strings, use std::string
(defined in the <string>
header) instead. std::string is easier, safer, and more flexible. In the rare case that you do need to work with fixed buffer sizes and C-style strings (e.g. for memory-limited devices), we’d recommend using a well-tested 3rd party string library designed for the purpose instead.
Pointers
What good are pointers?
At this point, pointers may seem a little silly, academic, or obtuse. Why use a pointer if we can just use the original variable?
It turns out that pointers are useful in many different cases:
- Arrays are implemented using pointers. Pointers can be used to iterate through an array (as an alternative to array indices).
- They are the only way you can dynamically allocate memory in C++. This is by far the most common use case for pointers.
- They can be used to pass a large amount of data to a function in a way that doesn’t involve copying the data, which is inefficient.
- They can be used to pass a function as a parameter to another function.
- They can be used to achieve
polymorphism
when dealing with inheritance. - They can be used to have one struct/class point at another struct/class, to form a chain. This is useful in some more advanced data structures, such as linked lists and trees.
So there are actually a surprising number of uses for pointers. But don’t worry if you don’t understand what most of these are yet. Now that you understand what pointers are at a basic level, we can start taking an in-depth look at the various cases in which they’re useful, which we’ll do in subsequent lessons.
Pointers convert to boolean false if they are null, and boolean true if they are non-null. Therefore, we can use a conditional to test whether a pointer is null or not:
1 | double *ptr = 0; |
Best practice: Initialize your pointers to a null value if you’re not giving them another value.
In C++, there is a special preprocessor macro called NULL
(defined in the
The value of NULL is implementation defined, but is usually defined as the integer constant 0. Note: as of C++11, NULL can be defined as nullptr
instead (which we’ll discuss in a bit).
Best Practice: Because NULL is a preprocessor macro with an implementation defined value, avoid using NULL (sure?).
Note that the value of 0 isn’t a pointer type, so assigning 0 (or NULL, pre-C++11) to a pointer to denote that the pointer is a null pointer is a little inconsistent. In rare cases, when used as a literal argument, it can even cause problems because the compiler can’t tell whether we mean a null pointer or the integer 0.
1 |
|
To address the above issues, C++11 introduces a new keyword called nullptr
. nullptr is both a keyword and an rvalue
constant, much like the boolean keywords true and false are.
1 | int *ptr = nullptr; |
C++ will implicitly convert nullptr
to any pointer type. So in the above example, nullptr is implicitly converted to an integer pointer, and then the value of nullptr assigned to ptr. This has the effect of making integer pointer ptr a null pointer.
C++11 also introduces a new type called std::nullptr_t
(in header
In all but two cases (which we’ll cover below), when a fixed array is used in an expression, the fixed array will decay
(be implicitly converted) into a pointer that points to the first element of the array. (But a pointer is still not an array though)
Arrays in structs and classes don’t decay
Finally, it is worth noting that arrays that are part of structs or classes do not decay when the whole struct or class is passed to a function. This yields a useful way to prevent decay if desired, and will be valuable later when we write classes that utilize arrays.
For optimization purposes, multiple string literals may be consolidated into a single value. For example:
1 | const char *name1{ "Alex" }; |
These are two different string literals with the same value. The compiler may opt to combine these into a single shared string literal, with both name1
and name2
pointed at the same address. Thus, if name1 was not const, making a change to name1 could also impact name2 (which might not be expected). Actually, if there is no const modifier, name1 can’t be changed still.
Rule: Feel free to use C-style string symbolic constants if you need read-only strings in your program, but always make them const
!
By outputting char *
or const char *
, std::cout will assume you are going to print a string instead of an address (int *). While this is great 99% of the time, it can lead to unexpected results. Consider the following case:
1 | int in = 5; |
Why did it do this? Well, it assumed &ch
(which has type char *
) was a string. So it printed the ‘Q’, and then kept going. Next in memory was a bunch of garbage. Eventually, it ran into some memory holding a 0 value, which it interpreted as a null terminator, so it stopped. What you see may be different depending on what’s in memory after variable c.
C++ supports three basic types of memory allocation, of which you’ve already seen two.
Static memory allocation
happens for static and global variables. Memory for these types of variables is allocated once when your program is run and persists throughout the life of your program.Automatic memory allocation
happens for function parameters and local variables. Memory for these types of variables is allocated when the relevant block is entered, and freed when the block is exited, as many times as necessary.Dynamic memory allocation
is the topic of this article.
To allocate a single variable dynamically, we use the scalar (non-array) form of the new operator:
1 | new int; |
If it wasn’t before, it should now be clear at least one case in which pointers are useful. Without a pointer to hold the address of the memory that was just allocated, we’d have no way to access the memory that was just allocated for us!
When we are done with a dynamically allocated variable, we need to explicitly tell C++ to free the memory for reuse. For single variables, this is done via the scalar (non-array) form of the delete operator:
1 | // assume ptr has previously been allocated with operator new |
The delete operator does not actually delete anything. It simply returns the memory being pointed to back to the operating system
. The operating system is then free to reassign that memory to another application (or to this application again later).
Note: deleting a pointer that is not pointing to dynamically allocated memory may cause bad things to happen. A pointer that is pointing to deallocated memory is called a dangling pointer
. Dereferencing or deleting a dangling pointer will lead to undefined behavior
.
Rule: Set deleted pointers to 0 (or nullptr in C++11) unless they are going out of scope immediately afterward.
Operator new can fail
When requesting memory from the operating system, in rare circumstances, the operating system may not have any memory to grant the request with.
By default, if new fails, a bad_alloc
exception is thrown. If this exception isn’t properly handled (and it won’t be, since we haven’t covered exceptions or exception handling yet), the program will simply terminate (crash) with an unhandled exception error.
In many cases, having new throw an exception (or having your program crash) is undesirable, so there’s an alternate form of new that can be used instead to tell new to return a null pointer if memory can’t be allocated. This is done by adding the constant std::nothrow between the new keyword and the allocation type:
1 | int *value = new (std::nothrow) int; // value will be set to a null pointer if the integer allocation fails |
Note that if you then attempt to dereference this memory, undefined behavior will result (most likely, your program will crash). Consequently, the best practice is to check all memory requests to ensure they actually succeeded before using the allocated memory.
1 | int *value = new (std::nothrow) int; |
Deleting a null pointer has no effect. Thus, there is no need for the following:
1 | if (ptr) |
Memory leaks
Memory leaks happen when your program loses the address of some bit of dynamically allocated memory before giving it back to the operating system. When this happens, your program can’t delete the dynamically allocated memory, because it no longer knows where it is. The operating system also can’t use this memory, because that memory is considered to be still in use by your program.
Dynamically allocated memory effectively has no scope. That is, it stays allocated until it is explicitly deallocated or until the program ends (and the operating system cleans it up, assuming your operating system does that). However, the pointers used to hold dynamically allocated memory addresses follow the scoping rules of normal variables. This mismatch can create interesting problems.
1 | void doSomething() { |
Dynamically allocating arrays
In addition to dynamically allocating single values, we can also dynamically allocate arrays of variables. Unlike a fixed array, where the array size must be fixed at compile time, dynamically allocating an array allows us to choose an array length at runtime.
To allocate an array dynamically, we use the array form of new and delete (often called new[]
and delete[]
):
1 | int *array = new int[length]; |
One often asked question of what array delete[]
is, “How does array delete know how much memory to delete?” The answer is that array new[]
keeps track of how much memory was allocated to a variable, so that array delete[]
can delete the proper amount. Unfortunately, this size/length isn’t accessible to the programmer (which means we need to keep track of the length if we want to access the size of a dynamically allocating array).
Dynamic arrays are almost identical to fixed arrays, but remember to delete[] it.
If you want to initialize a dynamically allocated array to 0, the syntax is quite simple:
1 | int *array = new int[length](); // initialized to 0 |
Prior to C++11, there was no easy way to initialize a dynamic array to a non-zero value (initializer lists only worked for fixed arrays). This means you had to loop through the array and assign element values explicitly.
1 | int *array = new int[5]; |
However, starting with C++11, it’s now possible to initialize dynamic arrays using initializer lists
!
1 | int fixedArray[5] = { 9, 7, 5, 3, 1 }; |
Resizing arrays (not okay)
Dynamically allocating an array allows you to set the array length at the time of allocation. However, C++ does not provide a built-in way to resize an array that has already been allocated. It is possible to work around this limitation by dynamically allocating a new array, copying the elements over, and deleting the old array. However, this is error prone, especially when the element type is a class
(which have special rules governing how they are created).
Consequently, we recommend avoiding doing this yourself.
Fortunately, if you need this capability, C++ provides a resizable array as part of the standard library called std::vector. We’ll introduce std::vector shortly.
Pointers and const
Pointing to const value:
1 | int value = 5; |
Const pointers:
1 | int value = 5; |
Const pointer to a const value:
1 | int value = 5; |
References
l-values and r-values
In C++, variables are a type of l-value (pronounced ell-value). An l-value is a value that has an address
(in memory). Since all variables have addresses, all variables are l-values. The name l-value came about because l-values are the only values that can be on the left side of an assignment statement
. When we do an assignment, the left hand side of the assignment operator must be an l-value. Consequently, a statement like 5 = 6; will cause a compile error, because 5 is not an l-value. The value of 5 has no memory, and thus nothing can be assigned to it. 5 means 5, and its value cannot be reassigned. When an l-value has a value assigned to it, the current value at that memory address is overwritten.
The opposite of l-values are r-values (pronounced arr-values). An r-value refers to any value that can be assigned to an l-value.
r-values are always evaluated to produce a single value. Examples of r-values are literals (such as 5, which evaluates to 5), variables (such as x, which evaluates to whatever value was last assigned to it), or expressions (such as 2 + x, which evaluates to the value of x plus 2).
1 | x = 7; |
In this statement, the variable x is being used in two different contexts. On the left side of the assignment operator, “x” is being used as an l-value (variable with an address). On the right side of the assignment operator, x is being used as an r-value, and will be evaluated to produce a value (in this case, 7). When C++ evaluates the above statement, it evaluates as: x = 7 + 1;
The key takeaway is that on the left side of the assignment, you must have something that represents a memory address
(such as a variable). Everything on the right side of the assignment will be evaluated to produce a value.
Note: const variables are considered non-modifiable l-values.
Three basic variable types:
- Normal variables
- Pointers
- Reference variables
A reference is a type of C++ variable that acts as an alias
to another object or value.
C++ supports three kinds of references:
References to
non-const
values (typically just called “references”, or “non-const references”), which we’ll discuss in this lesson.1
2
3
4
5
6int value = 5; // normal integer
int &ref = value; // reference to variable value
int x = 5; // normal integer
int &y = x; // y is a reference to x
int &z = y; // z is also a reference to xUsing the address-of & operator on a reference returns the address of the value being referenced:
1
2cout << &value; // prints 0012FF7C
cout << &ref; // prints 0012FF7CReferences to
const
values (often called “const references”), which we’ll discuss in the next lesson.C++11 added r-value references, which we cover in detail in the chapter on move semantics.
References must be initialized.
References to non-const values can only be initialized with non-const l-values. They cannot be initialized with const l-values or r-values.
References cannot be reassigned
Once initialized, a reference cannot be changed to reference another variable. Consider the following snippet:
1 | int value1 = 5; |
Note that the second statement may not do what you might expect! Instead of reassigning ref to reference variable value2, it instead assigns the value from value2 to value1
(which ref is a reference of).
References as function parameters
References are most often used as function parameters. In this context, the reference parameter acts as an alias for the argument, and no copy of the argument is made into the parameter. This can lead to better performance if the argument is large or expensive to copy.
In lesson 6.8 – Pointers and arrays we talked about how passing a pointer argument to a function allows the function to dereference the pointer to modify the argument’s value directly.
References work similarly in this regard. Because the reference parameter acts as an alias for the argument, a function that uses a reference parameter is able to modify the argument passed in:
1 |
|
Best practice: Pass arguments by non-const reference
when the argument needs to be modified by the function.
The primary downside of using non-const references
as function parameters is that the argument must be a non-const l-value
. This can be restrictive.
Using references to pass C-style arrays to functions
One of the most annoying issues with C-style arrays is that in most cases they decay to pointers when evaluated. However, if a C-style array is passed by reference, this decaying does not happen.
1 | // Note: You need to specify the array size in the function declaration |
References as shortcuts
A secondary (much less used) use of references is to provide easier access to nested data. Consider the following structs:
1 | struct Something { |
References vs pointers
References and pointers have an interesting relationship – a reference acts like a pointer that is implicitly dereferenced when accessed (references are usually implemented internally by the compiler using pointers). Thus given the following:
1 | int value = 5; |
*ptr
and ref
evaluate identically. As a result, the following two statements produce the same effect:
1 | *ptr = 5; |
Because references must be initialized
to valid objects (cannot be null) and cannot be changed once set, references are generally much safer to use than pointers (since there’s no risk of dereferencing a null pointer). However, they are also a bit more limited in functionality accordingly.
Note: If a given task can be solved with either a reference or a pointer, the reference should generally be preferred. Pointers should only be used in situations where references are not sufficient (such as dynamically allocating memory).
References to r-values extend the lifetime of the referenced value
Normally r-values have expression scope, meaning the values are destroyed at the end of the expression in which they are created.
1 | std::cout << 2 + 3; // 2 + 3 evaluates to r-value 5, which is destroyed at the end of this statement |
However, when a reference to a const value is initialized with an r-value, the lifetime of the r-value is extended to match the lifetime of the reference.
1 | int somefcn() { |
1 | int somefcn() { |
Const references as function parameters
References used as function parameters can also be const. This allows us to access the argument without making a copy of it, while guaranteeing that the function will not change the value being referenced.
1 | // ref is a const reference to the argument passed in, not a copy |
References to const values are particularly useful as function parameters because of their versatility
. A const reference parameter allows you to pass in a non-const l-value argument, a const l-value argument, a literal, or the result of an expression:
1 | int a = 1; |
To avoid making unnecessary and potentially expensive copies, variables that are not pointers
or fundamental data types
(int, double, etc…) should be generally passed by (const) reference.
Fundamental data types should be passed by value, unless the function needs to change them.
Rule: Pass non-pointer, non-fundamental data type variables (such as structs) by (const) reference.
1 | struct Person { |
Rule: When using a pointer to access the value of a member, use operator ->
instead of operator .
.
For-each loops
C++11 introduces a new type of loop called a for-each loop (also called a range-based for loop) that provides a simpler and safer method for cases where we want to iterate through every element in an array (or other list-type structure).
1 | for (element_declaration : array) |
Because element_declaration should have the same type as the array elements, this is an ideal case in which to use the auto
keyword, and let C++ deduce the type of the array elements for us.
Copying array elements can be expensive, and most of the time we really just want to refer to the original element. Fortunately, we can use references for this:
1 | int array[5] = { 9, 7, 5, 3, 1 }; |
And, of course, it’s a good idea to make your element const if you’re intending to use it in a read-only fashion.
Rule: In for-each loops element declarations, if your elements are non-fundamental types, use references or const references for performance reasons.
For-each doesn’t work with pointers to an array
In order to iterate through the array, for-each needs to know how big the array is, which means knowing the array size. Because arrays that have decayed into a pointer do not know their size, for-each loops will not work with them!
1 | int sumArray(int array[]) { // array is a pointer |
Similarly, dynamic arrays won’t work with for-each loops for the same reason.
Multidimensional arrays
1 | int **array = new int[10][5]; // won't work |
Note: Unfortunately, this relatively simple solution doesn’t work if the right-most array dimension isn’t a compile-time constant.
Intro to std::array
Introduced in C++11, std::array provides fixed array functionality that won’t decay when passed into a function. std::array is defined in the array header, inside the std namespace.
1 |
|
Just like the native implementation of fixed arrays, the length of a std::array must be set at compile time.
std::array supports a second form of array element access (the at() function) that does bounds checking:
1 | std::array<int, 5> myArray { 1, 2, 3, 4, 5 }; |
In the above example, the call to array.at(1) checks to ensure array element 1 is valid, and because it is, it returns a reference to array element 1. We then assign the value of 6 to this. However, the call to array.at(9) fails because array element 9 is out of bounds for the array. Instead of returning a reference, the at() function throws an error that terminates the program (note: It’s actually throwing an exception of type std::out_of_range – we cover exceptions in chapter 15). Because it does bounds checking, at() is slower (but safer) than operator[].
std::array will clean up after itself when it goes out of scope, so there’s no need to do any kind of cleanup.
1 | void printLength(const std::array<double, 5> &myArray) { |
Also note that we passed std::array by (const) reference. This is to prevent the compiler from making a copy
of the std::array when the std::array was passed to the function (for performance reasons).
Rule: Always pass std::array by reference or const reference
Manually indexing std::array via size_type
1 | std::array<int, 5> myArray { 7, 3, 1, 9, 5 }; |
The answer is that there’s a likely signed/unsigned mismatch in this code! Due to a curious decision, the size() function and array index parameter to operator[] use a type called size_type
, which is defined by the C++ standard as an unsigned integral type. Our loop counter/index (variable i) is a signed int. Therefore both the comparison i < myArray.size() and the array index myArray[i] have type mismatches.
Interestingly enough, size_type isn’t a global type (like int or std::size_t). Rather, it’s defined inside the definition of std::array (C++ allows nested types). This means when we want to use size_type, we have to prefix it with the full array type (think of std::array acting as a namespace in this regard). In our above example, the fully-prefixed type of “size_type” is std::array<int, 5>::size_type!
Therefore, the correct way to write the above code is as follows:
1 | for (std::array<int, 5>::size_type i = 0; i < myArray.size(); ++i) { |
In all common implementations of std::array, size_type is a typedef for std::size_t. So it’s somewhat common to see developers use size_t
instead. While not technically correct
, in almost all implementations, this will work:
1 | for (std::size_t i = 0; i < myArray.size(); ++i) |
A better solution is to avoid manual indexing of std::array
in the first place. Instead, use range-based for loops (or iterators) if possible.
Intro to std::vector
Introduced in C++03, std::vector provides dynamic array functionality that handles its own memory management. This means you can create arrays that have their length set at runtime, without having to explicitly allocate and deallocate memory using new and delete. std::vector lives in the <vector>
header.
1 |
|
Just like with std::array, size() returns a value of nested type size_type (full type in the above example would be std::vector::size_type), which is an unsigned integer.
1 | // resize an array |
There are two things to note here. First, when we resized the array, the existing element values were preserved! Second, new elements are initialized to the default value for the type (which is 0 for integers).
Resizing a vector is computationally expensive
, so you should strive to minimize the number of times you do so.
Compacting bools
std::vector has another cool trick up its sleeves. There is a special implementation for std::vector of type bool that will compact 8 booleans into a byte
! This happens behind the scenes, and doesn’t change how you use the std::vector.
1 | std::vector<bool> array { true, false, false, true, true }; |
Functions
Parameters vs Arguments
In common usage, the terms parameter
and argument
are often interchanged. However, for the purposes of further discussion, we will make a distinction between the two:
A parameter
(sometimes called a formal parameter
) is a variable declared in the function declaration: void foo(int x)
.
An argument
(sometimes called an actual parameter
) is the value that is passed to the function by the caller: foo(5)
.
Rule: When passing an argument by reference, always use a const reference unless you need to change the value of the argument
References to pointers
It’s possible to pass a pointer by reference, and have the function change the address of the pointer entirely:
1 | void foo(int *&ptr) { // pass pointer by reference |
When to use pass by reference:
- When passing structs or classes (use const if read-only).
- When you need the function to modify an argument.
- When you need access to the type information of a fixed array.
When not to use pass by reference:
- When passing fundamental types that don’t need to be modified (use pass by value).
When to use pass by address/pointer (actually it is passed by value):
- When passing built-in arrays (if you’re okay with the fact that they’ll decay into a pointer).
- When passing a pointer and nullptr is a valid argument logically.
Note: Return by address is often used to return dynamically allocated memory to the caller.
1 | int* allocateArray(int size) { |
Just like return by address, you should not return local variables by reference. Consider the following example:
1 | int& doubleValue(int x) { |
Returning by reference is typically used to return arguments passed by reference to the function back to the caller. In the following example, we return (by reference) an element of an array that was passed to our function by reference:
1 | // Returns a reference to the index element of array |
Lifetime extension doesn’t save dangling references
1 | const int& returnByReference() { |
In the above program, returnByReference()
is returning a const reference to a value that will go out of scope when the function ends. This is normally a no-no
, as it will result in a dangling reference. However, we also know that assigning a value to a const reference can extend the lifetime of that value. So which takes precedence here? Does 5 go out of scope first, or does ref
extend the lifetime of 5?
The answer is that 5 goes out of scope first, then ref
extends the lifetime of the dangling reference. Lifetime extension only works when the object going out of scope is going out of scope in the same block (e.g. because it has expression scope). It does not work across function boundaries.
Use tuple to return multiple values
1 | std::tuple<int, double> returnTuple() { // return a tuple that contains an int and a double |
As of C++17, a structured binding declaration can be used to simplify splitting multiple returned values into separate variables:
1 | int main() { |
Using a struct is a better option than a tuple if you’re using the struct in multiple places. However, for cases where you’re just packaging up these values to return and there would be no reuse from defining a new struct, a tuple is a bit cleaner since it doesn’t introduce a new user-defined data type.
Inline functions
Rule: Be aware of inline functions. Modern compilers should implicitly add inline functions for you as appropriate, so there isn’t a need to use the keyword.
Inline functions are exempt from the one-definition per program rule
In previous chapters, we’ve noted that you should not implement functions (with external linkage) in header files, because when those headers are included into multiple .cpp files, the function definition will be copied into multiple .cpp files. These files will then be compiled, and the linker will throw an error because it will note that you’ve defined the same function more than once.
However, inline functions are exempt from the rule that you can only have one definition per program, because of the fact that inline functions do not actually result in a real function being compiled – therefore, there’s no conflict when the linker goes to link multiple files together.
This may seem like an uninteresting bit of trivia at this point, but next chapter we’ll introduce a new type of function (a member function) that makes significant use of this point.
Even with inline functions, you generally should not define global functions in header files.
Function overloading
Function return types are not considered for uniqueness
A function’s return type is NOT considered when overloading functions. (Note for advanced readers: This was an intentional choice, as it ensures the behavior of a function call or subexpression can be determined independently from the rest of the expression, making understanding complex expressions much simpler. Put another way, we can always determine which version of a function will be called based solely on the arguments. If return values were included, then we wouldn’t have an easy syntactic way to tell which version of a function was being called – we’d also have to understand how the return value was being used, which requires a lot more analysis).
1 | int getRandomValue(); |
Types generated by typedef are not distinct, since they don’t introduce new types. The following two declarations of Print()
are considered identical:
1 | typedef char *string; |
How function calls are matched with overloaded functions
Making a call to an overloaded function results in one of three possible outcomes:
- A match is found. The call is resolved to a particular overloaded function.
- No match is found. The arguments cannot be matched to any overloaded function.
- An ambiguous match is found. The arguments matched more than one overloaded function.
When an overloaded function is called, C++ goes through the following process to determine which version of the function will be called:
First, C++ tries to find an exact match
. This is the case where the actual argument exactly matches the parameter type of one of the overloaded functions. For example:
1 | void print(char *value); |
Although 0 could technically match print(char) (as a null pointer), it exactly matches print(int) (matching char would require an implicit conversion).** Thus print(int) is the best match available.
Secondly, if no exact match is found, C++ tries to find a match through promotion
. To summarize,
- Char, unsigned char, and short is promoted to an int.
- Unsigned short can be promoted to int or unsigned int, depending on the size of an int
- Float is promoted to double
- Enum is promoted to int
1 | void print(char *value); |
Thirdly, if no promotion is possible, C++ tries to find a match through standard conversion
. Standard conversions include:
- Any numeric type will match any other numeric type, including unsigned (e.g. int to float)
- Enum will match the formal type of a numeric type (e.g. enum to float)
- Zero will match a pointer type and numeric type (e.g. 0 to char*, or 0 to float)
- A pointer will match a void pointer
1 | struct Employee; // defined somewhere else |
Finally, C++ tries to find a match through user-defined conversion
. Although we have not covered classes yet, classes (which are similar to structs) can define conversions to other types that can be implicitly applied to objects of that class. For example, we might define a class X and a user-defined conversion to int.
1 | class X; // with user-defined conversion to int |
Default parameters
1 | void printValues(int x, int y=10) { |
A function can have multiple default parameters:
1 | void printValues(int x=10, int y=20, int z=30) { |
Note that it is impossible to supply an argument for parameter z without also supplying arguments for parameters x and y. This is because C++ does not support a function call syntax such as printValues(,,3). This has two major consequences:
All default parameters must be the
rightmost
parameters. The following is not allowed:1
void printValue(int x=10, int y); // not allowed
If more than one default parameter exists, the leftmost default parameter should be the one most likely to be explicitly set by the user.
Default parameters can only be declared once
Once declared, a default parameter cannot be redeclared. That means for a function with a forward declaration
and a function definition
, the default parameter can be declared in either the forward declaration or the function definition, but not both.
1 | void printValues(int x, int y=10); |
Default parameters can only be declared once
Once declared, a default parameter cannot be redeclared. That means for a function with a forward declaration and a function definition, the default parameter can be declared in either the forward declaration or the function definition, but not both.
Default parameters and function overloading
Functions with default parameters may be overloaded. For example, the following is allowed:
1 | void print(std::string str); |
If the user were to call print(), it would resolve to print(‘ ‘), which would print a space.
However, it is important to note that default parameters do NOT count towards the parameters that make the function unique. Consequently, the following is not allowed:
1 | void printValues(int x); |
If the caller were to call printValues(10), the compiler would not be able to disambiguate whether the user wanted printValues(int) or printValues(int, 20) with the default value.
Function Pointers
Note that the type (parameters and return type) of the function pointer must match the type of the function. Here are some examples of this:
1 | // function prototypes |
Unlike fundamental types, C++ will implicitly convert a function into a function pointer if needed (so you don’t need to use the address-of operator (&) to get the function’s address). However, it will not implicitly convert function pointers to void pointers, or vice-versa.
One interesting note: Default parameters won’t work for functions called through function pointers. Default parameters are resolved at compile-time
(that is, if you don’t supply an argument for a default parameter, the compiler substitutes one in for you when the code is compiled). However, function pointers are resolved at run-time. Consequently, default parameters cannot be resolved when making a function call with a function pointer. You’ll explicitly have to pass in values for any defaulted parameters in this case.
Providing default functions
If you’re going to allow the caller to pass in a function as a parameter, it can often be useful to provide some standard functions for the caller to use for their convenience. For example, in the selection sort example above, providing the ascending()
and descending()
function along with the selectionSort() function would make the callers life easier, as they wouldn’t have to rewrite ascending() or descending() every time they want to use them.
You can even set one of these as a default parameter:
1 | // Default the sort to ascending sort |
Making function pointers prettier with typedef or type aliases
1 | typedef bool (*validateFcn)(int, int); |
In C++11, you can instead use type aliases to create aliases for function pointers types:
1 | using validateFcn = bool(*)(int, int); // type alias |
This reads more naturally than the equivalent typedef, since the name of the alias and the alias definition are placed on opposite sides of the equals sign.
Using std::function in C++11
Introduced in C++11, an alternate method of defining and storing function pointers is to use std::function, which is part of the standard library
1 |
|
The stack and the heap
Stack overflow
The stack has a limited size, and consequently can only hold a limited amount of information. On Windows, the default stack size is 1MB. On some unix machines, it can be as large as 8MB. If the program tries to put too much information on the stack, stack overflow will result. Stack overflow happens when all the memory in the stack has been allocated – in that case, further allocations begin overflowing into other sections of memory.
Here is an example program that will likely cause a stack overflow. You can run it on your system and watch it crash:
1 | int stack[100000000]; |
Another example:
1 | void foo() { |
std::vector capacity and stack behavior
Although this is the most useful and commonly used part of std::vector, std::vector has some additional attributes and capabilities that make it useful in some other capacities as well.
Length vs capacity
1 | int *array = new int[10] { 1, 2, 3, 4, 5 }; |
We would say that this array has a length of 10, even though we’re only using 5 of the elements that we allocated.
However, what if we only wanted to iterate over the elements we’ve initialized, reserving the unused ones for future expansion? In that case, we’d need to separately track how many elements were “used” from how many elements were allocated. Unlike a built-in array or a std::array, which only remembers its length, std::vector contains two separate attributes
: length and capacity. In the context of a std::vector, length is how many elements are being used in the array, whereas capacity is how many elements were allocated in memory.
Taking a look at an example from the previous lesson on std::vector:
1 | std::vector<int> array { 0, 1, 2 }; |
In this case, the resize() function caused the std::vector to change both its length and capacity. Note that the capacity is guaranteed to be at least as large as the array length (but could be larger), otherwise accessing the elements at the end of the array would be outside of the allocated memory!
Why differentiate between length and capacity? std::vector will reallocate its memory if needed, but like Melville’s Bartleby, it would prefer not to, because resizing an array is computationally expensive
. Consider the following:
1 | std::vector<int> array; |
Array subscripts and at() are based on length, not capacity
Vectors may allocate extra capacity
When a vector is resized, the vector may allocate more capacity than is needed. This is done to provide some “breathing room” for additional elements, to minimize the number of resize operations needed.
Handling errors, cerr and exit
Problem: When a function is called, the caller may have passed the function parameters that are semantically meaningless.
1 | void printString(const char *cstring) { |
Can you identify the assumption that may be violated? The answer is that the caller might pass in a null pointer instead of a valid C-style string. If that happens, the program will crash. Here’s the function again with code that checks to make sure the function parameter is non-null:
1 | void printString(const char *cstring) { |
cerr
is a mechanism that is meant specifically for printing error messages. cerr is an output stream (just like cout) that is defined in
Assert and static_assert
An assert statement is a preprocessor macro
that evaluates a conditional expression at runtime.
1 |
|
Making your assert statements more descriptive
Fortunately, there’s a little trick you can use to make your assert statements more descriptive. Simply add a C-style string description joined with a logical AND:
1 | assert(found && "Car could not be found in database"); |
Here’s why this works: A C-style string always evaluates to boolean true
. So if found is false, false && true = false. If found is true, true && true = true. Thus, logical AND-ing a string doesn’t impact the evaluation of the assert.
NDEBUG and other considerations
The assert() function comes with a small performance cost that is incurred each time the assert condition is checked. Furthermore, asserts should (ideally) never be encountered in production code (because your code should already be thoroughly tested). Consequently, many developers prefer that asserts are only active in debug builds. C++ comes with a way to turn off asserts in production code:
1 |
|
Static_assert
C++11 adds another type of assert called static_assert. Unlike assert, which operates at runtime, static_assert is designed to operate at compile time
, causing the compiler to error if the condition is not true. If the condition is false, the diagnostic message is printed.
Here’s an example of using static_assert to ensure types have a certain size:
1 | static_assert(sizeof(long) == 8, "long must be 8 bytes"); |
A few notes. Because static_assert is evaluated by the compiler, the conditional part of a static_assert must be able to be evaluated at compile time. Because static_assert is not evaluated at runtime, static_assert statements can also be placed anywhere in the code file (even in global space).
In C++11, a diagnostic message must
be supplied as the second parameter. In C++17, providing a diagnostic message is optional
.
Ellipsis (and why to avoid them)
The best way to learn about ellipsis is by example. So let’s write a simple program that uses ellipsis. Let’s say we want to write a function that calculates the average of a bunch of integers. We’d do it like this:
1 |
|