Knowledge Dump

Functions

This article covers the use of functions in C++ (version 17), as well as some keywords related to them.

Contents

Syntax
Basic functions are declared by writing the output type, followed by the function name, input variables in parentheses and a pair of curly braces containing the function body, e.g. int addition (int a, int b) {...}. If a function is to not return any value, the output type is specified as void. They can be declared without function body and input variable names early into the code, following up with the actual definition later on, too. For this, the curly braces have to be omitted and a semicolon is added at the end. Doing this allows the program to know the functions in advance, which may be necessary when there are function dependencies (functions calling other functions), or to improve readability. Example:
int foo1(int);				//Declare without implementation (and input variable names).
int foo2(int);				//This is necessary, since both functions potentially call each other.

int foo1(int a) {			//Doubles positive integers. If negative, pass to foo2.
	if (a > 0) { return 2 * a; }
	else { return foo2(a); }	//foo2 needs to be declared at this point.
}

int foo2(int a) {			//Doubles negative integers and changes sign. If positive, pass to foo1.
	if (a <= 0) { return (-2)*a; }
	else { return foo1(a); }	//foo1 already declared (and implemented).
}
We can add default values to the input variables, by adding an equal sign and the desired value/variable to arguments at the end of the input variable list. They don't have to be constants and can be redeclared, but only in differing scopes.

It should be noted that, when several default arguments are declared, it is not possible to change the order of default value use in a function call. For the above example, this means that we can call sum(11), which translates to sum(11, var2). However, there is no shortcut for replacing var2 and keeping var1 as default, i.e. sum(var1, 11) is not possible without explicitly typing both arguments.
Call by value vs. call by reference
There are two ways input variables can be passed to functions: Either by copying the values of the variables (Call by value) or by referencing the original variable (Call by reference). The latter case is used by simply adding an ampersand sign (&) after the input type in the declaration. Note that calling by reference is similar to using pointers, but has some distinctions. Most notably, references always lead to the memory address of a valid variable (in fact, the reference has the same address as the original), while pointers can point to any address, potentially causing undefined behavior. Thus, references are usually used instead of pointers as function input.

Whether variables are to be called by value, or by reference, depends on the individual case. If the original variable is not to be changed, but would be manipulated in the function body, calling it by value is the way to go, since only a copy of the variable will be passed to the function. In the contrary case, i.e. when functions are to affect their input, they should be called by reference.
Another thing to look out for, is that calling by value might not be possible for all data types or variables, since they might either be too large in size to be copied (efficiently), or not even have a valid copy constructor. In these cases, calling them by reference is preferred. Example:
int sum(int a, int b) {			//Small sized input and original variables are to remain unchanged => Call by value.
	return a + b;
}

void triple(int &a) {			//Function is to triple input => Call by reference.
	a *= 3;
}

void triple_p(int *a) {			//Alternative: Call by pointer.
	*a *= 3;
}
Overloading
It is possible to implement several functions with the same name, as long as they have differing input arguments. This is called overloading a function and is usually used for defining functions that mostly do the same, but work differently, depending on input type or amount. If there are many different types and no real difference in implementation, the use of templates might be preferred. Example:
int sum(int a, int b) {
	return a + b;
}

int sum(int a, int b, int c) {
	return a + b + c;
}
Templates
Instead of overloading functions with several input types and the same function body, function templates can be used. Templates replace one (or more) input arguments with a variable of fixed, but changeable data type. When calling those functions, an instance is created for each type used (generic programming).
Templates are declared by adding, for example, template <class identifier1, class identifier2> (for two different template types) before a function declaration. Instead of the keyword class, typename can be used, too. After that, the template can be used like any other data type.


There's a functionality similar to overloading for template functions, called template specialization. It can be used after defining a template, in order to specify a different function body for a specific data type that is replacing the generic placeholder – either to allow for the use of more data types, which otherwise wouldn't work properly with the standard template function body, or to increase efficiency due to custom implementation.
Specialized template functions are defined almost like normal functions, with the difference being that there's template <> written before the declaration and the template data type is specified after the function name, e.g. int sum<int>(int a, int b). The specialized function has to be declared in the same scope as the original template (for member functions, it has to be in the class/struct scope).
We speak of explicit template specialization when all template variables are specified and of partial template specialization, when some remain generic. While both specialization types work for class/struct templates, the latter does not work for function templates. When specializing function templates, all generic variables have to be assigned a specific data type. For functions with the generic arguments as input variables, this can easily be overcome by overloading the function. When this is not the case, more extensive workarounds would have to be used.


Lastly, it should be mentioned that function templates can not just be defined with generic data types, but also with one or more specific ones. They are instantiated with a specific constant expression of said non-generic data type and may thus easily lead to unnecessary code clutter in the executable, if the function is instantiated with many different values. Example:
#include <iostream>

template <class T, int N>		//One generic and one fixed data type.
int sum(T a) {
	return a + N;			//There may be problems with type conversion if T not int.
}

int main() {
	std::cout << sum<int, 5>(7);	//Instantiates a sum function that adds 5 to input.
					//Output: 12.
	return 0;
}
Inline functions
Each function call produces a small overhead, which may not use up a significant portion of computational resources for big functions, but be very notable for small ones. By declaring a function as inline, you hint to the compiler that, instead of calling the function normally (producing said overhead), it should instead just paste the function body to the place of the function call (inlining). However, the compiler is not bound to listen and may or may not proceed to inline the function, depending on what it deems optimal. In fact, using inline for this reason has become mostly obsolete and is only used (if at all) at small functions that are called frequently. To declare an inline function, the keyword inline is added before the function declaration.

Inline functions have a special property though, so there is another use case: They can be declared in multiple translation units, without producing a linker error. Hence, inline functions can be, for example, declared in a header file that is included in two .cpp files, which are jointly compiled. For this to work properly, the function body of the inline function should be identical in all translation units, since otherwise the outcome may depend on compilation order and result in ill-formed executables. Ensuring this is task of the programmer, since the compiler/linker won't be able to detect this and hence not throw an error or warning.