Chapter 14 - Notes

Preprocessor and the Compiler

The preprocessor, as the name suggests, is a component that runs before the compiler starts; therefore playing in a part in deciding what is compiled on the basis of how you instruct it.

Preprocessor directives all start with the # sign:

  • Instructing the preprocessor to insert contents utils.h in the file:

#include "utils.h"
  • Defining a macro constant:

#define VALUE 50;
  • Defining a macro function:

#define SQUARE(x) ((x) * (x))

Macros are generally about text substitution: when a macro is defined, the compiler simply replaces all instances of the macro with the value that is defined; nothing is being computed or evaluated.

You can check the output of the preprocessor by invoking your compiler with the -E flag.

gcc -E main.cpp  > main.preprocessed

This will create a main.preprocessed file with the output of the preprocessor.

If you include any standard libraries in your program, you may find the preprocessed output a little too verbose. You can work around this with the -nostdinc flag:

gcc -E -nostdinc main.cpp  > main.preprocessed

Macro Constants

The syntax for composing a macro constant is as follows:

#define identifier value

The preprocessor will then go on to replace every instance of identifier with value.

Drawbacks

  • The preprocessor only makes "dumb" text substitutions, and does not check for the correctness of the substitution (the compiler does this).

  • You do not get to control the data type of values of macro constants; this can be circumvented by using const variables.

Macros and Multiple Inclusion

Multiple Inclusion

Classes and functions are typically declared in header files (*.h) and the implementations are defined in source files (*.cpp).

If a header file, foo.h that defines class Foo, which uses class Bar that is declared in another header file bar.h, then foo.h needs to include bar.h.

If the design was complicated, and Bar references Foo as well, then bar.h also needs to include foo.h.

For the preprocessor, two files including each other is recursive in nature, meaning that the actual source would've been expanded infinitely.

Solution

To avoid this problem, macros can be used with preprocessor directives #ifndef and #endif to prevent multiple inclusion.

  • foo.h

#ifndef FOO_H
#define FOO_H

#include "Bar.h"
// ...

#endif
  • bar.h

#ifndef BAR_H
#define BAR_H

#include "Foo.h"
// ...

#endif

If the preprocessor enters foo.h first, and sees #ifndef, it sees that FOO_H has not been defined yet, and proceeds. The line after that immediately defines FOO_H, ensuring that the next time this file is included, the block within is not copied.

Macro Functions

The preprocessor can replace text matching a certain pattern, allowing them to be used to write simple functions:

 #define SQUARE(x) ((x) * (x))
 #define PI 3.1416
 #define AREA_CIRCLE(r) (PI*(r)*(r))
 #define MAX(a, b) (((a) > (b)) ? (a) : (b))
 #define MIN(a, b) (((a) < (b)) ? (a) : (b))
  • Used for very simple calculations

  • As functions are expanded inline, they may help with code performance in certain cases

Drawbacks

Macros are not type sensitive, for example the macro AREA_CIRCLE could return either a double or float, depending on the type of its parameters.

Benefits

This type insensitivity could be beneficial, for example in the case of the MAX(a, b) and MIN(a, b) macro functions. Had they been normal functions, there would be two variants/overloads of it, one for ints and one for doubles; and if you would like to compare between ints and doubles you would have four variants.

This optimisation in reducing lines of code is a slight advantage for using macros to define simple utility functions.

Parenthesis?

It is curious that the AREA_CIRCLE macro is defined with so many brackets, when its code equivalent could be much simpler:

double areaCircle(double radius) {
    return PI * radius * radius;  // no brackets
}

Consider a "simplified" version of the macro:

#define AREA_CIRCLE(radius) (PI * radius * radius)

Invoking the macro like this:

AREA_CIRCLE(2 + 5);

Would result in the following expansion:

PI * 2 + 5 * 2 + 5;  // Not the same as PI * 7 * 7 due to operator precedence

Summary

  • Plain text substitution can affect the result of the macro due to operator precedence

  • Parenthesis help by ensuring that the parameter inputs are first evaluated, making the macro code independent of operator precedence

  • Due to the nuances of macro functions, often it is better to just write inline functions instead.

  • Alternatively, a template function (explored later) is a better way of defining type-independent generic functions that doesn't necessarily give up all the features of type safety.

Macros to Validate Expressions

The assert macro in the assert.h header allows you to check for valid expressions or variable values.

The macro notifies when an assertion has failed, and provides you the file and line number for where the failure was encountered, making it a handy debugging feature.

Template Classes

In Lesson 12 Exercise 1 we created a VectorInt class the replicates the behaviour of a std::vector<int> class. Though the utility of the class is restricted to that of integers only.

If you were required to store a vector of floats then this would not work, unless we replicated most of the code into another class, perhaps called VectorFloat.

This is where template classes would be useful.

Purpose

Template classes, like template functions, are templatised versions of C++ classes.

When using a template class, you are given the option to specify the type parameter for the template class you are instantiating. (e.g. specifying int for a vector class to store a collection of integers, or specifying float for a collection of decimal numbers).

Example

An example of a simple templatised vector class may look like the following:

template <typename T>
class MyVector {
private:
    T* values;
    int size;
    int capacity;

public:
    MyVector(int capacity = 10) {
        this->size = 0;
        this->capacity = capacity;
        this->values = new T[capacity];
    }
    
    void push_back(T value) {
        if (size < capacity) {
            values[size] = value;
            size++;
        }
    }
    
    T& get(int index) {
        return values[index];
    }
    
    ~MyVector() {
        delete[] values;
    }
};

In our template class, the type T is used in multiple places:

  • The dynamic array type of the member variable values

  • The type of the parameter in the push_back function

  • The type of the value returned from the get function

The actual type of T is templated, and so one class is created per instantiation of a MyVector object of a different type.

MyVector<int> vectorInt; 
vectorInt.push_back(1);

MyVector<float> vectorFloat;
vectorFloat.push_back(3.142);

MyVector<std::string> vectorString;
vectorString.push_back("Hello World"); 

In the above example we have created 3 specialisations of our MyVector template class: one for ints, one for floats and another for std::strings.

Summary: Template classes define a pattern for classes, allowing one to implement said patterns for different data types that the template may be instantiated with.

Templates with Multiple Type Parameters

The template parameter list can also accommodate multiple type parameters separated by a comma.

template <typename T1, typename T2>
class Pair {
private:
    T1 first;
    T2 second;
public:
    Pair(const T1& first, const T2& second) {
        this->first = first;
        this->second = second;
    }
    
    T1& getFirst() {
        return first;
    }
    
    T2& getSecond() {
        return second;
    }
};

In this example, the Pair class accepts to template parameters named T1 and T2, and holds two values of the two types.

The types for T1 and T2 do not have to be different.

Pair<int, double> pairIntDouble(10, 4.2);
Pair<float, float> pairFloatFloat(3.1, 3.2);

Templates with Default Type Parameters

Like function parameters, we can also provide default type parameters for templates:

template <typename T1=int, typename T2=int>
class Pair {
    // ...
};

Therefore, we construction of Pair objects containing a pair of integers can be simplified:

Pair<> pairIntInt(5, 120);

Template Instantiation and Specialisation

Instantiation

All templates (functions and classes) are simply blueprints, and do not truly exist for the compiler until it has been used in one form or another.

To the compiler, a template that you define but don't use is code that is simply ignored.

Once you instantiate a template, you are instructing the compiler to create a class/function for you using the template you have defined, substituting the types that you have provided as the template arguments.

In the context of templates, instantiation of a template is the creation of a specific function or class using template arguments.

Specialisation

There may be situations that require you to explicitly define a different behaviour for a template of a specific type. This is where you specialise a template for that type.

Unspecialised templates are also known as base templates. (e.g. base class template, or base function template)

Example

A specialisation of class Pair when instantiated with type parameters of both std::string would look like this:

template<>
class Pair<std::string, std::string> {
    // ...
};

A template specialisation must follow the template definition.

Template Classes and Static Members

Recall that static member variables are shared across all instances of a class; moreover, they do not require an instance to access.

For template classes, static member variables are shared across all objects of a template class with the same template instantiation.

template <typename T>
struct Foo {
    static int count;
};

template <typename T>
int Foo<T>::count = 0;  
// All instantiations of template class `Foo`
// will have the static member variable `count`
// start from 0 

This illustrates the idea that the compiler creates multiple distinct classes of Foo for each instantiation of it. Therefore static member variables are tied to each instantiation of a template class.

Performing Compile-Time Checks

Similar to static_cast that performs checks at compile time that the cast is valid.

There is static_assert which is a compile-time assert to be used to check for certain general conditions at compile time.

This is especially useful for templates, for example if there is a template that you do not want instantiated for integers:

template <typename T>
class AllButInt {
public:
    AllButInt() {
        static_assert(
            sizeof(T) != sizeof(int), 
            "AllButInt cannot be instantiated for an integer"
        );
    }
};

int main() {
    AllButInt<int> test;  // Will not compile!
}

The above program fails, giving you the error: AllButInt cannot be instantiated for an integer.

Last updated

Was this helpful?