Skip to content

Level 0 C++ Tutorial

As a note, this tutorial assumes that you are using a Unix-based machine. If you are using a Windows machine, you will need to install a Linux virtual machine or use the Windows Subsystem for Linux (WSL). Instructions for this are provided here.

A Note on C++

C++ is a complicated language, and probably the most technically demanding of the 4 we use at TUAS. Pointers and manual memory management can be incredibly difficult to understand, and even more difficult to debug. This is probably why more modern languages, and even modern C++ itself, have introduced features like garbage collection (e.g. Java, Python, Go...) and smart pointers (e.g. C++, Rust somewhat) to get away from manual memory management. Even with smart pointers and more modern features, however, C++ is still a very complicated language, and possibly even more so because of them. However, it is also incredibly powerful in that it provides great performance and low level control. It is for this reason that we use it on our onboard computer, where hardware interopability and performance are more relatively important than our software that runs on the ground.

If this previous paragraph meant nothing to you, then don't worry! You will learn all of this in time. For now, just focus on the very basics by following this tutorial. And like all good tutorials, we'll start with a Hello World example.

1. "Hello World!" Program

1.1 Overview

This first section will deal with getting a working C++ program up and running. Once this is up and running, you may wish the either continue following the tutorial, or perhaps go off on your own. Another option would even be to try each section on your own by reading the overview at the beginning, and then comparing it with the code provided before moving on. Either way, by the end of this section you will have a working C++ skeleton program to build off of.

1.2 Installing C++

Most linux distributions should come installed with g++, a C++ compiler. I am unsure if Mac's come with it pre-installed. To check if you have it installed, you can run the following command in your terminal:

g++ --version
If you have it installed, then you should see something like this:
g++ (GCC) 12.2.1 20221121 (Red Hat 12.2.1-4)
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
If you already have it installed, that's great! If not, then you will need to install it.

Linux

To install g++ on Ubuntu or any Debian-based system, you can run the following commands:

sudo apt update
sudo apt install g++

If you are running a non-Debian system, then you should Google the instructions for your respective system.

Mac

To install g++ on a Mac, you can run the following homebrew command: brew install gcc. (Note: this hasn't been verified on a Mac system yet, so if someone could check to see if this works that would be great). You can also use another compiler, as I believe gcc/g++ is less commonly used on Mac. If so, just know that the tutorial will contain commands for gcc, so you'll have to find the equivalent command for what compiler you choose.

1.3 Setting up a Development Environment

Now that you have g++ installed, you can start to create a working development environment. To do this, you will need to create a directory for your project. I would recommend creating a directory called tritonuas or tuas in your home directory, and then putting it somewhere inside of them. For my running example, I will be doing it inside of ~/tuas/onboarding/level_0/cpp. Since you probably won't be implementing this in four different languages, something like ~/tuas/level_0_cpp would probably be fine. But it literally does not matter, so do whatever you want.

To set up your directory, you can run the following commands:

cd # Go to your home directory (Equivalent to cd ~)
mkdir tuas # if it does not already exist
cd tuas # Go to your tuas directory. 
mkdir level_0_cpp # make directory for this project
cd level_0_cpp # go into the level_0_cpp directory
You could have done this all in two command like this, no matter if you already had the directories or not, by entering the following commands:
mkdir -p ~/tuas/level_0_cpp
cd ~/tuas/level_0_cpp

  • Side Note: I will not be explaining every detail of the commands I run. If you are confused about what a command does or what a command's argument does, like for example the -p, a great resource is the man command. To use it on mkdir, enter the following command
    man mkdir
    
    This will give output like this:
        mkdir - make directories
    
    SYNOPSIS
        mkdir [OPTION]... DIRECTORY...
    
    DESCRIPTION
       Create the DIRECTORY(ies), if they do not already exist.
    
       Mandatory arguments to long options are mandatory for short options too.
    
       -m, --mode=MODE
              set file mode (as in chmod), not a=rwx - umask
    
       -p, --parents
              no error if existing, make parent directories as needed, with their file modes unaffected by any -m option.
    
       -v, --verbose
              print a message for each created directory
    
       -Z     set SELinux security context of each created directory to the default type
    
       --context[=CTX]
              like -Z, or if CTX is specified then set the SELinux or SMACK security context to CTX
    
       --help display this help and exit
    
       --version
              output version information and exit
    AUTHOR
        Written by David MacKenzie.
    
    Which lets us know that -p creates directories if they do not exist.

Now that you are in your directory, it would be a good idea to set up a git repository. This is not necessary since you are likely working alone, but it is still good practice so we recommend doing it. To do this, you can run the following command:

git init
Then go to Github and create a new repository. You can name it whatever you want. You should be able to follow the instructions on the website for how to connect it with your local repository you just created.

With Git now in place, you're almost ready to start writing code. First, enter the following command:

touch main.cpp

This will create an empty file caled main.cpp. Now, to open the current directory in VSCode, enter the following command:

code .

This will open the current directory in VSCode. With VSCode open, you can open the main.cpp file you created.

Note: You probably already knew how to create a file and open VSCode on your open without this tutorial telling you how, but I wanted to do it entirely with the command line to start normalizing the experience. If you were really hardcore, you could do this entire project in something like vim and never even use an editor with a GUI. But learning to use VSCode on its own is a skill, and is greatly beneficial in larger code bases. So, I would recommend learning how to use it, or at least something with LSP (Language Server Protocol) support. If you don't know what it is, it is essentially what all modern IDE's and editors use to provide things like syntax highlighting, code completion, and other useful features for a variety of languages.

With the mention of LSP, before starting to code you will probably want to install the C++ VSCode extension. Just search C++ in the extensions tab and install the most popular one provided by Microsoft. This will provide you with syntax highlighting and other useful features.

1.4 Writing and Running Hello World

Now with that we are actually ready to start coding. I'm just going to provide an entire Hello World example here, because it makes more sense to just provide it and then analyze it.

#include <iostream>

int main() {
    std::cout << "Hello World!" << std::endl;
    return 0;
}

Before we analyze it, however, let's make sure we can run it. To compile the program, you can run the following command:

g++ main.cpp
This will create an executable called a.out. To run it, you can run the following command:
./a.out
If you wanted to give the executable a different name, like hello_world.out, you could have done this instead:
g++ main.cpp -o hello_world.out
Which then would have been followed by:
./hello_world.out

If all is correct, "Hello World!" should have been outputted to your terminal. If not, then there are some possible things that could have gone wrong.

  1. Make sure your code is exactly the same as the code provided above. If you are missing a semicolon or something, then it will not compile.
  2. Make sure you are in the same directory as your main.cpp file. If you are not, then you should move into that directory, or provide the full path to the file relative to where you are
  3. Make sure you have g++ installed. If you don't, then you will need to install it. See the section above on how to do this.
  4. Make sure that your executable file is marked as executable. To do this, enter chmod +x a.out, or whatever name you gave it.

If none of these solves your problem, then I would try Googling the error message you are getting. If you are still stuck, then feel free to ask for help on Slack or Discord.

1.5 Analyzing Hello World

By this point, you should have a working "Hello World!" program. Now, let's analyze it.

The first line looks like this:

#include <iostream>
This includes a library from the standard library called iostream into your code. Iostream is short for "input and output stream," which means it allows you to write and read from/to the input and output streams of your program. If this talk of streams doesn't make sense, don't worry. For now, just know that this line allows you to use the std::cout function, which we'll talk about in a bit.

The standard library is a collection of libraries that come with every C++ compiler. This means that you can use them without having to install anything. This is in contrast to third party libraries, which you will need to install. The standard library is incredibly useful, and you will likely use it in every C++ program you write. You can view the documentation for the standard library here. For now, don't worry too much about everything it provides. Just know that it provides a lot of useful stuff.

The next line looks like this:

int main() {
This defines a function called main which serves as the entry point into the program. If you have programmed in other languages, especially Java, this probably looks familiar. The int means that the function returns an integer, the () means that it takes no parameters, and the { begins the function body. The } ends the function body. We'll go more in depth about functions later, so for now it's fine to just know that everything inside main is where your code goes.

The next line looks like this:

std::cout << "Hello World!" << std::endl;
This is the line that actually outputs the text "Hello World!" to the terminal. The first part, std::cout means that we are using a funtion from the standard library called cout. Any function you use from the standard library has to be prefaced by std:: to specify that you are referring to the version of the function defined in the standard library. However, for most functions you probably won't have any namespace collisions (i.e. two functions with the same name, like if you defined your own cout function), so you might think that this is extremely annoying to have to type out every time. And for such a small program like this, you are right. However, for larger programs, it is very useful to know exactly where a function is coming from, and it is also useful to know that you are using the standard library version of the function. And if you were writing your own third party library for others to use, it gets a lot more complicated.

For this tutorial, I will be explicitly typing out std:: every time so it is very clear what functions are from the standard library. This is also the rule we follow for all of our C++ projects at TUAS. However, for this tutorial and other small programs you write, it is totally fine to do what I am about to show. If you want to use cout without having to type std:: every time, you can add the following line to the top of your file:

using namespace std;
With this, your Hello World would look like this:
#include <iostream>

using namespace std;

int main() {
    cout << "Hello World!" << endl;
    return 0;
}
Notice that now you do not need the std:: before cout and endl. This is because you are now telling the compiler that you want to use the std namespace for everything in this file. So, if the compiler finds a function that does not exist in your file, it will look in the std namespace for it.

Now that we have the std:: prefix out of the way, we can talk about what the line actually is doing. The << is called the stream insertion operator. It is used to insert things into the output stream. In contrast, the >> is called the stream extraction operator. It is used to extract things from the input stream. We will talk more about streams later, but for now just know that << is used to output things to the terminal, and >> is used to get input from the terminal. We'll see the >> operator in a little bit.

So in total, the line is putting the string "Hello World" into the output stream (outputting it to your terminal), and then sending the std::endl to the output stream. The std::endl is also from the standard library, and it is used to insert a newline character into the output stream. This is why "Hello World" is on its own line. If you wanted to output "Hello World" without a newline character, you could have omitted the std::endl and just done std::cout << "Hello World!"; However, in this case that doesn't really make sense because if you didn't output a newline, the next terminal prompt would appear right after the Hello World, like this:

<tyler cpp> $ ./a.out 
Hello World!<tyler cpp> $ 

There are situations where you would want to do this, like if you were planning to output text to the terminal later on that you wanted to be on the same line.

If you have used other languages, you probably have seen the \n character before. This is the general newline character, and you could have also used it instead of std::endl, and it would have done basically the same thing. However, there is one main difference. When using std::endl, it flushes the output stream after inserting the newline. To fully understand what this means you would need to have a knowledge of streams, but at a high level when you put text to an output stream with std::cout it doesn't necessarily output it to the terminal immediately. It might wait until it has a lot of text to output, or it might wait until it can squeeze it between other things going on. However, when you use std::endl, it will output everything in the output stream immediately: it "flushes" the output stream.

The main takeaway from this is that as a new C++ programmer you don't really need to care right now about the difference between std::endl and \n, but you should know that if you are about to output many newlines then you probably don't want to use std::endl until the end because it is more efficient to flush the output stream once at the end than to flush it every time you output a newline.

Lastly, the final line is this:

return 0;

If you remember when we talked about the main function earlier, we said that the int meant the function returns an integer. This is what return 0 is doing: it is returning the value 0. When you write functions in your own code, you yourself will usually use the return value in some way. However, the main function is called by your system when you run the executable, or possibly by another program, so main's return value tells your system or that other program something about the execution of your program: whether or not an error occurred. If the return value is 0, then it means that the program ran successfully. If it is anything else, then it means that an error occurred. This is why you will often see programs that return 1 when an error occurs. Any program can define whatever encoding scheme it wants for error values, as long as 0 means that it was successful.

And one last thing: you probably noticed that every line was ended by a semicolon. Unlike Go, Python, or Javascript, this is required in C++ and is probably one of the most common errors in the language. So while it isn't that complicated, it's still important to point out.

So... that was an awful lot of words to explain a very simple program. But, I think it is important to not skip over the small details because as you learn more and more of the language you will start to see how everything fits together. For now, however, it's enough to just know that the more complex things mentioned here, like streams and functions, are something that exist in C++ and that you will learn about them in time.

1.6 Hello World Finished

With that we have finished the Hello World program At this point you will probably want to make a git commit. To do this, you can run the following commands:

git add main.cpp
git commit -m "Add Hello World program"

Note: G++ will have created the a.out executable. You should add this to a .gitignore file so that it is not tracked by git. To do this, you can run the following command:

echo "*.out" >> .gitignore
Essentially this will automatically ignore any file that ends in .out. You can also just create a .gitignore file manually and add it all through VSCode, but the above command is how you would do it all in the command line.

This will allow you to run commands like

git add .

without also adding the executable files. Generally, you should not push build files to a repository, as they can be generated from the source code and there may be differences between executables on different operating systems.

2. Getting User Input

2.1 Overview

Now that we have a working skeleton program, we can start to incrementally add new functionality. Generally when you are working on a project or task like this, you want to break it up into pieces like we are doing here. Right now these steps are very small because this tutorial is assuming this is your first time using C++. But if you were an experienced C++ programmer, your steps would likely be much wider in scope since you already have all the base knowledge you need. But for now, we will continue to break it down into small steps.

Instead of trying to implement all of the user input exactly described in the project writeup, we will even break that down into smaller steps, incrementally building up knowledge as we go.

2.2 Echoing User Input

To start, we will just echo back the user input we recieve. We will need to use the std::cin function, which is the input counterpart to std::cout.

Here is a sample code snippet that gets the user input and then outputs it back to the terminal:

#include <iostream>
#include <string>

int main() {
    std::string input;    
    std::cout << "Enter a string: ";
    std::cin >> input;
    std::cout << "You entered: " << input << std::endl;
    return 0;
}

To start, we define a variable of type std::string called input. This is where we will store the user input. It is important to note that the string type in C++ is defined in the standard library, so we need to #include <string> at the top of the file. This lets us use the std::string type, and like other languages we use it to store sequences of characters.

If you are familiar with C you will know that in C you use the char** type to store strings. In C++ you should basically never do this, but with some rare exceptions as the std::string type is much more powerful and easier to use than the C string type. If you don't know what char** means, don't worry.

It is also important to point out the difference between the int type and the std::string type. The int type is a primitive because it is defined by the language itself. The std::string type, like we mentioned, is not a primitive because it is defined in the standard library. There is an annoying amount of primitive types in C++, and they're all basically just variations on the ones you would expect, so I won't list them all here, but the most relevant are these:

  • int: integer (32 bits)
    • e.g. 5
  • float: floating point number (32 bits)
    • e.g. 5.1
  • double: floating point number (64 bits)
    • e.g. 5.1
    • You should basically always use double unless you're trying to optimize memory usage, probably on some sort of embedded system.
  • char: character
    • e.g. 'A'
  • bool: boolean
    • e.g. true and false

You'll notice that under char I wrote 'A' and not "A". This is intentional. In other languages like Python and Javascript, there is no difference between using single quotes and double quotes unless you are trying to include some kind of quotation inside of another. However, in C++ this is not the case. If you use single quotes, then it is a character. If you use double quotes, then it is a string. So "A" is valid, because it is a string containing one character 'A', but 'ABC' is invalid because a character can only be one letter.

Also, you should know that there are pointer versions of all of these types, denoted by an *. For example, int* is a pointer to an integer. We won't even use pointers in this tutorial, but you should know that they are something that exists and that they are important to learn later on.

And one last thing which is especially important for those who are familiar with Java: in Java there is a distinction between objects and primitives, where objects are stored on the Heap and primitives are stored on the stack. If you're unfamiliar about what the stack and heap are, in general the stack is where local variables are stored while the heap is where dynamically allocated objects reside in memory. This is not the case in C++. Both int and std::string will be stored on the stack, and if you pass an std::string into a function then it will be copied by value, not reference. The way you pass by reference in C++ is by using pointers (and references), which are beyond the scope of this tutorial.

Now that we have input defined as a variable of type std::string, which is currently set to the empty string "", we can store user input inside of it. First, we output a message "Enter a String: " without a newline to the user to make sure they know that the program is expecting input. Then, we have the following line:

std::cin >> input;

If you remember from earlier, the >> operator is the stream extraction operator. It is used to extract things from the input stream. In this case, we are extracting the user input from the input stream and storing it in the input variable. This also implicitly waits until a newline is entered by the user, so it will take everything the user types up until they press Enter and then store it in the input variable.

Lastly, we output back the user input to the terminal with the following line:

std::cout << "You entered: " << input << std::endl;

You'll notice that we use the << operator multiple times to chain together different values. This is very common when using std::cout and std::cin. And again, we end with an std::endl to make sure we flush the output stream.

Running this program will produce something like the following:

<tyler cpp> $ g++ main.cpp
<tyler cpp> $ ./a.out 
Enter a string: C++    
You entered: C++
<tyler cpp> $ 

2.3 Doing More With User Input

Now that we can get user input, let's try to do some operations on it like we will need to do in our final program. If you remember, in the project writeup we said that the user will input a file that contains a list of valid words, and then the program will randomly select one of those words. However, this would involve learning many new things: most notably parsing command line arguments, reading from a file, and generating random numbers. We'll get to that all later, but for now we'll just pretend that we have ready done all of that and picked out a random word from the file. This is actually a really good thing to do when writing programs: try to simplify it down so that you're only figuring out one thing at a time. We could have done this by doing it all in the order that the actual program will do it (i.e. parsing command line arguments -> reading from a file - generating a random number -> THEN getting user input), but instead I am showing it this way because some of those tasks are more difficult, and it makes sense to go from easier to harder.

To simulate everything that happens before user input, we'll save the word the user has to guess in an std::string variable directly.

#include <iostream>
#include <string>

int main() {
    std::string word_to_guess = "apple";
    // ...
}

Now we can write code that deals with this hardcoded value, and we can deal with the actual logic of the hangman game itself.

Towards this, first thing we want to do is get user input. We can plug in similar code to what we wrote in the previous section:

#include <iostream>
#include <string>

int main() {
    std::string word_to_guess = "apple";
    std::string input;

    std::cout << "Welcome to Hangman!\n";
    std::cout << "_ _ _ _ _" << std::endl;

    std::cout << "Guess a letter: ";
    std::cin >> input;

    // ...
}

Right now, we're ignoring number of guesses, and the line telling how many unknown letters are in the word is hardcode. We'll get to that later. For now, we'll just focus on getting user input and checking if it is correct.

But with that, we immediately reach a roadblock. How do we check if the user inputted the correct letter? We have to look through the string variable somehow and see if it contains the letter. This is a little complicated, so lets see if we can first simplify it down to something easier. We'll simplify down word_to_guess to just be a letter;

#include <iostream>
#include <string>

int main() {
    std::string letter_to_guess = "A";
    std::string input;

    std::cout << "Welcome to Hangman!\n";
    std::cout << "_" << std::endl;

    std::cout << "Guess a letter: ";
    std::cin >> input;

    if (input == letter_to_guess) {
        std::cout << "You guessed correctly!" << std::endl;
    } else {
        std::cout << "You guessed incorrectly! The letter was " << letter_to_guess << "." << std::endl;
    }
    // ...
}

You'll notice that this is the first place in which we are using an if statement. This program basically says "If the value stored in the variable input is equal to the value stored in the variable letter_to_guess, then print out You guessed correctly!. Otherwise, print out You guessed incorrectly! The letter was A."

You might be wondering why we didn't define letter_to_guess as a char when it is only one character. The reason we did this is that you cannot compare char variables with std::string variables. Some types that aren't the same can still be compared, but that is on a type-by-type basis.

Note for Java programmers: If you are familiar with Java, seeing == being used to compare strings might trigger alarm bells in your head, as in Java this will return false even if the text inside of the string objects is the same because in Java the == operator only compares addresses of objects. In C++, the == operator will only have this behavior if the type of the variables are pointers, so the type would be std::string* (with an asterisk) instead of std::string. We won't really be dealing with pointers with this program, but in the level 1 project we will.

Running this program a couple of times:

<tyler cpp> $ g++ main.cpp 
<tyler cpp> $ ./a.out 
Welcome to Hangman!
_
Guess a letter: f
You guessed incorrectly! The letter was A.
<tyler cpp> $ ./a.out 
Welcome to Hangman!
_
Guess a letter: a
You guessed incorrectly! The letter was A.
<tyler cpp> $ ./a.out 
Welcome to Hangman!
_
Guess a letter: A
You guessed correctly!
<tyler cpp> $ 

Immediately, you'll notice that the characters are case sensitive. This is definitely something we want to fix. You might think we can fix this by just changing letter_to_guess to lowercase "a", but then if the user typed in an upper case "A" there would still be a problem.

Luckily, the standard library gives us a function to convert characters to their lower case equivalent. We can use it like this

#include <iostream>
#include <string>
#include <cctype> // new

int main() {
    char letter_to_guess = "a"; // modified
    std::string input;

    std::cout << "Welcome to Hangman!\n";
    std::cout << "_" << std::endl;

    std::cout << "Guess a letter: ";
    std::cin >> input;

    char letter_guessed = tolower(input[0]); // new

    if (letter_guessed == letter_to_guess) { // modified
        std::cout << "You guessed correctly!" << std::endl;
    } else {
        std::cout << "You guessed incorrectly! The letter was " << letter_to_guess << "." << std::endl;
    }
}

Running this:

<tyler cpp> $ ./a.out 
Welcome to Hangman!
_
Guess a letter: A
You guessed correctly!

Let's take a look at the modifications we made.

  • First, we inluded the <cctype> library. This lets us use the tolower function, which we use later.
  • Then, we changed letter_to_guess to be a char instead of a std::string. This will make it easier to compare to the user input, because of the next change we made.
  • Next, we use the tolower function, and pass in input[0] as an argument. This introduces two new kinds of syntax, so lets go over them one by one.

When we say input[0], we are accessing the 0th character in the string stored in input. If you are famililar with lists or arrays in other languages, you are probably famililar with the concept of 0-based indexing. If not, then just know that for any "container" of elements in C++ and most programming languages, the first item is actually stored at the 0th index. This might seem odd at first, but it makes more sense with the most common operations you tend to do. The square brackets [] are what you use to index containers that support arbitrary indexing (i.e. get any value you want).

For this purpose, you can consider std::strings as a container of characters. If we had a string defined like this: std::string word = "Hello" then word[0] will be 'H', word[1] will be 'e', ..., until you get to word[4] which will be 'o'. If you tried to index beyond the end of the string, or any container, it is undefined behavior—which means that basically anything could happen, your program crashing one of the most preferable as it is better to know there is a problem then be unaware of it. With that in mind, our above program is not very safe, as before you index into a container you should probably make sure it at least has that many elements. For example, if we wanted to make sure the string at least had one element, we could do

    std::cout << "Guess a letter: ";
    std::cin >> input;

    if (input.size() > 0) {
        // input[0] guaranteed to have a value: i.e. it is defined behavior
    }

In this case, the size function (we'll talk about functions in the next paragraph) is being called to check that the string has at least 1 character inside of it. If you want to see what kinds of functions are available on all sorts of different values, including std:string, you can visit this website and specifically this page if you want to read more about std::string.

The second new kind of syntax we introduced is function calling. This is what we are doing when we say char letter_guessed = tolower(input[0]), and when we say input.size(). Between these two examples, there is are some slight differences that demonstrate something important. We'll start with tolower.

If you visit the cppreference page on tolower, you will see the following information:

cppreference documentation on tolower

It's important to be able to pick out relevant information from official documentation, since it often can contain more information than is needed for someone new to the language. I've already chopped off a lot of unimportant (for now) information by only including the top of the page, but even in here it's important to be able to parse the information.

Probably the most important piece of information is near the top where it says

int tolower( int ch )

This says that tolower is a function which takes one parameter of type int, called ch, and returns an int value. This means that you give an int value to the function, and it gives back another int value. You might be confused why this function is taking integers and not character values, but for now just squint and pretend that instead of int it is actually char. The reason for this is beyond the scope of this tutorial, but since int's and char's are very similiar you can pretend they are the same for the purposes of this hangman game.

To conceptualize return values, you can pretend that the entire expression tolower(input[0]) gets replaced with the returned value. So if input[0] was 'A', then the entire expression gets replaced with 'a', which means the entire line ends up becoming char letter_guessed = 'a'.

The second example of a function call is when we said input.size(). This function call is different from the previous in two main ways:

  1. It does not take any parameters. This is why the open and close parentheses () do not contain any values inside of them. Depending on how many parameters a function takes, you will have that many comma-separated values between the parentheses.
  2. It is preceded by input.. This is because it is a function that is defined on std::string itself. While tolower is just a function that doesn't have any extrinsic connection any value other than the parameters passed in, size depends on the std::string that it was called from. In Java everything is a class, so if you are coming from that background this is probably more familiar, as Java doesn't have any functions that are not a part of classes. We'll talk more about classes in the level 1 tutorial, so for now it is enough to understand the size function gives the number of characters in the string it is called on with the dot . operator.

Now we should have enough knowledge to understand the program, up to this point, in its entirety.

  1. We import all necessary libraries
  2. We hardcode the character the user needs to guess (currently 'a') in the char variable letter_to_guess
  3. We create an empty std::string called input in which we will store the user input in its entirety
  4. We output a prompt to the user using cout
  5. We get user input using cin and store it in input
  6. We convert the first character of input into its lowercase equivalent, and then store it in the char variable letter_guessed
  7. We use an if statement to see if the user guessed the correct letter, outputting the relevant victory/loss message.

The obvious extension right now would be to change the code logic to let the user guess an entire word instead of just one character, but before we do that let's create a fully functioning character guessing game with limited guesses.

3. Making a Character Guessing Game

Right now, the code looks like this

#include <iostream>
#include <string>
#include <cctype>

int main() {
    char letter_to_guess = 'a';
    std::string input;

    std::cout << "Welcome to Hangman!\n";
    std::cout << "_" << std::endl;

    std::cout << "Guess a letter: ";
    std::cin >> input;

    char letter_guessed = tolower(input[0]);

    if (letter_guessed == letter_to_guess) {
        std::cout << "You guessed correctly!" << std::endl;
    } else {
        std::cout << "You guessed incorrectly! The letter was " << letter_to_guess << "." << std::endl;
    }
}

By the end of this section, the game will let the user guess a randomly generated character, with a limited number of guesses.

3.1 Before we Start: Control Structures

If you think about the general program flow of this character guessing game, you can break it down into a flowchart.

This diagram is a natural way to represent how the program logic works. But because there is an edge that goes backwards up the diagram to a previous point in time, if we were to write this without loops, we would have to manually express the logic using deeply nested if statements, creating a new branch for every possibility. ehis obviously is not sustainable for any program of more than basic length, so this naturally implies that all programming languages must have something that makes this not needed, and nobody would ever think otherwise.

We already talked about if statements, but didn't really go into depth. And we haven't talked about loops at all, or any other control structures that will make this program much easier to write. To make sure everyone following this tutorial is on the same page, in this section we'll just go over the different types of loops and control structures available in C++. If you're familiar with other languages, then this will seem very familiar because C++ was heavily based on C, and nearly all languages are inspired from C in some way.

When we say control structures, this refers to all of the different ways of structuring the flow of control in a program. In other words, they are instructions in code that let you choose how or when to run other lines of code. Probably the most basic is the if statement, which allows you to conditionally run a piece of code according to the status of variables. Without control structures, programs would be very uninteresting.

If you already know all of this, feel free to skim or skip this section.

3.1.1 If Statement

If statements have the following basic form:

if (<boolean_expression>) {
    // code that only runs if <boolean_expression> is true
}

A boolean expression is something that evaluates to true or false. Here are some examples of boolean expressions

  • x > y
  • x >= y
  • x < y
  • x <= y
  • x == y

You can chain together multiple boolean expressions with the boolean and (&&) and or (||) operators. And requires that both expressions be true for the entire expression to be true, while or requires that at least one of the expressions to be true for the entire expression to be true. So x == y && a == b requires that x equals y and a equals b.

Also, you can negate a boolean expression by using the negation operator (!). So !(x == y) is only true if x and y are not equal. For this simple case, it is equivalent to x != y.

3.1.2 If, Else Statements

An if, else statement has the following basic form

if (<boolean_expression>) {
    // code if the expression is true
} else {
    // code if the expression is false
}

Even with simple if, else statements, we already start to see some interesting scoping rules. By scope, we mean the context in which a variable is "active." So if a variable is out of scope, you cannot access it.

A simple demonstration of this can be shown with one if, else statement:

#include <iostream>
#include <string>

int main() {
    std::string input;
    std::cin >> input;

    if (input == "hello") {
        int x = 5;
    } else {
        int x = 6;
    }

    std::cout << x << std::endl;
}

While a similar program would give the expected output in Python (printing 5 if the user enters "hello" or 6 if the user enters anything else), in C++ if you try to compile this it will give the following error

test.cpp: In function ‘int main()’:
test.cpp:14:16: error: ‘x’ was not declared in this scope
   14 |   std::cout << x << std::endl;
      |                ^

The error 'x' was not declared in this scope. This is because of how scope works. When you enter a section of code surrounded by a curly brace, you enter a new block of code. Any variables that were defined outside of that block can still be used, but when you exit that block of code any variables that were defined are no longer in scope, and therefore can not be used anymore. Because the variable x is defined within the curly braces, x has no meaning once you exit that block. To get the expected behavior, you would have to alter the program to be like this:

    // ...
    int x; 
    if (input == "hello") {
        x = 5;
    } else {
        x = 6;
    }

    std::cout << x << std::endl;

Taking this another level, when you enter the main function you are entering a new level of scope. So, any variables that were defined outside and before the main function can also be used and modified inside main. These are called global variables and they are generally discouraged since they can be mutated (modified) in many different places, which can lead to code that is harder to understand and maintain. This doesn't really matter right now because we just have one main function, but you can imagine if you had multiple functions in a file then it would be hard to keep track of what variables were used by which functions. If you even feel like you need a global variable, there is probably a way to structure your program such that you don't need it.

3.1.3 If, Else If, Else Statements

Else if statements are also very similar to if and else if statements.

if (<boolean_expression_1>) {
    // if boolean expression 1 is true
} else if (<boolean_expression_2>) {
    // if boolean expression 1 is false and 2 is true
} else if (<boolean_expression_3>) {
    // if boolean expressions 1 and 2 are false, and 3 is true
} else {
    // if all of the previous expressions are false
}

You can use these for more complicated logic chains, and you can chain together as many as you want.

However, if you use too many it can become hard to track what is going on. The next control structure can help with that.

3.1.4 Switch Statement

You can think of switch statements as similar to else if chains, but slightly easier to follow, and slightly less powerful.

Their general syntax looks like this:

switch (<expression>) {
    case <possible_value_1>:
        // if <expression> == <possible_value_1>
        break;

    // ...

    case <possible_value_n>
        // if <expression == <possible_value_2>>
        break;

    default:
        // if <expression> does not equal any of the previous possible values
        break;
}

You'll notice that at the end of each case statement there is a break statement. This is not required, but if you did not have it then you would "fall through" to the next case. This might be what you want, but most of the time it isn't.

Before, I said that switch statements were slightly less powerful than if else chains. This is because each possible value must be the same type as the expression, and you can't do any arbitrary boolean expression inside the case statement. In addition, you cannot use switch statements on std::string's, or any non-primitive type.

3.1.5 While Loop

Everything up until this point has had to deal with basic decision making. Now, we enter the realm of loops, which will help write repeated code. Loops are one of the most powerful basic construct in programming, and almost every program will have at least one. We'll start with one type of loop: while loops.

While loops have the following form:

while (<boolean-expression>) {
    // loop body
}

Before entering the loop, the boolaen expression will be evaluated. If true, then the flow of execution will enter the loop body. If false, the entire loop body will be skipped. At the end of the block, the expresion will be evaluated again. If it is true, then the loop repeats itself, and if it is false then it continues after the loop.

3.1.6 Do-While Loops

Do-while loops are very similar to while loops, with one exception. They look like this:

do {
    // loop body
} while (<boolean-expression>)

As you may be able to guess from the structure of the code, the boolean expression in a do-while loop does not evaluate when you enter the loop for the first time. Instead, it only gets evaluated after the body is done. Therefore, a do-while loop will always execute at least once, whereas a while loop may never execute.

3.1.7 For Loops

For loops are another type of loop, and they are incredibly common. In C++ there are two main types of for loops: C-style for loops and foreach loops. We wil start with C-style for loops because they are more traditional and are generally more flexible. Most people consider these "normal" for loops, and they are called c-style because they originate from C.

A c-style for loop has the following syntax:

for (<initialization-statement>; <loop-condition>; <update-statement>) {
    // loop body
}

As you can see, there are 3 main parts to a for loop (4 if you include the loop body): the initialization statement, the loop condition, and the update statement. The precise names are not important, but knowing what each part does is. It becomes a little more clear with the canonical example:

for (int i = 0; i < 5; i++) {
    std::cout << i << "\n";
}

In this example, the program will output the numbers 0-4 all on separate lines. The initialization statement is int i = 0; This defines a variable named i for the duration of the for loop and sets it to 0. This is the first thing that is run. Then, the loop condition i < 5 is checked. Because 0 < 5 evaluates to true, the loop body is entered, where the number 0 is outputted to standard output alonside a newline. Lastly, the update statement is run, increasing the variable i by 1. Once again the loop condition is checked, and because 1 < 5 is still true, the loop body is run again. This continues until i becomes 5, and since i is only printed out if i < 5, the number 5 is never printed out.

This format of creating a variable i and using it to count to a specific number is a very common programming idiom in C++ and other C-based languages. Using i is arbitrary, and is generally popular because often times the brevity you get from having a one letter variable outweighs any clarity you get from using a longer, more descriptive name (because usually it is just used to signify an arbitrary iteration count). However, sometimes it makes more sense to use a more descriptive name, such as row or column if you are doing something more specific. The "canonical" variable names that follow i if you are using nested for loops are j and k, such as in the following example:

for (int i = 0; i < 10; i++) {
    for (int j = 0; j < 10; j++) {
        for (int k = 0; k < 10; k++) {
            // do something
        }
    }
}

While this C-style of for loop is very flexible and allows you to do a lot, the most common use case historically has been to iterate through a container of elements. We haven't covered containers of elements yet in this tutorial, so just know that later on in this tutorial this is something we will talk about, and when that comes we will introduce the other kind of for loop: a foreach loop. This is useful because it removes all the boilerplate a C-style for loop forces you to write out.

3.1.8 Break and Continue

In all of the above loop constructions, there two additional keywords that allow you more flexibility with your flow of control. They are the break statement and the continue statement.

The break statement is slightly simpler, so we will start with it. Essentially, if a break statement is executed, the inner-most loop is immediately exited, no matter what the loop's boolean condition is. This is very commonly used in conjunction with an infinite loop, like in the below example, but it does not have to be.

while (true) {
    // normally, an infinite loop since true always evaluates to true... however,
    bool flag = false;

    // do computations, setting the flag value accordingly

    if (flag) {
        break;
    }

    // do more computations, setting the flag value accordingly

    if (flag) {
        break;
    }
}

As you can see, there are two possible ways to leave this for loop: either through the first if statement or through the second one. A similar effect could be achieved by moving the flag variable to the boolean condition and morphing the code like so:

bool flag = false;
while (!flag) {
    // do computations, setting flag
    if (!flag) {
        // do more computations, seting flag as appropriate
    }
}

This acheives the same exact behavior as the above example, but it is arguable more complicated to understand. One measurable way the previous example with breaks is better than this example is that the majority of the logic is only at one level of indentation, while in this example much of the code is at two levels of indentation.

Similar to the break statement is also the continue statement. The continue statement works exactly as a break statement, except instead of exiting the loop entirely it returns to the top of the loop to run the loop condition. If that is true, then it goes into the body like normal, and if false it exits the loop like normal.

3.1.9 Try-Catch & Exceptions

Try-catch blocks are used in conjunction with exceptions to provide error handling. We won't really go in much depth here, but if you have time you may want to look into it on your own. The general syntax is as follows:

try {
    // block of code you think might error
} catch (ExceptionType exception) {
    // error handling
}

You can also catch other types of values instead of exceptions, but generally it is recommended to reserve try-catch for error handling with exceptions, and not to throw around actual values you program needs to run.

3.1.10 Goto

Goto is a keyword which exists in C and C++, alongside some other lower level languages, which has been greatly disparaged in modern times. It essentially allows you to jump to any other point in a program at will. This is widely frowned upon because it can make your code harder to understand. Codebases which relied upon them heavily in the past are actually where the term "spaghetti code" originates from, because the flow of execution was all over the place like the noodles in spaghetti. So while you should almost never use a goto statement in your code, there are some places where it actually can be helpful.

For example, since break statements only allow you to break out of the innermost loop, you can use a goto to break out of nested loops like so:

int secret_number = /* user input */;
while (true) {
    for (int i = 0; i < 10; i++) {
        if (secret_number + i == 10) {
            goto loop_end;
        }
    }

    // update secret number based on user input
}
loop_end:
// code continues

Syntactically, note that goto depends on labels, in this case loop_end:. Note that it ends in a colon. You should also note that the label could have been anywhere in the file, even before the goto.

This specific example makes it so you don't have to have multiple break statements, and include a separate condition variable to detect if you need to break multiple times. While it looks silly here, in more complicated nested loops this can make it a lot simpler to understand.

3.2 Keeping Track of Guesses Remaining

With this knowledge on control structures, we can make the character guessing game based off of the flow chart from earlier.

#include <iostream>
#include <string>
#include <cctype>

int main() {
    int guesses_remaining = 5;
    char letter_to_guess = 'a';

    std::cout << "Welcome to Hangman!\n";

    while (guesses_remaining > 0) {
        std::cout << "\nYou have " << guesses_remaining << " guesses remaining.\n";
        std::cout << "_" << std::endl;
        std::cout << "Guess a letter: ";

        std::string input;
        std::cin >> input;

        if (input.size() != 1) {
            std::cout << "Please enter a single letter." << std::endl;
            continue;
        }

        char letter_guessed = tolower(input[0]);

        if (letter_guessed == letter_to_guess) {
            std::cout << "You guessed correctly! Congratulations" << std::endl;
            break;
        } 

        std::cout << "You guessed incorrectly." << std::endl;
        guesses_remaining--;
    }

    if (guesses_remaining == 0) {
        std::cout << "You are out of guesses! The letter was " << letter_to_guess << "." << std::endl;
    }

    return 0;
}

Note that the while loop's condition is (guesses_remaining > 0), and that there is one break statement inside of the loop, so there are two ways to leave the loop: either by making the loop's condition false (having no guesses left) or by guessing the correct letter.

Note that there are many ways you could structure this loop, and this is just one possibility. Other ways might be cleaner or more preferable for some people (e.g. while (true) vs. while (<condition>), but these are generally personal preference).

3.3 Randomly Selecting a Character

Now, we'll take another small step and randomly select a character to guess. In doing this, we will talk about how to write our own functions.

3.3.1 How to Define a Function

Up until this point, we have only been using functions that were defined in the standard library (e.g. std::tolower). This is very convenient for C++ programmers because they do not need to rewrite these lower level functions that are very commonly used (that is why they are in the standard library). However, not every possible function you may want to use is defined in the standard library. For example, and where we are going with this, there is no standard library function which gives a random lowercase character. This is an example of a case when you would want to write your own function.

At a high level, functions are an example of an abstraction. In this case, we say that the implementation level details of std::tolower have been abstracted away from the C++ programmer. You need not know the ASCII representation of chars in C++ and how to manipulate them. You only need to know the interface of the function to use it. That is, you only need to know the parameters to the function and the return type, and how they are related, to use the function.

When you are planning out a program, it can be useful to think at a high level and delegate out functionality to functions. Currently in our program, we are currently hardcoding the letter_to_guess to be 'a'. But, if we had a function called getRandomLetter which took no parameters and returned a character, then we could write a line that looks like this:

char letter_to_guess = getRandomLetter();

Now, as long as we had a function with this declaration

char getRandomLetter();

which returned a random lowercase ASCII character, then our program would work. So, let's create this function.

To start, we'll assume we want to keep all of our code inside of our main.cpp file. Later on, we'll talk about how to separate code into different files, if only to make sure we talk about how to do it, as this program probably isn't complicated enough to necessitate it.

To define a function, you follow this general format:

<return-type> <function-name>(<param_1-type> <param_1-name>, ..., <param_n-type> <param_n-name>) {
    // code here
}

With this, if you are familiar with more modern languages like Python, you might expect this to work:

// #includes for standard library

int main() {
    // ...
    char letter_to_guess = getRandomLetter();
    // ...
}

char getRandomLetter() {
    // definition here
}

However, if you do this you will get the following compile time error:

main.cpp: In function ‘int main()’:
main.cpp:7:28: error: ‘getRandomLetter’ was not declared in this scope
    7 |     char letter_to_guess = getRandomLetter();
      |                            ^~~~~~~~~~~~~~~

The reason this error is because the function getRandomLetter hasn't been defined by the time it is trying to be called inside main. This reveals something important: when you are calling a function in C++ it must have already been defined by the time that point in the program has been reached.

Your first instinct to solve this might be to put the function definition before main. While this would work, it can lead to complications when you are calling functions from other functions, since you will need to make sure a function is defined before you use it.

To solve this, you can write a forward declaration. The best way to illustrate this is by an example:

// #includes for standard library

char getRandomLetter(); // NEW: forward declaration

int main() {
    // ...
    char letter_to_guess = getRandomLetter();
    // ...
}

char getRandomLetter() {
    // definition here
}

This works because before we enter the main function, we are telling the compiler that there is a function getRandomLetter which takes no parameters and returns a char. When it reaches the call to getRandomLetter in main, it knows that there is a function by that name, and it is able to confirm that the return type of the function matches the variable we are assigning it into. When the function is actually called, it correctly will go to the definition at the end of the file to execute that code.

If you had multiple functions, you would put all of the forward declarations at the beginning of the file—not having to worry about their order—and then all of their definitions at the end—also not having to worry about the order.

Also to make sure this is clear, there is no rule that the main function has to go after the forward declarations but before the definitions. You can basically do anything that compiles, but I think this organization makes since if you are defining functions like this inside of a one-file program.

Now that we've talked about how to define a general function, we can delve into the specific implementation for getRandomLetter. The easiest way to implement this

3.3.2 Writing the getRandomLetter Function

The easiest way to write this function is to understand how ASCII values work.

Under the hood, char's are actually just int's but with two main differences: 1. They are encoded using the ASCII encoding scheme 2. In accordance with ASCII, a char is only 1 byte, unlike an int which is 4 bytes. (Technically on some compilers/systems an int might not be 4 bytes, but on any modern compiler it should be, not that this matters right now).

So, if we understand how ASCII works, we can selectively generate random integers with the correct ASCII encodings, and then convert them to chars.

The ASCII table looks like this:

ASCII Table

From this, we can see that all of the lowercase ASCII letters 'a' through 'z' are in the range 97 through 122. Therefore, if we generate a random number between those two bounds inclusive, then we will have a random lowercase character.

To generate random integers, first include <random> from the standard library like so

#include <random>

Now, this library is actually fairly complicated and provides a surprising amount of configurability for pseudorandom number generation, but for now we'll just focus on a simplified view of it:

// ...
char getRandomLetter() {
    const char ASCII_LOWER_A = 97;
    const char ASCII_LOWER_Z = 122;

    // Only create these objects once: they are static
    static std::random_device rd;
    static std::mt19937 mt(rd());
    static std::uniform_int_distribution<int> dist(ASCII_LOWER_A, ASCII_LOWER_Z);

    return dist(mt);
}

Let's break this down.

  • First, I create two constants for the ASCII values of 'a' and 'z'. This makes the code more readable, and generally if it makes sense to label constant numbers like this, you should.
  • Then, I create an std::random_device. If you have heard of pseudorandom number generation, then you know that when you generate pseudorandom numbers you have to provide a "seed" for the algorithm. Commonly, this is the current UNIX timestamp (seconds since January 1st, 1970), but the std::random_device class cleverly requests some information from the operating system to formulate this seed relatively randomly. This means you don't have to manually create the seed, unless you want to for testing purposes.
  • Next, I create an std::mt19937, which is a pseudorandom number generator which uses the Mersenne Twister algorithm to generate random numbers. We pass in the random device to provide the seed.
  • To finish the static variables we need, I create an std::uniform_int_distribution which represents a uniform distribution from 97 to 122, or all the lowercase ASCII letters.
  • Lastly, to actually use these values, we use the std::uniform_int_distribution as if it were a function and pass in the pseudorandom generator to generate pseudorandom numbers within the distribution and using the given generator.

As for what the static means inside infront of the type of these variables, this is something unrelated to the <random> standard library features, and is a part of C++ itself (and other languages like Java).

Basically, a static variable is like a global variable, but one that can only be accessed within the scope it is defined. It is global because it gets allocated once at the beginning of the program, and persists across function calls. Therefore, in this context the static variables we define are created once at the start of the program, and every time we enter the getRandomLetter function these variables refer to the same exact objects as before.

The prototypical example of a static variable would be like follows:

void testFunction() {
    static int count = 0;
    count++;
    std::cout << "testFunction() has been called " << count << " times" << std::endl;
}

This is a simple function that keeps track of how many times it has been called and prints out this number. This works because the count variable is declared as static: it is set to 0 once at the start of the program, and every time the function is called the value is the same as it was before the previous call to the function exited.

We use static in our random example because we don't want to create a new seed and random number generator every time we call the getRandomLetter function. We could achieve the same thing by making these static variables global variables instead, but in this case it is a little better to have them as static variables because they only need to be accessed inside of the getRandomLetter function, and generally it is better to limit the scope of your variables as much as possible while still being usable.

Now, if we put everything all together the program looks like this:

#include <iostream>
#include <string>
#include <cctype>
#include <random>

char getRandomLetter();

int main() {
    int guesses_remaining = 5;
    char letter_to_guess = getRandomLetter();

    std::cout << "Welcome to Hangman!\n";

    while (guesses_remaining > 0) {
        std::cout << "\nYou have " << guesses_remaining << " guesses remaining.\n";
        std::cout << "_" << std::endl;
        std::cout << "Guess a letter: ";

        std::string input;
        std::cin >> input;

        if (input.size() != 1) {
            std::cout << "Please enter a single letter." << std::endl;
            continue;
        }

        char letter_guessed = tolower(input[0]);

        if (letter_guessed == letter_to_guess) {
            std::cout << "You guessed correctly! Congratulations" << std::endl;
            break;
        }

        std::cout << "You guessed incorrectly." << std::endl;
        guesses_remaining--;
    }

    if (guesses_remaining == 0) {
        std::cout << "You are out of guesses! The letter was " << letter_to_guess << "." << std::endl;
    }

    return 0;
}

char getRandomLetter() {
    const char ASCII_LOWER_A = 97;
    const char ASCII_LOWER_Z = 122;

    // Only create these objects once: they are static
    static std::random_device rd;
    static std::mt19937 mt(rd());
    static std::uniform_int_distribution<int> dist(ASCII_LOWER_A, ASCII_LOWER_Z);

    return dist(mt);
}

This program now accurately creates a letter guessing game where the letter you have to guess is randomized. Now, if we were actually creating a letter guessing game, we might consider adding in hints that tell you if the actual letter is further forward or back in the alphabet, but since our end product is a word guessing game we will not worry about making this intermediate program fun/fair.

3.4 Refactor to Use Header Files

Often when writing larger programs, it becomes imperative to organize separate functions (and classes) in different files so that it is easier to navigate the codebase. This is hardly necessary for this file, but it is still important to know how to do, so we'll talk about how to do this here.

When writing C++ programs, there are two different kinds of files: header files and source files. Source files are what we have been dealing with so far, and they end in the .cpp extension. Header files are another kind of file, and they end in the .h extension. Sometimes you might see people use the .hpp extension to distinguish between C and C++ header files. You can use either extension but you should try to be consistent.

Earlier, we talked about the difference between forward declarations and actual function definitions. You can envision a similar distinction for header and source files. If we wanted to refactor out our helper generate random character function to a separate file that defines all of our utility functinos, then we could create two files: utilities.hpp and utilities.cpp, and then put in the forward declaration into the header file and the function definition into the source file.

However, you don't actually have to split up the function definition from the forward declaration, just like we didn't really have to do that in the single file version. You are also able to just put the entire function definition inside of the header file. We'll start with this first because it is easier to understand.

3.4.1 Only Using Header Files

To do this, first you would put the entire function definition inside of the header file

// utilites.hpp
#pragma once

#include <random>

char getRandomLetter() {
    const char ASCII_LOWER_A = 97;
    const char ASCII_LOWER_Z = 122;

    // Only create these objects once: they are static
    static std::random_device rd;
    static std::mt19937 mt(rd());
    static std::uniform_int_distribution<int> dist(ASCII_LOWER_A, ASCII_LOWER_Z);

    return dist(mt);
}

Then, we would include the utilities.hpp file in main.cpp by writing the following line near the top

// main.cpp
// ...
#include "utilities.hpp"
// ...

Essentially, this include line directly copies all of the contents of the utilities.hpp file into the main.cpp file, right where the include line is. It is exactly equivalent to if we put the function definition on that line.

You might have noticed that at the top of header file we included a line that says #pragma once. This is called a header guard. Essentially, a header guard prevents the compiler from including the same header file twice, if you are including it across multiple files. This prevents functions from being defined twice. You basically will always want to do this, so any header file you write should have this.

Note that the #pragma once line is technically not a part of standard C++. However, at this point it is supported by every major and modern compiler, so for our purposes it should be fine to use. The older, more "standard" way of doing it would look like this:

// utilities.hpp
#ifndef UTILITIES_HPP
#define UTILITIES_HPP

// Contents here

#endif

The way this works is basically the first time the compiler enters this file, the term UTILITES_HPP has not been defined in any compiler macro, so it enters the region of code enclosed by the #ifndef and #endif lines. The first thing it does is it defines UTILITIES_HPP in a macro (one that doesn't really do anything), and then it compiles everything as normal. Then later on if another file tries to include this file the #ifndef will block it from recompiling the actual definitions in the header file since UTILITIES_HPP has already been defined.

Note that the actual term you define (UTILITES_HPP in the above example) does not matter as long as it doesn't collide with any term used in another header file. It is fairly common, however, to just use the file name.

To summarize this, here are the pros/cons of each approach:

Pros Cons
#pragma once easy and less error prone not standard C++
#ifndef standard C++, supported by every compiler more error prone if you mistype something

Therefore, for all TUAS things we will just use #pragma once.

Now, that is everything you would need to know to just extract out function definitions to header files. You would compile the main.cpp exactly the same since in the process of compiling main.cpp it will follow the #include line to the appropriate file and copy everything over.

3.4.2 Using Header and Source Files

Now, we'll go over another approach which preserves the distinction between forward declarations and actual definitions.

To start, the header file will be much simpler as it essentially just contains the forward declaration.

// utilities.hpp
#pragma once

char getRandomLetter();

Then, the source file will contain the actual definition.

// utilities.cpp
#include <random>

char getRandomLetter() {
    const char ASCII_LOWER_A = 97;
    const char ASCII_LOWER_Z = 122;

    // Only create these objects once: they are static
    static std::random_device rd;
    static std::mt19937 mt(rd());
    static std::uniform_int_distribution<int> dist(ASCII_LOWER_A, ASCII_LOWER_Z);

    return dist(mt);
}

And that is it. The main.cpp file will look exactly the same as we still include the utilities.hpp file like normal. Note that there is no header guard in the source file because it is not a header file and shouldn't be included in other files.

The main complexity, however, comes when you are trying to compile this program. This complexity is definitely overkill for this simple program, but when you start including class definitions and much more functions it becomes worth it.

Before, compiling was just one command: g++ main.cpp. Now, however, there are two main steps we need to do.

First, we have to compile the source files separately into object files (that have the .o extension). Object files are intermediate files that contain the compiled information from a given cpp file and any hpp files it may have included. In this case we have two cpp files: main.cpp and utilities.cpp, so we create two object files with this command:

g++ -c main.cpp utilities.cpp

This will create main.o and utilities.o. Note that because we give the -c flag, it will take all of the cpp files afterwards and compile their object file representation.

Second, we combine the object files into one final executable with this command:

g++ -o char_guess main.o utilities.o

This creates an executable called char_guess based on the information from main.o and utilities.o. Note that the -o extension is immediately followed by the name of the executable, and then all of the object files you want to use.

For more complicated programs, this is definitely a process you would want to automate. A simple tool to do this is a Makefile, or if you wanted something more sophisticated you could like a build tool like CMake. However, this is beyond the scope of this tutorial, so we won't talk about either here.

For the purposes of this tutorial, you can use this Makefile to compile the basic program. To use it, you can type make build in your terminal to compile the program to the executable hangman.out. In addition, the command make clean will delete the executable and all intermediate object files.

GCC = g++ -std=c++20

build: main.o utilities.o
    ${GCC} -o hangman.out main.o utilities.o

main.o: main.cpp utilities.hpp
    ${GCC} -c main.cpp

utilities.o: utilities.cpp utilities.hpp
    ${GCC} -c utilities.cpp

.PHONY: clean
clean:
    rm -rf *.o
    rm -rf *.out

Note: this Makefile adds the extra parameter -std=c++20. This will be needed later on if you are using an older version of g++ that defaults to c++17. In this tutorial we use some language features that were not introduced until c++20, so you will need to enable them in order to compile the program.

4. From Characters to Words

For the rest of the tutorial I'll keep this separation of files we set up from the previous section, but if you skipped it then everything will still work fine if you put everything in main.cpp.

By this point, we have a working character guessing game. Now, we'll take the big step and convert it to actual hangman.

4.1 Pseudocode

Writing out pseudocode is a good strategy when planning out a program, and we'll use it here to fill in the gaps of what we will need to implement to make our word guessing game. If you are unaware, pseudocode is a human-readable description of the processes a program or algorithm must take.

Whenever you write pseudocode it is up to you how close you want it to look to actual code, and in this case since we already have some code that we're adding to, my pseudocode is a mix of actual C++ and comment descriptions of what needs to happen. In the following snippet, I've rewritten the main.cpp file but included pseudocode for everything we will need to change.

    int guesses_remaining = 5;

    std::string word_to_guess = // generate random word

    std::cout << "Welcome to Hangman!\n";

    while (guesses_remaining > 0) {
        std::cout << "\nYou have " << guesses_remaining << " guesses remaining.\n";
        // Output current status of guessed word in the following 
        // format: _ _ _ c _ _ _
        //        if the word was glucose and the player had 
        //        only guessed 'c'
        std::cout << "Guess a letter: ";

        std::string input;
        std::cin >> input;

        if (input.size() != 1) {
            std::cout << "Please enter a single letter." << std::endl;
            continue;
        }

        char letter_guessed = tolower(input[0]);

        // Check to make sure the current letter the user just typed
        // hasn't already been guessed.

        if (/* Check if letter is in word */) {
            // Output how many occurances of the letter there are

            // if entire word guessed, break
            // if not, continue
        } else {
            std::cout << "Incorrect! There are no "<< letter_guessed <<"'s in the word." << std::endl;
            guesses_remaining--;
        }
    }

    if (guesses_remaining == 0) {
        std::cout << "You are out of guesses! The word was " << word_to_guess << "." << std::endl;
    } else {
        std::cout << "Congratulations! You guessed the word correctly!\nThe word was: " << word_to_guess;
    }

    return 0;

From this, we have to figure out how to do all of the following things:

  • Output current status of guessed word
  • Check to make sure the inputted letter hasn't been guessed
  • Check if guessed letter is in word
  • Output how many occurances of a letter there are in a word
  • Check if the entire word has been guessed

This is a good start, but we can take this and get a little more specific. In the next step, we'll convert the psuedocode into actual code by replacing all of the comment descriptions with function calls that perform the described code.

    int guesses_remaining = 5;

    std::string word_to_guess = generateRandomWord(); // NEW

    std::cout << "Welcome to Hangman!\n";

    while (guesses_remaining > 0) {

        outputCurrentStatus(???); // NEW

        char letter_guessed = getLetterInput(???); // NEW

        int num_occurrences = getNumOccurrences(letter_guessed, word_to_guess); // NEW

        if (num_occurrences > 0) {
            if (num_occurrences == 1) {
                std::cout << "Correct! There is 1 " << letter_guessed << " in the word." << std::endl;
            } else {
                std::cout << "Correct! There are " << num_occurrences << " " << letter_guessed << "'s in the word." << std::endl;
            }

            if (isEntireWordGuessed(???)) { // NEW
                break;
            }
        } else {
            std::cout << "Incorrect! There are no "<< letter_guessed <<"'s in the word." << std::endl;
            guesses_remaining--;
        }
    }

    if (guesses_remaining == 0) {
        std::cout << "You are out of guesses! The word was " << word_to_guess << "." << std::endl;
    } else {
        std::cout << "Congratulations! You guessed the word correctly!\nThe word was: " << word_to_guess;
    }

    return 0;
In this code snippet there are 5 new functions that we need to write, and one we write them the program should theoretically just work with little to no changes in the main function. These functions are:

std::string generateRandomWord() // 1
void outputCurrentStatus(???) // 2
char getLetterInput(???) // 3
int getNumOccurrences(letter_guessed, word_to_guess) // 4
bool isEntireWordGuessed(???) // 5

In this list, functions 1 and 4 have their entire parameter list written out, while functions 2, 3, and 5 just have question marks. I prefer to do this for the more complicated funtions because it can help me to look at each function individually before deciding all of the variables from the main function that will be needed. With that in mind, let's go down the list and implement each function.

4.2 std::string generateRandomWord()

In the original program description, it said that there should be a file that the program reads from which provides all of the valid hidden words. Since this isn't exactly integral to the functioning of the program itself, it makes sense to defer implementing this until we make sure that everything else works. Therefore, we'll implement a "dummy" or "mock" implementation of this function until everything else around it is in place.

// utilities.hpp
std::string generateRandomWord();

// utilities.cpp
std::string generateRandomWord() {
    return "hello";
}

4.3 void outputCurrentStatus(???)

We want this function to output text in the following format, assuming the word is "hello":

Note that now we are adding the letters guessed to the output.

You have [X] guesses left.
You have already guessed:
_ _ _ _ _

From this, we can tell there are 3 pieces of information from the main function that we will need to implement this function:

  1. the number of remaining guesses
  2. the word to guess
  3. the letters that have already been guessed

The 1st and 2nd things are trivial to pass in because we are already tracking them, but the 3rd is something that we are not even keeping track of yet. This means that we will need to make some changes to our main function so we are keeping track of all of the letters that have been guessed. Before we do this, however, we need to talk about some more advanced data structures.

4.3.1 The Problem

Only using things we have talked about up until this point, if you tried to keep track of all of the letters that have been guessed you might try something incredibly suspicious. One such idea might be to create 26 different boolean variables, one for each letter, which are true if that letter has been guessed and false otherwise. Another idea would be to create 26 different char variables, which are named things like first_guess, second_guess, etc.... Both of these approaches would lead to completely unmaintainable code, especially for other kinds of situations which aren't limited by the length of the alphabet, and you would be right in thinking that there is a better way.

One "good" approach that you could actually do without learning anything else would be to create a variable of type std::string that starts as an empty string. Then, as the player guessed letters you could append those characters to the string. Therefore, you could just check if a character was in the string to see if it had been guessed. While this would work and it isn't bad by any means, there are other approaches which lead to more understandable code, and which abstract to other kinds of data types instead of just chars.

With this in mind, we'll talk about a bunch of the most common data structures that could possibly be used for this problem. Then, we'll talk about how any of them could be used in many different ways to solve the problem. Lastly, we'll pick one and use it.

4.3.2 C-style Arrays

The most "simple" way to solve this problem is to use a C-style array. In this case, simple doesn't necessarily mean that it is the easiest for a new programmer to understand or use, but that it doesn't have much going on behind the scenes besides what you see.

At a high level, a C-style array is a "container" of elements with a specified size. For example, if you wanted to keep track of a student's grade in a class, you could create an array of size 3 which contains their 2 midterm scores and their final score, like shown below:

int test_scores[] {50, 75, 80};
In this example, we are creating an array variable called test_scores which contains int's. Then, we set this variable to the value of {50, 75, 80}. This is called an initalizer list, and is a way to distinctly set the value of each index in the array. There are some important notes on this syntax: 1. there is no equals sign between the variable name and the initializer list 2. the square brackets which make it an array come after the variable name, not the type

To access each of the values in the array, you use the following syntax:

// Display the first midterm score
std::cout << test_scores[0] << std::endl;

// Display the second midterm score
std::cout << test_scores[1] << std::endl;

// Display the final score
std::cout << test_scores[2] << std::endl;

From this, you can see that arrays are 0 indexed. This means that the first value stored in the array is at index 0, and the final value stored in the array is at index size - 1.

It is undefined behavior to access any index outside of this range, so don't do it.

If you are coming from a newer language, then you might expect the following line of code to output all of the values inside of the array:

std::cout << test_scores << std::endl;

However, if you do this you will receive output which looks something like this:

0x7ffca1c106d4

So what is going on here?

Essentially, C-style arrays are pointers. What this means is that the actual value stored in test_scores is a memory address. In other words, the value stored inside of test_scores represents a place inside of your computer at which the array begins. We could go really in depth and explain the nuances of what this means, but it is beyond the scope of this tutorial.

Another thing this means, however, is that there is nothing intrinsically part of the array itself that keeps track of the size of the array. If you wanted to keep track of the size, you would have to have another variable which was set to it. This is why when functions expect C-style arrays as parameters, there is almost always one parameter which is the array itself, and another which is the size, like in the following example:

void exampleFunction(int array[], int size) {
    for (int i = 0; i < size; i++) {
        // do something at every index...
    }
}

// Since arrays are also pointers, the following is equivalent

void exampleFunction(int* array, int size) {
    for (int i = 0; i < size; i++) {
        // do something at every index...
    }
}

If the talk of pointers is confusing you, don't worry about it too much. The later sections don't refer to them.

In addition, it is important to note that the size of a C-style array MUST be determinable at compile time. This means that even if an array will never change size, if the compiler can't determine what it will be at compile time then the program will error and be unable to compile. The most common scenario in which new programmers get confused by this is demonstrated by the following sample program:

int size;
std::cin >> size;
int[size] my_array; 
// ERROR because you can't determine the value of size at compile time
// as it depends on user input

If you want to do something like this, you would either need to do some more advanced manual memory management (the C approach), or use one of the data structures we talk about later (the C++ approach).

Before we move on to the more C++ relevant data structures, it is imporant to talk about the other ways you can initialize C-style arrays. Before, we used an initializer list to set the value of each index. However, you can also do the following:

int array[100];
array[0] = 5;
array[1] = 10;

In this example, we create an array of size 100 and then individually set the values of each index. An important thing to note is that the values of the indices that haven't been set to anything are undefined. Essentially, what this means is that array[2] through array[99] could be anything. In practice, they will be whatever memory was left over in that region of your computer from an earlier program, or an earlier part of the current program.

In most C++ programs you probably shouldn't use C-style arrays, but they are still important to understand because you'll likely find them in other people's code. When writing your own code, however, I'd recommend using one of the following data structures we talk about.

4.3.3 std::array

std::array is a standard library container which essentially replaces C-style arrays. In other words, they are a fixed sized container where the size must be determinable at compile time.

The following sample shows the equivalent code from the test_scores example we had in the previous section:

#include <array>

// ...

std::array<int, 3> test_scores = {50, 75, 80};

std::cout << test_scores[0] << "\n";
std::cout << test_scores[1] << "\n";
std::cout << test_scores.at(2) << std::endl;

There are 4 things of note to talk about in this example: 1. Inside the angle brackets is where you define the type of variables the array can store, and the size of the array. 2. The syntax for initializer lists is a little more flexible: note the equals sign 3. Square bracket notation works, but in addition you can use the at function. The main difference between this and the square bracket notation is that at provides bounds checking, so it will throw an exception if you attempt to index out of the bounds of the array 4. Make sure to remember to include the <array> standard library definitions with #include <array>

In addition, the array itself also keeps track of it's size, which means you can do something like this:

std::array<int, 3> arr = {1,2,3};
for (int i = 0; i < arr.size(); i++) {
    // do something at every index...
}

And you can also use this syntax as well:

std::array<int, 3> arr = {1,2,3};
for (int i : arr) {
    // do something at every index...
}

To read more about all of the extra functionality std::array provides, you can check out this page.

4.3.4 std::vector

Up until this point, we have only talked about data structures which have a fixed size. These are nice, but it doesn't take too much imagination to think of a scenario where you would want an array that might need to change size. Of course, you could achieve this behavior by creating an extremely large array, and then use as much space as you need, but that is extremely inefficient spacewise.std::vector will let us have a dynamically sized data structure so we don't need to use a ton of extra space and manually keep track of how much we are actually using.

If you are familiar with Java, std::vector is C++'s equivalent of ArrayList, and if you are familiar with Python, it is the equivalent of a list. Therefore, it is probably the most commonly used standard library data structure in C++.

The following example shows the basic functionality of a vector.

// Make sure to remember to include the standard library
#include <vector>

int main() {
    // Create an empty vector
    std::vector<int> vec;

    // Use push_back to add items to the end
    vec.push_back(1);
    vec.push_back(2);

    // Index into the vector like normal

    std::cout << vec[0] << "\n";
    std::cout << vec.at(1) << "\n"; // Can also use at function, just like std::array

    // Can also get the size easily
    std::cout << vec.size() << "\n"

    // Can also use foreach loop to iterate
    for (auto num : vec) {
        std::cout << num << "\n";
    }

    std::cout << std::endl;
}

Without delving too much into the specifics, vectors are highly efficient at arbitrarily indexing items, and at appending to the back. Inserting items into the middle of the vector, or removing specific items, however, are not as fast. The next container, however, is generally the reverse.

To read more about all of the funtions provided for std::vector, you can click here.

4.3.5 std::list

std::list is C++'s implementation of a linked list. It is similar to a vector, but comes with the efficiencies and defficiencies of a linked list. If you aren't familiar with the differences between linked lists and arrays, in short linked lists are much more efficient when you are making many changes in the middle of the list, but are worse when it comes to accessing any arbitrary item that isn't at the front or back.

You can read more about std::list here.

4.3.6 std::map and std::unordered_map

std::map and std::unordered_map are C++'s implementations of a Map abstract data type: that is, a kind of data structure where you have key and value pairs. Generally, you should use std::unordered_map instead of std::map when you do not care about the ordering of items, because std::unordered_map will generally be faster. (If you know a bit about the theory behind this, then it'll make sense that generally std::unordered_map is implemented as a HashMap and std::map is implemented as some sort of balancing tree, like a Red-Black Tree).

From now on, I'll only be talking about std::unordered_map, but know that most of everything applies to both.

If you are familiar with Python, you can think of maps like dictionaries. At a high level, you associate "keys" with "values". In the analogy of a real dictionary, you can think of the "key" as a word, and the "value" as a definition. Once this information has been inserted into a map, you can look up a "key" and find the "value" associated incredibly quickly. If you tried to naively implement this type of behavior using a normal array, you would have to scan through the array until you found the corresponding value. However, with a Hashmap you can do this in constant time.

Alongside vectors (arrays), maps are the bread and butter of programming. They can be applied to a lot of problems in clever ways, and are very worth knowing. To read more about how to use maps, you can click here.

4.3.7 Sets vs. Maps

When using a map, you always have a key and value. However, sometimes you just want keys. This is essentially what std::set and std::unordered_set are for.

If you ever want a collection where you want to quickly check for the existence of a "key", then you should use a set. You can read more about the C++ implementation here.

4.3.8 What Data Stucture to Use?

If you remember, the original problem was that we wanted to keep track of all the guesses that the player has made. Here are 3 (of many) possible approaches we could take:

  1. Use an std::array<bool> of size 26. Spot 0 corresponds to a, 1 to b, and so on. False means that letter has not been guessed, and true means that it has been guessed.
  2. Use an std::vector<char> and add a character to the vector once it has been guessed.
  3. Use an std::unordered_set<char> and add the character to the set once it has been guessed.

There are definitely other methods of doing this, some perhaps better than what's been discussed here, but for such a simple program these will all suffice and would be good solutions. However, I would argue that number 3 would be perferable to the other two.

While number 1 would allow constant time lookup to see if a letter has been guessed, it is a little more complicated to use than the other two. And while std:vector is more commonly used than std::unordered_set, number 2 would be inferior to number 3 because it would take longer to figure out if a letter had been guessed. The set allows constant time lookup for letters ("keys"), while the vector would require you to iterate through the entire container to see if a letter had been guessed.

Therefore, in this tutorial we will use an std::unordered_set.

4.3.9 Keeping Track of the Letters Guessed

In the main function, we'll create an std::unordered_set<char> to keep track of the letters that have been guessed.

#include <iostream>
#include <string>
#include <cctype>
#include <unordered_set> // NEW

#include "utilities.hpp"

int main() {
    int guesses_remaining = 5;

    std::string word_to_guess = generateRandomWord();
    std::unordered_set<char> letters_guessed; // NEW

    std::cout << "Welcome to Hangman!\n";
    // ...
}

Now, we need to decide on the parameters for the outputCurrentStatus function. We decided earlier that we needed to pass in the word we're guessing, the number of guesses remaining, and what letters have already been guessed. Since we now have all of these things, we can write the forward declaration.

// utilities.hpp

#pragma once

#include <unordered_set>
#include <string>

// ...

void outputCurrentStatus(std::string word_to_guess,
                         int guesses_remaining,
                         const std::unordered_set<char>& letters_guessed);

It is important to note the type of the letters_guessed variable. On your first attempt, you might just put std::unordered_set<char> for the type. This would actually work fine, and for a program of this size would be perfectly fine. However, to understand why this might be a bad idea in certain scenarios, and to understand the syntax used in the above example, it is important to know how passing values into functions works in C++.

If you are familiar with Java, then you know about the differences between pass by value and pass by reference. In Java, all primitives types are passed into functions by value, while all object types are passed into functions by reference. Pass by value means that the variable inside of the function, if changed, does not affect the value that was actually passed in, while pass by reference means that modifying the object inside of the function actually modifies the original variable as well.

In C++, everything is pass by value unless specified otherwise. There are two different ways to perform pass by reference, and they work slightly differently. You can either pass in a pointer or a reference.

Explaining the nuances behind pointers and references is beyond the scope of this tutorial, but it is enough to know that in the context of types, including an ampersand (&) next to the type signifies that it is a reference. This means that when you pass in the value, and changes to it will be reflected outside of the function.

This example shows the difference between passing by value vs. passing by reference:

// Pass by value
void foo(int x) {
    x += 5; // this only affects the copy of x that exists solely within this function
}

int main() {
    int x = 0;
    foo(x);
    // x is still 0
}
// Pass by reference
void foo(int& x) {
    x += 5;
}

int main() {
    int x = 0;
    foo(x);
    // x is now 5
}

Therefore, you can see that when a parameter in a function has an ampersand attached to the type, it means that the value is being passed as a reference. You can also achieve this with pointers, but don't worry about it for this program.

However, with this knowledge we only have enough knowledge for half of the type signature used above: const std::unordered_set<char>&. We talked about the ampersand, but this leads to two questions:

  1. Why are we using pass by reference when we aren't actually modifying the set inside of the function?
  2. What does the const mean?

In short, you can prepend any type with const to make it so that value is constant: i.e. it cannot be changed. Therefore, when we make a parameter to a function a constant reference it means that we are passing a value in as a reference that cannot be modified. You might think that this completely negates the purpose of passing by reference, since the reason to pass by reference is to allow modifications made inside the function to persist, and the const completely negates this, but there is actually another benefit to pass by reference over pass by value.

When you do pass by value, you are literally copying the entire object and creating this copy for use inside of the function. This is completely fine for primitive values because they are small, but for more complex objects this means that it can take a long time to copy over the value. For example, say you had a set of 10,000 items. If you did pass by value there would be a significant time loss if you used pass by value. However, with pass by reference, since you are just passing in a reference to the original object, there is no copying that must be done since you are still dealing with the original object. But, since we don't want the function to be able to actually modify the reference, we attach the const to the reference to ensure that it does not change.

In short, we use a constant reference when we do not want to incur the cost of copying a large object, and when we still do not want the original object to be modified.

There are two other things of note to mention before moving on: 1. In this case, the set will never actually be large enough for the copying to take a noticable amount of time, but I still did it here to demonstrate that this is a pattern that you should generally follow. 2. You might thinking that the const isn't really necessary because you the programmer would know that it shouldn't be modified. It, of course, would work exactly the same without the const in the type signature, but it provides some significant benefits. First, it lets other programmers who may be working in your code understand your intentions. By making a parameter a constant reference everyone (even your future self that has forgotten your original intentions) knows that it should not (and cannot!) be modified. Second, it provides a compile time error if you ever accidentally try to modify it. Getting compile time errors is a good thing because it means that any potential errors are caught sooner rather than later. If at a later time, however, you decide that the value should be able to be modified, then you can just remove the const.

With this, we can write the function:

/*
    Example output:
    ---
    You have 5 guesses left.
    You have already guessed: h, e, l, p
    h e l l _
*/
void outputCurrentStatus(std::string word_to_guess, 
                         int guesses_remaining, 
                         const std::unordered_set<char>& letters_guessed) {
    std::cout << "You have " << guesses_remaining << " guesses left.\n";

    std::cout << "You have already guessed: ";
    for (char c : letters_guessed) {
        std::cout << c << ", ";
    }
    std::cout << "\n";

    for (char c : word_to_guess) {
        if (letters_guessed.contains(c)) {
            std::cout << c << " ";
        } else {
            std::cout << "_ ";
        }
    }
    std::cout << "\n";
}

Going through this function, first we output the number of guesses remaining. Then, we go through the set of all characters that have already been guessed and display them to the user. Lastly, we go through every character in the word in order and check if if that character is in the set of characters that have already been guessed. If that letter has been guessed, then it is displayed to the player. Otherwise, it just displays a placeholder underscore character.

4.4 getLetterInput(???)

Now, we need to write a function called getLetterInput which returns a char that the player has entered. The only piece of information needed in this function is the letters already guessed, so that if a letter is guessed twice it doesn't count against the number of guesses. In other words, this function will repeatedly ask for a character to guess until it gets one that hasn't been guessed before. That is the value that will be returned from the function.

First, we'll write the forward declaration:

// utilities.hpp
#include <unordered_set>
// ...
char getLetterInput(const std::unordered_set<char>& letters_guessed);

(Note: this is going to change later, so stay tuned).

Then, we can write the actual function based on the character input code we wrote earlier. It might look something like this:

// utilities.cpp
char getLetterInput(const std::unordered_set<char>& letters_guessed) {
    static const char ASCII_LOWER_A = 97;
    static const char ASCII_LOWER_Z = 122;
    std::cout << "Guess a letter: ";

    // Repeat until valid input
    while (true) {
        std::string input;
        std::cin >> input;

        if (input.size() == 1 &&
            input[0] <= ASCII_LOWER_Z &&
            input[0] >= ASCII_LOWER_A &&
            !letters_guessed.contains(input[0])) {
                return input[0];
            }
    }
}

This should work, but this function gets a little confusing inside of the while loop because there are so many conditions in the if statement. In fact, when we consider the fact that we also might want to include error messages that contain information about what is wrong with the player's input, this becomes enough functionality that we might want to consider splitting this up into a different function. For example, wouldn't it be nice if we could just write something like

    while (true) {
        std::string input;
        std::cin >> input;

        if (isValidLetter(input)) {
            return input[0];
        } else {
            // display error message somehow?
        }
    }

Ignoring the error messages for now, let's focus on writing this isValidLetter function. The first draft might look something like this:

// utilities.hpp
bool isValidLetter(std::string input, const std::unordered_set<char>& letters_guessed);
// utilities.cpp
bool isValidLetter(std::string input, const std::unordered_set<char>& letters_guessed) {
    static const char ASCII_LOWER_A = 97;
    static const char ASCII_LOWER_Z = 122;

    return input.size() == 1 &&
           tolower(input[0]) <= ASCII_LOWER_Z &&
           tolower(input[0]) >= ASCII_LOWER_A &&
           !letters_guessed.contains(input[0]);
}

This totally works, and could be easily slotted in with the most recently shown version of getLetterInput, but it still doesn't display error message information. To do this, we have to restructure the code a bit. There are two main directions we could go with this:

  1. Let the isValidLetter function display the error message directly to standard output, leaving the function signature the same.
  2. Change the function signature of isValidLetter to return information relating to the error message, and have a higher level function deal with standard output.

Generally when I think about decisions like this, I like my lower level functions to not have too many side effects because it lets them be more usable in many situations. For example, if we decided to have isValidLetter output directly to standard output, then we would never be able to use the function if we didn't desire that behavior. However, if we let the function just return information about why that input wasn't a valid letter, then the function is more generalized and therefore can be used when we don't really care about the error message. Of course, we are only using this function in one place in this program so approach 1 would be totally fine, but for the reasons I mentioned I will go will approach 2. Not only would this be better for a larger program, but it will also give an excuse to talk about some helpful modern C++ language features and syntax.

Essentially, there are two pieces of information that this isValidLetter function needs to return to the getLetterInput function:

  1. Whether or not the input is a valid character to have been guessed
  2. If it is not a valid character, a string containing the reason why it is not valid.

While before we were using a bool to return true/false on whether or not the letter was valid, at this point it might make more sense to instead return he transformed lowercase char representation of that valid guess since that is what is needed later on in the program. In this regard, the function becomes less of isValidLetter and more of parseInput, so we'll change the name of the function to match the new behavior. In addition, we need to change the return value to allow us to return two different values. The easiest way to do this is to utilize the std::pair type. std::pair essentially allows us to package together two different variables in one, and this is helpful because functions can only return one variable. Therefore, the new function signature for isValidLetter now will look like:

// utilities.hpp
#include <utility> // needed for std::pair
// ...
std::pair<char, std::string> parseInput(std::string input, const std::unordered_set<char>& letters_guessed);

Reading this function signature, we know that it takes an input string and the set of letters_guessed, and returns two values: a char and an std::string. To use a pair, you put the two types that you want to use inside of the angle brackets, like with std::array or std::vector or any other object that accepts template types.

Now, for the implementation:

// utilities.cpp

// Requirements:
// 1. only one character in the string
// 2. lowercase ascii letter between a and z
// 3. letter has not already been guessed
// Returns <input, error message> as a pair.
// Input will be the null character if there is an error.
// If no error, the string will be empty
std::pair<char, std::string> parseInput(std::string input, const std::unordered_set<char>& letters_guessed) {
    static const char ASCII_LOWER_A = 97;
    static const char ASCII_LOWER_Z = 122;

    if (input.size() != 1) {
        return {'\0', "Input must be one letter"};
    }

    char letter = tolower(input[0]);

    if (letter > ASCII_LOWER_Z || letter < ASCII_LOWER_A) {
        return {'\0', "Input must be a letter from a-z"};
    }

    if (letters_guessed.contains(letter)) {
        return {'\0', "That letter has already been guessed"};
    }

    return {letter, ""};
}

There are several things we should note about this code:

  1. To construct a pair, we put the two values separated by a comma inside of curly parentheses. There are other ways to construct pairs, but this is probably the most syntactically concise. Another common method is the std::make_pair function. (You can read about std::pair here)
  2. To specify that the input was invalid, we set the char return value to the null character \0, which is just the integer value 0. We technically could have used any arbitrary non-valid character, as long as we documented it.
  3. It is not relevant here, but may be relevant elsewhere when you are coding: if you want an object like std:pair that allows an arbitrary number of values, you can use std::tuple (read here).

With this, we can rewrite the getLetterInput function to utilize this new parseInput function:

char getLetterInput(const std::unordered_set<char>& letters_guessed) {

    std::cout << "Guess a letter: ";

    // Repeat until valid input
    while (true) {
        std::string input;
        std::cin >> input;

        auto [parsed_input, error_msg] = parseInput(input, letters_guessed);

        if (parsed_input != '\0') {
            return parsed_input;
        }

        std::cout << error_msg << ". Try again: ";
    }
}

Note the syntax used to extract the values from the pair. This is called a structured binding and allows us to place the two values inside of distinct variables. We could have also written

auto parsed_pair = parseInput(input, letters_guessed);

if (parsed_pair.first != '\0') {
    return parsed_pair.first
}

std::cout << parsed_pair.second << ". Try again: ";
using the first and second keywords to access the two values in the pair. but the destructuring method is generally easier to understand because you can give good names to each of the values that make the code easier to understand. It also extrapolates nicer when using std::tuple with more than two values, as those do not have the first and second keywords.

Note that in both examples we used the auto keyword. This essentially is a way to get around explicitly stating the type of a variable when declaring it if the compiler should be able to figure out the type on its own. This works here because the return value of parseInput is known by the compiler, so it can take parsed_pair and make its type the (annoyingly long) std::pair<char, std::string>. auto is also commonly used in foreach loops, but I would recommend only using it when the type is incredibly obtuse and you gain little clarity from explicitly writing out the type. Also, like in the first example, it is required when using structured bindings.

With this, there is actually a bug in the code still. You might have noticed that we never actually add the guessed letter into the set of guessed letters. To fix this, we can add the letter to the set once the value has been returned into the main function. However, it would be a little cleaner to just do this inside of the getLetterInput function since that function should handle everything to do with the player inputting a letter. If you were to try this, however, unfortunately you would get an error because we are passing the set as a constant reference. So, this lets us know that we decided incorrectly earlier when we decided on the types for the parameters since we found a reason we want to mutate the letters_guessed set inside of the function. This means that the final function signature for getLetterInput should be

// utilities.hpp
char getLetterInput(std::unordered_set<char>& letters_guessed);

Make sure to update this in both utilities.hpp and uiltities.cpp.

4.5 getNumOccurrences(char letter, std::string word)

Now we need to write the function that finds how many times a letter is inside of a word. This is a relatively simple function, so I would definitely recommend trying to write it yourself before looking at the tutorial's solution.

With that being said,

// utilities.hpp
int getNumOccurrences(char letter, std::string word);
int getNumOccurrences(char letter, std::string word) {
    int count = 0;
    for (char c : word) {
        if (c == letter) {
            count++;
        }
    }
    return count;
}

This is a relatively common problem you might encounter in many scenarios, and there is actually a standard library function that does basically this but much more generally. If you are curious about this, you can read about the std::count and std::count_if funtions here.

4.6 bool isEntireWordGuessed(???);

Lastly, we need a function to determine if the entire word has been guessed. To do this, we need to know the word that is being guessed and all of the letters that have already been guessed. This leads us to the following function declaration and definition:

// utilities.hpp
bool isEntireWordGuessed(std::string word, const std::unordered_set<char>& letters_guessed);
// utilites.cpp
bool isEntireWordGuessed(std::string word, const std::unordered_set<char>& letters_guessed) {
    for (char c : word) {
        if (!letters_guessed.contains(c)) {
            return false;
        }
    }
    return true;
}

4.7 Putting it all Together

At this point, we have written all of the code needed for the hangman game, short of randomizing the word to guess. If you are having trouble getting everything to work together, you can reference the following code that I have written up until this point while writing this tutorial:

4.7.1 utilities.hpp

#pragma once

#include <unordered_set>
#include <string>
#include <utility>

std::string generateRandomWord();

void outputCurrentStatus(std::string word_to_guess,
                         int guesses_remaining,
                         const std::unordered_set<char>& letters_guessed);


std::pair<char, std::string> parseInput(std::string input, const std::unordered_set<char>& letters_guessed);

char getLetterInput(std::unordered_set<char>& letters_guessed);

int getNumOccurrences(char letter, std::string word);

bool isEntireWordGuessed(std::string word, const std::unordered_set<char>& letters_guessed);

4.7.2 utilities.cpp

#include <string>
#include <unordered_set>
#include <iostream>
#include <utility>
#include "utilities.hpp"

std::string generateRandomWord() {
    return "hello";
}

/*
    Example output:
    ---
    You have 5 guesses left.
    You have already guessed: h, e, l, p
    h e l l _
*/
void outputCurrentStatus(std::string word_to_guess, 
                         int guesses_remaining, 
                         const std::unordered_set<char>& letters_guessed) {
    std::cout << "You have " << guesses_remaining << " guesses left.\n";

    std::cout << "You have already guessed: ";
    for (char c : letters_guessed) {
        std::cout << c << ", ";
    }
    std::cout << "\n";

    for (char c : word_to_guess) {
        if (letters_guessed.contains(c)) {
            std::cout << c << " ";
        } else {
            std::cout << "_ ";
        }
    }
    std::cout << "\n";
}

// Requirements:
// 1. only one character in the string
// 2. lowercase ascii letter between a and z
// 3. letter has not already been guessed
// Returns <input, error message> as a pair.
// Input will be the null character if there is an error.
// If no error, the string will be empty
std::pair<char, std::string> parseInput(std::string input, const std::unordered_set<char>& letters_guessed) {
    static const char ASCII_LOWER_A = 97;
    static const char ASCII_LOWER_Z = 122;

    if (input.size() != 1) {
        return {'\0', "Input must be one letter"};
    }

    char letter = tolower(input[0]);

    if (letter > ASCII_LOWER_Z || letter < ASCII_LOWER_A) {
        return {'\0', "Input must be a letter from a-z"};
    }

    if (letters_guessed.contains(letter)) {
        return {'\0', "That letter has already been guessed"};
    }

    return {letter, ""};
}

char getLetterInput(std::unordered_set<char>& letters_guessed) {

    std::cout << "Guess a letter: ";

    // Repeat until valid input
    while (true) {
        std::string input;
        std::cin >> input;

        auto [parsed_input, error_msg] = parseInput(input, letters_guessed);

        if (parsed_input != '\0') {
            letters_guessed.insert(parsed_input);
            return parsed_input;
        }

        std::cout << error_msg << ". Try again: ";
    }
}

int getNumOccurrences(char letter, std::string word) {
    int count = 0;
    for (char c : word) {
        if (c == letter) {
            count++;
        }
    }
    return count;
}

bool isEntireWordGuessed(std::string word, const std::unordered_set<char>& letters_guessed) {
    for (char c : word) {
        if (!letters_guessed.contains(c)) {
            return false;
        }
    }
    return true;
}

4.7.3 main.cpp

#include <iostream>
#include <string>
#include <cctype>
#include <unordered_set>

#include "utilities.hpp"

int main() {
    int guesses_remaining = 5;

    std::string word_to_guess = generateRandomWord();
    std::unordered_set<char> letters_guessed;

    std::cout << "Welcome to Hangman!\n";

    while (guesses_remaining > 0) {

        outputCurrentStatus(word_to_guess, guesses_remaining, letters_guessed);

        char letter_guessed = getLetterInput(letters_guessed);
        std::cout << "\n";

        int num_occurrences = getNumOccurrences(letter_guessed, word_to_guess);

        if (num_occurrences > 0) {
            if (num_occurrences == 1) {
                std::cout << "Correct! There is 1 " << letter_guessed << " in the word." << std::endl;
            } else {
                std::cout << "Correct! There are " << num_occurrences << " " << letter_guessed << "'s in the word." << std::endl;
            }

            if (isEntireWordGuessed(word_to_guess, letters_guessed)) {
                break;
            }
        } else {
            std::cout << "Incorrect! There are no "<< letter_guessed <<"'s in the word." << std::endl;
            guesses_remaining--;
        }
    }

    if (guesses_remaining == 0) {
        std::cout << "You are out of guesses! The word was " << word_to_guess << "." << std::endl;
    } else {
        std::cout << "Congratulations! You guessed the word correctly!\nThe word was: " << word_to_guess << std::endl;
    }

    return 0;
}

5. File Input

5.3.1 Problem Description

There are two main features we need to add:

  1. Reading words from a file and randomly selecting one for guessing
  2. Allowing the user to specify the file to read from as a command line argument

To clarify what this means, let's say there is an executable hangman.out a file words.txt with the following content:

hello
gamer
suspicious

The player should be able to run the following terminal command

./hangman.out words.txt

And this should run the game with a 1/3 chance of the word being any of the three words in the words.txt file.

We will first focus on feature 1, making the program always read from a words.txt file that we assume exists, and then we will implement feature 2 and let the user specify the file, checking to make sure it actually exists.

5.3.2 File input

We will first implement the file input inside of the generateRandomWord function we wrote earlier.

Much like how we use the <iostream> standard library package to work with the input and output streams, we use the <fstream> package to work with file streams. First, we'll create an std::ifstream object which we will use to read from a file. (Note: the i in ifstream means input. There is an alternate ofstream which deals with file output, which we won't touch here.)

std::string generateRandomWord() {
    std::ifstream file("words.txt");
    std::vector<std::string> words;

    std::string line;
    while (std::getline(file, line)) {
        words.push_back(line);
    }
}
After we open the file stream, we then create a vector of strings called words in which we will store all of the words we find in the file. Then, we use the std::getline method to read from the file line by line. Essentially, we give std::getline the input stream (file) and the string to store the next line in (line). The while loop will keep going until it goes through all of the lines in the file. And in each iteration, we simply append the current word to the list of words until we've added all of the words. All this leaves then is to select one of these words randomly. We can adapt the rng code we wrote earlier when we were generating a random character to this scenario. The only difference is that we need to generate a random index into the words vector.

std::string generateRandomWord() {
    std::ifstream file("words.txt");

    std::vector<std::string> words;

    std::string line;
    while (std::getline(file, line)) {
        words.push_back(line);
    }

    static std::random_device rd;
    static std::mt19937 mt(rd());
    static std::uniform_int_distribution<int> dist(0, words.size() - 1);

    return words[dist(mt)];
}

With this, as long as you have the words.txt file created you can verify that the program correctly selects a random word from the file.

5.3.3 Command Line Arguments

Up until this point, our main function's signature has been very boring. However, that is about to change. In order to view any command line arguments that the player passes into the program, we have to modify the main funtion's signature to look like the following:

int main(int argc, char** argv) {
    //
}

Essentially, argv is a n array of strings that contains the program name in index 0 and all arguments in subequent positions. Therefore, argc tells you the size of the argv array so you know how many spaces you can safely index.

As an extra note, char** argv could also be written as char* argv[] since they are essentially the same thing.

Therefore, since our expected input looks like

./hangman.out words.txt

argc will be 2 and argv[0] will be "./hangman.out" and argv[1] will be "words.txt". Therefore, we can verify this input with the following code:

// main.cpp
int main(int argc, char** argv) {
    if (argc != 2) {
        std::cout << "Expected use: \"./${executable name} ${words file}\"\n" 
                  << "    where the words file contains one possible word per line in plain text.\n";
    }

    std::string file_name = argv[1];

    int guesses_remaining = 5;

    std::string word_to_guess = generateRandomWord(file_name);
    //...
}
// utilities.cpp
std::string generateRandomWord(std::string file_name) {
    std::ifstream file(file_name);

    if (!file.good()) {
        std::cerr << "Cannot open " << file_name << " for editing. Does it exist?" << std::endl;
        exit(1);
    }

    std::vector<std::string> words;
    // ...
}

Note that we use std::cerr instead of std::cout to output to standard error instead of standard out, and that we use exit(1) to prematurely end the program with a code of 1, which since it is non-zero signifies an error occurred. exit(1) is equivalent to return 1 from the main function.

6. Conclusion

At this point, congratulations! You have completed the tutorial and hopefully were successful in making the hangman game in C++. If you notice any typos or errors in the tutorial it would be very helpful if you pointed them out or even submited a PR to fix them on your own.

If you want try and expand on this a little bit, you can try adding some of the following features:

  1. Menu inside of the program to create your own word files
  2. Allow words with dashes
  3. Allow phrases instead of just singular words
  4. ASCII art showing the progression of the hangman
  5. Let the player select a level of difficulty