Working with C-strings in C++ - Part 1

Published on 14 Apr 2020

Recently I had to do an assignment that required working with strings and character arrays. I wanted to do a particular thing with a character array and treat it as a string, but wasn't so far into my textbook, so I took to Stack Overflow and didn't learn anything of value.

I went back to my trusty textbook for last semester's Introduction to Programming II module, called Problem Solving with C++ 10th edition by Walter Savitch. I highly recommend it if you're into learning a robust programming language, by the way. This post refers to C++ version 11.

Strings are interesting things, since they are treated as an array of base type char in C++, which it inherited from the C language. I never knew too much about strings and the technicalities surrounding them, since I mostly used Javascript which is a dynamically typed language.

So, what makes a C-string different? Why are C-strings not the same as ordinary arrays of base type char?.

The null character

The most interesting thing I learned was that C-string variables included a null character at the end of the array. But why is this necessary?

Firstly, it is used as a sentinel to indicate the end of the character array. When you read an array of characters and the loop reaches the null character, you know that you've reached the end of the string. It also occupies one element in the array, which means that your declared array size must be one element greater than the number of elements you wish to include.

When you declare a C-string variable, it is wise to add one to the maximum size to ensure that the null character is included. You declare it the same way as you would a normal array of type char, however you can omit the array size. C++ will automatically add the null character to this newly initiated C-string variable, adding 1 more to the length of the array. You can include a size, but you would need to be aware of the fact that if your string is mutated with more characters, you might lose the null character and some weird stuff will possibly happen to your code!

Assigning values and using the comparison operator

In my fresh C++ programmer mind, I thought assigning values to C-strings or comparing two values was as simple as using the assignment and comparison operators as one would with other data types such as int or double. Totally not true! You can assign a value when you initialise the variable using the assignment operator, but try to change the entire array later on this way and you'll get some bizarre results...

Instead, you should use the member function strcpy(someString, "hey"). The problem with this is that it doesn't take into account the fact that the new value, "hey", could be much larger than the declared size of the string, therefore you could potentially lose characters when running this function on a string. Thankfully, there is a safer version of this, strncpy. This function takes an extra argument which gives the maximum number of characters to copy. Just make sure to leave room for the null character ;)

Sometimes you need to compare two C-string values, but I just mentioned you cannot really use the equality (or comparison) operator in this case. It's going to return strange results. Instead, use the strcmp(str1, str2) member function. It compares the characters in the C-string arguments a character at a time. It's basically like running a for-loop and comparing each indexed element with that of the other string, but a bit more involved. If at any point the numeric encoding of the first string is less than the numeric encoding of the second string, the testing stops and returns a negative number. If the encoding is greater, it returns a positive number. If they're both the same, it returns a zero.

If you use the strcmp function as a Boolean expression, such as in an if-else statement to test for equality, then the nonzero value will evaluate to true if the two strings are different and to false if the value is zero. This is kind of weird inverted logic, but still, makes it a bit easier to compare two strings so you do not have to write your own function to do so!

Do I need any library files to use C-strings?

To simply declare them, initialise them and not do any operations like the ones I just mentioned, no. If you do need functions like strcpy(targetString, SourceString) or strcmp(str1, str2), then you definitely need to #include at the beginning of your code.

In Part Two, I will discuss my understanding of additional functions, pitfalls and other interesting things I've learned about using C-strings in C++. Stay tuned!