When using the character data type, we treat each individual symbol as one unit of data. For example, the phrase "abc123" would be stored as six separate characters; but doing so would require six separate storage locations (variables). If we want to store the entire phrase as a single unit of data in just one storage location, then we must use a data type that allows for multiple characters to be stored under just one label. A data type that can be used for this in both C and C++ is referred to as "c-string" (a chain of characters).

C++ allows programmers to define data objects from a class named string. In the C programming language, there is not a formal string data type. Both C and C++ allow programmers to implement the string data type using
arrays of character storage locations that can be referenced collectively with one label or individually by subscript. Individual character constants (such as 'a' or '5') are represented in these languages enclosed in apostrophes (a.k.a. "single-quoted"). Character storage is declared one character at a time using the data type named char. String constants (such as "Hello" or "The answer is:") can be represented as groups of (double) quoted symbols. String storage can done either as string objects or as arrays of characters (know a C-strings). These arrays must be declared as having enough elements to store all of the characters in the string plus one additional special character used to indicate the end of the string. This "end-of-string" character is denoted as '\0' and is often referred to as the "null character". It is used by many string manipulating functions to indicate where a string ends. This is necessary because a string might not fill up all of the character elements available in the array that was declared to store it. The '\0' marks the first unused element in the array and prevents it (and higher) elements from being processed as part of the string data.

String Functions

Both the C and C++ languages provide a variety of pre-written functions to help programmers manipulate strings. Some of them copy string data. Others extract sub-strings from larger strings. And others help to attach strings together to form larger strings. Most of these functions receive string input and produce string output. However, some of them involve strings in some way, but don't use string input or produce string output. From the point of view of terminology, any function that involves strings in any way is referred to as a "string function". For more details on string functions, see Chapter 10 in your textbook.

STRING DECLARATION

In C++, C-string arrays can be declared in two different ways, depending on whether
you know the contents of each element in advance or not.

Option 1 - String Declaration without Initialization

If you do not know the contents of the array in advance of the program's
execution, then you would declare the array in the following manner:

charlabel[size];

The statement used to declare an array is written in a manner similar to
other variable declarations. The data type is written first, followed by the
variable label, and finally an integer constant (or symbolic constant) in
brackets indicating the size (or quantity of elements to be allocated).
All elements will be of char data type. The identifiers used to label an
array must conform to the same rules as any other identifier in C++ and cannot
duplicate a name already in use by a scalar. An array to store a person's last
name no longer than 15 characters would be declared with the statement:

char LAST[16];

This would declare sixteen char storage locations identified as:
LAST[0], LAST[1], LAST[2], through LAST[15]. The extra one would be provided
to hold the end-of-string character '\0' that is appended to the end of all
string data by many commands in C++.

Option 2 - String Declaration with String Constant Initialization

The act of storing a string into the newly created string variable (array
of characters) could be accomplished in a variety of ways. If the value (data)
was known at the time the array was declared, the array could be both
declared and initialized in the same statement as follows:

char LAST[16]="Andrews";

This would declare sixteen char storage locations identified as:
LAST[0], LAST[1], LAST[2], through LAST[15]. The first seven elements (LAST[0] through
LAST[6]) would be assigned the characters in the name "Andrews".
LAST[7]) would be assigned the end-of-string character '\0'. The remaining
elements of the array would be unassigned (unknown). This would not be a
problem, because most functions that manipulate strings stop processing
the array data when they encounter the '\0'.

Option 3 - String Declaration with Element Initialization

Because arrays also can be declared and initialized in a manner that involves
listing the values of individual elements, the LAST array also could be declared
and initialized as follows:

char LAST[16] = {'A','n','d','r','e','w','s','\0'};

This approach would have the same effect as the declaration above, but is not
often used in place of the easier form above.

STRING STORAGE

String Storage as Individual Elements

Another (more difficult) approach to storing a name in the string LAST could be
accomplished after the array was declared (as shown in Option 1 above) using
individual assignment statements, such as:

But the end-of-string character '\0' would have to be assigned manually as:

LAST[7]='\0';

String Storage as an Entire String Constant

Novice C++ programmers are often surprised to discover that a string cannot
be directly assigned to a C-string variable using a statement such as:

LAST="Andrews"; /* Example of a typical coding error */

Remember that C-string variables are not scalar storage locations, but rather
arrays of characters. In the C++ language, a reference to an array using just its
label (LAST in this example) is interpreted by C++ as a reference to the
address of the first element of the array (in other words &LAST[0]).
It would make no sense to assign a string constant to an address. Thus a function
was developed to help copy string data into a C-string storage location.
The name "Andrews" can be copied into the C-string variable LAST using
the "string copy" function as follows:

strcpy (LAST, "Andrews"); /* Store string data in LAST array */

This would work much like the combined declaration and initialization statement
shown in Option 2 above. The first seven elements (LAST[0] through LAST[6]) would
be assigned the characters in the name "Andrews". As above, LAST[7])
would be assigned the end-of-string character '\0' and the remaining elements of
the array would be unassigned (unknown).

String Storage of Sub-Strings

An optional function was developed to help copy only a portion of a source
string into another C-string storage location. For example, consider the following
declarations:

char STRING1[10]="Nathan";
char STRING2[10];

STRING1 was initialized to have the contents "Nathan". STRING2 was declared,
but given no initial value. If we now wanted to copy only the first three characters
("Nat") from STRING1 to STRING2, we could use the special "limited string
copy" function as follows:

Notice the difference in the name of the function. It has an 'n' in the middle of its
name. Also notice the addition of a third actual parameter (3) in the call. This indicates
the quantity of characters to copy. The function will copy this number of characters
unless it encounters a '\0' in STRING1 first (in which case it will stop short).
In either event, the programmer must add a manual statement to write a '\0' to the end of
STRING2, as in:

STRING2[3]='\0';

KEYBOARD ENTRY OF C-STRINGS

Although the entry of simple C-strings at the keyboard can be handled using the cin object,
such an approach is risky because of that object's treatment of whitespace and the potential for inputting more characters than the defined size of the C-string (character array). To avoid these problems, use
the getline member function of the cin object, as in:

cin.getline(NAME,SIZE);

The first argument (NAME) in the example above is the identifier of the character array.
The second argument (SIZE) limits user input to prevent reading characters in excess of the array's size.
A third optional argument can be included to specify a delimiter (other than the default '\n') to signal the end the input. Beware that when you specify a delimiter other than '\n' in the getline function, any '\n' entered in the input would be stored just like any other character in the input buffer and would not terminate the input of the string.

NOTE: The getline member function of the cin object should not be confused with the global getline function, which is used to input string objects. The source code below demonstrates how to define a string object and read data into it from the keyboard using the global getline function:

OUTPUT OF C-STRINGS

Output of C-strings can be accomplished in two different ways in C++.
One method involves treating the string data as a single unit and using the cout
stream object, as in:

cout << STRINGNAME;

where STRINGNAME represents the name of a C-string variable (character array).

The use of the cout object in this way depends on the C-string variable
having been properly stored with a '\0' (end-of-string) character terminating it. Without
that "null character", the function would have no way of determining how long the string was,
and would continue to access data beyond the boundaries of the character array declared
to hold the string.

The other method for outputting a C-string involves treating the string data formally as
an array of characters and outputting each character separately in a loop as individual
elements of the array. For example, after defining the C-string (character array)

char WORD[10]="Hello";

we could display the five characters within that string with a counting loop as in

for (C=0; C<5; C++) cout << WORD[C];

Note the use of an integer variable (C) to act as a subscript to each element of the
array during each pass of the loop. The use of a counting loop for this purpose requires
that we know the size of the string we plan to display. If we do not know this, we can
determine it using the "string length" function as in:

for (C=0; C<strlen(WORD); C++) cout << WORD[C];

Like most other string functions, the strlen function relies on the
string having been properly terminated with a null character. The expected presense of the
null character would allow us to employ a sentinel controlled loop rather than a counting
loop to output the string in the following manner:

C=0;
while (WORD[C]!='\0')
{
cout << WORD[C];
C=C+1;
}

Of course this still relies on the string having been properly terminated with a null character.
We can combine both control methods into a hybrid control method using the logical
"and" operator (&&) to produce a highly reliable output technique of

C=0;
while (WORD[C]!='\0' && C<SIZE)
{
cout << WORD[C];
C=C+1;
}

where SIZE represents a known size of the array used to store the string.