C For C++ Programmers


Before there was C++ there was C.  Developed at Bell Laboratories in the early 1970's, C was used for over 95% of the code in the UNIX operating system kernel.  By the end of the 70's, C compilers were available for most mini and microcomputers.  Superior to BASIC for applications requiring efficient code generation, C became the de facto standard for application software development on mini-computers and that new phenomenon, the PC. Meanwhile, Bjarne Stroustrup, a Bell Labs research computer scientist, released a prototype language called ``C With Classes.''   Originally a preprocessor that generated C code, this effort eventually produced C++.  Meanwhile, C itself continued to dominate application development for personal computer software and real-time control. Both C and C++ are used today for application software, systems software, and embedded systems.

C++ can be viewed as C with richer support for object oriented development.  Conversely, C can be viewed as a primitive variant of C++.   In any event, most legal C programs are also legal C++, and the C++ compiler is able to translate them.  The few exceptions use features considered obsolete in C; these will not be discussed.

Learning C means stripping away some of the comfortable assumptions under which C++ programmers work: while C is simpler than C++, C also makes it easier to defeat type checking, scramble memory, and use pointers in an undisciplined way.   The remainder of this document assumes you know C++ moderately well, and discusses C in this context.

C Control Structures

Free Functions

In C++, one uses classes to structure the system into components. Most of the functions in C++ are members of a class, and thus tightly related to the class's objects. However, C++ also supports free functions: functions that are not part of a class. The free function you are most familiar with is main, which gets things going by creating, initializing, and activating one or more objects. Other examples are numerical functions such as sin and log, which really don't belong to any class.

In C, all functions are free functions. There is no way in C to bind a function to a data structure or object. Any such relationships depend on the discipline of the programming team. The data structures manipulated by C functions must be either  global variables or explicit parameters: there are no implicit arguments like this in C++.


C does not have references or reference parameters. All arguments to C's free functions are passed by value. To achieve the effect of call by reference, pointers are used. Consider the following examples:

void swap(int& x, int& y) {

       int t ;

       t = x ; x = y ; y = t ;


. . .

swap(a, b) ;
void swap(int *x, int *y) {

       int t ;

       t = *x ; *x = *y ; *y = t ;


. . .

swap(&a, &b) ;

The code on the left is standard C++, using reference parameters to exchange two integers.  The code on the right is the equivalent C version.  As C always passes by value, we have to pass  pointers to the two integers being swapped, and use the indirection operator * to access the values.  In the call on the right, we use the ``address of'' operator & to create the necessary pointers to a and b . In this case, the C++ code is much cleaner, as the pointers are manipulated by the compiler ``under the hood.''

Return Values

C functions are severely constrained as to the types of values they can return:

Scalars are simple values like integers, floating point numbers, and characters (not strings!) .
Structures are like classes with all public data and no operations. Any piece of code can select and modify the elements of a structure.
Pointers in C can address simple values, structures, or arrays.

Function Miscellany

There is no way in C to overload a function (i.e., to have two or functions with the same name but different argument lists). In a similar vein, there are no template functions in C.

Other Control Structures

The sequencing, iteration, and decision control structures in C are the same as in C++.  That is, the while and for loops, as well as the if and switch statements, are identical.  However, the types of values and structures you can use in decisions and iterations are much simpler than in C++.

C Data Structures

C data structures are fewer and simpler than those of C++.  Most significantly, there are no classes in C -- the closest we can come to a class is a struct , which is a collection of related data items. The following subsections describe the basic C types and data structures.


The predefined scalar types are the same in C and C++, namely:

Floating point
Numbers with both an integer and fractional component, represented in decimal or scientific notation.  The two subtypes are double (double precision) and float (single precision).  For most computations, double precision is preferable.
Example constants
4.0  3.14159  6.023e23
Integral values.  The default int is the ``natural'' size for the CPU -- in most cases 32 bits.  There are two variants, short int (or short) and long int (or long ) which are typically 16 and 32 bits, respectively. One can also specify non-negative integers by prefixing a declaration with unsigned .
Characters (type char ) are just very small (1 byte) integers. Though they usually hold a single character, there is nothing to prevent their use as small numbers (other than a desire for sanity).
Example constants
1234  'x'
Note that character constants like 'x' are really small integers whose value is the character's ASCII code. Some non-printing characters have special escape sequences:
\n Newline
\r Return
\t Tab
\ooo  Character with octal ASCII code "ooo".
Because integers and characters are really the same type, you'll often see code where a character is assigned to an integer. This is considered normal in C, even though it looks a bit odd.


Arrays in C are declared by giving their size , where the subscripts range from 0 to (size-1). For example, the fragment:

double vector[20] ;
char name[16] ;

shows the declaration of a 20-element array of doubles, vector, and a 16-element array of characters, name. The legal subscripts for the two arrays are 0 to 19 and 0 to 15, respectively.

Multiple dimension arrays (matrices) are treated as arrays of arrays in C.  The following declaration defines a table of 10 character arrays, where each character array contains 16 characters:

char table[10][16]

To access the jth character in the ith row, you would write:


In C, an array's name does not refer to the array's contents! Instead, the name is the address of the first array element (0)! It is hard to over-emphasize this point, as it is the source of many subtle C errors. Consider the following declarations:

char nameA[16] ;
char nameB[16] ;

It is perfectly legal in C to write:

if ( nameA == nameB )

However, the comparison is not between the two 16-character strings, but between the addresses of the two arrays. Given that ``nameA'' and ``nameB'' are different arrays, this comparison will always be false.

The rationale for this seemingly strange behavior is that it supports passing arrays by reference. When an an array is named as an argument, a  pointer is actually passed. Consider a function to change all lower-case letters in a string to 'x', and a call to that function using  nameA above:

void letter_to_X( char *string, int length ) {
    int i ;

    for ( i = 0 ; i < length ; i++ ) {
        if ( string[i] >= 'a' && string[i] <= 'z' )
            string[i] = 'X' ;
    . . .
letter_to_X( nameA, 16 ) ;

There are several things to note:

  1. The argument is a pointer to a character.  This is in line with the notion that an array name's value is the address of its first element.
  2. The length has to be passed in explicitly : there is no way to determine the size of an array from its address alone.
  3. The assignment string[i] = 'X' uses a subscript with a pointer.  This is perfectly normal in C -- the compiler simply uses the pointer as the base address for its subscript calculations.
  4. The assignment changes the contents of .{code} nameA because we have a pointer to the array rather than a copy of the array contents.
  5. There is no protection against subscript errors. If the length argument is incorrect, or if the algorithm is wrong, it is possible to access and modify data that is not part of the array.  Unlike the lists, sequences, and strings in the RogueWave library, subscript errors are not caught in C.  Such errors are the source of many subtle failures in C programs.


In C, character string constants are constant arrays of characters. As in C++, such strings are enclosed in double quotes. They are constant because the text of the string is clearly spelled out between the quotes. They are arrays , as they occupy contiguous memory, with the characters numbered from zero. It is even possible (if a bit silly) to subscript strings; for example:


is the character 'd'.

The C (and C++) convention is that all strings are terminated by a NULL character (character '\0'). All constant strings have a NULL appended by the compiler. All strings that are constructed in character array variables should have a NULL appended. Most string manipulation functions use the NULL as a marker to terminate processing. If the NULL is missing, unrelated areas of memory may be modified and corrupted.

Because string constants are arrays, the constraints from the previous section apply. In particular, it is impossible in C to assign a string constant to a character array directly, as the ``value'' of a string constant is the address of its first character. Instead, there are "library functions" to handle string copying, etc., the interface to which is available via:

#include <string.h>

Here are a few of the common functions:

char *strcpy(char *to, const char *from)
Copy the string from to string to. The return value is the to pointer, which can then be passed to other functions. It is up to the caller to ensure that:
  1. The from string is properly terminated with a NULL.  The compiler ensures this for string constants.
  2. The to string has enough space allocated to hold all the characters in from plus the terminating NULL !


(void) strcpy(nameA, "Joe Blow") ;

Note that the string is copied to an array, but we already know the array name is a pointer to its first element. Also, the cast of the return value to (void) says that we know there is a return value but we're ignoring it.

char *strcat(char *to, const char *from)
Append string from to the end of string to. The return value is the to pointer, which can then be passed to other functions. It is up to the caller to ensure that:
  1. Both from and to are properly terminated with a NULL.
  2. The to string has enough space allocated to hold all the characters in the combined strings plus the terminating NULL.


(void) strcat(nameA, ", Jr.") ;

Adds the string ", Jr." to the end of nameA. Again, casting the return value to (void) says we're ignoring the return value.

int strlen(const char *s)
Returns the length of string s. Note that this is the number of characters up-to but not including the NULL byte. This may be less than the space actually allocated (for example, if a string does not fill up a character array). If the string was not terminated with a NULL, then strlen will continue scanning through memory until either it finds a NULL by coincidence or it creates a memory violation.


if ( strlen(nameA) < 8 ) {
    (void) strcpy( nameB, nameA )

The code above copies the string in nameA to nameB if the string in nameA is less than eight characters long.


As mentioned previously, a C structure is essentially a C++ class with no member functions and with all the data members public. Indeed, if you look in the C++ reference manual, you'll see that this is .{emphasis} exactly how C++ defines the meaning of a .{code} struct .

Most C programmers use structures to provide an approximation to objects. The object data is stored in the structure's elements, and a set of C functions is developed to manipulate such structures. The difference between C and C++ is that the latter can enforce the access rules. That is, in C++ only member functions can get at the object's data. In C, it is a matter of convention and discipline as to which functions can manipulate a structure's contents.


struct name {
    char first[16] ;
    char mi ;
    char last[20] ;
} ;

struct student {
    struct name stu_name ;
    int         stu_number ;
    double      stu_debt ;
} ;

struct student rit[15000] ;

In the example above, structure name has three components: a 16 character array for the first name, a single character for the middle initial, and a 20 character array for the last name. The second structure, student, uses the previous structure to define a component for the student's name, as well as two other components for the student number and amount of money owed. Finally, and array of 15000 student structures is defined to hold the overall RIT enrollment.

As with classes, dot notation is used to select components:

The sixth element of the array (a student structure).
The name of the sixth student (a name structure).
The middle initial of the sixth student's name (a char).

Unlike arrays, a structure variable refers to the whole structure, not its address.  When structures are passed to functions, a copy is passed, and changes to the copy are not reflected in the original structure. To affect a structure, the function argument must be a pointer.


void clear_debt( struct student *p_student ) {
    p->stu_debt = 0.0 ;
    return ;

    . . .
clear_debt( &rit[6] ) ;

The function clear_debt expects a pointer to a struct student, and uses this to set the stu_debt component to zero. In the call to clear_debt, the address of operator & is used to create a pointer to the sixth RIT student.


The need to use the struct keyword, possibly with an asterisk for a pointer, can clutter up a C program's declarations. The  typedef construct lets us create suitable aliases for any type we wish


typedef struct name Name ;
typedef struct student Student ;
typedef struct student *StudentPtr ;

After these declarations, we can use the identifiers Name, Student, and StudentPtr instead of the longer forms using keywords and asterisks. For example, the array of records can be declared as:

Student rit[15000] ;

and the function header becomes:

void clear_debt( StudentPtr p_student ) {


Both C and C++ support pointers to data in memory. We've already seen a couple uses of pointers in previous examples.  What follows are some more simple examples of pointers in C, using the student records declarations above:


struct student *p_stu ;
StudentPtr p_stu ;


p_stu = &rit[6] ;


p_stu->stu_debt = p_stu * 1.02 ;

Memory Allocation

In both C and C++, pointers are used primarily to create dynamic data structures like lists and trees. C++ allocates new objects and recycles existing ones with new and delete, respectively. In addition, well-designed class libraries hide much of the complexity behind a simpler class interfaces. The RogueWave RWCString class, for example, provides strings that can grow and shrink. The allocation and deallocation needed to support these strings is hidden in the implementation.

In C, many more of these details are visible to clients of a package built from pointers. What is more, the burden of allocating and freeing memory at the right time is on the programmer's shoulders. There is nothing like a C++ destructor in C. The following simple example will demonstrate the key memory management issues in C:

The Problem

Assume we decide to replace the array implementation of RIT's student database with one based on singly linked lists.

Solution Part 1

To do this, we'll define a new structure type StuNode which contains a student structure and a pointer to the next StuNode in the list:

typedef struct stu_node *StuNodePtr ;

typedef struct stu_node {
    Student    sn_student ;
    StuNodePtr sn_next ;
} StuNode ;

Next we'll define a global pointer rit_head to point to the first student in the list. As the list is initially empty, we'll set this pointer to zero (or NULL, a symbolic constant available in file stddef.h):

#include <stddef.h>

    . . .

StuNodePtr rit_head = NULL ;

Solution Part 2

Now we need a function to add a new student to the list (the argument is the student to add). This will require access to the malloc memory allocation function, which is declared in stdlib.h. We'll assume the student is added at the list head:

#include <stdlib.h>

    . . .

void add_student( Student new_stu ) {
    StuNodePtr new_node ;

    new_node =
    (StuNodePtr) malloc( sizeof(StuNode) ) ;

    new_node->sn_student = new_stu ;
    new_node->sn_next = rit_head ;
    rit_head = new_node ;

The first assignment calls the system .{code} malloc function with the size (in bytes) of a .{code} StuNode structure. The return value is a pointer to newly allocated memory at least as large as that requested. Unfortunately, this pointer is of type .{code} "char *" , rather than .{code} "StuNodePtr" . To remedy this, we .{term} cast the pointer from its real type to the type we want: that's the purpose of .{code} "(StuNodePtr)" in front of .{code} malloc . The remainder of the code:

  1. Copies the student structure argument into the student structure component of the node,
  2. Ensures the node points to whatever is at the head of the list, and
  3. Sets the head pointer to the newly created node.

Solution Part 3

Finally, we need a way to dispose of a node structure when a student is removed from the list. We won't give the details of finding the node in the list and properly unlinking it; we'll assume we simply need to delete the space associated with the node:

void free_node( StuNodePtr p_stu ) {
    (void) free( (char *) p_stu ) ;

The free function is also in the standard library, and it simply releases the space associated with pointer p_stu. We cast the pointer to type (char *), which is the type of pointer expected by free. The return value from free is an integer, but this value is rarely (if ever) used.  To indicate we are ignoring the value, we cast the return value to (void). Once a memory region if freed, no further reference to the region is allowed.  If such pointer access occurs by accident, the program will behave unpredictably.

NOTE: This can happen in C++ as well if there are any pointers to deleted objects!.


Two other keywords that can be prepended to a declaration are extern and static . An extern declaration gives the name and type of a variable or function, but does not allocate any space. Such statements are typically found in header files defining the interface to a C module. Typically the space is allocated when the variable or function is defined in the .c or .C file that implements the module.

A static function or global variable is one whose scope is restricted to the current implementation file. These declarations occur at the top of the source file, before any references to the associated functions or variables. No other module can refer to these names, so they are effectively "hidden" within the implementation file where they are defined. In essence, static provides a crude form of private data and operations.

In summary, C's data structuring mechanisms are both more primitive and more unstructured than those of C++. They are primitive in that they do not support member functions, protected or private information, or generic (template) structures. They are more unstructured in that neither array subscripts nor pointers are checked for legality. The flexibility of this free-wheeling approach must be balanced against the increased probability of subtle, hard to locate bugs. The smart approach is to do as much as possible in C++, using stable, well-designed library classes.

C Input/Output

C input/output is also primitive compared to C++. There are no input or output streams, nor are the >> and << I/O operators available. As with string manipulation and memory allocation, I/O is supported by a standard library, stdio.h. Access to the library interface is gained by including its header file:

#include <stdio.h>

Here are a few of the functions provided.

Returns the next character from standard input as an int . On end-of-file, the special indicator  EOF (a negative value) is returned.


int c ;

for ( c = getchar() ; c != EOF ; c = getchar() ) {
    process_char( c ) ;

printf(format [,args] )
printf has a variable length argument list. The first argument is a string specifying the output format. Embedded in this string can be formatting specifiers, each preceded by a '%' The specifiers tell how to format the arguments that follow, from left to right.


printf( "String %s integer %d float %5.2f\\n", s, i, f ) ;

The first argument, s, is handled by the %s specifier. This assumes the argument points to a string of characters, which are printed until the terminating NULL is encountered. The second argument, i, is printed as a decimal integer by the .{code} "%d" specifier.

Finally, the last argument, f, is printed as a floating point number by the %5.2f specifier. The "5.2" portion says to use a field 5 characters wide, and to print 2 digits to the right of the decimal point. There are many other output formats, and many other I/O functions as well. For all the details, consult the manual page for the standard I/O library:

man stdio

Please note that printf has no way of verifying the argument types. If you pass an integer where the format requires a pointer, you'll get weird output and possibly a core dump. You just have to be very careful.

Compiling C Source Files

The Unix convention is that files ending in .c (lower case c) contain C code, while those ending in .C (upper case C) are for C++. While we have a compiler that handles just the C language, you are probably better off using the C++ compiler, CC, for both C and C++ files. All the programs we'll develop should compile this way, though you may get warning messages from C code because the C++ compiler is pickier about type checking..

NOTE: You may have to define a special .c.o rule in your  Makefiles to have C code compiled by the C++ compiler.