[ Viewing Hints ] [ Exercise Solutions ] [ Volume 2 ] [ Free Newsletter ]
[ Seminars ] [ Seminars on CD ROM ] [ Consulting ]

Thinking in C++, 2nd ed. Volume 1

©2000 by Bruce Eckel

[ Previous Chapter ] [ Table of Contents ] [ Index ] [ Next Chapter ]

4: Data Abstraction

C++ is a productivity enhancement tool. Why else
would you make the effort (and it is an effort,
regardless of how easy we attempt to make the transition)

to switch from some language that you already know and are productive with to a new language in which you’re going to be less productive for a while, until you get the hang of it? It’s because you’ve become convinced that you’re going to get big gains by using this new tool.

Productivity, in computer programming terms, means that fewer people can make much more complex and impressive programs in less time. There are certainly other issues when it comes to choosing a language, such as efficiency (does the nature of the language cause slowdown and code bloat?), safety (does the language help you ensure that your program will always do what you plan, and handle errors gracefully?), and maintenance (does the language help you create code that is easy to understand, modify, and extend?). These are certainly important factors that will be examined in this book.

But raw productivity means a program that formerly took three of you a week to write now takes one of you a day or two. This touches several levels of economics. You’re happy because you get the rush of power that comes from building something, your client (or boss) is happy because products are produced faster and with fewer people, and the customers are happy because they get products more cheaply. The only way to get massive increases in productivity is to leverage off other people’s code. That is, to use libraries.

A library is simply a bunch of code that someone else has written and packaged together. Often, the most minimal package is a file with an extension like lib and one or more header files to tell your compiler what’s in the library. The linker knows how to search through the library file and extract the appropriate compiled code. But that’s only one way to deliver a library. On platforms that span many architectures, such as Linux/Unix, often the only sensible way to deliver a library is with source code, so it can be reconfigured and recompiled on the new target.

Thus, libraries are probably the most important way to improve productivity, and one of the primary design goals of C++ is to make library use easier. This implies that there’s something hard about using libraries in C. Understanding this factor will give you a first insight into the design of C++, and thus insight into how to use it.

A tiny C-like library

A library usually starts out as a collection of functions, but if you have used third-party C libraries you know there’s usually more to it than that because there’s more to life than behavior, actions, and functions. There are also characteristics (blue, pounds, texture, luminance), which are represented by data. And when you start to deal with a set of characteristics in C, it is very convenient to clump them together into a struct, especially if you want to represent more than one similar thing in your problem space. Then you can make a variable of this struct for each thing.

Thus, most C libraries have a set of structs and a set of functions that act on those structs. As an example of what such a system looks like, consider a programming tool that acts like an array, but whose size can be established at runtime, when it is created. I’ll call it a CStash. Although it’s written in C++, it has the style of what you’d write in C:

//: C04:CLib.h
// Header file for a C-like library
// An array-like entity created at runtime

typedef struct CStashTag {
  int size;      // Size of each space
  int quantity;  // Number of storage spaces
  int next;      // Next empty space
  // Dynamically allocated array of bytes:
  unsigned char* storage;
} CStash;

void initialize(CStash* s, int size);
void cleanup(CStash* s);
int add(CStash* s, const void* element);
void* fetch(CStash* s, int index);
int count(CStash* s);
void inflate(CStash* s, int increase);
///:~

A tag name like CStashTag is generally used for a struct in case you need to reference the struct inside itself. For example, when creating a linked list (each element in your list contains a pointer to the next element), you need a pointer to the next struct variable, so you need a way to identify the type of that pointer within the struct body. Also, you'll almost universally see the typedef as shown above for every struct in a C library. This is done so you can treat the struct as if it were a new type and define variables of that struct like this:

CStash A, B, C;

The storage pointer is an unsigned char*. An unsigned char is the smallest piece of storage a C compiler supports, although on some machines it can be the same size as the largest. It’s implementation dependent, but is often one byte long. You might think that because the CStash is designed to hold any type of variable, a void* would be more appropriate here. However, the purpose is not to treat this storage as a block of some unknown type, but rather as a block of contiguous bytes.

The source code for the implementation file (which you may not get if you buy a library commercially – you might get only a compiled obj or lib or dll, etc.) looks like this:

//: C04:CLib.cpp {O}
// Implementation of example C-like library
// Declare structure and functions:
#include "CLib.h"
#include <iostream>
#include <cassert> 
using namespace std;
// Quantity of elements to add
// when increasing storage:
const int increment = 100;

void initialize(CStash* s, int sz) {
  s->size = sz;
  s->quantity = 0;
  s->storage = 0;
  s->next = 0;
}

int add(CStash* s, const void* element) {
  if(s->next >= s->quantity) //Enough space left?
    inflate(s, increment);
  // Copy element into storage,
  // starting at next empty space:
  int startBytes = s->next * s->size;
  unsigned char* e = (unsigned char*)element;
  for(int i = 0; i < s->size; i++)
    s->storage[startBytes + i] = e[i];
  s->next++;
  return(s->next - 1); // Index number
}

void* fetch(CStash* s, int index) {
  // Check index boundaries:
  assert(0 <= index);
  if(index >= s->next)
    return 0; // To indicate the end
  // Produce pointer to desired element:
  return &(s->storage[index * s->size]);
}

int count(CStash* s) {
  return s->next;  // Elements in CStash
}

void inflate(CStash* s, int increase) {
  assert(increase > 0);
  int newQuantity = s->quantity + increase;
  int newBytes = newQuantity * s->size;
  int oldBytes = s->quantity * s->size;
  unsigned char* b = new unsigned char[newBytes];
  for(int i = 0; i < oldBytes; i++)
    b[i] = s->storage[i]; // Copy old to new
  delete [](s->storage); // Old storage
  s->storage = b; // Point to new memory
  s->quantity = newQuantity;
}

void cleanup(CStash* s) {
  if(s->storage != 0) {
   cout << "freeing storage" << endl;
   delete []s->storage;
  }
} ///:~

initialize( ) performs the necessary setup for struct CStash by setting the internal variables to appropriate values. Initially, the storage pointer is set to zero – no initial storage is allocated.

The add( ) function inserts an element into the CStash at the next available location. First, it checks to see if there is any available space left. If not, it expands the storage using the inflate( ) function, described later.

Because the compiler doesn’t know the specific type of the variable being stored (all the function gets is a void*), you can’t just do an assignment, which would certainly be the convenient thing. Instead, you must copy the variable byte-by-byte. The most straightforward way to perform the copying is with array indexing. Typically, there are already data bytes in storage, and this is indicated by the value of next. To start with the right byte offset, next is multiplied by the size of each element (in bytes) to produce startBytes. Then the argument element is cast to an unsigned char* so that it can be addressed byte-by-byte and copied into the available storage space. next is incremented so that it indicates the next available piece of storage, and the “index number” where the value was stored so that value can be retrieved using this index number with fetch( ).

fetch( ) checks to see that the index isn’t out of bounds and then returns the address of the desired variable, calculated using the index argument. Since index indicates the number of elements to offset into the CStash, it must be multiplied by the number of bytes occupied by each piece to produce the numerical offset in bytes. When this offset is used to index into storage using array indexing, you don’t get the address, but instead the byte at the address. To produce the address, you must use the address-of operator &.

count( ) may look a bit strange at first to a seasoned C programmer. It seems like a lot of trouble to go through to do something that would probably be a lot easier to do by hand. If you have a struct CStash called intStash, for example, it would seem much more straightforward to find out how many elements it has by saying intStash.next instead of making a function call (which has overhead), such as count(&intStash). However, if you wanted to change the internal representation of CStash and thus the way the count was calculated, the function call interface allows the necessary flexibility. But alas, most programmers won’t bother to find out about your “better” design for the library. They’ll look at the struct and grab the next value directly, and possibly even change next without your permission. If only there were some way for the library designer to have better control over things like this! (Yes, that’s foreshadowing.)

Dynamic storage allocation

You never know the maximum amount of storage you might need for a CStash, so the memory pointed to by storage is allocated from the heap. The heap is a big block of memory used for allocating smaller pieces at runtime. You use the heap when you don’t know the size of the memory you’ll need while you’re writing a program. That is, only at runtime will you find out that you need space to hold 200 Airplane variables instead of 20. In Standard C, dynamic-memory allocation functions include malloc( ), calloc( ), realloc( ), and free( ). Instead of library calls, however, C++ has a more sophisticated (albeit simpler to use) approach to dynamic memory that is integrated into the language via the keywords new and delete.

The inflate( ) function uses new to get a bigger chunk of space for the CStash. In this situation, we will only expand memory and not shrink it, and the assert( ) will guarantee that a negative number is not passed to inflate( ) as the increase value. The new number of elements that can be held (after inflate( ) completes) is calculated as newQuantity, and this is multiplied by the number of bytes per element to produce newBytes, which will be the number of bytes in the allocation. So that we know how many bytes to copy over from the old location, oldBytes is calculated using the old quantity.

The actual storage allocation occurs in the new-expression, which is the expression involving the new keyword:

new unsigned char[newBytes];

The general form of the new-expression is:

new Type;

in which Type describes the type of variable you want allocated on the heap. In this case, we want an array of unsigned char that is newBytes long, so that is what appears as the Type. You can also allocate something as simple as an int by saying:

new int;

and although this is rarely done, you can see that the form is consistent.

A new-expression returns a pointer to an object of the exact type that you asked for. So if you say new Type, you get back a pointer to a Type. If you say new int, you get back a pointer to an int. If you want a new unsigned char array, you get back a pointer to the first element of that array. The compiler will ensure that you assign the return value of the new-expression to a pointer of the correct type.

Of course, any time you request memory it’s possible for the request to fail, if there is no more memory. As you will learn, C++ has mechanisms that come into play if the memory-allocation operation is unsuccessful.

Once the new storage is allocated, the data in the old storage must be copied to the new storage; this is again accomplished with array indexing, copying one byte at a time in a loop. After the data is copied, the old storage must be released so that it can be used by other parts of the program if they need new storage. The delete keyword is the complement of new, and must be applied to release any storage that is allocated with new (if you forget to use delete, that storage remains unavailable, and if this so-called memory leak happens enough, you’ll run out of memory). In addition, there’s a special syntax when you’re deleting an array. It’s as if you must remind the compiler that this pointer is not just pointing to one object, but to an array of objects: you put a set of empty square brackets in front of the pointer to be deleted:

delete []myArray;

Once the old storage has been deleted, the pointer to the new storage can be assigned to the storage pointer, the quantity is adjusted, and inflate( ) has completed its job.

Note that the heap manager is fairly primitive. It gives you chunks of memory and takes them back when you delete them. There’s no inherent facility for heap compaction, which compresses the heap to provide bigger free chunks. If a program allocates and frees heap storage for a while, you can end up with a fragmented heap that has lots of memory free, but without any pieces that are big enough to allocate the size you’re looking for at the moment. A heap compactor complicates a program because it moves memory chunks around, so your pointers won’t retain their proper values. Some operating environments have heap compaction built in, but they require you to use special memory handles (which can be temporarily converted to pointers, after locking the memory so the heap compactor can’t move it) instead of pointers. You can also build your own heap-compaction scheme, but this is not a task to be undertaken lightly.

When you create a variable on the stack at compile-time, the storage for that variable is automatically created and freed by the compiler. The compiler knows exactly how much storage is needed, and it knows the lifetime of the variables because of scoping. With dynamic memory allocation, however, the compiler doesn’t know how much storage you’re going to need, and it doesn’t know the lifetime of that storage. That is, the storage doesn’t get cleaned up automatically. Therefore, you’re responsible for releasing the storage using delete, which tells the heap manager that storage can be used by the next call to new. The logical place for this to happen in the library is in the cleanup( ) function because that is where all the closing-up housekeeping is done.

To test the library, two CStashes are created. The first holds ints and the second holds arrays of 80 chars:

//: C04:CLibTest.cpp
//{L} CLib
// Test the C-like library
#include "CLib.h"
#include <fstream>
#include <iostream>
#include <string>
#include <cassert>
using namespace std;

int main() {
  // Define variables at the beginning
  // of the block, as in C:
  CStash intStash, stringStash;
  int i;
  char* cp;
  ifstream in;
  string line;
  const int bufsize = 80;
  // Now remember to initialize the variables:
  initialize(&intStash, sizeof(int));
  for(i = 0; i < 100; i++)
    add(&intStash, &i);
  for(i = 0; i < count(&intStash); i++)
    cout << "fetch(&intStash, " << i << ") = "
         << *(int*)fetch(&intStash, i)
         << endl;
  // Holds 80-character strings:
  initialize(&stringStash, sizeof(char)*bufsize);
  in.open("CLibTest.cpp");
  assert(in);
  while(getline(in, line))
    add(&stringStash, line.c_str());
  i = 0;
  while((cp = (char*)fetch(&stringStash,i++))!=0)
    cout << "fetch(&stringStash, " << i << ") = "
         << cp << endl;
  cleanup(&intStash);
  cleanup(&stringStash);
} ///:~

Following the form required by C, all the variables are created at the beginning of the scope of main( ). Of course, you must remember to initialize the CStash variables later in the block by calling initialize( ). One of the problems with C libraries is that you must carefully convey to the user the importance of the initialization and cleanup functions. If these functions aren’t called, there will be a lot of trouble. Unfortunately, the user doesn’t always wonder if initialization and cleanup are mandatory. They know what they want to accomplish, and they’re not as concerned about you jumping up and down saying, “Hey, wait, you have to do this first!” Some users have even been known to initialize the elements of a structure themselves. There’s certainly no mechanism in C to prevent it (more foreshadowing).

The intStash is filled up with integers, and the stringStash is filled with character arrays. These character arrays are produced by opening the source code file, CLibTest.cpp, and reading the lines from it into a string called line, and then producing a pointer to the character representation of line using the member function c_str( ).

After each Stash is loaded, it is displayed. The intStash is printed using a for loop, which uses count( ) to establish its limit. The stringStash is printed with a while, which breaks out when fetch( ) returns zero to indicate it is out of bounds.

You’ll also notice an additional cast in

cp = (char*)fetch(&stringStash,i++)

This is due to the stricter type checking in C++, which does not allow you to simply assign a void* to any other type (C allows this).

Bad guesses

There is one more important issue you should understand before we look at the general problems in creating a C library. Note that the CLib.h header file must be included in any file that refers to CStash because the compiler can’t even guess at what that structure looks like. However, it can guess at what a function looks like; this sounds like a feature but it turns out to be a major C pitfall.

Although you should always declare functions by including a header file, function declarations aren’t essential in C. It’s possible in C (but not in C++) to call a function that you haven’t declared. A good compiler will warn you that you probably ought to declare a function first, but it isn’t enforced by the C language standard. This is a dangerous practice, because the C compiler can assume that a function that you call with an int argument has an argument list containing int, even if it may actually contain a float. This can produce bugs that are very difficult to find, as you will see.

Each separate C implementation file (with an extension of .c) is a translation unit. That is, the compiler is run separately on each translation unit, and when it is running it is aware of only that unit. Thus, any information you provide by including header files is quite important because it determines the compiler’s understanding of the rest of your program. Declarations in header files are particularly important, because everywhere the header is included, the compiler will know exactly what to do. If, for example, you have a declaration in a header file that says void func(float), the compiler knows that if you call that function with an integer argument, it should convert the int to a float as it passes the argument (this is called promotion). Without the declaration, the C compiler would simply assume that a function func(int) existed, it wouldn’t do the promotion, and the wrong data would quietly be passed into func( ).

For each translation unit, the compiler creates an object file, with an extension of .o or .obj or something similar. These object files, along with the necessary start-up code, must be collected by the linker into the executable program. During linking, all the external references must be resolved. For example, in CLibTest.cpp, functions such as initialize( ) and fetch( ) are declared (that is, the compiler is told what they look like) and used, but not defined. They are defined elsewhere, in CLib.cpp. Thus, the calls in CLib.cpp are external references. The linker must, when it puts all the object files together, take the unresolved external references and find the addresses they actually refer to. Those addresses are put into the executable program to replace the external references.

It’s important to realize that in C, the external references that the linker searches for are simply function names, generally with an underscore in front of them. So all the linker has to do is match up the function name where it is called and the function body in the object file, and it’s done. If you accidentally made a call that the compiler interpreted as func(int) and there’s a function body for func(float) in some other object file, the linker will see _func in one place and _func in another, and it will think everything’s OK. The func( ) at the calling location will push an int onto the stack, and the func( ) function body will expect a float to be on the stack. If the function only reads the value and doesn’t write to it, it won’t blow up the stack. In fact, the float value it reads off the stack might even make some kind of sense. That’s worse because it’s harder to find the bug.

What's wrong?

We are remarkably adaptable, even in situations in which perhaps we shouldn’t adapt. The style of the CStash library has been a staple for C programmers, but if you look at it for a while, you might notice that it’s rather . . . awkward. When you use it, you have to pass the address of the structure to every single function in the library. When reading the code, the mechanism of the library gets mixed with the meaning of the function calls, which is confusing when you’re trying to understand what’s going on.

One of the biggest obstacles, however, to using libraries in C is the problem of name clashes. C has a single name space for functions; that is, when the linker looks for a function name, it looks in a single master list. In addition, when the compiler is working on a translation unit, it can work only with a single function with a given name.

Now suppose you decide to buy two libraries from two different vendors, and each library has a structure that must be initialized and cleaned up. Both vendors decided that initialize( ) and cleanup( ) are good names. If you include both their header files in a single translation unit, what does the C compiler do? Fortunately, C gives you an error, telling you there’s a type mismatch in the two different argument lists of the declared functions. But even if you don’t include them in the same translation unit, the linker will still have problems. A good linker will detect that there’s a name clash, but some linkers take the first function name they find, by searching through the list of object files in the order you give them in the link list. (This can even be thought of as a feature because it allows you to replace a library function with your own version.)

In either event, you can’t use two C libraries that contain a function with the identical name. To solve this problem, C library vendors will often prepend a sequence of unique characters to the beginning of all their function names. So initialize( ) and cleanup( ) might become CStash_initialize( ) and CStash_cleanup( ). This is a logical thing to do because it “decorates” the name of the struct the function works on with the name of the function.

Now it’s time to take the first step toward creating classes in C++. Variable names inside a struct do not clash with global variable names. So why not take advantage of this for function names, when those functions operate on a particular struct? That is, why not make functions members of structs?

The basic object

Step one is exactly that. C++ functions can be placed inside structs as “member functions.” Here’s what it looks like after converting the C version of CStash to the C++ Stash:

//: C04:CppLib.h
// C-like library converted to C++

struct Stash {
  int size;      // Size of each space
  int quantity;  // Number of storage spaces
  int next;      // Next empty space
   // Dynamically allocated array of bytes:
  unsigned char* storage;
  // Functions!
  void initialize(int size);
  void cleanup();
  int add(const void* element);
  void* fetch(int index);
  int count();
  void inflate(int increase);
}; ///:~

First, notice there is no typedef . Instead of requiring you to create a typedef, the C++ compiler turns the name of the structure into a new type name for the program (just as int, char, float and double are type names).

All the data members are exactly the same as before, but now the functions are inside the body of the struct. In addition, notice that the first argument from the C version of the library has been removed. In C++, instead of forcing you to pass the address of the structure as the first argument to all the functions that operate on that structure, the compiler secretly does this for you. Now the only arguments for the functions are concerned with what the function does, not the mechanism of the function’s operation.

It’s important to realize that the function code is effectively the same as it was with the C version of the library. The number of arguments is the same (even though you don’t see the structure address being passed in, it’s still there), and there’s only one function body for each function. That is, just because you say

Stash A, B, C;

doesn’t mean you get a different add( ) function for each variable.

So the code that’s generated is almost identical to what you would have written for the C version of the library. Interestingly enough, this includes the “name decoration” you probably would have done to produce Stash_initialize( ), Stash_cleanup( ), and so on. When the function name is inside the struct, the compiler effectively does the same thing. Therefore, initialize( ) inside the structure Stash will not collide with a function named initialize( ) inside any other structure, or even a global function named initialize( ). Most of the time you don’t have to worry about the function name decoration – you use the undecorated name. But sometimes you do need to be able to specify that this initialize( ) belongs to the struct Stash, and not to any other struct. In particular, when you’re defining the function you need to fully specify which one it is. To accomplish this full specification, C++ has an operator (::) called the scope resolution operator (named so because names can now be in different scopes: at global scope or within the scope of a struct). For example, if you want to specify initialize( ), which belongs to Stash, you say Stash::initialize(int size). You can see how the scope resolution operator is used in the function definitions:

//: C04:CppLib.cpp {O}
// C library converted to C++
// Declare structure and functions:
#include "CppLib.h"
#include <iostream>
#include <cassert>
using namespace std;
// Quantity of elements to add
// when increasing storage:
const int increment = 100;

void Stash::initialize(int sz) {
  size = sz;
  quantity = 0;
  storage = 0;
  next = 0;
}

int Stash::add(const void* element) {
  if(next >= quantity) // Enough space left?
    inflate(increment);
  // Copy element into storage,
  // starting at next empty space:
  int startBytes = next * size;
  unsigned char* e = (unsigned char*)element;
  for(int i = 0; i < size; i++)
    storage[startBytes + i] = e[i];
  next++;
  return(next - 1); // Index number
}

void* Stash::fetch(int index) {
  // Check index boundaries:
  assert(0 <= index);
  if(index >= next)
    return 0; // To indicate the end
  // Produce pointer to desired element:
  return &(storage[index * size]);
}

int Stash::count() {
  return next; // Number of elements in CStash
}

void Stash::inflate(int increase) {
  assert(increase > 0);
  int newQuantity = quantity + increase;
  int newBytes = newQuantity * size;
  int oldBytes = quantity * size;
  unsigned char* b = new unsigned char[newBytes];
  for(int i = 0; i < oldBytes; i++)
    b[i] = storage[i]; // Copy old to new
  delete []storage; // Old storage
  storage = b; // Point to new memory
  quantity = newQuantity;
}

void Stash::cleanup() {
  if(storage != 0) {
    cout << "freeing storage" << endl;
    delete []storage;
  }
} ///:~

There are several other things that are different between C and C++. First, the declarations in the header files are required by the compiler. In C++ you cannot call a function without declaring it first. The compiler will issue an error message otherwise. This is an important way to ensure that function calls are consistent between the point where they are called and the point where they are defined. By forcing you to declare the function before you call it, the C++ compiler virtually ensures that you will perform this declaration by including the header file. If you also include the same header file in the place where the functions are defined, then the compiler checks to make sure that the declaration in the header and the function definition match up. This means that the header file becomes a validated repository for function declarations and ensures that functions are used consistently throughout all translation units in the project.

Of course, global functions can still be declared by hand every place where they are defined and used. (This is so tedious that it becomes very unlikely.) However, structures must always be declared before they are defined or used, and the most convenient place to put a structure definition is in a header file, except for those you intentionally hide in a file.

You can see that all the member functions look almost the same as when they were C functions, except for the scope resolution and the fact that the first argument from the C version of the library is no longer explicit. It’s still there, of course, because the function has to be able to work on a particular struct variable. But notice, inside the member function, that the member selection is also gone! Thus, instead of saying s–>size = sz; you say size = sz; and eliminate the tedious s–>, which didn’t really add anything to the meaning of what you were doing anyway. The C++ compiler is apparently doing this for you. Indeed, it is taking the “secret” first argument (the address of the structure that we were previously passing in by hand) and applying the member selector whenever you refer to one of the data members of a struct. This means that whenever you are inside the member function of another struct, you can refer to any member (including another member function) by simply giving its name. The compiler will search through the local structure’s names before looking for a global version of that name. You’ll find that this feature means that not only is your code easier to write, it’s a lot easier to read.

But what if, for some reason, you want to be able to get your hands on the address of the structure? In the C version of the library it was easy because each function’s first argument was a CStash* called s. In C++, things are even more consistent. There’s a special keyword, called this, which produces the address of the struct. It’s the equivalent of the ‘s’ in the C version of the library. So we can revert to the C style of things by saying

this->size = Size;

The code generated by the compiler is exactly the same, so you don’t need to use this in such a fashion; occasionally, you’ll see code where people explicitly use this-> everywhere but it doesn’t add anything to the meaning of the code and often indicates an inexperienced programmer. Usually, you don’t use this often, but when you need it, it’s there (some of the examples later in the book will use this).

There’s one last item to mention. In C, you could assign a void* to any other pointer like this:

int i = 10;
void* vp = &i; // OK in both C and C++
int* ip = vp; // Only acceptable in C

and there was no complaint from the compiler. But in C++, this statement is not allowed. Why? Because C is not so particular about type information, so it allows you to assign a pointer with an unspecified type to a pointer with a specified type. Not so with C++. Type is critical in C++, and the compiler stamps its foot when there are any violations of type information. This has always been important, but it is especially important in C++ because you have member functions in structs. If you could pass pointers to structs around with impunity in C++, then you could end up calling a member function for a struct that doesn’t even logically exist for that struct! A real recipe for disaster. Therefore, while C++ allows the assignment of any type of pointer to a void* (this was the original intent of void*, which is required to be large enough to hold a pointer to any type), it will not allow you to assign a void pointer to any other type of pointer. A cast is always required to tell the reader and the compiler that you really do want to treat it as the destination type.

This brings up an interesting issue. One of the important goals for C++ is to compile as much existing C code as possible to allow for an easy transition to the new language. However, this doesn’t mean any code that C allows will automatically be allowed in C++. There are a number of things the C compiler lets you get away with that are dangerous and error-prone. (We’ll look at them as the book progresses.) The C++ compiler generates warnings and errors for these situations. This is often much more of an advantage than a hindrance. In fact, there are many situations in which you are trying to run down an error in C and just can’t find it, but as soon as you recompile the program in C++, the compiler points out the problem! In C, you’ll often find that you can get the program to compile, but then you have to get it to work. In C++, when the program compiles correctly, it often works, too! This is because the language is a lot stricter about type.

You can see a number of new things in the way the C++ version of Stash is used in the following test program:

//: C04:CppLibTest.cpp
//{L} CppLib
// Test of C++ library
#include "CppLib.h"
#include "../require.h"
#include <fstream>
#include <iostream>
#include <string>
using namespace std;

int main() {
  Stash intStash;
  intStash.initialize(sizeof(int));
  for(int i = 0; i < 100; i++)
    intStash.add(&i);
  for(int j = 0; j < intStash.count(); j++)
    cout << "intStash.fetch(" << j << ") = "
         << *(int*)intStash.fetch(j)
         << endl;
  // Holds 80-character strings:
  Stash stringStash;
  const int bufsize = 80;
  stringStash.initialize(sizeof(char) * bufsize);
  ifstream in("CppLibTest.cpp");
  assure(in, "CppLibTest.cpp");
  string line;
  while(getline(in, line))
    stringStash.add(line.c_str());
  int k = 0;
  char* cp;
  while((cp =(char*)stringStash.fetch(k++)) != 0)
    cout << "stringStash.fetch(" << k << ") = "
         << cp << endl;
  intStash.cleanup();
  stringStash.cleanup();
} ///:~

One thing you’ll notice is that the variables are all defined “on the fly” (as introduced in the previous chapter). That is, they are defined at any point in the scope, rather than being restricted – as in C – to the beginning of the scope.

The code is quite similar to CLibTest.cpp, but when a member function is called, the call occurs using the member selection operator ‘.’ preceded by the name of the variable. This is a convenient syntax because it mimics the selection of a data member of the structure. The difference is that this is a function member, so it has an argument list.

Of course, the call that the compiler actually generates looks much more like the original C library function. Thus, considering name decoration and the passing of this, the C++ function call intStash.initialize(sizeof(int), 100) becomes something like Stash_initialize(&intStash, sizeof(int), 100). If you ever wonder what’s going on underneath the covers, remember that the original C++ compiler cfront from AT&T produced C code as its output, which was then compiled by the underlying C compiler. This approach meant that cfront could be quickly ported to any machine that had a C compiler, and it helped to rapidly disseminate C++ compiler technology. But because the C++ compiler had to generate C, you know that there must be some way to represent C++ syntax in C (some compilers still allow you to produce C code).

There’s one other change from ClibTest.cpp, which is the introduction of the require.h header file. This is a header file that I created for this book to perform more sophisticated error checking than that provided by assert( ). It contains several functions, including the one used here called assure( ), which is used for files. This function checks to see if the file has successfully been opened, and if not it reports to standard error that the file could not be opened (thus it needs the name of the file as the second argument) and exits the program. The require.h functions will be used throughout the book, in particular to ensure that there are the right number of command-line arguments and that files are opened properly. The require.h functions replace repetitive and distracting error-checking code, and yet they provide essentially useful error messages. These functions will be fully explained later in the book.

What's an object?

Now that you’ve seen an initial example, it’s time to step back and take a look at some terminology. The act of bringing functions inside structures is the root of what C++ adds to C, and it introduces a new way of thinking about structures: as concepts. In C, a struct is an agglomeration of data, a way to package data so you can treat it in a clump. But it’s hard to think about it as anything but a programming convenience. The functions that operate on those structures are elsewhere. However, with functions in the package, the structure becomes a new creature, capable of describing both characteristics (like a C struct does) and behaviors. The concept of an object, a free-standing, bounded entity that can remember and act, suggests itself.

In C++, an object is just a variable, and the purest definition is “a region of storage” (this is a more specific way of saying, “an object must have a unique identifier,” which in the case of C++ is a unique memory address). It’s a place where you can store data, and it’s implied that there are also operations that can be performed on this data.

Unfortunately, there’s not complete consistency across languages when it comes to these terms, although they are fairly well-accepted. You will also sometimes encounter disagreement about what an object-oriented language is, although that seems to be reasonably well sorted out by now. There are languages that are object-based, which means that they have objects like the C++ structures-with-functions that you’ve seen so far. This, however, is only part of the picture when it comes to an object-oriented language, and languages that stop at packaging functions inside data structures are object-based, not object-oriented.

Abstract data typing

The ability to package data with functions allows you to create a new data type. This is often called encapsulation[33]. An existing data type may have several pieces of data packaged together. For example, a float has an exponent, a mantissa, and a sign bit. You can tell it to do things: add to another float or to an int, and so on. It has characteristics and behavior.

The definition of Stash creates a new data type. You can add( ), fetch( ), and inflate( ). You create one by saying Stash s, just as you create a float by saying float f. A Stash also has characteristics and behavior. Even though it acts like a real, built-in data type, we refer to it as an abstract data type, perhaps because it allows us to abstract a concept from the problem space into the solution space. In addition, the C++ compiler treats it like a new data type, and if you say a function expects a Stash, the compiler makes sure you pass a Stash to that function. So the same level of type checking happens with abstract data types (sometimes called user-defined types) as with built-in types.

You can immediately see a difference, however, in the way you perform operations on objects. You say object.memberFunction(arglist). This is “calling a member function for an object.” But in object-oriented parlance, this is also referred to as “sending a message to an object.” So for a Stash s, the statement s.add(&i) “sends a message to s” saying, “add( ) this to yourself.” In fact, object-oriented programming can be summed up in a single phrase: sending messages to objects. Really, that’s all you do – create a bunch of objects and send messages to them. The trick, of course, is figuring out what your objects and messages are, but once you accomplish this the implementation in C++ is surprisingly straightforward.

Object details

A question that often comes up in seminars is, “How big is an object, and what does it look like?” The answer is “about what you expect from a C struct.” In fact, the code the C compiler produces for a C struct (with no C++ adornments) will usually look exactly the same as the code produced by a C++ compiler. This is reassuring to those C programmers who depend on the details of size and layout in their code, and for some reason directly access structure bytes instead of using identifiers (relying on a particular size and layout for a structure is a nonportable activity).

The size of a struct is the combined size of all of its members. Sometimes when the compiler lays out a struct, it adds extra bytes to make the boundaries come out neatly – this may increase execution efficiency. In Chapter 15, you’ll see how in some cases “secret” pointers are added to the structure, but you don’t need to worry about that right now.

You can determine the size of a struct using the sizeof operator. Here’s a small example:

//: C04:Sizeof.cpp
// Sizes of structs
#include "CLib.h"
#include "CppLib.h"
#include <iostream>
using namespace std;

struct A {
  int i[100];
};

struct B {
  void f();
};

void B::f() {}

int main() {
  cout << "sizeof struct A = " << sizeof(A)
       << " bytes" << endl;
  cout << "sizeof struct B = " << sizeof(B)
       << " bytes" << endl;
  cout << "sizeof CStash in C = " 
       << sizeof(CStash) << " bytes" << endl;
  cout << "sizeof Stash in C++ = " 
       << sizeof(Stash) << " bytes" << endl;
} ///:~

On my machine (your results may vary) the first print statement produces 200 because each int occupies two bytes. struct B is something of an anomaly because it is a struct with no data members. In C, this is illegal, but in C++ we need the option of creating a struct whose sole task is to scope function names, so it is allowed. Still, the result produced by the second print statement is a somewhat surprising nonzero value . In early versions of the language, the size was zero, but an awkward situation arises when you create such objects: They have the same address as the object created directly after them, and so are not distinct. One of the fundamental rules of objects is that each object must have a unique address, so structures with no data members will always have some minimum nonzero size.

The last two sizeof statements show you that the size of the structure in C++ is the same as the size of the equivalent version in C. C++ tries not to add any unnecessary overhead.

Header file etiquette

When you create a struct containing member functions, you are creating a new data type. In general, you want this type to be easily accessible to yourself and others. In addition, you want to separate the interface (the declaration) from the implementation (the definition of the member functions) so the implementation can be changed without forcing a re-compile of the entire system. You achieve this end by putting the declaration for your new type in a header file.

When I first learned to program in C, the header file was a mystery to me. Many C books don’t seem to emphasize it, and the compiler didn’t enforce function declarations, so it seemed optional most of the time, except when structures were declared. In C++ the use of header files becomes crystal clear. They are virtually mandatory for easy program development, and you put very specific information in them: declarations. The header file tells the compiler what is available in your library. You can use the library even if you only possess the header file along with the object file or library file; you don’t need the source code for the cpp file. The header file is where the interface specification is stored.

Although it is not enforced by the compiler, the best approach to building large projects in C is to use libraries; collect associated functions into the same object module or library, and use a header file to hold all the declarations for the functions. It is de rigueur in C++; you could throw any function into a C library, but the C++ abstract data type determines the functions that are associated by dint of their common access to the data in a struct. Any member function must be declared in the struct declaration; you cannot put it elsewhere. The use of function libraries was encouraged in C and institutionalized in C++.

Importance of header files

When using a function from a library, C allows you the option of ignoring the header file and simply declaring the function by hand. In the past, people would sometimes do this to speed up the compiler just a bit by avoiding the task of opening and including the file (this is usually not an issue with modern compilers). For example, here’s an extremely lazy declaration of the C function printf( ) (from <stdio.h>):

printf(...);

The ellipses specify a variable argument list[34], which says: printf( ) has some arguments, each of which has a type, but ignore that. Just take whatever arguments you see and accept them. By using this kind of declaration, you suspend all error checking on the arguments.

This practice can cause subtle problems. If you declare functions by hand, in one file you may make a mistake. Since the compiler sees only your hand-declaration in that file, it may be able to adapt to your mistake. The program will then link correctly, but the use of the function in that one file will be faulty. This is a tough error to find, and is easily avoided by using a header file.

If you place all your function declarations in a header file, and include that header everywhere you use the function and where you define the function, you ensure a consistent declaration across the whole system. You also ensure that the declaration and the definition match by including the header in the definition file.

If a struct is declared in a header file in C++, you must include the header file everywhere a struct is used and where struct member functions are defined. The C++ compiler will give an error message if you try to call a regular function, or to call or define a member function, without declaring it first. By enforcing the proper use of header files, the language ensures consistency in libraries, and reduces bugs by forcing the same interface to be used everywhere.

The header is a contract between you and the user of your library. The contract describes your data structures, and states the arguments and return values for the function calls. It says, “Here’s what my library does.” The user needs some of this information to develop the application and the compiler needs all of it to generate proper code. The user of the struct simply includes the header file, creates objects (instances) of that struct, and links in the object module or library (i.e.: the compiled code).

The compiler enforces the contract by requiring you to declare all structures and functions before they are used and, in the case of member functions, before they are defined. Thus, you’re forced to put the declarations in the header and to include the header in the file where the member functions are defined and the file(s) where they are used. Because a single header file describing your library is included throughout the system, the compiler can ensure consistency and prevent errors.

There are certain issues that you must be aware of in order to organize your code properly and write effective header files. The first issue concerns what you can put into header files. The basic rule is “only declarations,” that is, only information to the compiler but nothing that allocates storage by generating code or creating variables. This is because the header file will typically be included in several translation units in a project, and if storage for one identifier is allocated in more than one place, the linker will come up with a multiple definition error (this is C++’s one definition rule: You can declare things as many times as you want, but there can be only one actual definition for each thing).

This rule isn’t completely hard and fast. If you define a variable that is “file static” (has visibility only within a file) inside a header file, there will be multiple instances of that data across the project, but the linker won’t have a collision[35]. Basically, you don’t want to do anything in the header file that will cause an ambiguity at link time.

The multiple-declaration problem

The second header-file issue is this: when you put a struct declaration in a header file, it is possible for the file to be included more than once in a complicated program. Iostreams are a good example. Any time a struct does I/O it may include one of the iostream headers. If the cpp file you are working on uses more than one kind of struct (typically including a header file for each one), you run the risk of including the <iostream> header more than once and re-declaring iostreams.

The compiler considers the redeclaration of a structure (this includes both structs and classes) to be an error, since it would otherwise allow you to use the same name for different types. To prevent this error when multiple header files are included, you need to build some intelligence into your header files using the preprocessor (Standard C++ header files like <iostream> already have this “intelligence”).

Both C and C++ allow you to redeclare a function, as long as the two declarations match, but neither will allow the redeclaration of a structure . In C++ this rule is especially important because if the compiler allowed you to redeclare a structure and the two declarations differed, which one would it use?

The problem of redeclaration comes up quite a bit in C++ because each data type (structure with functions) generally has its own header file, and you have to include one header in another if you want to create another data type that uses the first one. In any cpp file in your project, it’s likely that you’ll include several files that include the same header file. During a single compilation, the compiler can see the same header file several times. Unless you do something about it, the compiler will see the redeclaration of your structure and report a compile-time error. To solve the problem, you need to know a bit more about the preprocessor.

The preprocessor directives
#define, #ifdef, and #endif

The preprocessor directive #define can be used to create compile-time flags. You have two choices: you can simply tell the preprocessor that the flag is defined, without specifying a value:

#define FLAG

or you can give it a value (which is the typical C way to define a constant):

#define PI 3.14159

In either case, the label can now be tested by the preprocessor to see if it has been defined:

#ifdef FLAG

This will yield a true result, and the code following the #ifdef will be included in the package sent to the compiler. This inclusion stops when the preprocessor encounters the statement

#endif

#endif // FLAG

Any non-comment after the #endif on the same line is illegal, even though some compilers may accept it. The #ifdef/#endif pairs may be nested within each other.

The complement of #define is #undef (short for “un-define”), which will make an #ifdef statement using the same variable yield a false result. #undef will also cause the preprocessor to stop using a macro. The complement of #ifdef is #ifndef, which will yield a true if the label has not been defined (this is the one we will use in header files).

There are other useful features in the C preprocessor. You should check your local documentation for the full set.

A standard for header file s

In each header file that contains a structure, you should first check to see if this header has already been included in this particular cpp file. You do this by testing a preprocessor flag. If the flag isn’t set, the file wasn’t included and you should set the flag (so the structure can’t get re-declared) and declare the structure. If the flag was set then that type has already been declared so you should just ignore the code that declares it. Here’s how the header file should look:

#ifndef HEADER_FLAG
#define HEADER_FLAG
// Type declaration here...
#endif // HEADER_FLAG

As you can see, the first time the header file is included, the contents of the header file (including your type declaration) will be included by the preprocessor. All the subsequent times it is included – in a single compilation unit – the type declaration will be ignored. The name HEADER_FLAG can be any unique name, but a reliable standard to follow is to capitalize the name of the header file and replace periods with underscores (leading underscores, however, are reserved for system names). Here’s an example:

//: C04:Simple.h
// Simple header that prevents re-definition
#ifndef SIMPLE_H
#define SIMPLE_H

struct Simple {
  int i,j,k;
  initialize() { i = j = k = 0; }
};
#endif // SIMPLE_H ///:~

Although the SIMPLE_H after the #endif is commented out and thus ignored by the preprocessor, it is useful for documentation.

These preprocessor statements that prevent multiple inclusion are often referred to as include guards.

Namespaces in headers

You’ll notice that using directives are present in nearly all the cpp files in this book, usually in the form:

using namespace std;

Since std is the namespace that surrounds the entire Standard C++ library, this particular using directive allows the names in the Standard C++ library to be used without qualification. However, you’ll virtually never see a using directive in a header file (at least, not outside of a scope). The reason is that the using directive eliminates the protection of that particular namespace, and the effect lasts until the end of the current compilation unit. If you put a using directive (outside of a scope) in a header file, it means that this loss of “namespace protection” will occur with any file that includes this header, which often means other header files. Thus, if you start putting using directives in header files, it’s very easy to end up “turning off” namespaces practically everywhere, and thereby neutralizing the beneficial effects of namespaces.

In short: don’t put using directives in header files.

Using headers in projects

When building a project in C++, you’ll usually create it by bringing together a lot of different types (data structures with associated functions). You’ll usually put the declaration for each type or group of associated types in a separate header file , then define the functions for that type in a translation unit. When you use that type, you must include the header file to perform the declarations properly.

Sometimes that pattern will be followed in this book, but more often the examples will be very small, so everything – the structure declarations, function definitions, and the main( ) function – may appear in a single file. However, keep in mind that you’ll want to use separate files and header files in practice.

Nested structures

The convenience of taking data and function names out of the global name space extends to structures. You can nest a structure within another structure, and therefore keep associated elements together. The declaration syntax is what you would expect, as you can see in the following structure, which implements a push-down stack as a simple linked list so it “never” runs out of memory:

//: C04:Stack.h
// Nested struct in linked list
#ifndef STACK_H
#define STACK_H

struct Stack {
  struct Link {
    void* data;
    Link* next;
    void initialize(void* dat, Link* nxt);
  }* head;
  void initialize();
  void push(void* dat);
  void* peek();
  void* pop();
  void cleanup();
};
#endif // STACK_H ///:~

The nested struct is called Link, and it contains a pointer to the next Link in the list and a pointer to the data stored in the Link. If the next pointer is zero, it means you’re at the end of the list.

Notice that the head pointer is defined right after the declaration for struct Link, instead of a separate definition Link* head. This is a syntax that came from C, but it emphasizes the importance of the semicolon after the structure declaration; the semicolon indicates the end of the comma-separated list of definitions of that structure type. (Usually the list is empty.)

The nested structure has its own initialize( ) function, like all the structures presented so far, to ensure proper initialization. Stack has both an initialize( ) and cleanup( ) function, as well as push( ), which takes a pointer to the data you wish to store (it assumes this has been allocated on the heap), and pop( ), which returns the data pointer from the top of the Stack and removes the top element. (When you pop( ) an element, you are responsible for destroying the object pointed to by the data.) The peek( ) function also returns the data pointer from the top element, but it leaves the top element on the Stack.

Here are the definitions for the member functions:

//: C04:Stack.cpp {O}
// Linked list with nesting
#include "Stack.h"
#include "../require.h"
using namespace std;

void 
Stack::Link::initialize(void* dat, Link* nxt) {
  data = dat;
  next = nxt;
}

void Stack::initialize() { head = 0; }

void Stack::push(void* dat) {
  Link* newLink = new Link;
  newLink->initialize(dat, head);
  head = newLink;
}

void* Stack::peek() { 
  require(head != 0, "Stack empty");
  return head->data; 
}

void* Stack::pop() {
  if(head == 0) return 0;
  void* result = head->data;
  Link* oldHead = head;
  head = head->next;
  delete oldHead;
  return result;
}

void Stack::cleanup() {
  require(head == 0, "Stack not empty");
} ///:~

The first definition is particularly interesting because it shows you how to define a member of a nested structure. You simply use an additional level of scope resolution to specify the name of the enclosing struct. Stack::Link::initialize( ) takes the arguments and assigns them to its members.

Stack::initialize( ) sets head to zero, so the object knows it has an empty list.

Stack::push( ) takes the argument, which is a pointer to the variable you want to keep track of, and pushes it on the Stack. First, it uses new to allocate storage for the Link it will insert at the top. Then it calls Link’s initialize( ) function to assign the appropriate values to the members of the Link. Notice that the next pointer is assigned to the current head; then head is assigned to the new Link pointer. This effectively pushes the Link in at the top of the list.

Stack::pop( ) captures the data pointer at the current top of the Stack; then it moves the head pointer down and deletes the old top of the Stack, finally returning the captured pointer. When pop( ) removes the last element, then head again becomes zero, meaning the Stack is empty.

Stack::cleanup( ) doesn’t actually do any cleanup. Instead, it establishes a firm policy that “you (the client programmer using this Stack object) are responsible for popping all the elements off this Stack and deleting them.” The require( ) is used to indicate that a programming error has occurred if the Stack is not empty.

Why couldn’t the Stack destructor be responsible for all the objects that the client programmer didn’t pop( )? The problem is that the Stack is holding void pointers, and you’ll learn in Chapter 13 that calling delete for a void* doesn’t clean things up properly. The subject of “who’s responsible for the memory” is not even that simple, as we’ll see in later chapters.

Here’s an example to test the Stack:

//: C04:StackTest.cpp
//{L} Stack
//{T} StackTest.cpp
// Test of nested linked list
#include "Stack.h"
#include "../require.h"
#include <fstream>
#include <iostream>
#include <string>
using namespace std;

int main(int argc, char* argv[]) {
  requireArgs(argc, 1); // File name is argument
  ifstream in(argv[1]);
  assure(in, argv[1]);
  Stack textlines;
  textlines.initialize();
  string line;
  // Read file and store lines in the Stack:
  while(getline(in, line))
    textlines.push(new string(line));
  // Pop the lines from the Stack and print them:
  string* s;
  while((s = (string*)textlines.pop()) != 0) {
    cout << *s << endl;
    delete s; 
  }
  textlines.cleanup();
} ///:~

This is similar to the earlier example, but it pushes lines from a file (as string pointers) on the Stack and then pops them off, which results in the file being printed out in reverse order. Note that the pop( ) member function returns a void* and this must be cast back to a string* before it can be used. To print the string, the pointer is dereferenced.

As textlines is being filled, the contents of line is “cloned” for each push( ) by making a new string(line). The value returned from the new-expression is a pointer to the new string that was created and that copied the information from line. If you had simply passed the address of line to push( ), you would end up with a Stack filled with identical addresses, all pointing to line. You’ll learn more about this “cloning” process later in the book.

The file name is taken from the command line. To guarantee that there are enough arguments on the command line, you see a second function used from the require.h header file: requireArgs( ), which compares argc to the desired number of arguments and prints an appropriate error message and exits the program if there aren’t enough arguments.

Global scope resolution

The scope resolution operator gets you out of situations in which the name the compiler chooses by default (the “nearest” name) isn’t what you want. For example, suppose you have a structure with a local identifier a, and you want to select a global identifier a from inside a member function. The compiler would default to choosing the local one, so you must tell it to do otherwise. When you want to specify a global name using scope resolution, you use the operator with nothing in front of it. Here’s an example that shows global scope resolution for both a variable and a function:

//: C04:Scoperes.cpp
// Global scope resolution
int a;
void f() {}

struct S {
  int a;
  void f();
};

void S::f() {
  ::f();  // Would be recursive otherwise!
  ::a++;  // Select the global a
  a--;    // The a at struct scope
}
int main() { S s; f(); } ///:~

Without scope resolution in S::f( ), the compiler would default to selecting the member versions of f( ) and a.

Summary

In this chapter, you’ve learned the fundamental “twist” of C++: that you can place functions inside of structures. This new type of structure is called an abstract data type, and variables you create using this structure are called objects, or instances, of that type. Calling a member function for an object is called sending a message to that object. The primary action in object-oriented programming is sending messages to objects.

Although packaging data and functions together is a significant benefit for code organization and makes library use easier because it prevents name clashes by hiding the names, there’s a lot more you can do to make programming safer in C++. In the next chapter, you’ll learn how to protect some members of a struct so that only you can manipulate them. This establishes a clear boundary between what the user of the structure can change and what only the programmer may change.

Exercises

Solutions to selected exercises can be found in the electronic document The Thinking in C++ Annotated Solution Guide, available for a small fee from http://www.BruceEckel.com.

In the Standard C library, the function puts( ) prints a char array to the console (so you can say puts("hello")). Write a C program that uses puts( ) but does not include <stdio.h> or otherwise declare the function. Compile this program with your C compiler. (Some C++ compilers are not distinct from their C compilers; in this case you may need to discover a command-line flag that forces a C compilation.) Now compile it with the C++ compiler and note the difference.
Create a struct declaration with a single member function, then create a definition for that member function. Create an object of your new data type, and call the member function.
Change your solution to Exercise 2 so the struct is declared in a properly “guarded” header file, with the definition in one cpp file and your main( ) in another.
Create a struct with a single int data member, and two global functions, each of which takes a pointer to that struct. The first function has a second int argument and sets the struct’s int to the argument value, the second displays the int from the struct. Test the functions.
Repeat Exercise 4 but move the functions so they are member functions of the struct, and test again.
Create a class that (redundantly) performs data member selection and a member function call using the this keyword (which refers to the address of the current object).
Make a Stash that holds doubles. Fill it with 25 double values, then print them out to the console.
Repeat Exercise 7 with Stack.
Create a file containing a function f( ) that takes an int argument and prints it to the console using the printf( ) function in <stdio.h> by saying: printf(“%d\n”, i) in which i is the int you wish to print. Create a separate file containing main( ), and in this file declare f( ) to take a float argument. Call f( ) from inside main( ). Try to compile and link your program with the C++ compiler and see what happens. Now compile and link the program using the C compiler, and see what happens when it runs. Explain the behavior.
Find out how to produce assembly language from your C and C++ compilers. Write a function in C and a struct with a single member function in C++. Produce assembly language from each and find the function names that are produced by your C function and your C++ member function, so you can see what sort of name decoration occurs inside the compiler.
Write a program with conditionally-compiled code in main( ), so that when a preprocessor value is defined one message is printed, but when it is not defined another message is printed. Compile this code experimenting with a #define within the program, then discover the way your compiler takes preprocessor definitions on the command line and experiment with that.
Write a program that uses assert( ) with an argument that is always false (zero) to see what happens when you run it. Now compile it with #define NDEBUG and run it again to see the difference.
Create an abstract data type that represents a videotape in a video rental store. Try to consider all the data and operations that may be necessary for the Video type to work well within the video rental management system. Include a print( ) member function that displays information about the Video.
Create a Stack object to hold the Video objects from Exercise 13. Create several Video objects, store them in the Stack, then display them using Video::print( ).
Write a program that prints out all the sizes for the fundamental data types on your computer using sizeof.
Modify Stash to use a vector<char> as its underlying data structure.
Dynamically create pieces of storage of the following types, using new: int, long, an array of 100 chars, an array of 100 floats. Print the addresses of these and then free the storage using delete.
Write a function that takes a char* argument. Using new, dynamically allocate an array of char that is the size of the char array that’s passed to the function. Using array indexing, copy the characters from the argument to the dynamically allocated array (don’t forget the null terminator) and return the pointer to the copy. In your main( ), test the function by passing a static quoted character array, then take the result of that and pass it back into the function. Print both strings and both pointers so you can see they are different storage. Using delete, clean up all the dynamic storage.
Show an example of a structure declared within another structure (a nested structure). Declare data members in both structs, and declare and define member functions in both structs. Write a main( ) that tests your new types.
How big is a structure? Write a piece of code that prints the size of various structures. Create structures that have data members only and ones that have data members and function members. Then create a structure that has no members at all. Print out the sizes of all these. Explain the reason for the result of the structure with no data members at all.
C++ automatically creates the equivalent of a typedef for structs, as you’ve seen in this chapter. It also does this for enumerations and unions. Write a small program that demonstrates this.
Create a Stack that holds Stashes. Each Stash will hold five lines from an input file. Create the Stashes using new. Read a file into your Stack, then reprint it in its original form by extracting it from the Stack.
Modify Exercise 22 so that you create a struct that encapsulates the Stack of Stashes. The user should only add and get lines via member functions, but under the covers the struct happens to use a Stack of Stashes.
Create a struct that holds an int and a pointer to another instance of the same struct. Write a function that takes the address of one of these structs and an int indicating the length of the list you want created. This function will make a whole chain of these structs (a linked list), starting from the argument (the head of the list), with each one pointing to the next. Make the new structs using new, and put the count (which object number this is) in the int. In the last struct in the list, put a zero value in the pointer to indicate that it’s the end. Write a second function that takes the head of your list and moves through to the end, printing out both the pointer value and the int value for each one.
Repeat Exercise 24, but put the functions inside a struct instead of using “raw” structs and functions.

[33] This term can cause debate. Some people use it as defined here; others use it to describe access control, discussed in the following chapter.

[34] To write a function definition for a function that takes a true variable argument list, you must use varargs, although these should be avoided in C++. You can find details about the use of varargs in your C manual.

[35] However, in Standard C++ file static is a deprecated feature.

[ Previous Chapter ] [ Table of Contents ] [ Index ] [ Next Chapter ]
Last Update:02/01/2000