Dangling pointers

A dangling pointer is a pointer to storage that is no longer allocated. Dangling pointers are nasty bugs because they seldom crash the program until long after they have been created, which makes them hard to find. Programs that create dangling pointers often appear to work on small inputs, but are likely to fail on large or complex inputs.

As the world's leading example of an object-oriented programming language that does not rely on garbage collection, C++ makes it easy to create dangling pointers. Here are a few examples of the most popular techniques.

Note: These examples use C-style strings because we've been using an old version of Gnu C++ whose strings do not conform to the new standard.)

    delete [] s1;
    delete [] s2;
    return f (s1, s2);      // s1 and s2 are dangling pointers

This code will probably appear to work unless f or one of the functions that are called during the activation of f happen to allocate heap storage. When the bug does show up, it will probably look like a bug in f or in one of the functions that f calls.

typedef Foo_ * Foo;

Foo newFoo (char * x) {
    Foo_ tmp(x);
    return &tmp;
}

This is the classic technique for creating a dangling pointer in C.

typedef char * Foo;

Foo newFoo (char * x) {
    Foo tmp = new char [strlen (x) +1] ;
    strcpy (tmp, x);
    delete [] x;
    return tmp;
}

Here newFoo creates a dangling pointer by deleting the client's C-style string.

typedef char * Foo;

Foo newFoo (char * s) {
    return s;
}

If newFoo is supposed to return a Foo whose lifetime is independent of the lifetime of its argument, then a dangling pointer will be created when a client deletes the C-style string that was passed to newFoo. The bug might appear to lie in the client code, but newFoo would be the real culprit.

class Foo {
  public:
    Foo (char * x) : len(strlen(x)), name(x) { }
  private:
    int len;
    char * name;
};

Foo newFoo (char * s) {
    return Foo(s);
}

Once again, a dangling pointer will be created when a client deletes the C-style string that was passed to Foo or newFoo.

class Foo {
  public:
    Foo (char * x) {
        len = strlen (x);
        name = new char[len + 1];
        strcpy (name, x);
    }
    virtual ~Foo () {
        delete [] name;
    }
  private:
    int len;
    char * name;
};

Foo newFoo (char * s) {
    Foo foo = Foo(s);
    return foo;
}

This code fixes the previous bug by introducing three new bugs. The most obvious is that the compiler inserts an implicit call to foo.~Foo() when newFoo returns. This implicit call deallocates foo.name. Hence the Foo that is returned by newFoo always contains a dangling pointer.

The other bugs are illustrated by the following client code:

    Foo f1 = newFoo ("hi there");
    Foo f2 = f1;
    Foo f3;
    f3 = f2;

Since no copy operator is defined, the compiler will implicitly define a copy constructor that makes Foo f2 = f1 roughly equivalent to

    Foo f2;
    f2.len = f1.len;
    f2.name = f1.name;

Thus f2.name becomes the same pointer as f1.name.

Similarly, no assignment operator is defined, so the compiler will implicitly define an assignment operator that makes f3 = f2 roughly equivalent to

    f3.len = f2.len;
    f3.name = f2.name;

Thus each of f1, f2, and f3 contain exactly the same pointer. When they go out of scope, that pointer will be deallocated not once, but three times.

A storage leak would be created if we were to remove the destructor or to remove the call to delete, so those are not good alternatives. What we need is a copy constructor and an overloaded assignment operator.

class Foo {
  public:
    Foo (char * x) {
        len = strlen (x);
        name = new char[len + 1];
        strcpy (name, x);
    }
    virtual ~Foo () {
        delete [] name;
    }
    Foo (const Foo & foo);                     // copy constructor
    const Foo & Foo:operator= (const Foo &);   // assignment operator
  private:
    int len;
    char * name;
};

//  copy constructor

Foo::Foo (const Foo & foo) {
    len = foo.len;
    name = new char [foo.len];
    strcpy(name, foo.name);
}

//  assignment operator

const Foo & Foo::operator= (const Foo & rhs) {
    delete [] name;
    name = new char [rhs.len + 1];
    strcpy(name, rhs.name);
    return *this;                              // so x = y = z will work
}

Foo newFoo (char * s) {
    Foo foo = Foo(s);
    return foo;
}

This code still contains a bug. Consider the client code

    Foo f1 = newFoo ("hello");
    Foo f2 = newFoo ("goodbye");
    f1 = flag ? f1 : f2;

The assignment represents an implicit call to f1.operator=(flag ? f1 : f2). Suppose flag is true, so the value of the right hand side of the assignment is a reference to f1. The code for f1.operator= begins by deleting f1.name. It then passes the dangling pointer f1.name as both arguments to strcpy. Following the assignment, f1 contains a dangling pointer. When f1 goes out of scope, and its destructor is called, the delete [] operator will be called on f1.name for the second time.

The solution for this problem is to make the assignment operator check whether this is equal to the right hand side:

const Foo & Foo::operator= (const Foo & rhs) {
    if (this == &rhs) {
        delete [] name;
        name = new char [rhs.len + 1];
        strcpy(name, rhs.name);
    }
    return *this;                              // so x = y = z will work
}

What could be simpler?

Last updated 2 March 1998.