RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


The State of the Language: An Interview with Bjarne Stroustrup : Page 4

C++ founding father assesses the language on the eve of its new standard.


Have you had a chance to program in any of the newer programming languages (Java, C#, Ruby, and so on)? Do you find in them anything that impresses you or worth commending in terms of novelty, engineering merits, or simplicity?

I have tried a lot of languages, including those you mention. I'd prefer not to do language comparisons. Such comparisons are rarely fair and even less frequently seen to be fair. However, "simple" is not the first word that springs to mind. I note the size of the run-time support involved. I predicted the growth of Java complexity and don't condemn it—I consider complexity an inevitable consequence of serving a large community. The design aims of those languages are not those of C++ and vice versa.

I feel that there is pressure to add to C++ features such as finally, garbage collection, and dynamic array bounds checking as an attempt to appease users of other languages. In reality, these features don't fit well into the design aims of C++: finally can be replaced by RAII and local classes anyway, GC and destructors are mutually exclusive, and runtime bounds checking violates the pay-as-you-go and trust-the-programmer principles. Does C++ truly need a GC? And more generally, where do you draw the line between borrowed features that are indispensable and those that are not?

There is always pressure to add features. Many people think that I and the committee are just bloody-minded and/or stupid not to immediately add their favorite feature—typically a feature they have tried or just heard of in some other language. Often, those same people complain that the committee is adding too many features and that we should remove some of those old and ugly "legacy features." Making changes to a widely used language is not easy. There are distinct limits to how many changes we can add with a reasonable hope that they will be widely useful, rarely harmful, or confusing, and not breaking existing code. People really don't like their existing code to be broken and making significant extensions 100 percent compatible and properly integrated with all existing and new features can be quite difficult.

I don't know how to draw a sharp line between worthwhile and frivolous extensions, but I do know that no new feature is really "indispensible." I try to evaluate each new suggested feature on its merits and in the context of existing language features, library facilities, known problems, and other proposed features. Since the number of new features we can accept is limited, I try to maximize utility, minimize damage, and minimize implementation cost. Each new language feature is an intricate puzzle and the more fundamental a new feature is, the more parts of the existing language and existing usage are affected and must be taken into account. For example, the new strongly typed enums were relatively easy to design and implement because they are an isolated feature, but conversely they are also unlikely to have a dramatic impact on the way people design their code and view C++. On the other hand, the new facilities for initialization are likely to impact every user and be highly visible in code. On the other hand, their definition touches many points of current usage and definition and is therefore at least an order of magnitude harder to design/define than the enumerations.

finally clauses for try blocks are a minor issue and—as you mention—redundant in that we can use RAII. It can even be seriously argued that providing finally would lead to uglier and more buggy code as programmers used to it in Java (or whatever) found it easier to avoid learning RAII. finallyis not on the list of C++0x features.

The garbage collection issue is not simple. I even think that your question oversimplifies the issues—we need to find a way to combine GC and destructors. We will not see GC in C++0x, but we will see further work on a design for optional and programmer-controlled GC. For C++0x, we will get a definition of what it means for a pointer to be disguised and an ABI for deeming areas of memory "not containing pointers" and/or "cannot be collected." The result of these simple guarantees will be that existing add-on collectors will be more portable and more effective.

First, let's clarify the ideals: We want simple and comprehensive resource management. That is, we want every resource acquired to be released ("no leaks"). To get that, we need a programming model that is simple to use. A complicated model will lead to errors (leaks) when people misuse it or give up on it in favor of (often even more error prone) "home brew" solutions. "Comprehensive" is essential because memory isn't the only resource we need to worry about; we need to handle locks, sockets, file handles, thread handles, etc.

We have RAII (for scoped resource use) plus "smart pointers" (for resources that don't have their lifetimes determined by a simple scope) as a comprehensive solution. From this perspective, the smart pointer types in C++0x completes the RAII technique supported in C++98. Unfortunately, this set of techniques works only where people use it systematically and correctly.

For example:

void f(int i, char* p)
vector<X> v(i);
string s(p);
// …

Here the storage for elements of v and x are handled automatically by the destructors of vector and string. If X has a non-memory resource (for instance, a lock) vector's destructor will release it. This style of use is simple, widely understood, and efficient.

"I consider it obvious that C++ GC will have to be optional (under some form of programmer control) because some programs do not need GC (they don't leak), some programs cannot afford GC delays (not all collectors offer real-time guarantees), and some programs cannot afford to include a significant run-time support system."

The reason that GC is attractive to me is that there are projects where I think that "RAII plus smart pointers" are unlikely to be systematically and correctly used. Examples are projects where exceptions cannot be used, projects with a lot of exception-unsafe "legacy" parts, projects with components developed in a number of places with different programming philosophies and programmer skills, and projects with long-established resource management strategies that don't fit the "RAII plus smart pointers" model. Typically, such projects are valuable, long-lived, expensive to rewrite, and they leak. Add-on garbage collectors have been successfully used to deal with such leaks for over a decade. In some cases, the collector is used simply until the leaks can be plugged; in other cases, they are used because someone gave up plugging all the leaks. This use of GC is sometimes called "litter collection" as opposed to uses where programs leak for the convenience of programmers. My guess is that even with the best education based on RAII, we will have programs that need litter collection "forever"—new ones will be written as fast as old ones are made safe.

Note that my aim is not to use GC to hide the problems and complexities of resource management, but to use GC as yet another tool for dealing with resource problems. This is quite different from the view of GC as a panacea. The current C++ techniques for resource management deal more directly with the problem than GC and should be used as the first line of defense against resource management problems. One of the strengths of well-written C++ is exactly that it generates so little garbage. This makes GC in C++ surprisingly (to some) efficient.

I consider it obvious that C++ GC will have to be optional (under some form of programmer control) because some programs do not need GC (they don't leak), some programs cannot afford GC delays (not all collectors offer real-time guarantees), and some programs cannot afford to include a significant run-time support system. One of the two major design issues for C++ GC is how to express this programmer control. The difficulty is to ensure that a program component that relies on the absence of GC (for performance, for instance) is never linked with a component that relies on GC. Remember that there is a widespread use of dynamic linking and plug-ins so in many contexts whole-program analysis is not an option.

As you indicate, the other big issue is how to reconcile GC and destructors. If programmers come to rely on GC to collect their garbage for them, they might "leak" an object for the collector to recycle even though that object had a non-trivial destructor—a destructor that releases a non-memory resource. For example, given GC, I might use a

vector<X*> v;

Without providing code to delete the pointers when v is destroyed if I "know" that X does not own a non-memory resource. Such code would be brittle because a change to X(for instance, adding a lock member), could make my code leak. In general, this problem seems intractable for real-world scenarios involving maintenance of code (adding non-trivial destructors) and dynamic linking. However, my impression is that a combination of explicit declarations of destructors releasing non-memory resources ("explicit destructors") and heuristics can eliminate a high percentage of real problems. Remember that neither of the two "pure" alternatives (no-GC and all-GC) consistently leads to perfect memory management either (in the hands of real-world programmers for real-world problems), so the absence of a perfect solution should not deter us from providing a good one.

Close Icon
Thanks for your registration, follow us on our social networks to keep up-to-date