Tip of the Day
C++
Language: C++
Expertise: Beginner
Mar 27, 1997

Virtual Functons

How are Virtual Functions implemented in C++ ? What is a virtual table?

Let me start by saying that the C++ language definition does not impose any implementation restrictions on any compiler vendors. In other words, this is a C++ compiler technology question and not a C++ language question.

That said, I'd like to talk about "Virtual Function Table" or vtbl as it is sometimes called.

The virtual table concept is a way to implement virtual functions. Many compilers adopt some variation of this general idea to implement virtual functions in their compilers. This was the chosen method of implementation for the original C++ compiler that Bjarne Stroustrup wrote. He explains this in his book "The Design and Evolution of C++." What I am going to give here is an explanation of the same concept.

In this implementation, each class that has at least one virtual function has an additional (hidden) data member. This is a pointer to an array of structs called vtbl_entry.

Here is how the vtbl_entry may look:

struct vtbl_entry
	FuncPtr fptr_ ; // a pointer to a function
	int offset; // used in case of multiple inheritance
Ignore the offset for now. Now let's look at how the vtbl_entries are set up for some objects.

class Base 
	virtual void f ();
	virtual void g ();

	// .. some data members

class Derived : public Base
	virtual void g ();
	// ... some data members

// now consider
Base *bp = new Derived;

// This is how the Derived object we created could be laid out.
---------------- ---------------- ---------------- ----------------
----> Start of Base subobject
----> Base::vtbl  { {Base::f,0}, {Derived::g,0} }
----> Base's data members
----> Derived::vtbl { {Derived::g,0} }
----> Derived's data members
---------------- DIAGRAM 1------- ---------------- ----------------

Notice that Base::vtbl has 2 entries (one per virtual function) 

now a call like

bp->g (); // calls Derived::g because derived overrides g

can be translated by the compiler to look like

vtbl_entry *entry = bp->vtbl[index(g)]; // find the entry for function "g"
/// This will return Derived::g as shown in the diagram 1 above
(entry->fptr)(bp); // bp is the this pointer -- calls Derived::g
Note that a call to any virtual function — not just virtual functions that are overriden by the derived class — is implemented this way . So even the call bp->f() would generate code like the above, except that it will call Base::f because that is what is stored in the vtbl (see first block of code).

Now let's consider a more complex example:

class Base1
	virtual void f ();
	// Base 1's data members

class Base2
	virtual void f ();
	virtual void g ();
	// Base2's data members

class Derived : public Base1, public Base2
	virtual void f ();
	// Derived's data members

// now consider the statement 
Base2 * base2 = new Derived;

// This is how the Derived object we created could be laid out.
---------------- ---------------- ---------------- ----------------
----> Base1::vtbl  { {Derived::f,0} }
----> Base1's data members
----> Base2::vtbl  { {Derived::f,-delta(Base2)} , {Base2::g,0} } /**NOTE**/
----> Base2's data members
----> Derived::vtbl { {Derived::f,0} }
----> Derived's data members
---------------- DIAGRAM 2------- ---------------- ----------------
Notice that Base2::vtbl has two entries (one per virtual function) and the first entry stores Derived::f and a -delta(Base2) as opposed to a zero in the offset feild. Let's examine why this is necessary:
// consider the call 

base2->g (); // Since the function g is defined only in Base2, the 
//this pointer passed to the function g must point to the Base2 subobject only
// this is what would happen if this code gets translated in the same
// way as we saw in code example 1.

//Now consider the call

base2->f(); // Since the function f is overriden by Derived, the 
//this pointer passed to this function should point to the entire 
// Derived object. If the generated code is exactly the same as
//in the other cases, we would pass a pointer to the Base2 subobject
// which is wrong. (Derived::f() will not be able to access protected members
// in its Base1 base class.
// To solve this problem the compiler-generated code would look like
// the following:

vtbl_entry *entry = base2->vtbl[index(f)]; // find the entry for function "f".
// This will return Derived::f as shown in diagram 2 above
(entry->fptr)(base2+entry->offset); //. We now add the offset stored in the
// vtbl entry for f which is -delta(Base2), which is a negative number 
// representing the difference between the Base2 subobject in Derived and
// the start of the whole Derived object.
NOTE that the offset is always added to the this pointer to be passed. In most cases this offset is 0 so it has no effect. The advantage of this implementation is that it is fairly simple to implement and does not contain any branching (no if or case statements) and hence is considered quite fast. -bom C++Pro
