Object construction and virtual methods

Dan and I were chatting at Cenqua last week about some interesting interactions between object constructors and virtual methods in Java. I investigated it a bit more comparing C++, Java and C#. I thought it was worth a blog entry

Before Java came along I was a C++ developer. There’s plenty to get your head around in C++, expecially what the compiler does behind the scenes, and I was pretty clued up on all that. One of the interesting things is what happens to virtual function dispatch during object construction.When Java arrived, one of the things I noticed was that it was quite different in this area. To illustrate the point, consider the following C++ code

#include <iostream>
using std::cout;
using std::endl;

class A {
public:

    virtual void yell()
    {
        cout << "I'm an A" << endl;
    }

    A() {
        cout << "Constructing A" << endl;
        yell();
    }
};

class B : public A {
public:    
    B() {
        cout << "Constructing B" << endl;
    }

    virtual void yell()
    {
        cout << "I'm a B" << endl;
    }
};

int main() {
    A* a = new B();
    a->yell();
    return 0;
}

and its equivalent in Java

public class A {
    public void yell()
    {
        System.out.println("I'm an A");
    }

    public A() {
        System.out.println("Constructing A");
        yell();
    }

    public static void main(String[] args) {
        A a = new B();
        a.yell();
    }
}

class B extends A {
    public B() {
        System.out.println("Constructing B");
    }

    public void yell()
    {
        System.out.println("I'm a B");
    }
};

These two programs behave differently. The C++ output is

Constructing A
I'm an A
Constructing B
I'm a B

whereas the java output is

Constructing A
I'm a B
Constructing B
I'm a B

So in Java (and C#) a virtual method can be called before an object is constructed. In C++ the object is constructed in stages and its type changes over time as each constructor in the class hierarchy runs. Virtual dispatch is active but only to the methods of the object’s current type.

In many respects the C++ approach is safer than Java where you can go “below” the objects currently constructed type. In Java there can be a danger of a method being executed before the object’s constructor is run. These errors can be difficult to find because the person who wrote the code has a very strong assumption that the constructor has been called before any method is called. Even reading the code, this assumption will be at play. To understand the behaviour you need to be thinking outside the single class you may be looking at. Here is an example.

If we change the Java code for the class B to this

class B extends A {
    private String data;
    
    public B() {
        System.out.println("Constructing B");
        data = "B data";
    }

    public void yell()
    {
        System.out.println("I'm a B - see my data is " + data);
    }
};

the output is

Constructing A
I'm a B - see my data is null
Constructing B
I'm a B - see my data is B data

To the developer of class B that could be a very surprising result. Despite initializing the data field in the constructor, a method call reveals the data to be uninitialized. C#’s behaviour is the same (well except that it does print the value of the null string as “null” – just an empty string). Up to this point, things are not too surprising. Things get more interesting when we move the value initializers to the declarations. Here Java and C#’s behaviours diverge. Here’s the C# code

class B : A {
    private string data = "B data";
    
    public B() {
        Console.WriteLine("Constructing B");
    }

    public override void yell()
    {
        Console.WriteLine("I'm a B - see my data is " + data);
    }
};

The java output is the same as before but the C# output is now

Constructing A
I'm a B - see my data is B data
Constructing B
I'm a B - see my data is B data

So in Java the variable initializers are part of the constructor but in C# they are run before that (i.e. before the virtual methods can be called). In Java this means that data changes made in virtual methods may be changed by constructor operations and variable initializers. Consider the following updated version of the example

public class A {
    public void yell()
    {
        System.out.println("I'm an A");
    }

    public A() {
        System.out.println("Constructing A");
        setData();
        yell();
    }

    public void setData() {
        // do nothing
    }
    
    public static void main(String[] args) {
        A a = new B();
        a.yell();
    }
}

class B extends A {
    private String data = null;
    
    public B() {
        System.out.println("Constructing B");
    }

    public void yell()
    {
        System.out.println("I'm a B - see my data is " + data);
    }
    
    public void setData() {
        data = "B data";
    }
};

The output is the reverse of the previous version:

Constructing A
I'm a B - see my data is B data
Constructing B
I'm a B - see my data is null

That’s interesting. The data field’s null initializer, run as part of the constructor, undoes the change made in the setData virtual method. Again, the C# equivalent code produces a different result because the initializers are run before virtual dispatch starts. It also means that, in Java, having a null initializer on a field actually generates some code in the constructor. If we remove the null assigment on the data field the value does not get reset in the constructor.

I always assumed that null initializers were benign and neither generated code nor changed behaviour. It’s not that simple. The moral of the story is to be very careful calling virtual (non-final) methods in constructors and to know what your language does behind the scenes, as the three languages I’ve looked at here are all different.

One thought on “Object construction and virtual methods”

  1. Pingback: Oliver's Blog

Comments are closed.