Task Configuration in Ant

Task configuration is the process in Ant whereby the values from the XML build files are passed into the appropriate methods in the corresponding Java objects. For example, given the following build file snippet,

<target name="example">
    <echo message="test"/>
<target/>    

the configuration process will call the setMessage(String) method on the Echo java task. In addition to setting attributes, configuration is also responsible for creating nested elements and passing any text content to the task.

Throughout Ant's history the configuration process and its supporting object model have evolved significantly. Many of the structures and processing for configuration in the current model are there as a result of Ant's evolution. It is instructive therefore to look at how Ant has evolved and how that impacts Ant's current operation.

Early Days

In the very earliest versions of Ant, the build file consisted of just property definitions, task definitions (taskdefs) and targets which contained tasks. There were no nested elements and the only task setting methods recognized were those that took strings. Taskdefs and properties were not handled by Ant tasks, but were processed directly in the build file parsing stage. As a result, when tasks were processed, the mapping from taskname to Java class was completely determined. As a result, the object model built from the build file was very simple. Early Ant object model The model involved a single project instance, containing multiple targets with each target containing many tasks. At this point in time the build model contained the actual task instances. For example, if <echo> was encountered while parsing the build file, Ant's task map was queried for the corresponding Java task, org.apache.tools.ant.taskdefs.Echo, and an instance created.

Ant operates in two phases - a parsing phase and a runtime phase. At this point, the parsing phase would read the build file and produce the build model shown in the diagram. Configuration of tasks was performed in this parsing phase as was the invocation the the task's init() method. Once all tasks were configured, the runtime phase was started. All of the parsing information was discarded and the requested targets were run, executing their tasks in sequence. In other words, there was no abstract model of the build file available once parsing had finished. The build was represented by the concrete instances of the tasks that could be executed.

The first major change to this approach was to convert taskdefs and properties to tasks. This is why, for a long time, these tasks were the only ones allowed to exist outside of a target - the so-called top-level tasks. There were a few experimental variations along the way, including elimination of top-level tasks and the special treatment of a target named "init". These behaviours would be removed.

For taskdefs to work, the task definition was actually performed in the parsing phase so the class to be examined would be known at parse-time. The required work was done in the init() method (a parse-time method) rather than the execute() method (a run-time method). However, as the property and taskdef operations were now tasks they could be placed within targets. This created the expectation that these operations could be scoped and only operate when the containing target was run. This scoping was, however, illusory. Since these two tasks worked at parse-time they would function even when the containing target was not executed. Needless to say this caused a deal of confusion and is a lesson that the structure of the build file should correlate with the runtime behaviour.

Nested Elements

Nested elements and support for attributes of types other than String was soon added to Ant. After some initial bean-introspection based approaches, these facilites were implemented using reflection in the IntrospectionHelper class. This class is a clever, albeit complex, piece of code which determines what attributes and nested elements a Java class can support. It builds a list of anonymous inner class instances to invoke the various element creation and attribute setting methods. This allows introspectionHelper to provide a reasonably simple interface to access these services. Of course, to know what nested elements could be created, the introspection helper needed to be given the class of the object being introspected.

As attributes are mapped to setXXX() style methods, nested elements were mapped to either addXXX() or createXXX() methods. The former allowed Ant to create the object corresponding to the nested element and pass it to the Task, while the latter gives the task responsibility for object creation. Ant then configures the created object from the build file. In general, tasks should use the addXXX() form where possible.

Data Types

Where tasks are the verbs of an Ant build file, data types such as fileset, patternset, etc are the nouns. They are things or collections of things which are manipulated by the tasks. Of course there are also many things external to Ant such as files, compilation artifacts that are also manipulated by Ant tasks.

Initially data types were supported at the top level only. Special case handling in the parsing phase would attempt to find known data types. Data types were not supported within targets.

Dynamic Tasks

There were other problems with parse-time operation of the taskdef task besides its apparent disrespect for target scoping. It was not possible to compile a task class and then taskdef the task into the build. The only solution to this problem was to have taskdef and property operate at run-time rather than parse-time (i.e to do their work in their execute() method rather than their init() method). If this were possible, a task could be compiled and then taskdef'd into the same build.

The implication of this change, however, was that task classes could not be guaranteed to be known at parse-time. There wold be tasks which had not yet been compiled, and whose classes, therefore did not exist at parse-time. What attributes and nested elements these tasks would support could not be determined since IntrospectionHelper would not have access to a class to examine. The solution to this was to store the parsing information and defer the introspection until the task was known.

The RuntimeConfigurable class was introduced to store the information about the build file structure. In effect, this class is the foundation of a lightweight Ant-specific DOM. Each RuntimeConfigurable represents a single non-<target> node in the build file, whether it is a task or a nested element. All aspects of the node are stored in the RuntimeConfigurable instance - the node's attributes, text content and child nodes (as other RuntimeConfigurable) instances. The name, RuntimeConfigurable, is probably a slight misnomer since it is not actually configuable in the sense of Ant configuration. It, in fact, maintains the configuration information from the build file and is eventually applied to tasks and nested elements, which are are the real configurable components.

Each task in Ant has a reference to its RuntimeConfigurable instance. RuntimeConfigurable and UnknownElement Prior to the task being executed, it would need to be configured from its RuntimeConfigurable instance. This was done by calling the somewhat oddly-named method maybeConfigure(). The standard task implementation of the maybeConfigure() method passed control to the maybeConfigure() method of its RuntimeConfigurable which would configure the task and its nested elements from its stored information and its child RuntimeConfigurables.

With the build file information stored in the RuntimeConfigurables, all that was left to do was to support tasks whose class was not known at parse-time. Since the basic Ant model of project-target-task could not be changed, a new class UnknownElement was introduced to allow the model to support storing this information. UnknownElement extends Task, allowing it to be stored in the Ant object model. It is not, however, really a task - it is a placeholder for something which might become a task, a nested element within a task or something else entirely (Ant would later support dynamic definition of data types, which also are represented by UnknownElements).

When an UnknownElement in a build file has a nested element, a child UnknownElement would be created. For an unrecognized build file element, therefore, Ant would create a tree of UnknownElement instances. Each node in this tree would have its corresponding RuntimeConfigurable. These are, in effect, parallel trees, one composed of UnknownElements and the other of RuntimeConfigurables. It might seem redundant that Ant builds the tree of UnknownElements, since most information could be picked up from the RuntimeConfigurables when the UnknownElement was finally resolved to a real class. The main reason to do this is to support element ids. Every element in Ant, no matter where in the hierarchy can be given an identifier. If the UnknownElement tree was not built, these ids could not be supported until a task was executed. The ids are therefore attached to the UnknownElements.

UnknownElement's maybeConfigure method was a little different from that of the standard Task. Prior to Task configuration, UnknownElement needs to determine the class which is actually intended to be used. As maybeConfigure is called just prior to task execution, the class for the task should exist at this point. An instance of this real task class is created and is available for introspection. The child elements of the UnknownElement are used to drive the creation of nested elements. The links to the original RuntimeConfigurable instances are updated to pass the relationship from the UnknownElement onto the real Task and its nested elements. Any ids are also transferred to the real elements. Now that the real task exists, its maybeConfigure method would be called to perform the standard configuration of the task.

Task Containers

Task Containers were introduced to allow Tasks to be composed of other Tasks. The two examples of Task Containers that come with Ant, <sequential> and <parallel>, allow tasks to be grouped for parallel execution. Targets were also generalized as Task Containers. RuntimeConfigurable and UnknownElement Task Containers allow you to build logic tasks where the execution of a set of tasks is controlled by another task. Examples are the <if> and <try-catch> tasks of ant-contrib, which are actually built with instances of the Sequential task.

Task containers turn out to complicate the configuration operation of Ant, because the nested tasks of a Task Container should not resolved and configured when the TaskContainer task is itself configured. Property values may be set or tasks defined by some of the tasks in the TaskContainer.

Half and Half

At this stage in the evolution of Ant, the build model constructed by the parse phase consisted of some concrete tasks, representing those tasks which were known at parse-time, and UnknownElements for everything which was not known. All elements, both known and unknown have associated RuntimeConfigurables. This is the half and half model. It has worked well for most build files but it does suffer from some problems which we can expose with a taskname redefinition. Consider the following code:

  <target name="redef">
    <taskdef name="echo" 
             classname="org.apache.tools.ant.taskdefs.Jar"/>
    <echo jarfile="test.jar">
    </echo>
  </target>

Ant 1.4 and previous versions will fail with the above code as follows


build.xml:32: Class org.apache.tools.ant.taskdefs.Echo 
   doesn't support the "jarfile" attribute.

Since echo is a known task name, The echo class is put into the build model and not an UnknownElement. When this task is configured from its RuntimeConfigurable, the configuration fails. Ant 1.5 is able to handle this construction through the use of a little trick. When a task is redefined, all existing task instances are marked as invalid and when configuration is attempted, the task is replaced by the correct class. The addition of a nested element, however is enough to bring even Ant 1.5 unstuck:

  <target name="redef">
    <taskdef name="echo" 
             classname="org.apache.tools.ant.taskdefs.Jar"/>
    <echo jarfile="test.jar">
      <fileset dir="build"/>
    </echo>
  </target>

Since the parsing code is dealing with a real class and not an UnknownElement, it checks the suitability of nested elements during the parsing phase. If the name of the task above is changed to some unknwon name, the code will work. These are boundary cases - core task redefinition such as this is not common at all and not particularly recommended either. Nevertheless, it does highlight issues in the model.

Top-Level Tasks

In Ant 1.5, the build model still only allows targets, datatypes and specific "special" tasks to exist at the top-level of the build. This special-case behaviour in the parsing phase is a little undesirable. As a "definitional" task, it makes sense for taskdef to be allowed at the top-level. The problem is that to add any other definitional tasks would require a change to the Ant core code.

In Ant 1.6, this behaviour was changed. Any task is now allowed at the top-level. All top-level tasks are collected into an implicit target which is executed as soon as parsing is finished. This preserves the essential behaviour of parse-time, top-level tasks while allowing any task to be run at this phase. This should support the creation of new definitional tasks

Abstract Build Model

Another change introduced in ant 1.6 is to move away from the half and half build model described above. In Ant 1.6, all build file nodes are represented by UnknownElement instances. As such, the problems of core task redefinition are gone. This gives consistency to the way tasks are processed, especially taskdefs.

From a build file user's perspective, the Ant 1.6 changes will not be that noticeable - core tasks are no longer created at parse-time anymore but are created just prior to execution. The right task still gets created and executed. The most visible difference will be to the <script> task. The script task offers buildfile writers the most visibility of the internal workings of Ant. As such it is more sensitive to changes in those workings. For example, if a script uses an id to reference a task, it may get different types of objects in different versions of Ant. In Ant 1.5, it would get a real task instance for a core task or a task that has been executed but an UnknownElement for all other tasks. In Ant 1.6, it will only get a real task if the task has been executed. At all other times it will recieve a reference to an UnknownElement. If you are using the script task to navigate around the build model you need to know and handle these differences. Fortunately, most script task users don't manipulate the build model.

Configuration Walkthrough

Configuration is sufficiently complex that I thought a walk through of the whole process would help to understand the process. The description here relates to Ant 1.6.

Task Arguments

The final wrinkle to consider is where a task accepts another task as a nested element. For example, consider the following code in a task

    public void addNested(Echo nestedEcho) {
        this.nestedEcho = nestedEcho;
    }

This nested element will be configured when the task itself is configured. Normally that is fine unless the task with this nested element also happens to be a TaskContainer. As such the task might change the definitions of properties. To allow for this, a task can be reconfigured from its RuntimeConfigurable, on request.

Conclusion

Well, that is about all there is to Task configuration in Ant. Where we are today is the result of a long journey of evolutionary development and refinement. The use of UnknownElements and RuntimeConfigurable gives an abstract build model within the confines of the original project-target-task model with which Ant started. A greenfields approach could probably deliver a cleaner model, at least for a short while, but it would render a lot of existing Ant build files and task definitions unsuported.