Browse DevX
Sign up for e-mail newsletters from DevX


Build an XML Based Scheduling Utility : Page 3

Complex applications often consist of many individual tasks, each of which may depend upon the successful completion of other tasks. For example, you may want an application to execute only if a preceding series of steps occur without failure, in a specific sequence. Managing such dependencies sequences manually quickly becomes a burden. Learn to automate process sequence dependencies with this XML-based scheduling utility.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

XML Scheduling Utility Program Outline
The program schedules jobs in two steps. First, it sorts the units (job_sum_boxes) and stores them in the appropriate order in a string array. The sort serves to place dependencies in their proper order by by searching the 'run_condition' of each unit for the name of the previous unit(s) run and then placing them in the job stream.

Next, starting with the first unit, the program sorts the subunits (job_boxes) corresponding to each unit and executes the command associated with each subunit. Again, the sort searches the 'run_condition' of each subunit for the name(s) of the previous subunit(s) run.

The code below shows the overall structure of the DOMScheduler class:

public class DOMScheduler { // Declare class variables /* Code Snipped */ ... ... // Constructor public DOMScheduler(String xmlFile) { DOMParser parser = new DOMParser(); try { parser.parse(xmlFile); Document document = parser.getDocument(); // Initialize the class Variables /* Code Snipped */ … for (int loopCounter =0; loopCounter...) { ... outerTraverse(document); } /* Code Snipped */ ... for (int outerLoop=0; ...) { /* Code snipped */ ... traverse(document); traverseTwo(document); } } catch { ... } public void outerTraverse(Node node) {} public void traverse(Node node) {} public void traverseTwo(Node node) {} public void innerTraverse(Node node, …..) {} public static void main(String[] args) { ... DOMScheduler domScheduler = new _ DOMScheduler(args[0]); } }

The DOMScheduler constructor calls the class methods outerTraverse(), traverse() and traverseTwo(). The outerTraverse() method sorts the units . After the units have been sorted, the traverse() and traverseTwo() methods sort the subunits in each of the parent units. The traverseTwo() method calls the innerTraverse() method.

The string arrays jobSumBoxNameinRCondition[] and jobBoxNameinRCondition[] (declared as class variables) store the unit (job_sum_box) and subunit (job_box) names of the unit or subunit respectively that's currently being processed. The next loop uses those names while searching for the 'run_condition's of the units/subunits. The tempJobSumBoxNameinRCondition[] and tempJobBoxNameinRCondition[] string arrays are temporary holders for jobSumBoxNameinRCondition[] and jobBoxNameinRCondition[] within the outerTraverse() and traverseTwo() methods.

All the traverse class methods have the same recursive Traversal logic, and outline of which is shown below:

public void traverse (Node node) { type = Node.getNodetype if type = Element_node { //Additional logic ... Attr attrs[] = SortAttributes(node.getAttributes()) //Do processing of Attributes using attrs[ ] } NodeList children = node.getChildNodes() if (children != null) then { for (int i =0; i < children.getLength; i++) { traverse(children.item(i)); } } }

The comment //Additional logic in the preceding code represents different code for each traverse method. Here's an explanation of the underlying logic for each type.

The outerTraverse() "Additional logic" Code Outline

if (node.getNodeName().equals("job_sum_box")) { NodeList nodeList = node.getChildNodes(); /* Code Snipped */ ... if (nodeList.item(i).getNodeName(). equals("run_condition")) { searchString = new String (nodeList.item(i).getChildNodes() .item(0).getNodeValue().trim()); /* Code Snipped */ ... if (searchString.lastIndexOf (jobSumBoxNameinRCondition[k]) != -1) { /* Code Snipped */ ... } } }

The preceding code identifies the units and checks the text value (the searchString variable in the preceding code fragement) of the run_condition of that unit. The code identifies the unit with the run_condition none as the first unit to be run. The code makes additional checks on the 'process_code' (not shown above) of each unit to verify whether the unit has already been processed.

The code then searches the run_condition of all units for the name of this unit (stored in jobSumBoxNameinRCondition[k]) and processes all matching units in the subsequent loop. The tempJobSumBoxNameinRCondition[] string array stores the unit names for the search in the next loop.

The traverseTwo() "Additional logic" Code outline

</b>if (node.getNodeName().equals("job_box")) { NodeList nodeList = node.getChildNodes(); if (nodeList.item(i).getNodeName(). equals("run_condition")) { searchString = new String (nodeList.item(i).getChildNodes(). item(0).getNodeValue().trim()); ... if (searchString.lastIndexOf (jobBoxNameinRCondition[k]) != -1) { if (searchString.lastIndexOf("AND") != -1) { // Logic to handle AND condition // involving a call to the // innerTraverse() method } else { ... } } } }

A quick glance through the preceding structure reveals the similarity between the additional logic for outerTraverse() and traverseTwo(). The two major differences are that traverseTwo() searches for the subunits (job_boxes) rather than the units. Also, traverseTwo() contains additional logic to handle multiple AND conditions in the run_condition, which involves a call to the innerTraverse() method.

The traverse() method is a special case of the traverseTwo() method used for the processing of the first subunit in each unit (identified by the none value in its run_condition).

The simple innerTraverse() method verifies that all the subunits contained in the jobNumberStr[] array have already been processed.

Extending the DOMScheduler Program
The DOMScheduler program contains hardcoded values for the upper end values of loopCounter (30) and k (10) in the loops. These values are set arbitrarily to account for a job stream of considerable complexity. You can increase them as necessary to work with more complex job streams.

The sample code used to traverse the DOM tree can be avoided by using a higher level API like JDOM and/or by using the DOM Traversal Module, which is part of the DOM Level 2 Specification (See Resources).

You could enhance the scheduling utility by including support for additional keywords such as OR and for complex run conditions involving combinations of AND and OR conditions. You could also add support for features such as Job Restart and Job Abort.

While the scheduling utility sample code does not use the success_code element, you should add a success_code check. You can store the output of each executable run in the files specified by the std_out_file and std_err_file elements. The sample project supports only a single job stream, but you can modify the project to support multiple job streams.

You must populate the XML file as a one-time setup operation. Standard data warehouse applications usually store process flow information in a relational database. It's possible to create an XML file from database tables directly using standard object-relational mapping techniques and tools, thereby automating the process of populating the XML file. Refer to the resources section for a good link to XML-DBMS transformation tools and techniques.

Alternatively, you may be able to populate the XML file programmatically from an external source using data binding tools such as Java Architecture for XML Binding (JAXB). To store the XML, you can serialize the output DOM Tree and write the information to disk for further analysis.

While the sample code runs as a stand-alone executable, you could implement it as a Windows service. This would let you include additional scheduling information in the XML file, for example, "run this stream every 24 hrs"

You can easily integrate an XML based scheduling utility such as the one described in this article into any operational infrastructure that you may have. Such utilities can be a firm step toward simplifying your daily operations.

Manoj Sharma is a software architect with more than six years professional experience in the IT industry in various development and consulting roles. He is also a regular contributor to industry magazines and trade journals. You can reach him at sharma.manoj@rogers.com
Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date