Assignment 1, Bonus Points

DOM Parsing for XML Documents
Due Date: March 31st, 2008


You can earn 2 bonus points by adding to the statistics of the first assignment the following four calculations. As before, only consider element and text nodes when you look for root-to-leaf paths, sibling (children) lists, or nodes on the same level.
By "maximal breadth" we mean the maximal number of nodes that are on the same level, i.e., have same distance to the root node.
The "average breadth" is the average of numbers of nodes on a same level, taken over all levels in the tree.
For the average length of sibling lists do not count the length/breadth 1 of the top-most node (viz. the book node below), but
simply sum the number of (element or text)-children of all element nodes, and divide by the total number of element nodes.
Your program of assignemnt 1 should print these additional numbers when using the -sb option, precisely as shown below (with correct numbers of course).
Round the average numbers to four digits after the dot, and always print 4 digits after the dot, as shown below.

Example Runs

Assume you have a file "test.xml" which consists of the following XML snipplet:

<book isbn="1-2345-6789-0" year="1994">
<title>TCP/IP Illustrated</title>
<author><last>Stevens</last><first>John</first></author>
<publisher>Addison-Wesley</publisher>
<price currency="USD">65.95</price>
</book>

Your program should behave as follows (assuming the executable is named "DOMcat"):

> DOMcat -sb test.xml
Total number of nodes: 15
Number of element nodes: 7
Number of attribute nodes: 3
Number of text nodes: 5
Maximal height: 4
Maximal length of sibling list: 4
Number of distinct element names: 7
Number of distinct attribute names: 3
Average height: 3.4000
Average length of sibling lists: 1.5714
Maximal breadth: 5
Average breadth: 3.0000

CRICOS Provider Number: 00098G