Assignment 4

How to Print Query Results:

Assignment 1

<a>
  <b>Hello</b>
  <c></c>
  <b>World</b>
</a>

Here are examples of what your program should print as result, assuming the executable is called XPathEval.
Note that we will always enclose the query in quotes ("") when we test your program.

> XPathEval h.xml "/a/b"
1:
<b>
Hello
</b>
2:
<b>
World
</b>

> XPathEval h.xml "/a/*/text()"
1:
Hello
2:
World

> XPathEval h.xml "//*"
1:
<a>
<b>
Hello
</b>
<c>
</c>
<b>
World
</b>
</a>
2:
<b>
Hello
</b>
3:
<c>
</c>
4:
<b>
World
</b>

For Point 4, we would like you to implement two different ways of outputting the results:

(1) as above, i.e., XML line-by-line fragments of the subtrees at the selected nodes ("-t" option) and
(2) as a list of DAG nodes; a node is of the form N1.N2.N3...Nn where N1 is the last row number of the DAG,
N2 is a child number of the root node, N3 is a child number of the N2th child of the root, etc.

As before, always print the result nodes in document order.

Here is an example, consider this following DAG (stored in a file "dag.txt").


1:c


2:d


3:b[1:2,2:3]


4:b[2,1]


5:a[3,4:4,1]

> XPathEval -t dag.txt "//b[./following-sibling::c]"
1:
<b>
<c>
</c>
<c>
</c>
<d>
</d>
<d>
</d>
<d>
</d>
</b>
2:
<b>
<d>
</d>
<c>
</c>
</b>
3:
<b>
<d>
</d>
<c>
</c>
</b>
4:
<b>
<d>
</d>
<c>
</c>
</b>
5:
<b>
<d>
</d>
<c>
</c>
</b>

> XPathEval -d dag.txt "//b[./following-sibling::c]"
[5.1,5.2,5.3,5.4,5.5]

> XPathEval -d dag.txt "//b[./following-sibling::c]/d[./following-sibling::c]"
[5.2.1,5.3.1,5.4.1,5.5.1]

> XPathEval -t dag.txt "//b[./following-sibling::c]/*[./following-sibling::d]"
1:
<c>
</c>
2:
<c>
</c>
3:
<d>
</d>
4:
<d>
</d>

> XPathEval -d dag.txt "//b[./following-sibling::c]/*[./following-sibling::d]"
[5.1.1,5.1.2,5.1.3,5.1.4]

> XPathEval -d dag.txt "//f"
[]

More Sample Runs

For this document and

the Simple Root Path simpleRoot1=/*/a/b/* this is the result.
simpleRoot2=/b/b/c this is the result.
simpleRoot3=/*/*/*/*/*/*/*/*/b/b/c/text() this is the result.
simple1=//a/c/* this is the result.
thirdPart1=//a[./b/c] this is the result.

For this DAG and

the Simple Root Path simpleRoot3=/b/b/c this is the result for -d:
[94.2.1,94.2.3,94.2.4,94.2.6,94.2.8,94.2.9,94.2.10,94.2.11,94.2.15,94.2.17,94.17.9,94.17.14]
dag1=//a[./following-sibling::c]/b/c this is the result for -t, and this is the result for -d:
[94.1.1.1.1.1.1.2,94.1.1.1.1.1.1.2.1.10.2.4,94.1.1.1.1.1.1.2.1.10.2.7.9.3,94.1.1.1.1.1.1.2.1.10.2.7.9.10,94.1.1.1.1.1.1.2.1.10.2.7.9.15,94.1.1.1.1.1.1.2.1.10.2.8,94.1.1.1.1.1.1.4,94.1.1.1.1.5.1.6.12,94.1.1.2.3.4,94.1.1.2.3.7,94.1.1.2.3.8,94.1.1.3,94.1.1.7,94.1.1.9,94.1.1.12]

Bonus Points

Here you are asked to implement a streaming XPath evaluator. This means that the evaluator should use as little memory as possible;
in particular, it should not store any XML tree in memory. For XML documents (of bounded depth, say, <=20), your program should
only use constant amount of memory. We will test your program with bounded depth documents of arbitrary size (say >1GB) to check
whether the memory consumption of your program is really independent of the size of the document (but only depends on the depth).

Simple Paths your streaming evaluator should work over the simple paths, as defined above. Instead of complete subtrees, you should print
a list of node descriptions (similar as in -d of Part 4, but without the N1; the top-most element node of the document is represented as "0";
the N-th child of that top-most node is represented as "N"; and the M-th child of that node as N.M, etc.).
(2 Points)

Example Runs

> cat h.xml | XPathEval -s "/a/b"
1: 1
2: 3

> cat h.xml | XPathEval -s "/a/*/text()"
1: 1.1
2: 3.1

> cat h.xml | XPathEval -s "//*"
1: 0
2: 1
3: 2
4: 3

> cat tr.xml
<a>
  <b><c/><c/><d/><d/><d/></b>
  <e><d/><c/></e>
  <e><d/><c/></e>
  <e><d/><c/></e>
  <e><d/><c/></e>
  <c/>
  <b><b><c/><d/></b></b>
</a> 

> cat tr.xml | XPathEval -s "//c"
1: 1.1
2: 1.2
3: 2.2
4: 3.2
5: 4.2
6: 5.2
7: 6
8: 7.1.1

> cat tr.xml | XPathEval -s "//b/d"
1: 1.3
2: 1.4
3: 1.5
4: 7.1.2

CRICOS Provider Number: 00098G

XPath Evaluation over Main Memory Structures Due Date: May 18th, 2009

How to Print Query Results:

More Sample Runs

Bonus Points

XPath Evaluation over Main Memory Structures
Due Date: May 18th, 2009