Prague Stringology Conference 2011

Kazuhito Hagio, Takashi Ohgami, Hideo Bannai and Masayuki Takeda

Efficient Eager XPath Filtering over XML Streams

Abstract:
We address the embedding existence problem (often referred to as the filtering problem) over streaming XML data for Conjunctive XPath (CXP). Ramanan (2009) considered Downward CXP, a fragment of CXP that involves downward navigational axes only, and presented a streaming algorithm which solves the problem in O(|P||D|) time using only O(|P|height(D)) bits of space, where |P| and |D| are the sizes of a query P and an XML data D, respectively, and height(D) denotes the tree height of D. Unfortunately, the algorithm is lazy in the sense that it does not necessarily report the answer even after enough information has been gathered from the input XML stream. In this paper, we present an eager streaming algorithm that solves the problem with same time and space complexity. We also show the algorithm can be easily extended to Backward CXP a larger fragment of CXP.

Download paper: Article in PostScript Article in PDF BibTeX Reference
 PostScript   PDF   BibTeX reference 
Download presentation: Presentation