skip to main content
survey

Structural XML Query Processing

Published: 26 September 2017 Publication History

Abstract

Since the boom in new proposals on techniques for efficient querying of XML data is now over and the research world has shifted its attention toward new types of data formats, we believe that it is crucial to review what has been done in the area to help users choose an appropriate strategy and scientists exploit the contributions in new areas of data processing. The aim of this work is to provide a comprehensive study of the state-of-the-art of approaches for the structural querying of XML data. In particular, we start with a description of labeling schemas to capture the structure of the data and the respective storage strategies. Then we deal with the key part of every XML query processing: a twig query join, XML query algebras, optimizations of query plans, and selectivity estimation of XML queries. To the best of our knowledge, this is the first work that provides such a detailed description of XML query processing techniques that are related to structural aspects and that contains information about their theoretical and practical features as well as about their mutual compatibility and general usability.

References

[1]
Serge Abiteboul, Ioana Manolescu, Neoklis Polyzotis, Nicoleta Preda, and Chong Sun. 2008. XML processing in DHT networks. In IEEE 24th International Conference on Data Engineering. IEEE, 606--615
[2]
Serge Abiteboul, Ioana Manolescu, and Emanuel Taropa. 2006. A framework for distributed XML data management. In International Conference on Extending Database Technology. Springer, 1049--1058.
[3]
Ashraf Aboulnaga, Alaa R. Alameldeen, and Jeffrey F. Naughton. 2001. Estimating the selectivity of XML path expressions for internet scale applications. In Proceedings of the 27th International Conference on Very Large Data Bases (VLDB). 591--600.
[4]
Manoj K. Agarwal, Krithi Ramamritham, and Prashant Agarwal. 2016. Generic keyword search over XML data. In Proceedings of the 19th International Conference on Extending Database Technology (EDBT’16). 149--160.
[5]
Shurug Al-Khalifa and H. V. Jagadish. 2002. Multi-level operator combination in XML query processing. In Proceedings of the 2002 ACM CIKM International Conference on Information and Knowledge Management (CIKM). 134--141.
[6]
Shurug Al-Khalifa, H. V. Jagadish, Jignesh M. Patel, Yuqing Wu, Nick Koudas, and Divesh Srivastava. 2002. Structural joins: A primitive for efficient XML query pattern matching. In Proceedings of the 18th International Conference on Data Engineering (ICDE). 141--152.
[7]
Norah Saleh Alghamdi, Wenny Rahayu, and Eric Pardede. 2014. Semantic-based structural and content indexing for the efficient retrieval of queries over large XML data repositories. Future Generation Computer Systems 37 (2014), 212--231.
[8]
Muath Alrammal, Gaetan Hains, and Mohamed Zergaoui. 2011. Path tree: Document synopsis for xpath query selectivity estimation. In 2011 International Conference on Complex, Intelligent and Software Intensive Systems (CISIS). IEEE, 321--328.
[9]
S. Amer-Yahia. 2003. Storage Techniques and Mapping Schemas for XML. Technical Report TD-5P4L7B, AT&T Labs-Research.
[10]
Morton M. Astrahan and Donald D. Chamberlin. 1975. Implementation of a structured english query language. Communications of the ACM 18, 10 (1975), 580--588.
[11]
R. Bača and M. Krátký. 2009. On the efficiency of a prefix path holistic algorithm. In Database and XML Technologies (Lecture Notes in Computer Science), Vol. 5679. Springer--Verlag, 25--32.
[12]
Radim Bača, Michal Krátký, Tok Wang Ling, and Jiaheng Lu. 2012. Optimal and efficient generalized twig pattern processing: A combination of preorder and postorder filterings. The VLDB Journal (2012), 1--25.
[13]
Radim Bača, Petr Lukáš, and Michal Krátký. 2015. Cost-based holistic twig joins. Information Systems 52 (2015), 21--33.
[14]
A. Balmin and Y. Papakonstantinou. 2005. Storing and querying XML data using denormalized relational databases. The VLDB Journal 14, 1 (2005), 30--49.
[15]
Roger Bamford, Vinayak Borkar, Matthias Brantner, Peter M. Fischer, Daniela Florescu, David Graf, Donald Kossmann, Tim Kraska, Dan Muresan, Sorin Nasoi, and others. 2009. XQuery reloaded. Proceedings of the VLDB Endowment 2, 2 (2009), 1342--1353.
[16]
Zhifeng Bao, Tok Wang Ling, Bo Chen, and Jiaheng Lu. 2009. Effective xml keyword search with relevance oriented ranking. In Proceedings of the 2009 IEEE 25th International Conference on Data Engineering. IEEE, 517--528.
[17]
Radim Bača and Michal Krátký. 2008. On the efficient search of an XML twig query in large dataguide trees. In Proceedings of the 12th International Database Engineering 8 Applications Symposium, IDEAS 2008. ACM Press.
[18]
Radim Bača and Michal Krátký. 2009. TJDewey -- On the efficient path labeling scheme holistic approach. In Proceedings of the 1st International Workshop on Benchmarking of XML and Semantic Web Applications, DASFAA. Springer--Verlag.
[19]
Radim Bača, Jiří Walder, Martin Pawlas, and Michal Krátký. 2010. Benchmarking the compression of XML node streams. In Proceedings of the BenchmarX 2010 International Workshop, DASFAA. Springer-Verlag.
[20]
Rudolf Bayer and E. McCreight. 2002. Software pioneers. Springer-Verlag New York. Chapter Organization and Maintenance of Large Ordered Indexes, 245--262.
[21]
Dave Beckett and Brian McBride. 2004. RDF/XML Syntax Specification (Revised). W3C. Retrieved from http://www.w3.org/TR/rdf-syntax-grammar/.
[22]
Nicole Bidoit, Dario Colazzo, Noor Malla, Federico Ulliana, Maurizio Nolé, and Carlo Sartiani. 2013. Processing XML queries and updates on map/reduce clusters. In Joint 2013 EDBT/ICDT Conferences, EDBT’13 Proceedings. Genoa, 745--748.
[23]
Nikos Bikakis, Nektarios Gioldasis, Chrisa Tsinaraki, and Stavros Christodoulakis. 2009. Querying XML data with SPARQL. In International Conference on Database and Expert Systems Applications. Springer, 372--381.
[24]
P. V. Biron and A. Malhotra. October 2004. XML Schema Part 2: Datatypes (Second Edition). W3C. Retrieved from http://www.w3.org/TR/xmlschema-2/.
[25]
Christian Bizer, Tom Heath, and Tim Berners-Lee. 2009. Linked data -- The story so far. International Journal on Semantic Web and Information Systems 5, 3 (2009), 1--22.
[26]
S. Boag, D. Chamberlin, M. F. Fern’andez, D. Florescu, J. Robie, and J. Sim’eon. December 2010. XQuery 1.0: An XML Query Language (Second Edition). W3C. Retrieved from http://www.w3.org/TR/xquery/.
[27]
Philip Bohannon, Juliana Freire, Prasan Roy, and Jérôme Siméon. 2002. From XML schema to relations: A cost-based approach to XML storage. In Proceedings of the 18th International Conference on Data Engineering (ICDE). 64--75.
[28]
Matthias Brantner, Sven Helmer, Carl-Christian Kanne, and Guido Moerkotte. 2005. Full-fledged algebraic Xpath processing in Natix. In Proceedings of the 21st International Conference on Data Engineering (ICDE). 705--716.
[29]
Tim Bray. 2014. The Javascript object notation (Json) data interchange format. Internet Engineering Task Force (IETF) (2014).
[30]
T. Bray, J. Paoli, C. M. Sperberg-McQueen, E. Maler, and F. Yergeau. November 2008. Extensible Markup Language (XML) 1.0 (Fifth Edition). W3C. http://www.w3.org/TR/xml/.
[31]
Nicolas Bruno, Nick Koudas, and Divesh Srivastava. 2002. Holistic twig joins: Optimal XML pattern matching. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 310--321.
[32]
Jesús Camacho-Rodríguez, Dario Colazzo, and Ioana Manolescu. 2012. Building large XML stores in the Amazon cloud. In 2012 IEEE 28th International Conference on Data Engineering Workshops (ICDEW). IEEE, 151--158.
[33]
Jesús Camacho-Rodríguez, Dario Colazzo, and Ioana Manolescu. 2015. Paxquery: Efficient parallel processing of complex xquery. IEEE Transactions on Knowledge and Data Engineering 27, 7 (2015), 1977--1991.
[34]
Dirceu Cavendish and K. Selçuk Candan. 2008. Distributed XML processing: Theory and applications. Journal of Parallel and Distributed Computing 68, 8 (2008), 1054--1069.
[35]
Surajit Chaudhuri. 1998. An overview of query optimization in relational systems. In Proceedings of the 17th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM, 34--43.
[36]
Dunren Che, Karl Aberer, and M. Tamer Özsu. 2006. Query optimization in XML structured-document databases. The VLDB Journal 15, 3 (2006), 263--289.
[37]
Dunren Che, Tok Wang Ling, and Wen-Chi Hou. 2012. Holistic boolean-twig pattern matching for efficient XML query processing. IEEE Transactions on Knowledge and Data Engineering 24, 11 (2012), 2008--2024.
[38]
Bo Chen, Tok Wang Ling, M. Tamer Özsu, and Zhenzhou Zhu. 2007. On label stream partition for efficient holistic twig join. In Database Systems for Advanced Applications, 10th International Conference (DASFAA). 807--818.
[39]
Qun Chen, Andrew Lim, and Kian Win Ong. 2003. D(k)-index: An adaptive structural summary for graph-structured data. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD’03). ACM, New York, 134--144.
[40]
Songting Chen, Hua-Gang Li, Jun’ichi Tatemura, Wang-Pin Hsiung, Divyakant Agrawal, and K. Selçuk Candan. 2006. Twigstack: Bottom-up processing of generalized-tree-pattern queries over XML documents. In Proceedings of the 32nd International Conference on Very Large Data Bases (VLDB). VLDB endowment, 283--294.
[41]
Ting Chen, Jiaheng Lu, and Tok Wang Ling. 2005. On boosting holism in XML twig pattern matching using structural indexing techniques. In Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data (SIGMOD). 455--466.
[42]
Zhimin Chen, H. V. Jagadish, Laks V. S. Lakshmanan, and Stelios Paparizos. 2003. From tree patterns to generalized tree patterns: On efficient evaluation of XQuery. In Proceedings of the 29th International Conference on Very Large Data Bases (VLDB). 237--248.
[43]
B. Choi, M. Mahoui, and D. Wood. 2003. On the optimality of holistic algorithms for twig queries. In Database and Expert Systems Applications (Lecture Notes in Computer Science), Vol. 2736. Springer--Verlag, 28--37.
[44]
J. Clark and S. DeRose. November 1999. XML Path Language (XPath) Version 1.0. W3C. Retrieved from http://www.w3.org/TR/xpath.
[45]
Sara Cohen and Maayan Shiloach. 2009. Flexible XML querying using skyline semantics. In 2009 IEEE 25th International Conference on Data Engineering. IEEE, 553--564.
[46]
Wojciech Czerwinski, Wim Martens, Pawel Parys, and Marcin Przybylko. 2015. The (almost) complete guide to tree pattern containment. In Proceedings of the 34th ACM Symposium on Principles of Database Systems, PODS 2015, Melbourne, Victoria, Australia, May 31-June 4, 2015. 117--130.
[47]
Alin Deutsch, Yannis Papakonstantinou, and Yu Xu. 2004. The NEXT logical framework for XQuery. In Proceedings of the 30th International Conference on Very Large Data Bases (VLDB). 168--179.
[48]
Paul F. Dietz. 1982. Maintaining order in a linked list. In Proceedings of 14th Annual ACM Symposium on Theory of Computing (STOC 1982). 122--127.
[49]
Matthias Droop, Markus Flarer, Jinghua Groppe, Sven Groppe, Volker Linnemann, Jakob Pinggera, Florian Santner, Michael Schier, Felix Schöpf, Hannes Staffler, and others. 2008. Bringing the XML and semantic web worlds closer: Transforming XML into RDF and embedding XPath into SPARQL. In International Conference on Enterprise Information Systems. Springer, 31--45.
[50]
Matthias Droop, Markus Flarer, Jinghua Groppe, Sven Groppe, Volker Linnemann, Jakob Pinggera, Florian Santner, Michael Schier, Felix Schöpf, Hannes Staffler, and others. 2007. Translating xpath queries into sparql queries. In OTM Confederated International Conferences on the Move to Meaningful Internet Systems. Springer, 9--10.
[51]
F. Du, S. Amer-Yahia, and J. Freire. 2004. ShreX: Managing XML documents in relational databases. In VLDB’04: Proceedings of 30th International Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc., Toronto, ON, Canada, 1297--1300.
[52]
E.-S. M. El-Alfy, S. Mohammed, and A. F. Barradah. 2015. XHQE: A hybrid system for scalable selectivity estimation of XML queries. Information Systems Frontiers (2015), 1--17.
[53]
Iman Elghandour, Ashraf Aboulnaga, Daniel C. Zilio, and Calisto Zuzarte. 2013. Recommending XML physical designs for XML databases. VLDB Journal 22, 4 (2013), 447--470.
[54]
David C. Faye, Olivier Curé, and Guillaume Blin. 2012. A survey of RDF storage approaches. Arima Journal 15 (2012), 11--35.
[55]
Donald Feinberg and Adam M. Ronthal. 2016. Hype Cycle for Information Infrastructure, 2016. Technical Report G00304182. Gartner.
[56]
Thorsten Fiebig, Sven Helmer, Carl-Christian Kanne, Guido Moerkotte, Julia Neumann, Robert Schiele, and Till Westmann. 2002. Anatomy of a native XML base management system. VLDB Journal 11, 4 (2002), 292--314.
[57]
Guilherme Figueiredo, Vanessa Braganholo, and Marta Mattoso. 2010. Processing queries over distributed XML databases. Journal of Information and Data Management 1, 3 (2010), 455.
[58]
Jan Finis, Robert Brunel, Alfons Kemper, Thomas Neumann, Franz Färber, and Norman May. 2013. DeltaNI: An efficient labeling scheme for versioned hierarchical data. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. ACM, 905--916.
[59]
Jan Finis, Robert Brunel, Alfons Kemper, Thomas Neumann, Norman May, and Franz Faerber. 2016. Order indexes: Supporting highly dynamic hierarchical data in relational main-memory database systems. VLDB Journal (2016), 1--26.
[60]
Damien K. Fisher and Sebastian Maneth. 2007. Structural selectivity estimation for XML documents. In Proceedings of the 23rd International Conference on Data Engineering (ICDE). 626--635.
[61]
D. Florescu and D. Kossmann. 1999. Storing and querying XML data using an RDMBS. IEEE Data Engineering Bulletin 22, 3 (1999), 27--34.
[62]
Marcus Fontoura, Vanja Josifovski, Eugene Shekita, and Beverly Yang. 2005. Optimizing cursor movement in holistic twig joins. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management (CIKM’05). ACM, New York, 784--791.
[63]
Juliana Freire, Jayant R. Haritsa, Maya Ramanath, Prasan Roy, and Jérôme Siméon. 2002. StatiX: Making XML count. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 181--191.
[64]
Haris Georgiadis, Minas Charalambides, and Vasilis Vassalos. 2010. Efficient physical operators for cost-based XPath execution. In Proceedings of the 13th International Conference on Extending Database Technology. ACM, 171--182.
[65]
Roy Goldman and Jennifer Widom. 1997. DataGuides: Enabling query formulation and optimization in semistructured databases. In Proceedings of the International Conference on Very Large Data Bases, VLDB 1997. 436--445.
[66]
Nils Grimsmo, Truls A. Bjorklund, and Magnus Lie Hetland. 2010. Fast optimal twig joins. In Proceedings of the 36st International Conference on Very Large Data Bases, VLDB 2010. VLDB Endowment.
[67]
Torsten Grust. 2005. Purely relational FLWORs. In Proceedings of the 2nd International Workshop on XQuery Implementation, Experience and Perspectives (XIME-P), in cooperation with ACM SIGMOD.
[68]
Torsten Grust. June 4-6, 2002. Accelerating XPath location steps. In Proceedings of the 2002 ACM SIGMOD, Madison, USA. ACM Press.
[69]
Torsten Grust, Jan Rittinger, and Jens Teubner. 2008. Pathfinder: XQuery off the relational shelf. IEEE Data Engineering Bulletin 31, 4 (2008), 7--14.
[70]
Torsten Grust, Sherif Sakr, and Jens Teubner. 2004. XQuery on SQL hosts. In Proceedings of the 29th International Conference on Very Large Data Bases (VLDB). 252--263.
[71]
Torsten Grust, Maurice van Keulen, and Jens Teubner. 2003. Staircase join: Teach a relational DBMS to watch its (axis) steps. In Proceedings of the 29th International Conference on Very Large Data Bases (VLDB). 524--525.
[72]
Alan Halverson, Josef Burger, Leonidas Galanis, Ameet Kini, Rajasekar Krishnamurthy, Ajith Nagaraja Rao, Feng Tian, Stratis D. Viglas, Yuan Wang, Jeffrey F. Naughton, and others. 2003. Mixed mode XML query processing. In Proceedings of the 29th International Conference on Very Large Data Bases-Volume 29. VLDB Endowment, 225--236.
[73]
Mohammad Hammoud, Dania Abed Rabbou, Reza Nouri, Seyed-Mehdi-Reza Beheshti, and Sherif Sakr. 2015. DREAM: Distributed RDF engine with adaptive query planner and minimal communication. PVLDB 8, 6 (2015), 654--665.
[74]
T. Härder, M. Haustein, C. Mathis, and M. Wagner. 2007. Node labeling schemes for dynamic XML documents reconsidered. Data 8 Knowledge Engineering 60, 1 (2007), 126--149.
[75]
IBM. 2016. DB2 Database Software. Retrieved from http://www-01.ibm.com/software/data/db2/.
[76]
ISO/IEC 9075-14:2003. 2006. Part 14: XML-Related Specifications (SQL/XML). Int. Organization for Standardization.
[77]
Sayyed Kamyar Izadi, Mostafa S. Haghjoo, and Theo Härder. 2012. S3: Processing tree-pattern XML queries with all logical operators. Data 8 Knowledge Engineering 72 (2012), 31--62.
[78]
H. V. Jagadish, Shurug Al-Khalifa, Adriane Chapman, Laks V. S. Lakshmanan, Andrew Nierman, Stelios Paparizos, Jignesh M. Patel, Divesh Srivastava, Nuwee Wiwatwattana, Yuqing Wu, and Cong Yu. 2002. TIMBER: A native XML database. VLDB Journal 11, 4 (2002), 274--291.
[79]
H. V. Jagadish, Laks V. S. Lakshmanan, Divesh Srivastava, and Keith Thompson. 2001. TAX: A tree algebra for XML. In Proceedings of the 8th International Workshop on Database Programming Languages (DBPL). 149--164.
[80]
Haifeng Jiang, Hongjun Lu, and Wei Wang. 2004. Efficient processing of twig queries with OR-predicates. In Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data (SIGMOD). 59--70.
[81]
Haifeng Jiang, Hongjun Lu, Wei Wang, and Beng Chin Ooi. 2003a. XR-tree: Indexing XML data for efficient structural joins. In Proceedings of the 19th International Conference on Data Engineering (ICDE). 253--263.
[82]
Haifeng Jiang, Wei Wang, Hongjun Lu, and Jeffrey Xu Yu. 2003b. Holistic twig joins on indexed XML documents. In Proceedings of 29th International Conference on Very Large Data Bases (VLDB). 273--284.
[83]
Enhua Jiao, Tok Wang Ling, and Chee Yong Chan. 2005. PathStack-: A holistic path join algorithm for path query with not-predicates on XML data. In Proceedings of the 10th International Conference on Database Systems for Advanced Applications (DASFAA). 113--124.
[84]
R. Kaushik, P. Bohannon, J. F. Naughton, and H. F. Korth. 2002a. Covering indexes for branching path queries. In Proceedings of ACM SIGMOD 2002. ACM Press, 133--144.
[85]
R. Kaushik, P. Shenoy, P. Bohannon, and E. Gudes. 2002b. Exploiting local similarity for indexing paths in graph-structured data. In Proceedings of the 18th International Conference on Data Engineering, 2001. 129--140.
[86]
Lukas Kircher, Michael Grossniklaus, Christian Grün, and Marc H. Scholl. 2015. Efficient structural bulk updates on the pre/dist/size XML encoding. In 2015 IEEE 31st International Conference on Data Engineering. IEEE, 447--458.
[87]
M. Klettke and H. Meyer. 2000. XML and object-relational database systems -- Enhancing structural mappings based on statistics. In Lecture Notes in Computer Science, Vol. 1997. 151--170.
[88]
Michal Krátký, Jaroslav Pokorný, and Václav Snášel. 2004. Implementation of XPath axes in the multi-dimensional approach to indexing XML data. In Current Trends in Database Technology, International Workshop on Database Technologies for Handling XML information on the Web, DataX, EDBT 2004 (Lecture Notes in Computer Science), Vol. 3268. Springer--Verlag.
[89]
A. Kuckelberg and R. Krieger. 2003. Efficient structure oriented storage of XML documents using ORDBMS. In Proceedings of the VLDB’02 Workshop EEXTT and CAiSE’02 Workshop DTWeb on Efficiency and Effectiveness of XML Tools and Techniques and Data Integration over the Web -- Revised Papers. Springer-Verlag, London, UK, 131--143.
[90]
Wolfgang Lehner and Kai-Uwe Sattler. 2013. Web-Scale Data Management for the Cloud. Springer. Retrieved from http://www.springer.com/computer/database+management+%26+information+retrieval/book/978-1-4614-6855-4.
[91]
Viktor Leis, Alfons Kemper, and Thomas Neumann. 2013. The adaptive radix tree: ARTful indexing for main-memory databases. In 2013 IEEE 29th International Conference on Data Engineering (ICDE). IEEE, 38--49.
[92]
Guoliang Li, Jianhua Feng, Jianyong Wang, and Lizhu Zhou. 2007. Effective keyword search for valuable lcas over xml documents. In Proceedings of the 16th ACM Conference on Information and Knowledge Management. ACM, 31--40.
[93]
Hanyu Li, Mong-Li Lee, Wynne Hsu, and Gao Cong. 2006. An estimation system for XPath expressions. In Proceedings of the 22nd International Conference on Data Engineering (ICDE). 54.
[94]
Lipyeow Lim, Min Wang, Sriram Padmanabhan, Jeffrey Scott Vitter, and Ronald Parr. 2002. XPathLearner: An on-line self-tuning Markov histogram for XML path selectivity estimation. In Proceedings of International Conference on Very Large Data Bases (VLDB). 442--453.
[95]
Rung-Ren Lin, Ya-Hui Chang, and Kun-Mao Chao. 2013. A compact and efficient labeling scheme for XML documents. In International Conference on Database Systems for Advanced Applications. Springer, 269--283.
[96]
Jian Liu, Z. M. Ma, and Li Yan. 2013. Efficient labeling scheme for dynamic XML trees. Information Sciences 221 (2013), 338--354.
[97]
Jian Liu and DL Yan. 2016. Answering approximate queries over XML data. IEEE Transactions on Fuzzy Systems 24, 2 (2016), 288--305.
[98]
Jian Liu and X. X. Zhang. 2016. Dynamic labeling scheme for XML updates. Knowledge-Based Systems (2016).
[99]
Ziyang Liu and Yi Chen. 2011. Processing keyword search on XML: A survey. World Wide Web 14, 5--6 (2011), 671--707.
[100]
Jiaheng Lu, Ting Chen, and Tok Wang Ling. 2004. Efficient processing of XML twig patterns with parent child edges: A look-ahead approach. In Proceedings of the 13th ACM International Conference on Information and Knowledge Management (CIKM). 533--542.
[101]
Jiaheng Lu and Irena Holubová. 2017. Multi-model data management: What’s new and what’s next?. In Proceedings of the 20th International Conference on Extending Database Technology, EDBT 2017, Venice, Italy, March 21--24, 2017. Volker Markl, Salvatore Orlando, Bernhard Mitschang, Periklis Andritsos, Kai-Uwe Sattler, and Sebastian Breß (Eds.). OpenProceedings.org, 602--605.
[102]
J. Lu, T. W. Ling, C. Y. Chan, and T. Chen. 2005a. From region encoding to extended Dewey: On efficient processing of XML twig pattern matching. In Proceedings of the 31st Conference on Very Large Data Bases, VLDB 2005. VLDB Endowment, 193--204.
[103]
Jiaheng Lu, Tok Wang Ling, Zhifeng Bao, and Chen Wang. 2010. Extended XML tree pattern matching: Theories and algorithms. IEEE Transactions on Knowledge and Data Engineering (TKDE) 2010 23 (2010), 402--416. Issue 3.
[104]
Jiaheng Lu, Tok Wang Ling, Tian Yu, Changqing Li, and Wei Ni. 2005b. Efficient processing of ordered XML twig pattern. In Proceedings of DEXA 2005 (Lecture Notes in Computer Science), Vol. 3588. Springer-Verlag, 300--309.
[105]
Cheng Luo, Zhewei Jiang, Wen-Chi Hou, Feng Yu, and Qiang Zhu. 2009. A sampling approach for XML query selectivity estimation. In Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology. ACM, 335--344.
[106]
Imam Machdi, Toshiyuki Amagasa, and Hiroyuki Kitagawa. 2009. XML data partitioning strategies to improve parallelism in parallel holistic twig joins. In Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication. ACM, 471--480.
[107]
Imam Machdi, Toshiyuki Amagasa, and Hiroyuki Kitagawa. 2010. Parallel holistic twig joins on a multi-core system. International Journal of Web Information Systems 6, 2 (2010), 149--177.
[108]
Sebastian Maneth, Nikolay Mihaylov, and Sherif Sakr. 2008. XML tree structure compression. In Proceedings of the 19th International Workshop on Database and Expert Systems Applications (DEXA’08). 243--247.
[109]
Ioana Manolescu, Yannis Papakonstantinou, and Vasilis Vassalos. 2009. XML tuple algebra. In Encyclopedia of Database Systems. Springer, 3640--3646.
[110]
Christian Mathis. 2007. Extending a tuple-based XPath algebra to enhance evaluation flexibility. Informatik-Forschung und Entwicklung 21, 3--4 (2007), 147--164.
[111]
Norman May, Sven Helmer, and Guido Moerkotte. 2004. Nested queries and quantifiers in an ordered context. In 2013 IEEE 29th International Conference on Data Engineering (ICDE). IEEE Computer Society, 239--239.
[112]
Philippe Michiels, George A. Mihaila, and Jérôme Siméon. 2007. Put a tree pattern in your algebra. In Proceedings of the 23rd International Conference on Data Engineering (ICDE). 246--255.
[113]
Microsoft. 2016. Microsoft SQL Server 2016. Retrieved from http://www.microsoft.com/en-us/server-cloud/products/sql-server/.
[114]
Salahadin Mohammed, Ahmad F. Barradah, and El-Sayed M. El-Alfy. 2016. Selectivity estimation of extended XML query tree patterns based on prime number labeling and synopsis modeling. Simulation Modelling Practice and Theory 64 (2016), 30--42.
[115]
Salahadin Mohammed, El-Sayed M. El-Alfy, and Ahmad F. Barradah. 2015. Improved selectivity estimator for XML queries based on structural synopsis. World Wide Web 18, 4 (2015), 1123--1144.
[116]
MonetDB BV. 2008. MonetDB/XQuery. MonetDB B.V.http://www.monetdb.org/XQuery.
[117]
Matthias Nicola, Irina Kogan, and Berni Schiefer. 2007. An XML transaction processing benchmark. In Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data. ACM, 937--948.
[118]
Bo Ning, Guoren Wang, and Jeffrey Xu Yu. 2008. A holistic algorithm for efficiently evaluating Xtwig joins. In Proceedings of the 13th International Conference on Database Systems for Advanced Applications (DASFAA 2008) (Lecture Notes in Computer Science), Vol. 4947. Springer--Verlag, 571--579.
[119]
P. O’Neil, E. O’Neil, S. Pal, I. Cseri, G. Schaller, and N. Westbury. 2004. ORDPATHs: Insert-friendly XML node labels. In Proceedings of the 2004 ACM International Conference on Management of Data, SIGMOD 2004. 903--908.
[120]
Oracle. 2016. Oracle Database 12c. (2016). https://www.oracle.com/database/.
[121]
Fatma Özcan, Normen Seemann, and Ling Wang. 2008. XQuery rewrite optimization in IBM DB2 pureXML. IEEE Data Engineering Bulletin 31, 4 (2008), 25--32.
[122]
Yinfei Pan, Ying Zhang, and Kenneth Chiu. 2008. Parsing XML using parallel traversal of streaming trees. In International Conference on High-Performance Computing. Springer, 142--156.
[123]
Neoklis Polyzotis and Minos Garofalakis. 2006a. XSKETCH synopses for XML data graphs. ACM Transactions on Database Systems 31 (September2006), 1014--1063. Issue 3.
[124]
Neoklis Polyzotis, Minos Garofalakis, and Yannis Ioannidis. 2004b. Approximate XML query answers. In Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data. ACM, 263--274.
[125]
Neoklis Polyzotis, Minos Garofalakis, and Yannis Ioannidis. 2004a. Selectivity estimation for XML twigs. In Proceedings of the 20th International Conference on Data Engineering, 2004. IEEE, 264--275.
[126]
Neoklis Polyzotis and Minos N. Garofalakis. 2002. Structure and value synopses for XML data graphs. In Proceedings of the International Conference on Very Large Data Bases (VLDB). IEEE, 466--477.
[127]
Neoklis Polyzotis and Minos N. Garofalakis. 2006b. XCluster synopses for structured XML content. In Proceedings of the International Conference on Data Engineering (ICDE). IEEE, 63.
[128]
Viswanath Poosala and Yannis E. Ioannidis. 1997. Selectivity estimation without the attribute value independence sssumption. VLDB Journal Vol. 97. 486--495.
[129]
Lu Qin, Jeffrey Xu Yu, and Bolin Ding. 2007. TwigList: Make twig pattern matching fast. In Proceedings of the 10th International Conference on Database Systems for Advanced Applications (DASFAA). 850--862.
[130]
Christopher Re, Jérôme Siméon, and Mary F. Fernández. 2006. A complete and efficient algebraic compiler for XQuery. In Proceedings of the 22nd International Conference on Data Engineering (ICDE). 14.
[131]
Jonathan Robie, Matthias Brantner, Daniela Florescu, Ghislain Fourny, and Till Westmann. 2012. JSONiq: XQuery for JSON. JSON for XQuery (2012), 63--72.
[132]
Kanda Runapongsa and Jignesh M. Patel. 2002. Storing and querying XML data in object-relational DBMSs. In Proceedings of the Worshops XMLDM, MDDE, and YRWS on XML-Based Data Management and Multimedia Engineering-Revised Papers (EDBT’02). Springer-Verlag, London, 266--285.
[133]
Sherif Sakr. 2007. Cardinality-aware and purely relational implementation of an XQuery processor. Ph.D. Dissertation. University of Konstanz. http://www.ub.uni-konstanz.de/kops/volltexte/2007/3259/.
[134]
Sherif Sakr. 2008. Algebra-based XQuery cardinality estimation. International Journal of Web Information Systems 4, 1 (2008), 7--46.
[135]
Sherif Sakr. 2009. XML compression techniques: A survey and comparison. Journal of Computer System Science 75, 5 (2009), 303--322.
[136]
Sherif Sakr. 2016. Big Data 2.0 Processing Systems - A Survey. Springer.
[137]
Sherif Sakr and Ghazi Al-Naymat. 2010. Relational processing of RDF queries: A survey. ACM SIGMOD Record 38, 4 (2010), 23--28.
[138]
Sherif Sakr and Mohamed Medhat Gaber (Eds.). 2014. Large Scale and Big Data - Processing and Management. Auerbach Publications.
[139]
Sherif Sakr, Anna Liu, Daniel M. Batista, and Mohammad Alomari. 2011. A survey of large scale data management approaches in cloud environments. IEEE Communications Surveys and Tutorials 13, 3 (2011), 311--336.
[140]
Sherif Sakr and Eric Pardede (Eds.). 2011. Graph Data Management: Techniques and Applications. IGI Global.
[141]
Airi Salminen and Frank Wm. Tompa. 1994. PAT expressions: An algebra for text search. Acta Linguistica Hungarica 41, 1 (1994), 277--306.
[142]
Stefan Schuh, Xiao Chen, and Jens Dittrich. 2016. An experimental comparison of thirteen relational equi-joins in main memory. In Proceedings of the 2016 International Conference on Management of Data. ACM, 1961--1976.
[143]
Thomas Schwentick. 2007. Automata for XML: A survey. Journal of Computer System Science 73, 3 (2007), 289--315.
[144]
J. Shanmugasundaram, K. Tufte, C. Zhang, G. He, D. J. DeWitt, and J. F. Naughton. 1999. Relational databases for querying XML documents: Limitations and opportunities. In VLDB’99: Proceedings of the 25th International Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 302--314.
[145]
Lila Shnaiderman and Oded Shmueli. 2015. Multi-core processing of XML twig patterns. IEEE Transactions on Knowledge and Data Engineering 27, 4 (2015), 1057--1070.
[146]
Adam Silberstein, Hao He, Ke Yi, and Jun Yang. 2005. BOXes: Efficient maintenance of order-based labeling for dynamic XML data. In Proceedings of the 21st International Conference on Data Engineering (ICDE). IEEE, 285--296.
[147]
Igor Tatarinov, Stratis D. Viglas, Kevin Beyer, Jayavel Shanmugasundaram, Eugene Shekita, and Chun Zhang. 2002. Storing and querying ordered XML using a relational database system. In Proceedings of the ACM International Conference on Management of Data (SIGMOD’02). ACM Press, New York.204--215.
[148]
Jens Teubner, Torsten Grust, Sebastian Maneth, and Sherif Sakr. 2008. Dependable cardinality forecasts for XQuery. Proceedings of the VLDB Endowment (PVLDB) 1, 1 (2008), 463--477.
[149]
H. S. Thompson, D. Beech, M. Maloney, and N. Mendelsohn. October 2004. XML Schema Part 1: Structures (Second Edition). W3C. http://www.w3.org/TR/xmlschema-1/.
[150]
Pingfang Tian, Dan Luo, Yaoyao Li, and Jinguang Gu. 2013. XML multi-core query optimization based on task preemption and data partition. In Joint International Semantic Technology Conference. Springer, 294--305.
[151]
Silke Trißl and Ulf Leser. 2007. Fast and practical indexing and querying of very large graphs. In Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data. ACM, 845--856.
[152]
University of Washington Database Group. 2002. The XML Repository. Retrieved from http://www.cs.washington.edu/research/xmldatasets/.
[153]
Henrique Valer, Caetano Sauer, and Theo Härder. 2013. XQuery processing over NoSQL stores. In Grundlagen von Datenbanken. Citeseer, 75--80.
[154]
Andreas M. Weiner and Theo Härder. 2009. Using structural joins and holistic twig joins for native XML query optimization. In Advances in Databases and Information Systems (LNCS), Vol. 5739. Springer, Berlin, 149--163.
[155]
Andreas M. Weiner and Theo Härder. 2010. An integrative approach to query optimization in native XML database management systems. In Proceedings of the 14th International Database Engineering 8 Applications Symposium (IDEAS’10). ACM, New York, 64--74.
[156]
Y. Wu, J. M. Patel, and H. Jagadish. 2003. Structural join order selection for XML query optimization. In Proceedings of ICDE 2003. IEEE CS, 443--454.
[157]
Yuqing Wu, Jignesh M. Patel, and H. V. Jagadish. 2002. Estimating answer sizes for XML queries. In Proceedings of the 8th International Conference on Extending Database Technology (EDBT). 590--608.
[158]
Yanghua Xiao, Ji Hong, Wanyun Cui, Zhenying He, Wei Wang, and Guodong Feng. 2012. Branch code: A labeling scheme for efficient query answering on trees. In 2012 IEEE 28th International Conference on Data Engineering. IEEE, 654--665.
[159]
W. Xiao-ling, L. Jin-feng, and D. Yi-sheng. 2003. An adaptable and adjustable mapping from XML data to tables in RDB. In Proceedings of the VLDB’02 Workshop EEXTT and CAiSE’02 Workshop DTWeb. Springer-Verlag, London. 117--130.
[160]
Liang Xu, Tok Wang Ling, Huayu Wu, and Zhifeng Bao. 2009. DDE: From Dewey to a fully dynamic XML labeling scheme. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data. ACM, 719--730.
[161]
B. Yang, M. Fontoura, E. Shekita, S. Rajagopalan, and K. Beyer. 2004. Virtual cursors for XML joins. In Proceedings of the 13th ACM International Conference on Information and Knowledge Management. ACM, 523--532.
[162]
Masatoshi Yoshikawa, T. Amagasa, T. Shimura, and S. Uemura. 2001. XRel: A path-based approach to storage and retrieval of XML documents using relational databases. ACM Transactions on Internet Technology 1, 1 (2001), 110--141.
[163]
Tian Yu, Tok Wang Ling, and Jiaheng Lu. 2006. TwigStackList-: A holistic twig join algorithm for twig query with not-predicates on XML data. In Proceedings of the 11th International Conference on Database Systems for Advanced Applications (DASFAA). 249--263.
[164]
Chun Zhang, Jeffrey Naughton, David DeWitt, Qiong Luo, and Guy Lohman. 2001. On supporting containment queries in relational database management systems. In Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data. ACM Press, 425--436.
[165]
Ning Zhang, Peter J. Haas, Vanja Josifovski, Guy M. Lohman, and Chun Zhang. 2005. Statistical learning techniques for costing XML queries. In Proceedings of the 31st International Conference on Very Large Data Bases (VLDB). 289--300.
[166]
Ning Zhang, M. Tamer Özsu, Ashraf Aboulnaga, and Ihab F. Ilyas. 2006. XSEED: Accurate and fast cardinality estimation for XPath queries. In Proceedings of the 22nd International Conference on Data Engineering (ICDE). IEEE, 61.
[167]
Junfeng Zhou, Wei Wang, Ziyang Chen, Jeffrey Xu Yu, Xian Tang, Yifei Lu, and Yukun Li. 2016. Top-down XML keyword query processing. IEEE Transactions on Knowledge and Data Engineering 28, 5 (2016), 1340--1353.

Cited By

View all
  • (2022)Handling Big Data in Relational Database Management SystemsComputers, Materials & Continua10.32604/cmc.2022.02832672:3(5149-5164)Online publication date: 2022
  • (2022)Synthesizing fuzzy tree automataRAIRO - Theoretical Informatics and Applications10.1051/ita/202200556(6)Online publication date: 21-Jun-2022
  • (2021)Dynamic interleaving of content and structure for robust indexing of semi-structured hierarchical dataProceedings of the VLDB Endowment10.14778/3401960.340196313:10(1641-1653)Online publication date: 10-Mar-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Computing Surveys
ACM Computing Surveys  Volume 50, Issue 5
September 2018
573 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/3145473
  • Editor:
  • Sartaj Sahni
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 September 2017
Accepted: 01 May 2017
Revised: 01 April 2017
Received: 01 May 2016
Published in CSUR Volume 50, Issue 5

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. XML
  2. structural XML query processing

Qualifiers

  • Survey
  • Research
  • Refereed

Funding Sources

  • ENET project
  • Student Grant Competition project
  • Czech Science Foundation project Nr. GAčR

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)42
  • Downloads (Last 6 weeks)3
Reflects downloads up to 24 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Handling Big Data in Relational Database Management SystemsComputers, Materials & Continua10.32604/cmc.2022.02832672:3(5149-5164)Online publication date: 2022
  • (2022)Synthesizing fuzzy tree automataRAIRO - Theoretical Informatics and Applications10.1051/ita/202200556(6)Online publication date: 21-Jun-2022
  • (2021)Dynamic interleaving of content and structure for robust indexing of semi-structured hierarchical dataProceedings of the VLDB Endowment10.14778/3401960.340196313:10(1641-1653)Online publication date: 10-Mar-2021
  • (2020)GridTables: A One-Size-Fits-Most H2TAP Data StoreDatenbank-Spektrum10.1007/s13222-019-00330-x20:1(43-56)Online publication date: 31-Jan-2020
  • (2019)Querying XML documents using Prolog enginesInformation Processing and Management: an International Journal10.1016/j.ipm.2019.05.01156:5(1753-1770)Online publication date: 1-Sep-2019

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media