Datenbanksysteme II

Internas und Implementierung von Datenbanksystemen


Database-Supported
XML Processors

Skalierbares XML Processing dank Datenbanktechnologie


Selected Fun Problems...

Proseminar basierend auf Aufgaben des ACM Programming Contest



Jan Rittinger PDF Print

Jan Rittinger

Coordinates

Address: Universität Tübingen
Wilhelm-Schickard-Institut für Informatik
Lehrstuhl für Datenbanksysteme
Sand 13 · 72076 Tübingen · Germany
Room: B312
Phone: +49 7071 29-75485
Fax: +49 7071 29-5958
E-Mail: This e-mail address is being protected from spambots. You need JavaScript enabled to view it

Research Interest

Since 2003 I'm working on the XQuery compiler Pathfinder. Pathfinder compiles arbitrary XQuery expressions into an 1NF table algebra while faithfully maintaining XQuery semantics.

During my internship at the CWI in 2004 I built the first back-end specific code generator for the MonetDB database system. The combination of this Pathfinder code generator and MonetDB have become known as ``MonetDB/XQuery''. Special bulk-oriented algorithms for XPath step processing (based on the Staircase-Join algorithms) and a syntactical value join recognition turned MonetDB/XQuery into one of the fastest XQuery processors.

Since 2005 I'm building and refining the table algebra of Pathfinder. Based on the algebraic representations our group (and others) build specific back-end adapters for SQL:99-speaking systems, MonetDB, kdb+, and Natix.

In parallel I'm building (and extending) a peephole-style optimizer for the Pathfinder algebra that—in contrast to optimizers in standard SQL systems that have real trouble making use of the unoptimized plans—allows all back-ends to benefit from the performance improvements hiding in the DAG-shaped algebraic plans. A large number of properties such as e.g., constantness, key information, column usage analysis, and functional dependencies and a large set of rewrite rules e.g., leads to ``join graph'' plans where order information and duplicate elimination are restricted to the result, but do not occur inside the evaluation plan anymore.

The effectiveness of the optimizer can be inspected in the plans at the bottom of the page: The unoptimized plan of XMark query Q8 (left) in comparison to the optimized variant (right) forces the back-end to choose a serial evaluation technique without a chance to apply value-based joins. The optimized variant of Q8 in constrast is a lot smaller, works without explicit intermediate ordering operators (look for red operators in the plans), and runs on all back-ends orders of magnitudes faster.


Teaching

Übung Database-Supported XML Processors
Proseminar Selected Fun Problems of the ACM Programming Contest

Short Biography

since 09/2008 Research assistant, Wilhelm-Schickard-Institut für Informatik, Universität Tübingen (Lehrstuhl für Datenbanksystem).
09/2005-08/2008 Research assistant, Department of Informatics, Technische Universität München (Database Systems Group).
03/2006 IBM Certified Database Administrator — DB2 Universal Database V8.1 for Linux UNIX and Windows
08/2005 Master's Degree (M.Sc.) in Computer Science at the Department of Computer & Information Science, University of Konstanz.
09/2004–08/2005 Master Research Student at the Graduiertenkolleg Konstanz, funded by the Deutsche Forschungsgemeinschaft (project Pathfinder).
04/2004–09/2004 Internship at the Centruum voor Wiskunde en Informatica in Amsterdam, The Netherlands. Designed and implemented a MIL code generator for XQuery in the Pathfinder.
01/2004–08/2005 Studies of Information Engineering at the University of Konstanz (Master of Science).
01/2004 Bachelor's Degree (B.Sc.) in Computer Science at the Department of Computer & Information Science, University of Konstanz.
07/2002–10/2002 Internship at the Department for Controlling & Business Administration, Dr. Ing. h.c. F. Porsche AG. Work on the requirement specification for an IT project management database.
10/2000–01/2004 Studies of Information Engineering at the University of Konstanz (Bachelor of Science).

Unoptimized and Optimized variant of XMark query Q8


Q8:
let $auction := doc("auction.xml") return
for $p in $auction/site/people/person
let $a :=
for $t in $auction/site/closed_auctions/closed_auction
where $t/buyer/@person = $p/@id
return $t
return {count($a)}

XMark Q8 unoptimized XMark Q8 optimized
Unoptimized algebra plan representing XMark query Q08 Optimized algebra plan representing XMark query Q08