Welcome!

Open Source Authors: Maureen O'Gara, Jeremy Geelan, Liz McMillan, Reuven Cohen, Lavenya Dilip

Related Topics: XML, Open Source

XML: Article

eXist - An Introduction To Open Source Native XML Database

I am going to introduce you to the open source, free (GNU LGPL license), native XML database eXist (www.exist-db-org)

XML Schema of Data
This is the XML Schema of our data taken from XBench (see the References section) (TC/SD). It has the structure information. Figure 8 tells us that "dictionary" is the root element. "dictionary" has more than one "e" children. The "hwg" element is the child of "e," and so on.

Querying eXist Using XPath
You can find all of the queries that were used in this article below. The following sections show similar queries taken from XBench. Using and understanding XPath is easier than XQuery, so we'll examine these queries first (see Figure 9).

The right-most icon, which looks like a pair of binoculars, is labeled "Query the database with XPath." This is slightly mislabeled, because we can use the same window for XQuery. In the middle of the "Query Dialog," make sure that you select the context /db/XBench. This is the collection in which we stored our data.

Query1
Return the entry matching the headword "minute." This query can be expressed in XPath as:


XPath:
/dictionary/e[hwg/hw="minute"]
Query1 results
<e id="E2">
<hwg>
<hw>minute</hw>
<pr>1/k96SMVL^2</pr>
<pos>n.</pos>
</hwg>
<et>
<cr>E143</cr>
<cr>E180</cr>
<cr>E530</cr>
<cr>E308</cr>
<cr>E215</cr>
<cr>E298</cr>
<cr>E294</cr>
</et>
...
Only one item is returned. Note that total time, for query compilation and execution, is 719 msec (milliseconds). This is pretty good!

Query2
Find the headword with matching the quotation year "1900." This query can be expressed in XPath as:

XPath:
/dictionary/e[ss/s/qp/q/qd="1900"]/hwg/hw

There are 28 matches for this query. Compilation time is 28 msec, execution time is 890 msec. Total time is less than a second. Compilation time reflects the time spent on reading indexes and applying the appropriate algorithms (more on this topic soon). Execution time is the time needed to access the data (stored in DOM format) and return the parts of it that satisfy the query.

Query2 results
<hw>husbandry</hw>
<hw>supper</hw>
<hw>strand</hw>
<hw>nominated</hw>
<hw>saying</hw>
<hw>coram</hw>
<hw>outwards</hw>
<hw>benches</hw>
<hw>faustuses</hw>
<hw>rhapsody</hw>
<hw>rotten</hw>
<hw>punish</hw>
<hw>favours</hw>
<hw>earth</hw>
<hw>italian</hw>
<hw>waits</hw>
<hw>mention</hw>
<hw>sea</hw>
<hw>compelled</hw>
<hw>rumination</hw>
<hw>outrage</hw>
<hw>liege</hw>
<hw>lifted</hw>
<hw>embrace</hw>
<hw>break</hw>
<hw>profession</hw>
<hw>erecting</hw>
<hw>cinna</hw>

Querying eXist Using XQuery
As I mentioned earlier eXist supports XQuery too. At this section we are going to work on two XQuery examples.

Query3
Return quotation text, separated by one unspecified level, matching headword "opinion." This query can be expressed in XQuery as:

XQuery:
for $ent in /dictionary/e
where $ent/*/hw = "opinion"
return
$ent/ss/s/qp/*/qt

Query3 results
<qt>bravely ruthless courts shall lose daringly since the dino</qt>
<qt>doggedly close dinos about the dinos use always about the foxes.ruthlessly
stealt</qt>
<qt>doggedly brave multipliers can nag quietly on the busy realms?dolphins at
the bravely fluf
<cr>E250</cr>
ironic warhorses besides the enticing,
</qt>
...

Items found numbered 20. Compilation time is 16 msec. Execution time is 2469 msec. Total time spent is less than 2.5 seconds. This is a reasonable time for answering a-not-so-easy query. Notice that this query has two stars (*) in it. Star (*) stands for any element.

More Stories By Selim Mimaroglu

Selim Mimaroglu is a PhD candidate in computer science at the University of Massachusetts in Boston. He holds an MS in computer science from that school and has a BS in electrical engineering.

Comments (1) View Comments

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Most Recent Comments
XML News Desk 11/29/05 07:22:28 PM EST

eXist - An Introduction To Open Source Native XML Database. In this article I am going to introduce you to the open source, free (GNU LGPL license), native XML database eXist (www.exist-db-org). Data is important, no question about it. Data that can't be queried is not very useful. Users expect to have good query response time. From my personal experience and testing, I am confident in saying that eXist is a fairly good database. It has very good query response time, it is very user friendly, it's easy to set up and operate, and it's written in Java, therefore it is platform independent.