| By Rick Grehan | Article Rating: |
|
| May 9, 2007 11:00 AM EDT | Reads: |
14,322 |
The first element of this class, theWord, holds the string of the word itself. The second is the array of POSSynonyms.
The result lets us create a network of interconnected synonyms. You can get a feel for what it might look like if you examine Figure 2. The ThesaurusEntry object for the word "heat" would include a posSyns array pointing to a POSSynonyms object for the verb synonyms, and another for the noun synonyms. The verb POSSynonyms object would, in turn, include a synonyms' array that points to ThesaurusEntry objects corresponding to the words boil, seethe, simmer, and so on. The POSSynonyms object would include a synonyms' array pointing to ThesaurusEnty objects for warmth, glow, temperature, and so on.
Building a Trie
So much for the storage of words
and connections to synonyms. Now we need a data structure that's
suitable for locating a word so a user can do a search. Note that this
data structure only has to locate words. We don't require
lexicographical ordering (as would be provided in, say, a binary tree
or a B-Tree). While there are several data structures that would serve
for this task, for this example, we'll use a trie.
A trie is a tree-like data structure. However, whereas most tree data structures store one or more entire keys (in our case, keys are words) in each node, a trie chops the key apart and stores pieces of it on each node. So our trie will store one letter on each node.
This will be easier to understand if you look at the diagram in Figure 3. Each node in a trie contains three elements and each element is an array:
- A character array. This holds the letters of the words that the trie stores.
- A node-pointer array. Members of this array point to child nodes in the trie.
- A data-pointer array. Members of this array point to the actual data that the trie indexes.
Published May 9, 2007 Reads 14,322
Copyright © 2007 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Rick Grehan
Rick Grehan is a QA engineer at Compuware's NuMega Labs, where he has worked on Java and .NET projects. He is also a contributing editor for InfoWorld Magazine. His articles have appeared in Embedded Systems Programming, EDN, The Microprocessor Report, BYTE, Computer Design, and other journals. He is also the coauthor of three books on computer programming.
- 4th International Cloud Computing Conference & Expo Starts Today
- Publishing Synergy: Blog, Twitter and Ulitzer
- Performance Tuning Essentials for Java
- Cloud Expo New York Call for Papers Deadline December 15
- Google Wave
- IBM Hardware Chief, Intel VC Exec Arrested in Insider Trading Scam
- Cloud Computing Can Revitalize Your Career as Software Developer
- SOA World Magazine "Readers' Choice Awards" Voting Is Now Open
- Oracle+MySQL Opponents Take to the Barricades
- Virtualization Expo Call for Papers Deadline December 15
- Oracle Faces Growing Price for MySQL
- SpringSource Moving to Spring 3.0
- 4th International Cloud Computing Conference & Expo Starts Today
- Deputy CIO of the CIA to Keynote 1st Annual GovIT Expo
- Publishing Synergy: Blog, Twitter and Ulitzer
- Performance Tuning Essentials for Java
- Cloud Expo New York Call for Papers Deadline December 15
- Cloud Computing Expo: Exclusive Q&A with Yahoo! SVP Cloud Computing
- Google Wave
- IBM Hardware Chief, Intel VC Exec Arrested in Insider Trading Scam
- Cloud Computing Can Revitalize Your Career as Software Developer
- Oracle-Sun: IBM Reportedly Behind Delay
- Citrix Aims To Cripple VMware’s Cloud Designs
- Oracle Trashes HP Relationship for Sun
- After Ubuntu, Windows Looks Increasingly Bad, Increasingly Archaic, Increasingly Unfriendly
- SCO CEO Posts Open Letter to the Open Source Community
- Simula Labs Launches Hosted Delivery Platform To Enable Enterprise Open Source Adoption
- Where Are RIA Technologies Headed in 2008?
- Source Claims SCO Will Sue Google
- How Open Is "Open"? – Industry Luminaries Join the Debate
- Latest SCO News is Plain Weird
- IBM Tells SCO Court It Can't Find AIX-on-Power Code
- SCO Claims Linux Lifted ELF
- Flashback: Investing in 'Professional Open Source' - Exclusive 2004 Interview with David Skok, Matrix Partners
- HP Starts Pushing Desktop Linux
- Linux Business Week Exclusive: Linux Kernel To Be Re-Written To Counter Microsoft FUD





























