Click here to close now.

Welcome!

Open Source Authors: Liz McMillan, Dan Ristic, Roger Strukhoff, Carmen Gonzalez, Lori MacVittie

Related Topics: XML

XML: Article

Integrating Enterprise Information on Demand with XQuery - Part 2

Integrating Enterprise Information on Demand with XQuery - Part 2

In Part I of this article (XML-J, Vol. 4, issue 6), we introduced the enterprise information integration (EII) problem and explained how the XML query language XQuery and related technologies - specifically XML, XML Schema, and Web services - are central to enabling this age-old problem to be successfully addressed at last.

We provided a technical overview of the XQuery language and presented a simple "single view of Customer" example to illustrate XQuery's role in the EII domain. The example was based on an electronics retailer that wanted to share customer information across three portals - portals for customer self-service, credit approval, and product service. The information to be integrated resided in a variety of back-end information sources, including two relational database management systems, an SAP system, and a Web service.

In this article, our XQuery/EII saga continues. In this installment, we look at how EII relates to two other technologies designed for integration tasks, namely enterprise application integration (EAI) and extract-transform-load (ETL) tools. We also take a brief look at BEA Liquid Data for WebLogic, an XQuery-based EII offering, and discuss how XQuery and Liquid Data were put to use recently in a telecommunications-related customer project.

What About EAI?
Given the industry buzz around EAI today, a natural question about EII is "so why bother?" That is, why isn't a modern EAI solution alone - for example, a workflow engine with XML-based data transformation capabilities - sufficient to solve the EII problem? The answer is, in principle, that EAI is in fact sufficient to solve the EII problem. A developer could always choose to hand-build a set of workflows, writing one workflow per application-level "query" to deliver the desired information back to the calling applications. In the example from Part I of this article, three hand-tailored workflows could instead be written to provide information retrieval capabilities comparable to our XQuery-based solution. But is that the best approach, in terms of development time and maintenance cost?

The basic question here is when to use a declarative query language (XQuery in the case of modern EII) versus constructing code in a procedural language (a workflow language in the case of EAI). The lessons from the relational database revolution are clear: When applicable, a declarative approach offers significant advantages. Instead of hand-constructing a "query plan" (EAI workflow) to extract the needed data from each of the data sources in some manually predefined order, the EII approach allows a single, smaller, and simpler declarative query to be written.

The resulting benefits should be obvious. First, the user does not need to build each query plan by hand, which could involve a considerable effort. Instead, the user specifies (when defining the core view) what data sources are relevant and what logical conditions relate and characterize the data to be retrieved. Second, queries can be optimized automatically by the EII middleware, resulting in an optimal query execution plan (order of accessing the sources, queries or methods to extract the data, etc.) for each different query. For example, using EAI, one central workflow could be written to retrieve all of the customer information in Part I's example, and then other workflows could be written to first call this workflow and then further filter the results. However, in the EII approach, the query processor will (for each query) prune out irrelevant data sources as well as push SQL selection conditions (such as only retrieving "Open" support cases in Listing 2 of Part I) down to any RDBMS data sources. Third, as the data sources change over time in terms of their schemas, statistics, or performance, the EII user will not be forced to rewrite all of his or her queries. Simply maintaining each base view query and re-optimizing the other queries will adapt their query execution plans to the new situation. In contrast, in the case of EAI, many workflows would have to be rewritten to handle most such changes.

There really isn't an either/or choice to be made between EAI and EII at all. Both technologies have critical roles to play in an overall enterprise integration solution. These technologies are complementary: EII provides ease of data integration, while EAI provides ease of process integration. EII is appropriate for composing integrated views and queries over enterprise data. EAI is the appropriate technology for creating composite applications that orchestrate the functional capabilities of a set of related but independent applications, Web services, etc. Moreover, EII can be used to handily augment EAI in scenarios where workflows need to access integrated data views. For example, if our electronics retailer wanted its order process to offer free shipping to customers who have ordered more than $1,000 of goods during the year and who have accumulated more than 5,000 reward points, the integrated view of customer from Part I could be used to easily access the relevant information from within the order entry workflow.

What About ETL?
Another technology related to EII is ETL. In fact, ETL tools are designed precisely for the purpose of integrating data from multiple sources. These tools are therefore another category of software that naturally leads to a "why bother with EII?" question - why isn't ETL technology the answer? As you'll see, the answer is again that both technologies have their place in modern IT architectures.

ETL tools are designed for use in moving data from a variety of sources into a data warehouse for offline analysis and reporting purposes. As the name suggests, ETL tools provide facilities for extracting data from a source; transforming that data into a more suitable form for inclusion in the data warehouse, possibly cleansing it in the process; and then loading the transformed data into the warehouse's database. Typical ETL tools are therefore focused on supporting the design and administration of data migration, cleansing, and transformation processes. These are often batch processes that occur on a daily or weekly basis.

Data warehouses and the ETL tools that feed them are invaluable for enabling businesses to aggregate and analyze historical information. For example, our electronics retailer might very well want to keep track of customer data, sales data, and product issue data over a period of years in order to analyze customer behavior by geographic region over time, improve their credit card risk model, and so on. A data warehouse is the appropriate place to retain such data and run large analytical queries against it, and ETL technology is the right technology today for creating, cleaning, and maintaining the data in the warehouse. However, ETL is not the right technology for building applications that need access to current operational data - it doesn't support the declarative creation of views or real-time access to operational data through queries.

For applications that need to integrate current information, Part I of this article showed how XQuery can be used to declaratively specify reusable views that aggregate data from multiple operational stores and how XQuery can be used to write XML queries over such integrated views. We also explained how standard database query processing techniques, including view expansion, predicate pushdown, and distributed query optimization, can be applied to XQuery, making XQuery-based EII an excellent technological fit for such applications.

Clearly, both ETL and EII technologies have important roles to play in today's enterprise. ETL serves to feed data warehouses, while EII is an enabler for applications that need timely access to current, integrated information from a variety of operational enterprise data sources. As with EAI, there are also cases where the two technologies come together. As one example, an ETL tool could be used to help create and maintain a cross-reference table to relate different notions of "customer id" for use in creating XQuery-based EII views across different back-end systems. As another example, an ETL-fed data warehouse could be used to build a portal for analyzing the historical behavior of a company's top customers, with an EII tool used to allow click-through inspection of the customers' purchases in the past 24 hours.

Putting XQuery-Based EII to Work
For the reasons discussed in this article, XQuery-based EII middleware is an emerging product segment that promises to deliver the tools and technology needed in this important space. One commercially available XQuery-based middleware product is BEA Liquid Data for WebLogic. Liquid Data is capable of accessing data from relational database management systems, Web services, packaged applications (through J2EE CA adapters and application views), XML files, XML messages, and, through a custom function mechanism, most any other data source as well. For illustration purposes, the architecture of Liquid Data is depicted in Figure 1. Liquid Data provides default XML views of all of its data sources and provides an XQuery-based graphical view and query editor for use in integrating and enhancing information drawn from one or more data sources. It includes a distributed query processing engine as well as providing advanced features such as support for query result caching and both data-source-level and stored query-level access control.

As a final example of the applicability of XQuery to enterprise information integration problems, we'll describe an actual customer integration exercise where Liquid Data was put to use. In that project, a large telecommunications vendor wanted to create a single view of order information for one of its business divisions. The goal of the project was to make integrated order information available to the division's customers (other businesses) through a Web portal, enabling their customers to log in and check on the status of their orders, as well as making information available to the division's own customer service representatives.

The division had data distributed across multiple systems, including a relational database containing order summary information and two different order management systems. Order details were kept in one or the other of the two order management systems, depending on the type of order. Functionality-wise, a limited view of order details was provided through the customer order status portal that the division built using Liquid Data, whereas customer service representatives were permitted to see all of the order data through their portal. In both cases, it was possible to search for order information by various combinations of purchase order number, date range, and order.

The use of XQuery-based EII technology enabled the customer to complete their portal project in much less time than they had expected it to take with traditional technologies, and their total cost of ownership was also lower due to the reusability of Liquid Data assets and the low cost of maintenance enabled by EII.

Summary
In this article, we have explained how XQuery is beginning to transform the integration world, making it possible to finally tackle the enterprise information integration problem where past attempts have failed. In Part I we provided an overview of XQuery and illustrated how it could be used to integrate the disparate information sources of a hypothetical electronics retailer. In Part II we discussed the relationship of EII to EAI and ETL technologies and then briefly presented BEA's XQuery-based EII product and described one of the customer projects in which it was used.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
SYS-CON Events announced today that Open Data Centers (ODC), a carrier-neutral colocation provider, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place June 9-11, 2015, at the Javits Center in New York City, NY. Open Data Centers is a carrier-neutral data center operator in New Jersey and New York City offering alternative connectivity options for carriers, service providers and enterprise customers.
SYS-CON Events announced today that On the Avenue Marketing Group, a sales and marketing firm that utilizes events to market and sell products to consumers, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. On the Avenue Marketing Group (OTA) is a sales and marketing firm that utilizes events to market and sell products to consumers. On behalf of our clients, we attend thousands of fairs, festivals, expos, concerts, conferences, and sporting events annually, helping them reach millions of individuals ...
SYS-CON Events announced today that ActiveState, the leading independent Cloud Foundry and Docker-based PaaS provider, has been named “Silver Sponsor” of SYS-CON's DevOps Summit New York, which will take place June 9-11, 2015, at the Javits Center in New York City, NY. ActiveState believes that enterprises gain a competitive advantage when they are able to quickly create, deploy and efficiently manage software solutions that immediately create business value, but they face many challenges that prevent them from doing so. The Company is uniquely positioned to help address these challenges thro...
SYS-CON Events announced today that Vitria Technology, Inc. will exhibit at SYS-CON’s @ThingsExpo, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Vitria will showcase the company’s new IoT Analytics Platform through live demonstrations at booth #330. Vitria’s IoT Analytics Platform, fully integrated and powered by an operational intelligence engine, enables customers to rapidly build and operationalize advanced analytics to deliver timely business outcomes for use cases across the industrial, enterprise, and consumer segments.
SYS-CON Events announced today that Alert Logic, the leading provider of Security-as-a-Service solutions for the cloud, has been named “Bronze Sponsor” of SYS-CON's 16th International Cloud Expo® and DevOps Summit 2015 New York, which will take place June 9-11, 2015, at the Javits Center in New York City, NY, and the 17th International Cloud Expo® and DevOps Summit 2015 Silicon Valley, which will take place November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA.
The WebRTC Summit 2015 New York, to be held June 9-11, 2015, at the Javits Center in New York, NY, announces that its Call for Papers is open. Topics include all aspects of improving IT delivery by eliminating waste through automated business models leveraging cloud technologies. WebRTC Summit is co-located with 16th International Cloud Expo, @ThingsExpo, Big Data Expo, and DevOps Summit.
SYS-CON Events announced today that Akana, formerly SOA Software, has been named “Bronze Sponsor” of SYS-CON's 16th International Cloud Expo® New York, which will take place June 9-11, 2015, at the Javits Center in New York City, NY. Akana’s comprehensive suite of API Management, API Security, Integrated SOA Governance, and Cloud Integration solutions helps businesses accelerate digital transformation by securely extending their reach across multiple channels – mobile, cloud and Internet of Things. Akana enables enterprises to share data as APIs, connect and integrate applications, drive part...
SYS-CON Events announced today that CommVault has been named “Bronze Sponsor” of SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY, and the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. A singular vision – a belief in a better way to address current and future data management needs – guides CommVault in the development of Singular Information Management® solutions for high-performance data protection, universal availability and sim...
SYS-CON Events announced today that SafeLogic has been named “Bag Sponsor” of SYS-CON's 16th International Cloud Expo® New York, which will take place June 9-11, 2015, at the Javits Center in New York City, NY. SafeLogic provides security products for applications in mobile and server/appliance environments. SafeLogic’s flagship product CryptoComply is a FIPS 140-2 validated cryptographic engine designed to secure data on servers, workstations, appliances, mobile devices, and in the Cloud.
The best mobile applications are augmented by dedicated servers, the Internet and Cloud services. Mobile developers should focus on one thing: writing the next socially disruptive viral app. Thanks to the cloud, they can focus on the overall solution, not the underlying plumbing. From iOS to Android and Windows, developers can leverage cloud services to create a common cross-platform backend to persist user settings, app data, broadcast notifications, run jobs, etc. This session provides a high level technical overview of many cloud services available to mobile app developers, includi...
BroadSoft on Tuesday announced that it is a recipient of the 2014 Frost & Sullivan Market Leadership Award in the Hosted/Cloud Internet Protocol (IP) Telephony market for Latin America. According to Frost & Sullivan market research, the Latin America (LATAM) hosted/cloud Internet Protocol (IP) telephony market, including integrated unified communications and collaboration (UC&C) applications, is currently experiencing a rapid growth trajectory and is expected to exhibit a tenfold rise in annual revenues in the 2013-2020 period. With more than 600 cloud deployments internationally, BroadSoft w...
SYS-CON Events announced today that StorPool Storage will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. StorPool is distributed storage software that allows service providers, enterprises and other cloud builders to run data storage on standard x86 servers, instead of using expensive and inefficient storage arrays (SAN).
Temasys has announced senior management additions to its team. Joining are David Holloway as Vice President of Commercial and Nadine Yap as Vice President of Product. Over the past 12 months Temasys has doubled in size as it adds new customers and expands the development of its Skylink platform. Skylink leads the charge to move WebRTC, traditionally seen as a desktop, browser based technology, to become a ubiquitous web communications technology on web and mobile, as well as Internet of Things compatible devices.
GENBAND has announced that SageNet is leveraging the Nuvia platform to deliver Unified Communications as a Service (UCaaS) to its large base of retail and enterprise customers. Nuvia’s cloud-based solution provides SageNet’s customers with a full suite of business communications and collaboration tools. Two large national SageNet retail customers have recently signed up to deploy the Nuvia platform and the company will continue to sell the service to new and existing customers. Nuvia’s capabilities include HD voice, video, multimedia messaging, mobility, conferencing, Web collaboration, deskt...
VoxImplant has announced full WebRTC support in the newest versions of its Android SDK and iOS SDK. The updated SDKs, which enable audio and video calls on mobile devices, are now compatible with the WebRTC standard to allow any mobile app to communicate with WebRTC-enabled browsers, including Google Chrome, Mozilla Firefox, Opera, and, when available, Microsoft Spartan. The WebRTC-updated SDKs represent VoxImplant's continued leadership in simplifying the development of real-time communications (RTC) services for app developers. VoxImplant (built by Zingaya, the real-time communication servi...
SYS-CON Events announced today that Site24x7, the cloud infrastructure monitoring service, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Site24x7 is a cloud infrastructure monitoring service that helps monitor the uptime and performance of websites, online applications, servers, mobile websites and custom APIs. The monitoring is done from 50+ locations across the world and from various wireless carriers, thus providing a global perspective of the end-user experience. Site24x7 supports monitoring H...
SYS-CON Events announced today that Intelligent Systems Services will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Established in 1994, Intelligent Systems Services Inc. is located near Washington, DC, with representatives and partners nationwide. ISS’s well-established track record is based on the continuous pursuit of excellence in designing, implementing and supporting nationwide clients’ mission-critical systems. ISS has completed many successful projects in Healthcare, Commercial, Manufacturing, ...
Sonus Networks introduced the Sonus WebRTC Services Solution, a virtualized Web Real-Time Communications (WebRTC) offer, purpose-built for the Cloud. The WebRTC Services Solution provides signaling from WebRTC-to-WebRTC applications and interworking from WebRTC-to-Session Initiation Protocol (SIP), delivering advanced real-time communications capabilities on mobile applications and on websites, which are accessible via a browser.
SYS-CON Events announced today that B2Cloud, a provider of enterprise resource planning software, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. B2cloud develops the software you need. They have the ideal tools to help you work with your clients. B2Cloud’s main solutions include AGIS – ERP, CLOHC, AGIS – Invoice, and IZUM
SYS-CON Events announced today that Tufin, the market-leading provider of Security Policy Orchestration Solutions, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. As the market leader of Security Policy Orchestration, Tufin automates and accelerates network configuration changes while maintaining security and compliance. Tufin's award-winning Orchestration Suite™ gives IT organizations the power and agility to enforce security policy across complex, multi-vendor enterprise networks. With more than 1...