Welcome!

Open Source Cloud Authors: Matt Brickey, Liz McMillan, Elizabeth White, Wesley Coelho, Rainer Ersch

Related Topics: Industrial IoT

Industrial IoT: Article

Integrating Enterprise Information on Demand with XQuery - Part 2

Integrating Enterprise Information on Demand with XQuery - Part 2

In Part I of this article (XML-J, Vol. 4, issue 6), we introduced the enterprise information integration (EII) problem and explained how the XML query language XQuery and related technologies - specifically XML, XML Schema, and Web services - are central to enabling this age-old problem to be successfully addressed at last.

We provided a technical overview of the XQuery language and presented a simple "single view of Customer" example to illustrate XQuery's role in the EII domain. The example was based on an electronics retailer that wanted to share customer information across three portals - portals for customer self-service, credit approval, and product service. The information to be integrated resided in a variety of back-end information sources, including two relational database management systems, an SAP system, and a Web service.

In this article, our XQuery/EII saga continues. In this installment, we look at how EII relates to two other technologies designed for integration tasks, namely enterprise application integration (EAI) and extract-transform-load (ETL) tools. We also take a brief look at BEA Liquid Data for WebLogic, an XQuery-based EII offering, and discuss how XQuery and Liquid Data were put to use recently in a telecommunications-related customer project.

What About EAI?
Given the industry buzz around EAI today, a natural question about EII is "so why bother?" That is, why isn't a modern EAI solution alone - for example, a workflow engine with XML-based data transformation capabilities - sufficient to solve the EII problem? The answer is, in principle, that EAI is in fact sufficient to solve the EII problem. A developer could always choose to hand-build a set of workflows, writing one workflow per application-level "query" to deliver the desired information back to the calling applications. In the example from Part I of this article, three hand-tailored workflows could instead be written to provide information retrieval capabilities comparable to our XQuery-based solution. But is that the best approach, in terms of development time and maintenance cost?

The basic question here is when to use a declarative query language (XQuery in the case of modern EII) versus constructing code in a procedural language (a workflow language in the case of EAI). The lessons from the relational database revolution are clear: When applicable, a declarative approach offers significant advantages. Instead of hand-constructing a "query plan" (EAI workflow) to extract the needed data from each of the data sources in some manually predefined order, the EII approach allows a single, smaller, and simpler declarative query to be written.

The resulting benefits should be obvious. First, the user does not need to build each query plan by hand, which could involve a considerable effort. Instead, the user specifies (when defining the core view) what data sources are relevant and what logical conditions relate and characterize the data to be retrieved. Second, queries can be optimized automatically by the EII middleware, resulting in an optimal query execution plan (order of accessing the sources, queries or methods to extract the data, etc.) for each different query. For example, using EAI, one central workflow could be written to retrieve all of the customer information in Part I's example, and then other workflows could be written to first call this workflow and then further filter the results. However, in the EII approach, the query processor will (for each query) prune out irrelevant data sources as well as push SQL selection conditions (such as only retrieving "Open" support cases in Listing 2 of Part I) down to any RDBMS data sources. Third, as the data sources change over time in terms of their schemas, statistics, or performance, the EII user will not be forced to rewrite all of his or her queries. Simply maintaining each base view query and re-optimizing the other queries will adapt their query execution plans to the new situation. In contrast, in the case of EAI, many workflows would have to be rewritten to handle most such changes.

There really isn't an either/or choice to be made between EAI and EII at all. Both technologies have critical roles to play in an overall enterprise integration solution. These technologies are complementary: EII provides ease of data integration, while EAI provides ease of process integration. EII is appropriate for composing integrated views and queries over enterprise data. EAI is the appropriate technology for creating composite applications that orchestrate the functional capabilities of a set of related but independent applications, Web services, etc. Moreover, EII can be used to handily augment EAI in scenarios where workflows need to access integrated data views. For example, if our electronics retailer wanted its order process to offer free shipping to customers who have ordered more than $1,000 of goods during the year and who have accumulated more than 5,000 reward points, the integrated view of customer from Part I could be used to easily access the relevant information from within the order entry workflow.

What About ETL?
Another technology related to EII is ETL. In fact, ETL tools are designed precisely for the purpose of integrating data from multiple sources. These tools are therefore another category of software that naturally leads to a "why bother with EII?" question - why isn't ETL technology the answer? As you'll see, the answer is again that both technologies have their place in modern IT architectures.

ETL tools are designed for use in moving data from a variety of sources into a data warehouse for offline analysis and reporting purposes. As the name suggests, ETL tools provide facilities for extracting data from a source; transforming that data into a more suitable form for inclusion in the data warehouse, possibly cleansing it in the process; and then loading the transformed data into the warehouse's database. Typical ETL tools are therefore focused on supporting the design and administration of data migration, cleansing, and transformation processes. These are often batch processes that occur on a daily or weekly basis.

Data warehouses and the ETL tools that feed them are invaluable for enabling businesses to aggregate and analyze historical information. For example, our electronics retailer might very well want to keep track of customer data, sales data, and product issue data over a period of years in order to analyze customer behavior by geographic region over time, improve their credit card risk model, and so on. A data warehouse is the appropriate place to retain such data and run large analytical queries against it, and ETL technology is the right technology today for creating, cleaning, and maintaining the data in the warehouse. However, ETL is not the right technology for building applications that need access to current operational data - it doesn't support the declarative creation of views or real-time access to operational data through queries.

For applications that need to integrate current information, Part I of this article showed how XQuery can be used to declaratively specify reusable views that aggregate data from multiple operational stores and how XQuery can be used to write XML queries over such integrated views. We also explained how standard database query processing techniques, including view expansion, predicate pushdown, and distributed query optimization, can be applied to XQuery, making XQuery-based EII an excellent technological fit for such applications.

Clearly, both ETL and EII technologies have important roles to play in today's enterprise. ETL serves to feed data warehouses, while EII is an enabler for applications that need timely access to current, integrated information from a variety of operational enterprise data sources. As with EAI, there are also cases where the two technologies come together. As one example, an ETL tool could be used to help create and maintain a cross-reference table to relate different notions of "customer id" for use in creating XQuery-based EII views across different back-end systems. As another example, an ETL-fed data warehouse could be used to build a portal for analyzing the historical behavior of a company's top customers, with an EII tool used to allow click-through inspection of the customers' purchases in the past 24 hours.

Putting XQuery-Based EII to Work
For the reasons discussed in this article, XQuery-based EII middleware is an emerging product segment that promises to deliver the tools and technology needed in this important space. One commercially available XQuery-based middleware product is BEA Liquid Data for WebLogic. Liquid Data is capable of accessing data from relational database management systems, Web services, packaged applications (through J2EE CA adapters and application views), XML files, XML messages, and, through a custom function mechanism, most any other data source as well. For illustration purposes, the architecture of Liquid Data is depicted in Figure 1. Liquid Data provides default XML views of all of its data sources and provides an XQuery-based graphical view and query editor for use in integrating and enhancing information drawn from one or more data sources. It includes a distributed query processing engine as well as providing advanced features such as support for query result caching and both data-source-level and stored query-level access control.

As a final example of the applicability of XQuery to enterprise information integration problems, we'll describe an actual customer integration exercise where Liquid Data was put to use. In that project, a large telecommunications vendor wanted to create a single view of order information for one of its business divisions. The goal of the project was to make integrated order information available to the division's customers (other businesses) through a Web portal, enabling their customers to log in and check on the status of their orders, as well as making information available to the division's own customer service representatives.

The division had data distributed across multiple systems, including a relational database containing order summary information and two different order management systems. Order details were kept in one or the other of the two order management systems, depending on the type of order. Functionality-wise, a limited view of order details was provided through the customer order status portal that the division built using Liquid Data, whereas customer service representatives were permitted to see all of the order data through their portal. In both cases, it was possible to search for order information by various combinations of purchase order number, date range, and order.

The use of XQuery-based EII technology enabled the customer to complete their portal project in much less time than they had expected it to take with traditional technologies, and their total cost of ownership was also lower due to the reusability of Liquid Data assets and the low cost of maintenance enabled by EII.

Summary
In this article, we have explained how XQuery is beginning to transform the integration world, making it possible to finally tackle the enterprise information integration problem where past attempts have failed. In Part I we provided an overview of XQuery and illustrated how it could be used to integrate the disparate information sources of a hypothetical electronics retailer. In Part II we discussed the relationship of EII to EAI and ETL technologies and then briefly presented BEA's XQuery-based EII product and described one of the customer projects in which it was used.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
"We've been engaging with a lot of customers including Panasonic, we've been involved with Cisco and now we're working with the U.S. government - the Department of Homeland Security," explained Peter Jung, Chief Product Officer at Pulzze Systems, in this SYS-CON.tv interview at @ThingsExpo, held June 6-8, 2017, at the Javits Center in New York City, NY.
SYS-CON Events announced today that Calligo, an innovative cloud service provider offering mid-sized companies the highest levels of data privacy and security, has been named "Bronze Sponsor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Calligo offers unparalleled application performance guarantees, commercial flexibility and a personalised support service from its globally located cloud plat...
"We are focused on SAP running in the clouds, to make this super easy because we believe in the tremendous value of those powerful worlds - SAP and the cloud," explained Frank Stienhans, CTO of Ocean9, Inc., in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
Internet of @ThingsExpo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The Internet of Things (IoT) is the most profound change in personal and enterprise IT since the creation of the Worldwide Web more than 20 years ago. All major researchers estimate there will be tens of billions devic...
"The Striim platform is a full end-to-end streaming integration and analytics platform that is middleware that covers a lot of different use cases," explained Steve Wilkes, Founder and CTO at Striim, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We provide IoT solutions. We provide the most compatible solutions for many applications. Our solutions are industry agnostic and also protocol agnostic," explained Richard Han, Head of Sales and Marketing and Engineering at Systena America, in this SYS-CON.tv interview at @ThingsExpo, held June 6-8, 2017, at the Javits Center in New York City, NY.
SYS-CON Events announced today that DXWorldExpo has been named “Global Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Digital Transformation is the key issue driving the global enterprise IT business. Digital Transformation is most prominent among Global 2000 enterprises and government institutions.
SYS-CON Events announced today that Datera, that offers a radically new data management architecture, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Datera is transforming the traditional datacenter model through modern cloud simplicity. The technology industry is at another major inflection point. The rise of mobile, the Internet of Things, data storage and Big...
DX World EXPO, LLC., a Lighthouse Point, Florida-based startup trade show producer and the creator of "DXWorldEXPO® - Digital Transformation Conference & Expo" has announced its executive management team. The team is headed by Levent Selamoglu, who has been named CEO. "Now is the time for a truly global DX event, to bring together the leading minds from the technology world in a conversation about Digital Transformation," he said in making the announcement.
"MobiDev is a Ukraine-based software development company. We do mobile development, and we're specialists in that. But we do full stack software development for entrepreneurs, for emerging companies, and for enterprise ventures," explained Alan Winters, U.S. Head of Business Development at MobiDev, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
While the focus and objectives of IoT initiatives are many and diverse, they all share a few common attributes, and one of those is the network. Commonly, that network includes the Internet, over which there isn't any real control for performance and availability. Or is there? The current state of the art for Big Data analytics, as applied to network telemetry, offers new opportunities for improving and assuring operational integrity. In his session at @ThingsExpo, Jim Frey, Vice President of S...
SYS-CON Events announced today that DXWorldExpo has been named “Global Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Digital Transformation is the key issue driving the global enterprise IT business. Digital Transformation is most prominent among Global 2000 enterprises and government institutions.
In his opening keynote at 20th Cloud Expo, Michael Maximilien, Research Scientist, Architect, and Engineer at IBM, discussed the full potential of the cloud and social data requires artificial intelligence. By mixing Cloud Foundry and the rich set of Watson services, IBM's Bluemix is the best cloud operating system for enterprises today, providing rapid development and deployment of applications that can take advantage of the rich catalog of Watson services to help drive insights from the vast t...
SYS-CON Events announced today that EnterpriseTech has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. EnterpriseTech is a professional resource for news and intelligence covering the migration of high-end technologies into the enterprise and business-IT industry, with a special focus on high-tech solutions in new product development, workload management, increased effic...
SYS-CON Events announced today that Massive Networks, that helps your business operate seamlessly with fast, reliable, and secure internet and network solutions, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. As a premier telecommunications provider, Massive Networks is headquartered out of Louisville, Colorado. With years of experience under their belt, their team of...
SYS-CON Events announced today that Cloud Academy named "Bronze Sponsor" of 21st International Cloud Expo which will take place October 31 - November 2, 2017 at the Santa Clara Convention Center in Santa Clara, CA. Cloud Academy is the industry’s most innovative, vendor-neutral cloud technology training platform. Cloud Academy provides continuous learning solutions for individuals and enterprise teams for Amazon Web Services, Microsoft Azure, Google Cloud Platform, and the most popular cloud com...
SYS-CON Events announced today that Cloudistics, an on-premises cloud computing company, has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Cloudistics delivers a complete public cloud experience with composable on-premises infrastructures to medium and large enterprises. Its software-defined technology natively converges network, storage, compute, virtualization, and ...
SYS-CON Events announced today that CHEETAH Training & Innovation will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct. 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. CHEETAH Training & Innovation is a cloud consulting and IT training firm specializing in improving clients cloud strategies and infrastructures for medium to large companies.
SYS-CON Events announced today that Datanami has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Datanami is a communication channel dedicated to providing insight, analysis and up-to-the-minute information about emerging trends and solutions in Big Data. The publication sheds light on all cutting-edge technologies including networking, storage and applications, and thei...
The current age of digital transformation means that IT organizations must adapt their toolset to cover all digital experiences, beyond just the end users’. Today’s businesses can no longer focus solely on the digital interactions they manage with employees or customers; they must now contend with non-traditional factors. Whether it's the power of brand to make or break a company, the need to monitor across all locations 24/7, or the ability to proactively resolve issues, companies must adapt to...