YOUR FEEDBACK
Verizon Becomes a Counter-Android Linux Convert
JNels wrote: Hey - Jeffrey Nelson here at Verizon Wireless. Not a bit of ...
SOA World Conference
Virtualization Conference
$200 Savings Expire May 16, 2008... – Register Today!


2007 West
GOLD SPONSORS:
Active Endpoints
Your SOA Needs BPEL for Orchestration
BEA
Virtualized SOA: Adaptive Infrastructure for Demanding Applications
Nexaweb
Overcoming Bandwidth Challenges with Nexaweb
TIBCO
What is Service Virtualization?
SILVER SPONSORS:
WSO2
Using Web Services Technologies and FOSS Solutions
Click For 2007 East
Event Webcasts

2008 East
PLATINUM SPONSORS:
Appcelerator
Think Fast: Accelerate AJAX Development with Appcelerator
GOLD SPONSORS:
DreamFace Interactive
The Ultimate Framework for Creating Personalized Web 2.0 Mashups
ICEsoft
AJAX and Social Computing for the Enterprise
Kaazing
Enterprise Comet: Real–Time, Real–Time, or Real–Time Web 2.0?
Nexaweb
Now Playing: Desktop Apps in the Browser!
Sun
jMaki as an AJAX Mashup Framework
POWER PANELS:
The Business Value
of RIAs
What Lies Beyond AJAX?
KEYNOTES:
Douglas Crockford
Can We Fix the Web?
Anthony Franco
2008: The Year of the RIA
Click For 2007 Event Webcasts
SYS-CON.TV
TOP LINKS YOU MUST CLICK ON


Open Source: The Next Frontier for Data Quality Management
Data quality, a pervasive & critical business issue

Digg This!

Data is the fundamental building block of every business, data in the form of client information, sales information, employee information, and financial information fuels the operation of every business. In today's business environment, which enables data entry from multiple points and through myriad processes, data quality has become an increasing concern for businesses trying to succeed in an ever more competitive atmosphere.

Data quality or data integrity as defined as incomplete, erroneous, or incompatible data is part of every business's day-to-day operation. Furthermore, as new flexible data entry options become available, the opportunity for data quality issues to be introduced into enterprise data increases. Overall business strategy is also increasing the prevalence of data quality issues as mergers, acquisitions, and department consolidations becoming part of almost every business's growth initiatives.

Data quality issues are often latent in an enterprise until a critical business initiative becomes road blocked because the enterprise data can't comply with the needs of the business. Companies of every size in every industry are increasingly reporting issues with data quality. The Data Warehousing Institute reported that 50% of its respondents felt that company data quality is worse than the organization thinks. Furthermore, more than half of respondents indicated their organizations had suffered losses due to poor data quality.

Data encompasses all the critical decision-making variables in an organization, including financial data, employee data, client data, prospect data, and inventory data. Viewing data that is erroneous or incomplete can seriously impact the decisions an organization makes and the strategies it employs. Recent research from Aberdeen indicates that the state of a company's data quality directly impacts its growth, profitability, and ability to compete. Poor data quality obscures an organization's view causing it to miss additional revenue opportunities, risk regulatory issues, and forfeit the intelligence gained from a clear view of business data.

As the prevalence and impact of data quality issues become more apparent, concern over these issues is reaching beyond the IT community to the C-suite. A recent study by the Financial Executives Research Foundation indicates that data quality across the enterprise was its number one concern, surpassing information security and Sarbanes-Oxley. Finance professionals cited information integrity as the key issue impacting overall corporate operations and performance.

Data quality is every organization's sleeping monster. It quietly erodes profitability, impedes growth, and hinders the implementation of mission-critical business initiatives.

The Limitations of Commercial Data Quality Solutions
Once an organization recognizes its data quality issues and their operational impact, it typically evaluates commercially available solutions to address the problem since most companies lack the IT infrastructure and knowledge to address enterprise data issues. However, for most companies seeking a data quality solution, the evaluation process is a sobering one because most commercially available solutions are costly, complex, and require software licenses and term contracts, while only addressing a portion of the overall issue.

Commercially available data solutions are fundamentally flawed in their implementation model. To be most effective data quality processes should be deployed at multiple touch points throughout an organization. Full implementations are almost impossible because they become cost-prohibitive when licenses are expanded to encompass more users and multiple systems.

Commercial solutions are also prohibi-tive to many organizations due to their term contract commitments, software licenses, and implementation requirements. Price tags for traditional solutions can often total in the hundreds of thousands of dollars if not over a million dollars, not including the human capital within the organization needed to manage the solution in concert with the provider. Such price tags make commercially available data solutions inaccessible to many small and mid-size enterprises that need data quality solutions.

Another drawback of traditional solutions is that they offer only cookie-cutter product approaches to data quality. Since most companies have data issues that are unique due to their specific organizational history and infrastructure, traditional cookie-cutter solutions often require significant programming and custom code development - all requiring additional testing, resources, and money, adding significantly to the complexity of the solution for implementation and service management.

Moreover, support for traditional solutions is typically limited to the providing vendor due to the proprietary software and licenses involved in the implementation of the solution. This restriction further increases the price tag of the conventional solution since support, service, and implementation can total as much as 70% of the purchase price of the solution.

Open Source: The Next Frontier for Data Quality Management
While open source has been gaining traction and attention in many business solutions, data quality solutions have remained an area where open source is not widely utilized. Open source, however, is well equipped to address the limitations of traditional software-based solutions or SAAS solutions and create industry-leading data solutions. Open source solutions are inherently better suited to address the needs of comprehensive data quality management with their flexibility, cost efficiency, customization, rapid integration, and turnkey scalability options.

A key benefit of open source data quality solutions is that they can be implemented at multiple data entry points throughout an organization because they require no license purchases. This flexibility creates a more comprehensive and longer-term solution than single-point commercial solutions.

Open Source data quality solutions also provide a significant cost advantage over conventional quality solutions because they require no software license purchases or management. Software licenses can account for up to 20% of the cost of a traditional implementation. This represents a significant cost savings to organizations. Furthermore, software licenses typically come with lengthy contract commitments attached, impacting the cost structure for an organization for a significant if not perpetual period of time.

Moreover, open source data quality software can be easily customized to address the unique data fingerprint of every organization eliminating the need to retrofit cookie-cutter traditional solutions with code modifications and custom programming. This customization ability reduces the complexity of the solutions and offers faster implementations, simpler integrations, less testing, and more rapid results than commercial solutions.

Another benefit of open source solutions is that servicing is more flexible and cost-efficient because it isn't tied to proprietary licensing. Service can then be provided by the technology vendor, secondary vendor, or internal resources. Furthermore, the open source community can also provide support and innovation for solutions as they evolve within an enterprise.

Lastly, open source data quality solutions have the added value of using the new technology processing systems dedicated to providing "pay as you go" (utility computing) processing options for turnkey scalability. This offers a further significant cost advantage over commercial solutions that require licenses tied to hardware. Data solutions are especially prone to scalability issues due to the volume of data undergoing processing, many traditional solutions become easily stressed due to these needs, increasing the costs, delaying results, and reducing the return on investment for traditional solutions.

It's clear that an open source solution for data quality offers many benefits to clients over conventional solutions. Open source provides all businesses access to critical data quality solutions that can positively impact their overall profitability, growth, and competitive position. Furthermore, the existence of the open source community enables a solution users' immediate access to shared knowledge and implementation enhancements, rather than waiting months or years for another software release. Open source can offer organizations the most customer-centric data quality solution available in the marketplace today with flexibility, customization, and significant cost advantages.

Research Sources:
•  TDWI. "Taking Data Quality to the Enterprise through Data Governance 2005."
•  Aberdeen Report. "{Customer Data Quality, The Roadmap to Growth and Profitability 2007."
•  Technology Issues for Financial Executives 2007 Annual Report.

About Subbu Manchiraju
Subbu Manchiraju is a vice president at Infosolve Technologies, which provides business clients with comprehensive data solutions that leverage the power of their enterprise data to achieve business objectives and create strategic opportunities-- without the burdens of cumbersome licensing agreements, complex term contracts and expensive hardware requirements.

Kasper Sørensen wrote: I absolutely agree with everything you're saying about the advantages of Open Source data quality but I find it less convincing when faced with the fact that Infosolvetech does not provide an Open Source licensed solution that complies with the Open Source definition! I've tried several times to find the source code for your OpenDQ product, but found that you had to be a paying customer to get it? How open is that? And how do you benefit from a non existing community? So now the point my point is obvious... Find another Open Source data quality solution to gain those benefits that you speak of. Try using DataCleaner (which I will gladly admit that I represent), Aggregate Profiler or Open Data Profiler. Respectively: http://www.eobjects.dk/da tacleaner http://sourcefo rge.net/projects/dataqual ity/ http://sourceforge.n et/projects/dataprofiler/
read & respond »
ENTERPRISE OPEN SOURCE MAGAZINE LATEST STORIES . . .
JavaOne 2008: Uncommon Java Bugs
Any large Java source base can have insidious and subtle bugs. Every experienced Java programmer knows that finding and fixing these bugs can be difficult and costly. Fortunately, there are a large number of free open source Java tools available that can be used to find and fix defects
Application Security for Open Source - The New Frontier
Hybrid applications made up of proprietary, open source and third-party components are the result of today's fast-paced and complex software development landscape. Applications developed within the last five years - whether internal or external - are at least 50% open source software (
3rd International Virtualization Conference & Expo: Themes & Topics
From Application Virtualization to Xen, a round-up of the virtualization themes & topics being discussed in NYC June 23-24, 2008 by the world-class speaker faculty at the 3rd International Virtualization Conference & Expo being held by SYS-CON Events in The Roosevelt Hotel, in midtown
Open-Xchange to Deliver Collaboration Solution Integrated With Parallels Virtualization
Open-Xchange and Parallels are integrating Open-Xchange open source email and collaboration software with Parallels technology to deliver a cost-effective, enterprise-class alternative to commercial email and collaboration products at a competitive price. The products, which will be fu
Open Source Penetration and Use in SOA Deployments
Open source has made significant inroads into middleware deployments in the enterprise. More and more, open source is being used to deliver the benefits of SOA and open source to the enterprise. There are many custom Enterprise Service Bus deployments waiting to be upgraded to a simple
OpenOffice 3.0 Goes to Public Beta
OpenOffice.org is publicly beta testing OpenOffice 3.0, which is not recommended for production use. General release is expected in September. Aside from cosmetics, it will support the upcoming OpenDocument Format 1.2 and is capable of opening Office 2007 and 2008 for Mac OS X files.
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

SYS-CON FEATURED WHITEPAPERS

ADS BY GOOGLE