Click here to close now.

Welcome!

Open Source Authors: Carmen Gonzalez, Elizabeth White, Pat Romanski, Jason Bloomberg, Roger Strukhoff

News Feed Item

Cloudera Impala Delivers Superior Performance on Open Hadoop Data Over Proprietary Analytic DBMSs

Emerges as Fastest, Most Functional and Proven Way to Run SQL on Hadoop Data

PALO ALTO, CA -- (Marketwired) -- 01/13/14 -- Cloudera, the leader in Apache Hadoop™ based data management platforms, today released the results of performance benchmark testing for its open source interactive SQL query engine, Cloudera Impala. Impala queries across data in an open Hadoop columnar storage format (Parquet) ran on average 2x faster than identical queries on a commercial analytic database management system (DBMS) over its proprietary storage format.

Cloudera delivers an enterprise data hub -- a next-generation platform for secure, powerful, real-time processing and analysis of data at scale. An enterprise data hub must provide data governance and lineage services, support enterprise-grade backup and disaster recovery and offer a wide range of ways to work with the data that it manages. It must support the tools and interfaces on which existing applications and tools rely. Critical among those is real-time SQL access for analytics.

Impala Delivers

Launched in October 2012 and released for general availability in May 2013, Impala enables high speed, interactive SQL analysis of Hadoop data at petabyte scale. Today Impala has emerged as the fastest, most functional and proven way to run SQL on Hadoop data for open source users and enterprise customers alike. The platform has continued to evolve rapidly with deepening support for the ANSI-SQL standard, certified integrations to leading business intelligence tools, sophisticated workload management and consistently superior performance.

Impala deployments continue to proliferate in the enterprise: to date, the platform has been downloaded by more than 5,000 unique organizations globally, demonstrating its appeal and significance. Cloudera continues to work closely with its enterprise customers and the open source community to refine and advance Impala's enterprise features, like Apache Sentry (incubating), the fine-grained, role-based authorization module released this year. Further establishing Impala's leadership in the industry, Hadoop-based solutions from other vendors integrate the Cloudera-created SQL query engine into their own offerings in response to customer demand.

Analytic DBMS Performance on Open Data in the Enterprise Data Hub

Running a diverse set of analytic queries on identical hardware, Impala has successfully eclipsed the performance of a popular proprietary parallel DBMS. The same benchmarks also showed Impala has maintained or widened its performance advantage against the latest release of Apache Hive (0.12).

Furthermore, it has done so on data in an open Hadoop data format. With these results, customers are able to exceed their SQL performance experiences from proprietary databases but preserve the flexibility they enjoy with the Hadoop stack.

The Proof Is In the Data: Impala Shows BI-Class Speed for Mainstream Workloads

To evaluate Impala's query performance against a popular analytic database (referred to as "DBMS-Y"), Cloudera ran a series of 20 queries based on the industry-standard benchmark TPC-DS. The results showed that:

  • Impala ran consistently faster than DBMS-Y: across 20 queries, Impala ran on average 2X to DBMS-Y, outperforming DBMS-Y in 17 of the 20 queries. For some queries, Impala was over 4x faster.

Queries over open data beat those over proprietary data: Even though Impala queries were done on openHadoop data in the Parquet format, and DBMS-Y queries were done on data in its own proprietary format. Impala was still faster.

  • Impala scales linearly and predictably: In tests, Impala maintained identical response times with increased user concurrency and on larger data sets by simply adding new machines at the same rate as the concurrency and data growth.
  • Furthermore, Impala is still more than an order of magnitude faster than Hive: on identical hardware Impala queries ran an average of 24x faster than those run on Apache Hive 0.12 using ORCfile.

No Sleight of hand, no gimmicks

Cloudera is committed to leading the industry as a high integrity business that provides unbiased information to customers and users. Dozens of users have download Cloudera's 100% open source platform, run their own performance evaluations, and shared them publicly. Cloudera places no confidentiality clauses or other proprietary restrictions on the use of its distribution. In addition, Cloudera has made the queries, configuration, hardware specifications, and data available for use for the open source community to review and evaluate. Information can be found at http://www.cloudera.com/impalaishellafast/

"Interactive exploratory business intelligence is a mainstay workload of the Enterprise Data Hub," said Mike Olson, founder, chief strategy officer and chairman of the Board at Cloudera. "We are proud of how quickly Impala has evolved and the rate at which it is being adopted. With thousands of users now running Impala in production, its significance is indisputable. One year ago, when we released Impala to open source, we knew that it had the potential to eventually play on the same field as some very mature analytic DBMSs, but the results of these performance benchmark tests exceed our very high expectations. In the coming months, we will unveil new enhancements to the platform that will further advance its performance, ease of use and security, extending Impala's benefits for open source users and our enterprise customers."

Learn More About Cloudera Impala
For a more detailed account of the methodology and results from Cloudera's Impala performance benchmark testing against Hive and a proprietary DBMS, visit the Cloudera blog: http://blog.cloudera.com/blog/2014/01/impala-performance-dbms-class-speed

For more information about Cloudera Impala and how to download for free, visit: http://cloudera.com/content/cloudera/en/products/cdh/impala.html

About Cloudera
Cloudera is revolutionizing enterprise data management by offering the first unified Platform for Big Data: The Enterprise Data Hub. Cloudera offers enterprises one place to store, process and analyze all their data, empowering them to extend the value of existing investments while enabling fundamental new ways to derive value from their data. Founded in 2008, Cloudera was the first and is still today the leading provider and supporter of Hadoop for the enterprise. Cloudera also offers software for business critical data challenges including storage, access, management, analysis, security and search. With over 15,000 individuals trained, Cloudera is a leading educator of data professionals, offering the industry's broadest array of Hadoop training and certification programs. Cloudera works with over 800 hardware, software and services partners to meet customers' big data goals. Leading organizations in every industry run Cloudera in production, including finance, telecommunications, retail, internet, utilities, oil and gas, healthcare, biopharmaceuticals, networking and media, plus top public sector organizations globally. www.cloudera.com

Connect with Cloudera
Read our blog: http://www.cloudera.com/blog/
Follow us on Twitter: http://twitter.com/cloudera
Visit us on Facebook: http://www.facebook.com/cloudera

Cloudera, Cloudera Platform for Big Data and CDH are trademarks or registered trademarks of Cloudera in the United States and in jurisdictions throughout the world. All other company and product names may be trade names or trademarks of their respective owners.

Add to Digg Bookmark with del.icio.us Add to Newsvine

Press Contacts

North America
Hope Nicora
Bhava Communications for Cloudera
cloudera@bhavacom.com
+1-510-984-1527

Europe
Richard Botley
Ketchum for Cloudera
LON-Cloudera@ketchum.com
+44 (0) 20 7611 3788

More Stories By Marketwired .

Copyright © 2009 Marketwired. All rights reserved. All the news releases provided by Marketwired are copyrighted. Any forms of copying other than an individual user's personal reference without express written permission is prohibited. Further distribution of these materials is strictly forbidden, including but not limited to, posting, emailing, faxing, archiving in a public database, redistributing via a computer network or in a printed form.

@ThingsExpo Stories
BroadSoft on Tuesday announced that it is a recipient of the 2014 Frost & Sullivan Market Leadership Award in the Hosted/Cloud Internet Protocol (IP) Telephony market for Latin America. According to Frost & Sullivan market research, the Latin America (LATAM) hosted/cloud Internet Protocol (IP) telephony market, including integrated unified communications and collaboration (UC&C) applications, is currently experiencing a rapid growth trajectory and is expected to exhibit a tenfold rise in annual revenues in the 2013-2020 period. With more than 600 cloud deployments internationally, BroadSoft w...
GENBAND has announced that SageNet is leveraging the Nuvia platform to deliver Unified Communications as a Service (UCaaS) to its large base of retail and enterprise customers. Nuvia’s cloud-based solution provides SageNet’s customers with a full suite of business communications and collaboration tools. Two large national SageNet retail customers have recently signed up to deploy the Nuvia platform and the company will continue to sell the service to new and existing customers. Nuvia’s capabilities include HD voice, video, multimedia messaging, mobility, conferencing, Web collaboration, deskt...
Sonus Networks introduced the Sonus WebRTC Services Solution, a virtualized Web Real-Time Communications (WebRTC) offer, purpose-built for the Cloud. The WebRTC Services Solution provides signaling from WebRTC-to-WebRTC applications and interworking from WebRTC-to-Session Initiation Protocol (SIP), delivering advanced real-time communications capabilities on mobile applications and on websites, which are accessible via a browser.
Temasys has announced senior management additions to its team. Joining are David Holloway as Vice President of Commercial and Nadine Yap as Vice President of Product. Over the past 12 months Temasys has doubled in size as it adds new customers and expands the development of its Skylink platform. Skylink leads the charge to move WebRTC, traditionally seen as a desktop, browser based technology, to become a ubiquitous web communications technology on web and mobile, as well as Internet of Things compatible devices.
SYS-CON Events announced today that AIC, a leading provider of OEM/ODM server and storage solutions, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. AIC is a leading provider of both standard OTS, off-the-shelf, and OEM/ODM server and storage solutions. With expert in-house design capabilities, validation, manufacturing and production, AIC's broad selection of products are highly flexible and are configurable to any form factor or custom configuration. AIC leads the industry with nearly 20 years of ...
“With easy-to-use SDKs for Atmel’s platforms, IoT developers can now reap the benefits of realtime communication, and bypass the security pitfalls and configuration complexities that put IoT deployments at risk,” said Todd Greene, founder & CEO of PubNub. PubNub will team with Atmel at CES 2015 to launch full SDK support for Atmel’s MCU, MPU, and Wireless SoC platforms. Atmel developers now have access to PubNub’s secure Publish/Subscribe messaging with guaranteed ¼ second latencies across PubNub’s 14 global points-of-presence. PubNub delivers secure communication through firewalls, proxy ser...
SYS-CON Events announced today that Vicom Computer Services, Inc., a provider of technology and service solutions, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. They are located at booth #427. Vicom Computer Services, Inc. is a progressive leader in the technology industry for over 30 years. Headquartered in the NY Metropolitan area. Vicom provides products and services based on today’s requirements around Unified Networks, Cloud Computing strategies, Virtualization around Software defined Data Ce...
SYS-CON Events announced today that Gridstore™, the leader in hyper-converged infrastructure purpose-built to optimize Microsoft workloads, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Gridstore™ is the leader in hyper-converged infrastructure purpose-built for Microsoft workloads and designed to accelerate applications in virtualized environments. Gridstore’s hyper-converged infrastructure is the industry’s first all flash version of HyperConverged Appliances that include both compute and storag...
Chuck Piluso will present a study of cloud adoption trends and the power and flexibility of IBM Power and Pureflex cloud solutions. Speaker Bio: Prior to Data Storage Corporation (DSC), Mr. Piluso founded North American Telecommunication Corporation, a facilities-based Competitive Local Exchange Carrier licensed by the Public Service Commission in 10 states, serving as the company's chairman and president from 1997 to 2000. Between 1990 and 1997, Mr. Piluso served as chairman & founder of International Telecommunications Corporation, a facilities-based international carrier licensed by t...
There are lots of challenges in IoT around secure, scalable and business friendly infrastructure for enterprises. For large corporations, IoT implementations are one of the top priorities of the decade. All industries are seeing a competitive need to sustain by investing in IoT initiatives. The value addition comes from improved customer service, innovative product and additional revenue streams. The data from these IP-connected devices can be leveraged for a variety of business applications as well as responsive action controls. The various architectural building blocks of an IoT ...
“In the past year we've seen a lot of stabilization of WebRTC. You can now use it in production with a far greater degree of certainty. A lot of the real developments in the past year have been in things like the data channel, which will enable a whole new type of application," explained Peter Dunkley, Technical Director at Acision, in this SYS-CON.tv interview at @ThingsExpo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
WebRTC is an up-and-coming standard that enables real-time voice and video to be directly embedded into browsers making the browser a primary user interface for communications and collaboration. WebRTC runs in a number of browsers today and is currently supported in over a billion installed browsers globally, across a range of platform OS and devices. Today, organizations that choose to deploy WebRTC applications and use a host machine that supports audio through USB or Bluetooth can use Plantronics products to connect and transit or receive the audio associated with the WebRTC session.
The best mobile applications are augmented by dedicated servers, the Internet and Cloud services. Mobile developers should focus on one thing: writing the next socially disruptive viral app. Thanks to the cloud, they can focus on the overall solution, not the underlying plumbing. From iOS to Android and Windows, developers can leverage cloud services to create a common cross-platform backend to persist user settings, app data, broadcast notifications, run jobs, etc. This session provides a high level technical overview of many cloud services available to mobile app developers, includi...
SYS-CON Media announced today that @WebRTCSummit Blog, the largest WebRTC resource in the world, has been launched. @WebRTCSummit Blog offers top articles, news stories, and blog posts from the world's well-known experts and guarantees better exposure for its authors than any other publication. @WebRTCSummit Blog can be bookmarked ▸ Here @WebRTCSummit conference site can be bookmarked ▸ Here
SYS-CON Events announced today that Ciqada will exhibit at SYS-CON's @ThingsExpo, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Ciqada™ makes it easy to connect your products to the Internet. By integrating key components - hardware, servers, dashboards, and mobile apps - into an easy-to-use, configurable system, your products can quickly and securely join the internet of things. With remote monitoring, control, and alert messaging capability, you will meet your customers' needs of tomorrow - today! Ciqada. Let your products take flight. For more inform...
Health care systems across the globe are under enormous strain, as facilities reach capacity and costs continue to rise. M2M and the Internet of Things have the potential to transform the industry through connected health solutions that can make care more efficient while reducing costs. In fact, Vodafone's annual M2M Barometer Report forecasts M2M applications rising to 57 percent in health care and life sciences by 2016. Lively is one of Vodafone's health care partners, whose solutions enable older adults to live independent lives while staying connected to loved ones. M2M will continue to gr...
Dave will share his insights on how Internet of Things for Enterprises are transforming and making more productive and efficient operations and maintenance (O&M) procedures in the cleantech industry and beyond. Speaker Bio: Dave Landa is chief operating officer of Cybozu Corp (kintone US). Based in the San Francisco Bay Area, Dave has been on the forefront of the Cloud revolution driving strategic business development on the executive teams of multiple leading Software as a Services (SaaS) application providers dating back to 2004. Cybozu's kintone.com is a leading global BYOA (Build Your O...
As enterprises move to all-IP networks and cloud-based applications, communications service providers (CSPs) – facing increased competition from over-the-top providers delivering content via the Internet and independently of CSPs – must be able to offer seamless cloud-based communication and collaboration solutions that can scale for small, midsize, and large enterprises, as well as public sector organizations, in order to keep and grow market share. The latest version of Oracle Communications Unified Communications Suite gives CSPs the capability to do just that. In addition, its integration ...
The IoT Bootcamp is coming to Cloud Expo | @ThingsExpo on June 9-10 at the Javits Center in New York. Instructor. Registration is now available at http://iotbootcamp.sys-con.com/ Instructor Janakiram MSV previously taught the famously successful Multi-Cloud Bootcamp at Cloud Expo | @ThingsExpo in November in Santa Clara. Now he is expanding the focus to Janakiram is the founder and CTO of Get Cloud Ready Consulting, a niche Cloud Migration and Cloud Operations firm that recently got acquired by Aditi Technologies. He is a Microsoft Regional Director for Hyderabad, India, and one of the f...
The 17th International Cloud Expo has announced that its Call for Papers is open. 17th International Cloud Expo, to be held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, APM, APIs, Microservices, Security, Big Data, Internet of Things, DevOps and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal today!