Welcome!

Open Source Authors: Maureen O'Gara, Jeremy Geelan, Liz McMillan, Reuven Cohen, Lavenya Dilip

Related Topics: Cloud Expo

Cloud Expo: Article

Cloud Analytics Checklist

What are enterprise users looking for from a cloud analytics solution?

Cloud Data Analytics on Ulitzer

In the previous article we looked at how realtime cloud analytics looks set to disrupt the $25B SQL/OLAP sector of the IT industry. What are users looking for from a next-generation post-SQL/OLAP enterprise analytics solution? Let's look at the requirements:

  • Realtime + Historical Data. In addition to analyzing (historical) data held in databases (Oracle, SQLServer, DB2, MySQL) or datastores (Hadoop, Amazon Elastic MapReduce), a next-gen analytics solution needs to be able to analyze, filter and transform live data streams in realtime, with low latency, and to be able to "push" just the right data, at the right time, to users throughout the enterprise. With SQL/OLAP or Hadoop/MapReduce, users "pull" historical data via queries or programs to find what they need, but for many analytics scenarios today what's needed instead, to handle information overload is a continuous "realtime push" model where "the data finds the user".

  • External + Internal Data. In the past it was so simple, an enterprise had only to deploy a few large specialized systems (ERP, CRM, Supply Chain, Web Analytics) to handle the internal data flowing through the organization. Today, in order to be able to operate with peak efficiency, a large enterprise will need to have a detailed realtime integrated awareness of all kinds of data sources that could impact the business, for example, information on: customers, partners, employees, competitors, marketing, advertising, pricing, web, news, markets, locations, gov data, communications, email, collaboration, social, IT, datacenters, networks, sensors.
  • Unstructured + Structured Data. SQL/OLAP analytics was built on the idea that data would be held in relational databases, and that the data would be highly structured. Today, this no longer applies. Much of the most valuable data to an enterprise today is either semi-structured or unstructured.
  • Easy-To-Use. SQL/OLAP has proved to be too complex for most enterprise users who need access to analytics for their work. Excel with its simple charting, visualization, sharing and collaboration features provides a much more attractive interface for most users. Other products and services such as Qlikview and GoodData also provide ease-of-use, but none of them (Excel included) offers the kind of realtime analytics, scalability and parallel processing required in analytics today. Despite its complexity and lack of mainstream adoption within the enterprise, a few companies have taken SQL/OLAP and made it even more complex by adding in features to support realtime stream processing. None of these StreamSQL solutions seem to have achieved any widespread adoption to date.
  • Cloud-Based, Pay-Per-Use. Every company looking to compete in the next-generation analytics market will have to have at least a public cloud offering, and most will also have virtual private cloud and private cloud offerings. Since enterprise data will often be held on more than one cloud, it will be increasingly important to have an "intercloud" capability, where analytics apps can be run simultaneously across multiple (public and/or private) clouds, e.g. across Amazon AWS and Windows Azure.
  • Elastic Scalability, Parallel Processing, MapReduce. With exponentially growing data volumes it will be essential to offer the elastic scalability and parallel processing required required to handle anything from one-off personal data analysis tasks up to the most demanding large-scale analytics apps required by the world's leading organizations in business, web, finance and government.
  • Seamless Integration With Standard Tools (Excel). With 40 Million analytics power users using Excel, this is a must for any analytics solution looking to achieve significant market adoption.

At Cloudscale, we've compiled a Cloud Analytics Checklist, showing how various analytics products/services measure up against this set of requirements. If you're thinking about cloud analytics and would like a copy of the Checklist then send a request with your email address via the Cloudscale website (no signup required) or by email to checklist@cloudscale.com, with the word Checklist in the Subject line.

More Stories By Bill McColl

Bill McColl is Founder and CEO, Cloudscale Inc. - which is developing a massively parallel cloud-based platform for continuous real-time intelligence on live data streams.

In 2006, he left Oxford University Computing Laboratory where for over twenty years he had been head of research in parallel computing and scalable systems. At the time of his departure, he was Professor of Computer Science and Chairman of the Faculty of Computer Science. McColl has published and lectured extensively on the design, analysis and implementation of massively parallel algorithms and systems.

He established and led Oxford Parallel, a major center for research on industrial and business applications of parallel computing at the university. He was also founder and CEO of Sychron Inc., a Silicon Valley VC-backed software company developing massively parallel system software for datacenter and desktop virtualization. Cloudscale Inc.is his second Silicon Valley company.