Open Source Cloud Authors: Pat Romanski, Liz McMillan, Yeshim Deniz, Elizabeth White, Zakia Bouachraoui

Related Topics: @CloudExpo, Microservices Expo

@CloudExpo: Blog Feed Post

Big Data Is All the Rage. Why?

We see Big Data solutions daily through tools such as Twitter and LinkedIn

On Monday, December 5, Bob Gourley went on the Enterprise CIO Forum to explain Big Data and why it matters. First, he defined Big Data simply as the data your organization cannot currently analyze. Though some technologists give more precise definitions, this sums up the challenge enterprises now face. If you can deal with all of your data now, you don’t have a Big Data problem, but as soon as you have more data than you can effectively manage to finding the answers you need fast enough to use them, you need a Big Data solution. Structured data and relational databases can also be Big Data but what we’re really talking about is the type and volume of information that exceeds traditional methods. New solutions include MapReduce, originally developed at Google to analyze and index the entire Internet, and Hadoop which grew to use those new methods.

We see Big Data solutions daily through tools such as Twitter and LinkedIn, which analyze massive amounts of information from user accounts and actions to perform searches and generate content in real time. Twitter, for example, looks at millions of tweets fast enough to determine what topics are currently trending, and LinkedIn can analyze your networks and profile to suggest people who you would want to connect with.

Big Data doesn’t have one tool or even a toolset, it has a framework. For example, there is a growing and evolving ecosystem around Hadoop, shepherded by Cloudera, including a variety of capabilities for structured and unstructured data. Hadoop itself allows the use of commodity hardware to efficiently store and process massive amounts of unstructured data.  According to Gourley, Hadoop is the essence of the current Big Data phenomenon, though there are other niche solutions out there. At the moment, government, finance, energy, and science are all turning to the Hadoop family for their Big Data solutions. Hadoop, formally known as Apache Hadoop, is an open standard managed by the Apache Foundation, and is combined with software such as HBASE, Hive, and Flume in distributions, such as the popular Cloudera’s Distribution including Apache Hadoop.

Big Data has created a “Cambrian Explosion” of capabilities and uses. For example, by analyzing social media and messages, organizations have distilled member’s “digital characters” to find criminals, rogue traders, and unusual behaviors. Other use cases include detecting cyber attacks and better internal search results and recommendations to clients, such as the federal government’s USASearch.

Initially, major IT companies were cautiously exploring Big Data solutions, but now many have jumped on the Hadoop bandwagon. Microsoft showed great agility when it recently abandoned its proprietary software Dryad to contribute to the open source Hadoop community, and many major companies have their own Hadoop distributions such as SGIIBM, EMC, and Dell. Users now have the choice to download Cloudera’s distribution and use its management tools to configure and run Hadoop, or purchase an incredibly powerful piece of hardware from SGI that already has Hadoop configured and guaranteed to run. That same firm will sell you training and services while providing patches, making it a good option for an enterprise CIO.

In the next year, expect to see the continuing evolution of management tools for Hadoop clusters. Currently, Cloudera’s configuration manager is the dominant tool and will continue to evolve as well. Dozens of firms are also beginning to provide applications on top of Hadoop, which will allow analysts to interact with Big Data themselves rapidly, without the help of the IT department. And as Hadoop grows more prevalent, now is the time to go out and get training.

Read the original blog entry...

More Stories By Bob Gourley

Bob Gourley writes on enterprise IT. He is a founder of Crucial Point and publisher of CTOvision.com

IoT & Smart Cities Stories
A valuable conference experience generates new contacts, sales leads, potential strategic partners and potential investors; helps gather competitive intelligence and even provides inspiration for new products and services. Conference Guru works with conference organizers to pass great deals to great conferences, helping you discover new conferences and increase your return on investment.
Headquartered in Plainsboro, NJ, Synametrics Technologies has provided IT professionals and computer systems developers since 1997. Based on the success of their initial product offerings (WinSQL and DeltaCopy), the company continues to create and hone innovative products that help its customers get more from their computer applications, databases and infrastructure. To date, over one million users around the world have chosen Synametrics solutions to help power their accelerated business or per...
DXWorldEXPO LLC announced today that ICOHOLDER named "Media Sponsor" of Miami Blockchain Event by FinTechEXPO. ICOHOLDER gives detailed information and help the community to invest in the trusty projects. Miami Blockchain Event by FinTechEXPO has opened its Call for Papers. The two-day event will present 20 top Blockchain experts. All speaking inquiries which covers the following information can be submitted by email to [email protected] Miami Blockchain Event by FinTechEXPOalso offers sp...
SYS-CON Events announced today that IoT Global Network has been named “Media Sponsor” of SYS-CON's @ThingsExpo, which will take place on June 6–8, 2017, at the Javits Center in New York City, NY. The IoT Global Network is a platform where you can connect with industry experts and network across the IoT community to build the successful IoT business of the future.
IoT is rapidly becoming mainstream as more and more investments are made into the platforms and technology. As this movement continues to expand and gain momentum it creates a massive wall of noise that can be difficult to sift through. Unfortunately, this inevitably makes IoT less approachable for people to get started with and can hamper efforts to integrate this key technology into your own portfolio. There are so many connected products already in place today with many hundreds more on the h...
The best way to leverage your Cloud Expo presence as a sponsor and exhibitor is to plan your news announcements around our events. The press covering Cloud Expo and @ThingsExpo will have access to these releases and will amplify your news announcements. More than two dozen Cloud companies either set deals at our shows or have announced their mergers and acquisitions at Cloud Expo. Product announcements during our show provide your company with the most reach through our targeted audiences.
CloudEXPO New York 2018, colocated with DXWorldEXPO New York 2018 will be held November 11-13, 2018, in New York City and will bring together Cloud Computing, FinTech and Blockchain, Digital Transformation, Big Data, Internet of Things, DevOps, AI, Machine Learning and WebRTC to one location.
Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereal. Andrew's role at ConsenSys Enterprise is a mul...
DXWorldEXPO | CloudEXPO are the world's most influential, independent events where Cloud Computing was coined and where technology buyers and vendors meet to experience and discuss the big picture of Digital Transformation and all of the strategies, tactics, and tools they need to realize their goals. Sponsors of DXWorldEXPO | CloudEXPO benefit from unmatched branding, profile building and lead generation opportunities.
Disruption, Innovation, Artificial Intelligence and Machine Learning, Leadership and Management hear these words all day every day... lofty goals but how do we make it real? Add to that, that simply put, people don't like change. But what if we could implement and utilize these enterprise tools in a fast and "Non-Disruptive" way, enabling us to glean insights about our business, identify and reduce exposure, risk and liability, and secure business continuity?