Welcome!

Open Source Authors: Carmen Gonzalez, Adrian Bridgwater, Liz McMillan, Elizabeth White, Pat Romanski

Related Topics: Big Data Journal, Open Source

Big Data Journal: Blog Feed Post

Big Data: New Ways to Hadoop with R

Today, there are two main ways to use Hadoop with R and big data

Today, there are two main ways to use Hadoop with R and big data:

1. Use the open-source rmr package to write map-reduce tasks in R (running within the Hadoop cluster - great for data distillation!)

2. Import data from Hadoop to a server running Revolution R Enterprise, via Hbase, ODBC (for high-performance Hadoop/SQL interfaces), or streaming data direct from HDFS to ScaleR's big-data predictive algorithms.


And now, there are even more Hadoop platforms supported for use with Revolution R Enterprise. You can use:

  • Cloudera CDH3 or CDH4
  • IBM BigInsights 2
  • New! Hortonworks Data Platform 1.2
  • New! Intel's Distribution for Hadoop (announced today)

And by the end of the year, there will be a third way to use Hadoop with R:

3. Leave the data in Hadoop, and use ScaleR's "in-Hadoop predictive analytics"

We announced today that we are jointly developing in-Hadoop predictive analytics with HortonWorks, and our first demonstrations are taking place now at the Strata conference. It's in the prototype stage right now, but we expect to have it generally available by the end of the year. In the meantime, check out the video below which explains the three ways of using R and Hadoop together, and includes an early demo of our in-Hadoop Predictive Analytics.

For more details, check out the press release below.

Revolution Analytics press releases: Revolution Analytics Expands Support for Hadoop and Pioneers In-Hadoop Predictive Analytics with Hortonworks

More Stories By David Smith

David Smith is Vice President of Marketing and Community at Revolution Analytics. He has a long history with the R and statistics communities. After graduating with a degree in Statistics from the University of Adelaide, South Australia, he spent four years researching statistical methodology at Lancaster University in the United Kingdom, where he also developed a number of packages for the S-PLUS statistical modeling environment. He continued his association with S-PLUS at Insightful (now TIBCO Spotfire) overseeing the product management of S-PLUS and other statistical and data mining products.<

David smith is the co-author (with Bill Venables) of the popular tutorial manual, An Introduction to R, and one of the originating developers of the ESS: Emacs Speaks Statistics project. Today, he leads marketing for REvolution R, supports R communities worldwide, and is responsible for the Revolutions blog. Prior to joining Revolution Analytics, he served as vice president of product management at Zynchros, Inc. Follow him on twitter at @RevoDavid

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.