Welcome!

Open Source Cloud Authors: Yeshim Deniz, Mano Marks, Pat Romanski, Carmen Gonzalez, Dana Gardner

Blog Feed Post

Big Data as Core, Big Data as Context, and Big Data as Buzzword Bingo

3711242567_7a2f9e6f13_zIt’s neither particularly newsworthy nor insightful to suggest that ‘Big Data’ gets everywhere these days, but two recent items reminded me of the gulf between credible execution of a big data play and the more questionable tacking of the big data meme onto an otherwise useful product.

Christmas is coming. Which means skating, and pantomimes (Captain Jack! And the Krankies!), and surprisingly expensive daughter shops, and pie with chicken and banana. But in amongst that lot, the weekend’s email and RSS brought news of

an ideal solution to store, manage and archive big data

and a

service built specifically for Fortune 1000 enterprises who want to rapidly explore how big data technology can unlock revenue from their data.

(both with my emphasis)

Infochimps has been around since 2009, and I’ve been following them with interest. CTO and Co-Founder Flip Kromer and I recorded podcasts in 2009 and early 2012, and we continue to meet up from time to time. From humble beginnings, the company grew to become one of a handful of credible Data Market offerings, before moving on to contribute key pieces of code to projects such as VMware’s Serengeti. Earlier this year, Infochimps’ broader ambitions began to become public as the Infochimps Platform rolled out. In August, the Platform gained streaming capabilities that helped propel it beyond any early reliance upon Hadoop. Then, this month, things got really interesting with the arrival of the Infochimps Enterprise Cloud. As Alex Williams reported for TechCrunch on Monday,

Infochimps data scientists and engineers developed the platform so they could collect lots of data and perform complex analytics along the way. A customer can pull in data from CRM systems and any of the other app silos where data pools then combine it with the data from Facebook, Twitter, and other services. The data flows into Infochimps’ data-delivery service and is cleaned up along the way. Data gets enriched, as needed, with other pieces of information such as demographic data.

The service works with any kind of database. Infochimps can implement any combination, including relational for SQL-like queries, and NoSQL for Hadoop jobs and big data storage. Analysis tools on the back-end provide the capability to create visuals and reports.

The company is setting itself some bold targets, seeking to speed up system deployments, making it easier for existing staff to do new things with data they already own, and freeing users to deploy a wide range of big data tools beyond the default of the cuddly elephant. And they’re targeting this directly at the Fortune 1000; companies with huge IT operations, demanding requirements, and an expectation of support, service and quality, all day, every day. For a small company of around 30 employees, which raised $1.55 million back in 2010 and hasn’t reported an investment since, that’s a big ask.

If even a fraction of what the Enterprise Cloud promises is available today, or demonstrably around the corner, then that team of 30 must be spending most of their time fending off a swarm of investors and acquirers. A nice problem to have, but a problem all the same.

I look forward to seeing real examples of the uses to which enterprise customers begin putting the Enterprise Cloud. I’ll also be watching with interest for rumours of acquisition or investment, both of which are bound to come.

The other piece of news also came from an established company. This time, consumer and small business backup provider Genie9. The company has a new backup product out, called Zoolz, and is making much of the integral “Cold Storage™ Technology” (Ugh!) that gives users reasonably straightforward access to Amazon’s very cheap Glacier storage service.

Personally, I achieve my backup and archival needs through a combination of DropBox, Google Drive, Spanning Backup, a Time Capsule and Arq (complete with its own non-™ hooks into Glacier). But that’s me. A one man band, with a particular set of devices and workflows, and it’s an arrangement that has grown up rather organically.

Zoolz makes perfect sense as a backup solution, and from a brief play with the tool it appears intuitive, capable, and affordable. The Glacier integration is also good, for those things you want to keep, but which you don’t need to access regularly. I have no problem with the tool at all, but what did (and does) bemuse me was the emphasis upon its role in meeting big data requirements.

Zoolz is designed with big data support in mind and will be a game changer to help companies move all their data to the cloud in a secure and fast way that is cheaper than tapes and traditional solutions.

Huh?

The web site devotes a whole page to the big data capabilities of Zoolz, but I’m singularly unconvinced. The whole point about big data, surely, is that you work with it? You pour it into very capable tools that allow you to hold it in (or close to) memory, and you chop and change it in a variety of ways whilst seeking insight? You don’t park it 3-5 hours away in an Amazon cold storage facility and think “job done,” just because Zoolz offers “photo preview” !

Zoolz (through Glacier) offers a place to park large volumes of data that you no longer wish to work with, but it does nothing at all to help people ingest, process, analyse or understand big data. Moving large volumes of data around is slow and expensive. Processes to work with data are often scripted or otherwise automated, and tied into workflows that make sense within the context of the analytic tools (like Hadoop, say) to be used. It’s wholly unclear that Zoolz’s pretty UI and consumer/small business workflows make any sense in that context whatsoever.

Personally, Genie9, I would be proud of what I’ve made in Zoolz. But I’d drop the ‘big data’ stuff. It doesn’t fit.

Bingo card image by Flickr user Sara

Read the original blog entry...

More Stories By Paul Miller

Paul Miller works at the interface between the worlds of Cloud Computing and the Semantic Web, providing the insights that enable you to exploit the next wave as we approach the World Wide Database.

He blogs at www.cloudofdata.com.

@ThingsExpo Stories
More and more brands have jumped on the IoT bandwagon. We have an excess of wearables – activity trackers, smartwatches, smart glasses and sneakers, and more that track seemingly endless datapoints. However, most consumers have no idea what “IoT” means. Creating more wearables that track data shouldn't be the aim of brands; delivering meaningful, tangible relevance to their users should be. We're in a period in which the IoT pendulum is still swinging. Initially, it swung toward "smart for smart...
The WebRTC Summit New York, to be held June 6-8, 2017, at the Javits Center in New York City, NY, announces that its Call for Papers is now open. Topics include all aspects of improving IT delivery by eliminating waste through automated business models leveraging cloud technologies. WebRTC Summit is co-located with 20th International Cloud Expo and @ThingsExpo. WebRTC is the future of browser-to-browser communications, and continues to make inroads into the traditional, difficult, plug-in web co...
"A lot of times people will come to us and have a very diverse set of requirements or very customized need and we'll help them to implement it in a fashion that you can't just buy off of the shelf," explained Nick Rose, CTO of Enzu, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Buzzword alert: Microservices and IoT at a DevOps conference? What could possibly go wrong? In this Power Panel at DevOps Summit, moderated by Jason Bloomberg, the leading expert on architecting agility for the enterprise and president of Intellyx, panelists peeled away the buzz and discuss the important architectural principles behind implementing IoT solutions for the enterprise. As remote IoT devices and sensors become increasingly intelligent, they become part of our distributed cloud enviro...
SYS-CON Events announced today that MobiDev, a client-oriented software development company, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place June 6-8, 2017, at the Javits Center in New York City, NY, and the 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MobiDev is a software company that develops and delivers turn-key mobile apps, websites, web services, and complex softw...
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo 2016 in New York. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place June 6-8, 2017, at the Javits Center in New York City, New York, is co-located with 20th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry p...
The security needs of IoT environments require a strong, proven approach to maintain security, trust and privacy in their ecosystem. Assurance and protection of device identity, secure data encryption and authentication are the key security challenges organizations are trying to address when integrating IoT devices. This holds true for IoT applications in a wide range of industries, for example, healthcare, consumer devices, and manufacturing. In his session at @ThingsExpo, Lancen LaChance, vic...
Who are you? How do you introduce yourself? Do you use a name, or do you greet a friend by the last four digits of his social security number? Assuming you don’t, why are we content to associate our identity with 10 random digits assigned by our phone company? Identity is an issue that affects everyone, but as individuals we don’t spend a lot of time thinking about it. In his session at @ThingsExpo, Ben Klang, Founder & President of Mojo Lingo, discussed the impact of technology on identity. Sho...
Manufacturers are embracing the Industrial Internet the same way consumers are leveraging Fitbits – to improve overall health and wellness. Both can provide consistent measurement, visibility, and suggest performance improvements customized to help reach goals. Fitbit users can view real-time data and make adjustments to increase their activity. In his session at @ThingsExpo, Mark Bernardo Professional Services Leader, Americas, at GE Digital, discussed how leveraging the Industrial Internet and...
What are the new priorities for the connected business? First: businesses need to think differently about the types of connections they will need to make – these span well beyond the traditional app to app into more modern forms of integration including SaaS integrations, mobile integrations, APIs, device integration and Big Data integration. It’s important these are unified together vs. doing them all piecemeal. Second, these types of connections need to be simple to design, adapt and configure...
IoT generates lots of temporal data. But how do you unlock its value? You need to discover patterns that are repeatable in vast quantities of data, understand their meaning, and implement scalable monitoring across multiple data streams in order to monetize the discoveries and insights. Motif discovery and deep learning platforms are emerging to visualize sensor data, to search for patterns and to build application that can monitor real time streams efficiently. In his session at @ThingsExpo, ...
A critical component of any IoT project is what to do with all the data being generated. This data needs to be captured, processed, structured, and stored in a way to facilitate different kinds of queries. Traditional data warehouse and analytical systems are mature technologies that can be used to handle certain kinds of queries, but they are not always well suited to many problems, particularly when there is a need for real-time insights.
WebRTC is about the data channel as much as about video and audio conferencing. However, basically all commercial WebRTC applications have been built with a focus on audio and video. The handling of “data” has been limited to text chat and file download – all other data sharing seems to end with screensharing. What is holding back a more intensive use of peer-to-peer data? In her session at @ThingsExpo, Dr Silvia Pfeiffer, WebRTC Applications Team Lead at National ICT Australia, looked at differ...
"ReadyTalk is an audio and web video conferencing provider. We've really come to embrace WebRTC as the platform for our future of technology," explained Dan Cunningham, CTO of ReadyTalk, in this SYS-CON.tv interview at WebRTC Summit at 19th Cloud Expo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
In his General Session at 16th Cloud Expo, David Shacochis, host of The Hybrid IT Files podcast and Vice President at CenturyLink, investigated three key trends of the “gigabit economy" though the story of a Fortune 500 communications company in transformation. Narrating how multi-modal hybrid IT, service automation, and agile delivery all intersect, he will cover the role of storytelling and empathy in achieving strategic alignment between the enterprise and its information technology.
Growth hacking is common for startups to make unheard-of progress in building their business. Career Hacks can help Geek Girls and those who support them (yes, that's you too, Dad!) to excel in this typically male-dominated world. Get ready to learn the facts: Is there a bias against women in the tech / developer communities? Why are women 50% of the workforce, but hold only 24% of the STEM or IT positions? Some beginnings of what to do about it! In her Day 2 Keynote at 17th Cloud Expo, Sandy Ca...
You have great SaaS business app ideas. You want to turn your idea quickly into a functional and engaging proof of concept. You need to be able to modify it to meet customers' needs, and you need to deliver a complete and secure SaaS application. How could you achieve all the above and yet avoid unforeseen IT requirements that add unnecessary cost and complexity? You also want your app to be responsive in any device at any time. In his session at 19th Cloud Expo, Mark Allen, General Manager of...
Web Real-Time Communication APIs have quickly revolutionized what browsers are capable of. In addition to video and audio streams, we can now bi-directionally send arbitrary data over WebRTC's PeerConnection Data Channels. With the advent of Progressive Web Apps and new hardware APIs such as WebBluetooh and WebUSB, we can finally enable users to stitch together the Internet of Things directly from their browsers while communicating privately and securely in a decentralized way.
Providing secure, mobile access to sensitive data sets is a critical element in realizing the full potential of cloud computing. However, large data caches remain inaccessible to edge devices for reasons of security, size, format or limited viewing capabilities. Medical imaging, computer aided design and seismic interpretation are just a few examples of industries facing this challenge. Rather than fighting for incremental gains by pulling these datasets to edge devices, we need to embrace the i...
Internet of @ThingsExpo, taking place June 6-8, 2017 at the Javits Center in New York City, New York, is co-located with the 20th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. @ThingsExpo New York Call for Papers is now open.