Open Source Cloud Authors: Elizabeth White, Rostyslav Demush, Pat Romanski, Liz McMillan, Yeshim Deniz

Related Topics: @DXWorldExpo, Open Source Cloud

@DXWorldExpo: Blog Feed Post

Hadoop Big Data Reservoirs

A look at the Hadoop Big Data Reservoir with Platfora’s Peter Schlampp

By - Peter Schlampp, the Vice President of Products and Business Development at Platfora, explains what the Hadoop Big Data reservoir is and is not in this webinar that I watched today. Knowing what the HDR is and is not is key to pulling out business intelligence insights and analytics. Platfora arrived at these conclusions from interviews of over 200 enteprise IT professionals who are working in the big data space. They identified key requirements of self-service access, high performance and security.

The Hadoop Data Reservoir is the central Hadoop cluster for the entire enterprise. It provides storage and the source for business analytics. It also allows processing for data preparation and advanced analytics. Peter and Platfora believe that the HDR eliminates data silos, reduces costs, and makes business analytics agile. He does not see HDR as the replacement for enterprise data warehouse, which require major planning and continuous attention from a large staff. Using an HDR allows it to be both self-service and flexible, while catching data that may have been dropped previously.

Meeting the Performance Requirement

Queries must be consistently fast – BI applications are driving increasing number of queries. The actions of one user, should not impact the performance of many. Users expect sub-second responses, and when they do not get them, they think something is wrong. To this extent, in their research, Platfora has found that most queries are straightforward and big. Platfora has solved the “big” part by creating pre-calculated summary tables, which summarize fine data. This reduces the amount of data needed to be queried to answer a question, and limits redundant processing. By storing these tables in-memory, sub-second response times are possible.

Meeting the Self-Service Requirement

One of the key ways that self-service must be met is making aggregate table creation automatic. This means that instead of having a human try to create them (and refine them over time) they have to be created by the system. As well, they must be maintained, tracking new additions to the data set. Sometimes, the aggregate tables are not good enough, and you must provide the ability to drill down through the table into the raw data. Often, users will need to ingest their own data sets, which allows them to augment aggregated tables.

Meeting the Security Requirement

While there is some built-in security in the Hadoop File System, it focuses mainly on file and directory based permissions, as well as secure authentication. This does not meet most enterprise security needs, creating a vacuum in the security paradigm. Platfora’s Hadoop Big Data Reservoir provides that granular security enterprises need, without any performance hit.

The Platfora Integrated Platform

Platfora’s Integrated Platform is three tiers of services into one unified package. The Web-based Business Intelligence Application is a self-service, light application that uses HTML5 to create a rich visual experience for the analyst. The Scale-out, In-Memory, Data Mart and Processing Engine enable rapid access of those aggregate stores, while maintaining the ability to grow to scale (and be used at scale). Lastly, the Automated Hadoop Refinery is constantly creating new aggregate tables, refining the data sets and maintaining the data. It requires little maintenance from IT workers, and provides a great deal of reliability.

The Demo

Peter launched into a demonstration which showed how the ingest of a large sets of data can help BI rapidly. This data includes information such as LDAP access and more. Peter demonstrated how to create unique “mount points,” for which he could immediately set that access rules – increasing the security around sets of data. He also described how creating a “Lens,” an in-memory aggregate table, can be easy and done directly by any analyst. The analyst can choose the fields within the raw data to focus the table around, as well as set security rules. Platfora’s visualization capability is called Vizboards. With it, users can choose data points to try to visualize why things have happened, according to the data. These Vizboards can then be shared with users as you see fit.

This is a great webinar, not just to see what Platfora is doing, but rather more about Hadoop in general. Hadoop Big Data Reservoirs can be used to catch data before it trickles out into the ether, and is no longer usable. Be sure to check it out if you’re looking to create functional and usable BI for your agency.

More Stories By Bob Gourley

Bob Gourley writes on enterprise IT. He is a founder of Crucial Point and publisher of CTOvision.com

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

@ThingsExpo Stories
It is of utmost importance for the future success of WebRTC to ensure that interoperability is operational between web browsers and any WebRTC-compliant client. To be guaranteed as operational and effective, interoperability must be tested extensively by establishing WebRTC data and media connections between different web browsers running on different devices and operating systems. In his session at WebRTC Summit at @ThingsExpo, Dr. Alex Gouaillard, CEO and Founder of CoSMo Software, presented ...
WebRTC is great technology to build your own communication tools. It will be even more exciting experience it with advanced devices, such as a 360 Camera, 360 microphone, and a depth sensor camera. In his session at @ThingsExpo, Masashi Ganeko, a manager at INFOCOM Corporation, introduced two experimental projects from his team and what they learned from them. "Shotoku Tamago" uses the robot audition software HARK to track speakers in 360 video of a remote party. "Virtual Teleport" uses a multip...
Business professionals no longer wonder if they'll migrate to the cloud; it's now a matter of when. The cloud environment has proved to be a major force in transitioning to an agile business model that enables quick decisions and fast implementation that solidify customer relationships. And when the cloud is combined with the power of cognitive computing, it drives innovation and transformation that achieves astounding competitive advantage.
Data is the fuel that drives the machine learning algorithmic engines and ultimately provides the business value. In his session at Cloud Expo, Ed Featherston, a director and senior enterprise architect at Collaborative Consulting, discussed the key considerations around quality, volume, timeliness, and pedigree that must be dealt with in order to properly fuel that engine.
IoT is rapidly becoming mainstream as more and more investments are made into the platforms and technology. As this movement continues to expand and gain momentum it creates a massive wall of noise that can be difficult to sift through. Unfortunately, this inevitably makes IoT less approachable for people to get started with and can hamper efforts to integrate this key technology into your own portfolio. There are so many connected products already in place today with many hundreds more on the h...
When shopping for a new data processing platform for IoT solutions, many development teams want to be able to test-drive options before making a choice. Yet when evaluating an IoT solution, it’s simply not feasible to do so at scale with physical devices. Building a sensor simulator is the next best choice; however, generating a realistic simulation at very high TPS with ease of configurability is a formidable challenge. When dealing with multiple application or transport protocols, you would be...
Detecting internal user threats in the Big Data eco-system is challenging and cumbersome. Many organizations monitor internal usage of the Big Data eco-system using a set of alerts. This is not a scalable process given the increase in the number of alerts with the accelerating growth in data volume and user base. Organizations are increasingly leveraging machine learning to monitor only those data elements that are sensitive and critical, autonomously establish monitoring policies, and to detect...
In his keynote at 18th Cloud Expo, Andrew Keys, Co-Founder of ConsenSys Enterprise, provided an overview of the evolution of the Internet and the Database and the future of their combination – the Blockchain. Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settl...
In his session at @ThingsExpo, Dr. Robert Cohen, an economist and senior fellow at the Economic Strategy Institute, presented the findings of a series of six detailed case studies of how large corporations are implementing IoT. The session explored how IoT has improved their economic performance, had major impacts on business models and resulted in impressive ROIs. The companies covered span manufacturing and services firms. He also explored servicification, how manufacturing firms shift from se...
DevOpsSummit New York 2018, colocated with CloudEXPO | DXWorldEXPO New York 2018 will be held November 11-13, 2018, in New York City. Digital Transformation (DX) is a major focus with the introduction of DXWorldEXPO within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of bus...
The Jevons Paradox suggests that when technological advances increase efficiency of a resource, it results in an overall increase in consumption. Writing on the increased use of coal as a result of technological improvements, 19th-century economist William Stanley Jevons found that these improvements led to the development of new ways to utilize coal. In his session at 19th Cloud Expo, Mark Thiele, Chief Strategy Officer for Apcera, compared the Jevons Paradox to modern-day enterprise IT, examin...
IoT solutions exploit operational data generated by Internet-connected smart “things” for the purpose of gaining operational insight and producing “better outcomes” (for example, create new business models, eliminate unscheduled maintenance, etc.). The explosive proliferation of IoT solutions will result in an exponential growth in the volume of IoT data, precipitating significant Information Governance issues: who owns the IoT data, what are the rights/duties of IoT solutions adopters towards t...
Amazon started as an online bookseller 20 years ago. Since then, it has evolved into a technology juggernaut that has disrupted multiple markets and industries and touches many aspects of our lives. It is a relentless technology and business model innovator driving disruption throughout numerous ecosystems. Amazon’s AWS revenues alone are approaching $16B a year making it one of the largest IT companies in the world. With dominant offerings in Cloud, IoT, eCommerce, Big Data, AI, Digital Assista...
Organizations planning enterprise data center consolidation and modernization projects are faced with a challenging, costly reality. Requirements to deploy modern, cloud-native applications simultaneously with traditional client/server applications are almost impossible to achieve with hardware-centric enterprise infrastructure. Compute and network infrastructure are fast moving down a software-defined path, but storage has been a laggard. Until now.
Digital Transformation is much more than a buzzword. The radical shift to digital mechanisms for almost every process is evident across all industries and verticals. This is often especially true in financial services, where the legacy environment is many times unable to keep up with the rapidly shifting demands of the consumer. The constant pressure to provide complete, omnichannel delivery of customer-facing solutions to meet both regulatory and customer demands is putting enormous pressure on...
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...
DXWorldEXPO LLC announced today that All in Mobile, a mobile app development company from Poland, will exhibit at the 22nd International CloudEXPO | DXWorldEXPO. All In Mobile is a mobile app development company from Poland. Since 2014, they maintain passion for developing mobile applications for enterprises and startups worldwide.
"Akvelon is a software development company and we also provide consultancy services to folks who are looking to scale or accelerate their engineering roadmaps," explained Jeremiah Mothersell, Marketing Manager at Akvelon, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
IoT is at the core or many Digital Transformation initiatives with the goal of re-inventing a company's business model. We all agree that collecting relevant IoT data will result in massive amounts of data needing to be stored. However, with the rapid development of IoT devices and ongoing business model transformation, we are not able to predict the volume and growth of IoT data. And with the lack of IoT history, traditional methods of IT and infrastructure planning based on the past do not app...
DXWorldEXPO LLC announced today that the upcoming DXWorldEXPO | CloudEXPO New York event will feature 10 companies from Poland to participate at the "Poland Digital Transformation Pavilion" on November 12-13, 2018.