Welcome!

Open Source Cloud Authors: Liz McMillan, AppDynamics Blog, Christopher Keene, Kevin Benedict, XebiaLabs Blog

Blog Feed Post

The PMML Revolution: Predictive analytics at the speed of business

This guest post is by Alex Guazzelli, VP of Analytics at Zementis Inc. -- ed. PMML, the Predictive Model Markup Language, is the de facto standard to represent predictive analytics and data mining models. With PMML, it is extremely easy to move a predictive solution from one system to another, since it avoids proprietary issues and incompatibilities. Companies around the globe are benefiting from PMML to make instant use of their predictive solutions. With PMML, there is no need for custom coding: you can easily move your solution from the scientist’s desktop, where it was built, to the production environment, where it is operationally deployed. Companies also use PMML as the common language between service providers and external vendors. In this way, it defines a single and clear process for the exchange of predictive solutions. It becomes the bridge not only between data analysis, model building, and deployment systems, but also between all the people and teams involved in the analytical process. This is extremely important, since PMML is used to disseminate knowledge and best practices, and to ensure transparency. All the top analytical tools, commercial and open-source, support PMML. And, the language itself has reached a great level of maturity and refinement. PMML 4.1, its latest version, makes it extremely easy for predictive solutions to be represented in an open and standard way. With PMML, you can represent a myriad of pre- and post-processing steps, besides the predictive modeling techniques per se. PMML 4.1 allows for multiple models (model composition, chaining, segmentation, and ensemble, which includes random forest models), to be represented by a single and concise language element. It also allows for model outputs to be transformed into business decisions. Therefore, a PMML file is able to represent the entire solution, from raw data to business decision, with one or multiple predictive models. The availability of a standard such as PMML combined with scoring solutions in the cloud, for Hadoop, and in-database make it possible for predictive analytics to fulfill its promise and crack the big data code. Zementis, Inc. has been in the forefront of PMML-based scoring, first through its ADAPA Scoring Engine, which is available for on-site deployment or as a service on cloud (Amazon and IBM), and lately through its Universal PMML Plug-in which is offered for a range of databases and for Hadoop. Zementis has partnered with Revolution Analytics, so that predictive solutions built in R can benefit from the vast scoring infrastructure already in place. I am proud to be associated with Zementis and excited to be part of an ever-growing PMML community. A PMML package for R that exports all kinds of predictive models is available directly from CRAN. Traditionally, the PMML Package offered support for the following data mining algorithms: ksvm (kernlab): Support Vector Machines nnet: Neural Networks rpart: C&RT Decision Trees  lm & glm (stats): Linear and Binary Logistic Regression Models  arules: Association Rules kmeans and hclust: Clustering Models  Recently, it has been expanded to support:  multinom (nnet): Multinomial Logistic Regression Models; glm (stats): Generalized Linear Models for classification and regression with a wide variety of link functions  randomForest: Random Forest Models for classification and regression (click HERE for examples); rsf (randomSurvivalForest): Random Survival Forest Models; And, this expansion is still on-going as the R community implements support for other packages and techniques. For more on the PMML package, please take a look at the paper we published with Graham Williams from Togaware in “The R Journal”. For that just follow the link below: PMML: An Open Standard for Sharing Models There may be quite a few reasons for you to move your predictive solution from R to an independent deployment platform. Among them, you may want parallel execution on big data or real-time scoring for applications such as fraud detection or recommender systems. With PMML you can easily move your model to the cloud or inside the database for scoring. Or, even have it executed on Hadoop. It is really up to you! On top of that, PMML allows for side-by-side deployment of predictive assets from R as well as other commercial data mining tools, supporting a multi-vendor environment as well as platform independent deployment. More and more companies and individuals are using the PMML standard for the obvious benefits it provides, putting their predictive solutions on the fast track. With PMML, the speed of predictive solutions can be on par with the speed of business. Dr. Alex Guazzelli is the VP of Analytics at Zementis Inc. where he is responsible for developing core technology and predictive solutions under ADAPA, a PMML-based decisioning platform. With more than 20 years of experience in predictive analytics, Dr. Guazzelli holds a PhD in Computer Science from the University of Southern California and has co-authored the book PMML in Action: Unleashing the Power of Open Standards for Data Mining and Predictive Analytics, now in its second edition (paperback and kindle). You can follow him at @DrAlexGuazzelli.

Read the original blog entry...

More Stories By David Smith

David Smith is Vice President of Marketing and Community at Revolution Analytics. He has a long history with the R and statistics communities. After graduating with a degree in Statistics from the University of Adelaide, South Australia, he spent four years researching statistical methodology at Lancaster University in the United Kingdom, where he also developed a number of packages for the S-PLUS statistical modeling environment. He continued his association with S-PLUS at Insightful (now TIBCO Spotfire) overseeing the product management of S-PLUS and other statistical and data mining products.<

David smith is the co-author (with Bill Venables) of the popular tutorial manual, An Introduction to R, and one of the originating developers of the ESS: Emacs Speaks Statistics project. Today, he leads marketing for REvolution R, supports R communities worldwide, and is responsible for the Revolutions blog. Prior to joining Revolution Analytics, he served as vice president of product management at Zynchros, Inc. Follow him on twitter at @RevoDavid

@ThingsExpo Stories
Big Data, cloud, analytics, contextual information, wearable tech, sensors, mobility, and WebRTC: together, these advances have created a perfect storm of technologies that are disrupting and transforming classic communications models and ecosystems. In his session at @ThingsExpo, Erik Perotti, Senior Manager of New Ventures on Plantronics’ Innovation team, provided an overview of this technological shift, including associated business and consumer communications impacts, and opportunities it ...
SYS-CON Events announced today that Venafi, the Immune System for the Internet™ and the leading provider of Next Generation Trust Protection, will exhibit at @DevOpsSummit at 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Venafi is the Immune System for the Internet™ that protects the foundation of all cybersecurity – cryptographic keys and digital certificates – so they can’t be misused by bad guys in attacks...
It’s 2016: buildings are smart, connected and the IoT is fundamentally altering how control and operating systems work and speak to each other. Platforms across the enterprise are networked via inexpensive sensors to collect massive amounts of data for analytics, information management, and insights that can be used to continuously improve operations. In his session at @ThingsExpo, Brian Chemel, Co-Founder and CTO of Digital Lumens, will explore: The benefits sensor-networked systems bring to ...
Manufacturers are embracing the Industrial Internet the same way consumers are leveraging Fitbits – to improve overall health and wellness. Both can provide consistent measurement, visibility, and suggest performance improvements customized to help reach goals. Fitbit users can view real-time data and make adjustments to increase their activity. In his session at @ThingsExpo, Mark Bernardo Professional Services Leader, Americas, at GE Digital, discussed how leveraging the Industrial Internet a...
There will be new vendors providing applications, middleware, and connected devices to support the thriving IoT ecosystem. This essentially means that electronic device manufacturers will also be in the software business. Many will be new to building embedded software or robust software. This creates an increased importance on software quality, particularly within the Industrial Internet of Things where business-critical applications are becoming dependent on products controlled by software. Qua...
In addition to all the benefits, IoT is also bringing new kind of customer experience challenges - cars that unlock themselves, thermostats turning houses into saunas and baby video monitors broadcasting over the internet. This list can only increase because while IoT services should be intuitive and simple to use, the delivery ecosystem is a myriad of potential problems as IoT explodes complexity. So finding a performance issue is like finding the proverbial needle in the haystack.
The 19th International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Digital Transformation, Microservices and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportuni...
Large scale deployments present unique planning challenges, system commissioning hurdles between IT and OT and demand careful system hand-off orchestration. In his session at @ThingsExpo, Jeff Smith, Senior Director and a founding member of Incenergy, will discuss some of the key tactics to ensure delivery success based on his experience of the last two years deploying Industrial IoT systems across four continents.
The Internet of Things will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform. In his session at @ThingsExpo, Craig Sproule, CEO of Metavine, demonstrated how to move beyond today's coding paradigm and shared the must-have mindsets for removing complexity from the develo...
SYS-CON Events announced today that MangoApps will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. MangoApps provides modern company intranets and team collaboration software, allowing workers to stay connected and productive from anywhere in the world and from any device.
IoT is rapidly changing the way enterprises are using data to improve business decision-making. In order to derive business value, organizations must unlock insights from the data gathered and then act on these. In their session at @ThingsExpo, Eric Hoffman, Vice President at EastBanc Technologies, and Peter Shashkin, Head of Development Department at EastBanc Technologies, discussed how one organization leveraged IoT, cloud technology and data analysis to improve customer experiences and effi...
The IETF draft standard for M2M certificates is a security solution specifically designed for the demanding needs of IoT/M2M applications. In his session at @ThingsExpo, Brian Romansky, VP of Strategic Technology at TrustPoint Innovation, explained how M2M certificates can efficiently enable confidentiality, integrity, and authenticity on highly constrained devices.
In today's uber-connected, consumer-centric, cloud-enabled, insights-driven, multi-device, global world, the focus of solutions has shifted from the product that is sold to the person who is buying the product or service. Enterprises have rebranded their business around the consumers of their products. The buyer is the person and the focus is not on the offering. The person is connected through multiple devices, wearables, at home, on the road, and in multiple locations, sometimes simultaneously...
“delaPlex Software provides software outsourcing services. We have a hybrid model where we have onshore developers and project managers that we can place anywhere in the U.S. or in Europe,” explained Manish Sachdeva, CEO at delaPlex Software, in this SYS-CON.tv interview at @ThingsExpo, held June 7-9, 2016, at the Javits Center in New York City, NY.
"We've discovered that after shows 80% if leads that people get, 80% of the conversations end up on the show floor, meaning people forget about it, people forget who they talk to, people forget that there are actual business opportunities to be had here so we try to help out and keep the conversations going," explained Jeff Mesnik, Founder and President of ContentMX, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Internet of @ThingsExpo, taking place November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with the 19th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world and ThingsExpo Silicon Valley Call for Papers is now open.
The IoT is changing the way enterprises conduct business. In his session at @ThingsExpo, Eric Hoffman, Vice President at EastBanc Technologies, discussed how businesses can gain an edge over competitors by empowering consumers to take control through IoT. He cited examples such as a Washington, D.C.-based sports club that leveraged IoT and the cloud to develop a comprehensive booking system. He also highlighted how IoT can revitalize and restore outdated business models, making them profitable ...
"delaPlex is a software development company. We do team-based outsourcing development," explained Mark Rivers, COO and Co-founder of delaPlex Software, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
We all know the latest numbers: Gartner, Inc. forecasts that 6.4 billion connected things will be in use worldwide in 2016, up 30 percent from last year, and will reach 20.8 billion by 2020. We're rapidly approaching a data production of 40 zettabytes a day – more than we can every physically store, and exabytes and yottabytes are just around the corner. For many that’s a good sign, as data has been proven to equal money – IF it’s ingested, integrated, and analyzed fast enough. Without real-ti...
"There's a growing demand from users for things to be faster. When you think about all the transactions or interactions users will have with your product and everything that is between those transactions and interactions - what drives us at Catchpoint Systems is the idea to measure that and to analyze it," explained Leo Vasiliou, Director of Web Performance Engineering at Catchpoint Systems, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York Ci...