Welcome!

Open Source Cloud Authors: Elizabeth White, Carmen Gonzalez, Mano Marks, Yeshim Deniz, Mark Ross-Smith

Blog Feed Post

The PMML Revolution: Predictive analytics at the speed of business

This guest post is by Alex Guazzelli, VP of Analytics at Zementis Inc. -- ed. PMML, the Predictive Model Markup Language, is the de facto standard to represent predictive analytics and data mining models. With PMML, it is extremely easy to move a predictive solution from one system to another, since it avoids proprietary issues and incompatibilities. Companies around the globe are benefiting from PMML to make instant use of their predictive solutions. With PMML, there is no need for custom coding: you can easily move your solution from the scientist’s desktop, where it was built, to the production environment, where it is operationally deployed. Companies also use PMML as the common language between service providers and external vendors. In this way, it defines a single and clear process for the exchange of predictive solutions. It becomes the bridge not only between data analysis, model building, and deployment systems, but also between all the people and teams involved in the analytical process. This is extremely important, since PMML is used to disseminate knowledge and best practices, and to ensure transparency. All the top analytical tools, commercial and open-source, support PMML. And, the language itself has reached a great level of maturity and refinement. PMML 4.1, its latest version, makes it extremely easy for predictive solutions to be represented in an open and standard way. With PMML, you can represent a myriad of pre- and post-processing steps, besides the predictive modeling techniques per se. PMML 4.1 allows for multiple models (model composition, chaining, segmentation, and ensemble, which includes random forest models), to be represented by a single and concise language element. It also allows for model outputs to be transformed into business decisions. Therefore, a PMML file is able to represent the entire solution, from raw data to business decision, with one or multiple predictive models. The availability of a standard such as PMML combined with scoring solutions in the cloud, for Hadoop, and in-database make it possible for predictive analytics to fulfill its promise and crack the big data code. Zementis, Inc. has been in the forefront of PMML-based scoring, first through its ADAPA Scoring Engine, which is available for on-site deployment or as a service on cloud (Amazon and IBM), and lately through its Universal PMML Plug-in which is offered for a range of databases and for Hadoop. Zementis has partnered with Revolution Analytics, so that predictive solutions built in R can benefit from the vast scoring infrastructure already in place. I am proud to be associated with Zementis and excited to be part of an ever-growing PMML community. A PMML package for R that exports all kinds of predictive models is available directly from CRAN. Traditionally, the PMML Package offered support for the following data mining algorithms: ksvm (kernlab): Support Vector Machines nnet: Neural Networks rpart: C&RT Decision Trees  lm & glm (stats): Linear and Binary Logistic Regression Models  arules: Association Rules kmeans and hclust: Clustering Models  Recently, it has been expanded to support:  multinom (nnet): Multinomial Logistic Regression Models; glm (stats): Generalized Linear Models for classification and regression with a wide variety of link functions  randomForest: Random Forest Models for classification and regression (click HERE for examples); rsf (randomSurvivalForest): Random Survival Forest Models; And, this expansion is still on-going as the R community implements support for other packages and techniques. For more on the PMML package, please take a look at the paper we published with Graham Williams from Togaware in “The R Journal”. For that just follow the link below: PMML: An Open Standard for Sharing Models There may be quite a few reasons for you to move your predictive solution from R to an independent deployment platform. Among them, you may want parallel execution on big data or real-time scoring for applications such as fraud detection or recommender systems. With PMML you can easily move your model to the cloud or inside the database for scoring. Or, even have it executed on Hadoop. It is really up to you! On top of that, PMML allows for side-by-side deployment of predictive assets from R as well as other commercial data mining tools, supporting a multi-vendor environment as well as platform independent deployment. More and more companies and individuals are using the PMML standard for the obvious benefits it provides, putting their predictive solutions on the fast track. With PMML, the speed of predictive solutions can be on par with the speed of business. Dr. Alex Guazzelli is the VP of Analytics at Zementis Inc. where he is responsible for developing core technology and predictive solutions under ADAPA, a PMML-based decisioning platform. With more than 20 years of experience in predictive analytics, Dr. Guazzelli holds a PhD in Computer Science from the University of Southern California and has co-authored the book PMML in Action: Unleashing the Power of Open Standards for Data Mining and Predictive Analytics, now in its second edition (paperback and kindle). You can follow him at @DrAlexGuazzelli.

Read the original blog entry...

More Stories By David Smith

David Smith is Vice President of Marketing and Community at Revolution Analytics. He has a long history with the R and statistics communities. After graduating with a degree in Statistics from the University of Adelaide, South Australia, he spent four years researching statistical methodology at Lancaster University in the United Kingdom, where he also developed a number of packages for the S-PLUS statistical modeling environment. He continued his association with S-PLUS at Insightful (now TIBCO Spotfire) overseeing the product management of S-PLUS and other statistical and data mining products.<

David smith is the co-author (with Bill Venables) of the popular tutorial manual, An Introduction to R, and one of the originating developers of the ESS: Emacs Speaks Statistics project. Today, he leads marketing for REvolution R, supports R communities worldwide, and is responsible for the Revolutions blog. Prior to joining Revolution Analytics, he served as vice president of product management at Zynchros, Inc. Follow him on twitter at @RevoDavid

@ThingsExpo Stories
In his General Session at 17th Cloud Expo, Bruce Swann, Senior Product Marketing Manager for Adobe Campaign, explored the key ingredients of cross-channel marketing in a digital world. Learn how the Adobe Marketing Cloud can help marketers embrace opportunities for personalized, relevant and real-time customer engagement across offline (direct mail, point of sale, call center) and digital (email, website, SMS, mobile apps, social networks, connected objects).
SYS-CON Events announced today that Catchpoint, a leading digital experience intelligence company, has been named “Silver Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Catchpoint Systems is a leading Digital Performance Analytics company that provides unparalleled insight into your customer-critical services to help you consistently deliver an amazing customer experience. Designed for digital business, C...
@ThingsExpo has been named the ‘Top WebRTC Influencer' by iTrend. iTrend processes millions of conversations, tweets, interactions, news articles, press releases, blog posts - and extract meaning form them and analyzes mobile and desktop software platforms used to communicate, various metadata (such as geo location), and automation tools. In overall placement, @ThingsExpo ranked as the number one ‘WebRTC Influencer' followed by @DevOpsSummit at 55th.
"There's a growing demand from users for things to be faster. When you think about all the transactions or interactions users will have with your product and everything that is between those transactions and interactions - what drives us at Catchpoint Systems is the idea to measure that and to analyze it," explained Leo Vasiliou, Director of Web Performance Engineering at Catchpoint Systems, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York Ci...
The 20th International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held June 6-8, 2017, at the Javits Center in New York City, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Containers, Microservices and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal ...
SYS-CON Events announced today that Linux Academy, the foremost online Linux and cloud training platform and community, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Linux Academy was founded on the belief that providing high-quality, in-depth training should be available at an affordable price. Industry leaders in quality training, provided services, and student certification passes, its goal is to c...
20th Cloud Expo, taking place June 6-8, 2017, at the Javits Center in New York City, NY, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy.
In the next five to ten years, millions, if not billions of things will become smarter. This smartness goes beyond connected things in our homes like the fridge, thermostat and fancy lighting, and into heavily regulated industries including aerospace, pharmaceutical/medical devices and energy. “Smartness” will embed itself within individual products that are part of our daily lives. We will engage with smart products - learning from them, informing them, and communicating with them. Smart produc...
"What is the next step in the evolution of IoT systems? The answer is data, information, which is a radical shift from assets, from things to input for decision making," stated Michael Minkevich, VP of Technology Services at Luxoft, in this SYS-CON.tv interview at @ThingsExpo, held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA.
The emerging Internet of Everything creates tremendous new opportunities for customer engagement and business model innovation. However, enterprises must overcome a number of critical challenges to bring these new solutions to market. In his session at @ThingsExpo, Michael Martin, CTO/CIO at nfrastructure, outlined these key challenges and recommended approaches for overcoming them to achieve speed and agility in the design, development and implementation of Internet of Everything solutions with...
WebRTC sits at the intersection between VoIP and the Web. As such, it poses some interesting challenges for those developing services on top of it, but also for those who need to test and monitor these services. In his session at WebRTC Summit, Tsahi Levent-Levi, co-founder of testRTC, reviewed the various challenges posed by WebRTC when it comes to testing and monitoring and on ways to overcome them.
Internet of @ThingsExpo, taking place June 6-8, 2017 at the Javits Center in New York City, New York, is co-located with the 20th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. @ThingsExpo New York Call for Papers is now open.
Smart Cities are here to stay, but for their promise to be delivered, the data they produce must not be put in new siloes. In his session at @ThingsExpo, Mathias Herberts, Co-founder and CTO of Cityzen Data, discussed the best practices that will ensure a successful smart city journey.
Every successful software product evolves from an idea to an enterprise system. Notably, the same way is passed by the product owner's company. In his session at 20th Cloud Expo, Oleg Lola, CEO of MobiDev, will provide a generalized overview of the evolution of a software product, the product owner, the needs that arise at various stages of this process, and the value brought by a software development partner to the product owner as a response to these needs.
In 2014, Amazon announced a new form of compute called Lambda. We didn't know it at the time, but this represented a fundamental shift in what we expect from cloud computing. Now, all of the major cloud computing vendors want to take part in this disruptive technology. In his session at 20th Cloud Expo, John Jelinek IV, a web developer at Linux Academy, will discuss why major players like AWS, Microsoft Azure, IBM Bluemix, and Google Cloud Platform are all trying to sidestep VMs and containers...
SYS-CON Events announced today that MobiDev, a client-oriented software development company, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place June 6-8, 2017, at the Javits Center in New York City, NY, and the 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MobiDev is a software company that develops and delivers turn-key mobile apps, websites, web services, and complex softw...
The cloud market growth today is largely in public clouds. While there is a lot of spend in IT departments in virtualization, these aren’t yet translating into a true “cloud” experience within the enterprise. What is stopping the growth of the “private cloud” market? In his general session at 18th Cloud Expo, Nara Rajagopalan, CEO of Accelerite, explored the challenges in deploying, managing, and getting adoption for a private cloud within an enterprise. What are the key differences between wh...
"Tintri was started in 2008 with the express purpose of building a storage appliance that is ideal for virtualized environments. We support a lot of different hypervisor platforms from VMware to OpenStack to Hyper-V," explained Dan Florea, Director of Product Management at Tintri, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
The security needs of IoT environments require a strong, proven approach to maintain security, trust and privacy in their ecosystem. Assurance and protection of device identity, secure data encryption and authentication are the key security challenges organizations are trying to address when integrating IoT devices. This holds true for IoT applications in a wide range of industries, for example, healthcare, consumer devices, and manufacturing. In his session at @ThingsExpo, Lancen LaChance, vic...
WebRTC has had a real tough three or four years, and so have those working with it. Only a few short years ago, the development world were excited about WebRTC and proclaiming how awesome it was. You might have played with the technology a couple of years ago, only to find the extra infrastructure requirements were painful to implement and poorly documented. This probably left a bitter taste in your mouth, especially when things went wrong.