First look at SAP Predictive Analysis 1.0.17
With SAP Predictive Analysis 1.0.17 general available as of 13th June there are now additional algorithms directly built-in and ready to be applied to your data.
Including 3 in my view very powerful algorithms which is ported from the KXEN company that was acquired by SAP end of 2013.
InfiniteInsight (II) Clustering which provides supervised clustering. Compared to the k-means algorithm this II Clustering algorithm can be used to
Clustering with R-K-Means the option is to chose number of clusters and features (numerical variables).
With II Clustering there is a completely new capability of having the tool help evaluate how many clusters there should be. Setting the range from x to y number of clusters. Furthermore there is a target (supervised) variable.
Evaluating the results that II Clustering provides comes with great interactive visualization capabilities:
II Clustering target variables handles bi-variate only (1 or 0).
This is my favorite new addition to the SAP Predictive Analysis tool. Not just because it is actually a great predictive algorithm but the way it allows for explanation of how the individual factors influences the target.
Configuring the II Classification is just as with other classification algorithms.
Evaluating the build predictive model is very good for discussion with the business:
The Variable contributions could also be displayed with for instance the correlation matrix (corrgram), however this is now already directly available in SAP PA 1.0.17.
Receiver Operating Characteristic (ROC - curve):
The ROC curve is a graphical plot which illustrates the performance of a binary classifier system.
SAP Predictive Analysis prior to release 1.0.17 had many other regression algorithms that are widely used in the predictive community, however with the II Regression and in particular the charting capabilities of the predicted results things really seems to come together.
Evaluating the II Regression model:
Besides the Variable Contributions chart as shown in the II Classification the II Regression can also visualize the accuracy of the model:
This was just my first impressions of SAP PA 1.0.17 - besides what has been shown here I am also impressed by performance and stability.
Additionally SAP PA 1.0.17 also brings support for HANA Partitioning - data sampling and the Support Vector Machine algorithm has also been added when using SAP HANA Connect.