Skip to Content

College Basketball Analysis Powered by SAP Lumira & Predictive Analytics

2016 College Basketball Analysis and Predictions

With SAP Lumira and SAP Predictive Analytics

Using SAP Lumira & SAP Predictive Analytics, the SAP Data Viz team analyzed data from the top 68 teams to fill out a bracket and determine who will reach the Final Four. Each selection was made based on data, removing any gut feel or school bias from the equation.


Get Involved and Viz the Madness!

Think you know which teams will make it to the finals? Try SAP Lumira and share your insights and predictions.

Here is how you can join the action:

  1. Download SAP Lumira and/or SAP Predictive Analytics
  2. Download the SAP Lumira college basketball dataset or use your own
  3. Share your analysis via the SAP Community Network and tweet your URL along with a picture of your bracket with #VizTheMadness16 @SAPAnalytics
    • Example tweet. Check out my #SAPLumira bracket for #MarchMadness #VizTheMadness16 @SAPAnalytics [URL] [IMAGE OF BRACKET]
  4. Join the SAP Data Viz Team bracket pool and let the trash talk begin! See how your team stacks up against our basketball data viz experts and be entered to win bragging rights for 2016 and an official #VizTheMadness16 certificate.

View and Interact with the Insights and Analysis


Here is the 2016 bracket:


Follow @SAPAnalytics on Twitter for details. Learn more about SAP Lumira today.

What was the Data and How was it Analyzed?

The Data Viz and Predictive Analytics team pulled publicly available data from multiple sources including: ESPN.com, Wikipedia & GPS Visualizer. The key statistics include: College Basketball Power Index (BPI), Defensive/Offensive Quotient, Strength of Schedule Deviation, Conference Rank, Win/Loss, Nets Points vs Average, and Average Scoring Margin. Each individual match-up was compared and analyzed on a team-by-team basis.

Variable Selection


SAP Predictive Analytics determines significant measures making the most contribution to our predictive model.  Of all 351 Division 1 basketball teams, we isolate the 68 tournament teams for a robust analysis.

Looking at initial results, we see that Offensive Quotient makes up over 40% of our model.


Next, we can use Category Significance for each variable and determine how each range of values would impact BPI Ranking.


Several matchups however are too close to confidently pick so we are deep diving in SAP Lumira for additional analysis.

Offensive / Defensive Performance & Quality of Wins & Losses vs Quality Opponents


Looking at offensive / defensive performance (quotient), strength of schedule, we can better understand a team’s playing style, its effectiveness, and how they win. Kansas, for example, relies more on their offense with a higher strength of schedule compared to a team like Saint Mary's, which has a stronger defense team and a significantly lower strength of schedule.

How do teams play against quality opponents? Quality wins against tough opponents can be a great predictor.  While Texas A&m has gone unbeaten this year against top 25 ranked teams, they has only played 3 games against them. Kansas is seemingly battle-tested having played 10 games against top-25 ranked opponents and winning 7 of them.


Team vs Team Analysis & Predictions

Sports fans can use hyperlinked text on the bracket to drill down and analyze different projected games with teams of interest. During the tournament, Gonzaga would struggle against Michigan State, a team which has a higher offensive and defensive performance, strength of schedule, and nets points vs average.

Thanks to Predictive Analytics, sports fans can create and train a regression model and apply it against games of interest, allowing sports fans predicts the probability of a team winning/losing a game against an opponent. Thanks to this model, we can predict the probability of teams like Michigan State winning/losing a game during the tournament.


Cluster Analysis


We then turned to SAP Predictive Analytics to crunch through the data. Building a cluster analysis to get an idea of how the teams split out based on their BPI Ranking:

We found that clusters of teams with higher Offensive Quotient also had a higher BPI Ranking. Then, we looked at each cluster by strengths and weaknesses.

The team then built a Decision Tree on the variables to get an idea if this lined up with the Cluster break out found in the first 2 steps while using variables from the regression analysis:

Be sure to check back to www.sap.com/vizthemadness for updates to the bracket during the tournament.


Past Analysis and Data Story Examples:


*Disclaimer: this analysis and predictions are an attempt to showcase SAP technology not accurately predict sport out. This analysis is utilizing data from March 10th, 2016

VS_7.PNG (44295 B)
Tags: