College Basketball Analysis Powered by SAP Lumira & Predictive Analytics
2016 College Basketball Analysis and Predictions
Using SAP Lumira & SAP Predictive Analytics, the SAP Data Viz team analyzed data from the top 68 teams to fill out a bracket and determine who will reach the Final Four. Each selection was made based on data, removing any gut feel or school bias from the equation.
Get Involved and Viz the Madness!
Think you know which teams will make it to the finals? Try SAP Lumira and share your insights and predictions.
Here is how you can join the action:
- Download SAP Lumira and/or SAP Predictive Analytics
- Download the SAP Lumira college basketball dataset or use your own
- Share your analysis via the SAP Community Network and tweet your URL along with a picture of your bracket with #VizTheMadness16 @SAPAnalytics
- Example tweet. Check out my #SAPLumira bracket for #MarchMadness #VizTheMadness16 @SAPAnalytics [URL] [IMAGE OF BRACKET]
- Join the SAP Data Viz Team bracket pool and let the trash talk begin! See how your team stacks up against our basketball data viz experts and be entered to win bragging rights for 2016 and an official #VizTheMadness16 certificate.
View and Interact with the Insights and Analysis
Here is the 2016 bracket:
What was the Data and How was it Analyzed?
The Data Viz and Predictive Analytics team pulled publicly available data from multiple sources including: ESPN.com, Wikipedia & GPS Visualizer. The key statistics include: College Basketball Power Index (BPI), Defensive/Offensive Quotient, Strength of Schedule Deviation, Conference Rank, Win/Loss, Nets Points vs Average, and Average Scoring Margin. Each individual match-up was compared and analyzed on a team-by-team basis.
SAP Predictive Analytics determines significant measures making the most contribution to our predictive model. Of all 351 Division 1 basketball teams, we isolate the 68 tournament teams for a robust analysis.
Looking at initial results, we see that Offensive Quotient makes up over 40% of our model.
Next, we can use Category Significance for each variable and determine how each range of values would impact BPI Ranking.
Several matchups however are too close to confidently pick so we are deep diving in SAP Lumira for additional analysis.
Offensive / Defensive Performance & Quality of Wins & Losses vs Quality Opponents
Looking at offensive / defensive performance (quotient), strength of schedule, we can better understand a team’s playing style, its effectiveness, and how they win. Kansas, for example, relies more on their offense with a higher strength of schedule compared to a team like Saint Mary's, which has a stronger defense team and a significantly lower strength of schedule.
How do teams play against quality opponents? Quality wins against tough opponents can be a great predictor. While Texas A&m has gone unbeaten this year against top 25 ranked teams, they has only played 3 games against them. Kansas is seemingly battle-tested having played 10 games against top-25 ranked opponents and winning 7 of them.
Team vs Team Analysis & Predictions
Sports fans can use hyperlinked text on the bracket to drill down and analyze different projected games with teams of interest. During the tournament, Gonzaga would struggle against Michigan State, a team which has a higher offensive and defensive performance, strength of schedule, and nets points vs average.
Thanks to Predictive Analytics, sports fans can create and train a regression model and apply it against games of interest, allowing sports fans predicts the probability of a team winning/losing a game against an opponent. Thanks to this model, we can predict the probability of teams like Michigan State winning/losing a game during the tournament.
We then turned to SAP Predictive Analytics to crunch through the data. Building a cluster analysis to get an idea of how the teams split out based on their BPI Ranking:
We found that clusters of teams with higher Offensive Quotient also had a higher BPI Ranking. Then, we looked at each cluster by strengths and weaknesses.
The team then built a Decision Tree on the variables to get an idea if this lined up with the Cluster break out found in the first 2 steps while using variables from the regression analysis:
Be sure to check back to www.sap.com/vizthemadness for updates to the bracket during the tournament.
Past Analysis and Data Story Examples:
- March Madness Predictions with SAP Lumira by Craig Powers
- BI2014 and HANA2014 Takeaways by Holger Mueller
- Insights from SAP Insider BI 2014 – BI for the Business User by Cindy Jutras
- March Madness 2014 using SAP Lumira by Sam Ko
- My 8 year old is mad about SAP Lumira by Ryan Oneil
- UK vs UCONN – You Care by Tammy Powlas
- 11 of 16 sweet sixteen predictions with SAP Lumira and SAP Predictive Analysis
*Disclaimer: this analysis and predictions are an attempt to showcase SAP technology not accurately predict sport out. This analysis is utilizing data from March 10th, 2016