Big Data = The New BI?
>> Written by @Waldemar Adams, Vice President of Business Intelligence, SAP EMEA
Over the last 6-10 months there is quite an increase in news and discussions about "Big Data". Interesting enough a lot of that is attached to Analytics and Business Intelligence as THE use-case. This is not really a surprise as BIG data became the umbrella term for some key trend and business opportunities in the BI market.
Business Intelligence, and Analytics, is this year again CIO’s #1 priority (as of Gartner’s annual CIO survey) - for a reason and again since many years now.
Despite the need to leverage at best underlying technology of the given time which might be ERP, OLAP, RDBMS in the past and is now Mobile, Cloud and in-memory in our decade, the demand to better understand whats going on in the company, to optimize the performance and to leverage the business opportunities stays the same.
OLAP and cubes were the promise for better Analytics and Business Intelligence starting in the late 90’s of the last century. And this was true at the given time. The accelerated growth of data, both from operational sources with more level of detail and with faster update cycles and also the need to understand and analyze unstructured data from like social media very much changed the game. Data doubles every 18 months and 70% of this growth is from unstructured sources. Therefore the business started suffering from getting the best insight out of the data reaching more and more often the limits.
In-memory databases starting their rocket-like rise in the early 2010 decade, very much spearheaded by SAP’s HANA as it is designed to solve the data dilemma. In parallel Hadoop became more relevant and considered as the "new database“ still need to pass the reality check.
To set the poles, when HANA was designed to be the high performance analytic appliance to enable all users to analyze the corporate KPI’s at the speed of thoughts, Hadoop's core purpose is an open source, easy to deploy and manage technology to store lots of data on commodity hardware.
But the promise of 'big data' is not only to have and host mass data. Just to store huge amount of data is also possible with traditional RDBMS at the final end. And that is the reason many companies started to suffer from the lack of agility turning their data into dark data which only consumes space but doest add any value. The real benefit is to turn that 'dark data‘ back into insight again and make it available for lots of users and allow them to get a better understanding of their relevant piece of the business. 15% of Fortune 500 will be able to fully exploit “Big Data” for competitive advantage by 2015
Data is an asset which needs to be leveraged to drive corporate performance, no matter how much data or of which kind, structured and unstructured. And 'big data‘ is todays #hashtag for this promise.
SAP with its Real-Time-Data-Platform (RTDP) is well prepared to host and provide hot data but also seamlessly integrates Hadoop to store mass data like web-logs or unstructured data from social media sources.
The perfect answer to an architecture is not the one OR the other but the right mix based on the specific requirements.
Hadoop standalone will most likely not allow to achieve the performance needed for the Analytic needs.
HANA and in-memory is ideal as the leading platform to host data for Analytics as it is taylor-made and build for it.
SAP’s Business Intelligence platform is agnostic by design and supports HANA and also Hadoop as one of the more than 120 possible data sources and technologies which are supported.
If a company wants to embark on big data just add an instance of Hadoop to the traditional warehouse.
With the one consistent layer of BI and the build-in multi-source capabilities access both the KPI’s from the DWH enriched by data from Hadoop is just a mouse click away. It works well but based on the Hadoop distribution this can result easily in slow performance.
To really drive the change and to leverage the benefits of big data an in-memory database like SAP HANA is the better starting point.
SAP HANA provides a federated architecture where certain data can technically reside in different buckets like in Hadoop or SAP IQ.
This federation technology hides the 'where‘ does the data technically resides both to the report designer or data discovery user. They only see the one constant source of data from the SAP RTDP which allows them to focus on using the data and not force them to tame the data sources.
If you want to read more details about SAP’s offering in the BIG data / BI space here’s a link http://www.sap.com/bigdata
As Obi Wan would phrase: May the 'big data' (be) with you!