Installing the Apache HIVE driver in SAP Lumira
The support for Apache HIVE 0.10 was added since the release of SAP Lumira 1.13.
Connecting to Hadoop is done via the 'Query with SQL' data source option in Lumira. However in order to keep Lumira lightweight, not all drivers are distributed with the software. Hadoop HIVE is one of those driver that have to be installed as needed.
This article describes where to find the required files and how to install them in SAP Lumira.
Note: While this slide references files for version 0.10 of the Apache Hadoop Hive JDBC driver, this driver is supposed to be backward compatible and consequently also work with versions 0.7, 0.8 and 0.9 of Apache Hadoop Hive.
While the Hadoop community is working to improve HIVE, this remains a fairly high latency data source compared to the RDBMS you are used to. As a consequence, the data aquisition step in SAP Lumira will take longer than with other sources. If high interactivity is needed with Hadoop data, you should consider leveraging the SAP HANA big data platform
While Lumira offers the options to acquire all data from HIVE tables, the most efficient use with Hadoop is obtained through processing queries in the Hadoop cluster, thus leveraging the power of MapReduce. SAP Lumira offers this capability thanks to the 'Query with SQL' option.
SAP Lumira documentation : SAP Lumira – SAP Help Portal Page. Section 4.8 of the user guide explains how to install and use 'Query via SQL' drivers. It's always a good idea to read the documentation.
You'll also need SAP Lumira 1.13 or above. Please visit http://www.saplumira.comif you don't know about Lumira or don't have the software.
Getting the list of files required for the driver
The Product Availability Matrix : http://service.sap.com/pam. The PAM gives you the list of files needed to install the HIVE driver.
1- Once on the PAM site, run a search for 'Lumira'
2- Select SAP Lumira 1.0
You land on the page shown below. On the right hand side under 'Essentials' you'll find the most recent PAM
3- Open the PAM and go to the page that gives the list of the JAR files required. Keep this list handy.
Note: the list may change over time if Lumira adds support for other HIVE versions so I'm not copying it here. Always check the PAM for the latest list.
Getting the files
Now that we have the list of files, we need to get them from Apache.
where * is the version (eg: 0.20.2)
This file is found in the Hadoop distribution:
1- Go to the Apache archives server for Hadoop
2- Select the Hadoop version corresponding to the file version you need
4- Open in your preferred archive tool
You'll find the hadoop-*-core.jar
5- Create a directory somewhere on your local drive and extract this file to it. You only need that file so it's not necessary to unzip the whole archive.
You can now close that archive, we will not need it anymore.
All other JAR files
All other files from the list are found in the HIVE distribution
1- Go to the Apache archive server for HIVE
2- Select the HIVE version currently supported by Lumira (0.10 at this time. Again, check the PAM)
3- Download hive-*.tar.gz
Installing the driver in SAP Lumira
Now that we collected all the required files, we're going to install the driver in Lumira.
1- Open Lumira and go to the SQL driver preferences (check the use guide as indicated in this article if you don't know how to get to it)
2- Select the Apache Hadoop Hive entry and click on 'install driver' (screenshot below)
3- Navigate to the previously created folder where you stored the JAR files and select them all. At the time of this article, there should be 10 of them.
4- Click 'Open', the Apache Hadoop Hive entry should now show a green check
5- Click 'Done' and restart SAP Lumira for the changes to be applied.
Congratulations! You can now start using Hadoop.