on 05-27-2016 2:19 PM
Hi,
I cannot connect to hive via hiveserver2 and some extra options (transportMode=http, httpPath=cliservice).
Here is the error I get using PA on my desktop ("Requested array size exceeds VM limit ... ", see below).
The hive DB is working fine. I can access it via beeline without any problem.
What does this error message mean ? Any clue to solve this?
Hello JB, Antoine is absolutely correct about support of Hive versions. So Hive 1.2 is only supported in automated mode and we have not upgraded Expert mode beyond 0.13 version. However purely on the basis of technical compatibility , you could try this. In our ODBC/JDBC connectivity in PA mode - we have tried and test "Hive Server2 port" . this is usually port 10000 and is sometimes also called as Thriftserver port. What you are trying is port 10001 which is http transport port to connect to Hive. This will work from beeline alright. For PA, You should be able to find "Hive server 2 port" in configuration on hadoop for Hive and use this port during connectivity without specifying any additional parameters. Will this be okay for you ? thanks Priti
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Jean-Baptiste, Sorry for the delayed reply. I made all possible test configuring HTTP port instead of Binary from PA-Expert tool. I am running into same issue as yours. I am afraid that this does not seem to be working. So the only option is using 'Binary' transport mode and associated thriftserver port. This is also the restriction BTW in Automated mode because the driver thats shipped does only support Binary mode . Thanks.
Hi. I'd like to access tsv files from hadoop cluster since the HTTP transport mode is not supported and this is a blocking factor internally. Can I do that through Spark ? It seems that the use of Spark inside SAP PA can only be done on Hive tables. Is this correct ? Thanks in advance for the information.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Jean-Baptiste,
Automated will work with Hive tables created via TSV or other delimiters okay.
You can use Spark via Native Spark Modeling in Automated which uses Spark scala APIs to build the model on the hadoop cluster. It currently supports Hive tables.
thanks,
Alan
Message was edited by: Alan McShane to be clearer
Hi Jean-Baptise, We finally seem to have worked out the solution for Hive HTTP mode.
So if you use Simba or Hortownworks(based on Simba) drivers then ODBC driver can work with HTTP mode on hive. Using the settings on driver window , you can specify transport mode, http_path and other details as per your needs. And then PA can be used for subsequent actions.
So this means that embedded DD driver can not be used for HTTP mode on Hive.
Let us know if you face any other issues and if you can work with Simba/other compliant drivers.
Thanks.
Thank you alan.mcshane@sap.com
Hi Jean-Baptiste,
Did Priti comment helped solve you problem?
Regards
@bdel
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Jean-Baptiste, please see if this helps. Requested array size exceeds VM limit when crea... | SCN
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
I see from your printscreen that the Hive DB that you are using has the version 1.2.1. Is this correct?
If yes, we do not support this particular version in Expert Analytics, only in Automated Analytics. See our PAM here: https://support.sap.com/content/dam/library/ssp/infopages/pam-essentials/Pred_Ana_20.pdf
That could be a reason...
I am not a big data expert, I will loop in colleagues to provide more input.
Thanks & regards
Antoine
User | Count |
---|---|
87 | |
10 | |
10 | |
10 | |
7 | |
6 | |
6 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.