cancel
Showing results for 
Search instead for 
Did you mean: 

Requested array size exceeds VM limit when creating new data set via Hive (Apache Hadoop)

Former Member
0 Kudos

I am trying to create a chain in the "predict" tab of Predictive Analytics 2.0 (expert mode) but I get an issue when creating the new data set.

I tried to create the new data set via "Query with SQL" using "Apache Hadoop Hive 0.13 Simba JDBC4 HiveServe2" driver.

After adding the user/password and server:(port) info, I get an error message saying:

"requested array size exceeds VM limit (failed to allocate 1414811712 bytes) (array length 1414811695)"

Is this a heap size configuration problem or a bug with PA trying to create a huge array?

Could you please advise?

Thanks,

Veronica

Accepted Solutions (1)

Accepted Solutions (1)

achab
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Veronica,

Can you please attach a screenshot of your problem and the moment you face it?

Is this an option for you to use other client tools (I am thinking about Lumira* or Information Design Tool) and check if the problem is the same?

Thanks & regards

Antoine

* If you upgrade to Predictive Analytics 2.1 and are licensed to use Lumira as well, you can now install Predictive Analytics and Lumira side by side on your client machine.

Former Member
0 Kudos

Hi Antoine,

Attached there is an screen shot of the problem.

The message was a bit misleading, but the issue was that the port in the path (host:port -- here blacked out ) was incorrect. After correcting the port, the connection was successful.


Cheers,

Veronica

Answers (1)

Answers (1)

Former Member
0 Kudos

Hi, so what is the default correct port no.? I am using 50070 and it gives me the above error.

Thanks.

Former Member
0 Kudos

we were using default 10000 port.

PritiSAC
Advisor
Advisor
0 Kudos

Hi Vijay, The correct port to use for ODBC/JDBC connectivity is the one associated with 'Hive Server2' in the Hadoop cluster. The default is 10000, but you can make double sure by checking the Hive Server2 properties. These properties are located in configuration panel example in Ambari/Cloudera manager. Port 50070 is for HDFS browsing and not supposed to be used with external JDBC/ODBC connectivity. Good Luck