cancel
Showing results for 
Search instead for 
Did you mean: 

what are various methods of importing bulk of data in HANA studio without using data services ?

Former Member

i have to import bulk data in SAP HANA without using data services. is there any methods from which bulks of data can be loaded?

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

After Digging every  possible methods , i find CTL method is good to load one time data.

But if we use SHell Scripts then we can use Ctl method to load the data .

I have implemented shell scripts over ctl files to load the data whenevr any ctl files is present.

I find it bit easy (if we have full packaged data and only if need to load in HANA ) then this method is easy and good.

To schedule the files is still a hurdle for me.

Former Member
0 Kudos

Just write a cron script which searches a directory for CSVs, then writes a control file and SQL load file for that CSV, and executes it using hdbsql

Which part is causing the challenge and how have you tried to solve it?

John

Former Member
0 Kudos

can we implement any other scheduler  Other than Corn ?

using sleep method i have written scripts.

instead of sleep i am thinking to creatwe scheduling acc to date basis.

hope it will helfull

Former Member
0 Kudos

Why wouldn't you want to use crond? What's the problem with it?

You can use whatever UNIX scheduler you want.

John

Former Member
0 Kudos

I have used Cron ..its good .But can we not run mutliple files under  cron scheduler.?

But now i am stuck up in implementation:

I have to implement a shellscript code where in i have to import only those ctls in HANA  whose csvs are present in a particular folder.

Like i have a folder name A and in that i have updated CSVs which i am scheduling on daily basis. In the A folder only those CSV will come who are updated .So now how can i load data on of those tables whose csv is only present in folder A . like if i have 2 tables on 1st day then only 2 ctl files will run and on the next day if i have 3 tables in A folder then 3 files will run.

This logic is harder for me to implement

Former Member
0 Kudos

i did the automation successfully with the above logic implementation

Saritha_K
Contributor
0 Kudos

Hi Shweta,

Even I have a requirement to load multiple .txt/.csv  files into HANA on a periodic basis. Can you please share your approach on how we can do it?

Regards,

Saritha K

Answers (2)

Answers (2)

rama_shankar3
Active Contributor
0 Kudos

Shweta:

For small volume and demo table files, you can use the import wizard if you are higher than SP4.

However, for high volume buld loads.  The best way is to use the import wizard for high volume flat files.  Please follow John's step-by-step and guidelines to import bulk data.

http://www.saphana.com/community/blogs/blog/2013/04/07/best-practices-for-sap-hana-data-loads

John is a HANA Guru / Veteran and I am a big fan of John's blogs and I have followed his inputs and tips in my projects. John, good to see you back in the blogging area.  You have been missing in action for some time sir! Welcome Back!

Hope this helps.

Rama

Former Member
0 Kudos

You're too kind - I've been busy with customers.

I do have a Java loader which I'm happy to share - creating a GitHub repository now for HANA code snippets. Empty for now, keep checking back.

Feel free to join and add your own code.

http://github.com/sap-hana

John

Former Member
0 Kudos

thanks John . As u mentioned in the very first post that u write shell scripts for the reloading of flat files

in HANA database. and to schedule them we have to use XS wrapper ..can u provide me some examples of XS wrapper . i m not getting the idea of it. As a beginner in HANA.:)

Rama: thanks for info and also about John's .

Former Member
0 Kudos

I pretty much always use the IMPORT command with control files. It is limited in file format (requires CSV) and so I often find myself writing Linux shell scripts to reformat files.

It is easy and cheap and very fast - I see up to 7m rows/second depending on the type of file.

http://help.sap.com/hana/html/sql_import_from.html

John

Former Member
0 Kudos

thanks for reply. i have few doubts..

is the CTL method u r talking about right?

but will this method helps to load the data which in bulk form?

and how to u schedule the CTL files ?

Former Member
0 Kudos

CTL files are ideal for single time bulk loads.

You didn't specify that you wanted to schedule them. You can do this with CTL files using a XS wrapper, and there is a scheduling engine in the roadmap.

But if you want to do periodic loads from flat files into HANA, then you are better using Data Services for sure. It provides good error handling, monitoring and notification, all of which may be important to you.

If you don't want to use Data Services then you could write a loader into XS using Server Side JavaScript, or I have even written them using JDBC. Alternatively, Informatica is now supported.

John

Former Member
0 Kudos

soory for not mentioning of scheduling in earlier post.

thanks alot for the insights..

can u help with how to write loader using JDBC.?

Former Member
0 Kudos

You can do it like for any database - here's an example for MSSQL, you could rewrite this using the HANA driver.

The key is that you addBatch() and executeBatch() to ensure you do bulk inserts rather than individual inserts. This will dramatically improve load performance. I advise batches of 50-200k rows depending on your data - play with the number.

Note that this will only use a single thread (actually two) and you will max out at about 100-200k rows/sec. If you want faster performance, you will have to write a multi-threaded loader which can push multiple batches at once into multiple connections. You will find you will get an improvement in performance up to the number of cores/2.

Typically you should expect 1m rows/sec of inserts on a typical 10 column table. For very wide tables (1-200 columns) this may drop to about 250k rows/sec.

Good Luck!

John

http://blogs.datadirect.com/2012/05/how-to-bulk-insert-jdbc-batches-into-microsoft-sql-server-oracle...