on 04-14-2014 12:01 PM
Hello,
I have a CSV File with a bit more then 320.000 Rows.
A Table which should store the Rows.
I guess enough Harddisc to load.
A HANA 1.00.68.384084
My Control file looks like this:
IMPORT DATA
INTO TABLE "KTH"."MyTable"
FROM '/cloudfolders/mycsv_utf.csv'
RECORD DELIMITED BY '\n'
FIELDS DELIMITED BY ','
OPTIONALLY ENCLOSED BY '"'
ERROR LOG '/cloudfolders/mycsv_utf_error.txt'
I call the Controlfile from HANA Studio with the following Command:
IMPORT FROM CONTROL FILE '/cloudfolders/my_control_file.ctl'
WITH
THREADS 10
BATCH 10000;
260.000 Rows are well imported.
A Error File "mycsv_utf_error.txt" get created (With wrong permissions?)
But the errorfile keeps empty (0 Byte). Also when I try to CMOD the File to 777 and re-run the Load Command.
I need to know which rows cause the error to solve the Problem (Data / Table / ...)
Any Idea?
Regards,
I found out a bit more ...
The Import behaves differently regarding.
Changing the Amount of Threads / Batch change the result of imported Rows.
IMPORT FROM CONTROL FILE '/cloudfolders/abrstat.ctl'
WITH
Threads nn
Batch nn
Threads 1 / No Batch: 265.537
Threads 10 / Batch 5000: 269.704
Threads 1 / Batch 50000: 264.420
Threads 10 / Batch 10: 266.756
Threads 100 / No Batch: 286.969
Threads 200 / No Batch: 293.224
Threads: 300 --> No Result at all (Overflow?)
Threads 250: 294.104
So as I see there is a Limit on the Threads somewhere above 250 (May System dependent?)
I do not see why there is a dependency between Threads and imported Rows.
Specially when Batch is not set (Default 1?) it should fail always at the same row (Because probably a whole batch fails where one Row has invalid Data). In Case of Batch=1 it should always fail at the same row - I guess.
I am realy suffering on this import. Without Error log I am not able to find the Row which affects the Import. Even with the above approach I am not able to find the Error.
So it would be very helpful and kind if somebody has an Idea how to solve the Problem of unwritten Logfile.
By the Way: I set the whole Folder to 777.
Regards,
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Thomas,
i don't have an answer but will bookmark your question as i will be soon 'migrating' my content and i remember some issues with my import but didn't realize the number of threads could impact the actual row counts, which doesn't sound too good. with adding sql code not executing as expected until ver75 sounds like we have plenty on our plate to test out.
thx,
greg
Hello,
here are some more information about the Problem on importing CSV:
1. Rows with Error:
Because I can't get any Error Messages (See below) I Try to find the Error with excluding the Rows until I found the Row with Error.
The Following in the CSV coused an Error:
Testnumber,Texttextfield
1, "Sampletext"
2, "This is a ""TEXT"" with quotes"
The Quotation in Texts are not accepted ...
THis is - as I see - not Conform with RFC 4180 (RFC 4180 - Common Format and MIME Type for Comma-Separated Values (CSV) Files) See Chapter 2.7:
must be escaped by preceding it with
another double quote. For example:
"aaa","b""bb","ccc"
2. Error Log:
This theme is not solved jet.
My Controlfile looks like this:
IMPORT DATA
INTO TABLE "KTH"."MyTable2013"
FROM '//csv_xchange/myfile_2013_1000_rows.csv'
RECORD DELIMITED BY '\n'
FIELDS DELIMITED BY ','
OPTIONALLY ENCLOSED BY '"'
ERROR LOG '/csv_xchange/myfile2013_utf_1.txt'
The CSV File gets imported and Rows with errors gets omited - but without noting an Error in the Error Log File.
The Error Logfile gets created in the correct Folder but keeps empty.
I try also to set the Rights of the File and of the Whole Folder to 777 with no change.
The Parent folder looks like this:
The Filesetting looks like this:
I dont see any solution anymore to solve this.
The HANA Version is as following:
Does anybody has the same Problem?
Does anybody have an Workarround for the Quotation Problem and/or the not written Error Log?
Regards,
Mansur Esmann i.R. of Thomas
Hello Krishna,
thanks for this link.
This helps me to solve the Doublequote Problem, but there are also other errors in the File and I have to find them with the Help of the Error Log.
I did not mention that the Error File gets from the User hdbadm created.
I guess hdbadm is the HANA - System User, so when the User can create the File then he should also be able to write into the file!?
Is in the Hana Developer Admin Perspective a Log which can tell about Errors related to IMPORT DATA, or may a Log about I/O Errors (like the Error log)?
Regards,
Mansur
User | Count |
---|---|
85 | |
10 | |
10 | |
10 | |
7 | |
6 | |
6 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.