06-01-2011 3:05 PM
Hi guys,
I am having the following problem: I have a program that generates a file of +500.000 registers (with a lot of info ecah) to the server. I have splitted the program in multiple jobs to make it performant, but the problem is that I have now multiple files (one per job), and I would need only 1 (with al registers).
How can I do, or what can I programm, to merge all the generated files into one single file? Also I cannot process per job more than 9000 entries, beacuse a dump occurs (selection is to large), so the quantities of files is difficult to manage manually.
Did you have this problem before?
Many thanks in advance!!1
LUCAS
06-01-2011 3:26 PM
Lucas,
Try the OPEN ... FOR APPENDING variant so that the contents are appended to the same file but you will not be able to run the jobs parallely. Do you have to run the Jobs parallely ?
Regards,
Suman Jagu
06-01-2011 3:26 PM
Lucas,
Try the OPEN ... FOR APPENDING variant so that the contents are appended to the same file but you will not be able to run the jobs parallely. Do you have to run the Jobs parallely ?
Regards,
Suman Jagu
06-01-2011 4:00 PM
Unfortunately, yes. I need to keep the parallel processing. In fact, I need it, in order to make the report performant, otherwise it can last for hours...
Best Regards
06-01-2011 4:22 PM
Hi mate,
You can use "AFTER JOB" start condition. This way you can have all jobs joined and when first job ends, next one begins and so on.
Then you can add records to the spool with the "APPENDING" command.
Regards
06-01-2011 4:56 PM
Looks like you need a little helper program that reads all files one by one and appends the contents to the "big one", to be triggered when all chunks have been processed.
This should be a piece of cake for you, since you already designed a solution involving parallel processing.
Thomas
06-02-2011 8:43 AM
Ahh I thought it was a spool. Since you are proccesing them in parallel mode it's going to be difficult to handle those files when finished. Well if you really know which one will be the last job, you can launch a report after it which reads all files generated and join them just in one. For example add all files path to a database table and at last read all of them from your join report.
Regards
06-02-2011 9:15 AM
Hi Lucas,
My thoughts...
1) If you are using parallel processing, are you using "PERFORMING/CALLING" ON END OF TASK, if yes, then you could have the "HELPER PROGRAM" suggested by Thomas in the same program.
Now, why i think it wouldn't be a good idea to have this as a separate program in case you are not using "PERFORMING/CALLING" ON END OF TASK, is you can never be sure if all the call spawned in parallel are complete.
2) Creating individual files for each parallel call spawned, poses additional challenges like duplicate file names etc (I am sure it can be avoided, but it still poses some challenges).
3) My suggestion on this would be to create a Z table, and create a lock object for the key(may be the file name), and in each of the parallel call En-queue/De-queue this key entry and write to the file.
Regards,
Chen
06-02-2011 2:51 PM
Many thanks for your answers.
To solve the problem, I have programmed the jobs to fill a Z table, and then the main program, generates one single file from the Z Table.
Best Regards,
LUCAS