cancel
Showing results for 
Search instead for 
Did you mean: 

split a file into N files of 64k

rodrigoalejandro_pertierr
Active Contributor
0 Kudos

hi,

i have a requirement that i need to receive a file,map it an as a result a need to split this files in many files of 64K as a limitation of the backed system (no comments )

i saw many XSLT examples that split the files by specific segment but i need a simulate in my PI7.1 the same functionality of pi 7.3 for file adapter where i can split a file in many files.

I was thinking about this kind of solution because the performace is better than java mapping. however it is not dicarded.

Example of file (lets assume it has a size of 130K)

<People>
<Person>            
<name>John</name>            
<date>June12</date>            
<workTime taskID="1">34</workTime>            
<workTime taskID="2">12</workTime>            
</Person>            
<Person>            
<name>John</name>            
<date>June13</date>            
<workTime taskID="1">21</workTime>            
<workTime taskID="2">11</workTime>            
</Person>
<Person>            
<name>Jack</name>            
<date>June19</date>            
<workTime taskID="1">20</workTime>            
<workTime taskID="2">30</workTime>            
</Person>   

</People>

as a result y need to split the file into three

FILE1 (60K)

<People>
<Person>  
<name>John</name>  
<date>June12</date>  
<workTime taskID="1">34</workTime>  
<workTime taskID="2">12</workTime>  
</Person>  
<Person>  
<name>John</name>  


FILE2 (60K)

<date>June13</date>  
<workTime taskID="1">21</workTime>  
<workTime taskID="2">11</workTime>  
</Person>
<Person>  
<name>Jack</name>  
<date>June19</date>  
<workTime taskID="1">20</workTime>

FILE3 (10K)

<workTime taskID="1">20</workTime>  
<workTime taskID="2">30</workTime>  
</Person>   

</People>

SO as you can see i need to split the file but at the end  when all the peaces comes together the final XML should be consistent

some examples:

http://stackoverflow.com/questions/4433529/split-xml-file-into-multiple-files-based-on-a-threshold-v...

http://www.abbeyworkshop.com/howto/xslt/xslt_split/

http://stackoverflow.com/questions/4169961/xslt-split-output-files-muenchian-grouping

...

...

...


hope it is clear

Accepted Solutions (0)

Answers (3)

Answers (3)

former_member181985
Active Contributor
0 Kudos

Hi Rodrigo,

I would recommend you to use Multi-mapping for your case. For e.g., Each multi-message can have max 3 records such that the message size (for file) <= 64 KB. Evaluate for how many records can  constitute 64 KB or less in your scenario case.

By the by, what is your target system and which adapter you are using? Also PI version?

As Ambrish already pointed, if you try to do what you explained then XML message becomes incomplete and in general XML based systems can not process such incomplete messages.

Regards,

Praveen Gujjeti

rodrigoalejandro_pertierr
Active Contributor
0 Kudos

currently it is not an option since

by the way, it doest matter if the split cut a segment or a field value to half. the program in the backend will assemble all the parts into one XML.

Regards

former_member181985
Active Contributor
0 Kudos

Hi,

With Java Mapping, you can still go ahead with Mutli-mapping concept. The java mapping code should be capable of chunking 64 KBs for each message and for final chunk it can be <= 64 KB. Use content conversion in file receiver channel to write record content for each file. This way you can serialize the data

<Messages>

<Message1><Record>FILE1 XML CONTENT AS CDATA</Record></Message1>

<Message2><Record>FILE2 XML CONTENT AS CDATA</Record></Message2>

<Message3><Record>FILE3 (10 KB) XML CONTENT AS CDATA</Record></Message3>

....

....

<MessageN><Record>FILEN XML CONTENT AS CDATA</Record></MessageN>

</Messages>

Hope this solution helps

Regards,

Praveen Gujjeti

anupam_ghosh2
Active Contributor
0 Kudos

Hi Rodrigo Alejandro Pertierra,

                                                With java mapping you can control how many bytes you can write to the destination along with extra care that all xml tags are completely written to a file. Since each character occupies 1 byte you can calculate and then spilt the file. You also need to ensure each file obtained after splitting is a valid XML (depending on whether a parser is reading each part in target side).

Thus I would suggest java mapping over XSLT. If you are worried about performance issues you can use an SAX parser over DOM so that PI server memory is used at minimum.

Regards

Anupam  

rodrigoalejandro_pertierr
Active Contributor
0 Kudos

I will try o check it and let you know. regards.

ambrish_mishra
Active Contributor
0 Kudos

Hi Rodrigo,

The mail issue if you want to split the file based on size, that is it fine if information in one file is incomplete and the partial information goes into another file?

Ambrish

rodrigoalejandro_pertierr
Active Contributor
0 Kudos

by the way, it doest matter if the split cut a segment or a field value to half. the program in the backend will assemble all the parts into one XML.

ambrish_mishra
Active Contributor
0 Kudos

Hi Rodrigo,

I would recommend to do a 2 step graphical mapping. First mapping should create the entire target structure with all the records. In second graphical mapping (can be a Java mapping), you take the parent node (return as XML) and pass it to UDF and do multi-mapping and create chunk of records and write to a file with counter in file adapter...

Hope it helps!

Ambrish