cancel
Showing results for 
Search instead for 
Did you mean: 

SFTP duplicate file checking

monikandan_p
Active Participant
0 Kudos

Dear Experts,

         I have done the development  SFTP->PI->File(FTP) and its working fine, But I don't want to process the duplicate file.

Issue in Detail:

   Every day our client will placing the file in SFTP, one file per day and no need to process the duplicate file. So that iam checking the duplicate file in channel configuration. So everyday we need to pick the latest file from SFTP using PI.

  PI is picking the latest file and updating in FTP and after 2hrs PI is again picking some of the random duplicate file from SFTP folder and updating in FTP and I increased the modification check in SFTP channel configuration, but the problem is still persists.

Kindly give some solution to resolve this issue if any body face this.

Best Regards,

Monikandan.

Accepted Solutions (1)

Accepted Solutions (1)

azharshaikh
Active Contributor
0 Kudos

Hi Monikandan,

My few cents:

1. Can you try to Change the Process Sequence based on Date instead of Name and check if it helps

2. You can use the Archiving Option to move the processed file to some other Dir to avoid it from getting re-processed.

Regards,

Azhar

monikandan_p
Active Participant
0 Kudos

Hi Azhar,

     Thanks for your reply.

2. We suggest the archive method to client they will not accept and there is no option to delete the

file from SFTP, So that I implemented the duplicate file checking in SFTP channel config.

1. Kindly explain how to handle in module config- Processing sequence.

Best Regards,

Monikandan

Former Member
0 Kudos

you can delete the file after processing right?

Answers (5)

Answers (5)

monikandan_p
Active Participant
0 Kudos

Dear All,

Finally our client accepted to delete the file from SFTP once processed by PI.

Thanks you all for your suggestions.

Best Regards,

Monikandan

0 Kudos

You can use archive method in this case.

Varsha.

0 Kudos

Hi,

In this case, You can use Extended receiver determination to identify today's file by reading all files from your sender channel and make sure you are processing the today's file in target mapping in the first step. Then select the check box of "Ignore error message" in determination of ICO.  If it is huge number of files available in your source, it may not be good idea but certainly will meet your requirement.

Regards,

Umamahesh.

Former Member
0 Kudos

Hi Monikandan,

Based on your requirement , the client is placing only one file a day and you have to process that .

The best way to handle this and avoid duplicate processing will be to create a schedule for your sender channel and make it run for 10 - 15 min once a day after the file got created in the folder. For example if the file is placed at 12:15am every day , schedule the channel from 12:20am - 12:30 am . In PI the file processing time will be fast.

Also make sure the polling interval in the channel is more than the scheduled run time so that it doesn't pick the same file again.

Hope this helps you to resolve the issue.

- Muru

monikandan_p
Active Participant
0 Kudos

Dear Murugavel,

    Thanks for your reply

        I have already scheduled the channels from 2pm to 3pm IST,But once PI picked the latest file from SFTP it again picking some random duplicate file within the 1hour schedule.

Previously i scheduled it for 30mins what happen in the case,here we have some 100 duplicate files exist in SFTP folder,So PI need to confirm each and every file for duplicate,For this PI is taking more than 30mins to validate the duplicate file check.

Kindly suggest some solution.

Best Regards,

Monikandan

dipen_pandya
Contributor
0 Kudos

Hi,

Better to use archive method as Azhar said.

Try to convince your client.

Regards,

Dipen.

Former Member
0 Kudos

Hi Monikandan,

            As your schedule is between 2 - 3 PM , setthe polling interval in the channel to 7200 sec which is 2hrs.

            It will avoid channel running again within the scheduled time frame , also please let us know

          Are you picking all the file from the source folder or do you have any naming convention defined?

          The files placed by your client daily , are they overwriting the existing one or creating a new one?

Regards,

Muru

Former Member
0 Kudos

I used this feature and it definetly works.

As far as I know, it remembers every for each file the size, last modified timestamp and filename.

This values are saved for 2 weeks in standard, which can be modified by setting the advanced parameter duplicateCheckPersist (in minutes)

After this time, it will read the old files again!

Of course, if someone touches a file and changes the changedate, it will be read again.

And you should change your processing sequence to Date.

By the way, the "modification check" will not change the duplicate behaviour. It's used to prevent reading unfinished files.


It just reads a file once, waits your given seconds, and then check again if the file has been changed. If you have many files in a directory, a high value will cause a lot of waiting time...

Also check SFTP-Adapter FAQ OSS-Note 1692819

monikandan_p
Active Participant
0 Kudos

Thanks for your reply Heiko,

I already check with date and time stamp but again it picking the duplicate file and client side they are not modifying any files from SFTP folder.

Kindly guide to resolve.

Best Regards,

Monikandan.

Former Member
0 Kudos

Actually, it doesn't matter if you set those advanced parameters. That's just for logging in your case.

But have you tried to set duplicateCheckPersist to a larger number, like 525600 (which should be one year)?

Of course, it will some files again in the first step - because it forgot that they've been read - but after that it shouldn't happen for one year.