Import Server (MDIS) parallel processing

Former Member · ‎09-16-2008

Hi,

We are running import server and each night it processes MATMAS XML files from R/3 into our main MDM repository. I have been working to tune this process with some good results, but have some questions regarding processing file chunks in parallel. I have lowered the chunk size (in MDIS.ini) to 12000 and set the number of chunks that can be processed in parallel to 7. Here is the snippet from mdis.ini :

\[ISKU_PRODUSTCAS69MSQL_9_8_4_3\]

Chunk Size=12000

No. Of Chunks Proccessed In Parallel=7

I know the "proccessed" isn't spelled right, but MDM put it there and I know MDIS is finding it because at the top of the import server log is the following (and I tried correcting the spelling without any change!) :

[ISKU_PROD\] Import Task Started. Chunk size \[12000\], No. parallel chunks\[7\]

Now I know that most of our files have multiple chunks since these MATMAS files can be quite large. Here is an example of a section of import log.

1756 2008/09/15 05:44:32.494 Timer: name: Import Records - Stage 1 - Prepare Import Records total ms: 18892.865415 6

1756 2008/09/15 05:44:37.962 Timer: name: Import Records - Stage 2 - Filter/Merge total ms: 5406.211170 6

1756 2008/09/15 05:44:38.697 Timer: name: Import Records - Stage 3 [Time spent on MDS ] total ms: 729.913273 6

1756 2008/09/15 05:44:38.697 Timer: name: Import Records - Stage 4 - exception generation total ms: 0.004424 6

1756 2008/09/15 05:44:38.697 Import action: Skip: 0 Create: 0 Updated (NULL fields only): 0 Updated (all mapped fields): 12000 Replace: 0 Delete [destination]: 0

1756 2008/09/15 05:44:38.728 Timer: name: Import Chunk total ms: 81269.620471

The '6' at the end of each line means that this refers to chunk number 6, and you can see this chunk contained 12000 records. Now for the question. Despite being told to process 7 chunks in parallel, each chunk is processed one after the other, ie sequentially and using the same thread. The import log shows quite clearing 1 chunk being processed after another.

I've been digging around in the documenation and the OSS notes but can't find anything to indicate why. I read the notes on streaming, but I don't think they apply since I'm only importing one file at a time. And it clearly knows the file has multiple chunks.

For reference, import server is running on a 4 CPU machine with 4GB of memory running Windows 2003 server (32-bit).

All suggestions gratefully received!

Mark

Former Member · ‎09-16-2008

It is Queued parallel Processing, you will not see a true parallel processing here, just like a pipelined CPU architecture.

Say Chunk 1 is at stage 1 preparing import records

Then Nothing else happens parallelly

Now is Chunk 1 moved to stage 2 Filter/Merge

Then Chunk 2 can move to stage 1 preparing import records

Now Chunk 1 moves to stage 3 MDS

Then Chunk 2 moves to stage 2 Filter/Merge

And Chunk 3 moves to stage 1 preparing import records

so on ......and so forth......

So if you prepare a Gantt chart with all these tasks in swim lanes then over a period of time you will observe that actually 'n' number of tasks will be running on various chunks and this 'n' is the what you set in No.of Chunks parameter.

try observing, when the stage 1 task started on Chunk 2 , see if it kinda overlaps stage 2 of chunk 1. Try posting all the logs , very interesting observation !

Size of Chunk is like preparing the packet, in any communication protocol, the more records you have in chunk , the less overhead/record but more wait time. The less the no.of records you have in your chunk the more the overhead/record but less the wait time. So yeah this can be tweaked according to your memory.

-Sudhir.

Import Server (MDIS) parallel processing

Accepted Solutions (0)

Answers (1)

Answers (1)

Consuming on-Premise Service in CAP Project

Re: How can I save Requirement in Cloud ALM

Re: How to configure SAP system in Eclipse ?

Re: BTP CI/CD Error while UPLOAD set

Re: LLM, RAG and Cloud Foundry: No space left on d...