cancel
Showing results for 
Search instead for 
Did you mean: 

Index Steps in Daily Batch

former_member206475
Active Participant
0 Kudos

Dear All,

We have a Process Chain which runs daily in Production system. As the performance is not so good we are planning to see if it is possible to take out the Index Drop and Creation step which is taking about an hour , from daily run and if we can run those some other time.

My question is:

1. Is this good/possible to do by taking out from daily run?

2. Will this actually improve performance?

3. Will the reporting/reports be impacted then?

4. If we plan to run some other time then will in impact the data in info providers?

Kindly suggest.

Regards

Zabi

Accepted Solutions (1)

Accepted Solutions (1)

karthik_vasudevan
Active Contributor

Hi Syed

Before answering your questions, I would like to know which step is taking long time. Creating index or deleting index. And how many cubes are involved in the variant.

If deleting index is taking long time, your questions are right and below is the answer.

     1. Is this good/possible to do by taking out from daily run?

Don't remove the step from the process chain, because index deletion and creation is very much important to important for data load performance, but you could create a separate process chain which has only delete index step with which you could save some time

     2. Will this actually improve performance?

This will not improve performance. On the other hand, this will decrease the performance

     3. Will the reporting/reports be impacted then?

Reports will have impact if you haven't create index after loading the data. Since index plays a main role in data selection

     4. If we plan to run some other time then will in impact the data in info providers

This is what I would recommend. You are right. you can delete index before any process chains starts, but after business hours.

If index creation job is taking long time, then you should help from Basis team as well as this will be impacted by DB configurations.

Let us know first which is creating issue. Then we will discuss further

Regards

Karthik

former_member206475
Active Participant
0 Kudos

Hi Karthik,

It is the Create which takes more time.

karthik_vasudevan
Active Contributor
0 Kudos

Ok Syed. The we need to get some more information.

1) How many cubes are included in that create index variant

2) Are they full load or delta load or both?

3) How many hours it takes to complete the step and if you check in the job detail, you could even say for which individual cube its taking more time

4) Do you have history data in the cubes or just very recent data. If you have history data, you could get your business approval to delete them

5) Do you also compress the cube weekly. Are there aggregates build on them?

Regards

Karthik

Former Member
0 Kudos

Hi Syed ,


Delete /Create indices are  the processes to improve loading and Query performances


While data loads of Cube,  Drop index process  is to delete index from the Cube To improve the Loading performance


it will occupy the space in DB If you not delete the indexes,



After data loads happened to the cube we create indices to the Cube  .to improve the Query performance allowing faster retrieval of records from the cube


Thanks

Vivek

Former Member
0 Kudos

Hi Syed,

are there any other process chains which updated data to the same/multiple cubes which are available in your problematic process chain?  I mean to say, when the create is running for cubes and at the same time DTP is writig data to it , it will have locking issues which lead to delay. this generally happens in case of hourly loads.

another point is if cube data is huge and not compressed, it will lead to long time to create index.

former_member206475
Active Participant
0 Kudos

Hi,

Best to do is to run the following outside the chain separately before hand -

index drop

change log delete

cube contents delete

psa delete

Can i run this before the main chain runs? will doing this harm the chain run somehow?

former_member206475
Active Participant
0 Kudos

Hi,

If i compress the cube data then can this affect Archiving process?

Because we are also planning to implement ADK archiving on huge cube tables.

So if compression is being performed regularly then can i still archive the compressed data and then remove same data from cube and again reload whenever required in cube?

karthik_vasudevan
Active Contributor
0 Kudos

Hi Syed

These are kind of housekeeping activities. You could even schedule it once in a week when your normal transaction data load doesn't happen.

And yes, you could also create a separate process chain for this. This will definitely help you reduce the runtime of your main chain. No harm in doing this.

Regards

Karthik

former_member206475
Active Participant
0 Kudos

Thanks. Also wanted to understand the dependency between Compression and SAP Archiving :-

If i compress the cube data then can this affect Archiving process?

Because we are also planning to implement ADK archiving on huge cube tables.

So if compression is being performed regularly then can i still archive the compressed data and then remove same data from cube and again reload whenever required in cube?

Former Member
0 Kudos

Hi Syed,

To archive you need to compress data, because only E fact table(compressed) can be archived.

If your BW version is 7.3 then the above step is not a prerequisite anymore

you can search more information on SCN for more info on both of these.

and regarding your other question

Best to do is to run the following outside the chain separately before hand -

index drop

change log delete

cube contents delete

psa delete

Can i run this before the main chain runs? will doing this harm the chain run somehow?

you can run house keeping job like.. change log , cube and PSA data deletion separately (if it a weekly/monthly(or older days deletions)

since delete index is not consuming much time (from your previous post), its better to keep it in same process chain.  because if indexes are not dropped before data loading, they will take longer time. that is why it is always best practice to include delete index in the beginning of process chain and delete index in the end of process chain (which improves report performance).

former_member206475
Active Participant
0 Kudos

Hi Jyoti,

The plan is to schedule a separate chain with these house keeping jobs every day few hours before the main batch.

If i do this and if the same objects are still existing in the main batch will this be ok? or should i make sure to remove those objects from main chain as i will be using separate chain for this.

Keeping those objects in main chain will it cause any overhead on the main batch and will it still take some runtime?

former_member206475
Active Participant
0 Kudos

Hi Karthik,

There are 4 local chains which are scheduled to run sequentially one after the other.

However i see no dependency between them. So if i redesign the batch to run these 4 local chains in parallel, will this help in anyways? What benefit can be achieved?

Former Member
0 Kudos

Hi Syed,

It’s better you remove the objects, the idea is to reduce the process chain run time. but if you don’t want to do this from dev and import it production (considering developing hours).


Then you can keep your main chain remain unchanged , as the previous chain deletes data from changelog, psa and cube already, these steps in main chain won’t take much time as the data previously deleted.

karthik_vasudevan
Active Contributor
0 Kudos

As you are doing the housekeeping activities before these four chains run, you could run the chains in parallel, provided you have enough resources.

As suggested by Jyothi, please do not make any changes to the existing chains and the process variants should remain the same. Those steps will actually take very less time.

I have the same setup in my project and it really helps a lot in reducing the runtime, specially delete index step.

Please have a look at this document as well, which might help you in some scenarios.

Regards

Karthik

former_member206475
Active Participant
0 Kudos

Thanks Jyoti.

I asked a question for topic on parallel processing.

In system there is one server initial with 14 background jobs.

There are 4 long running local chains which run one after the other now.

However as there is no dependency i would want to run all these 4 in parallel. Is this a good choice to do? will it help or make it worse?

karthik_vasudevan
Active Contributor
0 Kudos

Hi Syed

As you have only 14 BGD processes, it is not wise to run all four process chains at the same time. You could try running two at a time.

Or try running all four at the same time and see if there is any improvement seen. Only when you run it, you will know which steps are taking more time during parallel runs.

If one process chain has an infopackage which runs for two hours. You could use this time and run another chain. So based on the steps in all the four chains and the time it takes, you could decide on this.

Regards

Karthik

Former Member
0 Kudos

Hi Syed,

it is not a best approach to run 4 process chain in parallel specially when there is no load balancing (only one application server). because each process chain may have multiple steps (ex: del index, infopackage, dtp, etc etc as per your requirement)

when you run 4 proess chains parallel, each process variant from each chain will run in background, which may occupy all of your background jobs which not only be a huge load on the system, but will downgrade the system performance. you will end up seeing each chain taking much more longer time to finish.

please run the process chains always one after another, also make sure you are scheduling them after a certain time gap. may be 30 mins.. depends on your business call.

Answers (0)