cancel
Showing results for 
Search instead for 
Did you mean: 

Partition vs compression

Aummad
Participant
0 Kudos

Hi,

Please some one explain me what is the relation b/w partition and compression in the cube?

Thanks

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

You've gotten some good responses, so I won't repeat them, but I would just like to clarify the compression process a bit. It is NOT actually "compressing" the data in the same sense as when we use database compression or file zipping and replace repeating values with tokens.

In BW, the compression process moves the data from the F fact table to the E fact table. Along the way, it removes the Request ID from fact rows (actually, I believe it just sets the value to 0). This allows the data in the cube ( to varying degrees ) to be summarized more, reducing the number of fact records that end up in the E fact table. So the compressed E fact table can be smaller due to the reduced number of rows in it, not because the data has acutally been "compressed".

e.g. Let's say you load store sales data to a cube every day of the month and the time grain of this cube is Calendar Month. By the end of the month, you would have 30 Requests in the F fact table - each Request representing a day's sales. So each store has 30 rows of data for that month. Now if we compress the cube and remove the 30 different Request IDs, the compression process will summarize those 30 row to just 1 row of sales data for each store (remember the time grain is calendar month), resulting in a 30 to 1 reduction in the data volume after compression ( assuming each store has sales each day of the month). It's pretty easy to see that this reduction in the number of rows that a query must read should result in faster query execution.

So the degree of compression is dependent on the granularity of your cube AND the data. SOme cubes, due to the data model, can yield little to no reduxtion in the number of rows (e.g. Material inventory cube) Also - some SAP doc will also refer to Compression as Condensing.

The Compression process can optionally perform Zero Elimination, which can further reduce the number of rows in the compressed fact table (again depending on teh data). Rows in the F fact table where all the KF values are zero are eliminated, as are any rows where the characteristic values match and all the KFs net to 0 once the rows are summarized. Zero elimination even goes to the effort to eliminate previously compressed rows in the E fact table that had a KF with a non-zero values, but after being updated with newer data, now have all KF = 0.

Hope this helps.

Answers (4)

Answers (4)

Former Member
0 Kudos

Hi Rob,

Compression: The request id is deleted from the cubes and data is moved from F to E table. Compression is used to reduce the size of data in cubes.

Partition: Its done using the two critieria "0Calmonth & 0Fiscper". We use Partition to improve the query performance.

Partition works on the E Table. So for your partition to work, you need to compress the data first.

Regards,

Victor

Former Member
0 Kudos

Hi Rob,

I can only talk about my experience with Oracle databases. The following may or may not be true for other databases.

First you have to remember that there are E* and F* tables for an infocube. For each Request loaded to an infocube the F* table get a new partition. If you compress the Infocube, the F* Partitions are removed to the (usually) one E* Partition. So during the lifecycle of an infocube you create partitions each time you load a new request, and you remove partitions once you compress.

You can define partitions for the E* Table. These partitions are usually based on a timescale. They are always dependent on an infoobject of the cube. You can only delete this partitions by deleteing the content of the E* tables and reload again (at least if you keep in the BW tools).

Partitions - regardless of E* or F* type - can improve performance, because Oracle is able to collect data from partitions in parallel.

Kind regards,

Jürgen

Former Member
0 Kudos

Hi,

Partitioning: You use partitioning to split the total dataset for an InfoProvider into several, smaller, physically independent and redundancy-free units. This separation improves system performance when you analyze data delete data from the InfoProvider.

Pre-requisite:

You can only partition a dataset using one of the two partitioning criteria ‘calendar month’ (0CALMONTH) or ‘fiscal year/period (0FISCPER). At least one of the two InfoObjects must be contained in the InfoProvider.

Compression: When you load data into the InfoCube, entire requests can be added at the same time. Each of these requests has its own request ID, which is included in the fact table in the package dimension. This makes it possible to pay particular attention to individual requests. One advantage of the request ID concept is that you can subsequently delete complete requests from the InfoCube.

However, the request ID concept can also cause the same data record (where all characteristics are the same except the request ID) to appear more than once in the fact table. This unnecessarily increases the volume of data and affects system performance when you analyze data, since each time you execute a query, the system has to perform aggregation using the request ID.

You can eliminate these disadvantages by compressing data and bringing data from different requests together into one single request (request ID 0).

Also refer these links

[;

[;

Hope it helps....

Priti

Former Member
0 Kudos

Hi,

There is no relation between partitioning and compression of the cube except both are used to increase query performance.

and another relation is both are done on the E- fact table of the infocube.

REDDY

Former Member
0 Kudos

Think of partitions as being mini cubes per partioned charactersitic on the database level

So if you run a query across two fiscal periods - the SQL gets split into two and hits two small cubes in parallel

(in reality it doesnt really work that way - but conceptually think of it liek that)

Thus if any query uses fiscal periods as a range in the filter the response time is super quick (of course with BIA this all goes out the window)

You can only partion on a compressed portion of the cube

ie again as explained in the post above - there are logically two cubes - uncompressed and compressed (and the compressed as mentioned above is partioend across disk)