on 01-06-2015 2:54 PM
Good day everyone.
I'm new to SAPDS, starting to build a DW from a retail sales system. What is the "best/must used/common" method to make the delta load of the fact table.
I've done delta load for dimension tables using table comparison, but I believe table comparision for fact tables is not a good idea.
Thanks in advance.
Hi Emerson,
why do you think, table comparison is not a good idea?
If you want to have a delta load and you do not have any CDC support from the database, you have to use the table comparison (or create your own comparison in a custom transform or validation).
However, do you really need a delta load on the fact table? It is much easier to create a full load.
Regards
Severin
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hello Severin, thanks for fast reply.
I believe table comparison does a full scan at source and target then calculates the difference, so for a sales fact containing millions of records I think could be not a good idea (if my perception is wrong then I can go with table comparison transform).
Well I mean delta load for loads done after initial full load, so after a full load it only inserts the records that have been changed since last load.
Everything you want to know about the DS Table_comparison transform can be found in
I use that transform both for target CDC of both facts tables and dimensions. If you want to achieve highest performance, you've got to combine with source CDC (based on last_modification timestamp, sequence number, log table, db-based CDC...).
Hello Dirk,
Thanks for your reply and sharing the links, it will help to better understand DS. Will give it a try to the table comparison transform.
For this project target database is IQ 16 and initial load of fact sales data is about 15 million rows. So I'll see if performance is acceptable or will try one of the other methods mentioned here.
Regards.
table comparison for fact table? never ever, table comparison is effective, but not that much. What if you FT consist of 1 billion or more rows? Your DBA will not be happy, if you compare data based on index column, then yes, you can achieve good results. Best option how to implement delta load for fact table is to fix the timestamp of fact table and load anly those where Timestamp is greater than one fixed in fact table.
So.
1. Create a Local variable in script and find the Max timestamp value (there should be one)... select max(timestamp_col) from MyFDacttable
2. Then create DF which extracts data from Source system to target and add this vavariable to WHERE tab of Query.
By implementing this you will extract only those values where timestamp is greater than existing one... You can implement the same for multiple columns
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Another possibility is to use the CDC transformation. But you need a supported database (e.g. ORACLE or SQL Server) and have to activate CDC in the database.
The advantage is, that the DS just loading the new or different rows. The disadvantage is, that the database need a bit more time to insert / update or delete database rows.
User | Count |
---|---|
84 | |
24 | |
12 | |
9 | |
7 | |
6 | |
5 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.