Solved: Impact and Lineage Analysis setup for a dwh

Former Member · ‎08-21-2013

Hello,

My department is responsible for reporting on risk on various levels (portfolio's, industries, countries).

For this we collect on daily/weekly/monthly basis

data (csv) on outstanding amounts for all loans/facilities from all our offices,
data (xml) on customer data, incl risk rating
data (xml) on reference data, like countries, statusses, risk-levels, etc.
market data (fx, interests, equity listing info)
country risk data (csv)

As an info architect I'm responsible for maintaining datamodels around these data.

For a more efficient overview in case of issues we also like to achieve the Impact and Lineage possibilities.

I already made 5 conceptual models for each of these representing there initiatial loaded formats,plus one

for the datawarehouse where all is collected and cleansed, linking to the entities in the 5 loaded-file models.

So far so good.

I could continue making some new models for an extended datawarehouse and finally for the datamarts. Point is

that this collection of cdm's finally will not help me achieve the following:

I want to move to LDM to help my clients (analysts) find back their FK-columns (and to enable me to
add additional info on those relationships) and to introduce extra columns for keying and history mechanism.
I want to insert the processes that glue these models together (merely as stepping stones to documentation outside PD)
I want to be able to do a any impact and Lineage Analysis

I stumbled upon a video demo on an airline ticket example which made it sound rather simple.But I am currently stuck with the

following questions:

The data from source 1 (above) are the “real ones”, that can be cleansed, mapped into other ones and referenced to a description via the other sources (2-5) that represent master data around it. But if I draw a FK column to the referenced entity I loose my info referencing the column back to a column in source 1 . If I stick to the column as delivered, I can not express cases of mapping (data is changed to a internal bankwide
code), nor that the code refers to a category of reference data ( we probably like to drill down into in the reporting and data mart
models). ? So in my lineage I like to be able to see e.g. A local product code as delivered by the client in its loans-file >
mapped to an internal product code stemming from a file mapping the clients local code to the bankwide used product code, > and
referencing a list of products+descriptions used in yet another file. So the lineage should forck back three ways for the dataitem
in this example. How should I tackle this case? Ideally I want to solve this puzzle (for me) only using a project containing a set of
LDM glued together by PDM. Please let me know if I have to use Data Movement Models instead to achieve this, but for what I read about these they look too much intend for setting up ETL, which is again to much detail.
I only want to use PDM as a glue, I rather do not want to have to put up a whole list of input items and output items to it. Is there a way? Or does the Mapping Editor already cick in here?
The lineage should help me find out escpecially the lineage for data-items appearing in the final models (datamarts/reports). Any solution to Q1&2 that does not fullfill this lineage requuirement, is finally not usefull I'm afraid.

If people have some answers for me or
can point me in the right direction for additional info, would be
great! Thanks a lot in advance.

Wim de Groot

Former Member · ‎09-11-2013

Hello Wim

It looks to me as if you need to build a Data Movement Model (DMM). I haven't done one before, so I tried a quick experiment with your example.

I created a DMM, and a simple PDM to represent your Data Warehouse. The DMM has a single Transformation Process, which contains a number of steps:

See "Data Movement Model.jpg".

I've created a diagram to describe the first transformation, "Import Loan File".

See "Process A.jpg"

It'll take some experimentation to get the process levelling right - I think you could get several of your processes into a single transformation task.

I wasn't able to attach a copy of the project folder - if anyone wants it, please ask..

Former Member · ‎08-23-2013

Wim,

Without some visual examples I'm hard pressed to see your issues clearly. There (probably) isn't an easy answer to your questions. Mappings in PD do have challenges, and generation links, while having less limitations can only be driven by transformation & generation. PD has just a very basic framework for things like implementing surrogate key transformation, mappings & look-ups. (PD Customization can help here btw, but esp. programming transformations can be tricky).

Impact and Lineage Analysis setup for a dwh

Accepted Solutions (1)

Accepted Solutions (1)

Answers (1)

Answers (1)

Re: Cloud Connector with Error 425

Re: NCO 2.0 and UNICODE compatibility

Re: Setup email alerts and where to check email hi...

Re: SAP Datasphere test tenant ?

CPI Mapping condition issue