cancel
Showing results for 
Search instead for 
Did you mean: 

Impact and Lineage Analysis setup for a dwh

Former Member
0 Kudos

Hello,

My department is responsible for reporting on risk on various levels (portfolio's, industries, countries).

For this we collect on daily/weekly/monthly basis

  1. data (csv) on outstanding amounts  for all loans/facilities from all our offices,
  2. data (xml) on customer data, incl risk rating
  3. data (xml) on reference data, like countries, statusses, risk-levels, etc.
  4. market data (fx, interests, equity listing info)
  5. country risk data (csv)

As an info architect I'm responsible for maintaining datamodels around these data.

For a more efficient overview in case of issues we also like to achieve the Impact and Lineage possibilities.

I already made 5 conceptual models for each of these representing there initiatial loaded formats,plus one

for the datawarehouse where all is collected and cleansed, linking to the entities in the 5  loaded-file models.

So far so good.

I could continue making some new models for an extended datawarehouse and finally for the datamarts. Point is

that this collection of cdm's finally will not help me achieve the following:

  • I want to move to LDM to help my  clients (analysts) find back their FK-columns (and to enable me to
    add additional info on those relationships) and to introduce extra columns for keying and history mechanism.
  • I want to insert the processes  that glue these models together (merely as stepping stones to  documentation outside PD)
  • I want to be able to do a any  impact and Lineage Analysis

I stumbled upon a video demo on an airline ticket example which made it sound rather simple.But I am currently stuck with the

following questions:

  1. The data from source 1 (above) are the “real ones”, that can be cleansed, mapped into other ones and referenced to a description via the other sources (2-5) that represent master data around it. But if I draw a FK column to the  referenced entity I loose my info referencing the column back to a column in source 1 . If  I stick to the column as delivered, I can not express cases of mapping (data is changed to a internal bankwide
    code), nor that the code refers to a category of reference data ( we  probably like to drill down into in the reporting and data mart
    models). ? So in my lineage I like to be able to see e.g. A local  product code as delivered by the client in its loans-file >
    mapped to an internal product code stemming from a file mapping the clients local code to the bankwide used product code, > and
    referencing a list of products+descriptions used in yet another  file. So the lineage should forck back three ways  for the dataitem
    in this example. How should I tackle this case? Ideally I want to  solve this puzzle (for me) only using a project containing a set of
    LDM glued together by PDM. Please let me know if I have to use Data Movement Models instead to achieve this, but for what I read about these  they look too much intend for setting up ETL, which is again to much  detail.
  2. I only want to use PDM as a glue,  I rather do not want to have to put up a whole list of input items  and output items to it. Is there a way? Or does the Mapping Editor already cick in here?
  3. The lineage should help me find  out escpecially the lineage for data-items appearing in the final models (datamarts/reports). Any solution to Q1&2 that does not  fullfill this lineage requuirement, is finally not usefull I'm  afraid.

If people have some answers for me or
can point me in the right direction for additional info, would be
great! Thanks a lot in advance.

Wim de Groot

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

Hello Wim

It looks to me as if you need to build a Data Movement Model (DMM). I haven't done one before, so I tried a quick experiment with your example.

I created a DMM, and a simple PDM to represent your Data Warehouse. The DMM has a single Transformation Process, which contains a number of steps:

  • See "Data Movement Model.jpg".

I've created a diagram to describe the first transformation, "Import Loan File".

  • See "Process A.jpg"

It'll take some experimentation to get the process levelling right - I think you could get several of your processes into a single transformation task.

I wasn't able to attach a copy of the project folder - if anyone wants it, please ask..

Former Member
0 Kudos

Thanks George, and apologies for my late respons.

Although I primarely thought DMM was too detailed, I may be wrong.

I will reconsider with help from your respons.

Thanks very much so far

Cheers

Wim

Answers (1)

Answers (1)

Former Member
0 Kudos

Wim,

Without some visual examples I'm hard pressed to see your issues clearly. There (probably) isn't an easy answer to your questions. Mappings in PD do have challenges, and generation links, while having less limitations can only be driven by transformation & generation. PD has just a very basic  framework for things like implementing surrogate key transformation, mappings & look-ups. (PD Customization can help here btw, but esp. programming transformations can be tricky).

Former Member
0 Kudos

Hi Martijn,

Thanks for your answer.

Let me focus on one particular question:

Example:

I have a file 1 (Loan) with loan info. One of the columns is country_code and contains NLE

During/after uploading this file with  proces A this code is converted via proces B into NL with data in a mapping file 2.(country mapping). This new code is present in a lookup file 3 (Country-iso2). The check on this reference is done via proces C.

After this staging the loans are upoaded into file 4 in a datawarehouse via proces D.

And from there into file 5 in a datamart via proces E.

I like to create a project where the above tables within their separate models are glued together with the processes.

And via lineage of the loan datamart I like to see a diagram where file 1-5 and proces A-E line up.

I hope this example helps as a "visualisation".

Any help is much appreciated!

Cheers

Wim

Former Member
0 Kudos

The mapping route would create 4 models: 1,2,3,4. The processes are not explicitly modeled this way.

The only interesting thing is the lookup mapping:

You map loan.load_id to target.loan_id

You map country.country-iso2 to target.country-iso2

In the mapping editor mappings on your criteria pane you enter loan.country_code=country mappings.country_code.

This way you map the look-up using the mapping editor in powerdesigner..

(I'd remark here that I would automate and architect this type of look-up process differently and not represent it as a mapping exercise, but YMMV).

I have an example file if you're interested.

http://s7.postimg.org/kyarzngx7/mapping_PD_example_lookup.png

Former Member
0 Kudos

Martijn, thanks very much for your answers!

I dived into and think I grasped it.

On the other hand I am more and more loosing control of just having a streightforward set of datamodels. Instead I got stuck in a jungle of shortcut/replica, mapping editor and lineage stuff where I don't see how I will get this transparant for myself not to mention my expected audience.

So I do think I will leave it for now. Maybe this will work out in my future.

But thanks again for explaining, much appreciated.

Cheers

Wim

Former Member
0 Kudos

Wim,

you're welcome. Of course, the discussion on how to architect, model and maintain your metadata is another ballgame. Don't let powerdesigner or ER 'modeling' be your guide here, because they  don;t make you focus on the correct aspects. I always use PD despite it shortcomings, not because of it's conceptual strength. the Feel free to contact me if you wan't to discuss architecture and modeling  instead of techniques & tools.