cancel
Showing results for 
Search instead for 
Did you mean: 

Is it possible to do a right outer join when merging 2 datasets?

david_lai
Participant
0 Kudos

Hello Everyone,

I have the following scenario.

1 data table

1 lookup table

I have a requirement where I need to be able to see all values from the lookup table, even if the key does not exist in the data table.

When merging the 2 datasets, it allows you to do a left outer join or inner join.  However it does not allow you to do a right or full outer join.

Hence I can show all the data from the data table, but not necessarily from the lookup table.

Unfortunately it is not possible to flip the tables around on the merge, where the lookup table acts as the data table, as it will complain that the data table's keys are not unique.

If anyone has any suggestions, it would be greatly appreciated.

Thanks!

David

Accepted Solutions (1)

Accepted Solutions (1)

former_member285534
Active Participant
0 Kudos

Hi David,

I've had the same problem before and, as you wrote, right join is not possible as of Lumira v1.28.  A workaround I've used in this cases is:

- Add, manually, dummy rows to you transactional data, one row per each lookup entry, referencing precisely each lookup possible value.

- All the measures of the new dummy rows should be zeros (assuming sum aggregation method)

- The remaining dimensions of your dummy rows should also be dummy values, which in turn would need to be added to their corresponding lookup tables.

After having done these steps, join the data and lookup data sources with inner join.

Example.  Suppose dim_a is the dimension for which you need all possible lookup values to show:

Original transactional data

dim_a  dim_b  measure

a      x      10

c      y      20

Original lookup data for dim_a

dim_a description_a

a     desc_a

b     desc_b

c     desc_c

Original lookup data for dim_b


dim_b description_b

x     desc_x

y     desc_y

Applying workaround

New transactional data

dim_a  dim_b  measure

a      x      10

c      y      20

a      ZZ      0           (dummy row)

b      ZZ      0           (dummy row)

c      ZZ      0           (dummy row)

New lookup data for dim_b


dim_b description_b

x     desc_x

y     desc_y

ZZ    Not applicable       (dummy row)

This workaround is not elegant and it has its inconveniences (annoying dim_b dummy values appear, transactional dummy rows must be mantained when there are changes to dim_a lookup table, etc.), but it may suffice or it may be your only option depending on your analysis needs.

Regards,

Fernando

david_lai
Participant
0 Kudos

Hi Fernando,

This is what I have been doing right now as a workaround, however this works only when the lookup table is not very dynamic.

Otherwise it becomes a nightmare as you need to have a separate task to find all the missing items in the main data set and do an append.

Thanks for your input though

former_member285534
Active Participant
0 Kudos

Hi, I know, it can turn very unpractical.  I hope we find a better solution soon

Fernando

Answers (1)

Answers (1)

ashutosh_rastogi
Active Contributor
0 Kudos

Hi David,

If you don't have the two data sets merged, then you can achieve the use case using blending.

While adding the dimension/measure in the visualization, you should be choosing to add the ones from lookup table and then that becomes your primary dataset for blending. This should help you with what you are trying to achieve.

Regards,

Ashutosh

david_lai
Participant
0 Kudos

Hi Ashutosh,

Unfortunately the purpose of blending is not to achieve a right or full outer join.

A good example comes from the following site, as blending is used so that we don't end up having values multiplied when they shouldn't be on one-to-many situations

In addition, if you were to use linked datasets instead of merge, you need to include the keys in your output.

For example

A.key1, A.key2, A.x, B.key1, B.key2, B.y

We can't have

A.x | B.y

We must include the keys in the output which is not always the desired case.

Regards

David

ashutosh_rastogi
Active Contributor
0 Kudos

Agreed, Joins is not the intended purpose of blending. It allows users to find the correlation between the data from two data sets based on the linked dimensions.

Regards,

Ashutosh