Solved: Join between table variable and column table in a ...

bulent_ozen1 · ‎10-10-2014

Hi all,

Hope everyone is doing fine!

We have our own data management framework implemented using SAP HANA XS and SQLScript.

We have a SAP BW DTP like stored procedure where we call a SAP BW Transformation like SQLScript procedure which returns the result package as a SQLScript table variable.

The "DTP"-like procedure then attempts to join this table variable with the actual SAP HANA column table (which is kind of the SAP BW DSO active table) to detect the delta (the changed records, the new records and the deleted records)

The table variable that contains the result package records is not that big e.g. around 50K. The physical column table (mimicking the active table) is much bigger (e.g. 50M and growing) and the number of columns to be compared is quite high! The (2) columns involved in the join are defined as the primary key on the column store table (hence I assume already would have the concat attribure created on it)

But what we don't much know about is the behaviour of the table variable! How is the join between a table variable and a physical column table actually performed? I assume the table variable gets first materialized! Will concat attribute columns still be created on this materialized view? Does HANA still create and maintain translation tables behind the scnenes?

As the number of records increase in the active column store table in production, we realised that these joins got slower and slower even though the number of records in the table variable remained constant around 50K.

Is there a way to force this join to be executed similar to what we do in ABAP SQL with "FOR ALL ENTRIES IN ..." on the big column table only wihout triggering translation table maintenance (if I am not wrong and if this is what HANA is really doing behind the scenes of course)?

Any help would be appreciated.

Regards

Bulent

lbreddemann · ‎10-10-2014

Hi Bulent and welcome to SCN

Your assumption is somewhat correct.

What actually happens (most of the times) with table variables and joins is that SAP HANA creates a SQL statement where the table variable is "inlined".

For example something like

v_tab1 := select id, name as "USER_NAME" from myusers;

v_tabx := select a."ACCOUNT_NAME", u."USER_NAME"

from accounts a inner join :myusers u

on a.user_id = u.id;

would be transformed to

v_tabx := select a."ACCOUNT_NAME", u."USER_NAME"

from accounts a inner join (select id, name as "USER_NAME" from myusers) u

on a.user_id = u.id;

This allows for more optimizations than simply materializing v_tab1 first.

Anyhow, for the join itself, it does not change too much in this example.

The way joins work for column store data is via translation tables.

So, sure enough the translation tables always need to be maintained.

In principle it's not possible to keep the data-change-discovery process runtime constant when the amount of data grows.

However, the increase of runtime should be linear or better.

Concerning your comment on the join columns being the primary key: the translation tables have nothing to do with the concat attributes. Both are completely different concepts.

Translation tables really are there to be able to map the encoded and possibly compacted dictionaries of two join columns to each other.

SInce the FOR ALL ENTRIES approach is basically running many single look-ups against a table (via IN ... and/or UNION) the effort to resolve the keys would need to be done n-times for a chunk of n rows. With the join approach it would be done only once per chunk.

Cheers,

Lars

Join between table variable and column table in a SQLScript procedure

Accepted Solutions (1)

Accepted Solutions (1)

Answers (0)

Iterating through JSONModel with multiple nested a...

Re: Connecting SAP CAP with SAP Cloud ALM

How to use CopyProvider on Table created by TypeSc...

How to execute/fire onPost when user press enter o...

Re: Connecting SAP CAP with SAP Cloud ALM