cancel
Showing results for 
Search instead for 
Did you mean: 

SAP HANA UNION ALL Database HINT

Former Member
0 Kudos

Dear SAP HANA friends,

I have a question regarding UNION ALL. I have a code where I know that the left or right side will be empty. On top of that there are a lot calculations. Executing only one side gives a performance of 2 seconds. Using the UNION ALL we receive factor 3 which is about 6 seconds. What we see is that the plan is completely different and the UNION ALL is pushed up. Which makes sense from our perspective if both sides of the union all have equal side  but in our case the union all has never to be executed because one of both sides is always empty.

Is there any possibility to

a) give a hint to the UNION ALL to be executed exactly at this point

b) any dummy statement to hinder the push up of the UNION ALL in the plan

c) an equivalent statement to the UNION ALL to fetch data either from the one or the other side

Thanks & Kind Regards,

Klaus

Accepted Solutions (0)

Answers (1)

Answers (1)

lbreddemann
Active Contributor
0 Kudos

Can you provide more insight into the model logic? is it e.g. predictable from user input or other factors which of the two sides can safely be ignored?

In that case, the constant union modelling could help to enable branch pruning.

Former Member
0 Kudos

Hi Lars,

the decision comes from a session context variable.

it is like "select * from tbl1 where SESSION_CONTEXT('xyz') is null union all select * from tbl2 where SESSION_CONTEXT('xyz') is not null"

Kind Regards,

Klaus

lbreddemann
Active Contributor
0 Kudos

I see.

The problem here is that the function SESSION_CONTEXT is evaluated after the base filters on the accessed tables are executed.

This is generally a good strategy, but doesn't work too well in your case, where you want to effectively evaluate a query local constant.

Let's see an example here.

My base table has got hundred millions of records, so whether we hit it or not will make an impact to the query runtime:


set session 'XYZ' = 'X';

select top 10 'X' as "XY" from "TENS_OF_MILLS"

where

    SESSION_CONTEXT('X') is null ;

/*

successfully executed in 906 ms 619 µs  (server processing time: 535 ms 978 µs)

successfully executed in 933 ms 987 µs  (server processing time: 551 ms 514 µs)

successfully executed in 919 ms 52 µs  (server processing time: 551 ms 730 µs)

OPERATOR_NAME           OPERATOR_DETAILS                  EXECUTION_ENGINE  TABLE_NAME      TABLE_TYPE      TABLE_SIZE  OUTPUT_SIZE

ROW SEARCH              'X'                               ROW               ?               ?               ?           10

  LIMIT                 NUM RECORDS: 10                   ROW               ?               ?               ?           10

    FILTER              SESSION_CONTEXT('X') IS NULL      ROW               ?               ?               ?           10,836,377

      COLUMN SEARCH     TENS_OF_MILLS.$rowid$

  (OLTP SEARCH, ENUM_BY: CS_TABLE)  COLUMN            ?               ?               108,363,776 108,363,776

        COLUMN TABLE                                      COLUMN            TENS_OF_MILLS   COLUMN TABLE    108,363,776 108,363,776

*/

We have nearly a full second runtime and the explain plan shows that the table is hit first and that the function expression only is evaluated afterwards. Also note that the whole record set gets converted for row store handling.

From a query execution point of view, this expression is not a constant so pushing it further down would generally not be a good strategy.

Now, for your use case you actually want it to be treated as a constant.

This can be achieved by introducing a constant and applying the function result as a filter to it:


set 'XYZ' = 'X' ; 

select top 10 'X' as "XY" from "TENS_OF_MILLS"

where

    'Y' = SESSION_CONTEXT('XYZ')

UNION ALL

select top 10 'Y' as "XY" from "TENS_OF_MILLS"

where

    'X' = SESSION_CONTEXT('XYZ');

OPERATOR_NAME           OPERATOR_DETAILS                                EXECUTION_ENGINE TABLE_NAME                                  TABLE_TYPE      TABLE_SIZE  OUTPUT_SIZE

COLUMN SEARCH           UNION_COL0                                      COLUMN           ?                                           ?               ?           20        

  COLUMN UNION ALL      ('X', 'Y') UNION_COL0 (ENUM_BY: CS_UNION_ALL)   COLUMN           ?                                           ?               0           20        

    COLUMN SEARCH       'X' (PARALLELIZED, ENUM_BY: CS_UNION_ALL)       COLUMN           #_SYS_QO_COL_7f55b90eff40:4000000000000e6   ?               ?           10        

      LIMIT             NUM RECORDS: 10                                 COLUMN           ?                                           ?               ?           10        

        COLUMN TABLE    FILTER CONDITION: 'Y' = SESSION_CONTEXT('XYZ')  COLUMN           TENS_OF_MILLS                               COLUMN TABLE    108,363,776 108,363,775

    COLUMN SEARCH       'Y' (PARALLELIZED, ENUM_BY: CS_UNION_ALL)       COLUMN           #_SYS_QO_COL_7f55b90f1c30:4000000000000ee   ?               ?           10        

      LIMIT             NUM RECORDS: 10                                 COLUMN           ?                                           ?               ?           10        

        COLUMN TABLE    FILTER CONDITION: 'X' = SESSION_CONTEXT('XYZ')  COLUMN           TENS_OF_MILLS                               COLUMN TABLE    108,363,776 108,363,775

Ok, this might be hard to read thanks to the formatting capabilities n the forum software, but I hope you get the main points here:

1. The constant filter condition is evaluated before/when hitting the table, not afterwards.

2. The whole processing happens in the column store engine.

All that comes at a price of course...

successfully executed in 380 ms 887 µs  (server processing time: 9 ms 738 µs)

successfully executed in 374 ms 484 µs  (server processing time: 8 ms 770 µs)

successfully executed in 383 ms 539 µs  (server processing time: 8 ms 115 µs)

Now you need to find another job that you can spend the 700 saves milliseconds on .

Also: get rid of the IS NOT NULL and IS NULL logic.

If you really feel that you need to use session context variables to parameter-ise SQL code, then don't make your life harder by using ambiguous values like NULL.

Anyhow, that's my take on this.

There you go, now you know!

Former Member
0 Kudos

Hi Lars,

thanks a lot for your reply. We have a table with 29 million records.

We implemented it as suggested by you but with selects on a table and a claculation view.

The select is the following

If we just do the select on the lower part (lines 10-12) we get the following plan and a runtime of 9 seconds

With the union all we get this plan and a runtime of 19 seconds (doubled). This is even true with the second table being empty.

We see that the union all is pushed up. We understand that this makes sense from parallelization perspective. But we do not understand that this doubles the runtime, even that the one side is empty.

Do you have any further advice?

Thanks & Kind Regards,

Klaus

lbreddemann
Active Contributor
0 Kudos

Unfortunately I don't see the screenshots - so at the moment there is not much that I can add to this.

lbreddemann
Active Contributor
0 Kudos

Alright, no I was able to see the execution plans.

Did you also have a look into the second (long running one?).

The column view clearly had been unfolded and the transformations also led to a duplication of the SESSION_CONTEXT condition. The join orders are different and in the second one based on intermediate result sets.

The optimizer doesn't understand what this condition is used for, so it doesn't treat the two query parts independently. One thing i would try here would be the NO_CALC_VIEW_UNFOLDING  hint, so that the calc. view access is not unfolded.