on 03-17-2016 6:48 AM
Dear SAP HANA friends,
I have a question regarding UNION ALL. I have a code where I know that the left or right side will be empty. On top of that there are a lot calculations. Executing only one side gives a performance of 2 seconds. Using the UNION ALL we receive factor 3 which is about 6 seconds. What we see is that the plan is completely different and the UNION ALL is pushed up. Which makes sense from our perspective if both sides of the union all have equal side but in our case the union all has never to be executed because one of both sides is always empty.
Is there any possibility to
a) give a hint to the UNION ALL to be executed exactly at this point
b) any dummy statement to hinder the push up of the UNION ALL in the plan
c) an equivalent statement to the UNION ALL to fetch data either from the one or the other side
Thanks & Kind Regards,
Klaus
Can you provide more insight into the model logic? is it e.g. predictable from user input or other factors which of the two sides can safely be ignored?
In that case, the constant union modelling could help to enable branch pruning.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
I see.
The problem here is that the function SESSION_CONTEXT is evaluated after the base filters on the accessed tables are executed.
This is generally a good strategy, but doesn't work too well in your case, where you want to effectively evaluate a query local constant.
Let's see an example here.
My base table has got hundred millions of records, so whether we hit it or not will make an impact to the query runtime:
set session 'XYZ' = 'X';
select top 10 'X' as "XY" from "TENS_OF_MILLS"
where
SESSION_CONTEXT('X') is null ;
/*
successfully executed in 906 ms 619 µs (server processing time: 535 ms 978 µs)
successfully executed in 933 ms 987 µs (server processing time: 551 ms 514 µs)
successfully executed in 919 ms 52 µs (server processing time: 551 ms 730 µs)
OPERATOR_NAME OPERATOR_DETAILS EXECUTION_ENGINE TABLE_NAME TABLE_TYPE TABLE_SIZE OUTPUT_SIZE
ROW SEARCH 'X' ROW ? ? ? 10
LIMIT NUM RECORDS: 10 ROW ? ? ? 10
FILTER SESSION_CONTEXT('X') IS NULL ROW ? ? ? 10,836,377
COLUMN SEARCH TENS_OF_MILLS.$rowid$
(OLTP SEARCH, ENUM_BY: CS_TABLE) COLUMN ? ? 108,363,776 108,363,776
COLUMN TABLE COLUMN TENS_OF_MILLS COLUMN TABLE 108,363,776 108,363,776
*/
We have nearly a full second runtime and the explain plan shows that the table is hit first and that the function expression only is evaluated afterwards. Also note that the whole record set gets converted for row store handling.
From a query execution point of view, this expression is not a constant so pushing it further down would generally not be a good strategy.
Now, for your use case you actually want it to be treated as a constant.
This can be achieved by introducing a constant and applying the function result as a filter to it:
set 'XYZ' = 'X' ;
select top 10 'X' as "XY" from "TENS_OF_MILLS"
where
'Y' = SESSION_CONTEXT('XYZ')
UNION ALL
select top 10 'Y' as "XY" from "TENS_OF_MILLS"
where
'X' = SESSION_CONTEXT('XYZ');
OPERATOR_NAME OPERATOR_DETAILS EXECUTION_ENGINE TABLE_NAME TABLE_TYPE TABLE_SIZE OUTPUT_SIZE
COLUMN SEARCH UNION_COL0 COLUMN ? ? ? 20
COLUMN UNION ALL ('X', 'Y') UNION_COL0 (ENUM_BY: CS_UNION_ALL) COLUMN ? ? 0 20
COLUMN SEARCH 'X' (PARALLELIZED, ENUM_BY: CS_UNION_ALL) COLUMN #_SYS_QO_COL_7f55b90eff40:4000000000000e6 ? ? 10
LIMIT NUM RECORDS: 10 COLUMN ? ? ? 10
COLUMN TABLE FILTER CONDITION: 'Y' = SESSION_CONTEXT('XYZ') COLUMN TENS_OF_MILLS COLUMN TABLE 108,363,776 108,363,775
COLUMN SEARCH 'Y' (PARALLELIZED, ENUM_BY: CS_UNION_ALL) COLUMN #_SYS_QO_COL_7f55b90f1c30:4000000000000ee ? ? 10
LIMIT NUM RECORDS: 10 COLUMN ? ? ? 10
COLUMN TABLE FILTER CONDITION: 'X' = SESSION_CONTEXT('XYZ') COLUMN TENS_OF_MILLS COLUMN TABLE 108,363,776 108,363,775
Ok, this might be hard to read thanks to the formatting capabilities n the forum software, but I hope you get the main points here:
1. The constant filter condition is evaluated before/when hitting the table, not afterwards.
2. The whole processing happens in the column store engine.
All that comes at a price of course...
successfully executed in 380 ms 887 µs (server processing time: 9 ms 738 µs)
successfully executed in 374 ms 484 µs (server processing time: 8 ms 770 µs)
successfully executed in 383 ms 539 µs (server processing time: 8 ms 115 µs)
Now you need to find another job that you can spend the 700 saves milliseconds on .
Also: get rid of the IS NOT NULL and IS NULL logic.
If you really feel that you need to use session context variables to parameter-ise SQL code, then don't make your life harder by using ambiguous values like NULL.
Anyhow, that's my take on this.
There you go, now you know!
Hi Lars,
thanks a lot for your reply. We have a table with 29 million records.
We implemented it as suggested by you but with selects on a table and a claculation view.
The select is the following
If we just do the select on the lower part (lines 10-12) we get the following plan and a runtime of 9 seconds
With the union all we get this plan and a runtime of 19 seconds (doubled). This is even true with the second table being empty.
We see that the union all is pushed up. We understand that this makes sense from parallelization perspective. But we do not understand that this doubles the runtime, even that the one side is empty.
Do you have any further advice?
Thanks & Kind Regards,
Klaus
Alright, no I was able to see the execution plans.
Did you also have a look into the second (long running one?).
The column view clearly had been unfolded and the transformations also led to a duplication of the SESSION_CONTEXT condition. The join orders are different and in the second one based on intermediate result sets.
The optimizer doesn't understand what this condition is used for, so it doesn't treat the two query parts independently. One thing i would try here would be the NO_CALC_VIEW_UNFOLDING hint, so that the calc. view access is not unfolded.
User | Count |
---|---|
93 | |
10 | |
10 | |
9 | |
9 | |
7 | |
6 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.