on 03-29-2016 1:45 PM
Hi!
I've noticed a strange behavior of HANA (SPS 08 and 09) with tables loading in memory. I have a big column table partitioned by HASH on the first level and by RANGE (calendar period) on the second level. For clean test I unload the table comletely from memory. After that I run an SQL query with the restriction of calendar period in the way that only single or very few partitions must be analyzed. If that query is missing in SQL plan cache HANA loads all partitions in memory, but only the columns involved in the query. After that I unload the table from memory again and repeat the query. At this moment a query is already in SQL plan cache and only relevant partitions are loaded. This behavior is normal for HANA? The most recent revisions work the same way? If it's normal, please, provide me to official documentation which desrcibe this.
Thanks, Andrey.
Andrey,
Congratulations for the tests.
In our SPS09, we face the same scenario: unload priority affects loading behavior exactly as you detailed.
Also, loading only works right concerning columns in the first level of the partitioning. If a select is executed with a where clause on a column in the second leval of the partitioning, all partitions are loaded into memory.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi!
Just for info. A behaviour when HANA loads all the partitions in memory at the time of SQL statement preparation (instead of loading only the relevant partitions according to SQL query restriction) is expected:
When all partitions are unloaded and you prepare an SQL statement, HANA SQL optimizer first collects statistics from most columns of ALL partitions to be able to porivde optimal execution plans for the later queries.
So that's how it is. Actually, it means that partition pruning is not working during SQL statement preparation. But this is true only for tables with unload priority values 0 - 5 (btw, the default value for most tables is 5). But not everything is so bad, you can change this behavior:
If you want to change the behavior that for prepared statements most
of the columns for all partions are loaded then you need to change the
unload priority for the table from default value 5 to 7.Then only
columns from some larger partitions are loaded, with the burden that
SQL execution plans for the statement may not be optimal dependent
on filter values which are later used to execute the SQL, after the
preparation.
Actually, we noticed that even priority value 6 provides loading of almost only the necessary partitions and columns (you can find in my message of Apr 20, 2016 10:05 PM that HANA still loading very few unnecessary columns and partitions). We did not notice any performance degradation with changing priority from 5 to 6, but the memory is well saved and the queries run faster. Unfortunately, the dependence of loading behavior on the priority values is not documented:
The different unload priority values are not officially documented
and explained in detail, but the value range is from 0-9, with 5 being
default.
So just need to have this in mind.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Once again: you have been observing an effect and mis-attributed the cause.
The table unload priority has no influence on when a column gets loaded back to memory.
And all dramatic writing in bold and over-generalisation of single observation to a general processing scheme doesn't change that.
Hi!
Don't agree. We have a response from support that unload priority affects (would like to highlight it, but the drama is not welcome) the behaviour of columns/partition loading during statement preparation.
In my message from Apr 20, 2016 10:05 PM () I gave an example how to reproduce the influence of unloading priority to columns/partitions loading. Anyone checked it in a real system except me? (in HANA Cloud platform for example) Or the conclusions were made by the text description? I pay special attention that all these three test steps are performed by the three completely new SQL requests to eliminate the SQL cache influence. The uniqueness is provided here by different comment strings /* ..... */.
And returning to my example: if (as you wrote) "The optimiser sampling results, which lead to the initial column load in your example with unload priority 5, are cached and kept." why we observe that during the last request with unloading priority 5 all partitions/columns are loaded again (the test step #3)? The optimizer results are supposed to be cached already, right?
Ok, I once again spend some time to look into this.
The software you're using on HCP is no different than the on premise version, which means they work the same.
Not sure, what confirmation you received from out support on this matter, but the fact is: UNLOAD priority doesn't change whether or not column are loaded during preparation.
In your test, you took the status LOADED after your preparation as an indicator for that partitions had been loaded into memory for the query preparation. And in your example the partitions for the table with UNLOAD PRIORITY 6 weren't loaded when you checked after you prepared the statements.
The problem here is, that by simply checking the loaded state you're dismissing the possibility that the columns might have been loaded, sampled and unloaded early due to the UNLOAD priority.
I repeated my tests and took the effort to actually trace the loading/unloading as well as the partition pruning and can assure you: the UNLOAD PRIORITY has no influence on the column loading during statement preparation.
Concerning the "inexplicable" load behavior you observed: have you ensured that the table partitions don't get loaded by other processes, e.g. the merge dog? In my tests, the "weird" load states disappeared when I ensured the testes tables where excluded from the auto merge.
As a guidance on how to you also can create a test scenario that doesn't mix up all these different complex concepts, here's the list of steps for that:
1. create and fill the tables - all the same, but not the same table:
drop table pruning_seed;
drop table pruning_testUP1;
[...]
drop table pruning_testUP9;
CREATE column TABLE pruning_seed (
DOCNR NVARCHAR(5),
CALMONTH NVARCHAR(6),
SUMM DECIMAL(17,2),
PRIMARY KEY (DOCNR)) UNLOAD PRIORITY 1;
/* PARTITIONING */
ALTER TABLE pruning_seed PARTITION BY HASH ("DOCNR") PARTITIONS 3, RANGE ("CALMONTH") (
partition VALUE = '201401',
partition VALUE = '201402',
partition VALUE = '201403',
partition VALUE = '201404',
partition VALUE = '201405',
partition VALUE = '201406',
partition VALUE = '201407',
partition VALUE = '201408',
partition VALUE = '201409',
partition VALUE = '201410',
partition VALUE = '201411',
partition VALUE = '201412',
partition VALUE = '201501',
partition VALUE = '201502',
partition VALUE = '201503',
partition VALUE = '201504',
partition VALUE = '201505',
partition VALUE = '201506',
partition VALUE = '201507',
partition VALUE = '201508',
partition VALUE = '201509',
partition VALUE = '201510',
partition VALUE = '201511',
partition VALUE = '201512',
partition OTHERS);
INSERT INTO pruning_seed VALUES ('1', '201501', '100.50');
INSERT INTO pruning_seed VALUES ('2', '201502', '100.50');
INSERT INTO pruning_seed VALUES ('3', '201503', '100.50');
INSERT INTO pruning_seed VALUES ('4', '201504', '100.50');
INSERT INTO pruning_seed VALUES ('5', '201505', '100.50');
INSERT INTO pruning_seed VALUES ('6', '201506', '100.50');
INSERT INTO pruning_seed VALUES ('7', '201507', '100.50');
INSERT INTO pruning_seed VALUES ('8', '201501', '100.50');
INSERT INTO pruning_seed VALUES ('9', '201501', '100.50');
INSERT INTO pruning_seed VALUES ('11', '201501', '100.50');
INSERT INTO pruning_seed VALUES ('12', '201502', '100.50');
INSERT INTO pruning_seed VALUES ('13', '201503', '100.50');
INSERT INTO pruning_seed VALUES ('14', '201504', '100.50');
INSERT INTO pruning_seed VALUES ('15', '201505', '100.50');
INSERT INTO pruning_seed VALUES ('16', '201506', '100.50');
INSERT INTO pruning_seed VALUES ('17', '201507', '100.50');
INSERT INTO pruning_seed VALUES ('18', '201501', '100.50');
INSERT INTO pruning_seed VALUES ('19', '201501', '100.50');
INSERT INTO pruning_seed VALUES ('21', '201501', '100.50');
INSERT INTO pruning_seed VALUES ('22', '201502', '100.50');
INSERT INTO pruning_seed VALUES ('23', '201503', '100.50');
INSERT INTO pruning_seed VALUES ('24', '201504', '100.50');
INSERT INTO pruning_seed VALUES ('25', '201505', '100.50');
INSERT INTO pruning_seed VALUES ('26', '201506', '100.50');
INSERT INTO pruning_seed VALUES ('27', '201507', '100.50');
INSERT INTO pruning_seed VALUES ('28', '201501', '100.50');
INSERT INTO pruning_seed VALUES ('29', '201501', '100.50');
merge delta of pruning_seed;
alter table pruning_seed disable automerge;
--> now we have a seeding table, let's create the actual test tables WITHOUT AUTO MERGE!
create column table pruning_testUP1 like pruning_seed with data WITHOUT AUTO MERGE;
[...]
create column table pruning_testUP9 like pruning_seed with data WITHOUT AUTO MERGE;
--> now set the unload priorities and issue a delta merge
alter table pruning_testUP1 unload priority 1;
[...]
alter table pruning_testUP9 unload priority 9;
merge delta of pruning_testUP1;
[...]
merge delta of pruning_testUP9;
Here we are with identical tables, that are excluded from delta merge and that have all different unload priorities.
Unload them from memory:
unload pruning_testUP1;
[...]
unload pruning_testUP9;
Check the load status for all columns and all partitions:
with load_stat as (select table_name, part_id,
sum (case loaded
when 'TRUE' then 1
else 0
end )load_cnt
from m_cs_all_columns
where table_name like 'PRUNING_TESTUP%'
group by table_name, part_id
order by table_name, part_id)
select table_name, string_agg (load_cnt, '/' order by part_id) as load_meter
from load_stat
group by table_name;
TABLE_NAME LOAD_METER
PRUNING_TESTUP1 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP2 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP3 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP4 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP5 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP6 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP7 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP8 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP9 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0
No partition loaded - as expected.
Prepare the statements:
select /* test stmt #1*/ calmonth from pruning_testUP1 where calmonth = '201502' ;
select /* test stmt #2*/ calmonth from pruning_testUP2 where calmonth = '201502' ;
select /* test stmt #3*/ calmonth from pruning_testUP3 where calmonth = '201502' ;
select /* test stmt #4*/ calmonth from pruning_testUP4 where calmonth = '201502' ;
select /* test stmt #5*/ calmonth from pruning_testUP5 where calmonth = '201502' ;
select /* test stmt #6*/ calmonth from pruning_testUP6 where calmonth = '201502' ;
select /* test stmt #7*/ calmonth from pruning_testUP7 where calmonth = '201502' ;
select /* test stmt #8*/ calmonth from pruning_testUP8 where calmonth = '201502' ;
select /* test stmt #9*/ calmonth from pruning_testUP9 where calmonth = '201502' ;
And check the Plan Cache:
select plan_id, statement_hash, statement_string, accessed_table_names, reference_count,
preparation_count, execution_count, total_table_load_time_during_preparation
from M_SQL_PLAN_CACHE
where statement_string like '%test stmt #%';
PLAN_ID STATEMENT_HASH STATEMENT_STRING ACCESSED_TABLE_NAMES REFERENCE_COUNT PREPARATION_COUNT EXECUTION_COUNT TOTAL_TABLE_LOAD_TIME_DURING_PREPARATION
44580002 94d2b5937b0b60c61bdfabf374d609f7 select /* test stmt #9*/ calmonth from pruning_testUP9 where calmonth = '201502' DEVDUDE.PRUNING_TESTUP9(18) 0 1 0 1310
44570002 0ea60eb42735799128a00008b41858f0 select /* test stmt #8*/ calmonth from pruning_testUP8 where calmonth = '201502' DEVDUDE.PRUNING_TESTUP8(18) 0 1 0 1364
44560002 fffa5043402a3992ce9f22a0a3105b4d select /* test stmt #7*/ calmonth from pruning_testUP7 where calmonth = '201502' DEVDUDE.PRUNING_TESTUP7(18) 0 1 0 1355
44550002 536920ec921299930c52a3a59c59a756 select /* test stmt #6*/ calmonth from pruning_testUP6 where calmonth = '201502' DEVDUDE.PRUNING_TESTUP6(18) 0 1 0 1403
44540002 8e7964926f235ffb0e6b25131e3689ba select /* test stmt #5*/ calmonth from pruning_testUP5 where calmonth = '201502' DEVDUDE.PRUNING_TESTUP5(18) 0 1 0 1299
44530002 454a1b96bd35c9bdfb69e5e803ff422e select /* test stmt #4*/ calmonth from pruning_testUP4 where calmonth = '201502' DEVDUDE.PRUNING_TESTUP4(18) 0 1 0 1350
44520002 198caff798de0c26b0b8572ee13ef535 select /* test stmt #3*/ calmonth from pruning_testUP3 where calmonth = '201502' DEVDUDE.PRUNING_TESTUP3(18) 0 1 0 1922
44510002 4cea6e058ee10a030e00be82baadf8b7 select /* test stmt #2*/ calmonth from pruning_testUP2 where calmonth = '201502' DEVDUDE.PRUNING_TESTUP2(18) 0 1 0 1339
44500002 d6cefc1f24249e32aff53f54e22176b7 select /* test stmt #1*/ calmonth from pruning_testUP1 where calmonth = '201502' DEVDUDE.PRUNING_TESTUP1(18) 0 1 0 1350
*/
Each table gets touched, each query had been prepared once and we see a very similar LOAD TIME DURING PREPARATION - irrespective of the UNLOAD PRIORITY.
Checking the load status now produces:
TABLE_NAME LOAD_METER
1 2 3 4 5 6 7
1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
PRUNING_TESTUP1 0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP2 0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP3 0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP4 0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP5 0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP6 0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP7 0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP8 0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP9 0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0
All tables (with all the different unload priority settings) have the exact same partitions loaded to memory right now.
In a test before, I could see that the tables with unload priority >5 had already been UNLOADED again.
These observations can be backed by the UNLOAD/LOAD traces.
Feel free to use this example and perform the adequate tracing yourself.
Or not and believe what you want to believe.
Hi, Lars!
Thank you very much for such a detailed test scenario. Just checked it very precisely on our SP82 system. There are my results:
TABLE_NAME;LOAD_METER
PRUNING_TESTUP1;2/2/2/2/2/2/2/2/2/2/2/2/3/3/3/3/3/3/3/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/3/3/3/2/3/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/3/3/2/3/2/3/3/2/2/2/2/2/2
PRUNING_TESTUP2;2/2/2/2/2/2/2/2/2/2/2/2/3/3/3/3/3/3/3/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/3/3/3/2/3/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/3/3/2/3/2/3/3/2/2/2/2/2/2
PRUNING_TESTUP3;2/2/2/2/2/2/2/2/2/2/2/2/3/3/3/3/3/3/3/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/3/3/3/2/3/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/3/3/2/3/2/3/3/2/2/2/2/2/2
PRUNING_TESTUP4;2/2/2/2/2/2/2/2/2/2/2/2/3/3/3/3/3/3/3/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/3/3/3/2/3/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/3/3/2/3/2/3/3/2/2/2/2/2/2
PRUNING_TESTUP5;2/2/2/2/2/2/2/2/2/2/2/2/3/3/3/3/3/3/3/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/3/3/3/2/3/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/2/3/3/2/3/2/3/3/2/2/2/2/2/2
PRUNING_TESTUP6;0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP7;0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP8;0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP9;0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0
This is exactly what I was writing about all the time I can not understand why do you get a different result. I will check it on HCP.
Just one remark: our SP82 did not like a syntax of:
so I removed "order by...".
UPDATE: just checked it on HCP trial dev. account (SP97), the results are very similar to mine (the syntax "order by..." was incorrect also):
by text:
TABLE_NAME;LOAD_METER
PRUNING_TESTUP1;3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3
PRUNING_TESTUP2;3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3
PRUNING_TESTUP3;3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3
PRUNING_TESTUP4;3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3
PRUNING_TESTUP5;3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3
PRUNING_TESTUP6;0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP7;0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP8;0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0
PRUNING_TESTUP9;0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/3/0/0/0/0/0/0/0/0/0/0/0
UPDATE: checked it on HCP SP102: HANA MDC (<trial>) 1.00.102.03.1449674847
The syntax was correct at this time, but the results are the same as on SP97: for tables 1 - 5 all partitions are loaded. For 6 - 9 - only three partitions.
Concerning the syntax: I'm using a recent SPS11 HANA. SPS8 is outdated and not further developed. Without the ORDER BY clause in the STRING_AGG function, the result positions may be wrong - so, the output might be misleading.
As mentioned a couple of times:
Do enable tracing for loading, unloading and partitioning to see which partitions actually get touched due to your queries and which do due to other system activity.
By looking for which partitions are still in memory afterwards it's not possible to distinguish between columns that had been touched and left in memory and those who had been touched and dropped from memory again.
If you suspect a bug in your SAP HANA instance and it's in a current revision, then opening a support incident to investigate this would be the right move.
I tested and traced it on our different SPS9 (sp96) system. I turned on load*, unload*, partitioning, optimizer*, query*, sqlopt*, sqlquery, stat* and table* traces. I clearly see that for UP5 and UP6 trace results are pretty the same, partition pruning seems to be working for all test tables. For example:
[4534]{434643}[187/-1] 2016-08-03 12:40:20.581883 d partitioning Pruning.cpp(00098) : Pruning for index RYZHKOV:PRUNING_TESTUP5 (162281): considering part(s) 14, 39, 64 only. Partition spec is HASH 3 DOCNR; RANGE CALMONTH.....
[4534]{434643}[187/-1] 2016-08-03 12:40:21.429779 d partitioning Pruning.cpp(00098) : Pruning for index RYZHKOV:PRUNING_TESTUP6 (162282): considering part(s) 14, 39, 64 only. Partition spec is HASH 3 DOCNR; RANGE CALMONTH.....
The .loads. trace file also shows that allegedly only DOCNR, CALMONTH and $rowid$ columns are loaded for all the *UP1 - *UP9 tables.
The only difference that I noticed in trace files is the Query rewriting time. It's considerably (10x times) bigger for UP1-UP5 tables:
[...]
UP4 433.425 ms
UP5 436.699 ms
UP6 33.758 ms
UP7 32.494 ms
[...]
As I wrote before, SAP support confirmed to us that for tables with priority 1 - 5 all partitions are loaded by (or for) the HANA SQL optimizer during SQL statement preparation. As I see in my tests this is true for SPS8, 9 and 10. But I cannot see in trace files what exactly causes the loading of all partitions and which columns are involved. In the same time your test on SPS11 shows that probably starting from SPS11 this behaviour is different. Could you test this example on SPS9 or SPS10? I combined complete test scenario in attached file for convenience.
Concerning this:
Lars Breddemann wrote:
By looking for which partitions are still in memory afterwards it's not possible to distinguish between columns that had been touched and left in memory and those who had been touched and dropped from memory again.
I clearly see that for UP5 table these columns are still in memory (on SPS9):
DOCNR for 16 partitions;
CALMONTH, $trex_udiv$ and $rowid$ for all partitions.
For UP6 only columns DOCNR, CALMONTH and $rowid$ are still in memory only in 3 partitions.
I've done my part of testing here and not going to extend this.
You may provide the incident no. where support provided you the confirmation.
Instead of my own test script in a single file it would have made sense to provide the trace files instead.
Also, by actually executing the statement instead of just preparing them, you made the test cases more complex than necessary.
Hi!
I got a chance to test it on the SPS11 system (SP111). Well, it works how it really should work - only relevant partitions are touched in all cases. It seems that something was fixed in SPS11 concerning partitions loading during statements preparartion, but I cannot find what exactly.
For example, table PRUNING_TESTUP1.
SPS9:
SPS11:
UPD: guys, pls, could someone check any of the test cases from this thread (txt files with SQL script) on both SPS11(or 12) and on SPS<=10. Do you observe the same unwanted partitions loading on SPS<=10? Thanks in advance.
Sounds like you provided the partition information in your where clause in the wrong data type. You might want to check with the partitioning trace which partition conditions were understood.
Check the SAP note FAQ partitioning, search this forum (this has been discussed before) or read my book.
The SQL plan cache does not have an influence on partition loading.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Thanks for answer, Lars!
There's something different. I observe direct influense of the SQL plan cache on partition loading (pruning). So I repeated an experiment:
1. Unloaded table from memory.
2. Run SQL-query via ABAP with CALMONTH restrictions in WHERE for using partition pruning (table is RANGE partititioned on 2nd level by CALMONTH). And I sure the query is not in cache.
select count( * ) UP TO 1 rows FROM /bic/azuucalc00 WHERE calmonth = '201701' and /bic/miedivsnd = '1234'.
3. HANA prepares statement (cause it's missing in cache) and during preparation I see that ALL partitions are loaded into memory. At end of preparation (it tooks about 2 minutes) all partitions are loaded:
4. After that I unload table again and repeat the same query. This time the query is in cache already, HANA executes query instantly and loads only the relevant partition:
What I see in SQL plan cache: execution count 3 times (3rd time - test case with wrong data type in WHERE). Preparation - 1 times (about 2 minutes). Execution itself is very quick and tooks about 40 millisec.
That behavior is on SPS 08 and 09. SPS 10 not tested yet.
Also I tested a case with the wrong data type in WHERE condition (I used WHERE calmonth = 201701 (without quotes)) - and it works exactly the same way as described above.
Actually, exactly my case is described here: Musings of an IT Implementor: HANA OOM Error Tip #1 - Partition Tables Correctly (indexes are not important here).
Nope, you are not observing the direct impact of the SQL cache here. Are you debugging SAP HANA kernel code? Are you looking at trace files?
You are really just putting together two observations of yours and construct the connection.
Your claim, expressed more explicitly is:
Parsing a query in SAP HANA executes a full load of ALL partitions of a table.
That's not correct.
In order to find an explanation for your example, it's necessary that we see the exact definition of the table including the partitioning definition and the exact query text, too.
My ad hoc assumption would be that the table had been partition based on CALMONTH (integer value) and now gets queries with CALMONTH (string value).
SAP HANA doesn't convert this automatically and won't use the partition pruning in this case.
For the claim of parsing -> partition loading, this can be easily disproved by manually unloading the table in SAP HANA studio and preparing any statement against the table (right click, prepare or explain will do).
Then check M_CS_ALL_COLUMNS to see which columns are in memory.
Thanks!
Yes, the table is partitioned by integer values of CALMONTH:
HASH 3 FISCPER3,FISCYEAR,FISCVARNT,/BIC/MIESKODNM; RANGE CALMONTH 201000-201100,201100-201200,201200-201300,201300-201400,201401,201402.... (until 2017)
BW attribute CALMONTH however has type NUMC(6).
Is the partitioning definition is wrong in this case and must be reworked on char values of CALMONTH for using pruning?
Are there any options to make pruning to work without repartitioning (some settings or upgrading to more recent HANA revision)?
UPDATE: and if it's no influence of SQL plan cache on pruning or loading, why there is a different loading behavior for cached and not-cached queries?
UPDATE 2: did I understand correctly, that for write-optimized DSO in BW we never get pruning benefits on range partitioning, because all BW attributes are character like and in RANGE definition we can only set integer values?
There's no setting to change that. If you want to have partition pruning work for NUMC values you will have to re-partition correctly.
As for your example: this is because actual preparation of the query included bind variables.
In that case, the partition pruning cannot work properly as the values are not known yet.
However, to prepare the statement some size and cardinality estimations must take place.
So, in this case
a) statement not in cache yet
b) table not in memory yet
c) no values for the partition pruning available
SAP HANA has to load some of the columns into memory in order to prepare the statement.
So, even though there is no direct impact, there you have your connection: parsing/optimizing a statement requires some information from some of the columns, so in the worst case, you will trigger column loading just for the prepare.
As long as the statement is reusable in cache, this doesn't need to be done again, even if the table meanwhile has been unloaded.
Lars, thank you very much for participation!
I tried two versions of query (non-cached) directly from Studio in order to exclude the case with bind variables: (... where calmonth = '201701' ) and (... where calmonth = 201701 ) and pruning did not work for both cases.
Could you provide, please, how to activate partitioning trace for deeper analysis? I did not find it nor in SAP HANA Administration Guide, nor in your book (thanks for the advice).
Ok, there are SAP notes, there is the Administration Guide from the official documentation set and there is Richard's and my book.
In all of these sources it's explained how to setup a trace in SAP HANA.
In my book on page 375 I also cover which traces to activate to see the partitioning pruning at work.
In newer SAP HANA versions, even the explain plan covers which partitions are considered.
On the page 375 it's written: "...it is possible to activate the PARTITIONING trace with trace level INFO or DEBUG."
I did not find nothing about partitioning trace on "Trace configuration" tab in HANA Studio. There are only Database, SQL, Plan trace and others, but no Partitioning. I thought it might be as component of the Database trace, but there'se nothing like "partitioning". Google for SAP "partitioning trace" returns only 3 relevant results. All of them are SCN threads including this one where no explanation of how to activate it.
Anyway, I decided to run a pure test, so I created new empty table with simplified structure and similar partitioning scheme, filled it with some data and checked the queries. All of them are in the attachment.
In case of running NON-CACHED query on single CALMONTH I see that ALL non-empty partions are loaded, but only the involved column and the primary key column:
I checked this query in PlanViz - it's showing that only three relevant partitions are scanned:
So is it the partition pruning at work? If yes, why HANA is loading all partitions at preparation? Or does it mean that partition pruning doesn't work at preparation, but only at execution of the queries?
In case of the same but CACHED query only the relevant partitions and only the involved column are loaded:
How to enable specific traces is explained in the admin guide. Not going to repeat it here.
To your test:
/* CREATE */
CREATE column TABLE pruning_test (
DOCNR NVARCHAR(5),
CALMONTH NVARCHAR(6),
SUMM DECIMAL(17,2),
PRIMARY KEY (DOCNR))
/* PARTITIONING */
ALTER TABLE pruning_test PARTITION BY HASH ("DOCNR") PARTITIONS 3, RANGE ("CALMONTH") (
partition VALUE = '201401',
partition VALUE = '201402',
...
You partition by HASH-RANGE with 3 HASH partitions, which means that there are three partitions for each CALMONTH value.
Then you load data for CALMONTHS between 201501 and 201507.
That's 6 different values for CALMONTH and 18 possibly matching partitions for those.
Now I unload all columns.
unload pruning_test;
Let's check what's loaded:
select column_name, part_id, loaded
from m_cs_all_columns where table_name='PRUNING_TEST'
order by loaded desc;
COLUMN_NAME | PART_ID | LOADED |
SUMM | 52 | FALSE |
CALMONTH | 52 | FALSE |
DOCNR | 52 | FALSE |
$rowid$ | 51 | FALSE |
$trex_udiv$ | 51 | FALSE |
SUMM | 51 | FALSE |
CALMONTH | 51 | FALSE |
DOCNR | 51 | FALSE |
$rowid$ | 50 | FALSE |
...
LOADED = TRUE would come first, so here ALL columns are unloaded.
Let's check the plan cache:
select top 10 statement_hash,plan_id, left (statement_string, 30) as cmd_start,
is_valid, preparation_count, execution_count, max_preparation_time, max_table_load_time_during_preparation
from "PUBLIC"."M_SQL_PLAN_CACHE" where statement_string like '%RUN #%';
STATEMENT_HASH | PLAN_ID | CMD_START | IS_VALID | PREPARATION_COUNT | EXECUTION_COUNT | MAX_PREPARATION_TIME | MAX_TABLE_LOAD_TIME_DURING_PREPARATION |
257c5cc3b41666c03d6607e999c36f89 | 25,550,002 | select top 10 statement_hash,p | TRUE | 1 | 0 | 15,671 | 0 |
It's just us at the moment - all good.
Now, let's create a new statement to be parsed and check what happens.
select /*RUN #1*/ calmonth from pruning_test where calmonth = '201502' ;
Right click on Prepare Statement... and check plan cache again.
STATEMENT_HASH | PLAN_ID | CMD_START | IS_VALID | PREPARATION_COUNT | EXECUTION_COUNT | MAX_PREPARATION_TIME | MAX_TABLE_LOAD_TIME_DURING_PREPARATION |
257c5cc3b41666c03d6607e999c36f89 | 25,550,002 | select top 10 statement_hash,p | TRUE | 1 | 1 | 15,671 | 0 |
286e61d6c8b1459dfdd57043604f38f0 | 25,560,002 | select /*RUN #1*/ calmonth fro | TRUE | 1 | 0 | 67,377 | 1,260 |
Statement Run #1 has been prepared, but not executed and apparently some of the table columns got loaded...
Let's check:
COLUMN_NAME | PART_ID | LOADED |
DOCNR | 39 | TRUE |
CALMONTH | 39 | TRUE |
$rowid$ | 39 | TRUE |
$rowid$ | 14 | TRUE |
$rowid$ | 64 | TRUE |
CALMONTH | 14 | TRUE |
CALMONTH | 64 | TRUE |
DOCNR | 64 | TRUE |
DOCNR | 14 | TRUE |
DOCNR | 52 | FALSE |
$rowid$ | 51 | FALSE |
$trex_udiv$ | 51 | FALSE |
SUMM | 51 | FALSE |
The internal columns for the record management ($rowid$) and the two columns involved with partitioning CALMONTH and DOCNR have been loaded for 3 different partitions (14, 39, 64).
These three partitions are exactly the partitions where the data can possibly be.
All other partitions remain off memory. Partition pruning works also during preparation.
So far so good. The statement is prepared (or cached as you say).
Now, let's unload the table and re-prepare the statement:
STATEMENT_HASH | PLAN_ID | CMD_START | IS_VALID | PREPARATION_COUNT | EXECUTION_COUNT | MAX_PREPARATION_TIME | MAX_TABLE_LOAD_TIME_DURING_PREPARATION |
257c5cc3b41666c03d6607e999c36f89 | 25,550,002 | select top 10 statement_hash,p | TRUE | 1 | 2 | 15,671 | 0 |
286e61d6c8b1459dfdd57043604f38f0 | 25,560,002 | select /*RUN #1*/ calmonth fro | TRUE | 1 | 0 | 67,377 | 1,260 |
COLUMN_NAME | PART_ID | LOADED |
SUMM | 52 | FALSE |
CALMONTH | 52 | FALSE |
DOCNR | 52 | FALSE |
$rowid$ | 51 | FALSE |
$trex_udiv$ | 51 | FALSE |
...
As we see: nothing happens! The plan is still valid, no new preparation is required and no columns get loaded this time.
Now I run the query...
STATEMENT_HASH | PLAN_ID | CMD_START | IS_VALID | PREPARATION_COUNT | EXECUTION_COUNT | MAX_PREPARATION_TIME | MAX_TABLE_LOAD_TIME_DURING_PREPARATION |
257c5cc3b41666c03d6607e999c36f89 | 25,550,002 | select top 10 statement_hash,p | TRUE | 1 | 3 | 15,671 | 0 |
286e61d6c8b1459dfdd57043604f38f0 | 25,560,002 | select /*RUN #1*/ calmonth fro | TRUE | 1 | 1 | 67,377 | 1,260 |
COLUMN_NAME | PART_ID | LOADED |
CALMONTH | 39 | TRUE |
$rowid$ | 14 | TRUE |
$rowid$ | 39 | TRUE |
CALMONTH | 14 | TRUE |
$rowid$ | 64 | TRUE |
CALMONTH | 64 | TRUE |
CALMONTH | 52 | FALSE |
...
Again no preparation, but execution count increased by 1. As we still only have one part of the partitioning condition, of course all 3 partitions have to be touched again.
Now, let's unload again and prepare another statement...
select /*RUN #2*/ calmonth from pruning_test where calmonth = ?
STATEMENT_HASH | PLAN_ID | CMD_START | IS_VALID | PREPARATION_COUNT | EXECUTION_COUNT | MAX_PREPARATION_TIME | MAX_TABLE_LOAD_TIME_DURING_PREPARATION |
257c5cc3b41666c03d6607e999c36f89 | 25,550,002 | select top 10 statement_hash,p | TRUE | 1 | 5 | 15,671 | 0 |
286e61d6c8b1459dfdd57043604f38f0 | 25,560,002 | select /*RUN #1*/ calmonth fro | TRUE | 1 | 1 | 67,377 | 1,260 |
a58d94ab23fdee706c00e5a3a9529c0d | 25,570,002 | select /*RUN #2*/ calmonth fro | TRUE | 1 | 0 | 561,580 | 438,071 |
KAPOW!
For RUN #2 the preparation cannot make use of any partition pruning, because this information is not available yet.
But let's assume the main underlying idea of the in-memory database holds true and the stuff actually is in memory all the time. Then this is not a problem at all.
And if this prepared statement remains in cache and that it doesn't need to get re-prepared but some of the columns really weren't used at all and got unloaded. Would that lead to constant re-loading of those colunms?
unload, execute, check...
STATEMENT_HASH | PLAN_ID | CMD_START | IS_VALID | PREPARATION_COUNT | EXECUTION_COUNT | MAX_PREPARATION_TIME | MAX_TABLE_LOAD_TIME_DURING_PREPARATION |
257c5cc3b41666c03d6607e999c36f89 | 25,550,002 | select top 10 statement_hash,p | TRUE | 1 | 6 | 15,671 | 0 |
286e61d6c8b1459dfdd57043604f38f0 | 25,560,002 | select /*RUN #1*/ calmonth fro | TRUE | 1 | 1 | 67,377 | 1,260 |
a58d94ab23fdee706c00e5a3a9529c0d | 25,570,002 | select /*RUN #2*/ calmonth fro | TRUE | 1 | 1 | 561,580 | 438,071 |
COLUMN_NAME | PART_ID | LOADED |
CALMONTH | 39 | TRUE |
$rowid$ | 14 | TRUE |
$rowid$ | 39 | TRUE |
CALMONTH | 14 | TRUE |
$rowid$ | 64 | TRUE |
CALMONTH | 64 | TRUE |
CALMONTH | 52 | FALSE |
DOCNR | 52 | FALSE |
$rowid$ | 51 | FALSE |
$trex_udiv$ | 51 | FALSE |
...
Nope: alll as we expect.
With the statement being prepared and a value for the bind variable provided only the partitions that cannot be excluded are loaded.
Remember: partition pruning is all about being able to leave out partitions that can safely be ignored with the current set of constraints. If the constraint is left unspecific, e.g. because it's a bind variable, it cannot be used for any partition pruning.
Thanks for testing!
That's really strange, I repeated exactly your actions: unloaded the table, checked in M_CS_COLUMNS that the table is unloaded
COLUMN_NAME | PART_ID | LOADED |
SUMM | 52 | FALSE |
CALMONTH | 52 | FALSE |
DOCNR | 52 | FALSE |
$rowid$ | 51 | FALSE |
$trex_udiv$ | 51 | FALSE |
SUMM | 51 | FALSE |
CALMONTH | 51 | FALSE |
DOCNR | 51 | FALSE |
$rowid$ | 50 | FALSE |
$trex_udiv$ | 50 | FALSE |
after that hit "Prepare statement" in Studio for
select /*RUN #1*/ calmonth from pruning_test where calmonth = '201502' ;
And I see this picture in M_CS_COLUMNS:
COLUMN_NAME | PART_ID | LOADED |
CALMONTH | 63 | TRUE |
DOCNR | 63 | TRUE |
$rowid$ | 62 | TRUE |
$rowid$ | 20 | TRUE |
$rowid$ | 21 | TRUE |
$rowid$ | 61 | TRUE |
$rowid$ | 22 | TRUE |
$rowid$ | 60 | TRUE |
$rowid$ | 23 | TRUE |
$rowid$ | 59 | TRUE |
$rowid$ | 24 | TRUE |
$rowid$ | 58 | TRUE |
$rowid$ | 25 | TRUE |
$rowid$ | 26 | TRUE |
$rowid$ | 57 | TRUE |
$rowid$ | 27 | TRUE |
$rowid$ | 56 | TRUE |
CALMONTH | 15 | TRUE |
$rowid$ | 15 | TRUE |
DOCNR | 16 | TRUE |
All non-empty partitions are loaded, but only the columns: CALMONTH, DOCNR. $rowid$. All as I said before. The other strange thing that I tested this case on three different systems and different releases (SPS 08, 09 and 10) and behavior is exactly the same.
I repeated test many times but all remains the same. Here is SQL cache:
STATEMENT_HASH | PLAN_ID | CMD_START | IS_VALID | PREPARATION_COUNT | EXECUTION_COUNT | MAX_PREPARATION_TIME | MAX_TABLE_LOAD_TIME_DURING_PREPARATION |
286e61d6c8b1459dfdd57043604f38f0 | 2 995 400 002 | select /*RUN #1*/ calmonth fro | TRUE | 1 | 0 | 763 925 | 316 |
bbb26f2490a2156520d6debaf5dc9744 | 2 995 430 002 | select /*RUN #3*/ calmonth fro | TRUE | 1 | 0 | 291 493 | 308 |
e0e787daf339d949650c7dda719178dd | 2 995 420 002 | select /*RUN #2*/ calmonth fro | TRUE | 1 | 0 | 474 523 | 307 |
b2485c758d7c472d494f5820502d2aec | 2 995 440 002 | select top 10 statement_hash,p | TRUE | 1 | 3 | 30 369 | 0 |
2088c58eb6a9b37fb0c4a574a4690076 | 2 995 460 002 | select /*RUN #5*/ calmonth fro | TRUE | 1 | 0 | 366 770 | 308 |
64737b8c2199d5291db26e6699141b8b | 2 995 450 002 | select /*RUN #4*/ calmonth fro | TRUE | 1 | 0 | 284 762 | 302 |
Can it be that there'se something is not configured correctly here, or some component is missing, or some process is not started? Please advice further steps for analysys.
Hi! I checked partitionong trace, there's OK:
2016-04-04 10:09:14.871951 d partitioning Pruning.cpp(00510) : Pruning for index RYZHKOV:PRUNING_TEST (95566): considering part(s) 14, 39, 64 only. Partition spec is HASH 3 DOCNR; RANGE CALMONTH 201401,201402,201403,201404,201405,201406,201407,201408,201409,201410,201411,201412,201501,201502,201503,201504,201505,201506,201507,201508,201509,201510,201511,201512,*. Query values are ((<UNKNOWN> == 201502)).
But all non-empty partitions are loaded nevertheless.
Checked the case with SAP HANA Cloud Platform (Trial Developer Account, HANA XS (<shared>) 1.00.102.03.1449674847).
And I see exactly the same as my results - after preparing of the very first statement
select calmonth from pruning_test where calmonth = '201502';
on the newly created, partitioned, filled, merged and unloaded table I get this:
What's wrong?
Hmm... so you just executed the statement
select calmonth from pruning_test where calmonth = '201502';
right?
The thing is, that SAP HANA Studio by default always prepares and then executes the statement.
You can switch this off in the settings and then you should be able to distinguish between prepare and execute.
I recommend to also make use of JDBC tracing of SAP HANA studio, so that you have a better grip on what commands are actually send to SAP HANA.
Hi, Lars!
I ran "Prepare statement" in studio for
select calmonth from pruning_test where calmonth = '201502' /* some comment to ensure uniqueness */;
I unchecked checkbox "Prepare statements before executions", but nothing changed.
When the new statement is prepared or executed (before execution of the new query it must be pepared, right?) HANA loads almost all partitions as shown in the screenshot.
So what we have:
1. Every new statement must be prepared before first execution;
2. You say that Partition pruning works also during preparation. And on your test case it can be clearly seen that it is so - only partitions 14, 39, 64 are loaded during preparation of the new query;
3. When I try exactly the same test case on the HANA Cloud Platform dev. ed. rel. 102 (to eliminate the influence factor of the third-party system) I see that almost all partitions are loaded.
Could you try the same case on the HANA Cloud Platform dev. ed?
Hi!
I found that partition pruning behaviour during SQL preparation depends on UNLOAD PRIORITY table setting. With values 0 - 5 (my table has default setting - 5) partitioning is not taken into account during SQL statement preparation: mostly all partitions are loaded. But between values 6 and 9 pruning starts working also during preparation: only relevant partitions are loaded.
It's a bit confusing because according to official "help":
UNLOAD PRIORITY specifies that priority of table to be unloaded from memory.
No word on how that affects the loading behaviour.
Ok. Then, how to explain the following example? I run it on the SAP HANA Cloud Platform developer edition.
1.
alter table pruning_test unload priority 5;
unload pruning_test;
select calmonth from pruning_test where calmonth = '201502' /* 20.04.16 #01 */;
select column_name, m.part_id, range, loaded
from m_cs_all_columns as m
join M_CS_PARTITIONS as p
on m.schema_name = p.schema_name
and m.table_name = p.table_name
and m.part_id = p.part_id
where m.table_name = 'PRUNING_TEST'
order by loaded desc, m.part_id, column_name desc;
all partitions are touched:
2.
alter table pruning_test unload priority 6;
unload pruning_test;
select calmonth from pruning_test where calmonth = '201502' /* 20.04.16 #02 */;
select column_name, m.part_id, range, loaded
from m_cs_all_columns as m
join M_CS_PARTITIONS as p
on m.schema_name = p.schema_name
and m.table_name = p.table_name
and m.part_id = p.part_id
where m.table_name = 'PRUNING_TEST'
order by loaded desc, m.part_id, column_name desc;
much better! the relevant partitions 14, 39 and 64 are loaded! But what are these columns from partitions 22, 30 and 63? I did not request them! Why they have been loaded?
3. (the same as the stage 1)
alter table pruning_test unload priority 5;
unload pruning_test;
select calmonth from pruning_test where calmonth = '201502' /* 20.04.16 #03 */;
select column_name, m.part_id, range, loaded
from m_cs_all_columns as m
join M_CS_PARTITIONS as p
on m.schema_name = p.schema_name
and m.table_name = p.table_name
and m.part_id = p.part_id
where m.table_name = 'PRUNING_TEST'
order by loaded desc, m.part_id, column_name desc;
Again, everything is bad.
Ok, there's some mixup here.
The optimiser sampling results, which lead to the initial column load in your example with unload priority 5, are cached and kept.
When you run your second test with unload priority 6 these sampling results got reused and didn't trigger a column reload.
Once again: unload priority is evaluated only when SAP HANA determines column unloading. It's not used anywhere else, especially not in the query evaluation.
User | Count |
---|---|
86 | |
10 | |
10 | |
9 | |
6 | |
6 | |
6 | |
5 | |
4 | |
3 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.