on 01-24-2013 4:08 PM
Hi Guru:
I have a SQL statement which will perform a left join between 2 tables, one has 10 million records, one has about 40K.
The join will take about 25s.
The statement is listed below, anyone can tell me how to improve it?
Thanks.
Eric
var_out = SELECT a.fiscalyear,
a.day,
a.comp_code,
a.profit_center,
a.cost_elem,
a.gl_account,
a.function_area,
a.bac_prod,
a.transaction_type,
a.version,
a.cost_center,
a.bac_data_src,
a.sub_accnt,
a.acc_act,
a.s_cost_center,
a.p_profit_ctr,
a.p_func_area,
a.p_comp,
a.segment,
a.p_segment,
sum(b.activity) as activity,
sum(b.activity) as balance,
sum(b.activity) / a.day as average_balance,
a.FISCALYEAR + a.DAY as fiscper,
a.date_sap,
'USD' as currency
from :all_comb as a
left join :all_activities as b
on a.fiscalyear = b.fiscalyear and
a.day >= b.day and
a.comp_code = b.comp_code and
a.profit_center = b.profit_center and
a.cost_elem = b.cost_elem and
a.gl_account = b.gl_account and
a.function_area = b.function_area and
a.bac_prod = b.bac_prod and
a.transaction_type = b.transaction_type and
a.version = b.version and
a.cost_center = b.cost_center and
a.bac_data_src = b.bac_data_src and
a.sub_accnt = b.sub_accnt and
a.acc_act = b.acc_act and
a.s_cost_center = b.s_cost_center and
a.p_profit_ctr = b.p_profit_ctr and
a.p_func_area = b.p_func_area and
a.p_comp = b.p_comp and
a.p_segment = b.p_segment
group by a.fiscalyear, a.day, a.comp_code, a.profit_center, a.cost_elem, a.gl_account, a.function_area, a.bac_prod, a.transaction_type,
a.version, a.cost_center, a.bac_data_src, a.sub_accnt, a.acc_act, a.s_cost_center, a.p_profit_ctr, a.p_func_area, a.p_comp, a.segment, a.p_segment, a.date_sap;
Hi Eric,
Just out of curiosity, can you please try the following.
Use the following hint at the end of the statement after group by clause:
with parameters ('request_flags'='ANALYZE_MODEL');
Disclaimer:
This above mentioned feature is not supported by SAP and has been quite debated and discouraged on this forum. Lars will personally kill me for suggesting this.
http://scn.sap.com/thread/3277920
I am not completely sure if it would improve the performance, but there are discussion threads / wiki pages about this hint, mentioning that it will improve the performance.
I haven't used it personally (for obvious reasons as it is not supported), but just curious to see if it really works as mentioned in the wiki page.
http://wiki.sdn.sap.com/wiki/display/EIM/Performance+of+Hana+-+Single+table+aggregation
Regards,
Ravi
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Ravindra,
how could I ever kill one of the most active persons in this forum? 🙂
But you're not too far off with your suspicion - I won't encourage the usage of the parameter/hint here.
Instead, what I would guess could be the problem is this:
The statement is an SQL statement joining two table variables.
To do that, as this is SQL and not CE-function world - HANA needs to execute whatever creates the tables referenced by the table variables first and then join these materialized temporary result sets.
In addition to this, the join condition is a huge list of fields, that might or might not be required for the join PLUS there is a unequal condition - something the join engine can only do under special conditions as of SPS 5 (look for temporal join!).
All in all I rather would take the approach to run the plan visualisation and check
a) what operation takes longes (don't forget to display the timelines)
and
b) what operation creates large intermediate result sets that need to be handed over to the next operation.
Could easily be that you can rewrite the whole statement into smaller ones with CE-functions that run way quicker.
- Lars
Hi Lars,
Do you have some more documentation on the temporal join? Can't find it anywhere, except for a really small paragraph in the HANA developer guide.
In my revision 47 HANA studio I see the properties of a join that you'll need for the temporal join, but can't get it to work. Keep getting the error "From and to field of the temporal join have to be defined as view attributes on the same path PATH_H_CONTRACT_2"
Best regards,
Stefan
This discussion is more than 2 years old and the SAP HANA system discussed here is no longer current.
Please do open a new discussion on your topic with information on how to reproduce the problem you're facing.
Also, looking back on the original question it occurs that one major mistake in the statement was to not aggregate before the join.
The original statement goes - simplified - like this:
select a.col1, a.col2, ..., sum(b.colA), sum (b.colB)
from a left outer join b
where
a.col1 = b.col1
and a.col2 => b.col2
...
group by a.col1, a.col2;
Assuming that b (in the original statement called "all_activities") is a transaction level table, grouping/aggregating the data first and join the result to table a afterwards could have already improved performance.
Anyhow: please open a new threat for your topic!
Cheers,
Lars (moderator)
If it must be scripted, use the CE function CE_LEFT_OUTER_JOIN.
It looks very close to something that could be modeled as analytic view, except for one join condition <=
Is there any way you could model it as an analytic view, and then consume this in scripted CalcView for any additional logic? Or use restricted measures or calculated attributes/measures?
Cheers,
Jody
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
80 | |
9 | |
9 | |
7 | |
7 | |
6 | |
6 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.