on 07-10-2013 10:13 PM
Hi folks,
I have a feeling this question is right up Lars alley. I have a view where the user enters a posting date such as 02/01/2013. The view utilizes MKPF and MSEG and each are partitioned by MJAHR which is year such as '2013' as MJAHR is one of the primary keys in both tables. The performance is great when the user includes the year in their selections. However, since they are already choosing posting date we would like to simply extract the YEAR from their posting date selection. We have the code working however the problem is that the execution time takes twice as long. To get to the root of the problem I dissected the execution plan which was quite complex so as a test I created a much more simple SQL and compared the two to see why they were executing differently. In my dumbed down example below it actually takes 7 times as long to extract the year from the posting date! It seems the bulk of the problem is with the JEDistinctAttribute step of column search as well as the REDUCTION PHASE step of the column search. When comparing the two SQL execution plans visually they each have the exact same amount of boxes however the slow version is simply taking a lot longer for nearly each and every step. Here's examples;
CODE THAT RUNS IN 1.5 SECONDS
select | hdr.MBLNR | |
from | "MYSCHEMA"."MKPF" | hdr |
JOIN "MYSCHEMA"."MSEG" mat ON hdr.MBLNR = mat.MBLNR
where | ( |
hdr.MJAHR = '2013'
and hdr.BUDAT between TO_DATE('02-01-2013','MM-DD-YYYY') and TO_DATE('02-01-2013','MM-DD-YYYY')
and mat.BWART = 'MyMovementType'
and mat.WERKS = 'MyPlant')
GROUP BY hdr.MBLNR
CODE THAT DERIVES YEAR FROM POSTING DATE RUNS IN 7.5 SECONDS
select | hdr.MBLNR | |
from | "MYSCHEMA"."MKPF" | a11 |
JOIN "MYSCHEMA"."MSEG" mat ON hdr.MBLNR = mat.MBLNR
where | ( |
hdr.MJAHR between YEAR (TO_DATE('02-01-2013','MM-DD-YYYY')) and YEAR (TO_DATE('02-01-2013','MM-DD-YYYY'))
and hdr.BUDAT between TO_DATE('02-01-2013','MM-DD-YYYY') and TO_DATE('02-01-2013','MM-DD-YYYY')
and mat.BWART = 'MyMovementType'
and mat.WERKS = 'MyPlant')
GROUP BY hdr.MBLNR
Do you see any obvious problems with the slower SQL statement? Strangely in many of the JEDistinctAttribute step the 'number of minitasks' in the second SQL execution plan are often many times larger than the first but I'm not clear why. I figured I would post this here before creating a message with SAP. Thanks for any help or insight!
-Patrick
Hi Patrick,
Obviously I am not even 1% as expert as , and would definitely await his response on this thread ( I even tagged him here, so that it comes to his attention ), but can you please check the second query with COMPONENT function instead of YEAR.
Just out of curiosity and nothing else. Somehow I have a feeling that COMPONENT might be more efficient as compared to YEAR function.
Thanks and regards,
Ravi
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Guys I am able to dissect further and make the example even more simple;
if the where statement has this;
WHERE MJAHR = '2013'
it takes 1 second. If it contains this;
WHERE MJAHR = YEAR('2013')
it takes 7.5 seconds. So it's not even the BETWEEN function that's making it slow, it's the simple YEAR function. I would expect this function to perform once only so I don't understand why it's performing so much slower.
Hi Patrick,
One quick question, is the YEAR function in the SQL statement added by MSTR reporting. Does it come from the Calendar user Input. Is there a way you can avoid it being added to the generated SQL statement.
I'd suggest to open OSS note for this issue. I tried the similar query in my system on Revision 60, but did not face such a drastic performance issue.
Will try again few things and keep you posted.
Regards,
Ravi
HI Patrick and Ravi,
I only had the chance to look into this briefly (means: the following could be all nonsense ), but to me this looks like plain old implicit type cast problems.
The columns where the filter are defined upon are these (at least in my test system):
Name SQL Data Type Dim
BUDAT NVARCHAR 8
MJAHR NVARCHAR 4
But in your SQL you compare them against the result of YEAR() (which is an integer/number) or to_date, which is obviously a date datatype.
This leads to some conversion effort during the query execution:
select hdr.MBLNR
from "MKPF" hdr
JOIN "MSEG" mat ON hdr.MBLNR = mat.MBLNR
where
hdr.MJAHR = '2011'
and hdr.BUDAT between
TO_DATE('02-01-2011','MM-DD-YYYY')
and TO_DATE('02-01-2011','MM-DD-YYYY')
GROUP BY hdr.MBLNR
OPERATOR_NAME OPERATOR_DETAILS
COLUMN SEARCH HDR.MBLNR (LATE MATERIALIZATION, EVALUATOR-BASED JOIN CONSTRAINT)
DISTINCT GROUPING: HDR.MBLNR
JOIN JOIN CONDITION: (INNER) MAT.MBLNR = HDR.MBLNR
COLUMN TABLE
COLUMN TABLE FILTER CONDITION:
HDR.MJAHR = n'2011'
AND TO_DATE(HDR.BUDAT) <= TO_DATE('2011-02-01')
AND TO_DATE(HDR.BUDAT) >= TO_DATE('2011-02-01')
Note how both sides of the condition are implicitly converted into date.
select hdr.MBLNR
from "MKPF" hdr
JOIN "MSEG" mat ON hdr.MBLNR = mat.MBLNR
where
hdr.MJAHR between
YEAR (TO_DATE('02-01-2011','MM-DD-YYYY'))
and YEAR (TO_DATE('02-01-2013','MM-DD-YYYY'))
and hdr.BUDAT between
TO_DATE('02-01-2011','MM-DD-YYYY')
and TO_DATE('02-01-2013','MM-DD-YYYY')
GROUP BY hdr.MBLNR
OPERATOR_NAME OPERATOR_DETAILS
COLUMN SEARCH HDR.MBLNR (LATE MATERIALIZATION, EVALUATOR-BASED JOIN CONSTRAINT)
DISTINCT GROUPING: HDR.MBLNR
JOIN JOIN CONDITION: (INNER) HDR.MBLNR = MAT.MBLNR
COLUMN TABLE FILTER CONDITION:
TO_INT(HDR.MJAHR) <= 2013
AND TO_DATE(HDR.BUDAT) <= TO_DATE('2013-02-01')
AND TO_DATE(HDR.BUDAT) >= TO_DATE('2011-02-01')
AND TO_INT(HDR.MJAHR) >= 2011
Note how MJAHR is converted into integer here.
select hdr.MBLNR
from "MKPF" hdr
JOIN "MSEG" mat ON hdr.MBLNR = mat.MBLNR
where
hdr.MJAHR between
SUBSTR('02-01-2011',7,4)
and SUBSTR('02-01-2013',7,4)
and hdr.BUDAT between
to_dats(TO_DATE('02-01-2011','MM-DD-YYYY'))
and to_dats(TO_DATE('02-01-2013','MM-DD-YYYY'))
GROUP BY hdr.MBLNR
OPERATOR_NAME OPERATOR_DETAILS
COLUMN SEARCH HDR.MBLNR (LATE MATERIALIZATION)
DISTINCT GROUPING: HDR.MBLNR
JOIN JOIN CONDITION: (INNER) HDR.MBLNR = MAT.MBLNR
COLUMN TABLE FILTER CONDITION:
HDR.MJAHR >= n'2011'
AND HDR.BUDAT >= n'20110201'
AND HDR.MJAHR <= n'2013'
AND HDR.BUDAT <= n'20130201'
COLUMN TABLE
The TO_DATS function takes a sql date and creates the SAP string version from it.
As you see, now no conversion is done during runtime. Should run faster - but I need to leave the testing up to you...
Cheers.
Lars
Lars, this is why you are a legend here and my hero. It worked! At least it worked in my dumbed down version via SQL editor. All of this SQL is actually originally being generated from Microstrategy at query runtime so now I'm going to see if we can do the same thing there which I do not see why not. I'm going to have our Microstrategy teammate experiment with that now and close this thread shortly!
Thanks guys.
-Patrick
Hi Patrick,
I am sailing the same boat. MicroStrategy generated Queries are not optimized and there is hardly any scope to define the behavior of the generated queries. We cannot even take it up with SAP, as MSTR is not yet certified for the latest SP revision on HANA, especially around dates. There is explicit conversion of date using to_date function even if the data stored in HANA is in HANA date format.
Thanks for bringing up an important issue. And, as always, very informative reply from "Lars, The Amazing".
Regards,
Ravi
Lars, Ravi & Patrick,
You guys Rock when it comes to sharing knowledge and experiences with HANA development! I admire your patience, dedication and focus in continuous nourishing your HANA skills with the latest!
You guys are Iconic when it comes to HANA technical experience in the SDN forums. Not to forget few other names such as John Appleby, Bala Prabhakar (basis) & Neha Singla who are not much active these days! I am sure they will come back too!
Thanks Guys!
Sincerely,
Rama Shankar
User | Count |
---|---|
101 | |
13 | |
13 | |
11 | |
11 | |
7 | |
6 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.