on 02-27-2013 3:30 PM
Hi All,
One of my requirements is to return a specified Percentile so I need a percentile calculation function for HANA.
In order to do this, I need to rank the data records by the specified value. It is then expected a user will enter in a percentile and only the value within that percentile is returned for the data record.
Below is an example of the logic and how this applies to the data.
The following raw data records are give, we need to get the 80% percentile:
ID Name Amount
1 Joe 100
1 Jane 30
1 David 200
1 John 10
1 Scott 50
Step 1: rank data records
ID Name Amount
1 John 10
1 Jane 30
1 Scott 50
1 Joe 100
1 David 200
Step 2: return record 80% percentile value.
In this case the 80% percentile value for ID 1 is 120
As far as I know, there are no standard ranking function in HANA. Do I need to use CE function to generate this logic – or is there a better method to do this?
How should I go about doing the percentile calculation?
Thanks for your help, any guidance will be appreciated.
A calculation view is used to define more advanced slices on the data in SAP HANA database.
Returns the first value whose cumulative distribution value is greater than or equal to the percentile value of a constant.
PERCENTILE_DISC( <constant_literal> ) WITHIN GROUP ( ORDER BY <expression> [ ASC | DESC ] [ NULLS FIRST | NULLS LAST ] ) <window_specification>
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You have percentile function in HANA.
PERCENTILE_DISC and PERCENTILE_CONT.
Check the follow link for more details
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Marcus,
Here's a template for calculating percentiles with plain SAP HANA sql:
First create a table with sample data:
CREATE TABLE PERCENTILE_TEST AS (
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
UNION ALL
SELECT RAND()*1000 AS VAL FROM DUMMY
);
Then run the following sql:
SELECT
ROUND(RANKING.ROWNUM / COUNTING.ROW_COUNT * 100) PERCENTILE,
RANKING.VAL
FROM
(SELECT
VAL,
ROW_NUMBER() OVER (ORDER BY VAL ASC) ROWNUM
FROM
PERCENTILE_TEST) RANKING
INNER JOIN
(SELECT
CAST(COUNT(1)/10 AS INTEGER) PERCENTILE_STEP,
COUNT(1) ROW_COUNT
FROM
PERCENTILE_TEST) COUNTING
ON (MOD(RANKING.ROWNUM, COUNTING.PERCENTILE_STEP) = 0)
which will give you this (actual values will differ depending on which random values were generated above):
PERCENTILE;VAL
10;82.1
20;138.9
30;333.5
40;464.2
50;577.9
60;686.1
70;810.1
80;852.6
90;904.2
100;966.7
Hope that helps.
Best regards,
Niclas Kjäll-Ohlsson (Supply chain analytics @ Cisco)
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Marcus,
HANA have ranking functions but I confess I couldn't understand your 80% logic.
http://help.sap.com/hana/html/_esql_functions_window.html
Regards, Fernando Da Rós
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
86 | |
10 | |
10 | |
9 | |
7 | |
7 | |
6 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.