cancel
Showing results for 
Search instead for 
Did you mean: 

how to calculate the cumulative sum with certain interval in SAP HANA

Former Member
0 Kudos

Hi Experts,

I have a scenario in which I have a column with Boolean values and need to calculate the cumulative values in the interval where there are consecutive ones as mentioned below,

ID          Date                           column 1

1         9-14-2014 14:22:00          1

1          9-14-2014 14:25:00          1

2          9-14-2014 13:22:00          0

1          9-14-2014 15:02:00          0

1          9-14-2014 14:37:00          0

2          9-14-2014 14:25:00          1

2          9-14-2014 14:32:00          1

1          9-14-2014 14:05:00          1

2          9-14-2014 14:45:00          0

2          9-14-2014 14:59:00          0

1          9-14-2014 15:12:00          1

1          9-14-2014 15:18:00          1

1          9-14-2014 15:21:00          1

First needs to group by 'ID' and Order By ' Date' and calculate the calculated column 'cumulative sum' for the consecutive ones in 'column1' as

ID          Date                           column 1          Cumulative sum

1          9-14-2014 14:05:00          1                    1    

1          9-14-2014 14:22:00          1                    2

1          9-14-2014 14:25:00          1                    3

1          9-14-2014 14:37:00          0                    0

1          9-14-2014 15:02:00          0                    0

1          9-14-2014 15:12:00          1                    1

1          9-14-2014 15:18:00          1                    2

1          9-14-2014 15:21:00          1                    3

2          9-14-2014 13:22:00          0                    0

2          9-14-2014 14:25:00          1                    1

2          9-14-2014 14:32:00          1                    2

2          9-14-2014 14:45:00          0                    0

2          9-14-2014 14:59:00          0                    0



Is there any function and way to calculate this  without loops in procedure?


Please help!!. Thank you!!


-Gayathri



Accepted Solutions (0)

Answers (4)

Answers (4)

Former Member
0 Kudos

create column table TAB(ID int, DATE TIMESTAMP, VAL int);

insert into tab values(1,'2014-09-14 14:22:00',1);

insert into tab values(1,'2014-09-14 14:25:00',1);

insert into tab values(2,'2014-09-14 13:22:00',0);

insert into tab values(1,'2014-09-14 15:02:00',0);

insert into tab values(1,'2014-09-14 14:37:00',0);

insert into tab values(2,'2014-09-14 14:25:00',1);

insert into tab values(2,'2014-09-14 14:32:00',1);

insert into tab values(1,'2014-09-14 14:05:00',1);

insert into tab values(2,'2014-09-14 14:45:00',0);

insert into tab values(2,'2014-09-14 14:59:00',0);

insert into tab values(1,'2014-09-14 15:12:00',1);

insert into tab values(1,'2014-09-14 15:18:00',1);

insert into tab values(1,'2014-09-14 15:21:00',1);

select ID,DATE,VAL,row_number() over (partition by ID,GROUPID order by DATE) as CUMULATIVE_SUM from (

select *,(row_number() over (partition by id order by date)-row_number() over (partition by id,val order by date)) as GROUPID

from TAB

)

where VAL = 1

union all select ID,DATE,VAL,VAL as CUMULATIVE_SUM from TAB where VAL = 0

order by ID,DATE

Former Member
0 Kudos

This is achieved using a window function. There's a good example in this SCN post. It's a SQL thing not a HANA-specific thing.

http://scn.sap.com/thread/3160793

henrique_pinto
Active Contributor
0 Kudos

Just to be clear, you need to sum 1 if it's 1 and = 0 if it's 0?

If so, the same LAG() function I talked about in the other question you made should suffice.


Something like:


select id, date, col1, ((lag(col1, 1, 0) over (order by id, date) +col1)*col1) as cummulative_sum from table

The idea is to get the last value from the cummulative sum and sum with the current column 1 value and then multiply by the same column 1 current value. If it's 0, you'll get 0, if it's 1, you'll get it incremented by 1, since the multiplication by 1 is the same value.

This only works if column 1 is always either 0 or 1.

If it can assume other values, then use a case clause checking over the value of column 1:


select id, date, col1, (case col1 when 0 then 0 else col1 + lag(col1,1,0) over (order by id, date) end) as cummulative_sum from table

Former Member
0 Kudos

Hi Henrique,


the SQL,

select id, date, col1, ((lag(col1, 1, 0) over (partition by ID order by  date) +col1)*col1) as cummulative_sum from Table

Give the result as mentioned below, which is not the requirement I am looking for . I have also given the correct values of my requirement below,

ID          Date                           column 1          Cumulative sum     correct Cum_sum

1          9-14-2014 14:05:00          1                    1                              1

1          9-14-2014 14:22:00          1                    2                              2

1          9-14-2014 14:25:00          1                   2                              3

1          9-14-2014 14:37:00          0                    0                              0

1          9-14-2014 15:02:00          0                    0                              0

1          9-14-2014 15:12:00          1                    1                              1

1          9-14-2014 15:18:00          1                    2                              2

1          9-14-2014 15:21:00          1                    2                              3

2          9-14-2014 13:22:00          0                    0                              0

2          9-14-2014 14:25:00          1                    1                              1

2          9-14-2014 14:32:00          1                    2                              2

2          9-14-2014 14:45:00          0                    0                              0

2          9-14-2014 14:59:00          0                    0                              0

we need start the cumulative sum for every time the column 1 has consecutive 1s

henrique_pinto
Active Contributor
0 Kudos

another possibility is to partition over column1 and sum on each partition:


SELECT id, date, col1, sum(col1) over (partition by col1 order by id, date) as cummulative_sum from table1;

Former Member
0 Kudos

Hmmmm. Now I understand the requirement I don't see how this can be done in SQL on any database. I believe you will have to iterate, or change the requirement.

You may want to ask this on a forum like Stack Overflow, where they have SQL gurus. It's possible someone may have an idea.

Former Member
0 Kudos

Hi Henrique,

The SQL,

SELECT id, date, col1, sum(col1) over (partition by col1 order by id, date) as cumulative_sum

will give me the cumulative value for all the col 1 with same value which is not my requirement,

the above will give the result as shown below. I have also added the correct values as per my requirement as well

ID          Date                           column 1          Cumulative sum     correct ***_sum

1          9-14-2014 14:05:00          1                    1                              1

1          9-14-2014 14:22:00          1                    2                              2

1          9-14-2014 14:25:00          1                   2                              3

1          9-14-2014 14:37:00          0                    0                              0

1          9-14-2014 15:02:00          0                    0                              0

1          9-14-2014 15:12:00          1                   4                              1

1          9-14-2014 15:18:00          1                   5                              2

1          9-14-2014 15:21:00          1                    6                              3

2          9-14-2014 13:22:00          0                    0                              0

2          9-14-2014 14:25:00          1                    1                              1

2          9-14-2014 14:32:00          1                    2                              2

for my requirement , I need to find the cumulative value only for the consecutive 1s if there are 0s in between then there should be a break and cumulative sum should start again for the next set of consecutive 1s

Hope you could help. Thanks.

-Gayathri

henrique_pinto
Active Contributor
0 Kudos

Dude, you're allowed to think, you know.

I was just giving some ideas for you to work on, but apparently you don't want anything but the whole solution. That's not what SCN is about. You might as well go and search for an online consultancy site.

Former Member
0 Kudos

Apologies that my replies gave such an impression. I just wanted to explain my requirements correctly. but, Even before starting this discussion I have tried all these solutions you provided and also some other possibility of using window functions, self joins and so .

Only possibility which is giving an correct answer would be using loops in SQL Script but this will lead to performance issue whit large number of data which I do not want. That's the reason I posted this to know if I could take a fresh look of the problem other than using the same methods mentioned in my other discussion

http://scn.sap.com/thread/3618682

Former Member
0 Kudos

Try taking my advice above and post on StackOverflow, and ask how to do this in SQL. I don't believe there is a way because you are looking for a function for which I can't see an obvious set-based solution.

Whatever you do, don't use SQLScript loops for this, they will suck.

Instead, bring the result into an array in the front end and write some C++/Java/Python/JavaScript to achieve your result.

Why do you think performance would be a problem for this? I don't see a way to aggregate your dataset so you must be supplying the whole dataset back to the client. Presumably it's not all that big.

henrique_pinto
Active Contributor
0 Kudos

You could have just said all that in your original post and save my and (everyone else who spent some thought into it)'s time.

As John said, this is not a HANA specific question but rather a SQL question.

StackOverflow is a good starting point, but it brings to me the realization that we do not have a proper place within SCN to ask SQL questions. We have for ABAP & Java, might be time for us to have a space for SQL & JavaScript as well.

Former Member
0 Kudos

It's possible Henrique, but this forum is hardly so oversubscribed that it needs splitting.

Problem is there are few people with hardcore SQL expertise here!

henrique_pinto
Active Contributor
0 Kudos

That's the point - maybe splitting it up would attract those kind of audience.

Former Member
0 Kudos

Fair point

sreehari_vpillai
Active Contributor
0 Kudos

not clear about the requirement

Former Member
0 Kudos

Hi Sreehari,

I will simplify the requirement , below is the sample data, the explanation of the calculation involved,

ID          Date                           column 1          Cumulative sum       Comment

1          9-14-2014 14:05:00          1                    1     (col1) =1

1          9-14-2014 14:22:00          1                    2     (col1+previous cumulative sum) =1+1

1          9-14-2014 14:25:00          1                    3     (col1+Previous cumulative sum) =1+2

1          9-14-2014 14:37:00          0                    0     (col1)=0

1          9-14-2014 15:02:00          0                    0     (col1)=0

1          9-14-2014 15:12:00          1                    1     (col1)=1

1          9-14-2014 15:18:00          1                    2     (col1+previous cumulative sum)=1+1

1          9-14-2014 15:21:00          1                    3     (col1+previous cumulative sum)=1+2