Run HANA Stored procedure in parallel
Dear all,
I have created a stored procedure in HANA which will select the parameter "Property" into an array, iterate over this array and call a second stored procedure ("pattern_Identification") with the parameter "Property" as input parameter. As I have around 20 different "Property" parameters the procedure "pattern_Identification" is called several times sequential and makes the whole process slowly. That's why I would like to parallelize the process.
Please let me know how I can use HANA capability to call this stored procedure parallelly ?
Regards,
Matthias
PROCEDURE "MY_SYSTEM"."My_Project.procedures::run_Pattern_Identification" (
IN personID ALPHANUM(10)
)
LANGUAGE SQLSCRIPT
SQL SECURITY INVOKER
DEFAULT SCHEMA TEST_SYSTEM
AS
BEGIN
DECLARE property_Array VARCHAR(50) ARRAY;
DECLARE a_Length INT;
DECLARE a_Index INT;
DECLARE property VARCHAR(50);
property_Tab = SELECT DISTINCT "PROPERTY" FROM "ENVIRONMENTAL_DATA";
property_Array := ARRAY_AGG(:property_Tab.PROPERTY);
a_Length := CARDINALITY(:property_Array);
FOR a_Index IN 2 .. a_Length DO
property := :property_Array[:a_Index];
CALL "test_Project.procedures::pattern_Identification"(:property); -- Call could be parallelly
END FOR;
END;
Tags:
Lars Breddemann replied
Hi Matthias,
I know I am a bit late in this discussion, but it stuck to the back of my mind.
The main question is of course: why the procedure calls at all?
Looking at what happens in the procedures I figured it's this:
- find all different properties that had been collected so far
- for each of the properties
- find all days with discomfort PLUS the 5 days before that day
- for every day
- read the date and the value of the current property (if available on this date)
- store the result in a table
That's definitively doable in a single SQL statement:
select row_number() over(partition by dates.date order by dates.off desc ) sequence_id
, dates.date
, dates.off as "eventID"
, day_before
, prop.property
, prop."VALUE"
FROM
(select dc."DATE", date_range.off, add_days(dc."DATE", date_range.off) as day_before
from discomfort_t dc
cross join (select 0 as off from dummy union all
select -1 as off from dummy union all
select -2 as off from dummy union all
select -3 as off from dummy union all
select -4 as off from dummy union all
select -5 as off from dummy) date_range) dates
inner join
(select "DATE", property, "VALUE" from environmental_data) prop
on dates.day_before = prop."DATE"
order by dates.date asc, "eventID" desc
The dates "union"-sub-select generates the date offsets you are looking for.
This gets joined with the dates of days of discomfort.
We now have a list of dates of discomfort and the five days before that.
This then gets joined with all the measures properties and values.
Finally the row_number() produces the "sequence_id" starting new with every date of day of discomfort.
I don't have any test data of reasonable size but I don't see why this shouldn't be quicker then the two procedure approach...
Cheers,
Lars
So, now