Design Studio 1.5: View on Parallel Data Source Execution
Quite important feature - parallel execution of data sources. You can speed up the application, but of course there are some costs which you need to consider. This is a blog of the series "Design Studio 1.5, View on..." - here the function "Parallel Data Source Execution". For more topics see Design Studio 1.5: What's New in? A (technical) View.
This was long awaited functionality of parallel data source execution in Design Studio.
The Starting point.
Design Studio in previous release does execute all data sources in sequential order. This means that the total execution time is always a sum of all execution times of all data sources. This behavior could not be changed, the only possibility was to use background processing in order to initialize some of the data sources a bit later when some data were already presented to the user. But also this option has a drawback in queries with variables due to the merge scenario setup (all queries were reloaded again when some variable was changed).
What you can Expect?
The release 1.5 is allowing now to execute data sources in parallel. Before you do this, I will discuss here a bit the technology behind and visualize how this effects your system landscape and potential different sizing of the system.
Basic Implementation Details.
I will start with simple application, 2 data sources (same query) including some variables. When executed w/o parallel data source execution, the execution times are as visible at the chart below. All times are just examples, and only for visualization of all parts which are summing into complete application execution.
Pic. 1. Example of an application w/o parallel execution.
The times which are affected via parallel execution are marked in blue - this is the OLAP and BICS parts (OLAP part is a bigger part) of Data Source processing time, in this overview it is not detailed too much, as the execution time will not change, only the way how it will be executed will change.
When we look at the total execution time, it contains few phases:
- the startup time (redirect on the platform to the application, login time etc)
- load of the framework (probably shorter then the example, but this is dependent on the cache and PC where you start the browser - and the browser itself)
- then the Q-startup (Q is for Query) - this is the data initialization of the application, including assignment of the data source until submit of variables
- this one, can be devided into OLAP (SAP BW) processing of the query
- and into rendering of the application
What happens during rendering?
Also, the visualization above is not taking too much details on the rendering time. Of course some components (like CROSSTAB, CHART or other data bound components) are requiring the result set. This is triggered at the point of time when it is required for the first time - eg. getData() function is also triggering this.
When we take the rendering part into detailed view, we can distinguish it into
- the data (result set) acquisition which require calls into SAP BW and
- the application processing based on provided data from result set
The first one, data acquisition is executed also in parallel, so all user interactions are quicker when parallel execution is activated.
The second one is always executed in the main session, and it is not paralleled.
Pic. 2. Comparison of the execution with parallel workflow.
Where are the changes?
- As you can see, the initial start-up and session creation stays same, therefore you will not see any difference at this area.
- Then, in case of parallel data source execution the first query is requested in the main session, the second one in a new one. Of course the new session must be created (block: NewSession1), this is a bit of overhead (but not too much, depending on the system landscape and the network between BIP and SAP BW).
- Potentially there will be some "wait" time in the session which is quicker in execution. This is required to synchronize again the script execution based on the data sources- script execution is not paralleled.
- During the rendering, also the result set request is paralleled, this means that the rendering time itself will be reduced as well.
How does it work for multi-system landscape?
The data source parallel execution is not bound to SAP BW only, also applications which are connected to other systems (HANA, BI Universe) will profit from it. The reason is that the data source provider part is executed in separate threads and separate session on the server, and this solution is independent of the source system.
Here is a small comparison how does it look like on the complete system landscape.
Pic. 3. System landscape with NO parallel execution.
Without parallel query execution, there is only one connection between the systems. Starting from the browser to the BIP web layer and BIP APS layer. Then Design Studio Application is opening one session per used system in data source components. In summary, one user execution the application is causing everywhere only one session.
This changes when parallel execution of data sources is used. This is the view with parallel execution. The example is assuming using of few execution groups which are distributed across the data sources (which not necessary makes sense in this constellation).
Pic. 4. System landscape with USE OF parallel execution.
What has changed now? The browser has still only one session, the same for BIP web layer and BIP APS layer. The change is starting between the BIP APS layer and the source systems. In parallel data source execution, the BIP APS layer is opening more connections (equal to the number of defined groups + the main connection) to the source systems. This leads to the small discussion about required sizing changes on all levels.
View on sizing and back-end sessions.
In general, sizing is always divided into memory and processor power.
The general memory consumption should not change in a significant way. The reason is that parallel data source execution still needs to hold the objects in memory, independent from which execution group those are coming. Of course there is a bit memory overhead for handling of the session, but this is nothing we should be worry about now.
The processing power of the system landscape will be affected more, especially on the SAP BW server. As the parallel data source execution is opening more sessions and requesting all data (result sets) at the same time from the SAP BW system, this one will get higher load - but of course only by shorter time frame. This means, the processing power calculation will now need also the "average number of parallel processing groups in applications" as parameter - the calculation itself is similar as in the sap note 1177020 - SAP BusinessObjects Design Studio - Sizing Information. The parallel sessions should be considered as additional "concurrent users".
if you are using SAP BW, check also the sizing for RFC sessions, Configuring the SAP System for Parallel RFCs.
Of course, as the processing is parallel, the execution time is squeezed to shorter time frame, making a higher load on the server.
Pic. 5. Server load w/o parallel data source execution
Pic. 6. Influence of parallel query execution on server load (dotted line is the non-parallel load)
Looking at the load above, we can see that the server load will be accumulated (as the parallel execution is accessing the SAP BW server at the same time from both processing groups). This influence the required processing power and creates higher load peaks, on both - initialization and result set acquisition.
View on Server Sessions (example on SAP BW ABAP).
Every parallel execution group is opening new session against the SAP BW ABAP stack. You can monitor the sessions in SM04 transaction, what you should see in a local execution mode is:
|User Action||Number of |
|initialized Design Studio Client||1|
|opened application in design mode||2|
|executed application in local mode||+1 (+ 1 per group |
of data sources
|data source initialized||+ 1 per group|
|application closed||- (1 + 1 per group)|
This means, in the execution you will see increasing number of sessions per the application opened by the user until all groups are initialized. The sessions are hold by the application until the application is unloaded! As of 1.5 release, there is no option to "close active session". First the explicit user log-off is closing all sessions against SAP BW system.
The most impacted setting will be probably the number of possible sessions which are allowed by the system on RFC connections. The default is set to 300, meaning (considering only DS sessions) - a runtime user (no Client tool in use) can open 300 applications w/o parallel data source execution. When the parallel execution will be introduced, and the application will have additionally 2 processing groups, the max number is decreasing to 100 (as every application will consume 3 session, the main one and 2 processing groups).
The parallel data source processing is making the execution quicker, but you need to consider higher server load on SAP BW ABAP server.
Parallel data source execution can be activated only when variable un-merge is activated, see here Design Studio 1.5: View on Variable Unmerge Scenario.
Perhaps not all aspects are covered yet, in case of questions, feel free to ask.