SAP Crystal Reports Engine in a Multi-Threaded Visual Studio Application
As the SAP Crystal Reports engine uses the three Concurrent Processor License model, creating a multithreaded application should theoretically increase report processing performance. This document explains the details and ramifications of threading when using the SAP Crystal Reports engine.
The behavior of the SAP Crystal Reports Engine in a multi-threaded application is explained in this document. Test results show that for most applications, threading has limited value and may even appear to negatively impact performance.
On occasion it is tempting to employ threading in an application that uses the SAP Crystal Reports Engine to process reports. Threading should theoretically improve or speed up performance of the SAP Crystal Reports Engine. This Document discusses technical details and consequences of using threads when working with Crystal Reports.
SAP Crystal Reports and the Concurrent Processor License Model
The SAP Crystal Reports Engine is uses the Concurrent Processor License (CPL) model. Under this model, the engine is limited to three CPL. Meaning three requests for a report processing will be accepted at the same time. Subsequent requests will be queued up and processed when a license becomes available. E.g.; a report completes processing, be it view, export or print. The event Viewer will display the following warnings regarding queued requests:
A Crystal Reports job was delayed x second waiting for a free license to become available. More licenses can be purchased direct from Crystal Decisions or through the Crystal Decisions Online Store.
Note that the message is misleading in that it is not possible to purchase more licenses for SAP Crystal Reports. However more licenses can be obtained with SAP Crystal Reports Server and SAP BusinessObjects BI Platform 4.0.
Testing SAP Crystal Reports in a Multi-threaded Application
Tests were performed using the Crystal Reports Component Engine (In-process RAS) to print reports using different combinations of Threads and number of CPUs on the processing server. The application used in this test processed from 1 to 16 threads using a server with 1, 2, 4 and 8 CPUs.
The results showed that adding threads and CPUs caused degradation in performance due to thread blocking. This is due to the .NET runtime supporting a maximum of 3 concurrent threads per process and on the fact that the reporting engine is tuned to share hardware with an application such that it does not spawn threads at a rate that will consume all CPUs on a typical modern server. The following Figures show the amount of time (in seconds from the time a report is opened to the time it is closed) for a report to be printed by an application running either 3 or 12 threads on both 1 and 2 CPU servers.
Fig 1. Time (per report) to print 200 reports on a Single CPU machine
Fig 2. Time (per report) to print 200 reports on a Dual CPU machine
From the Figures above, it is clear that when running 12 threads on either a single or dual processor machine, about 15 to 20% of reports took over 10 seconds to complete. When setting the application to match the allowed concurrent threads, almost all reports completed in 3 seconds or less. When spawning as many as 12 threads, the engine can only service 3 of these concurrently. Any additional threads are blocked until a free thread becomes available.
Assuming there are request for 20 reports to be processed in order from 1 to 20:
In an application spawning 3 threads the reports were processed as in the following order:
In an application spawning 12 threads, the reports were processed in the following order:
The change in order for the 12 thread scenario is a reflection of threads being queued up until a free license becomes available. There is no guarantee that the first thread that is blocked will be the first thread that is serviced when a free license becomes available. This behavior is by-design as described in the article Crystal Reports 2008 Component Engine Scalability.
Timing tests were performed using the Crystal Reports Component Engine (In-process RAS) to print 200 reports using an application spawning from 1 to 16 threads on machines with 1, 2, 4 and 8 CPUs.
The Figure shows that, in general, adding additional threads neither improves nor degrades the amount of time taken to process the entire run of reports. This, in combination with Figures 1 and 2 shows that even though some threads may take longer to return, the overall time taken to process the same batch of print jobs remains relatively constant for a specific machine configuration.
Figure 3: # Threads vs. Time to process 200 reports
Single and dual processor machines have comparable behavior with respect to the length of time taken to print 200 reports. This is regardless of the number of threads, even though some threads appear to take longer to process (i.e. threads being blocked while waiting for an available free license. Adding 4 or 8 CPUs does cause degradation in performance and is most likely due to the following quote from the Crystal Reports 2008 Component Engine Scalability article.
“The component reporting engines are tuned to share hardware with an application, and as such do not spawn threads at a rate that will consume all CPUs on a typical modern server.”
Threading SAP Crystal Reports requests does not improve the performance of the application. In some instances, it may even appear that the performance is worse when using a multi-threaded application. If there is a requirement for sequential processing of the reports, for example batch printing, the optimal configuration would consist of two CPUs and three CPL.
Dan Paulsen did all of the testing and analysis for this issue. He also created an internal article desribing the issue. This Jive Document is only a formatted version of Dan's original article, with a few references added for completness.
Questions regarding this document should be posted as a Discussion in the SAP Crystal Reports, version for Visual Studio SCN Space.