Understanding SQT & DSI_SQT_MAX_CACHE - WARM STANDBY DATA FLOW
We have a warm standby replication server setup replicating 3 databases from PRIMARY to warm standby. As usual the biggest drag on the performance is the DSI performance issue and the WARM STANDBY server capacity to run the transactions. Presently DIST thread is on.
Our ASE version is 15.0.3ESD3(PDS & RDS) and OS is SUN Solaris SPARC.
Replication server uses rssd(15.7/EBF 24749 SMP SP135) and the Replication server is 15.7.1/EBF 24559 SP208 and the OS is Windows 2008 R2.
I wanted to confirm on 2 main parameters sqt_max_cache_size and dsi_sqt_max_cache_size, during the course of this discussion. Any inputs will be very helpful. It is on the context of the findings using the admin statistics and the whitepaper "sybase replciation server performance & tuning" by i believe Mr. Jeff Tallman.
1. In the whitepaper, page 149 ... it is mentioned that the WS-DSI threads read straight off the inbound queue and the sorting occurs in the cache represented by dsi_sqt_max_cache_size, effectively duplicating the sorting process carried out by the SQT thread.
a. So in this case, if the DIST thread is not set to off - the sorting happens twice ? Please confirm.
b. If the WS-DSI, is reading directly from the inbound queue - where is the SQT getting involved in this scenario? Isn't the data flow as below
REP AGENT -> REP AGENT USER THREAD -> SQM -> WS-DSI(S) -> WS-DSI(E) .
It is clear from the statistics though that, the SQT is being used, since we can see the CacheMemUsed value ranges from 5mb-20mb. The crux is to affirm that sorting is happening twice or once, if the DIST is set to on, in a WARM Standby environment.
2. What is the best way to find the appropriate DSI_MAX_CACHE_SIZE. Is is to just monitor "TransRemoved" i.e. counter 24009 and decrease it till we start having counter 24009 >0 multiple times ?
Our dsi_sqt_max_cache is set to 64mb(for our most active database), and sqt_max_cache_size is set to 32mb.
If we look at dsi_sqt_max_cache parameter "CacheMemUsed", we can always find that it is filled to the brim(64 mb as expected) and "TransRemoved" from dsi_cache is 0 but sqt_cache_size "CacheMemUsed" reached 20mb at max and at 32mb the "Transremoved" reached 1 a couple of time during Nightly
3. It is mentioned that the DIST can be set to off, safely in a Warm Standby. Can we set it to off and then to ON, at any instant ?
This query is asked in the context that i want to ensure that we have correct dsi_sqt_max_cache_size set. Because, when the SQT is down, i believe that our dsi sqt cache size will show up as 64mb(the actual size). At least now with sqt_max_cache_size of 32mb and "CacheMemUsed" of 20MB max, we can consider that our cache is more than what is required .... and without the SQT the dsi_sqt_max_cache_size is always shown to be full(because it fills up).
Furthermore, if SQT is down(due to DIST, being set to off), how will we find the # of open, closed, read and truncated command ?
4. What are the pros and cons of setting DIST off in a Warm Standby. Mainly can there be any -ve effect at all ?
5. Is there any good document for SQL DML replication? We are thinking of implementing this on our environment to fasten our replication further along with maybe parallel DSI.
6. For the SQT counters "SQTOpenTrans","SQTClosedTrans","SQTReadTrans" and "SQTTruncTrans" .. should we consider the counter_total,counter_max or counter_last columns of the rs_statdetail table ?
Is there any document which explains which of the counter columns must be considered for each of the display_name. Even after rs_statcounters.description colum, i wanted to know .. which counters matters most and what should be the corresponding column to draw inference about the counter.
Our observations for the admin_who & 10 minutes stas collection are attached in the text file and the .png file
Thank you very much!!