on 10-26-2015 11:42 AM
Hi Experts.
good evening everybody.
Could you please let me know what are properties and where we need to enable for semi join .
Best Regards,
Reddeppa K.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi,
check the value MIN_ACTIVATION_THRESHOLD_FOR_ SEMI_JOIN_RULE in the DF admin tool.
This is the fraction of tuples returned by a semi join compared to a full table scan in order for Data Federator to consider that the semi join is useful. If you have to retrieve too many values the semi join becomes less useful and Data Federator executes a table scan instead. For example, if you have a table with 10M rows, and you set the minimum activation threshold to 1000, then 10M / 1000 = 10 000. Data Federator uses the semi join operator if it calculates that it will fetch less than 10 000 rows to execute the semi join. Raise this value if you want to use semi join less often. Lower this value if you want to use semi join more often. type: string needs restart? no default value: 100
Amit
Query Engine generate semi join in optimization plan when finds it is profitable. The calculation (if it profitable or not) is based on Statistics and System Parameters. You need to update or change the Statistics as well as the system parameters.
If in the execution time semi join is not executed in this case you must increase the size of the cache used in semi join operation to put the distinct values of the Dimension. You must increase the size of the cache in System Parameters: SEMI_JOIN_DIMENSION_CACHE_MEMORY_SIZE. By default is 1024 (kilo bytes) by dimension. So if you have 3 dimensions you must increase 3 * 1024 for example. For the system parameter.
ACTIVATE_SEMI_JOIN_DIMENSION_RUNTIME_CARDINALITY_LIMIT which is by default true, please set ACTIVATE_SEMI_JOIN_DIMENSION_RUNTIME_CARDINALITY_LIMIT = false. When it is TRUE, you are basically letting the optimizer decide the cardinality limit.
You have to update the statistics. you can right click on the query, or on each table and you can execute ‘Compute’ (cardinality or statistics) and ‘Selection and children’, or you can set manually the User Cardinality. Ensure that the statistics are valid. Make sure there is no error message ..like “unknown” Cardinality. You can do this from Query Statistics and with Query Plan.
- Increase the size of JAVA Heap (xmx) for APS hosting DF services
- Increase parameter MAX _CONCURRENT_MEMORY_CONSUMING_QUERIES from 5 to 10
- EXECUTOR_TOTAL_MEMORY : between 75% and 80%
- MIN_ACTIVATION_THRESHOLD_FOR_SEMI_JOIN_ RULE. The cardinality of some tables has to be greater than the MIN_ACTIVATION_THRESHOLD_FOR_SEMI_JOIN_ RULE
The user can see in ‘Query Plan’ all dimensions that are involved in Semi join. The user must click In ‘Query Plan’ on ‘Semi-Join’ and he\she will display in the ‘Details’ panel the filtered columns. The user must click on ‘Filtered Columns’ to see the Dimensions involved in Semi Join.
To do that, run the query, from the Query plan, click the Plan and open each schema and table and see if you generate the semi-join
Hi Kebede,
I didn't find any Filtered Columns in details panel. i only found Schema, Keys ,Statistics, capabilities. Data Federation SQL and Native Connector Query .
Following Parameters i set.:
System Parameters:
Updated statistics.
SEMI_JOIN_DIMENSION_CACHE_MEMORY_SIZE=6114 (6*1024) (using 1 column only for join condition)
MAX _CONCURRENT_MEMORY_CONSUMING_QUERIES=10
MIN_ACTIVATION_THRESHOLD_FOR_ SEMI_JOIN_RULE=10000
ACTIVATE_SEMI_JOIN_DIMENSION_RUNTIME_CARDINALITY_LIMIT = false
EXECUTOR_TOTAL_MEMORY =80%
Connector Configuration: (the database which one having more data)
maxValueINInclause:50000
Semijoinmaxqueries:10
pls let me know still i need to set any other parameters
Thanks.
User | Count |
---|---|
95 | |
11 | |
10 | |
9 | |
9 | |
7 | |
6 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.