cancel
Showing results for 
Search instead for 
Did you mean: 

Semi Join.

Former Member
0 Kudos

Hi Experts.

good  evening everybody.

Could you please let me know what are properties and where  we need to enable for semi join .

Best Regards,

Reddeppa K.

Accepted Solutions (0)

Answers (1)

Answers (1)

amitrathi239
Active Contributor
0 Kudos

Hi,

You can update the semi join parameters in Data Federation Admin tool.

Under the system Parameters tab->SEMI_JOIN_EXECUTION_STRATEGIES

Amit

Former Member
0 Kudos

Hi Amit,

Thanks for quick reply ..

I have followed above link .. but semi join not triggered.

Thanks,

Reddeppa K.

amitrathi239
Active Contributor
0 Kudos

Hi,

have you checked how queries are running in the DF Admin tool->Query monitoring tab.

Amit

Former Member
0 Kudos

Yes ..

Creating 2 separate queries.

Thanks,

reddeppa K.

Former Member
0 Kudos

Hi Amit,

I will take some details in to my requirement.

i have 2 tables from different  database both are from oracle.

Fact Table- - 10 Millions.

Dimension table -- 0.5 Millions.

In this case Semi join will trigger ?

thanks,

reddeppa K.

amitrathi239
Active Contributor
0 Kudos

Hi,

I haven't tested on oracle yet.Tested and working with SAP BW universe.

Amit

Former Member
0 Kudos

Hi Amit,

it is possible to give details  steps ?

Reddeppa K.

amitrathi239
Active Contributor
0 Kudos

Hi,

check the value MIN_ACTIVATION_THRESHOLD_FOR_ SEMI_JOIN_RULE in the DF admin tool.

This is the fraction of tuples returned by a semi join compared to a full table scan in order for Data Federator to consider that the semi join is useful. If you have to retrieve too many values the semi join becomes less useful and Data Federator executes a table scan instead. For example, if you have a table with 10M rows, and you set the minimum activation threshold to 1000, then 10M / 1000 = 10 000. Data Federator uses the semi join operator if it calculates that it will fetch less than 10 000 rows to execute the semi join. Raise this value if you want to use semi join less often. Lower this value if you want to use semi join more often. type: string needs restart? no default value: 100

Amit

Former Member
0 Kudos

Hi Amit,

i increased above parameter 1000 and 10000 . but no use.

Thanks,

reddeppa K.

sonet_kebede
Advisor
Advisor
0 Kudos

Query Engine generate  semi join in optimization plan when  finds it is profitable. The calculation (if it profitable or not) is based on Statistics and System Parameters.  You need to update or change the Statistics  as well as the system parameters.

If in the execution time semi join is not executed in this case you must increase the size of the cache used in semi join operation to put the distinct values of the Dimension. You must  increase the size of the cache in System Parameters: SEMI_JOIN_DIMENSION_CACHE_MEMORY_SIZE. By default is 1024 (kilo bytes) by dimension. So if you have 3 dimensions you must increase 3 * 1024 for example. For the system parameter.

ACTIVATE_SEMI_JOIN_DIMENSION_RUNTIME_CARDINALITY_LIMIT which is by default true, please set ACTIVATE_SEMI_JOIN_DIMENSION_RUNTIME_CARDINALITY_LIMIT = false.  When it is TRUE, you are basically letting the optimizer decide the cardinality limit.

You have to update the statistics.  you can right click on the query, or on each table and you can execute ‘Compute’ (cardinality or statistics) and ‘Selection and children’, or you can set manually the User Cardinality.  Ensure that the statistics are valid.  Make sure there is no error message ..like “unknown” Cardinality.  You can do this from Query Statistics and with Query Plan.

- Increase the size of JAVA Heap (xmx) for APS hosting DF services

- Increase parameter MAX _CONCURRENT_MEMORY_CONSUMING_QUERIES from 5 to 10

- EXECUTOR_TOTAL_MEMORY : between 75% and 80%

- MIN_ACTIVATION_THRESHOLD_FOR_SEMI_JOIN_ RULE.  The cardinality of some tables has to be greater than the MIN_ACTIVATION_THRESHOLD_FOR_SEMI_JOIN_ RULE

Former Member
0 Kudos

Hi Kebede,

Thanks for your recommendation.

Could you please let me know where we can see whether semi join is triggered or no t?

Thanks.

sonet_kebede
Advisor
Advisor
0 Kudos

The user can see in ‘Query Plan’ all dimensions that are involved in Semi join. The user must click In ‘Query Plan’ on ‘Semi-Join’ and he\she will display  in the ‘Details’ panel the filtered columns. The user must click on ‘Filtered Columns’ to see the Dimensions involved in Semi Join.

To do that, run the query, from the Query plan, click the Plan  and open each schema and table  and see if you generate the semi-join

Former Member
0 Kudos

Hi Kebede,

I didn't find any Filtered Columns in details panel. i only found Schema, Keys ,Statistics, capabilities. Data Federation SQL and Native Connector Query .

Following Parameters i set.:

System Parameters:

Updated statistics.

SEMI_JOIN_DIMENSION_CACHE_MEMORY_SIZE=6114 (6*1024) (using 1 column only for join condition)

MAX _CONCURRENT_MEMORY_CONSUMING_QUERIES=10

MIN_ACTIVATION_THRESHOLD_FOR_ SEMI_JOIN_RULE=10000

ACTIVATE_SEMI_JOIN_DIMENSION_RUNTIME_CARDINALITY_LIMIT = false

EXECUTOR_TOTAL_MEMORY =80%


Connector Configuration: (the database which one having more data)

maxValueINInclause:50000

Semijoinmaxqueries:10

pls let me know still i need to set any other parameters

Thanks.