cancel
Showing results for 
Search instead for 
Did you mean: 

When to Use Inverted Keys

former_member205400
Active Participant
0 Kudos

We had some tables that were really large and using a lot of space. (The tables were already partitioned btw)

On a deeper dive we see that the tables internal object sizes were larger than the tables themselves. The tables had multiple keys and we saw those keys were large.

We changed those tables to inverted hashes ( I believe we did 4 tables ) and the size of those internal objects decreased. We didn't see any noticeable degradation of sql performance..

So ... we are looking to establish criteria for when we want to use inverted hashes.

I ran this qry:

select distinct a.schema_name,a.table_name, a.columns, a.internal, a.internalPK, b.record_count
from
(
select schema_name, table_name,
sum(case when my_type = 'columns' then memory end) as columns,
sum(case when my_type = 'internal' then memory end) as internal,
sum(case when my_type = 'internalPK' then memory end) as internalPK
from
(
--M_CS_COLUMNS - Runtime information of columns of column tables
select schema_name, table_name, 'columns' as my_type, memory
from
(
select schema_name, table_name,  sum(memory_size_in_total) as memory
from sys.m_cs_columns
group by schema_name, table_name
)
union
--M_CS_ALL_COLUMNS - Runtime information from all columns of column tables, including internal ones
select schema_name, table_name, 'internal' as my_type, memory
from
(
select schema_name, table_name, sum(memory_size_in_total) memory
from sys.m_cs_all_columns
group by schema_name,table_name
)
union
select schema_name, table_name, 'internalPK' as my_type, memory
from
(
select schema_name, table_name, sum(memory_size_in_total) memory
from sys.m_cs_all_columns
where column_name='$trexexternalkey$'
group by schema_name, table_name
)
)
where memory > 0
group by schema_name, table_name
) as a left join sys.m_cs_tables as b on a.schema_name=b.schema_name and a.table_name=b.table_name
where a.internalPK > a.columns
order by 1

and it produced the following:

So you can see that when the internalPK is greater than the size of the columns then it *could* be a candidate.

Anyone have some other criteria? 2,951,038,304 is about 3 Gb.

So maybe we shouldn't get concerned until the size is over 20 GB? Comments?

Mike

Accepted Solutions (0)

Answers (1)

Answers (1)

lbreddemann
Active Contributor
0 Kudos

Good question!

My personal view on this is that you need to decide whether or not the concatenated primary key in its default implementation delivers enough business value (don't ask me how to prove that) that it rectifies the additional memory usage.

Putting the size against the table payload doesn't really help here, although it provides some perspective.

A big contra to HASHED primary or unique key implementation is the missing ability to support range scans.

On the other hand, if the main scenario you want to see supported by the primary or unique key is the SELECT SINGLE, that is a select with the full specification of the key, then it might be worth the effort to change to the HASHED key implementation.

Other than that, there are no hard limits for the "allowed" memory consumption that would be OK.

former_member205400
Active Participant
0 Kudos


Lars,

Good Answer I think !!! Too bad its not black and white.

Just so I'm in sync with you what I thought you said though is that if I had a key with this table ZGPD_REPLEN_COST_WKLY like:

PRIMARY KEY INVERTED HASH ("SUNDAY_WE_LOAD_DAY_KEY", "REPOBJ", "MRP_IND", "MATNR", "PSPNR", "TYPE", "KSTAR", "MANDT")

1) if I do a select like:

select SUNDAY_WE_LOAD_DAY_KEY, someOtherfields

from my_schema.ZGPD_REPLEN_COST_WKLY

where SUNDAY_WE_LOAD_DAY_KEY between 9140 and 9250

then you are saying this is just a bit slower or very slow, or will not run?


2) if I do a select like:

select SUNDAY_WE_LOAD_DAY_KEY, someOtherfields

from my_schema.ZGPD_REPLEN_COST_WKLY

where SUNDAY_WE_LOAD_DAY_KEY = 9140 and

REPOBJ=OR000021048115 and

MRP_IND=='OR' and

MATNR='2X123545' and

PSPNR=892916 and

TYPE='Z0001' and

KSTAR=0092736 and

MANDT='010'

then this would be very fast !!!

Right?

Mike