cancel
Showing results for 
Search instead for 
Did you mean: 

Looking for more info on CONCAT_ATTRIBUTE

patrickbachmann
Active Contributor
0 Kudos

Hi folks,

I'm looking at the SQL behind some of our larger tables replicated from SAP and see some have extra concatenated attributes similar to this;

ALTER TABLE "OURSCHEMA"."MSEG" WITH PARAMETERS ('CONCAT_ATTRIBUTE'=('$LGORT$MATNR$WERKS$',

          'LGORT',

          'MATNR',

          'WERKS'))

Does anybody have any further documentation about this, why it exists exactly and what the benefits are?  At first I thought it was simply a concatenation of KEY fields (and most indeed are) but some of the fields being concatenated are not KEYS.  Some of these concatenated attributes seem like common ways I might want to join to other tables so I wonder why they wouldn't just all be KEY fields to begin with as well.  So I'm wondering the logic behind this and if SAP has pre-chosen all of these combinations.  The other reason I'm asking is I was told by SAP in the past, when looking into huge memory consumption issues for BSEG for example - ie: noticed in some cases record count growth did not directly tie to memory consumption growth) that some additional memory is required for concatenated attributes.  So I just want to understand this better and can not find any documentation anywhere on this.  I noticed some very old SQL Script guides (HANA SPS 01!) had very vague examples of CONCAT_ATTRIBUTE but it was removed in later documentation.

Thanks for any information you may have on this subject,

-Patrick

Accepted Solutions (1)

Accepted Solutions (1)

former_member184768
Active Contributor
0 Kudos

Hi Patrick,

I will try to mention as much as I know, but would be more keen on getting the reply from , "The Great".

  • First thing, you can find some details on these columns in m_cs_all_columns.
  • These columns are generated whenever you define them as KEY columns OR you  have a join on the table using a set of columns.
  • So if you join a table with a set of column combinations, a new column which concatenates these columns will be created.

I'll try to put a working example here in some time.

Regards,

Ravi


former_member184768
Active Contributor
0 Kudos

Here is the example:

create column table t_dept (dept_id tinyint primary key, dname varchar(5));

create column table t_emp (emp_id tinyint primary key, ename varchar(5), dept_id tinyint);

insert into t_dept values (1, 'ddd');

insert into t_dept values (2, 'ddd');

insert into t_dept values (3, 'ccc');

insert into t_emp (emp_id, ename, dept_id) values (1, 'aaa', 2);

insert into t_emp (emp_id, ename, dept_id) values (2, 'bbb', 1);

insert into t_emp (emp_id, ename, dept_id) values (3, 'ccc', 3);

insert into t_emp (emp_id, ename, dept_id) values (4, 'ddd', 1);

select * from m_cs_all_columns where table_name in ('T_DEPT', 'T_EMP') and column_name like '%$%';

-- Join on one set of columns

select * from T_EMP a, T_DEPT b where a.dept_id = b.dept_id and a.ename = b.dname;

select * from m_cs_all_columns where table_name in ('T_DEPT', 'T_EMP') and column_name like '%$%'; --- check the columns

-- Join on different set of columns

select * from T_EMP a, T_DEPT b where a.emp_id = b.dept_id and a.ename = b.dname;

select * from m_cs_all_columns where table_name in ('T_DEPT', 'T_EMP') and column_name like '%$%'; -- check the columns

So my understanding is, these columns are created to perform COLUMN to COLUMN join operation as explained in the document from http://scn.sap.com/docs/DOC-44174. Please refer to the section "How does Join work".

Also, please await for the HANA SUPERGURU (read Lars) to reply on this thread with much more valuable information and in-depth analysis.

Regards,

Ravi

patrickbachmann
Active Contributor
0 Kudos

Ahhhh this is very fascinating and makes sense!  Ok so very important to realize then as new content is being developed against the same views that we could potentially consume more memory if additional joins are being used since it's creating new concatenated attributes.  Good stuff Ravi, thanks.

patrickbachmann
Active Contributor
0 Kudos

Ravi the more I research this now I can see the mysterious columns in M_CS_ALL_COLUMNS which answers my question.  Still will keep this thread open until tomorrow in case Lars the great has anything to add.  Thanks again.

-Patrick

lbreddemann
Active Contributor
0 Kudos

Hey guys!

Busy times right now (moving from Austria to Germany...), so just a brief answer for now:

In general the column store engines are designed to work on, well single columns.

Whenever there is a combined multi-column request to be worked on, the default way to deal with that is/was to create a hidden concatenated column that persists the combined columns.

This of course affects the memory size of column store tables and it happens automagically.

You perform a join via multiple columns? If not present yet, SAP HANA will create the required concat-attribute.

You defined a multi-key index or primary key? A concat-attribute will be created.

While this is annoying from a DBA point of view, it allows for high-speed join operator implementations that can be shared for single and multi-column joins.

All in all there are a couple of hidden columns, only two of them ($trexexternalkey$ and $col1$col2$...$) are concat-attributes.

The others ($rowid$, $udiv...$) are internal management information, that are changed/modified by the SAP HANA DBMS constantly. Thing like visibility of records, uniqueness during a merge cycle and so on are handled by that.

More to come (I hope soonish).

For now: have a great weekend!

- Lars

patrickbachmann
Active Contributor
0 Kudos

Thanks guys this is great stuff.  Lars you don't need to elaborate anymore this explains everything to me for my needs at least.  Good luck with your move!

-Patrick

rama_shankar3
Active Contributor
0 Kudos

Great Points Ravi and Lars - thanks!

lbreddemann
Active Contributor
0 Kudos

Hey Patrick,

although you really didn't want me to do anything beyond this thread, I thought this piece would be interesting to some:

Cheers, Lars

justin_molenaur2
Contributor
0 Kudos

Lars, could you shed a little more light on the usage of the $trexexternalkey$ column? In the VizPlan I can see that this is referenced quite a but during join operations, and looking at the contents it is looks to be the concatenation of the keys of a given table. there is also some type of prefix involved (10, 2, 3, etc) that is not obvious for me.

Thanks for all the awesome info!

Justin

lbreddemann
Active Contributor
0 Kudos

Hi again.

$trexexternalkey$ is a special concat attribute as it is used for multi-column primary keys.

The prefixes are length indicators for variable string parts of the primary key.

Now that you know that: does it matter?

- Lars

Answers (2)

Answers (2)

justin_molenaur2
Contributor
0 Kudos

I was pointed to this thread by SAP Support while working an OSS message.

On my current implementation, we were experiencing issues where the analytic views were being invalidated after stored procedures were being executed that drop and recreate a given table. This was meant to solve the requirement of full table drop and reload scenarios (rebuilding a table every time), here we were using DROP/CREATE/INSERT with the CREATE statement explicitly defining the DDL of the new table.

When I analyzed the export SQL of a given table before it was dropped and reloaded, I could clearly see the 'ALTER TABLE...WITH PARAMTERS('CONCAT_ATTRIBUTE...... present in the DDL. However, per the procedure, when the table was dropped and recreated without the concatenated attributes present, the entire model was invalidated and could only be fixed by reactivation (which of course re-introduced the ALTER/CONCAT piece of the table).

So I guess this is just an FYI - any analytic views with multi column joins cannot have the tables dropped and recreated without the concatenated attributes. Maybe there is a way to retrieve the metadata from system tables to ensure the structure is recreated the same as it was before, but I haven't looked into that too deeply quite yet.

If the TRUNCATE function was available in a stored procedures this would not be an issue, but unfortunately it is not allowed at this time. So the solution for anyone looking for drop/reload is to perform a DELETE/INSERT, which is much slower than a TRUNCATE or DROP/CREATE.

Regards,

Justin

lbreddemann
Active Contributor
0 Kudos

Ahhh!

Finally I got it

Ok, so for some reason that I won't comment you want to regularly drop and re-create tables and fill them with data. Besides the fact that this is (materialized snapshots) exactly what we want to omit with SAP HANA, it's still doable without any issues even with concat attributes.

The point here is to be explicit in your data modelling.

Since the procedure that generates the DDL for your re-creation uses the dictionary information, it simply doesn't 'see' the hidden concat attributes.

What it does see are properly defined indexes on column store tables.

So, what you could do in that situation is this:

  1. Create table
  2. Load data
  3. Create indexes that match the columns of the multi-column joins for all involved tables
  4. Activate information models
    this activation will not create new hidden concat attributes but simply reuse the ones created by the indexes

    >> RELOADING PROCEDURE STARTS HERE <<
  5. Drop your table(s),
    clearly the information models cannot be used right now which is a different issue altogether
  6. repeat steps 1 - 3

Your reloading procedure would now be steps 5+6 (where step 6 really is steps 1,2,3 ).

That way, you not only prevent the automatic creation of the concat attributes but you gain visibility of them trough the index definitions.

There you go, now you know.

- Lars

justin_molenaur2
Contributor
0 Kudos

Lars, I never got a notice on your reply so I just found this, appreciate all the awesome information! I agree we want to do away with materialized tables, but in some cases the logic in the requirement makes it infeasible (for performance reasons) to do in a view based solution.

If I understand you correctly, there are two ways that hidden concatenated columns are created

1) Through explicit definition of indices

2) Through activation of a model with a multi-column join

What you are saying is to explicitly define indices that support the required joins in the models, and ensure they are recreated each time in the DDL. This makes sense in the context of how these are used.

One further observation I had now that I am informed on concatenated attributes. Does the order of the columns matter in the concatenated attributes, or does the order in which the joins are created at design time impact the order? For example, I noticed on a certain table, that there are concatenated attributes that contain the exact same columns, just in a different order. So what I see is something like below, where they may be two dimensions (attribute views) that contain the same joins, but maybe under the covers there are two concatenated attributes being created.

MATERIAL

CUSTOMER

DISTRIBUTION_CHANNEL

DISTRIBUTION_CHANNEL

CUSTOMER

MATERIAL

What would be the reason for this? I guess this would be pretty important for the above conversation where we explicitly define indices (to create concatenated attributes), we would somehow have to ensure that both of these indices were available to support both joins in the model?

Regards,

Justin

lbreddemann
Active Contributor
0 Kudos

HI Justin,

just to klugscheiss around a bit: there's a third event that could trigger the creation of a hidden concatenated attribute: performing multi-column joins in pure SQL.

And good thinking on your second question.

The answer is: no, the order of the columns in the concatenated column doesn't matter.

A tuple like [Peter|Paul|Mary] is equal to a tuple like [Paul|Mary|Peter] and HANA considers this during the build up of the translation tables that are required to map records from join side A to join side B.

BTW: as I briefly mentioned in , with SPS 7 the whole topic of automatic creation of concat attributes had been mitigated quite a bit.

You can however purposely create multiple indexes on the same set of columns with a different order of columns. For the join these doesn't play a role - but could lead to different sizes of concat attributes (think about pre-fix compression on top of the dictionary and larger clusters of same values in the value vector).

Still, if HANA (pre SPS 7) does find a concat attribute that covers the set of columns that is required for the join no new hidden concat attribute will be created.

And you'll notice that the concat attributes that are created always contain the columns in alphabetical order - not in the order the join had been specified in the statement.

- Lars

justin_molenaur2
Contributor
0 Kudos

First of all, thanks for entertaining my questions. I had to look up the translation on klugscheiss, makes more sense now

This gets a little interesting with your explanation and the example I am drawing from the system I currently working on. Here I have two dimensions that use the same 4 columns to join (DISTR_CHAN, SPART, VKORG, KUNWE), but I am seeing two separate concatenated attributes in the SQL of the underlying table. These are the same four fields, just in a different order and neither of them are alphabetically. These were generated by the activation of the model and not through one of the other two methods to generate them.

lbreddemann
Active Contributor
0 Kudos

Interesting.

Actually I only ever had a closer look into the automatic generation of concat attributes due to SQL joins.

The Modeller basically creates the join concats explicitly - so, feel free to disregard what I wrote for this case.

I cannot really explain why it would be necessary in your example to have the two concat attributes.

However, the double use of the same join fields for different dimensions is not exactly a standard star schema

Anyhow, I just tried to reproduce this on SPS 7 and failed - just one concat per table had been created although the XML for the analytic view clearly shows, that the order of join columns is different.

Maybe this is another optimization that was introduced - really, don't know.

- Lars

Former Member
0 Kudos

Hi Lars,

“Playing around a bit with SAP HANA” presentation is awesome and learn more on it.

I have some few queries regarding the presentation, when you said in version sp8 the concat attribute will not create.

But In our case (Currently using sps8) the concat attribute have created while joining the tables / select statement automatically.

How we need to reduce our table space and remove the concat attribute, Please help on this? Or suggest some other good ways.

Awaiting on your reply!

Regards,

Shaj

lbreddemann
Active Contributor
0 Kudos

Hey Shaj,

as I described in the creation of concat attributes is governed by an indexserver parameter:

indexserver.ini - [JOINS] - OPTIMIZE_METAMODEL = ON | OFF | AUTO (Default)

You may want to check the setting of this parameter.

Also, are you sure that the concat columns had been created with SPS8 or could it be that these had already there from an earler revision?

- Lars

Former Member
0 Kudos

Hi Lars,

Yes, Still the concat attribute are there in SPS8.

Regards,

Shajahan.S


lbreddemann
Active Contributor
0 Kudos

How is the parameter set in your environment?

Can you provide a test case that actually demonstrates the behavior?

- Lars

Former Member
0 Kudos

Hi Lars,

I have checked the indexserver.ini file in our system for Joins ---> Translator_cache_size.

I didn't find the option for Optimize_Metamodel. Please let me know how to proceed.

Regards,

Shaj

lbreddemann
Active Contributor
0 Kudos

You might want to check the documentation again

If the parameter doesn't show up, its initial value is used in your case.

You may set the parameter explicitly so that you know for sure which setting is active.

Former Member
0 Kudos

Hi Lars,

Initial values means ON (create + use concats)?

Which option is the recommended one.

lbreddemann
Active Contributor
0 Kudos

The default setting in fact is the recommended setting.

If you change the parameter setting you might end up with a worse performance.

I think I made the implications and suggested approached rather clear in the presentations.

For SQL queries it's a good idea to drop the concats and create indexes instead.

For information models you want to keep the automatically created concats (with SPS 9 onwards there is an option to choose whether or not to create them in the modeller UI).

All in all, it's important not to overdo the removal of the concats just for the sake of saving memory. Always test the performance of your queries after dropping them.

- Lars

former_member182302
Active Contributor
0 Kudos

Hi Lars.

For the columns created in earlier version, is it recommended to drop ?

Or we have to check the performance of the queries and take a call?

Regards,

Krishna Tangudu

lbreddemann
Active Contributor
0 Kudos

That's actually up to you.

I'd probably start with getting an overview on what concats are actually there and try to find what queries those serve.

With that knowledge it's then possible to take informed action and take it step by step.

After all it is a decision of query speed vs. space consumption. If the performance benefit it too small to rectify the additional storage usage dropping looks like a good idea to me.

But there is no general: "drop them all"-statement or recommendation.

- Lars

Former Member
0 Kudos

Lars is there a way to restrict any new dynamic column to column or concat-attributes to be created.  We have only 16GB for the data and now have 24GB used by the different joins most of them are wrong or don't even make sense.  Or then atleast of a time to live concat-attributes.

lbreddemann
Active Contributor
0 Kudos

Hi Ram,

currently there is no supported way to do this.

If you have a system (obviously dev system) where you find a lot of these concat_attributes, you can manually drop them.

Next time they need to be used, they will be recreated anyhow (or via activation of information model).

As a more general approach to explicitly model multi-column join support, you could define corresponding indexes on the column combinations you like to support.

All remaining concat_attributes then would be candidates for removal.

- Lars

justin_molenaur2
Contributor
0 Kudos

Per my my post late last week, I was thinking more about this comment

"If you have a system (obviously dev system) where you find a lot of these concat_attributes, you can manually drop them.

Next time they need to be used, they will be recreated anyhow (or via activation of information model)."

Per what I have observed, if you DROP the concatenated attributes of a given table, and those are in turn used in some other views - this will be a problem as the view will be invalidated and rendered unusable unless it is activated once more. But the comment is directed only at a DEV system where it can be quickly corrected.

Regards,

Justin

lbreddemann
Active Contributor
0 Kudos

That's what I wrote

"Next time they need to be used, they will be recreated anyhow (or via activation of information model)."


Maybe it is important to make it double clear:

it's not recommended or wise to perform a drop of a concat attribute in a live production system.

Just as you don't drop indexes on a classic database of a production system without having checked the side-effects before.

- Lars

justin_molenaur2
Contributor
0 Kudos

Yes, fair enough - you did say it . Just want to make clear that views become completely unexecutable ("view is invalidated" error) until re-activation.

Thanks for your insight here, always appreciated!

Regards,

Justin

justin_molenaur2
Contributor
0 Kudos

Lars - while I have your ear in this thread - had one more thought on this topic, which is also touched on within OSS note 1702641 under point 3.

I know that it is not possible to create an analytic view with an attribute view and fact table where the join keys come from two or more separate physical tables within the attribute view. For example, attempting to create some type of rule table with columns like MARA-MATNR and KNA1-KUNNR then joining to MARA/KNA1 then to a fact is not allowed in HANA currently.

Perhaps this limitation exists since multi-column joins need these concatenated columns to support them? Since you cannot persist a concatenated column between two tables, then I may assume that this is the reason for the limitation?

Regards,

Justin

lbreddemann
Active Contributor
0 Kudos

Hi Justin,

That's not how I understand it.

Analytic views are a way to model star-schema-queries. A star schema is defined by having exactly one fact table where all key figures/measures come from and the dimensions that take all the filters. This very pattern (filter on the "outer" tables, applied to the aggregated fact table) is the basis for the OLAP column view processing.

Allowing multiple sources for aggregation would require materializing the joined result set for the fact table and that would kill performance.

The thing is: if you don't have a star schema, then the olap column view is maybe not the right modelling tool. You can use calc views instead and don't have the limitation.

This has nothing to do with the concat attributes.

Even with single column joins the limitation is there.

- Lars

justin_molenaur2
Contributor
0 Kudos

Lars, thanks for the response. Perhaps I was not so clear in what I was trying to explain above - I am good with the star schema concept

What I was trying to say is that this scenario is allowed in HANA modeling (multi column attribute to fact join with all keys coming from same physical table), traditional snowflake.

While this is not allowed (multi column attribute to fact join where not all keys come from the same physical table).

My correlation with above is related to the need for the materialized concatenated attributes when doing multi column joins. Since the second scenario cannot really have concatenated attributes materialized (they exist in two separate physical tables), that may be the reason it's not allowed.

Hope this makes more sense.

An example where I hit this block is table 1 = MARA, table 3 = KNA1. Table 2 would hold some arbitrary groupings of customer and material attributes (material group, customer group) that map to some business definitions. Each line in the fact could technically resolve to the attribute in Table 2 and also by filtering table 2 we could force the fact to get filtered. This was accomplished in a traditional DB before, but cannot be replicated in HANA without some type of materialization due to the limitation.

Regards,

Justin

lbreddemann
Active Contributor
0 Kudos

Hmm....

I see what you want to do there.

But in fact this is not a proper star or snowflake schema.

I don't say this model is wrong but just that it doesn't follow the star/snowflake schema definition on which the olap view model is built upon.

Even with attribute views or dimension clouds or however you like to call them, the join to the fact table always must be coming from one table only.

Other DBMS are not different in that - they will process the request, but not with their star schema optimization. They'll apply regular general SQL processing for that instead.

Just like SAP HANA does if you create the query in SQL or in a calculation view.

Using the olap view in SAP HANA in fact is just a fancy way to tell it: look, this is a proper star schema, please optimize it for this.

Mapping non-BW table designs into a star schema is not always possible and if you check the HANA live content, you'll find that most views are in fact calculation views with SQL in them to express the sometimes really complex relationships of ECC tables.

- Lars