cancel
Showing results for 
Search instead for 
Did you mean: 

Partitioning a table on a single-host SAP HANA system

joseph_gonzales
Participant
0 Kudos

Hello SAP HANA Friends:

Has anyone successfully partitioned an SAP HANA columnar table on a single-host system?

I need a little guidance and direction on accomplishing this. As I understand it, a single-host system has one

indexserver (master). I do not understand how I can partition a table when I only have one indexserver available.

I would appreciate some direction on how to do this.

Regards,

Joe Gonzales

856 912 1136

Accepted Solutions (1)

Accepted Solutions (1)

lbreddemann
Active Contributor
0 Kudos

Hi Joseph,

technically nothing prevents you from partitioning tables on a single node system.

Why would we?

In fact, distributing data over multiple nodes is just one use case for table partitions.

When partitioning a table, every single partition gets its own table-data structure internally, so that every partition effectively behaves like a table in its own. Except for the fact that we take care of which partition to work on, based on your SQL statement.

A classic example is the partitioning by the age of the records.

Assume that most of the data you work and report on relates to the recent days and that you barely ever look at data from a year ago.

For such a scenario one can create partitions, e.g. one for each month.

The nice thing here is, that all the data you're currently not interested in, doesn't need to be in memory all the time. It can safely be unloaded, without any negative impact.

Delta merge operations only need to be performed for the current partition, as no data is changed in the "old-months"-partitions.

Upon query time, the HANA optimizer is clever enough (most of the times), that you're only looking for current data (you need to include a WHERE condition for the partitioning key for this into your query) and will only touch the corresponding partition(s).

Also, multiple partitions can be processed in parallel - so if your query should include data from partition 1 and partition 2, those two can be processed at the same time, making the query execution faster in total.

So, yes, you can use partitioning on a single node system with just one indexserver.

And it also can make sense 🙂

- Lars

Former Member
0 Kudos

Lars,

we are on a memory limited AWS instance and we have data from multiple clients which we rarely use but a single one. If we partitioned the data in the tables by client id and queriy only for the dominant client, how would HANA handle loading data into memory?

Thanks

Albrecht

lbreddemann
Active Contributor
0 Kudos

Well, in that case, the partitions that are not used will get unloaded as soon as the memory is required for other columns/partitions to work on.

Unfortunately for you, your dominant client will tend to use most of the space. So the effect of memory saving can safely be expected to be smallish.

- Lars    

joseph_gonzales
Participant
0 Kudos

Hi Lars:

Thanks for your response.

Could you help me understand what is means and what is the expected impact if I see that I have some partitions that are Fully Loaded, and other partitions that are Partially Loaded?

Regards,

Joe G.

856 912 1136

lbreddemann
Active Contributor
0 Kudos

Hi Joseph,

SAP HANA reads the data of column store tables into memory upon the first use after restart.

This is done on column basis, so only those columns that you actually refer to will be loaded to memory.

With partitioned tables, this can be done one a finer level of granularity: per column per partition.

If for some partitions all columns had been requested, you'll see them as FULLY LOADED.

For other partitions, where not all data was requested yet, PARTIALLY LOADED will be displayed.

- Lars

Answers (1)

Answers (1)

Former Member
0 Kudos

Hi Joseph,

There are two scenarios to partitioning a table.  One is across several nodes in a cluster and the other is partitioning the table within one node.  The former is normally referred to as a 'distributed table' (i.e. the table is distributed across several nodes).

You can partition the table within a single node for several reasons such as overcoming the 2B limit on the table entries, or for partitioning the table based on some criteria that improves the query performance in some cases.

When creating the table, there are several partitioning parameters available which you can see in the SAP HANA SQL Reference documentation available here: http://help.sap.com/hana_platform

Hope this helps

-Vivek