on 03-28-2013 9:15 PM
Hello SAP HANA Friends:
Has anyone successfully partitioned an SAP HANA columnar table on a single-host system?
I need a little guidance and direction on accomplishing this. As I understand it, a single-host system has one
indexserver (master). I do not understand how I can partition a table when I only have one indexserver available.
I would appreciate some direction on how to do this.
Regards,
Joe Gonzales
856 912 1136
Hi Joseph,
technically nothing prevents you from partitioning tables on a single node system.
Why would we?
In fact, distributing data over multiple nodes is just one use case for table partitions.
When partitioning a table, every single partition gets its own table-data structure internally, so that every partition effectively behaves like a table in its own. Except for the fact that we take care of which partition to work on, based on your SQL statement.
A classic example is the partitioning by the age of the records.
Assume that most of the data you work and report on relates to the recent days and that you barely ever look at data from a year ago.
For such a scenario one can create partitions, e.g. one for each month.
The nice thing here is, that all the data you're currently not interested in, doesn't need to be in memory all the time. It can safely be unloaded, without any negative impact.
Delta merge operations only need to be performed for the current partition, as no data is changed in the "old-months"-partitions.
Upon query time, the HANA optimizer is clever enough (most of the times), that you're only looking for current data (you need to include a WHERE condition for the partitioning key for this into your query) and will only touch the corresponding partition(s).
Also, multiple partitions can be processed in parallel - so if your query should include data from partition 1 and partition 2, those two can be processed at the same time, making the query execution faster in total.
So, yes, you can use partitioning on a single node system with just one indexserver.
And it also can make sense 🙂
- Lars
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Well, in that case, the partitions that are not used will get unloaded as soon as the memory is required for other columns/partitions to work on.
Unfortunately for you, your dominant client will tend to use most of the space. So the effect of memory saving can safely be expected to be smallish.
- Lars
Hi Joseph,
SAP HANA reads the data of column store tables into memory upon the first use after restart.
This is done on column basis, so only those columns that you actually refer to will be loaded to memory.
With partitioned tables, this can be done one a finer level of granularity: per column per partition.
If for some partitions all columns had been requested, you'll see them as FULLY LOADED.
For other partitions, where not all data was requested yet, PARTIALLY LOADED will be displayed.
- Lars
Hi Joseph,
There are two scenarios to partitioning a table. One is across several nodes in a cluster and the other is partitioning the table within one node. The former is normally referred to as a 'distributed table' (i.e. the table is distributed across several nodes).
You can partition the table within a single node for several reasons such as overcoming the 2B limit on the table entries, or for partitioning the table based on some criteria that improves the query performance in some cases.
When creating the table, there are several partitioning parameters available which you can see in the SAP HANA SQL Reference documentation available here: http://help.sap.com/hana_platform
Hope this helps
-Vivek
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
87 | |
10 | |
10 | |
10 | |
7 | |
6 | |
6 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.