Compression technique in HANA + comparisons
I'm working on SAP HANA right now and try to understand the whole database technique.
I found this video & pdf about SAP HANA compression (Compression | SAP HANA) but which technique does HANA actually use? All of those, do you choose or how exactly does it work?
Additionally it would be great to know if there are any comparisons to other database compression techniques out there to evaluate if the compression rate is good or not?
I appreciate every help!
John Appleby replied
If you are interested in this then you might consider watching the HPI in-memory course at http://open.hpi.de
The primary compression is dictionary encoding of columns.
Secondary compression is in the attribute vector of the column. There are several algorithms including run level encoding, cluster, prefix encoding, etc. This reduces the size of the attribute vector.
HANA doesn't look to compress too aggressively because the plan is to reduce memory bandwidth but not at the overall expense of performance. Therefore all compression algorithms are scan and cache friendly.