on 03-27-2015 9:07 AM
Hi experts,
I have a doubt... what happend if the master index server fail in a landscape with scale out but without standby?
does the change from the fail master index server change automatically to another index server slave?
Thanks in advance,
Regards,
When no standby server is available and a node fails, the whole db instance fails.
In such a - non-supported - scenario, there won't be a node that would take care of the data that was handled by the failed node before.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
It can be solved by replacing the failed node with one that works.
As each node should be working on it's own shard of the data, just assigning the piece of the failed node to any of the remaining nodes would mean to overload that node.
And if you had a node that would still have enough capacity to take over the work of a failed node, then this should be your standby-node anyhow.
The point here is not that the master name server node is affected. There are two other nodes that would take over the master name server role.
The point is that one of the nodes failed and if it cannot be brought up again, you're lacking a full node to run this distributed system.
You have to replace the node then with a working one.
That's how sharding works.
To mitigate the inherent single point of failure, we only support scale out with standby nodes.
Without that, you're out of luck.
- Lars
ok, I think the last one.
Suppose, I have only two nodes in scale out and no more... if one of them crash; the unique solution is reconfigure for work SAP HANA with only one node (that it can assume the resources), is it correct?
The reconfiguration is... backup and restore in one node?
Thanks in advance,
Regards,
No. Why would you backup and restore in this case?
The storage is shared in a SAP HANA cluster.
It doesn't "go bad" when a server node fails.
What you need to have is another server with the SAP HANA software on it that you attach to the storage. Then you "introduce" the new node to the system and can continue to work with no committed transaction lost.
- Lars
So you're asking if it is possible to reduce a scale-out system back to a single-node system?
This is a major system landscape design change and not the reaction to a server fault.
Also, it's only possible under the condition that the single-node system would be able to cope with the total load of the system.
In general yes, if you could squeeze all data to be processed by a single node, then you can "migrate" back to a single node system. And yes, that would be done via the restore of a backup.
As said above, this is not the scenario you asked about before - recovering from a failed node.
Remember, SAP HANA uses a shared nothing (SHARDING) approach. If resources fail and no redundant resources are available, then the system fails.
- Lars
If you have n active nodes of X TB of RAM, it means your total sizing requires n*X TB of RAM available for your system to properly operate. Also, a n-node scale out system will already have placed all tables/distributed all partitions across these n available nodes. In case of of those n nodes fail, you'll end up with (n-1)*X TB of available RAM. Also, the disk area equivalent to the tables/partitions of the failed node won't have available memory space to be loaded into, and that's why your whole instance fails.
You can of course redistribute the tables/partitions across your (n-1)-nodes scale out system, in case (n-1)*X TB of RAM is enough for all of your tables. But this is not without a considerable (> a few minutes) downtime.
The recommendation of having at least 1 stand-by node allows a near-zero downtime, since the failed node requests will be taken over by the stand by node (which will also mount the disk area of the tables/partitions of the failed node). In this case, your pending requests which were processed by the failed node might fail, but new requests will be properly processed (since the master node will already route them to the newly available former-stand-ny node).
In particular, it's a possible configuration to have a 2-node scale-out system, 1 active and 1 stand-by. However, you won't have 2*X TB available RAM, you'll have only X TB (it's active-passive, not active-active). This configuration is called host auto failover in the context of SOH and allows true HA (zero downtime) for single node systems.
Best,
Henrique.
User | Count |
---|---|
76 | |
9 | |
8 | |
7 | |
6 | |
5 | |
5 | |
5 | |
5 | |
5 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.