cancel
Showing results for 
Search instead for 
Did you mean: 

CPU over-commit in VMware cluster environment

Former Member
0 Kudos

I have read the note 1056052. The #5 guideline points out that SAP system should not use memory over-commit setting for virtual machines built on Windows with SQL server platform. My question is if this recommendation applys to CPU as well. Is it common to set CPU over commit for virtual machines that run dialog instance?

My environment is VMware vSphere 4.1, four ESX servers, ESX1 and ESX2 for production and ESX3 for DEV and ESX4 for QAS respectively. The initial placement is PRD CI+DB residents on ESX1 with all resources allocated, three PRD DIs each with 4 vCPU configured residents on ESX2 (total CPU capacity: 8, so CPU is over-committed here). Distributed Resource Scheduler (DRS) is enabled in the VMware cluster. I suppose DRS will initiate vMotion for one or two DIs from ESX2 to ESX3 or ESX4 with spare resources when PRD is loaded heavily.

Do you think the above system landscape and configuration is a practical one or reasonable? Thanks in adanvce for your help.

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

Yes, you can overcommit CPUs. See [Resource Management Guide|http://www.vmware.com/pdf/vsphere4/r41/vsp_41_resource_mgmt.pdf] for details.

Be aware that you could run into situations where every of the three PRD DIs assign its fully allocated resources, so that a contention situation occurs. In that case, response times of all three DIs will go down, unless one of the DIs will be moved on a less utilized ESX server.

How your DRS behaves depends only on how you configure it. This is also described in the Resource Management Guide.

Matthias

Former Member
0 Kudos

Matthias, thank you for your reply. You are right on the conclusion of CPU contention in case of high workload where CPU is over-committed. That's exactly where DRS can benefit SAP. I am now testing to see if CPU over-commitment will slow down the system in normal workload or the performance down is in an acceptable range. What's your insight of this?

At the same time, let's consider a different configuration. I remove the CPU over-commitment, for example 4 VMs each with 2 vCPU running on a 8-core ESX server. When the system is constantly stressed, DRS triggers vMotion of 1 VM off the specific ESX server. Comparing the response time before and after vMotion of that DI, I didn't see any improvement at all. Does this make sense? Can I conclude that DRS doesn't help in improving SAP performance whe CPU is not over-cmmitted on DI server?

Edited by: Fred Zhou on Nov 3, 2010 9:21 AM

Former Member
0 Kudos

The reason why vMotion moves a VM off a busy server is the following: On an ESX server, you always have still calculate some resouces for a) the Console OS and b) the VMkernel. So if you have 8 cores, and assign 8 cores to VMs, there is still a little contention between the VMs and the COS + VMkernel. That's the reason why DRS moves a VM away from such a utilized server. The performance gain might not be very big, because the contention was not critical.

As a rule of thumb I'd say, it's very dependent to the workload and to the CPU family on how CPU overcommitment affects the performance. There are workloads and CPU combinations where the [overall throughput can even rise|http://www.sdn.sap.com/irj/sdn/linux?rid=/webcontent/uuid/90bf7257-166b-2d10-2fbb-ec137a5c9b86] [original link is broken]; due to overcommitment because certain CPU features will be triggered. In general I'd say overall throughput will stay the same or can even increase (while throughput per vCPU will of course decrease), but latency will get worse because vCPUs can not be scheduled in realtime.

I recommend to read the [ESX CPU scheduling guide.|http://www.vmware.com/resources/techresources/10131]

Matthias

Former Member
0 Kudos

This thread can be closed. Thanks again for all helps.

Answers (0)