cancel
Showing results for 
Search instead for 
Did you mean: 

vCPU vs Physical CPU

nelis
Active Contributor
0 Kudos

Hi,

Just trying to get my head around vSphere with possible future implementation...

From what I understand from reading various white papers etc with HT enabled on physical hardware a single vCPU is/can be equivalent to a single HT thread of a physical core CPU ie Quad core(8 threads) = 8 vCPU's ...correct ?

So then when SAP tells you to configure eg database parameters to the number of cores do you take into account the physical or virtual aspect(should a vCPU be configured to a single core in other words) ? When I see benchmarks from SAP eg [this benchmark|http://download.sap.com/download.epd?context=B7691794A7D3E12043C201290ABF37F5DED04E103D4E310018D06CBF7953BE17F69E02BCFBFF4510737F16C3ADA5246C] ...are they using HT or not(it isn't very clear, unless I am missing something) ?

Thanks.

Nelis

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

Hi my friend

The quantity of physical CPU cores matters for optimizing database and SAP instance, not CPU quantity or any virtual CPUs(hyper thread technology). If it's a 4 way 4 core CPU host, then there are 16 physical CPUs recognized and functioning for applications.

On the other hand, hyper thread technology creates one more virtual CPU in a single core physical CPU, makes OS recognize 2 CPUs instead. Back to 7-8 years ago, it improved performance 10% approximately since dual core CPU hadn't published yet. But thereafter, it cannot even compete with multi-core CPU power so usually hyper thread should be disabled in BIOS unless you're clearly told to by hardware/database/SAP vendors.

Regards,

nelis
Active Contributor
0 Kudos

Thanks for your reply.

...so usually hyper thread should be disabled in BIOS unless you're clearly told to by hardware/database/SAP vendors.

I'm not so sure this is correct.

According to [this case study|http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/60458fab-e8f5-2c10-409e-d53eedcb7d54?QuickLink=index&overridelayout=true] between Intel, VMWare and SAP, HT "...enables greater consolidation ratios by supporting more virtual machines, or a greater number of vCPUs assigned per virtual machine." ...so it's more like they are telling you to enable it. In fact, in that case study they do use HT with SAP/VMWare. Or perhaps that is only for the Dialog instances ? If that is the case then you would require two separate VM hosts, one with HT(for DI's) and one without for CI/Database if you want to use complete virtualization ?

Nelis

Former Member
0 Kudos

You're right!! Thanks for pointing it out, I'm terribly sorry for the incorrect info, I need some chocolate now......

Note 1329360 - Using Itanium Hyperthreading on HP-UX

In general, the use of Hyperthreading should improve performance for most SAP workloads.

SAP with Microsoft SQL Server 2008 and SQL Server 2005 Best Practices

SAP supports Windows Server 2008 and the Hyper-Vu2122 technology for their NetWeaver 7.0 based applications and later releases of SAP software. It also is expected that SAP will support Windows Server 2008 R2, which is expected to be released in the second half of 2009 by Microsoft. Windows Server 2008 R2 is expected to introduce support for up to 256 CPU threads and live migration of Hyper-V virtual machines via a Shared Cluster File System.

At least on the Intel side, simultaneous Hyperthreading is introduced again. Measurements with first generations of these processors in the labs of Intel, Microsoft and hardware vendors revealed that the new Hyperthreading can increase throughput 20 to 40 percent.

nelis
Active Contributor
0 Kudos

So back to my original question, is that benchmark with or without HT enabled and what is the recommendation when told by SAP to change parameters according to the amount of core's in a system ?

Personally I can't see how SAP could expect a single HT thread or vCPU now to become the equivalent of a physical core, hence my question. They must say with or without HT.

Nelis

Former Member
0 Kudos

> is that benchmark with or without HT enabled

It is enabled, indicated "2 processors / 8 cores / 16 threads".

> what is the recommendation when told by SAP to change parameters according to the amount of core's in a system ?

If CPU quantity determines parameters for optimization in any case, it usually tells you whether or not hyperthreading counts. Like in SQL server, quantity of physical CPU cores determines how many data files should be used, indicated in the best practices white paper:

We look at cores where one socket today usually has up to 6 cores and CPU threads, which usually is the number of cores of a socket when no Hyperthreading is used, or double the amount of cores when Hyperthreading is available and used.

Regards,

Former Member
0 Kudos

Hi Nelis,

HT didn't bring a performance boost for SAP SD workload. SMT does (up to 30 %). How many cores per SAP suggestion can be translated in SMT enabled vCPUs is not necessarily the question which has to be asked here. Anyhow, a recommendation "per core" is very vague as core performance differ significantly. It's more important how you configure your ESX computing resources in terms of CPU overcommitment. On a machine where SMT is enabled, a 4 vCPU machine can use full 4 cores because of SMT's capability to assign idle core resources to the demanding threat. So your machine is used to have a certain performance. But when other virtual machines request computing resources, the performance of the machine could decrease significantly.

The Resource Management Guide for vSphere 4 gives a pretty good overview of how ESX treats multithreaded environments.

Kind regards,

Matthias

nelis
Active Contributor
0 Kudos

Ok. I'm still not clear on one thing though, what determines the number of vCPU's in a host, is it only the number of cores or will a HT enabled host allow for more vCPU's ?

Thanks, I'll do some more reading to get a better idea of multithreading in ESX and come back to this thread if necessary.

Nelis

Former Member
0 Kudos

To be precisely, VMware doesn't speak of cores but of "Hardware Execution Contexts". Every HEC can be separately addressed by the CPU scheduler. When you use HT, a core consists of two HECs.

nelis
Active Contributor
0 Kudos

1 vCPU = 1 HEC = 1 HT thread on an HT enabled host ?

...phew, difficult to get a straight answer

So when you size a VMWare environment would you not take into account how many vCPU's are available and whether HT is enabled or is it preferable to not enable it ? You can see why I asked my original question. Why am I seeing case studies with it enabled and without, obviously it must have some effect.

How many cores per SAP suggestion can be translated in SMT enabled vCPUs is not necessarily the question which has to be asked here. Anyhow, a recommendation "per core" is very vague as core performance differ significantly.

Well, lets say I want to take a bunch of hardware based systems, some with very specific requirements, and virtualize them. Requirements are that I have the database systems set to use at least 4 cores for good IO. Now clearly when sizing such an environment I would need to determine the number of vCPU's supported. Also, if the vCPU's were based on an HT enabled system then my conclusion is the performance would certainly not be the same as that on a system without HT enabled. My guess is it would also be far easier to oversubscribe on an HT enabled host.

Anyway, let me rather go do some more reading first then revert back if necessary, I'm probably just confusing myself...

Thanks for your time.

Nelis

Former Member
0 Kudos

Hi Nelis,

1 vCPU = 1 HEC = 1 HT thread on an HT enabled host ?

Yes.

...phew, difficult to get a straight answer

Yes.

The problem is that resource management is a very dynamic topic. I don't want customers to follow straight recommendations, because afterwards I have to process the tickets customers are opening because of bad performance. In the most cases, the reason for this is the lack of sizing / determination of expected workload. And that they don't read the Resource Management Guide...

Nevertheless, let's make an example:

You have an ESX server with 8 cores, HT enabled, therefore it is capable to schedule 16 vCPUs at the same time. Well, not exactly, because HEC 0 is always reserved for the Console OS of ESX. So you have 15 vCPUs to execute your stuff.

Now you have three VMs: two VMs with 8 vCPUs each, one VM with 4 vCPUs. In that case, you have an overcommitment of 5 vCPUs - which is absolutely not a problem for VMs of an average load.

Let's assume two VMs are idle and one 8 vCPU VM gets fully utilized. In that case, this VM gets the processing resources of the whole ESX server, as the vCPUs in an HT environment usually are getting scheduled on different cores. So the workload on each vCPU can utilize the full core because the other thread on the core is idle. Only one vCPU has to share Core 0 (which contains HEC 0) with the ESX Console OS.

Now, let's assume every of the three VMs has a utilization of 75 %, therefore the ESX host would also be fully utilized. Don't think like "how many HECs does the 4 vCPU machine have now". If you count in HECs, you could assume that it uses 3 HECs and 1 HEC (the view from inside the guest machine would be: 1 CPU) would be "offline" or something. This IS NOT the case. The guest machine utilizes all of its 4 CPUs it sees and the ESX CPU scheduler schedules the workload among the available HECs.

If you want to prioritize a certain VM so that its scheduling is granted no matter how big the overall workload is, you can set reservations. And because the ESX does not count in HECs to schedule a machine's workload, the reservation is given in MHz because this is the smallest unit of processing power the ESX CPU scheduler is aware of.

So if you are more confused than before - read the Resource Management Guide. Some say it has a much better wording than I have

Kind regards,

Matthias

nelis
Active Contributor
0 Kudos

Thanks Matthias for your contribution, it's much appreciated.

Just to finish off this thread....

After some additional reading I came across some useful information which hadn't been presented to me and which basically addresses my concerns with Hyper-threading in a virtualized environment. The best practice is to have HT enabled on a VMWare host. My concern was that a hyper-threaded host allocated vCPU would give suboptimal performance as apposed to a host without HT enabled.

As it turns out even after having HT enabled you can still manipulate the VM's into not using a HT threaded CPU by allowing the VM to a) assign a vCPU to a complete core so that no other vCPU can use it and b) allowing vCPU's in the same VM(SMP) to share the same core while others cannot. That's not to say that either of these options are recommended but I get a warm fuzzy feeling inside knowing that I have a choice

The winning statement for me was:

ESX systems manage processor time intelligently to guarantee that load is spread smoothly across all physical cores in the system. If there is no work for a logical processor it is put into a special halted state that frees its execution resources _and allows the virtual machine running on the other logical processor on the same core to use the full execution resources of the core._

Nelis

Former Member
0 Kudos

Hi Matthias,

I have a couple of questions because everything is not clear for me:

1) For an intial sizing, can we take it as a recommendation to put 1 vCPU for 1 HT?

2) Another related question: in the SAP Online help, it is stated that for windows, one CPU core should support 7 workprocesses. (http://help.sap.com/saphelp_nw73ehp1/helpdata/en/49/38111dc6d21ec9e10000000a42189b/content.htm). How about a core with 2 vCPUs? Still 7 or 7 per vCPU? Same for Linux?

Thanks,

Christian

Former Member
0 Kudos

Hello Christian,

when you run your Virtual Machines on hardware with active hyperthreading (HT) respectively simultaneous multithreading (SMT), things are a little different. It is hard to create a rule of thumb for this. Also, the 7 WPs / core rule is a very, very rough estimation. It depends on the application inside NetWeaver and on the type of users who are working on the system. Such recommendations are virtualization agnostic, therefore standard SAP performance rules apply.

With enabling HT / SMT on x86 CPUs, the calculating unit of the core is shared between two execution units, therefore it computes with higher efficiency. Mostly, the NetWeaver stack will profit from having more execution units. The downside is that a specific single thread performance cannot be guaranteed, as there always might be work comming in from the other thread on the core. So if you have workload which demands high computing throughput, you might see a performance degradation with active SMT.

Wether SMT is active or not, VMware ESX does not statically "map" threads or cores to vCPUs. But one vCPU will always be executed on one - like we call it - hardware execution context. This is a thread in SMT active hardware, and a core in non-SMT hardware.

On hardware with active SMT, when you have more vCPUs assigned to Virtual Machines than cores are available, it will happen that two vCPUs (from the same Virtual Machine or from different Virtual Machines; this decision is up to the hypervisor) are utilizing each thread of a core and share its computing resources. So none of the two vCPUs will have its full computing capability at that time. But the hypervisor schedules CPUs very dynamically and tries to avoid such situations. But the higher your over-commitment ratio is, the more likely you will run in such resource contention situations.

A more extending article which takes not only SMT but also NUMA effects into account can be found here:

[NUMA, Hyperthreading and NUMA.PreferHT|http://frankdenneman.nl/2010/10/numa-hyperthreading-and-numa-preferht/]

Kind regards,

Matthias

Former Member
0 Kudos

There is a caveat though.

If you are going to run your hardware at close to or at 100% Hyperthreading can hurt single threaded performance by up to 30%.

Hyperthreading relys on a physical core being in some sort of wait state so it can schedule a second thread on the same physical core taking advantage of this "idle time".

Now if you are running the physical cores at 100% (lots big single threaded CPU intensive jobs) or massivly over commiting you don't get much idle time, so a smaller thread ends up waiting for resources.

Hyperthreading work great is you have lots of small threads, i.e. online transactional workload or web type work loads.

Now AMD have gone down a different route with lots of smaller cores on the same die (not Hyperthreading) which tend to have less of an impact when running at high CPU utlization.

Answers (0)