10 Pitfalls That Can Impact VMWare Performance

Written by SPK Blog Post

Published on January 22, 2013

Categories: Data Engineering | Engineering Operations | Infrastructure | PDM/PLM-Product Data and Lifecycle Management | Software Development & Release Management

Ensuring servers provide consistent performance is a primary goal for all infrastructure management services . A large portion of our servers are in a virtualized environment, and the additional complexity involved there can present some challenging performance issues. The common solution of throwing faster CPUs and more RAM at performance problems may resolve most of your issues, but in many cases, it takes some deeper analysis to uncover some not so obvious bottlenecks.

Here are 10 pitfalls I’ve encountered that may be negatively impacting your VMware environment:

VMware tools. Yes, this is a very obvious item. The VMware tools not only provide an optimized NIC driver, but it more importantly includes a memory ballooning driver. It will encourage your guest to swap out any inactive memory pages — which can be very useful, particularly for over-committed hosts. The pitfall I frequently run into is that our Linux machines are patched and rebooted on a regular basis. Some of these updates include a new kernel, and when that is the case, VMware tools need to be rebuilt. This sounds like a good candidate for a custom Nagios plugin! The plugin could do an lsmod and make sure the VMware modules are present.
Storage tradeoffs. In a perfect world, we want large, fast, inexpensive disks in our storage array. Large disks, in the 2, 3, and 4TB range typically are limited to the SATA variety. Conversely, building a pure SSD based storage array could easily run you into the $60k range for only 10TB of space. SAS is a great middle ground. 10K RPM drives are now available in 900GB 2.5″ form factors, so density is a plus there as well. Based on my experience, slow storage is one of the most common bottlenecks. A 10 spindle SATA array with a quality RAID controller may provide disappointing results when coupled with an intense workload such as virtualized databases. SSD caches can be implemented in a couple of ways to help boost performance. From VMware’s perspective, vSphere 5 now lets you migrate VM swap files onto SSD disks. From an array’s perspective, RAID controllers may feature SSD caching as well. One example is Adaptec’s maxCache feature.
Cores vs Clock Speed. Back to my comment about throwing more CPUs at a performance problem — there are cases where more is not necessarily better. It’s important to best match your CPU type for a given workload. For instance, if your VM workload consists of a few single threaded applications, you will want the fastest CPUs available — not more CPU cores. However, if your VM workload consists of something like a virtual desktop infrastructure (VDI), you’ll likely care more about the total number of CPU cores available to the host.
Host density. While it may be great to tout the “consolidation ratio” you’ve achieved to upper management, the reality is that you need to be prepared to have a host failure at some point or another. When that moment arrives, assuming your remaining hosts even have the spare capacity, how quickly can the failed machines recover? When pricing out a new environment, perhaps it makes sense to look at reducing the specs of several hosts slightly so that an additional one can fit in the budget.
Network bandwidth . Most servers nowadays include two gigabit Ethernet interfaces, sometimes four. Two will be enough to get you by, but it is not ideal. Consider a situation where you have management and VMotion traffic on NIC A, and VM Network traffic on NIC B. You could potentially lose management access to your host if VMotion traffic causes network saturation. For new installations, consider migrating to 10Gb ethernet, which should provide more than adequate bandwidth for all traffic combined.
Lack of Capacity planning. For some reason or another, when customers hear the term virtual, they assume that there’s no incremental cost involved in adding additional VMs. In actuality, we know that nothing is free. That and the fact that VMs are extremely trivial to provision, we’re frequently in a position to give in to requests easily without giving them much thought. Instead of completely pushing back on the customer, perhaps make it a policy that each new virtual machine that comes online should have a capital budget associated to it. When host density reaches a certain point, the budget should have enough to cover a new host along with the supported storage, licensing, and other infrastructure costs.
Inventory. This goes hand in hand with capacity planning. Know what VMs live in your environment, who owns them, and what applications are tied to them. Quarterly or even yearly queries out to your customers may reveal that a significant number of VMs are associated to retired applications or cancelled projects.
Resource Pools. Configuring resource pools can be tedious and time consuming, but they may make your life much easier in the long run. If you find it difficult to carve out resource pools based on departments or functional groups, it may be a quick hit to simply create a “prod” and “dev” pool. Non-critical development or test machines can be pooled together with a smaller amount of resource shares. Additionally, you could leverage host affinity so that critical machines run on your newer, faster hosts.
Lack of visibility. Visibility is an important part of ensuring consistent performance. Often times, a customer will mention to me that a VM “feels” slower. In order to make an accurate comparison, we need historical metrics. While the built-in vSphere performance tools are great, I find myself looking immediately at Veeam One instead. Veeam provides a nice consolidated view of all your vSphere instances with easy to pinpoint dashboard graphs.
Expectations. Given the hardware you’ve been blessed with, it can only perform so well. Keeping expectations inline may be all there is to the solution. For new projects, perform not only functional testing of your application, but also a performance qualification. If possible, do the same for P2V conversions. You may uncover a potential performance issue even before going live.

Next Steps:

Contact SPK and Associates to see how we can help your organization with our ALM, PLM, and Engineering Tools Support services.
Read our White Papers & Case Studies for examples of how SPK leverages technology to advance engineering and business for our clients.

Michael Solinap
Sr. Systems Integrator

← Previous: Are you Paying too much for your Application Licenses? Next: How PLM Enables Innovation Without Risking Compliance →

Latest White Papers

ITSM Tool Integration Guide: Connecting Jira, ServiceNow, and Freshservice

While using a singular ITSM tool may be simpler, many organizations utilize multiple for their unique features. This often results in Jira Service Management, ServiceNow, and Freshservice working in tandem. Integrating these tools can be harder than it appears, but...

Subscribe to our blog

Stay up to date with the latest Engineering Technology tips and news.

Related Resources

Orchestrating AI Agents Across the Entire Software Development Lifecycle

Jul 17, 2026

As AI makes its way into software development, engineers are coding quicker. Development accelerates, but many organizations are discovering that the biggest delays happen after code is written. To realize the full value of AI, engineering teams need more than...

SPK and Associates Recognized as a Leading MSP of 2026

Jul 10, 2026

SPK and Associates is proud to announce that Channel Partners has recognized us as a top managed service provider of 2026. The MSP 501 is the technology industry’s prestigious ranking of the world’s top managed service providers. This marks SPK’s eighth consecutive...

Why Your Next CAD Workstation Refresh Should Be Your Last: The Business Case for Permanent Cloud Migration

Jul 10, 2026

For decades, buying high-powered CAD workstations was simply the cost of doing engineering. When design tools needed more computing power, IT bought more powerful machines. When engineers complained about slow performance, companies refreshed hardware. If a new hire...

Other Software Experience

Resources

Topics

Latest Blog Posts

Most Popular Resources

10 Pitfalls That Can Impact VMWare Performance

Latest White Papers

ITSM Tool Integration Guide: Connecting Jira, ServiceNow, and Freshservice

Subscribe to our blog

Thanks for subscribing! You'll hear from us soon!

Related Resources

Orchestrating AI Agents Across the Entire Software Development Lifecycle

SPK and Associates Recognized as a Leading MSP of 2026

Why Your Next CAD Workstation Refresh Should Be Your Last: The Business Case for Permanent Cloud Migration

About

All Content

From Reactive to Predictive: How AI and Integration Transform Engineering Efficiency

Contact