Solved Mystery: DRS, Powering On VM fails with “insufficient resources”
Once there was a DRS-enabled cluster with a mystery: When trying to start a virtual machine, the very generic error Insufficient resources
appeared. However further checking all hosts in the cluster reveals that there’re enough resources left to start this specific virtual machine. But still, the VM won’t power on with said error.
At this point, usual questions might be following:
- On which host DRS tries to start the specific VM?
- Can the VM be started on the DRS-selected host manually? (e.g. when DRS is disabled or via ESXi Host Client)
- Are all VMs affected or just specific ones? Does this also apply for new VMs?
- Does a different virtual machine hardware version change the behavior?
- Is there any ISO mounted to the VM which might prevent DRS to start VMs somewhere else?
- Are there any CPU or RAM reserverations?
- Are there any DRS Rules or Host Affinity rules which might conflict?
- Does restarting management services change the situation?
To keep it short: It was nothing from the above. Everything was just fine, but DRS was indeed behaving strange.
What we’ve done:
- We’ve created a new virtual machine with 4 MB of RAM, no disk or even a disk controller, low CPU and RAM shares, and the start failed on a relatively powerful host with enough resources.
- When disabling DRS (Be careful with this in production! This deletes ALL resource pools!), the VM could be powered on. Once set to manual, partial or fully automated, the issues persists.
Before I was about to end the remote session, I had one more sudden idea: Checking the "Advanced Options" of DRS. And there we go: There was one setting called LimitVMsPerESXHostPercent
set to 0
, which I haven’t heared about so far. Once this specific setting was removed for testing purposes, the VM could be powered on and DRS behaved as expected.
It turned out, that this setting was introduced back in the vSphere 5.5 days to limit the amount of virtual machines per host. As the setting was set to 0
(for unknown reasons in the customer’s environment), it seems to have this unexpected side-effect in their environment since quite a while.
To read more about the setting:
https://blogs.vmware.com/vsphere/2015/05/drs-keeps-vms-happy.html
http://www.yellow-bricks.com/2013/09/02/whats-new-in-vsphere-5-5-for-drs/
Key takeaway: It’s definitely worth checking the "Advanced Options" of DRS when DRS behaves differently than you would expect.
thanks, it saves me many hours of debug