HA Response Types – Host Isolation – Vmware KB

Host Isolation can be a troubling thing in the Enterprise Virtual Data Center.  As we have been moving forward with ESX we have noticed some of these things happening from time to time.  Somewhere in the beginning we forgot to do homework and find that when we run into Host Isolation that it almost always means some VM’s are down and unresponsive.  Understanding how H/A is different from DRS and vMotion is sometimes important to know.  Today I read a VMware KB that described HA response times in the case of a Host Isolation response.  I recall going through this in our fast track class and learned how to avoid this from happening.

The Best Practice “Rule of Thumb”:
When configuring and setting up your enterprise its important to know the little things that go on to help you avoid potential downtime.  Currently Host Isolation pretty much means that you have Zero control over your VM while the Host is flagged as being Isolated.  There are a couple of ways to avoid this by setting manual thresholds for isolation response times before claiming the host is actually isolated or by doubling the redundancy of your management network heartbeat as well. This could come back and tell you what exactly you are having a problem with. I have seen host isolation responses occur due to storage connectivity and even if the network goes down or some other configuration that was missed and now you have no redundancy. Setting manual thresholds is important to avoid that worst case scenarios. Below is the quick way of fixing these things without having to do any manual advanced options.  The KB listed this as well here: VMware KB: VMware High Availability host isolation resp…

HA Response Types

Leave powered on – When a network isolation occurs on the host, the state of the virtual machines remain unchanged and the virtual machines on the isolated host continue to run even if the host can no longer communicate with other hosts in the cluster. This setting also reduces the chances of a false positive. A false positive in this case is an isolated heartbeat network, but a non-isolated virtual machine network and a non-isolated iSCSI/NFS network.
“So your network is basically up, Your storage network is up, Your VM Network is up.  For some reason your host appears to have a problem.  Fix the problem on your host to resolve the issue.  You may in the end have to boot it, but at least this way you know VM’s are still up. KEEP IN MIND, that even if this happens I have found that you couldn’t get to the console of VM’s due to he nature of the problem.  Sometimes the latter may be the best option”
Power off – When a network isolation occurs, all virtual machines are powered off. It is a hard stop. A Power off response is initiated on the fourteenth second and a restart is initiated on the fifteenth second.
“So basically have a Virtual DR solution in place to keep from suffering from these catastrophes. Unless you are running High Performance NFS and performing your snap shots on the storage side this could be a long outage.  Might want to be sure to use VDR or maybe another backup solution like VEEAM.  I know there are more but you at least want to guarantee that you can restart in the event of a failure.”
Shut down – When a network isolation occurs, all virtual machines running on that host are shut down via VMware Tools. If this is not successful within 5 minutes, a Power off response type is executed.
“I am not sure about this one but I would think that this is a new one that many will opt to go with and kind of fits in with “DR Avoidance” Scenario.  Instead of expecting a FULL hard outage we go with an option to have an acceptable down time because we have redundancy at the VM Layer.  We document the power offs learn why they happen and then move on.  In some cases it may not even be a worry if you don’t have down time. Redundancy is the key.

Additional Info

http://www.vmware.com/files/pdf/VMwareHA_twp.pdf

I particularly like after page 21 in this document in the above line.  Have a good read and Merry Christmas!

***Disclaimer: The thoughts and views expressed on VirtualNoob.wordpress.org  and Chad King in no way reflect the views or thoughts of his employer or any other views of a company. These are his personal opinions which are formed on his own. Also, products improve over time and some things maybe out of date. Please feel free to contact us and request an update and we will be happy to assist. Thanks!~

Advertisements

About Cwjking

I am an IT professional working in the industry for over 10 years. Starting in Microsoft Administration and Solutions I was also a free lance consultant for small businesses. Since I first saw virtualization I have always been fascinated by the concept. I currently specialize in VMware technology. I consult daily on many different types of VMware Solutions. My current role is hands on administration, technical design, and consulting.

Posted on December 24, 2010, in Training. Bookmark the permalink. 2 Comments.

  1. Check out Duncan Epping’s blog at yellowbricks.com. he is a vmware employee that posts a lot on drs and ha. He also just relased a book about ha and drs.

    • Yes, I visit Duncan’s site quite often and have worked with him on some community postings. He knows this stuff in and out. He also makes an effort to take it to more of the architecture and design then just telling you the watered down version. I cannot wait to read some of his new book either. HA, DRS, and vMotion have some people confused and he definitely lines it out quite well. I just saw VMware post this from the VMwarekb twitter and decided to just give a friendly reminder… :). In fact I will probably update this blog with a link to his articles over at Yellow-Bricks.com.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: