Advertisements

Network Troubleshooting 101 – vSphere VM Guest (Updated)

General Information (VMware information): (Updated)
What is Beacon Probing?
Well, if you don’t know I would give the following Kb a read.
You can find that here:

Let’s hope I can do this some “Justice”

A few weeks ago I ran into an issue where VM NICS would just randomly go down. The only way I could get them back online was to perform a vMotion of the VM to another host and resetting the port connection on the dvSwitch seemed to fix the issue. So essentially I wanted to highlight some basic troubleshooting steps and things you can do to help you better pin point where the issue may be occurring.

  1. From the Guest VM having the issue attempt to ping any other VM on the same host, switch, and portgroup. This will allow to pin point the issue to see if it is related to the physical or virtual networking.
  2. Enabling Beacon Probing can help detect upstream failures. (Be sure to read up on it) Enabling beacon probing will increase bandwidth utilization and CPU cycles on an ESXi host so simply consider the tradeoffs.

Why I need to enable it?
VMware recommends to  introduce this change for either permanent or temporary use. Beacon probing as stated above can help or allow us to detect other failures which may occur upstream. When the failure occurs on the VM this will help you isolate the issue as being related to the VMware Virtual networking or the Physical switching (So can troubleshooting). If the failure happens on the VM and the Virtual Switch uplinks trigger an alert this will help you to pin point the issue.

Implementation Instructions:

Enabling Beacon Probing for Distributed Virtual Switch for a vCenter:

  1. Before enabling Beacon Probing I will engage James Hendrock to get Bandwidth utilization before and after enabling Beacon probing.
  2. Log on into the vSphere Client and connect to any vCenter you want to change it on
  3. Browse to Home > Inventory > Networking
  4. Expand the following objects in the tree to the left: (see figure)
  5. Right Click a Port Group under the dvSwitch switch > Edit settings…
  6. Highlight under policies the Teaming and Failover: (see figure)
  1. Select Network Failover Detection Dropdown (should currently say Link Status only) > Select Beacon Probing (see figure)
  2. Click Ok
  3. Repeat steps 4-7 for all other Port Groups on dvMgmt dvSwitch.

Take Away:

  1. Beacon probing is best used with a 3-pNIC configuration with even a N+2 Switching design being highly recommended
  2. Beacon probing configured with 2-pNICs means that whether or not you use it in a N+1 configuration it will detect a downstream failure but it has no way of knowing which uplink is bad.
  3. In a 2-pnic configuration if one fails it will trigger a redundancy lost message but at the same time just shotgun traffic down both pNICs to ensure communication is sent.
  4. If you are going to use Beacon Probing you must also consider the network design.

Resources:
http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1005577&sliceId=1&docTypeID=DT_KB_1_1&dialogID=273348945&stateId=0 0 273356736
http://blogs.vmware.com/networking/2008/12/using-beaconing-to-detect-link-failures-or-beaconing-demystified.html
http://www.bctechnet.com/?p=10
http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1012819&sliceId=1&docTypeID=DT_KB_1_1&dialogID=273348945&stateId=0 0 273356736
http://frankdenneman.nl/2011/02/ip-hash-versus-lbt/
http://www.yellow-bricks.com/2010/08/04/high-physical-switch-cpu-load/

***Disclaimer: The thoughts and views expressed on VirtualNoob.wordpress.com  and Chad King in no way reflect the views or thoughts of his employer or any other views of a company. These are his personal opinions which are formed on his own. Also, products improve over time and some things maybe out of date. Please feel free to contact us and request an update and we will be happy to assist. Thanks!~

Advertisements

About Chad King

I am an IT professional working in the industry for over 10 years. Starting in Microsoft Administration and Solutions I was also a free lance consultant for small businesses. Since I first saw virtualization I have always been fascinated by the concept. I currently specialize in VMware technology. I consult daily on many different types of VMware Solutions. I have experience in all domains related to virtualization and cloud solutions. From help desk, administration, engineering, implementation, and design allowing me to provide creative solutions with the understanding and gaps at all levels.

Posted on January 10, 2012, in Troubleshooting and tagged , , , , , . Bookmark the permalink. 4 Comments.

  1. I’ve been recently searching for more in-depth information how beacon probing really works in vSphere and I think that point 3 of your post (In a 2-pnic configuration BOTH UPLINKS WILL FAIL because of the failure, however you will know that it was something downstream. That is the only real benefit of using Beacon Probing) is not true.

    I’ve found this article which describes what really happens:

    http://blogs.vmware.com/networking/2008/12/using-beaconing-to-detect-link-failures-or-beaconing-demystified.html

    The important thing with 2-pnic configuration is that with one failed uplink it doesn’t fail both vmnicks, it starts to send trafic to both uplinks because it cannot determine which one failed – they called it shotgun mode. The visual indicator is that there is no red x mark on any of those two vmnics. However there is an alarm on ESX host level stating that link redundancy is lost.

    I’ve made a test and it fully confirms that behaviour.

    Like

    • But does it disable the VMnics? Or is just degraded and still functional?

      Like

      • With 2 vmnics configuration when both vmnics doesn’t receive beacon probing packets it doesn’t disable any of them. It raise the alarm on the host level that the redundancy is lost and starts to send traffic to both vmnics (because it doesn’t know which one is the failing one) so there is no loss of connectivity and both vmnics look like they function fine – you can’t tell which one is failing from the interface.

        Like

      • ahh yes, I recall reading that as well. Thanks for commenting so this is good information. For whatever reason it just never clicked. I have made the updates! thanks again!

        Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: