Category Archives: Troubleshooting

Network Troubleshooting 101 – vSphere VM Guest (Updated)

General Information (VMware information): (Updated)
What is Beacon Probing?
Well, if you don’t know I would give the following Kb a read.
You can find that here:

Let’s hope I can do this some “Justice”

A few weeks ago I ran into an issue where VM NICS would just randomly go down. The only way I could get them back online was to perform a vMotion of the VM to another host and resetting the port connection on the dvSwitch seemed to fix the issue. So essentially I wanted to highlight some basic troubleshooting steps and things you can do to help you better pin point where the issue may be occurring.

  1. From the Guest VM having the issue attempt to ping any other VM on the same host, switch, and portgroup. This will allow to pin point the issue to see if it is related to the physical or virtual networking.
  2. Enabling Beacon Probing can help detect upstream failures. (Be sure to read up on it) Enabling beacon probing will increase bandwidth utilization and CPU cycles on an ESXi host so simply consider the tradeoffs.

Why I need to enable it?
VMware recommends to  introduce this change for either permanent or temporary use. Beacon probing as stated above can help or allow us to detect other failures which may occur upstream. When the failure occurs on the VM this will help you isolate the issue as being related to the VMware Virtual networking or the Physical switching (So can troubleshooting). If the failure happens on the VM and the Virtual Switch uplinks trigger an alert this will help you to pin point the issue.

Implementation Instructions:

Enabling Beacon Probing for Distributed Virtual Switch for a vCenter:

  1. Before enabling Beacon Probing I will engage James Hendrock to get Bandwidth utilization before and after enabling Beacon probing.
  2. Log on into the vSphere Client and connect to any vCenter you want to change it on
  3. Browse to Home > Inventory > Networking
  4. Expand the following objects in the tree to the left: (see figure)
  5. Right Click a Port Group under the dvSwitch switch > Edit settings…
  6. Highlight under policies the Teaming and Failover: (see figure)
  1. Select Network Failover Detection Dropdown (should currently say Link Status only) > Select Beacon Probing (see figure)
  2. Click Ok
  3. Repeat steps 4-7 for all other Port Groups on dvMgmt dvSwitch.

Take Away:

  1. Beacon probing is best used with a 3-pNIC configuration with even a N+2 Switching design being highly recommended
  2. Beacon probing configured with 2-pNICs means that whether or not you use it in a N+1 configuration it will detect a downstream failure but it has no way of knowing which uplink is bad.
  3. In a 2-pnic configuration if one fails it will trigger a redundancy lost message but at the same time just shotgun traffic down both pNICs to ensure communication is sent.
  4. If you are going to use Beacon Probing you must also consider the network design.

Resources:
http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1005577&sliceId=1&docTypeID=DT_KB_1_1&dialogID=273348945&stateId=0 0 273356736
http://blogs.vmware.com/networking/2008/12/using-beaconing-to-detect-link-failures-or-beaconing-demystified.html
http://www.bctechnet.com/?p=10
http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1012819&sliceId=1&docTypeID=DT_KB_1_1&dialogID=273348945&stateId=0 0 273356736
http://frankdenneman.nl/2011/02/ip-hash-versus-lbt/
http://www.yellow-bricks.com/2010/08/04/high-physical-switch-cpu-load/

***Disclaimer: The thoughts and views expressed on VirtualNoob.wordpress.com  and Chad King in no way reflect the views or thoughts of his employer or any other views of a company. These are his personal opinions which are formed on his own. Also, products improve over time and some things maybe out of date. Please feel free to contact us and request an update and we will be happy to assist. Thanks!~

Virtualization – Looking back?

So in the past couple of weeks I have been looking at all sorts of different hardware platforms to run a vSphere environement. In our environment at work we run all kinds of different implementations. We use Dynamic Desktop Virtualization (DDV), Virtual Desktop Infrastrucuture (VDI), Remote site virtualization, and finally Server Virtualization. We started late in the game but have made relitively great leaps in moving forward. We currently have about 5000 VDI desktops, 300 DDV desktops (Very New), and over 2000 Virtual Servers at 2 data centers, Not to mention the remote sites as well.

So obviously this begs the question. Why haven’t we looked at what UCS, vBlock, or other hardware platforms have to offer? Something tells me in the heat of getting things to a virtual platform we had to just go with something very quick. Our big virtual push begin over a year ago and as of today we still leverage HP Hardware systems and most recently built a cluster of 12 DL380′s which are performing much better then what we have seen with our previous implementations. This however still doesn’t address other concerns like centralized management, Randomized workloads, and consolidation of other things like networking – which UCS and HP can bring to the table. Not to mention Xsigio? I probably left a few out but please forgive my ignorance..

When its all said and done I am looking back going why? I still think it would be great to see what UCS could bring to the table but in our environment we deal with many different types of workloads. We have UNIX, Windows, Essbase, Oracle, SQL, Active Driectory (One of the largest), and this is to name only a few and have yet to begin to virtualize these systems. Given we have gotten some Dev and Cert areas done but we all know production is a different beat altogether. We implemented HP Blades on our first go around – no 10g there. Now we have implemented DL380′s and still no 10g there as well. Cisco UCS can bring a lot more ROI in the long run when you just look at network consolidation and management. However, performance isn’t really where we are hurting – the place I see most improvement in the virtual infrastructure is the storage side of things. How do you maintain that solid disk read/writes with out running into hot spots and taking a huge hit in performance? I have seen systems like IBM XIV and NetAPP IPAM card that can really help with that though I think NetAPP is better due to how are environment has many different kinds of workloads. I really hope we see different hardware platforms being tested this next year and quit taking all this stuff from HP. Don’t get me wrong I like HP and they do have good products for virtualization I personally would like to see some competition and anwers to some of these questions. Anyways here is a bit on our environment.

Feel free to comment!

***Disclaimer: The thoughts and views expressed on VirtualNoob.wordpress.org  and Chad King in no way reflect the views or thoughts of his employer or any other views of a company. These are his personal opinions which are formed on his own. Also, products improve over time and some things maybe out of date. Please feel free to contact us and request an update and we will be happy to assist. Thanks!~

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: