Well so I have been at it again. Attempting to learn enough stuff about CISCO UCS to better understand what it can do. I already know there is a lot of potential and that we probably don’t utilize it to its capacity.
The other day a colleague and I were talking about slowness in general in cloud environments and he mentioned how we could improve performance for all the VMs from E1000 to the VMXNET3. Now I am fully aware of all the benefits and features of the VMXNET3 but I have to say; I was very reluctant to buy into the EVERY VM now gets a 10GB link – In my opinion, that terrifies me at first though. What if a VM all of sudden decided to GO NUTS and completely saturate the link? That would impact other VMs, would it not? At first yes, that could happen on a “RARE” occasion but you obviously have to understand your design and how Cisco UCS works.
Now onto the other observations and misconceptions I had about the VMXNET3. I have to say from what I have researched and gathered it does seem that most articles point to an increase in overall performance. Others reported that Host to Host communications was greatly increased even more than the percentages seen in outbound traffic. One blog post stated nearly a %300 percent increase! > that’s very impressive. So now I can confidently say if you are using CISCO UCS you should definitely consider using VMXNET3 driver. (NOTE: You cannot use FT with VMXNET3)
So how exactly does all this tie into my CISCO UCS post?
In short it’s this link here.
“The revolutionary Cisco® UCS M81KR Virtual Interface Card (VIC) helps increase application performance and consolidation ratios, with 38 percent greater network throughput, complementing the latest increases in Cisco Unified Computing System™ CPU performance and memory capacity. The virtual interface card and the Cisco Unified Computing System together set a new standard for balanced performance and efficiency.”
Now the VIC Card seems pretty cool, but what I thought was a little disappointing is that most companies will only really use something like this for a particular “Use Case” and It’s also curious because they don’t get into other things like upstream traffic and how it would affect host to host communication. The other disappointing factor was they tested this using RHEL which I can understand and it wasn’t really a real world test. What they only wanted to prove was that by offloading network traffic to UCS you get better performance. Now, this doesn’t mean I still wouldn’t want to know what it is capable of. Even so they showed just how having the interface card and VMXNET3 how much further traffic was improved.
Now Down to the nitty gritty:
1) Limitation on total overall Network Interfaces for VM’s
a) 1/2 height can only have 1 VIC = 128 Virtual Interfaces
b) Full Height can only have a maximum of 2 VICs = 128-256 Virtual Interfaces
2) Doesn’t really benchmark windows – that really does matter in the scheme of things considering MOST environments RUN windows.
3) Doesn’t really go into detail on how you would bind these NICS between UCS and vSphere Hypervisor. Only allocating a MAC in UCS and then using VMDirect Path for the NIC. (this is probably more simple then I think)
4) They don’t cover host to host but they do cover Chassis to Chassis which is great to see that kind of performance – but come on show us host to host!!!
5) Scenario 3 isn’t real clear on the VM ethernet interface used – it says “Default enic” so my guess is they couldn’t use anything else but a VMXNET3 – not sure why it says that.
6) Statistics for how CPU performance was affected per scenario
7) Does this mean there is no needs for 1000kv switching since you can use the “VIC” to set up your interface within UCS itself? (This would be my biggest reasoning > hand off to Net Eng = WIN!)
8) Lastly, VMware vCloud Director uses templates and is automated..how could you creatively design this to work with an automated cloud solution? (I mean heck I would love the performance; Only thing I can think is VCO plug-in for UCS and Tie it into VCO/VCD plug-in, Maybe? Why I say “USE-CASE”)
Obviously this is a lot of information but I would honestly like to test this in my own environment and see how well it does perform. Our cloud platform offers everything from weblogic, oracle, SQL, and more. Anyways let me know your thoughts and any other information would be greatly appreciated! Yes, I know I am a Noob .
***Disclaimer: The thoughts and views expressed on VirtualNoob.wordpress.com and Chad King in no way reflect the views or thoughts of his employer or any other views of a company. These are his personal opinions which are formed on his own. Also, products improve over time and some things maybe out of date. Please feel free to contact us and request an update and we will be happy to assist. Thanks!~
Well, as anyone knows starting a new job you almost always hit that point to where things get a little slow and you have to find things to do. If your VMware environments are anything like the majoirity of them out there chances are you can do some remediation. Now, First off you have to give credit where it’s due and I can honestly say that Alan Renouf and Luc Dekens both do a fabulous job of bringing all kinds of cool scripts to the table. Anyways lets get on with it.
First thing is first run over to Quest and grab PowerGUI Free and then you want to get the VMware Quest Powerpack and then the VMware Community Powerpack. You also want to pick Alan’s vCheck which is one of the most excellent tool ever!!
(Note: props to Alan and Kirk who spent majority of there time working on these excellent tools!)
Here is usually where I start:
1. Modify the vCheck to your liking (refer to the link on Alan’s log for any questions) the things I usually end up modifying on this script are:
- Snapshot age ( I change to 3 days 72 hours is long enough)
- Update NTP to your NTP server
- Change Datastore free space remaining
- Disable detecting dead path to LUN (seems to hang for me at times)
- Can adjust the VM free space (though personally it needs to be % based not MB Free
- Change vCenter alerts to something appropriate (I use 7 days)
- Change VM removal time frame (I also use 7 days)
2. Now run the script and check out your remediation items, pay attention to certain issues like:
- vMotion restraints because of CD-Roms attached
- Datastores low on space (powerpack can help with this)
- VMware tools out of date, issues, or not installed at all
- The above appear to be the more common ones I run this script weekly.
3. From the Powerpacks I usually run the following scripts.
- Best Practice Queries >Disk Queries > Orphaned VMDKs
- Best Practice Queries > Disk Queries > RDM’s
- Best Practice Queries > Disk Queries > Thin Disk
- Community PowerPack > Resource Pools > Ballooning Script
- Virtual Machine > VM with over X number CPU’s
- CD-ROMs mounted to VM
4. On an interim basis I will rerun a vCheck or I will run single scripts on an as needed basis:
- Snapshots > All Snapsots
- Virtual Machines > HAL Information (Not really an issue with win2k8
- Virtual Machines > CPU Ready %
- Virtual Machines > VM with active memory ballooning
- Waste Finder > If I feel like doing some deeper Datastore Cleanup
- Powered off VM
- Scan VM’s for NIC Drivers (Update install vmxnet3 if using e1000)
- Check Disk Alignment of all your VMs both Linux and Windows and each drive then update templates
- Enable LPS for certain VM if needed – windows doesnt enable by defualt but ESX 3.5 and up does. Yields memory savings based on app.
- Check to ensure windows 2008 templates and VMs have the WDDM display driver
At first glance some of these items may not makes sense, but you have consider your own environment. HAL is a good one to run really more so the first time around just to make sure your older stuff windows 2k3 and 2k are using the right HAL for the vCPU. I also like to run and dismount all ISO’s from the VMs. Now, you may want to make sure its not a VMware tools ISO mounted to the VM. If it is then you can possibly get a pop-up for a linux VM and its will appear to be unresponsive until someone answers the pop-up with a yes or no. Keep in mind remediation is about starting with the quick and easy and then working your way down. It takes time and creativity.
Now you will have challenges when remediating some things like snapshots when they are really big and I will add a secondary part describing what I normally do, in most cases a clone fixes the issue.
(NOTE: I will be adding additional links later on)
***Disclaimer: The thoughts and views expressed on VirtualNoob.wordpress.org and Chad King in no way reflect the views or thoughts of his employer or any other views of a company. These are his personal opinions which are formed on his own. Also, products improve over time and some things maybe out of date. Please feel free to contact us and request an update and we will be happy to assist. Thanks!~