vCenter Orchestrator – Working with SSL and Secure LDAP with MS Active Directory

I decided to tackle something a few days back that I finally figured out and I thought it would be a good idea to share it out as others seem to run into it from time to time. vCenter Orchestrator is something that is getting a lot more attention these days because of the automation it can bring to your VMware virtual environment. I won’t go into details about what it can do completely as there is plenty of that out there. I instead wanted to focus on it from a security stand point when working with Secure LDAP and using SSL. Now I most certainly want to say that there are probably some sources out there that may speak to this but many I have read and reviewed I have found lacking in some way for a better term. Some would defer to other articles which were even more vague. My only goal is to enable you to be able to effectively connect vCenter Orchestrator securely to your Microsoft Active Directory Services and be able to modify it.

What did it take?

Well here are some suggestions and key things to consider when standing this up.

  1. What accounts does VCO need?
  2. When using the Active Directory plug-in what level of further permissions is needed into active directory with the account making AD changes?
  3. What exactly do the VCO users do exactly and how do they work? (I would like VMware to detail this as current documentation is lacking)
  4. If using a Certificate Authority server what kind of certificate do I need to establish the SSL connection to a Domain Controller?
  5. How do I allow LDAPS on my Domain Controller?
  6. What Group Policy Configurations may be needed?
  7. Any other caveats to this?

Now I want to go through a standard process of setting up LDAP-SSL and designate specific points to outside sources for their contributions and help.

Enabling LDAP SSL Requirements

  • Stand up a MS Certificate Authority Server (This is to line up with my scenario)
  • Issue a Domain Certificate Template on the Certificate Authority specifically for LDAP-SSL. This can be the following noted here: I used the following Subject Name settings specifically Note: the name is just what I used to standardize LDAP SSL for the VCO design
  • On the root CA export the Root Certificate
  1. From the MS Root CA server:
  2. Go to Start > Run > Type MMC
  3. Go to File > Add/Remove Snap-in
  4. Add Certificates Click the “>” button
  5. Choose Local Host for connection
  6. Expand Certificates > Personal > Click Certificates
  7. In this repository you will find the Root CA Certificate. You will know this because the Certificate Template is a “Root Certicate”.
  8. Highlight the certificate and click All Tasks > Export > Next > Next (.DER) > Name and Save > Next > Finish
  9. Log into VCO and click the Network and then the SSL tab as shown:
  10. Go to the Import from file options:
  11. Click the search icon and browse to the location. Click Import. After importing you will the following:
  12. Common name will only show on VCO is you are using the common name as an option on the SSL certificate like I stated for the domain controller. Most CA Root Certificates import with no issue. I just like to know what certificates I have installed.
  13. Once you have deployed the LDAP SSL to your Domain Controllers (ALL OF THEM) you than import the root CA (same one the DC’s are signed by) you can now establish an SSL connection with the Domain Controllers

Note: You will have to check enable SSL on the LDAP and on the Active Directory Plug-ins. You will also have to ensure port 389 and 636 are opened between VCO and the Active Directory servers. If you have issues turn off any firewall to alleviate networking being an issue. If the connecting accounts have access and networking connectivity is not an issue you will wonderful green lights on everything.

Connecting with SSL Authentication and why it’s needed

At this point it’s clear this is somewhat straightforward but you need to note that missing any of these steps will result in a broken SSL connection to your domain controllers. By using a ROOT CA this makes things a lot easier. Usually I would just try deploying a trusted signed SSL to my appliance but in my case that feature of VCO was actually broken. The real use case behind this is being able to fully automate the Active Directory user creation. To be able to allow VCO to run an Active Directory workflow for creating an Enabled user requires LDAPS. You can create disabled users all day, but when it comes to making them enabled and modifying them you have be granted access. SSL with LDAP ensures this handshake with Active Directory is solid and that you can connect securely. However, this is only a small part of the puzzle. We will now cover the users.

At first you will need 3 users or 2. There are some things that are unclear noted in the following PDF paper from VMware: http://www.vmware.com/pdf/ad_plugin_10_users_guide.pdf

I will say that it is a good starting point for using Active Directory but note that you cannot do any real workflows without settings up SSL securely when working with Active Directory. VMware doesn’t really address this solid in my opinion and really there is a lot of fragmentation out there. Another good resource was this white paper as well: http://communities.vmware.com/docs/DOC-13959

I was still left with trying to figure some things out…

My suggestion when setting up the accounts

So in total from my standpoint when setting up VCO with AD (Active Directory) Access you will need 3 separate accounts.

  • One for the LDAPS connection
  • One for the primary connection to AD
  • One for the “Shared” session for AD

My assumption is that VCO uses the following accounts for specific task and from a security standpoint it may not be ideal to use a single account for two functions especially when talking to AD. VCO essentially uses java to do fancy API calls to Active Directory. The LDAP Account specified in the LDAP configuration is used for authentication and connection for the LDAPS functions (though this is poorly documented by VMware it’s my assumption). The Accounts used in the AD Plug-in are beyond me from a purpose standpoint. I know for certain one is used for accessing AD and making changes but I am not sure about the point of the “Shared” session and what its implications are. I would like more visibility on this…

If you are having issues…

I would not be surprised if this happens to you… so I would recommend the following for troubleshooting:

So if your lights will not go green for connecting on SSL check the following:

  • Trace your steps ensure you have the CA Root SSL on your VCO appliance
  • Ensure you have the LDAP SSL deployed to your AD servers for LDAP auth.
  • Ensure the VCO user account in LDAP is able to access and connect to AD. You can test by configuring SSL on the appliance and if you get all green lights that is a good sign connectivity it working.
  • A second test could be to log on with the VCO user account and ensuring you browse AD.
  • On the LDAP button there is a “test login” tab. Do a test login but ensure the user is a member of the group you specify in vCO admin group you set up VCO to point to for access.
  • Make sure ports 389 and 636 are open from VCO to the LDAP servers (AD in this case)
  • Upate 1: you can check for GPO signing as documented here: (This link tells you how to enable it but it is the same way to put it to no signing. Ideally GPO is best applied through a more granular means other then the default domain policy. You can create a specific GPO tied to a OU that contains your domain controllers and apply it there instead.)

So Chad, my appliance is connected on SSL but my workflows break dude… I get an error like this:

Exception

Unable to create a new user: InternalError: Failed to create user account… [LDAP: error code 50 – 00000005: SecErr: DSID-031521D0, problem 4003 (INSUFF_ACCESS_RIGHTS), data 0 ] (Dynamic Script Module name : createUserWithPassword#6) (Dynamic Script Module name : createUserWithPassword#9)

This one can stink so my suggestion is the following:

  • Make sure your VCO user accounts have proper permissions in Active Directory.
  • You can do further testing (if you don’t have the ability to do the following) by using another LDAP tool like LDAP administrator. You can set it up exactly as you would VCO and even use the VCO service account to prove it.
  • The easy way would be: Create a new user apart of no groups, Make it a domain admin only, Add it to the vco admin group, configure VCO for that account. (Note: Its important to ensure the VCO admin group is not restricted in any way through a delegation of permissions. I found this issue when someone else created the account and although it had domain permissions it was restricted)

Run your workflows after trying some of those and that should be able take care of you. One consideration I have to make clear is that the VCO accounts that access AD need to be set up with some sort of delegated permissions in their own group. Since you have to use two it would make since to have this group with delegated AD functions related to what the workflows need to be able to do. I do this today for specific use-case accounts in our environments. This is also known as Role Based Access Control (RBAC). I will cover that in another article but I want to bring some attention to VCO and some of my challenges with it. I hope this is helpful to someone out there.

vCloud Director – 1.5 RHEL 5 Bug Hot CPU ADD – Quick Work Around

So I was doing some testing in vCloud Director 1.5 and noticed my RHEL Linux 5 vApp wasn’t able to enable Virtual CPU Hot add.

I went in and check my vCenter settings to see what the deal was:

Changing the setting on my vCenter updated it in my vCloud Director..

The alternative to having to do this workaround would be to change the template version within vCloud Director to RHEL version 6

You will notice the Virtual CPU hot add becomes available to check. I used this method on existing templates and it did not seem to break the templates.
However, if you are trying to create new templates of RHEL 6 with RHEL5 5 OS you may want to make sure your SCSI controller is correct. Again, changing it on my vApps seemed to make no impact to my OS currently installed.

It’s apparent bug to vCloud Director and @Lamw was kind enough to help me out.

vSphere ESXi – Copying VMs – what you may not know

So this article is more of a FYI than anything. I wanted to just bring some attention to this as some may really be puzzled by why the hypervisor stinks at performing large copies. @Lamw can verify as well especially when working the VM Disk files. I think it is important to highlight the distinct difference. The CP command is for files (although a VM by definition is a subset of files per VMware) but not the VMDISK files. I am sure there can be much conspiracy for why this is the case but this has actually been around for a while. If I was probably one of the age old VMware guys out there this would probably not catch me off guard because it has been around or published I should say since VI3 (ESX 3). So obviously since I did not finish my Back to the Future Delorian ride in time yet, well I just didn’t know.

Background Information

During a particular situation I was copying some data from one ESX to another. This was basically a copy using the Datastore Browser in the vSphere client. I had staged some files from a NFS mount and wanted to copy them over to the SAN datastores. This NFS mount was read only so doing a storage migration would not work because they would require removing the VMDK files on the NFS mount after the copy. So I could do some clones but I could only do so many at a time. What I decided was to pop open the datastore browser and do a copy paste from the NFS to the SAN datastore.  It’s also important to understand that the Datastore Browser uses HTTP GET and PUT not CP. Keep in mind this is over 10GB Ethernet (NFS) and copying to the SAN which is 4GB FC HBA. It took a while to do the copy but I didn’t really notice. After staging all the data to the new SAN datastore I had to then turn it over to another ESX that had yet another separate datastore from the one hosting all the VMDK files. So there again another copy…. This time I noticed how slow it was really going even from datastore to datastore. I knew that the copy process would more than likely run over the Management Interface but even that was on a 10GB Ethernet connection so that should be screaming as well. Not the case… So as a last test I decided to try a copy from Datastore to Datastore that is mounted to the same host. I still averaged around 20-50kbs which is pretty terrible. So no matter how I went about it performance was terrible. I pretty much knew it had to do with the process at this point although I wasn’t sure why. In many of these scenarios I used different methods from SCP applications, the Datastore Browser, and CP in the shell of ESXi.

Trying a Different Approach

So after talking with VMware support and confirming my suspicions on the issues being around the process (using CP) we went through the very same instances I noted above to rule out any issues. We tested the same scenarios; Different Protocol Datastores, Non-shared Datastore copies, Shared Datastore Copies, Local Datastore to Datastore copies, all with the same affect – even when copying just a single disk. Of course at this point the support guy was a little stumped and had to get off the line to go talk to someone else. Usually that means they need to go to someone with a fresh set of eyes or more experience to help out and sure enough he came back with another suggestion; use cloning and storage migrations as a test. I of course didn’t think of this but when he mentioned it I pretty much had a Homer the Simpson “DOH!” moment. I guess by then my head was hurting trying to figure this stuff out. When we did the storage migrations and clones it was actually MUCH faster. In fact after the support call we did some testing. I could do 10 storage migrations to 1 VM copy using the CP command. In some cases it was 10+ to one VM copy. Granted I had to now have an additional step of adding to inventory the VM Guest but that wasn’t as bad as taking 1 hour to copy 1 virtual machine. Note: The array was not VAAI capable

What does this mean?

Yeah, so that is the million dollar question isn’t? Well CP has pretty much been deprecated since VI3 but its better said “Not to be used for handling Virtual Disk”. To better understand see/read for yourself: http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&docTypeID=DT_KB_1_1&externalId=1000936

In http://www.vmware.com/pdf/esx_3p_scvcons.pdf page 3

NOTE:notice the words “SIGNIFICANT PERFORMANCE IMPROVEMENTS”

So all this to tell you that CP is not a very good solution for doing mass copies or datastore copies. For me this present a problem when using any other tools like VEEAM SCP, Putty SCP, and etc.. So make sure you know what you want to accomplish beforehand as you don’t want to end up with some headaches as I did. I know that some of you may think it was a waste of a VMware case but anytime I can find information like this and share it out for others for me is invaluable. To add to my findings I should also mention that VMKFSTOOLS also ensures the integrity of the disk and is more suited for these things by design. I think VMware intentionally focused on VMKFSTOOLS as the solution because I don’t think CP was ever something intended to be used due to the lack of functionality. It may have some to do with licensing as well.

One Last Thing:

This was a huge pain at the time of moving some data between the NFS and SAN because I really didn’t have an automated solution for doing the copies. Many of you know that VEEAM FAST SCP before the new version did not have 64-bit support. I didn’t have any 32 bit machines and I didn’t want to waste time hacking away. However, I did want to mention that VEEAM released their new version of the product which is known as VEEAM free backup; you can get that here. I also did some testing and was very impressed with the copying speeds compared to that of the CP command. Another nice thing is that even if you have no Virtual Machines registered in the vCenter it still picks them up in the copy process as VMs. Not to mention you can get statistics and automate-schedule copy jobs with the application. For me and with what I do this is priceless. Simplicity, automation, and reporting – all free! I love it! Thanks to VEEAM for listening to all those out there wanting an improved solution. They did a good job. NOTE: Thanks again to @Lamw for pointing this out. The Datastore Browser uses HTTP Get/Put not CP.  I will correct this in the post later.

vSphere – vCloud – The Query 1.5 API “feature” pageSize bug?

Alright, this is going to be difficult for me to really explain so I will do my best to serve it justice. First, I am not a coder and I do not know the ins and outs of the API and code. What I will attempt to explain to you is how you can reproduce this issue on your VCD instance. I also want to note this is vanilla VCD 1.5 with no updates yet. I currently do have a case with VMware opened and I have yet to resolve it.

Let’s get to the nitty gritty.

First off, I want to say that I am not 100% sure that any other queries you use produce the same affect. This issue seems to happen with only the VMadmin query.

First I would recommend reading about connecting the Rest API with Will’s blog over at VMware:

http://blogs.vmware.com/vsphere/2012/03/exploring-the-vcloud-rest-api-part-1.html

Now that you have read that and understand how to connect to the REST API I will show you an example of a basic VMadmin query.
(Note: you need to have over 128 VCD Vapps to reproduce this type of issue)

GET http://vcd.url.com/api/query?type=adminVM

This showed me that I had 333 queries returned however on the 1st page I only found 128. Now the way the script talked to VCD API was rather plain and it was basically doing this query and dumping it to a XML file. The idea was that this was similar to 1.0 API where I could get all the data I wanted and dumped into an XML file. This wasn’t the case. It seems I couldn’t get around this 128 limit. So I decided to try the next query:

Get http://vcd.instance.com/api/query?type=adminVM&pageSize=999

After running it I still got 333 queries returned but only 128 on the single page EVEN after specifying a pageSize=999 so this isn’t the end of it… let’s dig deeper. After further researching I had actually found documented proof that this was a hard setting somewhere.

Page 212 of the VCD 1.5 API Guide taken from here: http://www.vmware.com/pdf/vcd_15_api_guide.pdf

So it became obvious to me at this point that no matter what your query is it would always default to 128 objects per page. So I tried to also do the following to change this hard setting (at the recommendation of someone) located in a global.properties file in the following directory on the vCloud Director cells:

/opt/vmware/vcloud-director/etc

add/change the following: restapi.queryservice.maxPageSize=1024

I added this to the global.properties file and the VCD cells service were also restarted. Can you guess what still happened? Nothing… this didn’t change anything at all. In fact, it still remained broken. Folks, this still wasn’t the worse part about it. Lets cover the part that I believe is a true bug in the API and had someone on Twitter also comment that there is a possible bug in adminVM query.

Lets say I do a query for a pageSize=135 and my query returns 153 results. We get the usual 128 queries per page. Here is an example of the commands I used:

GET https://vcd.url.com/api/query?type=adminVM&pageSize=135&page=1&fields=name&sortAsc=name

Sort ascending gives me an alphabetical sorting of all my vApp names and I can find a Breaking point for my virtual machines (I know my ABC’s and what should be next so to speak). So I copy and paste the results into Notepad++ and it shows me 128 entries of the page size of 135 (give or take a few for other lines returned not relevant to the query. The bug as discussed is evident. However, it doesn’t show the other 7 entries it should be showing. Remember, we did the page size for 135. So now let’s take a peek at page 2.

GET https://vcd.url.com/api/query?type=adminVM&pageSize=135&page=2&fields=name&sortAsc=name

So after you run this query you will the list of the remaining 153 results. However if you take notes you will notice that it is in fact completely missing the 7 other entries. So basically your query takes the 7 it could NOT list and dumps it out to somewhere in the Cloud…. So what does this mean aside from the fact that there is a bug?

You will need to use a looping construct and not specify a page size greater then 128. (see Will’s comments below)

This is a bug and I don’t think I could make it any clearer. I wish I could’ve provided some screenshots but I think if someone does there due diligence they will see what I am talking about. If you have 2000 VCD vApps and you do a page size of 500 you would lose 372 queries between each page. No matter how you specify the page size, modify the Global.properties its just broken plain and simple. If someone would like to provide some screen shots I would be happy to put them up here to show some better detail.

If you want to discuss in further detail feel free to comment and I will follow up.

UPDATE: After reviewing with VMware on some things I found out this is actually a true but with the vCloud 1.5 API bug.  The good news is that there is a fix slated to be published in August, perhaps they will allow for a private fix if you really need it. Stay tuned. If anyone has some information aside from this please provide and I will link it! Thanks again. Also, this is not related to any type of Query parameter this is more to do with how the Query service works.

vSphere 5 – Storage pt.3 LUN Sizing – Why it matters..

Well, I guess I am on a roll this week. I feel like a lot of my themes have been around storage and VMware this week. I don’t think that is a bad thing but I am seeing some gaps out there as far as considerations and recommendations. My only point in this post is to share my thoughts for you and what you should consider when facing this after your vSphere 5 upgrade or after you install it. I have to wonder just how many enterprises out there have seriously pushed the envelope of LUN sizing in VMware. One has to think; “If you are carving up large LUNS does that mean your scaling up?”. There are so many implications one should consider when designing your storage. One of the more critical pieces is I/Ops and the cluster size and what your target workload is. With bigger LUNS this is something you have to consider and I do think it is common knowledge for the most part.

There are so many things one should consider when deciding on a LUN Size for vSphere 5. I sincerely believe VMware is putting us all in a situation of scaling up sometimes. With the limitations of SDRS and Fast Provisioning it has really got my mind thinking. It’s going to be hard to justify a design scenario of a 16 node “used to be” cluster when you are trying to make a call on if you really want to use some of these other features. Again, you have heard me says this before but I will say it again; it seems more and more that VMware is making a huge target of this to Small to Medium sized businesses but offering some features larger sized companies (with much bigger clusters) now have to invest even more time in reviewing their current designs and standards – Hey, that could be a good thing :) . Standards to me are a huge factor for any organization. That part seems to take the longest to define and some cases even longer to get other teams to agree to. I don’t think VMware thought about some of those implications but I am sure they did their homework and knew just were a lot of this was going to land…

With that being said I will stop my rambling on about these things and get to the heart of the matter or better yet heart of the storage.

So, After performing an upgrade I have been wondering what LUN size would work best. I believe I have some pretty tough storage and a solid platform (CISCO UCS) so we can handle some I/Ops. I wanted to share some numbers with you that I found was very VERY interesting. I have begun to entertain the notion of utilizing Thin Provisioning even further. However, we are all aware that VMware still has an issue with UNMAP command which I have pointed out in previous blogs (here). However being that I have been put between a rock and hard place I believe update 1 to vSphere 5 at least addressed 1/2 of my concern of it. The other 1/2 that didn’t was the fact that now I have to defer to a manual process that involves an outage to reclaim that Thin Provisioned space… I guess that is a problem I can live it with given the way we use our storage today. It doesn’t cause us to much of a pain, but it is a pain none the less.

Anyways, so here is my homework on LUN sizing and how to get your numbers (Estimates):
(Note: This is completely hypothetical and not related to any specific company or customer; this will also include Thin Provisioning and Thick)

  • Factor an Average IOps per LUN (if you can from your storage vendor or from vCenter or an ESXi host)

    Take the IOps per all production LUNS and divide it by the number of datastores

    Total # IOps / # of Datastores

  • Gather the average numbers of virtual machines per datastore

    Total # VM’s / # of Datastores

    Try to use Real World production virtual machines

  • Decide on the LUN Size and use your current baseline as a multiplication factor from your current.

    So if you want to use 10TB Datastores and you are using 2TB datastores you can take whatever numbers and

    10TB / 2TB = 5 (this is you multiplication factor for IOPs and VM:Datastore Ratio)

So now let’s use an example to put this to practical use… and remember to factor in free space for maintenance I always keep it at 10% free.

Let’s say we have a customer with the following numbers before:

16 VM’s per Datastore

1200 I/Ops Average per Datastore (we will have to account for peak to)

2TB Datastore LUNS

Now for the math (Lets say the customer is moving to 10TB LUNS so this would be a factor of 5):

16 x 5 = 80 VM’s per Datastore (Thick Provisioned)

120 x 5 = 6,000 IOps per Datastore…

Not bad at all, but now let’s seriously take a look at thin provisioning which is QUITE different on numbers. Let’s say we check our storage software and it tells us on average a 2TB LUN only really uses 500 GB of space for the 16 VM’s per Datastore. Lets go ahead and factor some room in here (10% for alerting and maintenance purposes this time around). You can also download RVTools to get a glimpse of actual VM usage versus provisioned for some thin numbers.

First off:

16 VM per 500GB so that times 4 for the 2TB LUN; Makes 64 Thin VMs per 2TB Datastore.

Times that by the new LUN size 9TB / by 2TB = 4.5 (minus 10% for reserved for alerting purposes and Maintenance; this could also be considered conservative)

64 x 4.5 = 288 Average VM Per 10TB Datastore (and that 1 TB reserved too!)

We aren’t done yet; here comes the IOPs and lets use 1500 IOPs. Since we times the VM’s by a factor of 4 we want to do this for the average of IOPs as well:

1500 x 4 = 6000 per 2TB LUN; Using thin provisioning on VMs

6000 x 4.5 = 27000 IOps per LUN.

So this leave use with the following numbers for thick and thin:

VM to 10TB Datastore ratios:

80 Thick

288 Thin

IOps to 10TB Datastore ratios:

6000/IOps Thick Provisioning

27000/IOps Thin Provisioning

So, I hope this brings to light some things you will have to think about when choosing a LUN size. Also note that this is probably more of a service provider type of scenario as we all know most may use a single 64TB LUN though I am not sure I would recommend that. It all comes down to use-case and how it can be applied. So this also begs to question what’s the point of some of those other features if you leverage Thin Provisioning. Here are some closing thoughts and things I would recommend:

  • Consider Peak loads for your design; the maximum IOps you may be looking for in some cases
  • Get an average/max per VM datastore ratio (locate your biggest Thin VM)
  • Consider tiered storage and how it could be better utilized
  • Administration and Management overhead; essentially the larger the LUN the less over all provisioning time and so on.
  • VAAI capable array for those Thin benefits (running that reclaim UNMAP script..)
  • Benchmark, Test using some other tools on that bigger LUN to ensure stability at higher IOps
  • Lastly the storage array benchmarks and overall design/implementation
  • The more VM you can scale on a LUN can affect your cluster design; You may not want to enable your customers to scale that much
  • Alerting considerations and how you will manage it efficiently to not be counterproductive.
  • Consider other things like SDRS (fast provisioning gets ridiculous with Thin Provisioning)
  • Storage latency and things like Queues can be a pain point.

I hope this helps some of those out there that have been wondering about some of this stuff. The LUN size for me dramatically affect my cluster design and what I am looking to achieve. You also want to load test your array or at least get some proven specs on the array. I currently work with HDS VSP arrays and these things can handle anything you can throw at them. They are able to add any type of additional capacity you need rather it be Capacity, IOps, Processing or what not you can easily scale it out or up. Please share your thoughts on this as well. Here are some great references:

http://www.yellow-bricks.com/2011/07/29/vmfs-5-lun-sizing/
http://serverfault.com/questions/346436/vmware-vmfs5-and-lun-sizing-multiple-smaller-datastores-or-1-big-datastore
http://communities.vmware.com/thread/334553
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2014849 

Note: these numbers are hypothetical but its all in the numbers.

vSphere – vCloud – Fast Provisioning – My Thoughts…

Yea, some would say this post is probably overdue but lately I have sincerely been thinking. Have we been drinking some Kool-Aid around this feature? I couldn’t help but have some concerns around possible implementation of this feature in VCD installments. I in particular, am not sold on it completely. Here are just some quick reasons for me that didn’t exactly sell me.

  1. It’s a very “new” feature in regards to VCD which is still early in its years as a cloud platform.
  2. No way of currently updating those linked clones unlike VMware View. (some admin over head as well as using local and shared catalogs)
  3. Added complexity (with linked images, snap chains, and how you have handle storage motion)
  4. By Default ALL linked clone images are mis-aligned. (VMware has yet to address this problem) In some cases this could be a compounding factor causing some additional I/O overhead.
  5. Design has to be highly considered and evaluated with a max of 8 node clusters (This will affect current installments as well)

So yeah, I know I look like the bad guy but I seriously think this release was just a target more to SMB than anything. IMO, this is more like a feature for those of smaller businesses because now they don’t have to go out and spend all that crazy dough on a VAAI capable array (Hooray for them :)) which begs to question….

Why do you need to enable this feature if you already leverage VAAI capable arrays?

It just seems to me that Fast Provisioning is a little pre-mature in its release. Although VCD continues to improve I think this features needs some serious improving before some bigger shops may decide to utilize it. The other down is that we have yet to see any real progress on the UNMAP problem and it’s now treated as a manual task we should run during certain times… or outages I should say. That really blows because we all know what kinds of benefits and problems thin provisioning on some array can cause. For the most part, it’s just really bad reporting… lol.

Here are some other sources I would recommend reading and I seriously think you should read them and learn for yourself if it’s really worth it. Also, be careful not to put the cart before the OX and do your homework. Some people drink the kool-aid and don’t think to question or ask “What’s really under the hood?”. Fast Provisioning should never be compared to VMware View… It’s similar but not identical.. I would definitely recommend reading Nick’s blog it opened my eyes to what he calls the “Fallacies” and of course Chris has a good read.

http://datacenterdude.com/vmware/vcd-fast-provisioning-vaai-netapp/
http://www.chriscolotti.us/vmware/info-vcloud-director-fast-provisioned-catalog-virtual-machines/
http://www.kendrickcoleman.com/index.php?/Tech-Blog/vcloud-director-15-features-that-effect-limitation-and-design.html

vSphere – Networking – ESXi Single NIC VDS Management Migration

Well, I wasn’t sure how to name this blog as VMware continues to use all kinds of different lingos for all of their bells and whistles. I had the unique opportunity to begin working with migrating management interfaces or also know as vmkernel interfaces around from VSS to the DVS switching. This present a lot of struggles but it seems to me that VMware has really improved this functionality in the later versions of vSphere. I recall running into many kinds of issues when doing this on 4.0. So far using a vCenter 5 server with a mix of 4.1 and 5.0 host testing has proved to be seamless and non-interruptive. However, I would still highly recommend considering all your options and testing this method THOROUGHLY before ever touching production environments.

I was able migrate a single physical NIC running ESXi management from a VSS to a VDS. This video covers how I did that. The reason for the video was because I got all kinds of senseless google links when trying to search for something documented. So, I did myself a favor and published one.

Remember, this is a test and this is only applicable for me to use in a few environments. In most cases I use redundant NICs. Now the real kicker about this is that to migrate from a VDS to a VSS requires a bit more thinking and planning. Especially if you only got access to a single PNIC. Maybe I will cover that some other time… for now try to use two. Also, this may be a solution for environments running single 10GB and need to use PVLANS or centralize managment.

Up and Coming – CommVault Round 3…?

No way…

Yes, you read it right. Be expecting more on this to come over the next couple of weeks as I learn more. I will get to play with it in some capacity so maybe I can share some good stuff. I am going to be covering some things around considerations and designs – hopefully. We all know that this was a hot topic a while back but now that the dust has settled maybe I can have some progress. CommVault has made some pretty strong improvements over the past couple of months. They have covered some of the pain points I had with it in a pretty good amount of time. I don’t think this would’ve happened had some folks not made a point to show some of these things openly. My number one gripe back in the day was how CommVault NOR VEEAM had “TRUE” vCloud Director compatibility – IN MY OPINON (all caps shows my emphasis). I am sure that by now this and maybe a few other things have changed. I am not sure I will be doing any comparisons per se but it will be good to see how the latest greatest stacks up. I think by now I can easily say I know what industry standards look like for VMware backups and the expectations and performance should be. One thing you can take to the bank is that I plan to do my home work just like last time. If there is something I don’t like or think needs improved I will most certainly write about it. All the folks out there that read this blog need to understand there isn’t a lot of information out there around some of these topics I cover. I would encourage any person reviewing a CommVault solution to do your homework. There are a lot of things to consider when going with a backup product. Let’s hope this time around I don’t have to pull any punches…

Update vSphere 5 – My two cents err problems

What’s the deal man?

Well to be honest I have ran into two very specific issues and what I want to iterate is how crucial it is to review updates before just deploying a normal vSphere 5 implementation. First off, I want to say that in the middle of my experience with performing the upgrade to vSphere 5 the release of Update 1 occurred. So with that being said comes the dilemma. Coordinating the update process and procedure should always be critical. You should also do your due diligence and review the updates along with bugs. I have to honestly give credit to the VMware Community which has definitely allowed me to identify problems before hand and how to avoid and workaround those. Now on to my issues.

Issue Number 1:Broken sVmotion (Storage Migration)

Well, this one was obvious but being the optimist I am didn’t think I would run into this little issue. However it appears to be a ESXi special feature for vSphere 5! I would highly recommend reviewing the following issues if you are having problems performing storage vMotions on vSphere 5/vCloud 1.5. I believe it is actually an issue with the ESXi hypervisor because prior to Update 1 there was a patch you could install on your ESXi box. Please see the following references for resolution:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2012122

FIX? > Install UPDATE 1 or ESXi Patch 2

Issue Number 2: vCenter Network Alarm Feature!

So, key words to stress in this issue is probably one that makes many CRINGE. Test and Prod should always be the SAME > We all know how important that is but SERIOUSLY how many of us actually MIRROR everything even the alarms? This is more of an issue with standards and procedures then anything… again I am reminded of the 9 parts planning and 1 part implementing or the “Your poor planning doesn’t account for an emergency on my part”.

If the following statement doesn’t tell you what happened then this KB most certainly can.. :)
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2007231

FIX? > Yeah just the read the kb its quite ridiculous… oh wait just install update 1?

So, I am writing this tell you that I would recommend applying or using the in place upgrade for vSphere 5 Upate 1. Oh, and just so you know I warned it still doesn’t support the following build number:

NOTE: I would highly recommend to updating from vanilla 4.1 to avoid the special VMware feature of PSOD.

Last but not least I equally thought it would be important to highlight a video that we can all share and relate to when facing unexpected results…. It’s not exactly the same but I can definitely relate to the frustrations..

Enjoy!

vSphere 5 – Storage pt.2 vCloud and Vsphere Migrations

The point..

So on my last post I covered some things to think on when looking at the new VMFS-5 partitions. Obviously the point in moving to the new VMFS would be to gain all the benefits as explained in that previous post. One thing you will see in this post are just the types of migrations. I also want to highlight that I shared some resources on the bottom for those of you who may want to review some deeper highlights. Obviously there isn’t a ton of documentation out there highlighting this nor the special *features* for vSphere 5 (sVmotion issues??) that you may run into. So let hope I do this yet further justice. On to the blog!

Adding VMFS-5 to the vCloud

  1. Log in to vSphere and ensure you have a new LUN provisioned (covered above in how to:)
  2. Log into vCloud Director Web Interface and you must be an administrator.
  3. Click “System” tab and click on Provider VDC. Right click a PVDC and select “Open”
  4. After opening the PVDC select the Datastores Tab and then click the +/- button to add/remove datastores

  1. Browse through the datastores by clicking the > button or by searching in the top right. When you have located your datastore highlight it and then click the button then click “OK”. Disregard the warning.


(Note: the yellow highlights are ways you can search and browse through datastores. This is very handy when there are many to look through)


(Note: Highlight in yellow shows the datastore added successfully. This is a 20TB Datastore)

You will now see the datastore in the datastore summary tab for that PVDC

Migrating Virtual Machines for vCloud Director to the “new” VMFS-5 LUN.

  1. Make sure the vApp is NOT a linked clone. If it is a linked clone defer to the references below.
  2. Ensure the Datastore you want to Storage Motion the Virtual Machine to is also provisioned to the Org VDC. Do this by opening the Org vDC and selecting the “Datastores” Tab.

    Note: you can see both datastores are attached to this VDC with the organization known as App1

  3. You could then log-in to vSphere client with the following noted vCenter and perform a storage vMotion. Another way of doing a Storage vMotion could be by using William Lam’s script he wrote as well. (see references below)
  4. If you need to perform the sVmotion defer to the following method below.

NOTE: I would highly recommend that you roll out update 1 to all vCloud components. This addresses a few major fixes that will allow for operations to run more smoothly. More importantly, the only way to sVmotion vCloud VMs is to turn them off. This is a pretty common issue with vanilla vsphere 5/vcloud 1.5 roll outs. I also experienced this problem. For more information please see references at the bottom.

Migrate a Virtual Machine with Storage VMotion in vSphere

Use migration with Storage VMotion to relocate a virtual machine’s configuration file and virtual disks while the virtual machine is powered on. You cannot change the virtual machine’s execution host during a migration with Storage VMotion. (Note: that if VM is managed by vCloud and not at 1.5 update 1 you will need to possibly power off the virtual machine to perform the svmotion. If the virtual machine is a fast provisioned vm (linked clone) then you will need to perform the sVmotion through an API.

Procedure

  • Ensure you are not moving vCloud vApp if you are please follow the above process first.
  • Display the virtual machine you want to migrate in the inventory.
  • Right-click on the virtual machine, and select Migrate from the pop-up menu.
  • Select Change datastore and click Next.
  • Select a resource pool (the same) and click Next.
  • Select the destination datastore:
    To move the virtual machine configuration files and virtual disks to a single destination, select the datastore and click Next.
    To select individual destinations for the configuration file and each virtual disk, click Advanced. In the Datastore column, select a destination for the configuration file and each virtual disk, and click Next.
  • Select a disk format and click Next:
  • Option Description
    Same as Source Use the format of the original virtual disk.
    If you select this option for an RDM disk in either physical or virtual
    compatibility mode, only the mapping file is migrated.
    Thin provisioned Use the thin format to save storage space. The thin virtual disk uses just as
    much storage space as it needs for its initial operations. When the virtual disk
    requires more space, it can grow in size up to its maximum allocated capacity.
    This option is not available for RDMs in physical compatibility mode. If you
    select this option for a virtual compatibility mode RDM, the RDM is
    converted to a virtual disk. RDMs converted to virtual disks cannot be
    converted back to RDMs.

    Thick Allocate a fixed amount of hard disk space to the virtual disk. The virtual
    disk in the thick format does not change its size and from the beginning
    occupies the entire datastore space provisioned to it.
    This option is not available for RDMs in physical compatibility mode. If you
    select this option for a virtual compatibility mode RDM, the RDM is
    converted to a virtual disk. RDMs converted to virtual disks cannot be
    converted back to RDMs.

    NOTE: Disks are converted from thin to thick format or thick to thin format only when they are copied from one
    datastore to another. If you choose to leave a disk in its original location, the disk format is not converted, regardless of the selection made here.

  • Review the page and click Finish.
  • A task is created that begins the virtual machine migration process.

References:

Linked Clones:
http://www.virtuallyghetto.com/2012/04/scripts-to-extract-vcloud-director.html
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1014249

Storage Motion Issue:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2012122

How To’s sVmotion CLI/VCO style:
http://www.virtuallyghetto.com/2012/02/performing-storage-vmotion-in-vcloud.html
http://www.virtuallyghetto.com/2012/02/performing-storage-vmotion-in-vcloud_19.html
http://geekafterfive.com/2012/03/06/vcloud-powercli-svmotion/
http://geekafterfive.com/tag/vcloud/
http://pubs.vmware.com/vsphere-50/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-501-virtual-machine-admin-guide.pdf

Storage Considerations for vCloud:
http://www.vmware.com/files/pdf/techpaper/VMW_10Q3_WP_vCloud_Director_Storage.pdf

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: