I’ve been working with a lot of clients over the past couple years helping them adopt TFS Lab Management. One discussion that always comes up is how to architect the infrastructure required to run TFS Lab. I’m going to try and put down in writing the advice I usually give so I have somewhere to point people to in the future.

There are 3 main components in TFS Lab:

Hyper-V Host(s) – A server to host the running Virtual Machines (and yes, it must be Hyper-V)

Library Server(s) – A place to store the VM Templates, and stored VM’s. This is essentially just a network file share.

SCVMM Server – The centralized server that manages all this infrastructure.

The Hyper-V host must be a physical server (no you can’t create a VMWare VM and run Hyper-V inside of that – well, my co-worker actually had a client that got that to work, but the performance was horrendous so don’t do that). This means that setting up TFS Lab for the first time will require you to purchase/acquire at least one physical server.

If your datacenter is run off VMWare like so many of my clients are – it’s usually not a big deal to purchase a server specifically for TFS Lab that sits off in a corner running Hyper-V. In fact, if you run your datacenter on Hyper-V already, I’d still recommend isolating your TFS Lab away from your main virtualization infrastructure (i.e. setup a new SCVMM and hosts, don’t try to reuse your existing one).

Single-Server Deployment

Most of my clients start by adopting TFS Lab for one specific project, with the intention that if they like what they see they will scale up it’s use across the rest of their projects in the future. What I usually recommend to get started, is purchase a single beefy server (more below on typical server specs and price), and run all components off that one server.

For a single team project this works great. One nice thing is that because everything is on a single server your network infrastructure won’t play an issue in performance.

Note: SCVMM requires a SQL Server to store it’s configuration data. This is not pictured here, but I will almost always use the same SQL Server that TFS uses for it’s configuration/collection databases (i.e. not any of the servers pictured here).

Notice I host the SCVMM instance inside of a Virtual Machine (and SCVMM manages the host that it is actually running in – sounds kind of wacky but it works fine). This is contrary to the Microsoft guidance. Me and my co-workers have setup TFS Lab for many clients, and typically put SCVMM inside of a VM and have had no issues. In fact there are some important benefits you gain by doing this. Most importantly is it becomes easier to move the SCVMM server off to a different physical host down the road (as we will do in the below examples). If you install the SCVMM server on the physical machine (like so many people tend to do), when it comes time to scale out your Lab Infrastructure it is much harder to re-locate that SCVMM server elsewhere.

Physical Hardware Advice

For teams starting with TFS Lab the typical server hardware I recommend is around a $15,000 server. Nowadays, that should translate to a server with around 16 physical cores (32 logical with HyperThreading), 128 GB of RAM, 6x 2TB 7.2k RPM SATA drives (2TB RAID 1 Array for native OS + Library, 4TB RAID 10 Array for Host). Obviously, this depends on the size of the team, and complexity/size of the environment required for your application. But probably 90% of the teams I help setup Lab for the first time end up going with around a $15k server to start. Note: There is Microsoft guidance somewhere that recommends not to use Hosts with more than 48 GB of RAM; that guidance is outdated and misleading IMO, and I suggest you disregard it.

I usually create Lab VM’s with 1 CPU + 4 GB RAM, and typically budget about 100GB per VM for the VHD + Snapshots. Leaving some resources for the host OS, the above machine specs would allow you to create ~30 VM’s.

Note: TFS Lab Performance is very dependent on the disk subsystem. You want to maximize the number of spindles to increase parallelism, I usually advise going with many cheaper 7.2k RPM drives to maximize spindles and data density. For ultimate performance SSD is an obvious choice, but TFS Lab still requires a significantly large amount of storage making SSD too expensive for most teams (hopefully that should change in the next couple of years). There are 2 scenario’s where performance can be an issue, the large transfers from Library->Host that occur when you deploy a new Environment; and the operation of an existing environment. I tend to focus on the latter, which means paying close attention to the disks used by the Host(s). I spent a bunch of time recently working with a client to diagnose performance issues; to benchmark disk performance the SQLIO tool and this article are priceless. Also be careful when it comes to SAN. SAN storage tends to be much more expensive than local attached disk (especially in the capacities we’re usually dealing with for TFS Lab – watching the face of a SAN Engineer when requesting 10 TB of storage is a fun exercise regardless =)), and there are many more moving parts in a SAN which means more potential bottlenecks.

Scaling To a 2nd Team

Lets imagine that we’ve got a single-server TFS Lab setup and running, the team using it loves it and a 2nd team (separate TFS Team Project) wants to start using it.

Sure, you could share the same single-server setup across multiple team projects. But unless money/hardware is extremely tight I wouldn’t recommend doing that. The problem is that both teams will be sharing the same finite set of hardware resources (CPU, Memory, Disk), and there usually isn’t much visibility across teams. What can happen is Team A spins up a bunch of Lab Environments in the morning, then when Team B tries to spin some up in the afternoon they get errors about No Suitable Host Available, because the host has run out of available resources because of Team A. You can combat this by either over-provisioning the hardware so that it’s unlikely it will get maxed out, or by ensuring that the teams sharing the same Host(s) communicate well to avoid stepping on each others toes.

Instead, what I prefer to do is have dedicated hardware for each Team Project. Specifically dedicated Host(s) and Library for each Team Project. The SCVMM instance will still be shared between all Team Projects. If we take the above Single-Server/Single-Team-Project architecture, and scale it out for a 2nd Team Project it might look like below:

In this scenario, what I’ve done is dedicate the original server (#1) to Team Project A, and bought 2 new servers: another $15k server for Team Project B, and a smaller/cheaper server (~$2500) dedicated to run SCVMM.

I moved the SCVMM Virtual Machine off off Server #1 onto the new Server #3. Because SCVMM was in a VM it makes this migration extremely simple. SCVMM is a shared resource across all Team Projects so I don’t want it to reside on any hardware dedicated to a specific Team Project. In this scenario the server that hosts SCVMM Virtual Machine doesn’t even need to be Hyper-V, this is the one case where I don’t mind hosting the VM in the organizations primary virtualization infrastructure (even if that’s VMWare).

Also of note, is that I configure multiple Libraries (one per Team Project), and for this scenario where each Team Project only has a single host server, I place the library on the same physical server as the host. This has the benefit that the large Library->Host transfers and Host->Library transfers never need to hit the network.

Scaling a Single Large Team Project

The other important scenario is when you have a single Team Project that outgrows the single-server deployment. They simply need more resources (# of VM’s, CPU, RAM, Disk) than the original single-server can provide. In this case I aim for an architecture something like this:

As in the previous example I’ve moved the SCVMM VM off to it’s own host. Since the Library is now shared between several hosts I also move that off to some centralized location. In the image above I have it located on the physical server that hosts the SCVMM VM; however you could also place it on it’s own dedicated physical server (if I had multiple Team Projects I would definitely do this, as you’ll see below), or you could place it in a VM hosted on the same host (Server #3) or elsewhere. You want to pay attention to the network routing between your Library and Hosts. There will be large transfers happening (potentially hundreds of gigs at once), so you want the network between them to be fast and short. Typically you want to ensure that the hosts and Library are connected to the same physical network switch (I’m starting to see more people putting a 10GigE switch in place just for TFS Lab even if the rest of their network is still slower).

Mature TFS Lab Infrastructure

The final example is combining these various scenarios together into an organization that has many Team Projects, some large enough to require multiple hosts, and some where a single-server will suffice:

Configuring TFS to Dedicate Hosts/Libraries to Team Projects

Something to note, is that the TFS Admin Console only allows you to assign Host Groups and Libraries to Team Project Collections, but not individual Team Projects. It’s still possible, but you have to use the command-line rather than the TFS Admin GUI. You have to first assign your Host Group(s) and Libraries to the Team Project Collection in the GUI (make sure to turn Auto-Provision off). Then you have to run the following commands to assign the various host groups and libraries to specific Team Projects: