TFS Shipping Cadence

I’ve been meaning to write about this for a while but somehow the days just slip by and I never find the time.

Team Foundation Service

If you are a reader of my blog then you’ve been seeing my posts on our service updates for months now. But let me rewind a bit and start at the beginning.

About 2 years ago we began the journey to bring TFS to the cloud. In the beginning it was just an experiment – Can we port TFS to Azure and have a viable cloud hosted solution? It took a summer to prove that we could do it and the fall/winter to shore it up and make it production ready. So a little over a year ago, we decided we were serious about this and started asking what a product plan for it looked like.

Obviously our background is as an on-premises mission critical server team – we’ve been doing that for 10 years. Further, we’re part of the Microsoft “machine” and that has it’s own set of ingrained practices. We shipped on 2-3 year cycles. I believe we had/have a very good 2-3 year cycle with strong customer engagement, good planning, agile release mechanisms (like Power Tools), etc but still – it’s a 2-3 year cycle. We knew going into the cloud space, that wasn’t going to work.

In the cloud, people expect freshness. They expect progress. If your site/app hasn’t progressed recently, people start to assume it’s dead. We needed to rethink a lot about what we do. That thinking started with trying to figure out what we wanted. Like people often do, we started with what we knew and tried to evolve from there. We spent a few months thinking through “What if we do major releases every year and minor releases every 6 months?”, “Major releases every 6 months, patches once a month?”, “What if we do quarterly releases – can we get the release cycle going that fast?”, etc. We spent time debating what constitutes a major release vs a minor release. How much churn are customers willing to tolerate? We went round and round. Ultimately we concluded we were just approaching the question wrong. When a change this big is necessary – forget were you are and just ask where you want to be and then ask what it would take to get there.

So late last summer, we were shipping service updates about every 4-6 months and we made the call that we were going to go to weekly updates. The goal was to ship new features (not just bug fixes) every week. To some degree it was a statement. Let’s figure out how fast we can go. Some asked why not every day? – certainly some services out there do that. Ultimately I felt that it just wasn’t necessary for our product/customer base/size of team. Maybe someday that will make sense but I didn’t feel it did for where we are today. To avoid having the capabilities delivered in this weekly cadence be random, we decided on a 6 month planning horizon. We’d plan in roughly 6 month chunks and then deliver in weekly increments.

We started working through this last fall (I’ll write more about this effort later) and gradually turned up the frequency from 4-6 month updates. The team was already executing with a Scrum based process, using aligned 3 week sprints. As our release cycles got shorter and shorter, we realized that those 3 week sprints actually formed a natural cadence for the team. We plan the sprint, we build it, we deliver a “potentially shippable increment” – except, in this case, it wasn’t “potentially”; now it really was “shippable”. Because of this natural alignment, we decided to stop the cycle tuning process at 3 weeks and ship feature updates to the service every 3 weeks rather than every week. We knew we still needed ways to update the service to resolve high priority issues more frequently than that, so we instituted a “Patch Monday” plan that says any given Monday we can, if needed, roll out important but not urgent service fixes. We also can roll out urgent fixes any given day, and sometimes do – literally within a few hours of discovering the issue.

The last piece that came in to place was realizing that 6 month planning wasn’t even really enough to make sure the team really had a clear view of where we were going so we added, a 12-18 month “vision” to make sure we were all rowing in the same direction.

So our cadence is:

12-18 month vision – This is pretty high level and describes the kinds of scenarios we want to enable. It often includes some storyboard that demonstrates the value but is not intended to be either design or a feature list – it is just illustrative of the kind of experience we want to create.

6 month planning – In this window we get more crisp about what features we are building. Here we work out high level cross team commitments – often our work requires coordination across multiple teams. It’s still not design but rather agreement about what scenario we’re delivering and what features support that scenario.

3 week sprints – We generally keep a sprint schedule looking out 2-3 sprints for each feature team. It has a lot more detail for “this sprint” than the next couple but it gives us some clarity on what is coming and when, allows the next level of dependency planning and allows us to understand what kind of progress we are making on our scenarios and balance work where we need to. At the end of every sprint – we deploy to production. Some of what we deploy may be “hidden” behind a feature flag. That enables us to deploy in progress work and, where appropriate, expose it to limited sets of users to test/give feedback.

Patch Monday – Every Monday we are capable of deploying important but non-urgent service fixes. All teams know that if they have something they need to get in, there’s a window every Monday. If there’s nothing needed, we don’t deploy and most of the time we don’t need to.

Daily hotfixes – On any given day, we can patch the service with a hotfix if we have any urgent service issue. In practice, this seems to wind up happening about once every 6 weeks (once every other sprint). It’s usually the result of some regression that got deployed with a sprint payload but sometimes it’s something else like a new load induced problem, etc.

I expect as we continue to learn and mature this may evolve further. Maybe someday we’ll break the aligned sprint model and then go to weekly deployments. But for now, this is working well for us and seems to be working well for the consumers of TFSPreview, so we’ll keep doing it.

Team Explorer Clients

Once we had started to settle on a pretty rapid cadence for service updates, we realized we were going to have another problem. Some of our new service features are going to require updates to the clients to really expose them in a way that works great for developers. This means that having a 3 week cadence for the service and a 2 year cadence for the client (or even 1 year if you count a service pack in the middle) really isn’t going to work. So last fall, as the service cadence firmed up, we started looking at what to do about the client.

It didn’t take much thinking for us to realize that significant changes to the Visual Studio client every 3 weeks was probably not going to fly. Most customers don’t want to update their clients that often. The quality assurance model for an on-premises release has to be more rigorous than a cloud service because fixing an issue once it’s deployed is much harder. Etc. So we started looking at a model that had a few constraints:

Ship often enough to not hold the service evolution back.

Provide a better vehicle than we’ve ever had before to provide responsive innovation and improvements based on customer feedback.

Ensure the quality of on premises updates are high.

Deliver updates in a way that is reasonably easy to discover, has minimal impact, and has a reasonable acquisition experience.

Assume not everyone will take every update so it needs to be easy to skip updates and then later pick up new ones.

We ultimately settled on quarterly updates as a reasonable trade-off between the costs of frequent updates and the lag with the service. However, it’s clear to accomplish this, we’ll need to be thoughtful about what changes we make in these quarterly updates. 3 months is not enough time to run a full validation pass for arbitrary sets of changes. As such we’ll have to focus mostly on changes “higher in the stack” to minimize the potential destabilizing impact. To use a ridiculous example, significant feature changes to the .NET Framework every 3 months and deploying that to the world would be a very bad idea given our current abilities.

We introduced a Visual Studio Extensions and Updates feature in VS 2010. As we looked at mechanisms for delivering updates to VS it looked like an appealing mechanism. We also considered Microsoft Update and other options but we felt the VS Update mechanism was the most appealing. Unfortunately, in 2010, it didn’t support the full power we needed to update the breadth of components we felt we might need to update. Fortunately, having started to think this through last fall, we were able to pull together a plan to extend the VS Update mechanism in VS 2012 to support the power that we need.

So, the plan is to, now that VS 2012 has shipped, move to a quarterly update cadence for our clients. This won’t, of course eliminate the need for us to do longer cadence “major updates” too. So expect major releases to continue but I’m very happy to be able to provide continuing value on a regular basis.

Team Foundation Service

Once we had locked our plans for service updates every 3 weeks and client updates every quarter, the next obvious question was about our on premises TFS server. It’s clear that we have a large number of customers are going to continue to want to use an on premises solution – you might say that’s our bread and butter. We also have some very good hosting partners filling needs that our Team Foundation Service doesn’t address who would like to be able to provide the latest capabilities. How are these groups of people going to feel about waiting 2 years for features that the online service gets within 3 weeks of release? In fact it’s worse than that. A few months before we released TFS 2012 we had to start locking down the churn and as a result the service was getting new features that were not in TFS 2012 *before* TFS 2012 even shipped.

On the other hand, in as much as it is true that not everyone wants to update their client every 3 months, it’s even more true that not everyone wants to update their server every 3 months. Further, we don’t have a clean and simple a mechanism for updating the server as we do for the client. It’s also the case that the QA process for a mission critical server is even more involved and costly than for a client.

All this taken together, I’d rather not try to update the on-premises server every 3 months. However, as we’ve started to figure out how to put any cadence plan for the server into action, we are finding that it actually depends a great deal on what kinds of capabilities we are trying to deliver. So we’ve ultimately landed in a place where we’ve said, we aren’t going to make a firm commitment to a cadence for the on-premises server. Instead, we’ll “play it by ear”. At this point, it’s clear that we *will* need to update the on-premises server in our first quarterly update later this year. Once we’ve been through one of those cycles once, we’ll revisit the cadence question and see if we are in a better position to lock on a cadence or whether we continue to “play it by ear.”

Conclusion

It’s a long post and I’m sorry about that but I wanted to give you some flavor of the journey. The summary is:

Team Foundation Service updates every 3 weeks

Visual Studio Client updates quarterly

Team Foundation Server updates more frequent than every 2 years but details still being worked out. We’ll definitely deliver one this fall but then we’ll see after that.

Tags

Join the conversation

Good to read and understand your thinking. We already have folk asking us when our in house TFS server will have the Kanban features…so they sooner you ship the better. I'm liking 3 months if you guys make each update cumulative and high quality.

Maybe you can post about how User Voice plays into the team's thinking. The list on user voice has been the same top 25-45 for the last year and I tend to agree with most of the votes. Any view into if/how user voice influences your larger 12-18 month vision?

4 years ago

Chris Kadel

Brian – thanks again for helping to lay out expectations for partners and customers the way you have been doing. Much appreciated!

4 years ago

Benjamin Cheung

What does "Visual Studio Client updates quarterly" mean? Are you going to be including updates to not just Visual Studio 2012, but also Visual Studio 2010, Visual Studio 2008, Visual Studio 2005, and the MSSCCI providers?

We STILL have users on Visual Studio 2008 because of SSRS report development/SSIS package development, and BizTalk development. We also have Visual Studio 2010 users because we've move to SSDT development (thank the dear lord…took long enough).

Now I can understand if you drop the compatibility of updating Team Explorer…but please give us updated MSSCCI providers for the clients as I'm afraid to look but somewhere I'm sure there's someone in my org still writing VB6 code….<shutters>

We actually started an effort about a year ago to re-write our VB6 apps to .NET because Windows 8 was supposed to NOT support VB6…but then you guys caved and supported VB6 and so now those apps will still be updated for the next 3-5 years….PLEASE kill off support for VB6.

Benjamin, I thought about covering other client versions in this post but dropped it for length. I'll say a few words. No, this will not include quarterly updates of all major releases – just of the latest (so VS 2012 for now). That said, yes, lots of people use older versions and not they aren't screwed. We have a VERY high compat bar so, in general, you should expect the things you use in older clients to keep working. You should *not* expect all new features to show up in your older clients.

Scott, I'll comment a bit more about UserVoice when I announce our quarterly update. The truth is that we put up the UserVoice site a bit late in the 2012 cycle for it to have significant impact. That's part of the problem with longer cycles. You end up having too much plan committed and less flexibility.

The second issue is that some of the things at the top of our UserVoice list are "BIG" things. They may not seem it but here's one of the problems with this kind of voting system. If I could either give you #1 or #10 through #20, you might actually choose 10-20. In other words, the voting is unable to take into account the size of the effort.

That said, we do have a bunch of the UserVoice suggestions high on our backlog and I expect we'll be making some progress on them. In fact, I might argue we already made some progress on #2 with the Git-tf bridge we recently released.

Thanks Brian! I've been hoping to hear more about how Microsoft plans to use the Visual Studio 2012 Product Updates feature. Not much has really been said about it and I keep hoping to hear more about how Microsoft intends to use it. I think this post uncovered a part of that, but I'm curious what other kinds of "updates" Microsoft would look to push out as part of the product update – i.e. will it just be stability / performance / security patches to Visual Studio or will new features be included?

Also excited to see the new TFS(ervice) features becoming accessible to us on-premise folks. : )

Matt, Updates will include both features and fixes. However that will vary a lot based on the stack. Stuff lower in the stack (IDE plumbing, editor, compilers, etc) that has a lot of dependencies on it will likely be mostly stability/perf fixes. Stuff higher in the stack will have more feature work. The best way to get a feel for this as we start to share our plans for what goes in Quarterly Update 1 (QU1). Stay tuned for more on that.

Wow, that's great news. I'm looking forward to working with the TFS service – by the way when is the pricing going to be announced? I'm not updating our TFS server (new OS and new SQL server) until I know if the TFS service is going to be cheap enough to fit in our budget. 🙂

Brian – You mentioned "Fortunately, having started to think this through last fall, we were able to pull together a plan to extend the VS Update mechanism in VS 2012 to support the power that we need."

Love that the updates will be more often. Also like the updates to the pending changes, with one exception…

The company I work for decided quite awhile ago to place the work items in a different TFS project than the source (don't ask, but it had to do with using TFS for source, but not bug tracking, and then migrating that later).

The problem I have (and maybe there is a solution), is that I can no longer easily switch between pending changes and the work items, because they are in different projects, it takes 2-3x longer than in 2010 when pending changes was a separate window.

The only other small "complaint" is that I don't see a way to remove the hierarchy from the pending changes window. I never used it in 2010, as it added extra vertical information I didn't need all the time…

@Chad – one way you can work around the switching projects to associate work items is to create personal queries targeting the work items project under the project where your source code lives. Doing this will allow you to use the "Queries" drop down on the Pending Changes page to launch queries you use most often. The trick here is to change the @project macro to the name of the work items project.

Great post! As it was interesting to hear about the devlopment of Visual Studio and TFS it was interesting to understand the evolution of the release lifecycle when it comes to a cloud service. I was hoping you could share more about the branching strategy you are using to support this shipping cadence.

Where you can share project status and team updated with your social network. It advertises what you are doing, but also creates curiosity for collaboration, etc.

4 years ago

arnold

Brian we are looking to integrate Clarity, Remedy, Rally, Dynamics CRM, and Fogbugz with TFS 2012, are there any sync providers you'd recommend? Does your cloud version of TFS provide syncing/import/export capabilities beyond what is available in the on premises version of TFS?

4 years ago

Mike

Thanks for the info, Brian. I'd be really interested in finding out how you structure the TFS source with respect to branching when it comes to shipping features, i.e. do you branch at the end of an iteration, and merge in patches/bug fixes as required, do feature branches and merge back into a trunk, etc?

Nothing at the moment but it's definitely the kind of thing we're interested in. We're working on a web extensibility model that will allow people to do all kinds of nifty things. And, of course, we'll use it too.

Brian

3 years ago

Cameron W.

If you dont mind, i would like to ask how you manage your code base? with the ammount of Developers/testers etc… that you must have on your teams i would like to know what process (general if needed) you are using for checking in/merging and maintaining your code. do you use feature branches, one branch, etc…

I ask because we use a similar process but find that at the end of our cycle we are always trying to clean out code that wasnt 100% finished or find ourselves behind due to the time it takes to manage our code base.

I would love to hear how you mange your code to be able to turn around clean/tested/usable on a 3 week basis.

@Cameron, That's a long topic and the full answer would not fit in a comment but I'll give you a summary. We used to make VERY heavy use of feature branches. Over time, we found that as we increased our cadence the overhead of managing integrations became too high and the latency of merges introduced drift that lead to unexpected bugs. Add to that mistakes that were made in the merging process and we have reduced our reliance on feature branches. We still use them but just less than we used to.

The TFS team – let's say ~100 devs and testers work mostly in one branch together, checking in regularly and being responsible for the delivering tests for their changes and verifying the quality of their checkins to keep branch quality good. We still use feature branches when we are going to make a more disruptive architectural change that will last for weeks or months and would make it difficult for people not involved with the change to continue to be productive. At any given time, we might have a couple in flight.

Part of being able to sustain a rapid cadence is breaking down work into smaller chunks that we would not have previously called "finished" but are now finished in a new sense. They are a complete change – though they may be only a fragment of the end user feature. And they "do no harm" even if they do nothing whatsoever (the code is disabled, for example). This practice allows more developers to share a branch without destroying each other's productivity, while enabling maintenance of quality and rapid cadence.

We also have a lot of systems – like gated checkins, rolling test suites, regular performance and stress testing, a very close relationship between developers and testers, etc that help enable it.

Hopefully this gives some insights. If you have more specific questions, I'm happy to try to answer them.

Brian

3 years ago

Jason Brown

Where can I learn more about how to setup a "rolling test suite"? I can't seem to find any documentation on MSDN on the topic or with MTM beyond just setting up a test plan.