Meta

This last month I put together a presentation on version control. As far as I’m aware, everyone I work with uses some form of version control. My purpose with the presentation was many fold. I and a lot of thins that had been prrcelating that I wanted to think about and express. This as an excellent opportunity for that.

Git vs SVN

One thing I know is a sore point for some developers is the idea that git is taking over. There are a lot of people espousing the idea that git is better than svn in every way. This has created a conflict culture because there are a lot of developers making good use of svn and most have heard the primary arguments in favor of git and found the arguments are not compelling enough to motivate a switch.

This is okay.

There is no war on centralized version control, and if there is, the only people pushing it are dumbasses. (This coming from someone who was one such dumbass a couple years ago.)

It’s anti-agile to prescribe something like that. Each team works differently and needs to determine for themselves if the benefits outweigh the costs.

Continuous Delivery

Recently I read a book called continuous delivery. It is a comprehensive blowhard description of how devops “needs” to be. It covers testing strategies, and branching strategies. While a lopsided view of the world, it offered potential solutions that I have found fascinating.

Trunk Based Development vs Feature Branching

This offers not just a workflow that allows for potentially faster delivery, but also a realistic way to use SVN. An alternative to the branching hoopla around distributed systems.

DevOps

It also gave me an opportunity to talk about DevOps in a reasonably safe environment. It’s a political word at the moment and while I don’t want to get involved in that, I do want to talk about the DevOps movement.

This movement is about working together. Communicating on a level that hasn’t classically been done. It’s there because a lot of the problems we face are due to the fact that in our workflow, we develop a product and throw it over a virtual wall to operations to “handle”. They don’t understand what we need done and we don’t understand what they need done. It’s a matter of sharing knowledge and searching for a deeper understanding of what we’re doing. It’s hard because we don’t want to spread ourselves thin, but in order to deliver a quality product, you have to be more than just a cog in the machine. You don’t have to know exactly “how” to do everything, but you have to have a basic understanding of “what” is happening.

But I digress, I was able to tie this in to version control by talking about how DevOps is also concerned with automation. Creating scripts to do all parts of the deployments. These scripts, this automation needs to be in version control as well. They “can” be in the same repository, or they can be in a separate repository. But managing dependencies, flipping all the right switches, needs to be maintained in much the same way code is controlled. Because it is code.

Database Version Control

And of course, you can’t talk about Version Control without talking about database version control… but this one requires its own article.

I was asked to take a look at a thought model for a Cloud Delivery Platform. I’m not going to post that graphic, because it wasn’t ready for prime time. It was confused and I’m not really sure what it was trying to show. So I created a bit of a diagram showing what I thought was important. (Just click on it, it doesn’t seem to fit right, and the next size down is too small to read.)

Halfway through I realized I’m not terribly good at making these things. I’ll get better. But there’s a few things I wanted to draw attention to in this graphic.

The Roles

This was the most important thing to me is to highlight the crossover of roles for each step. There are few steps in the process that should be perfomed by a single role. Devops is about communication.

Local Development

Coding, building, testing, packaging are all things that should be done at the same time. You don’t code 1000 lines then build. You build with every small change. In the same way, you should have tests written with the small increments. And that gets extended to “packaging”.

Packaging

I’m probably missing a very important word in my vocabulary here, but I’m going to go easy on myself, it’s late. What I’m meaning here is the process of setting up an image, or a deployable product. The configuration that will collect external dependencies and run tests when it gets to the deployed environment. This should all be done locally first. It should all be put into version control, separate from the application code. And this needs to be done in collaboration between the developer, who knows the application, and operations (or as they’ve been renamed in my organization, “Devops Engineers”) those who know how to package things appropriately.

Local Concurrence vs Cloud Linearity

Why isn’t spell check underlining “linearity”. That can’t be real. As I mentioned before, the things done locally are all done at the same time (build, test, package). That’s all just development. When it gets to the cloud, it’s a done product. Nothing should be further developed there. Nothing should be getting reworked. Everything needs to come entirely from Version Control and no manual finagling. (obviously?) So the cloud portion is linear. Everything happens in order. If it fails at any point, it’s sent back to local dev and fixed there.
Maybe this is all so obvious, it doesn’t need to be said. Maybe I’d feel better if it were just said anyway.

I’ve talked about this before, but I made a pretty picture for it recently to help explain it.

SVN is a centralized repository that we use for controlling deployments, sensitive data, and storing environment specific configs. The majority of the code is contained within Git (github), the development, feature branching workflows etc.

This works well because SVN is good at being central and having a linear history. Git is good at branching and going nuts with workflow.

My previous Git vs SVN made some errors based on old information. I’m hoping this will be a more accurate account.

Git Pros:

DVCS (Distributed Version Control System). What this means is it works off of the idea that every user copies the entire VCS and runs their own repository locally. This has several benefits.

Speed. This makes development/working with the repository much faster as almost every operation does not require you to contact the “primary” repository.

Tiny Commits. This allows / encourages tiny commits, so changes are tracked more closely and can be more easily separated out.

Control over who merges. A merge can be attempted by any access layer, but this will generate a pull request which can then be reviewed and approved.

No network access required.

Lightweight Branching. While feature branching in SVN is technically possible, it’s also a pain in the ass. Git made branching a first class feature. Easy branching means more people will probably make use of more advanced workflows.

Much smaller repositories. Due to better compression.

Repo file formats are simpler. So repair is easy and corruption is rare.

Creating a repository is trivial. So people who want to use repositories to track tiny personal projects, can just do so without any setup.

SVN Pros:

Narrow Clones. You can make a checkout of just one subdirectory deep in a hierarchy, download only the files related to that directory, and still be able to make commits. This seems useful in massively large repositories.

Comfort. People are already used to the linear structure and commands of SVN and that makes sense to them. Switching to DVCS is a cost. Most developers find the switch painful because it goes against what they’ve done for so long.

Deals with binary files better. If you’re tracking massive amounts of non-textual data, SVN handles it better as Git’s compression won’t work as well and it’ll be copying over everything.

Better classical / hierarchical model. (As opposed to patch/change based revisions.) It keeps in place a simplified model, good for keeping release history. Good for if you only make large commits and don’t want / need to record small dev changes. From a Git perspective, think of fast-forwarding every pull request.

SVN Lock. This provides more top down control over the repository.

Supports empty directories. This is probably important to someone.

Shorter and (mostly) predictable revision numbers.

SVN encourages large commits, Git encourages smaller commits. Simplicity or granularity. We all started out used to the former, but the latter is catching on quickly.

What seems to be the bottom line for most people is that SVN is the old and Git is the new. This is the direction of things, and it’s a new way of thinking about workflows and how VC can actually be doing things OTHER than just storing data.

The only issue that really matters to us is the pain of the switch. The problem is if we don’t understand the benefits that we’re getting from the switch, there’s never going to be enough of an impetus to switch.

SVN is a minor improvement on the CVS design. A linear, central repository. To their credit, the developers of SVN acknowledge this, from the Red Book:

Subversion was designed to be a successor to CVS, and its originators set out to win the hearts of CVS users in two ways—by creating an open source system with a design (and “look and feel”) similar to CVS, and by attempting to avoid most of CVS’s noticeable flaws. While the result wasn’t—and isn’t—the next great evolution in version control design, Subversion is very powerful, very usable, and very flexible.

Git is a Distributed Version Control System (DVCS). It’s important to distinguish it as “decentralized”. What that means is that each user creates a literal “fork” of the project.

What Git offers that’s important is lightweight branching and significantly better merging. These are what make advanced workflows (feature branches) possible. See Git Flow.

Branching by itself isn’t where the magic is. What makes Git bounds above SVN is actually the merging. What makes the merging so much better is actually how the history of commits is kept. Git is a Directed Acyclic Graph (DAG) and SVN is done linearly.

What this means is the SVN merge doesn’t take into account previous merges of the branch, it merges the files directly. They’re both doing 3-way merges, but the 3rd that is used (the common ancestor) isn’t the same because the linear history of SVN isn’t taking into account previous merges. So basically SVN goes back way further because it doesn’t know where the last merge between branches was. The result is a lot more conflicts that have to be manually resolved.

The above is outdated. As of SVN version 1.5, (2008), SVN includes meta data on merge histories, so it uses the 3-way merge to similar effect as Git now. I’m going to have to write a new one of these after more research.