Post navigation

Don’t do svn-to-git repository conversions with git-svn!

It has come to my attention that some help pages on the web are still recommending git-svn as a conversion tool for migrating Subversion repositories to git. DO NOT DO THIS. You may damage your history badly if you do.

Reminder: I am speaking as an expert, having done numerous large and messy repository conversions. I’ve probably done more Subversion-to-git lifts than anybody else, I’ve torture-tested all the major tools for this job, and I know their failure modes intimately. Rather more intimately than I want to…

There is a job git-svn is reasonably good at: live gatewaying to a Subversion repo, allowing you to pretend it’s actually in git. Even in that case it has some mysterious bugs, but the problems from these can usually be corrected after the fact.

The problem with git-svn as a full importer is that it is not robust in the presence of repository malformations and edge cases – and these are all too common, both as a result of operator errors and scar tissue left by previous conversions from CVS. If anyone on your project has ever done a plain cp rather than “svn cp” when creating a tag directory, or deleted a branch or tag and then recreated it, or otherwise offended against the gods of the Subversion data model, git-svn will cheerfully, silently seize on that flaw and amplify the hell out of it in your git translation.

The result is likely to be a repository that looks just right enough at the head end to hide damage further back in the history. People often fail to notice this because they don’t actually spend much time looking at old revisions after a repository conversion – but on the rare occasions when history damage bites you it’s going to bite hard.

Don’t get screwed. Use git-svn for live gatewaying if you must, remaining aware that it is not the safest tool in the world. But for a full conversion use a dedicated importing tool. There are around a half-dozen of these; beware the ones that are wrappers around git-svn, because while they may add a few features they can’t do that much to address its weaknesses.

Ideally you want an importer with good documentation, a comprehensive end-to-end test suite full of examples of common Subversion-repository malformations, a practice of warning you when it trips over weirdness, and a long track record of success on large, old, and nasty repositories. And if you learn of any tool with all those features other than reposurgeon please let me know about it.

Please share this and link to it so the warning gets as widely distributed as possible.

P.S.: It has been pointed out that I could provide more positive guidance. Here it is: the DVCS Migration HOWTO.

Thanks. As the saying goes, all models are wrong, but some models are useful.

> … incorporate at least some of it into my docs and howtos.

Yeah, that would be great.

Looking around on the web, there are a lot of people whose mental models of git are broken (at least a few in a very similar fashion to the way mine was), and there are a lot of people whose mental models work just fine. But even a lot of the people with useful mental models of git — who can tell you the right thing to do in most situations — cannot correctly describe the reasons, either because to them the reasons are first principles and completely obvious, requiring no explanation; or, more commonly, because they have developed byzantine mental rationalizations for why they should or shouldn’t do things in a particular way.

This leads to numerous instances where people who know their mental models are bad (because they got unexpected results) post questions, and people who know the right thing to do, but not why, post answers complete with made-up rationales, but then bristle at the challenges to their fantasies. This leads to many fruitless and unsatisfying exchanges.

Even when it’s not contentious, there is usually no understanding conveyed about this particular issue of future merges. For example:

It seems clear to me that the original mess was caused by exactly the sort of partial merge I describe. Yet the focus of the answer is, first, on cleaning the mess in the most difficult way possible, and second on how to avoid it in the future by always performing a particular magic incantation ritual developed by high priests, and then third — oh, you don’t use the command line? Sorry, you can’t use this ritual.

There is nothing there that says “The reason this happened is that you previously partially merged these two branches, and in doing so, you convinced git that the proper thing to do whenever merging them again is to delete these changes.”

> you convinced git that the proper thing to do whenever merging them again is to delete these changes.

To be pedantic, of course, git isn’t actually deleting the changes on the next merge. Rather, it is only merging any changes to the data that occur after the immediate antecedents of the partial merge. And since those changes were not made in that section of the DAG, they will not be candidates for merging.

But it _feels_ like git is deleting those changes, because there were these files in your directory, and you type merge, and poof! they are gone…

Speaking of old and useless things, have you heard that SourceForge has started hijacking projects and distributing crapware installers? It gets really funny because most of those projects moved over to github (or elsewhere) when they introduced (optional, at the time) crapware installers.

Eric, that Hacker News comment is (approximately) two years old. In the past week, it seems SourceForge “hijacked” (claimed curator ownership for) the SourceForge pages for several projects that had moved to other repositories, such as Gimp, Firefox, and Subversion as “mirror hosts”; this was apparently prompted in part by the fact SourceForge does not delete project pages after such a migration “for sake of retaining materials of historical value”. It’s possible that the tools for redirection were improperly used in the case of these projects, but given the number and profile of the projects affected, I find this somewhat doubtful. You can read more on the position of both SourceForge and the GIMP developers in Ars Technica.

>Eric, that Hacker News comment is (approximately) two years old. In the past week, it seems SourceForge “hijacked” (claimed curator ownership for) the SourceForge pages for several projects that had moved to other repositories

Nasty.

I guess the acid test for whether they have really become evil will be whether they give the WinGimp page back to Jernej Simon?i?.

Ah, @Patrick Maupin, I hadn’t considered the Lanham Act angle at all. Firefox is a registered trademark of Mozilla and crapware added to a Firefox installer would definitely cause considerable harm to Mozilla and its trademark. (The GIMP, OTOH, has no such luck.)

>@esr: two interesting off topic articles about guns and gun con control in U.S.:

First one looks like a classic example of torturing the data to make it confess the results you want. From the summary:

“Rather than looking for a single state that matches Connecticut’s demographics, they performed a statistical analysis that created a synthetic state that tracked Connecticut’s pattern of firearm homicides before the law’s passage. This state was composed of a weighted rate from a number of different states.”

Those of us familiar with statistical fimflamming will smell a very large rat in the phrases “synthetic state” and “weighted average”. By choosing the weights it is likely that you can produce any result you want.

And I think that’s what’s been done here, since longitudinal studies elsewhere show a negative correlation between crime and civilian gun ownership.

Another problem with git-svn is that if I remember it correctly it is no longer actively developed, as the author and main developer no longer needs it for his day to day work. It doesn’t mean that it is not maintained: there were two commits to git-svn this year. On the other hand: only two commits.

If git-svn were actively developed, you would be able to use git notes to store svn-id info, instead of having to choose whether to have it in commit message (changing it), or not (and losing information).

It’s a pity that the idea of using remote-helpers to interact with repositories (i.e. svn:// or svn+ssh:// URLs for repositories) doesn’t seem to take off…

git-cvsimport is the same way, largely unmaintained. It still requires the old version of cvsps before ESR forked the project, and sadly, distribution packaging for both Arch and Debian recommend installing cvsps and using it for the conversion of CVS repositories. While git-cvsimport has a use case for live gatewaying with an active CVS server, it’s rather disappointing that such broken tools are continued to be recommended and used; made worse that these are provided by Git upstream themselves. I’m sure they mean no harm, but it’s sad.

That’s the problem of “software rot” – tools which once were cutting edge, very useful to have, are now no longer actively maintained. Perhaps those tools should be moved out of Git proper into contrib/ area?

@Jakub:
>That’s the problem of “software rot” – tools which once were cutting edge, very useful to have, are now no longer actively maintained.

Case in point, we are currently living in the dark ages of desktop UI design. XFCE and MATE struggle on among the ruins of the glory that was Windows Classic and the splendor that was GNOME. The barbarian hordes of Metro, Unity, and Shell run amok, burning desktop environments, neutering toolkits, and lobotomizing theming and configuration tools. The once prosperous land of GTK+ has been pillaged by the savage army of GTK-minus, which calls itself “version three”…

Oh please. GNOME always was terribad. Nasty, crufty layers of complexity stacked atop one another with dependencies in between, like rotten lasagna with spiderwebs in it. And for what? To make Linux more like Windows? If I wanted Windows I know where to find it. I missed that hype train in 1999 and I’m glad I did to this day.

In terms of desktop environments for Unix, nothing comes close to OPENSTEP/Cocoa in terms of ease of use and development. So if you want Unix with a nice desktop, buy a Mac.

I interact with my Linux machines primarily through shell and Emacs, running i3 for graphical apps, and that’s just the way it should be.

Sounds like you’ve got the “desktop” (i.e., overall GUI) config you want – so why the slam of Linux? OTOH, why not run i3 on your Mac?

I’m far from an average, typical user. I literally grew up around Unix machines — using command lines and tweaking config files by hand is no big deal to me. So what holds true for me regarding an acceptable interface doesn’t hold true for everybody. 99% of the people out there need a nice friendly graphical desktop — which GNOME pretends to be but never gets quite “there”. (Although GNOME 3.x comes closer than most prior attempts.) Those people would be better off with Macs.

In order for Linux to succeed as a general purpose computing platform for the masses, the open source community must adopt a stance of relentless focus on the end user at the expense of all other concerns. The bazaar nature of open source means that there is no leader and no central vision to make that a reality.

Note also that a slam against GNOME is not a slam against Linux. Linux is great for what it is. It’s a really powerful Unix OS, but to cross over as a modern workstation there has to be a culture of “put the user first, no exceptions” within the FOSS community and I’m just not seeing that ever develop. Not like Apple and Microsoft have.

Personally, I’ve tried several tiling WMs and found them restrictive; but I want to give credit to i3’s devs because they took the trouble to localize it: i3 detected I was using a Spanish keyboard and automatically bound “focus right” to $MOD+ntilde (“ñ”) instead of $MOD+semicolon, since that’s the key we have to the right of the L key. (Likewise, it bound “move right” to $MOD+Shift+ntilde. I say this for the sake of completeness. I’m a completeness freak! XD)

I also admire dwm’s devs because of their devotion to the Unix philosophy and because of dwm’s fluid window-management model (this blog post is what actually led me to realize said fluidity). Of course, dwm’s downside is all the tweaking and scripting you need to do to turn it into something remotely resembling a desktop environment; I’m not knowledgeable enough for that yet. (No, the editing-a-header-file thing doesn’t scare me; several web pages explain how to do it and, fortunately for me, it requires no knowledge of C. I’ve gotten the hang of it. :D)

But think about it: under i3, if you’re viewing workspace X and want to move one of its windows to workspace Y, you have to move the window to Y and then switch to viewing said workspace. That’s two commands. If, instead, you want to bring a window from Y to X, you first have to switch from viewing X to viewing Y. That’s three commands. Under dwm, you just tag the window in question so you can then easily bring it to view and/or expel it from view, as many times as you want… all as part of one seamless flow. Sure, tagging windows is a configuration chore you have to do upfront; but it pays off later. Besides, in the config (header) file you can establish rules for specific programs, including automatically assigning certain tags to them; thus, there’s no need to do said chore every single time you start dwm.

OTOH, I don’t know how well dwm handles multi-monitor; for some, including our host, that’s a potential deal breaker. FWIW, the official site does include a section about that.

> Those people would be better off with Macs.

Too expensive for some of us, I fear. And what about the walled garden? :-(

> the open source community must adopt a stance of relentless focus on the end user

There may be room for some hope. Have you tried the Elementary distro? I haven’t, but they say it was designed with attention to user experience.

@Jeff:
Before version 3, GNOME was, from the user perspective, about the best DE I’ve ever run across, especially in terms of configurability. It’s funny that you use the “make Linux more like Windows” line, I’ve more often heard GNOME 2 / MATE accused of making Linux too Mac-like.

I won’t argue with you on row l how crufty the internals of GNOME and GTK might be: my content on the neutering of toolkits was directed at regressions in GTK 3 that are directly visible to the user, not a comment on how much better or worse it may be from the developer’s perspective.

Before version 3, GNOME was, from the user perspective, about the best DE I’ve ever run across, especially in terms of configurability. It’s funny that you use the “make Linux more like Windows” line, I’ve more often heard GNOME 2 / MATE accused of making Linux too Mac-like.

Making Linux more like Windows comes from the GNOME founder himself, Miguel de Icaza, who was long a fan of Microsoft software and how it integrates. In particular GNOME ripped off the Windows object model COM with Bonobo, which was later swapped out for D-Bus for greater interoperability.

Even if git-svn, git-cvsexportcommit and (to lesser extent) git-cvsimport were actively maintained today, tools which are meant for bidirectional interaction, or even as “fat” client (or server as in the case of git-cvsserver) for other SCM, or tools for incremental import, have different tradeoffs then programs meant from the start for one-time conversion, like e.g. svn2git.

I’ve got to agree. From a usability standpoint Gnome 2.4 – 2.8 were great, probably the best, smoothest, most transparent desktop experience I’ve ever had, particularly where the ability to set up my workflow as I prefer was concerned. Gnome’s “vision” for 3.* got their project forked twice (at least) and pretty much lost them the Linux desktop.

I would go so far as to say that the one-two punch of Gnome plus Canonical’s Unity coming out in very quick succession was the very worst reverse we’ve ever seen to Linux on the desktop. (Unity was particularly bad for new users – most of the people I’ve spoken to are very confused by the interface, mainly because there isn’t a menu and they can’t find basic programs!)

The good news is that Xfce4 is (mostly) very nice. I get the same semi-transparent panel(s) and the ability to add/remove stuff from those panels. I get the same control over fonts and colors on the desktop as with Gnome 2.*, (with the exception of being unable to set up a differently colored “offset” background font to increase readability) and the GUI does some nice tricks. However, it’s not quite as good as same as Gnome 2.*. Xfce insists on placing icons in a grid, and it doesn’t move windows around on the taskbar quite as smoothly as Gnome 2.*. Unfortunately, I think XFCE relies on Gnome libraries for most of it’s functions, so purists may not want to go there, plus some of the theming is off and I’m not sure whether that’s a problem with particular themes, or whether Xfce has some fault that causes theming issues.

grtp.co is a server for an outfit called Gratipay, which is apparently something like Patreon. While it apparently is not malware, I suspect it’s done something with the DNS or certificate settings on Jon’s phone. They’ve got a github, so I suppose it’s possible that someone else is using their code…

> Ah. Currently yielding $32.50 a week.
>
> Kind of silly that it’s not more than that. Or am I using the wrong contribution service?

Personally, I’d give you money if I could; I’m in your debt. But I’m unemployed and use a currency that’s worth much less than the one you use; in order to get a half-decent amount of dollars, I would have to give an amount of pesos I cannot afford to. I’m sure you’ll understand.

> In order for Linux to succeed as a general purpose computing platform for the masses, the open source community must adopt a stance of relentless focus on the end user at the expense of all other concerns.

Yeah, except no. Android is a pile of mistakes heaped on mistakes. Mistake #1 was basing apps on a GC’d VM runtime rather than native apps (and trying to do an end run around Oracle’s IP rights in the process, for which they are bound to pay in the courts). Mistake #2 was allowing uncontrolled multitasking. They’re still paying for this, as the UI jank and sound latency problems are still unsolved in Android, after six years in the wild.

The reality is that Android’s purpose is to ensure users keep seeing Google’s ads. It was never a platform intended to serve the user (except maybe in the “To Serve Man” sense, serving users’ eyeballs to advertisers).

_”Ideally you want an importer with good documentation, a comprehensive end-to-end test suite full of examples of common Subversion-repository malformations, a practice of warning you when it trips over weirdness, and a long track record of success on large, old, and nasty repositories. And if you learn of any tool with all those features other than reposurgeon please let me know about it.”_