Posts tagged ‘dvcs’

Git vs. Mercurial: Please Relax

Everyone’s up in arms to embrace distributed version control as the new must-have tool for the developer in the know. Though many people have not yet migrated from Subversion, those that have almost invariably extoll the virtues of their particular choice. But though all of the major DVCS’s have features that set them above the previous generation of centralized systems, none stands head-and-shoulders above the others as Subversion does among the last generation: each of them was designed for a specific purpose, and each of them will serve those with different habits, workflows and development styles differently. Having used both git and Mercurial for the better part of a year, I’ve had the opportunity to compare the two. It saddened me to see a Twitter-based debate flamewar erupt over which is better, so I thought I’d do my best to try and ease the tension – with analogies!

Git is MacGyver

 

great man or greatest man?

 

 

Git’s design philosophy is unmistakably that of Unix: unlike Subversion, CVS, or Mercurial, git is not one monolithic binary but a multitude of individual tools, ranging from high-level “porcelain” commands such as git-pull, git-merge, and git-checkout to low-level “plumbing” commands such as git-apply, git-hash-object and git-merge-file. So, like MacGyver, you can do just about anything you need with Git – this includes totally awesome Wiki engines, issue trackers, filesystems, sysadmin tools – everything short of fuse repair:

As such, git is not so much a version control system as it is a tool for building your own version-controlled workflow. For example, when faced with the fact that no git tool performs the equivalent of hg addremove – a useful Mercurial command that adds all untracked files and removes all missing files – I found one line to a script originally written by James Robey:


#!/usr/bin/env bash
# git-addremove
git add .
git ls-files -deleted | xargs git rm

Git’s branching, tagging, merging, and rebasing are near flawless: git’s merging algorithm is close to omniscient, having once merged 12 Linux kernel patches simultaneously. Additionally, git provides you with tools to go back in time and edit your commit history – useful for those of us who have left certain critical elements out of a commit and had to quickly recommit with a helpful message such as “oops”. (And when I say “those of us”, I mean “every developer, ever”.) Personally, I elect to only use this feature to edit my last commit (using git commit --amend); I have never needed or wanted to meddle further with the past. git is also extremely fast thanks to its C codebase.

There is no better emblem of git’s flexibility than GitHub. GitHub’s rise to success has been positively meteoric, and with good reason. It’s a brilliantly-designed site that serves as more than a pretty, browsable frontend to my source tree in that it brings a social aspect to programming – using Git, I can fork anyone’s project, make my changes, petition for them to be included in the main repository, and pull other people’s changes to mine. Though it took a while for me to adjust to the anarchic notion of every user having their own equally-valid fork of a project – shouldn’t there be one definitive version of a project? – I realized its potential when working with other contributors to Nu. Add the fact that GitHub is one of the most solid and reliable services I’ve ever used, and you’ve got what very well might be the deal-breaker in the fight for DVCS dominance.

On the other hand, migrating from Subversion/CVS to git requires a lot of work. Linus has made it clear that he disagrees with the fundamental ideas behind Subversion/CVS, referring to SVN as “the most pointless project ever started”. As such, the git project has consciously made no effort to make the migration to git easy: the revert command in Subversion resets your current working copy to the last commit, but in git undoes a supplied patch and commits the changes needed to remove that patch. (The equivalent command for svn revert in git is git reset --hard HEAD^.) Whining about this on the git mailing list is a little like this:

Get it? No? Too bad.

Get it? No? Too bad.

Apparently the choices that Linus et. al made when designing git are sensible – if, of course, you understand the internal structure of git and how it stores your data. I’m afraid I’m only halfway through PeepCode’s great Git Internals, so I can’t comment on whether that statement is true. But I have to admit that if I had to read a $9, 100-page PDF to learn every new tool I downloaded, I would have no time and no money.

This brings us to another of git’s faults: its documentation is terrible. Man pages are no longer a sufficient replacement for a good, well-updated wiki or reference work; git’s wiki still has a long, long way to go. Add in the fact that since, like many OS X users, I installed git through MacPorts, only the main git tool comes with a man page, leaving me to consult the Web to find out exactly how to format revision specifiers. I’ve observed that developers that were able to learn git from colleagues already familiar with git and its internals tend to have a higher opinion of it, in contrast to people such as myself that had to waste a lot of time digging around through Google and the man pages.

However, considering the fact that git is supposed to be a platform, one would suppose that it would have clear, bridgable functions to reuse in your own C projects and bridge to other languages. One would be completely wrong – libgit.a is a joke, and the Ruby git gem (and my own vastly-inferior Nu/Git bindings) depend on running shell commands and parsing the output, which really gets quite tiresome after a while. The fact that console output may differ from platform to platform and any new feature may change the format of console output makes me very reluctant to commit to maintaining my Nu/Git bridge. (I swear, that’s the reason. It’s not because I’m lazy.)

In conclusion, Git is perfect for command-line wizards, people with large teams and complicated projects, and those who need their DVCS to be endlessly configurable. Certain developers have a workflow which, when interrupted, causes much grief and lamentation – if that description fits you, then git is almost certainly what you want, because it can be molded to fit the most esoteric workflow. Solo developers and those accustomed to working with centralized VCS’s may find git to be hostile, unfriendly and needlessly complex. When I work on a large project with many committers, I prefer git and GitHub.

Mercurial is James Bond


mercurial |mərˌkyoŏrēəl|
adjective
1) Subject to sudden or unpredictable changes of mood or mind: his mercurial temperament.

Though there have been many unfortunate open-source project names, Mercurial (also referred to by its command-line-tool name, hg) is both apt and unfortunate: though it is definitely speedy, both in terms of learning curve and execution speed, it is also at times inconsistent, maddening, and unpredictable. Mercurial is like James Bond: though they are not suited for each and every job, put them in a situation for which they are prepared and you will get things done. (If your programming job is as exciting as a Bond movie, please get in touch with me right away when one of your programmers is killed in action.)

In contrast to git’s philosophy of providing a flexible platform built out of individual components, Mercurial is monolithic and (relatively) inflexible. Developers who like to keep their system clean will probably appreciate the fact that hg installs one binary in contrast to the 144 that make up git, and developers who think that git’s ability to edit your previous commits is moronic, unnecessary, and dangerous will appreciate the simplicity hg provides by omitting that particular feature.

Compared to git, hg’s branching, merging and tagging systems are equally powerful and only slightly slower. The only current flaw in Mercurial’s branching system – and sweet crouching Jesus, is it ever a huge flaw – is that deleting named branches is unbelievably difficult: as far as I can tell, the only way to do so is to learn and enable the patch-queuing system (about which I have heard raves, but have not had the time yet to sit down and grok) and use the hg strip command, or install the local-branches extension. Selenium currently recommend you use tags instead of branches, which practically redefines the concept of a half-assed solution.

Despite that glaring flaw, the rest of hg is excellent. It functions almost identically to Subversion in the commands that it shares, and the new concepts – branching, merging, etc. – are easily learned and intuitive. Whereas I’m still learning how to do relatively basic things in git, I learned pretty much all of hg’s functionality in about a day. If you’re familiar with Subversion, transitioning to Mercurial should be a piece of cake – the functions you’re familiar with will be there, and the new functions are easy-to-learn and well-documented.

Though I’ve never tried to integrate Mercurial’s functionality into my own projects, I hear that since it’s written in Python it’s very easy just to import its classes and call them programatically rather than parse the output of shell scripts. I wanted to write a Mercurial frontend for OS X (I was planning to call it ‘hermetic’ – get the elaborate literary pun?), but the viral nature of the GPL discouraged me – since no company has granted me stock options for my code, I’m a little reluctant to just give away the fruits of my labors. Mercurial’s answer to GitHub is BitBucket, which I have not tried yet. If I do, I will update this entry posthaste.

In conclusion, Mercurial is the yin to git’s yang: those such as myself who are constantly experimenting with new ways to work and write code will object less to the restrictions that hg imposes on workflows. After switching to Mercurial for a small two-person project last year, my collaborator observed that Mercurial feels a lot more Mac-like – usability and smoothness of operation trump Unix philosophy when necessary. If I don’t have to share my code with anyone, I tend to use Mercurial in order to get things done faster.

So, What’s My Point?

To paraphrase Colin Wheeler, it’s OK to proselytize to those who have not switched to a distrubuted VCS yet, but trying to convert a git user to Mercurial (or vice-versa) is a waste of everyone’s time and energy. If you want to switch to a DVCS, then here are five easy steps:

  1. Evaluate your workflow and decide which tool suits you best.
  2. Learn how to use your chosen tool as well as you possibly can.
  3. Help newbies to make the transition.
  4. Shut up about the tools you use and write some code.

August 7, 2008 at 7:12 pm 27 comments


About Me



I'm Patrick Thomson. This was a blog about computer programming and computer science that I wrote in high school and college. I have since disavowed many of the views expressed on this site, but I'm keeping it around out of fondness.

If you like this, you might want to check out my Twitter or Tumblr, both of which are occasionally about code.

Blog Stats

  • 733,784 hits