Posts filed under ‘problems’

Git vs. Mercurial: Please Relax

Everyone’s up in arms to embrace distributed version control as the new must-have tool for the developer in the know. Though many people have not yet migrated from Subversion, those that have almost invariably extoll the virtues of their particular choice. But though all of the major DVCS’s have features that set them above the previous generation of centralized systems, none stands head-and-shoulders above the others as Subversion does among the last generation: each of them was designed for a specific purpose, and each of them will serve those with different habits, workflows and development styles differently. Having used both git and Mercurial for the better part of a year, I’ve had the opportunity to compare the two. It saddened me to see a Twitter-based debate flamewar erupt over which is better, so I thought I’d do my best to try and ease the tension – with analogies!

Git is MacGyver


great man or greatest man?



Git’s design philosophy is unmistakably that of Unix: unlike Subversion, CVS, or Mercurial, git is not one monolithic binary but a multitude of individual tools, ranging from high-level “porcelain” commands such as git-pull, git-merge, and git-checkout to low-level “plumbing” commands such as git-apply, git-hash-object and git-merge-file. So, like MacGyver, you can do just about anything you need with Git – this includes totally awesome Wiki engines, issue trackers, filesystems, sysadmin tools – everything short of fuse repair:

As such, git is not so much a version control system as it is a tool for building your own version-controlled workflow. For example, when faced with the fact that no git tool performs the equivalent of hg addremove – a useful Mercurial command that adds all untracked files and removes all missing files – I found one line to a script originally written by James Robey:

#!/usr/bin/env bash
# git-addremove
git add .
git ls-files -deleted | xargs git rm

Git’s branching, tagging, merging, and rebasing are near flawless: git’s merging algorithm is close to omniscient, having once merged 12 Linux kernel patches simultaneously. Additionally, git provides you with tools to go back in time and edit your commit history – useful for those of us who have left certain critical elements out of a commit and had to quickly recommit with a helpful message such as “oops”. (And when I say “those of us”, I mean “every developer, ever”.) Personally, I elect to only use this feature to edit my last commit (using git commit --amend); I have never needed or wanted to meddle further with the past. git is also extremely fast thanks to its C codebase.

There is no better emblem of git’s flexibility than GitHub. GitHub’s rise to success has been positively meteoric, and with good reason. It’s a brilliantly-designed site that serves as more than a pretty, browsable frontend to my source tree in that it brings a social aspect to programming – using Git, I can fork anyone’s project, make my changes, petition for them to be included in the main repository, and pull other people’s changes to mine. Though it took a while for me to adjust to the anarchic notion of every user having their own equally-valid fork of a project – shouldn’t there be one definitive version of a project? – I realized its potential when working with other contributors to Nu. Add the fact that GitHub is one of the most solid and reliable services I’ve ever used, and you’ve got what very well might be the deal-breaker in the fight for DVCS dominance.

On the other hand, migrating from Subversion/CVS to git requires a lot of work. Linus has made it clear that he disagrees with the fundamental ideas behind Subversion/CVS, referring to SVN as “the most pointless project ever started”. As such, the git project has consciously made no effort to make the migration to git easy: the revert command in Subversion resets your current working copy to the last commit, but in git undoes a supplied patch and commits the changes needed to remove that patch. (The equivalent command for svn revert in git is git reset --hard HEAD^.) Whining about this on the git mailing list is a little like this:

Get it? No? Too bad.

Get it? No? Too bad.

Apparently the choices that Linus et. al made when designing git are sensible – if, of course, you understand the internal structure of git and how it stores your data. I’m afraid I’m only halfway through PeepCode’s great Git Internals, so I can’t comment on whether that statement is true. But I have to admit that if I had to read a $9, 100-page PDF to learn every new tool I downloaded, I would have no time and no money.

This brings us to another of git’s faults: its documentation is terrible. Man pages are no longer a sufficient replacement for a good, well-updated wiki or reference work; git’s wiki still has a long, long way to go. Add in the fact that since, like many OS X users, I installed git through MacPorts, only the main git tool comes with a man page, leaving me to consult the Web to find out exactly how to format revision specifiers. I’ve observed that developers that were able to learn git from colleagues already familiar with git and its internals tend to have a higher opinion of it, in contrast to people such as myself that had to waste a lot of time digging around through Google and the man pages.

However, considering the fact that git is supposed to be a platform, one would suppose that it would have clear, bridgable functions to reuse in your own C projects and bridge to other languages. One would be completely wrong – libgit.a is a joke, and the Ruby git gem (and my own vastly-inferior Nu/Git bindings) depend on running shell commands and parsing the output, which really gets quite tiresome after a while. The fact that console output may differ from platform to platform and any new feature may change the format of console output makes me very reluctant to commit to maintaining my Nu/Git bridge. (I swear, that’s the reason. It’s not because I’m lazy.)

In conclusion, Git is perfect for command-line wizards, people with large teams and complicated projects, and those who need their DVCS to be endlessly configurable. Certain developers have a workflow which, when interrupted, causes much grief and lamentation – if that description fits you, then git is almost certainly what you want, because it can be molded to fit the most esoteric workflow. Solo developers and those accustomed to working with centralized VCS’s may find git to be hostile, unfriendly and needlessly complex. When I work on a large project with many committers, I prefer git and GitHub.

Mercurial is James Bond

mercurial |mərˌkyoŏrēəl|
1) Subject to sudden or unpredictable changes of mood or mind: his mercurial temperament.

Though there have been many unfortunate open-source project names, Mercurial (also referred to by its command-line-tool name, hg) is both apt and unfortunate: though it is definitely speedy, both in terms of learning curve and execution speed, it is also at times inconsistent, maddening, and unpredictable. Mercurial is like James Bond: though they are not suited for each and every job, put them in a situation for which they are prepared and you will get things done. (If your programming job is as exciting as a Bond movie, please get in touch with me right away when one of your programmers is killed in action.)

In contrast to git’s philosophy of providing a flexible platform built out of individual components, Mercurial is monolithic and (relatively) inflexible. Developers who like to keep their system clean will probably appreciate the fact that hg installs one binary in contrast to the 144 that make up git, and developers who think that git’s ability to edit your previous commits is moronic, unnecessary, and dangerous will appreciate the simplicity hg provides by omitting that particular feature.

Compared to git, hg’s branching, merging and tagging systems are equally powerful and only slightly slower. The only current flaw in Mercurial’s branching system – and sweet crouching Jesus, is it ever a huge flaw – is that deleting named branches is unbelievably difficult: as far as I can tell, the only way to do so is to learn and enable the patch-queuing system (about which I have heard raves, but have not had the time yet to sit down and grok) and use the hg strip command, or install the local-branches extension. Selenium currently recommend you use tags instead of branches, which practically redefines the concept of a half-assed solution.

Despite that glaring flaw, the rest of hg is excellent. It functions almost identically to Subversion in the commands that it shares, and the new concepts – branching, merging, etc. – are easily learned and intuitive. Whereas I’m still learning how to do relatively basic things in git, I learned pretty much all of hg’s functionality in about a day. If you’re familiar with Subversion, transitioning to Mercurial should be a piece of cake – the functions you’re familiar with will be there, and the new functions are easy-to-learn and well-documented.

Though I’ve never tried to integrate Mercurial’s functionality into my own projects, I hear that since it’s written in Python it’s very easy just to import its classes and call them programatically rather than parse the output of shell scripts. I wanted to write a Mercurial frontend for OS X (I was planning to call it ‘hermetic’ – get the elaborate literary pun?), but the viral nature of the GPL discouraged me – since no company has granted me stock options for my code, I’m a little reluctant to just give away the fruits of my labors. Mercurial’s answer to GitHub is BitBucket, which I have not tried yet. If I do, I will update this entry posthaste.

In conclusion, Mercurial is the yin to git’s yang: those such as myself who are constantly experimenting with new ways to work and write code will object less to the restrictions that hg imposes on workflows. After switching to Mercurial for a small two-person project last year, my collaborator observed that Mercurial feels a lot more Mac-like – usability and smoothness of operation trump Unix philosophy when necessary. If I don’t have to share my code with anyone, I tend to use Mercurial in order to get things done faster.

So, What’s My Point?

To paraphrase Colin Wheeler, it’s OK to proselytize to those who have not switched to a distrubuted VCS yet, but trying to convert a git user to Mercurial (or vice-versa) is a waste of everyone’s time and energy. If you want to switch to a DVCS, then here are five easy steps:

  1. Evaluate your workflow and decide which tool suits you best.
  2. Learn how to use your chosen tool as well as you possibly can.
  3. Help newbies to make the transition.
  4. Shut up about the tools you use and write some code.

August 7, 2008 at 7:12 pm 27 comments

mogenerator, or how I nearly abandoned Core Data

Unfortunately, my Macbook Pro is out of commission due to a broken fan (I’ve sent it in to Apple; they’d better send it back soon! I’m dying here!), so I’m just going to blog about a mini-renaissance I had when working with Core Data last week.

When I first saw Core Data, I couldn’t believe my eyes. Apple’s developer tools now had a simple way to graphically model as many parts of an MVC application as possible – Interface Builder for the view, Cocoa Bindings (and custom logic) for the controller, and now Core Data for the modeling. Add in the fact that I had heard nothing but praise for it, and I was completely sold. I created the model for my application, prototyped a view, and flipped to the documentation on Core Data’s NSManagedObject.

It was, to say the least, an unpleasant surprise. Though I adored the fact that Core Data would take care of undo/redo, saving, archival formats, and saving data, I didn’t want to start using valueForKey:. setValue: forKey, primitiveValueForKey:, and setPrimitiveValue: forKey: instead of the mutator methods that I had grown to love for their combination of ease-of-use and added maintainability. Sure, I could have made a million subclasses of NSManagedObject, but the idea of doing that manually struck me as tedious – and if there’s anything I loathe, it’s tedium. And updating every object every time I made a change to the Core Data model struck me as a maintainability nightmare.

Though it may reflect poorly on me as a programmer, I considered abandoning Core Data. The magic which it brought to undo/redo/saving/archiving could not overcome the reluctance I had to view all my objects as NSManagedObjects – which, frankly, seems like quite a breach of the Model part of the MVC philosophy.

But hope lay in wait. At the bottom of some Google results about subclassing NSManagedObject, I found this page from Jonathan ‘Wolf’ Rentzsch – one of my idols, both for his coding and his vocal pro-Cocoa advocacy – about an unbelievably clever tool named mogenerator.

In short, mogenerator reads the data you have stored in an .xcdatamodel file, extracts the information about each Entity one has created, creates two subclasses of NSManagedObject for each Entity, changes the .xcdatamodel file automatically so that your Entities inherit from their proper classes, and adds code for all necessary accessor and mutator methods – in short, it removes everything I resented about Core Data.

Why two subclasses? Because if I decide to make a change to the data model, I don’t wont to worry about overwriting my own code with the automatically generated code that mogenerator creates for me. To solve this problem, Wolf has his program use one of the two files for automatically generated code, and allows one to use the second – which extends the first file’s class – for one’s own nefarious purposes. If you ever update the .xcdatamodel file, all you need to do is run mogenerator again, and only the file with the automatically generated code will be overwritten; you can be sure that the custom application logic you’ve written will stay intact.

This is a stunningly useful program, and I don’t know why people aren’t proclaiming it’s merits hither and yon. mogenerator allows one to forgo all compromise with Core Data – you get all the advantages of an NSManagedObject without sacrificing the familiar paradigm of creating specific .c/.h files for each object’s code. My thanks go out to Rentzsch for such an amazing tool.

December 19, 2006 at 6:24 pm 4 comments

It’s Hard Out Here for an MVC Advocate

Edit: I finally got the Interface Builder palette mentioned herein working. I’ll post a screenshot sometime later. I also cleared up some language.

In the past, I have alluded to the fact that I am a diehard Model-View-Controller advocate. I stay remarkably lax on other issues – I don’t mind breaking encapsulation, enjoy both static and dynamic typing, and even advocate paradigms other than OOP for certain applications. However, when it comes to GUI or web application development, I will defend Smalltalk’s Model-View-Controller paradigm to the death. In my still-inchoate Cocoa application, I’m using Interface Builder for the view, Core Data for the model, and Cocoa Bindings + my own code for the controller.

The problem emerges when I need to use the smattering of custom widgets that I’ve selected. Since creating Interface Builder palettes is so difficult, it’s hard to get people to make them – but MVC falls apart the moment you have to exit out of Interface Builder to make visual changes to your own instances of custom widgets.

Therefore, I am stuck with a hard decision. Do I break the MVC design pattern and make a zillion little subclasses of NSView, in which I stick a whole bunch of initialization code? It would be a lot easier, especially when one considers how hard it is to create an IBPalette.

But it’s so ugly! I don’t want a custom-made NSView subclass for each color CTGradient that I want! And until Chad Weider releases a CTGradientWell (please, please, please, pretty please?) making a PTGradientView will be quite difficult.

Bah. Any comments/help/IB Palettes will be appreciated.

December 6, 2006 at 5:47 pm 5 comments

About Me

I'm Patrick Thomson. This was a blog about computer programming and computer science that I wrote in high school and college. I have since disavowed many of the views expressed on this site, but I'm keeping it around out of fondness.

If you like this, you might want to check out my Twitter or Tumblr, both of which are occasionally about code.

Blog Stats

  • 707,333 hits