Posts filed under ‘rant’

Git vs. Mercurial: Please Relax

Everyone’s up in arms to embrace distributed version control as the new must-have tool for the developer in the know. Though many people have not yet migrated from Subversion, those that have almost invariably extoll the virtues of their particular choice. But though all of the major DVCS’s have features that set them above the previous generation of centralized systems, none stands head-and-shoulders above the others as Subversion does among the last generation: each of them was designed for a specific purpose, and each of them will serve those with different habits, workflows and development styles differently. Having used both git and Mercurial for the better part of a year, I’ve had the opportunity to compare the two. It saddened me to see a Twitter-based debate flamewar erupt over which is better, so I thought I’d do my best to try and ease the tension – with analogies!

Git is MacGyver


great man or greatest man?



Git’s design philosophy is unmistakably that of Unix: unlike Subversion, CVS, or Mercurial, git is not one monolithic binary but a multitude of individual tools, ranging from high-level “porcelain” commands such as git-pull, git-merge, and git-checkout to low-level “plumbing” commands such as git-apply, git-hash-object and git-merge-file. So, like MacGyver, you can do just about anything you need with Git – this includes totally awesome Wiki engines, issue trackers, filesystems, sysadmin tools – everything short of fuse repair:

As such, git is not so much a version control system as it is a tool for building your own version-controlled workflow. For example, when faced with the fact that no git tool performs the equivalent of hg addremove – a useful Mercurial command that adds all untracked files and removes all missing files – I found one line to a script originally written by James Robey:

#!/usr/bin/env bash
# git-addremove
git add .
git ls-files -deleted | xargs git rm

Git’s branching, tagging, merging, and rebasing are near flawless: git’s merging algorithm is close to omniscient, having once merged 12 Linux kernel patches simultaneously. Additionally, git provides you with tools to go back in time and edit your commit history – useful for those of us who have left certain critical elements out of a commit and had to quickly recommit with a helpful message such as “oops”. (And when I say “those of us”, I mean “every developer, ever”.) Personally, I elect to only use this feature to edit my last commit (using git commit --amend); I have never needed or wanted to meddle further with the past. git is also extremely fast thanks to its C codebase.

There is no better emblem of git’s flexibility than GitHub. GitHub’s rise to success has been positively meteoric, and with good reason. It’s a brilliantly-designed site that serves as more than a pretty, browsable frontend to my source tree in that it brings a social aspect to programming – using Git, I can fork anyone’s project, make my changes, petition for them to be included in the main repository, and pull other people’s changes to mine. Though it took a while for me to adjust to the anarchic notion of every user having their own equally-valid fork of a project – shouldn’t there be one definitive version of a project? – I realized its potential when working with other contributors to Nu. Add the fact that GitHub is one of the most solid and reliable services I’ve ever used, and you’ve got what very well might be the deal-breaker in the fight for DVCS dominance.

On the other hand, migrating from Subversion/CVS to git requires a lot of work. Linus has made it clear that he disagrees with the fundamental ideas behind Subversion/CVS, referring to SVN as “the most pointless project ever started”. As such, the git project has consciously made no effort to make the migration to git easy: the revert command in Subversion resets your current working copy to the last commit, but in git undoes a supplied patch and commits the changes needed to remove that patch. (The equivalent command for svn revert in git is git reset --hard HEAD^.) Whining about this on the git mailing list is a little like this:

Get it? No? Too bad.

Get it? No? Too bad.

Apparently the choices that Linus et. al made when designing git are sensible – if, of course, you understand the internal structure of git and how it stores your data. I’m afraid I’m only halfway through PeepCode’s great Git Internals, so I can’t comment on whether that statement is true. But I have to admit that if I had to read a $9, 100-page PDF to learn every new tool I downloaded, I would have no time and no money.

This brings us to another of git’s faults: its documentation is terrible. Man pages are no longer a sufficient replacement for a good, well-updated wiki or reference work; git’s wiki still has a long, long way to go. Add in the fact that since, like many OS X users, I installed git through MacPorts, only the main git tool comes with a man page, leaving me to consult the Web to find out exactly how to format revision specifiers. I’ve observed that developers that were able to learn git from colleagues already familiar with git and its internals tend to have a higher opinion of it, in contrast to people such as myself that had to waste a lot of time digging around through Google and the man pages.

However, considering the fact that git is supposed to be a platform, one would suppose that it would have clear, bridgable functions to reuse in your own C projects and bridge to other languages. One would be completely wrong – libgit.a is a joke, and the Ruby git gem (and my own vastly-inferior Nu/Git bindings) depend on running shell commands and parsing the output, which really gets quite tiresome after a while. The fact that console output may differ from platform to platform and any new feature may change the format of console output makes me very reluctant to commit to maintaining my Nu/Git bridge. (I swear, that’s the reason. It’s not because I’m lazy.)

In conclusion, Git is perfect for command-line wizards, people with large teams and complicated projects, and those who need their DVCS to be endlessly configurable. Certain developers have a workflow which, when interrupted, causes much grief and lamentation – if that description fits you, then git is almost certainly what you want, because it can be molded to fit the most esoteric workflow. Solo developers and those accustomed to working with centralized VCS’s may find git to be hostile, unfriendly and needlessly complex. When I work on a large project with many committers, I prefer git and GitHub.

Mercurial is James Bond

mercurial |mərˌkyoŏrēəl|
1) Subject to sudden or unpredictable changes of mood or mind: his mercurial temperament.

Though there have been many unfortunate open-source project names, Mercurial (also referred to by its command-line-tool name, hg) is both apt and unfortunate: though it is definitely speedy, both in terms of learning curve and execution speed, it is also at times inconsistent, maddening, and unpredictable. Mercurial is like James Bond: though they are not suited for each and every job, put them in a situation for which they are prepared and you will get things done. (If your programming job is as exciting as a Bond movie, please get in touch with me right away when one of your programmers is killed in action.)

In contrast to git’s philosophy of providing a flexible platform built out of individual components, Mercurial is monolithic and (relatively) inflexible. Developers who like to keep their system clean will probably appreciate the fact that hg installs one binary in contrast to the 144 that make up git, and developers who think that git’s ability to edit your previous commits is moronic, unnecessary, and dangerous will appreciate the simplicity hg provides by omitting that particular feature.

Compared to git, hg’s branching, merging and tagging systems are equally powerful and only slightly slower. The only current flaw in Mercurial’s branching system – and sweet crouching Jesus, is it ever a huge flaw – is that deleting named branches is unbelievably difficult: as far as I can tell, the only way to do so is to learn and enable the patch-queuing system (about which I have heard raves, but have not had the time yet to sit down and grok) and use the hg strip command, or install the local-branches extension. Selenium currently recommend you use tags instead of branches, which practically redefines the concept of a half-assed solution.

Despite that glaring flaw, the rest of hg is excellent. It functions almost identically to Subversion in the commands that it shares, and the new concepts – branching, merging, etc. – are easily learned and intuitive. Whereas I’m still learning how to do relatively basic things in git, I learned pretty much all of hg’s functionality in about a day. If you’re familiar with Subversion, transitioning to Mercurial should be a piece of cake – the functions you’re familiar with will be there, and the new functions are easy-to-learn and well-documented.

Though I’ve never tried to integrate Mercurial’s functionality into my own projects, I hear that since it’s written in Python it’s very easy just to import its classes and call them programatically rather than parse the output of shell scripts. I wanted to write a Mercurial frontend for OS X (I was planning to call it ‘hermetic’ – get the elaborate literary pun?), but the viral nature of the GPL discouraged me – since no company has granted me stock options for my code, I’m a little reluctant to just give away the fruits of my labors. Mercurial’s answer to GitHub is BitBucket, which I have not tried yet. If I do, I will update this entry posthaste.

In conclusion, Mercurial is the yin to git’s yang: those such as myself who are constantly experimenting with new ways to work and write code will object less to the restrictions that hg imposes on workflows. After switching to Mercurial for a small two-person project last year, my collaborator observed that Mercurial feels a lot more Mac-like – usability and smoothness of operation trump Unix philosophy when necessary. If I don’t have to share my code with anyone, I tend to use Mercurial in order to get things done faster.

So, What’s My Point?

To paraphrase Colin Wheeler, it’s OK to proselytize to those who have not switched to a distrubuted VCS yet, but trying to convert a git user to Mercurial (or vice-versa) is a waste of everyone’s time and energy. If you want to switch to a DVCS, then here are five easy steps:

  1. Evaluate your workflow and decide which tool suits you best.
  2. Learn how to use your chosen tool as well as you possibly can.
  3. Help newbies to make the transition.
  4. Shut up about the tools you use and write some code.

August 7, 2008 at 7:12 pm 27 comments

Five Things that Suck About Objective-C and Cocoa

Things have been quiet here in this blog. Too quiet. As such, I’m keeping the name-five-things-you-hate-about-a-language-you-like ball rolling, having seen it rolled with zest and vigor by brian d foy, Titus, Jacob Kaplan-Moss and Vincent.

Without further ado:

Five Things that Suck About Objective-C/Cocoa:

  1. Syntax for NSString literals. For the uninitiated, in Objective-C code enclosing "insomnia" in simple double-quotes creates a C-style char[] string; if you wish to use the far more powerful and versatile Objective-C NSString class, you must add an @ (making @"insomnia"). Backwards-compatibility with C is a good and useful thing, but why require more keystrokes to do the most commonly-used thing? I myself last weekend puzzled over a wonderfully non-specific “invalid reciever” error for a long time before realizing that I forgot an @ when sending strings to an arrayWithObjects: method. Aside from that, why use the @-sign as a prefix for NSStrings when it’s used in a plethora of other places, such as @interface, @implementation, @end and @selector? Though 90% of my @-key-presses in Textmate are prefixes to NSStrings, I can’t have a keypress of @ automatically expand to @”” – there are too many other things to do with the poor little @-sign. The oft-neglected | (pipe or vertical bar, I’ve heard both) character is far less disruptive to the flow of typing.
  2. reallyLongAndCamelCasedMethodNamesGetAnnoying. I refer specifically to the lovely NSWorkspace method openURLs: withAppBundleIdentifier: options: additionalEventParamDescriptor: launchIdentifiers:
    And ObjC method names can be concise yet informative – take for example NSString’s compare: options: range: locale:. I must admit, this complaint is not entirely valid, especially considering Textmate/XCode’s fancy code completion.
  3. No operator overloading. Come on, guys – why reject this crucial part of Smalltalk heritage? I, for one, am sick of writing objectAtIndex and objectForKey: as compared to Python’s []. Though Smalltalk allows one to define new operators, I’d be perfectly happy to settle for a few overloadable operators (string concatenation is desperately needed).
  4. Mysterious helper methods. I didn’t know of the existence of NSHomeDirectory(), NSTemporaryDirectory(), or NSClassFromString() until very recently. True, this is my fault, but I think that F-Script’s idea of storing all of these methods in a singleton System object is excellent, and much more in line with Objective-C’s Smalltalk heritage. (Actually, I have a half-finished ObjC class that makes NSBeep() and all those other miscellaneous C functions into class methods; if there’s any interest, I’ll finish and release it. I suppose that makes this complaint invalid. Oh well.)
  5. File management is a mess. Essential code is scattered throughout NSWorkspace (in all its brain-dead glory), NSFileManager, NSFileHandle, NSPipe, NSDirectoryEnumerator, and NSData – few things are as infuriating as hunting down the correct class that does exactly what I want. (Actually, no. Finding a better solution after thirty minutes of hacking around some perceived inadequacy is worse.

So there you have it. To be honest, it took me quite a while to write this, mainly because Objective-C is such a great language and Cocoa is such a great set of libraries. I suppose that the imperfections in a consistently useful and friendly toolkits stand out, and in retrospect I sort of feel guilty for my picky attitude. After all, it could be much, much worse.

April 5, 2007 at 11:59 pm 3 comments

Why I Dig Objective-C

I apologize if this entry is incoherent or trivial. But I need to build up the habit of blogging.

I recently picked up a copy of Refactoring. I hadn’t heard of that book before I read Steve Yegge’s thoughts on it – indeed, my entire concept of refactoring had previously been limited to Eclipse’s built-in tools. To describe this book as an eye-opener would be an understatement; not only has it taught me many coding strategies (upon reading the entry for Introduce Null Object, I practically screamed “Why didn’t I think of that?”), but it’s also changed many of my ideas about the way I code.

Fowler’s clear, direct writing is one of the strengths of Refactoring; during the book, he introduces the concept of the code smell – a term which I had not heard before, and which perfectly describes a situation with which I have been familiar ever since I started coding – and points out that a sure-fire sign of smelly code is the presence of many comments; if you need to explain a method in excruciating detail, then you’re probably made it too complex.

One of the reasons why I adore Eclipse for Java development is because it shows JavaDoc attributes in its Intellisense methods – I don’t need to navigate through the Java API when I can hit Ctrl+Space and see the arguments the method takes. Without Eclipse, I have to either rely on memory (and in my opinion, life’s too short to memorize the proper sequence of arguments that a proper BufferedReader instance takes) or waste time navigating the API docs.

Objective-C, on the other hand, doesn’t have this – because method signatures can be broken up into multiple pieces. Take a look at these two method signatures – taken straight from the documentation:

- (NSRange)rangeOfString:(NSString *)subString options:(unsigned)mask range:(NSRange)aRange


public NSRange rangeOfString(String s, int i, NSRange nsrange)

When looking at the first method signature, I know that the second object will be an integer that controls the masking – simply by virtue of the fact that, like its parent Smalltalk, its methods recieve messages via keyword messaging. I can also make the assumption – and it is only an assumption – that - (NSRange)rangeOfString:(NSString *)subString options:(unsigned)mask and - (NSRange)rangeOfString:(NSString *)subString are valid methods. With the Java API, on the other hand, I have no clue what the second argument does; if I had to know, I’d need to be using Eclipse or XCode, not Emacs or Textmate. Textmate’s Cocoa completions don’t show any HeaderDoc information – but that doesn’t matter: the keywords tell me what the arguments are used for.

It seems to me that Objective-C’s syntax lends itself to clarity by its very nature; Java, on the other hand, depends on other programmers being clear and helpful in their method signatures. And as a short perusal through The Daily WTF reveals, programmers can be very, very unhelpful at times.

January 12, 2007 at 6:19 pm 1 comment

Blocks != Functional Programming

Joel Spolsky is one of my heroes. He has a vast amount of insightful articles that rank among the clearest and most relevant software writing today, and his blog gets more hits in a day than mine ever will. (Speaking of which, I hit 2000 visitors yesterday – around 10x more than I ever thought I’d get.) He’s a very smart cookie, and when he speaks, people listen. But last week, while browsing the top Reddit articles of all time, I was surprised to see his article Can Your Programming Language Do This? at #4. While it’s a good primer on Javascript abstraction, I don’t think it deserves as many points as it recieved. I spent the next few days thinking about why this article bothered me so much – Joel certainly didn’t say anything untrue, attack any favorite language of mine, or make some outlandish claim. But then I realized that Joel’s article fit together in a pattern of recent articles, all of which bothered me slightly.

Here’s what I realized: it seems that every language under the sun is being evangelized as an excellent functional programming language simply because it supports a few paradigms from FP.

Or, restated: Anonymous functions do not a functional language make.

The most egregious example of a pundit claiming a language is functional when it’s clearly not is Eric Kidd’s well-known Why Ruby is an acceptable Lisp. Kidd tells us explicitly that Ruby is a denser functional language than Lisp – and I’ll be the first one to admit that if I were to debate the “denser” part of that sentence, I wouldn’t know what I was talking about.

But Ruby is not functional – Wikipedia calls it a reflective, object-oriented programming language, and I agree with them. Yes, you can have block arguments to methods, continuations, generators, reflection, and metaprogramming – but it isn’t functional, for two reasons.

1. It’s hard to carry around functions as objects.

I really don’t know why Ruby hates parentheses so much – it’s probably part of its Perl heritage. In Ruby, you can call methods without sticking superfluous parentheses in there – take a look at this Python code:

" I'll write about Cocoa soon; disaster struck the app I was writing ".strip().lower().split()

Now take a look at the equivalent Ruby code:

" Apple's releasing a tool
with XCode 3 which completely supersedes my Cocoa app - so I'm very depressed right now ".strip.downcase.split

Though you could put parentheses in front of strip, downcase, and split, Ruby will work just fine without them. Now, this feature makes for far fewer parentheses, thereby making code significantly more readable. But what if I want a previously-declared function as an argument? If I type in the name, Ruby will just evaluate the function. Sure, I could use the kludge that is Symbol.to_proc, but that’s ugly – and it wraps the function inside a Proc object, which has to be called with the call(*args) method. And that’s just ugly. In Python, all you need to do is type the function’s name to use it as an object, and append a pair of parentheses if you need to call it.

2. Variables are.

A purely functional language only has immutable variables. Ruby doesn’t. (Yes, I know LISP isn’t purely functional. But it adheres to so many other FP paradigms that we can overlook that.)

But I’m getting distracted, so I’ll cut the above point short.

Anyway, what I wanted to say was this – just because your pet language has support for anonymous functions/closures doesn’t make it a functional language. Sure, Python has lambda and list comprehensions (which are taken from Haskell, a purely functional language) – but it’s not functional, it’s object-oriented. Yes, Ruby has blocks (even if you do have to wrap them in Procs), but it’s not functional. Javascript may have support for anonymous functions, but its syntax can be traced back to Algol and the birth of imperative programming language. Hell, even Objective-C has blocks if you include the F-Script framework, and it’s the farthest thing from functional there is.

In conclusion, don’t say your language is a functional one just because you borrowed a few ideas from Lisp. If you want a real functional language, try OCaml, Haskell, ML, or Scheme. Calling imperative/OOP languages functional just makes the term meaningless.

January 2, 2007 at 6:27 pm 10 comments

About Me

I'm Patrick Thomson. This was a blog about computer programming and computer science that I wrote in high school and college. I have since disavowed many of the views expressed on this site, but I'm keeping it around out of fondness.

If you like this, you might want to check out my Twitter or Tumblr, both of which are occasionally about code.

Blog Stats

  • 674,937 hits