Posts filed under ‘programming’

Git vs. Mercurial: Please Relax

Everyone’s up in arms to embrace distributed version control as the new must-have tool for the developer in the know. Though many people have not yet migrated from Subversion, those that have almost invariably extoll the virtues of their particular choice. But though all of the major DVCS’s have features that set them above the previous generation of centralized systems, none stands head-and-shoulders above the others as Subversion does among the last generation: each of them was designed for a specific purpose, and each of them will serve those with different habits, workflows and development styles differently. Having used both git and Mercurial for the better part of a year, I’ve had the opportunity to compare the two. It saddened me to see a Twitter-based debate flamewar erupt over which is better, so I thought I’d do my best to try and ease the tension – with analogies!

Git is MacGyver

 

great man or greatest man?

 

 

Git’s design philosophy is unmistakably that of Unix: unlike Subversion, CVS, or Mercurial, git is not one monolithic binary but a multitude of individual tools, ranging from high-level “porcelain” commands such as git-pull, git-merge, and git-checkout to low-level “plumbing” commands such as git-apply, git-hash-object and git-merge-file. So, like MacGyver, you can do just about anything you need with Git – this includes totally awesome Wiki engines, issue trackers, filesystems, sysadmin tools – everything short of fuse repair:

As such, git is not so much a version control system as it is a tool for building your own version-controlled workflow. For example, when faced with the fact that no git tool performs the equivalent of hg addremove – a useful Mercurial command that adds all untracked files and removes all missing files – I found one line to a script originally written by James Robey:


#!/usr/bin/env bash
# git-addremove
git add .
git ls-files -deleted | xargs git rm

Git’s branching, tagging, merging, and rebasing are near flawless: git’s merging algorithm is close to omniscient, having once merged 12 Linux kernel patches simultaneously. Additionally, git provides you with tools to go back in time and edit your commit history – useful for those of us who have left certain critical elements out of a commit and had to quickly recommit with a helpful message such as “oops”. (And when I say “those of us”, I mean “every developer, ever”.) Personally, I elect to only use this feature to edit my last commit (using git commit --amend); I have never needed or wanted to meddle further with the past. git is also extremely fast thanks to its C codebase.

There is no better emblem of git’s flexibility than GitHub. GitHub’s rise to success has been positively meteoric, and with good reason. It’s a brilliantly-designed site that serves as more than a pretty, browsable frontend to my source tree in that it brings a social aspect to programming – using Git, I can fork anyone’s project, make my changes, petition for them to be included in the main repository, and pull other people’s changes to mine. Though it took a while for me to adjust to the anarchic notion of every user having their own equally-valid fork of a project – shouldn’t there be one definitive version of a project? – I realized its potential when working with other contributors to Nu. Add the fact that GitHub is one of the most solid and reliable services I’ve ever used, and you’ve got what very well might be the deal-breaker in the fight for DVCS dominance.

On the other hand, migrating from Subversion/CVS to git requires a lot of work. Linus has made it clear that he disagrees with the fundamental ideas behind Subversion/CVS, referring to SVN as “the most pointless project ever started”. As such, the git project has consciously made no effort to make the migration to git easy: the revert command in Subversion resets your current working copy to the last commit, but in git undoes a supplied patch and commits the changes needed to remove that patch. (The equivalent command for svn revert in git is git reset --hard HEAD^.) Whining about this on the git mailing list is a little like this:

Get it? No? Too bad.

Get it? No? Too bad.

Apparently the choices that Linus et. al made when designing git are sensible – if, of course, you understand the internal structure of git and how it stores your data. I’m afraid I’m only halfway through PeepCode’s great Git Internals, so I can’t comment on whether that statement is true. But I have to admit that if I had to read a $9, 100-page PDF to learn every new tool I downloaded, I would have no time and no money.

This brings us to another of git’s faults: its documentation is terrible. Man pages are no longer a sufficient replacement for a good, well-updated wiki or reference work; git’s wiki still has a long, long way to go. Add in the fact that since, like many OS X users, I installed git through MacPorts, only the main git tool comes with a man page, leaving me to consult the Web to find out exactly how to format revision specifiers. I’ve observed that developers that were able to learn git from colleagues already familiar with git and its internals tend to have a higher opinion of it, in contrast to people such as myself that had to waste a lot of time digging around through Google and the man pages.

However, considering the fact that git is supposed to be a platform, one would suppose that it would have clear, bridgable functions to reuse in your own C projects and bridge to other languages. One would be completely wrong – libgit.a is a joke, and the Ruby git gem (and my own vastly-inferior Nu/Git bindings) depend on running shell commands and parsing the output, which really gets quite tiresome after a while. The fact that console output may differ from platform to platform and any new feature may change the format of console output makes me very reluctant to commit to maintaining my Nu/Git bridge. (I swear, that’s the reason. It’s not because I’m lazy.)

In conclusion, Git is perfect for command-line wizards, people with large teams and complicated projects, and those who need their DVCS to be endlessly configurable. Certain developers have a workflow which, when interrupted, causes much grief and lamentation – if that description fits you, then git is almost certainly what you want, because it can be molded to fit the most esoteric workflow. Solo developers and those accustomed to working with centralized VCS’s may find git to be hostile, unfriendly and needlessly complex. When I work on a large project with many committers, I prefer git and GitHub.

Mercurial is James Bond


mercurial |mərˌkyoŏrēəl|
adjective
1) Subject to sudden or unpredictable changes of mood or mind: his mercurial temperament.

Though there have been many unfortunate open-source project names, Mercurial (also referred to by its command-line-tool name, hg) is both apt and unfortunate: though it is definitely speedy, both in terms of learning curve and execution speed, it is also at times inconsistent, maddening, and unpredictable. Mercurial is like James Bond: though they are not suited for each and every job, put them in a situation for which they are prepared and you will get things done. (If your programming job is as exciting as a Bond movie, please get in touch with me right away when one of your programmers is killed in action.)

In contrast to git’s philosophy of providing a flexible platform built out of individual components, Mercurial is monolithic and (relatively) inflexible. Developers who like to keep their system clean will probably appreciate the fact that hg installs one binary in contrast to the 144 that make up git, and developers who think that git’s ability to edit your previous commits is moronic, unnecessary, and dangerous will appreciate the simplicity hg provides by omitting that particular feature.

Compared to git, hg’s branching, merging and tagging systems are equally powerful and only slightly slower. The only current flaw in Mercurial’s branching system – and sweet crouching Jesus, is it ever a huge flaw – is that deleting named branches is unbelievably difficult: as far as I can tell, the only way to do so is to learn and enable the patch-queuing system (about which I have heard raves, but have not had the time yet to sit down and grok) and use the hg strip command, or install the local-branches extension. Selenium currently recommend you use tags instead of branches, which practically redefines the concept of a half-assed solution.

Despite that glaring flaw, the rest of hg is excellent. It functions almost identically to Subversion in the commands that it shares, and the new concepts – branching, merging, etc. – are easily learned and intuitive. Whereas I’m still learning how to do relatively basic things in git, I learned pretty much all of hg’s functionality in about a day. If you’re familiar with Subversion, transitioning to Mercurial should be a piece of cake – the functions you’re familiar with will be there, and the new functions are easy-to-learn and well-documented.

Though I’ve never tried to integrate Mercurial’s functionality into my own projects, I hear that since it’s written in Python it’s very easy just to import its classes and call them programatically rather than parse the output of shell scripts. I wanted to write a Mercurial frontend for OS X (I was planning to call it ‘hermetic’ – get the elaborate literary pun?), but the viral nature of the GPL discouraged me – since no company has granted me stock options for my code, I’m a little reluctant to just give away the fruits of my labors. Mercurial’s answer to GitHub is BitBucket, which I have not tried yet. If I do, I will update this entry posthaste.

In conclusion, Mercurial is the yin to git’s yang: those such as myself who are constantly experimenting with new ways to work and write code will object less to the restrictions that hg imposes on workflows. After switching to Mercurial for a small two-person project last year, my collaborator observed that Mercurial feels a lot more Mac-like – usability and smoothness of operation trump Unix philosophy when necessary. If I don’t have to share my code with anyone, I tend to use Mercurial in order to get things done faster.

So, What’s My Point?

To paraphrase Colin Wheeler, it’s OK to proselytize to those who have not switched to a distrubuted VCS yet, but trying to convert a git user to Mercurial (or vice-versa) is a waste of everyone’s time and energy. If you want to switch to a DVCS, then here are five easy steps:

  1. Evaluate your workflow and decide which tool suits you best.
  2. Learn how to use your chosen tool as well as you possibly can.
  3. Help newbies to make the transition.
  4. Shut up about the tools you use and write some code.

August 7, 2008 at 7:12 pm 27 comments

How Tim Burks and Nu Stole the Show at C4[1]

Edit: Fixed some factual inaccuracies about the language itself.

Tim Burks, noted contributor to RubyCocoa and creator of RubyObjC, gave a talk at C4[1] about his experiences with creating a Ruby <-> ObjectiveC bridge, and the problems he overcame in doing so. It was an interesting presentation, and we were all suitably appreciative when he showed his custom visual chip-design software written in Ruby with a Cocoa interface.

And then he dropped a bombshell.

For the past year, Tim’s been working on a new dialect of Lisp – written in Objective-C – called Nu. Here are its features (more precisely, here are the ones that I remember; I was so awestruck that many went over my head):

  • Interpreted, running on top of Objective-C.
  • Scheme-y syntax. Everything is an s-expression (data is code, code is data). Variable assignment was done without let-clauses (which are a pain in the ass) – all one has to do was (set varname value).
  • Variable sigils to indicate variable scope.
  • True object-orientation – everything is an object.
  • True closures with the do-statement – which, incidentally, is how Ruby should have done it.
  • Macros. HOLY CRAP, MACROS! When Tim showed us an example of using define-macro for syntactical abstraction, Wolf Rentzsch and I started spontaneously applauding. His example even contained an example of absolutely beautiful exception handling that should be familiar to anyone with any ObjC or Ruby experience.
  • Symbol generation (__) to make macros hygenic and prevent variable name conflicts.
  • Nu data objects are Cocoa classes – the strings are NSStrings, the arrays NSArrays, etc.
  • Ability to create new Obj-C classes from inside Nu.
  • Interfaces with Cocoa libraries – you can access Core Data stores from within Nu in a much easier fashion than pure ObjC, thanks to Tim’s very clever idea of using a $session global to store the NSManagedObjectModel, NSManagedObjectContext, and NSPersistentStoreCoordinator.
  • Ruby-style string interpolation with #{}.
  • Regular expressions.
  • Positively drool-inducing metaprogramming, including a simulation of Ruby’s method_missing functionality.
  • A web-based templating system similar to ERb in 80 lines of Nu code – compare that with the 422 lines of code in erb.rb.

Tim showed us a MarsEdit-like blog editor written entirely in Nu, using Core Data as its backend – and then showed us the built-in Nu web server inside that program, complete with beautiful CSS/HTML/Ajax.

As F-Script is to Smalltalk, so Nu is to Lisp. Tim said that he hopes someday to open-source Nu; if he does, he will introduce what is quite possibly the most exciting development in the Lisp-related community in a long time. I don’t think I speak for just myself when I say I cannot wait to get my hands on it.

August 12, 2007 at 3:09 pm 4 comments

Map, Filter and Reduce in Cocoa

After working in Scheme, Python or Ruby, all of which (more or less) support function objects and the map(), filter() and reduce() functions, languages that don’t seem to be somewhat cumbersome. Cocoa manages to get these paradigms almost correctly implemented.

map()

One would think that Objective-C’s ability to pass functions around as objects in the form of selectors would make writing a map() method easy. Observe, however, the crucial differences of the NSArray equivalent to map(). (For those unfamilar with it, map(), when given an array and a method/function taking one argument, returns the result of mapping the function onto each item of the array.)

From the NSArray documentation:

makeObjectsPerformSelector:

Sends the aSelector message to each object in the array, starting with the first object and continuing through the array to the last object.

– (void)makeObjectsPerformSelector:(SEL)aSelector
Discussion

The aSelector method must not take any arguments. It shouldn’t have the side effect of modifying the receiving array.

This is different on no fewer than two levels. Firstly, even in the NSMutableArray subclass, this method is not allowed to have side effects. Frankly, I can think of few situations in which I would need to map an idempotent function onto an array; the point of map() is to be able to apply a function quickly to every element of an array and get back the changes! Secondly, an unaware or hurried programmer would think that this function was implemented so that one could write code like this:

- (void)printAnObject:(id)obj
{
NSLog([obj description]);
}

and then do this:
[anArray makeObjectsPerformSelector: @selector(printAnObject:)];

This is not the case – the above code would just make each element call printAnObject:, not call printAnObject with each element. I’m sure that to some it seems obvious, but I, for one, found this to be an insidiously tricky wart.

However, there is a (limited) workaround.

NSArray’s valueForKey: method (somewhat counter-intuitively) returns the result of invoking valueForKey on each of the array’s elements. As such, one can map KVC functions onto arrays. For example:

NSArray *arr = [NSArray arrayWithObjects: @"Get", @"CenterStage", @"0.6.2", @"because", @"it", @"rocks", nil];
[arr objectForKey: @"length"]; // returns [3, 10, 6, 7, 2, 5]

This can be used in many helpful ways; sadly, it only works on KVC-compliant properties/methods.

Anyway, moving on…

filter()

With OS X 10.4, Apple introduced the NSArray filteredArrayUsingPredicate: method, which allows one to filter an array based on criteria established by an NSPredicate. Observe:

NSArray *arr = [NSArray arrayWithObjects: @"This", @"is", @"the", @"first", @"CenterStage", @"release", @"I", @"helped", nil];
arr = [arr filteredArrayWithPredicate: [NSPredicate predicateWithFormat: @"SELF.length > 5"]]; // arr is now [@"CenterStage", "@"release", @"helped"]

Verbose, but useful.

reduce()

Frankly, Apple don’t give us any way to do this in pure Cocoa. The best way I’ve found (if you really need this, which is less often than one needs map() and filter()) is to use the F-Script framework and apply the \ operator to an array. Unfortunately, this takes a bit of overhead.

In conclusion, Cocoa and Objective-C almost bring us the joys of functional programming. We can only hope that Leopard and Obj-C 2.0 improve on these in some way.

List comprehensions, anyone?

June 18, 2007 at 10:42 pm 4 comments


About Me



I'm Patrick Thomson. This was a blog about computer programming and computer science that I wrote in high school and college. I have since disavowed many of the views expressed on this site, but I'm keeping it around out of fondness.

If you like this, you might want to check out my Twitter or Tumblr, both of which are occasionally about code.

Blog Stats

  • 674,937 hits