jQuery is a Monad

It’s said that every Haskell programmer writes their own monad tutorial, and with good reason: once you finally understand the definition and capabilities of a monad, you’ll be eager to try and break the mystique surrounding the concept of monads as a whole. To the outsider, monads are an impenetrable barrier to truly understanding Haskell; they’re cursed with a very unfortunate name, have bizarre syntax, and seem to do a thousand things at once. However, monads aren’t hard to understand when you see them in action.

As such, I present to you what may be the most widely-used monadic library in any language: the jQuery library, designed to bring Javascript back to its roots in functional programming and make AJAX and animations easy. The jQuery object, accessible through the jQuery variable or the $ shortcut, allows you to query or make DOM elements using CSS selectors or XPath queries, as shown below:

$("div"); // all div elements
$("span.moveable") // all spans with the moveable class
$("em,strong"); // all emphasized and strong tags
$("div\[img][3]"); // the third div that contains an image

It may be somewhat alarming, but jQuery all but eliminates the need for instance variables thanks to its method chaining. If I wanted to take all the span elements, fade them in, then set their text to “Alert!”, all without jQuery, I would need instance variables to save the found elements, to keep track of elapsed time, and perform each operation sequentially. In jQuery, it’s much easier:

$("span").fadeIn("slow").text("Alert!");

The returned result of fadeIn is not null, as one might otherwise assume, but the very same jQuery container $("span") upon which it was called. That way, I was able to call the .text("Alert!") method, which likewise returns the same container. Method chaining like this gives an immense dose of power, concision and readability to a previously flabby language. In addition to DOM manipulation, jQuery provides powerful AJAX shortcuts, attractive yet unobtrusive animations, and extensive CSS manipulation.

The Monadic Laws

Now that you have a basic idea of the concepts upon which jQuery is founded, let’s take a look at the three monadic laws. (The concept of monads and the monadic laws were first codified in the branch of mathematics known as category theory – beware, the previous link is dense as hell.)

The first monadic law is that a monad is a wrapper around another type. In Haskell, one has the IO String type, which is returned from functions that read from files, console input, or system calls – IO is a monad that wraps the String data type. jQuery obviously satisfies this condition, as it wraps DOM nodes retrieved through given queries.

The second monadic law is just as simple: all monads must have a function to wrap themselves around other data types. jQuery clearly has ways to apply itself to DOM nodes – you use the querying facilities to traverse the DOM, and if you’re feeling especially saucy, you can use pass the results of document.getElementsByTagName and its siblings to the jQuery object. Haskell refers to this as a type constructor – a function that takes some data and wraps that data inside a new type. (jQuery’s type constructor is its parentheses.)

The third monadic law, and the only one that’s even remotely complicated, is that all monads must be able to feed the value or values that they wrap into another function, as long as that function eventually returns a monad. fadeIn(), text(), and all the other chainable functions are examples of this – they take the elements given inside the jQuery object, perform their function on them, then rewraps them back into the jQuery object and returns them. And don’t think that you’re just limited to the built-in functions on the jQuery object – using the map() function, you can pass an anonymous function that will be called on each DOM element inside the jQuery object. map() will still return the jQuery object upon which it was called.

So, let’s review. Monads are abstract data types that satisfy three conditions:
1) They wrap themselves around other data types
2) They have an operation, confusingly called return, that actually performs the aforementioned wrapping
3) They have an operation called bind that allows to feed the value wrapped inside the monad into another function, as long as the function returns a monad.

Bam. That’s it. Simple – almost so simple as to be useless. But monads are like objects in that though they are conceptually very simple, they are immensely powerful. In Haskell, monads are used as an abstract datatype to represent actions – since they an be chained together, they a perfect fit for traditional imperative programming or code that depends on the outside world. Any keyboard input or file input in Haskell is wrapped inside an IO monad, serving to indicate that this part of the program is dependent on the outside world. By indicating that only certain sections of your programs depend on external, unpredictable input you not only make your debugging easier but also ensure that the rest of your functions depend only on their inputs. If you’re interested in learning about the other ways that Haskell uses monads or learning a stricter definition of monads, check out Jeff Newburn’s All About Monads.

“This is all well and good”, you say, “but how does jQuery’s monadic implementation manifest itself in common jQuery idioms?” Well, I’m glad that you asked.

Cautious Computations

Each language has its own way of dealing with passing a null object to a function that expects a non-null object or sending a message to a null object. Objective-C returns nil, Java throws NullPointerExceptions, and C – well, C segfaults, but what else is new? The jQuery equivalent of this would be trying to manipulate the contents of an empty jQuery object, like so:

$([]).fadeOut().text("THE WORLD HAS BROKEN!");

By calling the jQuery’s type constructor with an empty array as a parameter, we get an empty jQuery object; even though I’m calling fadeOut() and text() on nothing at all, jQuery will fail gracefully rather than spew errors all over the console. Much like Objective-C’s behavior when messaging nil or Ruby’s andand, this allows you to chain a long series of computations that may fail at some point together safely. From a higher-level perspective, this is very similar to Haskell’s Maybe monad, used to represent computations that might fail, such as a hashtable lookup.

State Transformations

When I heard that variables in Haskell could never change, I was horrified. Sure, I knew there are ways to work around this – recursion is the primary way – but I knew that there had to be some corner case where I would need to destructively update variables. It turns out I didn’t need to worry, because one of the most useful monads is the State monad. Not only does the State monad provide a vehicle in which to bind variables like you would in a regular language, but it also provides a useful semantic distinction – the very presence of the State monad in a function’s type implies that it depends on some state. (Now that I think about it, Haskell’s type system alone is more expressive than most languages.)

jQuery can be seen as a state monad too – it encapsulates a set of DOM nodes and allows you to chain stateful computations upon them. There are simple methods to change what is matched – add() adds new elements to the current object, contents() matches all children of the wrapped DOM nodes, and so on and so forth. The andSelf() and end() methods are much more interesting and much more reminiscent of Haskell’s state monad. Let’s take a look at how they work:

$("div.sidebar").find("a").andSelf().addClass('disabled')

In the above snippet, the $(“div.sidebar”) finds a div element with the sidebar class, and the find("a") class matches all links inside the sidebar. Were we to manipulate the jQuery object right now, only the links would be modified – instead, we add andSelf(), which readds the matched div element. We then add the ‘disabled’ class. end() performs the converse of andSelf() – it removes the elements matched by the previous destructive operation:

$("div.sidebar").find("a").addClass('disabled').end().fadeOut()

Conclusions

  1. Monads aren’t esoteric, abstruse computer science – they’re useful.
  2. You probably have used monads but just haven’t realized it.
  3. jQuery is awesome.

And with that, I will go and observe DC descend into delicious chaos.

January 18, 2009 at 4:53 am 20 comments

A Deeply Skeptical Look at C++0x

Today I saw a link to an article entitled C++ Gets an Overhaul on Hacker News detailing C++0x, the proposed set of standards for the new generation of C++. Working on Half-Life 2 mods and a DirectX game in C++ left me with the impression that C++ was complicated, ugly, bloated and inexpressive. and I was eager to see if these latest changes – other than garbage collection, about which I had already heard – would bring a much-needed dose of simplicity and ease-of-use to C++. Alas, it was not to be. As I read the article, I became so incensed with the sheer stupidity of some of the changes being introduced that I began mocking them out loud. In an attempt to preserve these witticisms for posterity, I will juxtapose extracts from this article and my thoughts.

Introduction

Ten years after the ratification of the first ISO C++ standard, C++ is heading for no less than a revolution.

Seriously, though, a revolution? The sort of revolution in which average people are freed from unjust laws, or the sort in which average people are crushed under infighting among the elite? Or is it the sort in which everyone’s head gets chopped off?

C++0x, the new C++ standard due in 2009, brings a new spirit and new flesh into the software development world.

How the hell do you pronounce ‘C++0x’ anyway? I’ve got a few ideas:

  • “cee-plus-plus-zero-exx” (logical)
  • “cee-plus-plus-hexadecimal” (what else would 0x mean?)
  • “cee-zero-ex” (terrible, sounds like the name of an energy drink)
  • “cox” (has a nice, if profane, exclamatory ring to it)
  • “cee-tox” (rhymes with detox, raising uncomfortable questions of addiction)
  • “cuttocks” (rhymes with “buttocks”, which is automatically funny)
  • “terrible”

I think I’ll go with the last one.

Brace yourself for state-of-the-art design idioms, even better performance, and a plethora of new features such as multithreading, concepts, hash table, rvalue references, smarter smart pointers, and new algorithms.

Sweet crouching Jesus! Brace yourselves! Hash tables? Multithreading? Truly, the future has arrived, and it’s going to kick your ass.

No doubt you’ll find a lot to like in C++0x!

Replace “like” with “mock” and you have yourself a true statement, sir.

New Core Features

The two most important features of C++0x are concepts and concurrency support. Concepts enable programmers to specify constraints on template parameters, thus making generic programming and design immensely simpler and more reliable.

Go on, try to think of a worse or more ambiguous name for a proposed language addition than ‘concepts’. No luck? Me neither.

Variadic templates, template aliases (also called template typedefs), and static_assert—though not directly related to concepts—will also make the use of templates in generic libraries more intuitive, flexible, and less error prone.

Please, please don’t add language features just to make the lives of library designers easier. Add features to make the average programmer’s life easier.

The importance of a standardized concurrency API in C++ can’t be overstated

Technically, it could; I could say “a standardized concurrency API in C++ will end world hunger”. But to step away from relentlessly mocking hoary clichés, I’d like to point out that compiled imperative languages like C++ are the absolute worst languages in which to do multithreading. The whole model of imperative programming is based around the introduction of changes in state – and when the state is being modified by hundreds of concurrent processes, your previously-bug-free program becomes nondeterministic. Multithreading is really, really hard for a vast array of reasons. Some of the smartest people in the world are trying to figure out how to make programming languages play nice with multiple cores now that Moore’s Law is on its way out, and there’s a reason that they’re using Haskell or inventing new languages rather than pretending that yet another layer on the tower of hacks that is C++ will fix the problem.

As multicore processors are becoming widespread, you simply can’t afford to remain stuck in the single-threaded era, or compromise on platform-dependent APIs. At last, there’s a portable, standardized and efficient multithreading library for C++.

There is very little that about C++ that can be described as “portable” – just ask the Firefox team. Even changing compilers from g++ to Visual C++ involves working through a host of cross-compiler flaws and incompatibilities, and actually porting across operating systems – or, God forbid, architectures – involves more #ifdef’s than you can shake a proprocessor at. (Whether a compiled language can be considered to be portable at all is another matter entirely.) Deep in your heart, you know that you’re going to have to make a lot of changes to make sure that your program compiles across architectures and OS’s – why not recognize this fact and use each target OS’s most powerful threading API?

Rvalue references are yet another silent revolution. While most users will probably not even know they exist (read my interview with Bjarne Stroustrup), rvalue references enable library designers to optimize containers and algorithms by implementing move semantics and perfect forwarding easily, thus reducing unneeded copy operations.

I have to force myself not to write this in as sarcastic a manner as possible, and this is the sort of statement that doesn’t help. C++ doesn’t need any more optimization, because it’s plenty fast already. The reason that Java beat C++ in the 90’s wasn’t because Java had features that C++ didn’t, it was because Java and Sun were willing to take a small performance hit in exchange for happier programmers, more readable code and cleaner algorithms. Silent revolutions won’t help C++; the only thing that can really save its life now is to remove features from the language. Yes, users will bitch and moan, but they do that anyway. Don’t let your language linger and die while it tries to bear the burden of backwards comaptibility.

Automatic type deduction is made possible by the new keywords auto and decltype which deduce the type of an object from its initializer and capture the type of an expression without having to spell it out, respectively.

Wow, uh, that’s actually exactly what C++ needed. Someone is listening to me!

Adding auto and decltype also paves the way for a new function declaration syntax. The function’s return type appears after the -> sign:
auto func(int x)->double {return pow(x);}

There are so many things wrong with this that I barely know where to start. The arrow? The fact that in contravention of who-knows-how-many-years of tradition, return type is going after the function and parameter names? The fact that though auto supposedly deduces that this is a function, it can’t figure out that this function would return a double, forcing the programmer to give two seemingly contradictory types?

Lambda expressions and closures are another prominent feature of C++0x. A lambda expression is a nameless function defined at > the place where it’s called. It is similar to a function object except that the programmer is rid of the burden of declaring a class with a constructor, defining an overloaded () operator and an instantiating a temporary object of that class—this tedium now becomes the compiler’s job. Here’s an example of a lambda expression:
//a lambda expression is used as an argument
myfunc([]\(int x, int y) -> int {return x+y;}

The lambda expression is indicated by the lambda introducer [] followed by a parameter list in parentheses. The optional return type comes next, following the -> sign. Finally, the lambda block itself is enclosed in braces.

Oh, man. Instead of the square brackets to introduce an anonymous function, why not use syntax that is a) meaningful and b) not used everywhere else? Off the top of my head, square brackets perform array declaration, array initialization, and are overloadable operators – why use them to declare anonymous functions as well? Why not take a cue from Python and use an actual keyword?

myfunc(lambda: (int x, int y) -> int {return x+y;} )

Clearer, no? What about function? Anything except this. How do we even tell where the anonymous function ends and the other arguments begin?

Other Additions

Some C++0x features are meant to simplify recurring programming tasks and minimize boilerplate code. Most of these “convenience features” were borrowed from other programming languages.

Maybe they’re recognizing that boilerplate code is a Bad Thing. Perhaps we’re getting typesafe macros? Better syntax for function pointers?

These convenience features include a null pointer literal called nullptr

We have a null pointer literal. It’s called NULL. It’s in C. Honestly.

C++0x enhances compatibility with other independent International Standards. The first set of additions is designed to bring the C++ in closer agreement with the ISO/IEC 9899:1999 Standard C (C99 for short).

Well, better a decade late than never, I guess.

The influence of the recent Unicode 4.0 standard is also reflected in C++0x. C++98 defines a wide char type called wchar_t, that has an implementation-defined size. In the mid-1990s, it was assumed that wchar_t would be sufficient for supporting Unicode but this turned out to be a false hope. The unspecified size of wchar_t prohibits portable UTF encoding in C++98. C++0x solves this problem by introducing two new character types with standardized sizes: char16_t and char32_t that are specifically designed to support portably all the Unicode 4.0 codesets and encoding schemes (UTF8, UTF16 and UTF32).

No, God Damnit. In introducing yet another character type, C++0x just made the problem worse. Now a newbie has to Google his way through forests of documentation and reports just to find out which character type to use. Considering that very few people have the know-how to use wchar_t anyway, why not simply redefine wchar_t to use UTF8? If your “solution” involves introducing two new character types into a language already fraught with unnecessary typedef’s, it’s not a solution. Period.

Some support for Unicode strings in its Standard Library will be available too.

Excuse me? If you’re going to introduce an entirely new character type, then you’d better provide C++ programmers with a coherent reason to switch beyond “because we said so.” The lure of a Unicode-embracing standard library would do much to spur adoption of these char_16t and char_32t types, harebrained though they may be; considering that strings are possibly the most fragile thing in C++, thanks to a million competing implementations, I sure as hell wouldn’t switch my already-working code to use a new character type if I wasn’t provided with a comprehensive set of reliable and fast string manipulation libraries.

A new library for manipulating regular expressions is defined in the C++0x header. Regular expression support has been noticeably lacking in C++—especially among web programmers, designers of XML parsers, and other text-processing applications.

There’s a reason for that, and it’s because scripting languages do regular expressions better than compiled languages. Continually tweaking a regular expression to capture just the right amount of information gets very tiresome when you have to wait ten seconds for your latest code to build, rather than having Perl or Ruby interpret your code for you.

C++ is my forefathers’ language, the language of the 80’s, the language of enormous videogames, segmentation faults, bluescreens, and linker errors. The programmers of my generation want to work in languages that teach us how to love our own potential before we have to hate others’ restrictions. C++0x makes no effort to step in this direction, and it truly saddens me that future generations of programmers will be taught in this language.

August 20, 2008 at 7:07 pm 40 comments

Git vs. Mercurial: Please Relax

Everyone’s up in arms to embrace distributed version control as the new must-have tool for the developer in the know. Though many people have not yet migrated from Subversion, those that have almost invariably extoll the virtues of their particular choice. But though all of the major DVCS’s have features that set them above the previous generation of centralized systems, none stands head-and-shoulders above the others as Subversion does among the last generation: each of them was designed for a specific purpose, and each of them will serve those with different habits, workflows and development styles differently. Having used both git and Mercurial for the better part of a year, I’ve had the opportunity to compare the two. It saddened me to see a Twitter-based debate flamewar erupt over which is better, so I thought I’d do my best to try and ease the tension – with analogies!

Git is MacGyver

 

great man or greatest man?

 

 

Git’s design philosophy is unmistakably that of Unix: unlike Subversion, CVS, or Mercurial, git is not one monolithic binary but a multitude of individual tools, ranging from high-level “porcelain” commands such as git-pull, git-merge, and git-checkout to low-level “plumbing” commands such as git-apply, git-hash-object and git-merge-file. So, like MacGyver, you can do just about anything you need with Git – this includes totally awesome Wiki engines, issue trackers, filesystems, sysadmin tools – everything short of fuse repair:

As such, git is not so much a version control system as it is a tool for building your own version-controlled workflow. For example, when faced with the fact that no git tool performs the equivalent of hg addremove – a useful Mercurial command that adds all untracked files and removes all missing files – I found one line to a script originally written by James Robey:


#!/usr/bin/env bash
# git-addremove
git add .
git ls-files -deleted | xargs git rm

Git’s branching, tagging, merging, and rebasing are near flawless: git’s merging algorithm is close to omniscient, having once merged 12 Linux kernel patches simultaneously. Additionally, git provides you with tools to go back in time and edit your commit history – useful for those of us who have left certain critical elements out of a commit and had to quickly recommit with a helpful message such as “oops”. (And when I say “those of us”, I mean “every developer, ever”.) Personally, I elect to only use this feature to edit my last commit (using git commit --amend); I have never needed or wanted to meddle further with the past. git is also extremely fast thanks to its C codebase.

There is no better emblem of git’s flexibility than GitHub. GitHub’s rise to success has been positively meteoric, and with good reason. It’s a brilliantly-designed site that serves as more than a pretty, browsable frontend to my source tree in that it brings a social aspect to programming – using Git, I can fork anyone’s project, make my changes, petition for them to be included in the main repository, and pull other people’s changes to mine. Though it took a while for me to adjust to the anarchic notion of every user having their own equally-valid fork of a project – shouldn’t there be one definitive version of a project? – I realized its potential when working with other contributors to Nu. Add the fact that GitHub is one of the most solid and reliable services I’ve ever used, and you’ve got what very well might be the deal-breaker in the fight for DVCS dominance.

On the other hand, migrating from Subversion/CVS to git requires a lot of work. Linus has made it clear that he disagrees with the fundamental ideas behind Subversion/CVS, referring to SVN as “the most pointless project ever started”. As such, the git project has consciously made no effort to make the migration to git easy: the revert command in Subversion resets your current working copy to the last commit, but in git undoes a supplied patch and commits the changes needed to remove that patch. (The equivalent command for svn revert in git is git reset --hard HEAD^.) Whining about this on the git mailing list is a little like this:

Get it? No? Too bad.

Get it? No? Too bad.

Apparently the choices that Linus et. al made when designing git are sensible – if, of course, you understand the internal structure of git and how it stores your data. I’m afraid I’m only halfway through PeepCode’s great Git Internals, so I can’t comment on whether that statement is true. But I have to admit that if I had to read a $9, 100-page PDF to learn every new tool I downloaded, I would have no time and no money.

This brings us to another of git’s faults: its documentation is terrible. Man pages are no longer a sufficient replacement for a good, well-updated wiki or reference work; git’s wiki still has a long, long way to go. Add in the fact that since, like many OS X users, I installed git through MacPorts, only the main git tool comes with a man page, leaving me to consult the Web to find out exactly how to format revision specifiers. I’ve observed that developers that were able to learn git from colleagues already familiar with git and its internals tend to have a higher opinion of it, in contrast to people such as myself that had to waste a lot of time digging around through Google and the man pages.

However, considering the fact that git is supposed to be a platform, one would suppose that it would have clear, bridgable functions to reuse in your own C projects and bridge to other languages. One would be completely wrong – libgit.a is a joke, and the Ruby git gem (and my own vastly-inferior Nu/Git bindings) depend on running shell commands and parsing the output, which really gets quite tiresome after a while. The fact that console output may differ from platform to platform and any new feature may change the format of console output makes me very reluctant to commit to maintaining my Nu/Git bridge. (I swear, that’s the reason. It’s not because I’m lazy.)

In conclusion, Git is perfect for command-line wizards, people with large teams and complicated projects, and those who need their DVCS to be endlessly configurable. Certain developers have a workflow which, when interrupted, causes much grief and lamentation – if that description fits you, then git is almost certainly what you want, because it can be molded to fit the most esoteric workflow. Solo developers and those accustomed to working with centralized VCS’s may find git to be hostile, unfriendly and needlessly complex. When I work on a large project with many committers, I prefer git and GitHub.

Mercurial is James Bond


mercurial |mərˌkyoŏrēəl|
adjective
1) Subject to sudden or unpredictable changes of mood or mind: his mercurial temperament.

Though there have been many unfortunate open-source project names, Mercurial (also referred to by its command-line-tool name, hg) is both apt and unfortunate: though it is definitely speedy, both in terms of learning curve and execution speed, it is also at times inconsistent, maddening, and unpredictable. Mercurial is like James Bond: though they are not suited for each and every job, put them in a situation for which they are prepared and you will get things done. (If your programming job is as exciting as a Bond movie, please get in touch with me right away when one of your programmers is killed in action.)

In contrast to git’s philosophy of providing a flexible platform built out of individual components, Mercurial is monolithic and (relatively) inflexible. Developers who like to keep their system clean will probably appreciate the fact that hg installs one binary in contrast to the 144 that make up git, and developers who think that git’s ability to edit your previous commits is moronic, unnecessary, and dangerous will appreciate the simplicity hg provides by omitting that particular feature.

Compared to git, hg’s branching, merging and tagging systems are equally powerful and only slightly slower. The only current flaw in Mercurial’s branching system – and sweet crouching Jesus, is it ever a huge flaw – is that deleting named branches is unbelievably difficult: as far as I can tell, the only way to do so is to learn and enable the patch-queuing system (about which I have heard raves, but have not had the time yet to sit down and grok) and use the hg strip command, or install the local-branches extension. Selenium currently recommend you use tags instead of branches, which practically redefines the concept of a half-assed solution.

Despite that glaring flaw, the rest of hg is excellent. It functions almost identically to Subversion in the commands that it shares, and the new concepts – branching, merging, etc. – are easily learned and intuitive. Whereas I’m still learning how to do relatively basic things in git, I learned pretty much all of hg’s functionality in about a day. If you’re familiar with Subversion, transitioning to Mercurial should be a piece of cake – the functions you’re familiar with will be there, and the new functions are easy-to-learn and well-documented.

Though I’ve never tried to integrate Mercurial’s functionality into my own projects, I hear that since it’s written in Python it’s very easy just to import its classes and call them programatically rather than parse the output of shell scripts. I wanted to write a Mercurial frontend for OS X (I was planning to call it ‘hermetic’ – get the elaborate literary pun?), but the viral nature of the GPL discouraged me – since no company has granted me stock options for my code, I’m a little reluctant to just give away the fruits of my labors. Mercurial’s answer to GitHub is BitBucket, which I have not tried yet. If I do, I will update this entry posthaste.

In conclusion, Mercurial is the yin to git’s yang: those such as myself who are constantly experimenting with new ways to work and write code will object less to the restrictions that hg imposes on workflows. After switching to Mercurial for a small two-person project last year, my collaborator observed that Mercurial feels a lot more Mac-like – usability and smoothness of operation trump Unix philosophy when necessary. If I don’t have to share my code with anyone, I tend to use Mercurial in order to get things done faster.

So, What’s My Point?

To paraphrase Colin Wheeler, it’s OK to proselytize to those who have not switched to a distrubuted VCS yet, but trying to convert a git user to Mercurial (or vice-versa) is a waste of everyone’s time and energy. If you want to switch to a DVCS, then here are five easy steps:

  1. Evaluate your workflow and decide which tool suits you best.
  2. Learn how to use your chosen tool as well as you possibly can.
  3. Help newbies to make the transition.
  4. Shut up about the tools you use and write some code.

August 7, 2008 at 7:12 pm 27 comments

“Haskell Curry? Yes, I dated his daughter.”

Last week I had the opportunity to talk with Alonzo Church, Jr. (he prefers to be called “Al”), son of the Alonzo Church (you know, the one who only invented the freaking lambda calculus). We had a lovely conversation; we talked about Alan Turing, Fortran, COBOL, the future of computer science, and all sorts of other interesting topics. The highlight of our conversation, though, was this:

Me: “So, I’m learning a new-ish programming language named Haskell right now. It’s very strange.”

Al: “Did you say Pascal?”

Me: “No, Haskell. It’s named after a logicial named Haskell Curry –“

Al: “Oh, Professor Curry! Yes, I knew him! My father worked with him!”

Me: “No way! That’s awesome!”

Al: “Yes! In fact, I dated his daughter!”

Me: “You DOG!”

August 21, 2007 at 4:44 pm 11 comments

How Tim Burks and Nu Stole the Show at C4[1]

Edit: Fixed some factual inaccuracies about the language itself.

Tim Burks, noted contributor to RubyCocoa and creator of RubyObjC, gave a talk at C4[1] about his experiences with creating a Ruby <-> ObjectiveC bridge, and the problems he overcame in doing so. It was an interesting presentation, and we were all suitably appreciative when he showed his custom visual chip-design software written in Ruby with a Cocoa interface.

And then he dropped a bombshell.

For the past year, Tim’s been working on a new dialect of Lisp – written in Objective-C – called Nu. Here are its features (more precisely, here are the ones that I remember; I was so awestruck that many went over my head):

  • Interpreted, running on top of Objective-C.
  • Scheme-y syntax. Everything is an s-expression (data is code, code is data). Variable assignment was done without let-clauses (which are a pain in the ass) – all one has to do was (set varname value).
  • Variable sigils to indicate variable scope.
  • True object-orientation – everything is an object.
  • True closures with the do-statement – which, incidentally, is how Ruby should have done it.
  • Macros. HOLY CRAP, MACROS! When Tim showed us an example of using define-macro for syntactical abstraction, Wolf Rentzsch and I started spontaneously applauding. His example even contained an example of absolutely beautiful exception handling that should be familiar to anyone with any ObjC or Ruby experience.
  • Symbol generation (__) to make macros hygenic and prevent variable name conflicts.
  • Nu data objects are Cocoa classes – the strings are NSStrings, the arrays NSArrays, etc.
  • Ability to create new Obj-C classes from inside Nu.
  • Interfaces with Cocoa libraries – you can access Core Data stores from within Nu in a much easier fashion than pure ObjC, thanks to Tim’s very clever idea of using a $session global to store the NSManagedObjectModel, NSManagedObjectContext, and NSPersistentStoreCoordinator.
  • Ruby-style string interpolation with #{}.
  • Regular expressions.
  • Positively drool-inducing metaprogramming, including a simulation of Ruby’s method_missing functionality.
  • A web-based templating system similar to ERb in 80 lines of Nu code – compare that with the 422 lines of code in erb.rb.

Tim showed us a MarsEdit-like blog editor written entirely in Nu, using Core Data as its backend – and then showed us the built-in Nu web server inside that program, complete with beautiful CSS/HTML/Ajax.

As F-Script is to Smalltalk, so Nu is to Lisp. Tim said that he hopes someday to open-source Nu; if he does, he will introduce what is quite possibly the most exciting development in the Lisp-related community in a long time. I don’t think I speak for just myself when I say I cannot wait to get my hands on it.

August 12, 2007 at 3:09 pm 4 comments

Inform 7: Natural-Language Programming Lives

When most programmers think of natural-language programming, they usually think of Applescript – an ambitious yet doomed-to-failure language with English-like syntax that Apple developed in order to automate OS scripting tasks. Frankly, Applescript in its current incarnation is pretty dismal – any reasonably complex script is positively overflowing with ends and tells, reminiscent of Ruby’s slower, dumber cousin. (Incidentally, 99% of the pain of Applescript can be removed with Ruby or Python bridges.)

These painful memories, combined with other influential condemnations, will probably lead many programmers to shun Inform 7 and its mission to be “built by writing natural English-language sentences.” However, if you will wait a moment, I will show you the masterful elegance with which Inform 7 mixes modern programming paradigms and the English language.

For those of you unfamiliar with Inform, it’s a system for writing text adventures – well-known examples include Adventure, Zork, and the famously difficult adaptation of The Hitchhiker’s Guide to the Galaxy. Inform was created in 1993; in its first six versions, games were written in a procedural programming language with a syntax looking like a cross between Perl and Applescript. Now, however, games are written entirely in English and then translated with the I7 compiler.

The following is not meant to be a full-fledged I7 tutorial; the official site does that far better than I ever could. This is simply a demonstration of how I7 manages to express complex programming concepts entirely in English.

(more…)

June 18, 2007 at 11:51 pm 3 comments

Map, Filter and Reduce in Cocoa

After working in Scheme, Python or Ruby, all of which (more or less) support function objects and the map(), filter() and reduce() functions, languages that don’t seem to be somewhat cumbersome. Cocoa manages to get these paradigms almost correctly implemented.

map()

One would think that Objective-C’s ability to pass functions around as objects in the form of selectors would make writing a map() method easy. Observe, however, the crucial differences of the NSArray equivalent to map(). (For those unfamilar with it, map(), when given an array and a method/function taking one argument, returns the result of mapping the function onto each item of the array.)

From the NSArray documentation:

makeObjectsPerformSelector:

Sends the aSelector message to each object in the array, starting with the first object and continuing through the array to the last object.

- (void)makeObjectsPerformSelector:(SEL)aSelector
Discussion

The aSelector method must not take any arguments. It shouldn’t have the side effect of modifying the receiving array.

This is different on no fewer than two levels. Firstly, even in the NSMutableArray subclass, this method is not allowed to have side effects. Frankly, I can think of few situations in which I would need to map an idempotent function onto an array; the point of map() is to be able to apply a function quickly to every element of an array and get back the changes! Secondly, an unaware or hurried programmer would think that this function was implemented so that one could write code like this:

- (void)printAnObject:(id)obj
{
NSLog([obj description]);
}

and then do this:
[anArray makeObjectsPerformSelector: @selector(printAnObject:)];

This is not the case – the above code would just make each element call printAnObject:, not call printAnObject with each element. I’m sure that to some it seems obvious, but I, for one, found this to be an insidiously tricky wart.

However, there is a (limited) workaround.

NSArray’s valueForKey: method (somewhat counter-intuitively) returns the result of invoking valueForKey on each of the array’s elements. As such, one can map KVC functions onto arrays. For example:

NSArray *arr = [NSArray arrayWithObjects: @"Get", @"CenterStage", @"0.6.2", @"because", @"it", @"rocks", nil];
[arr objectForKey: @"length"]; // returns [3, 10, 6, 7, 2, 5]

This can be used in many helpful ways; sadly, it only works on KVC-compliant properties/methods.

Anyway, moving on…

filter()

With OS X 10.4, Apple introduced the NSArray filteredArrayUsingPredicate: method, which allows one to filter an array based on criteria established by an NSPredicate. Observe:

NSArray *arr = [NSArray arrayWithObjects: @"This", @"is", @"the", @"first", @"CenterStage", @"release", @"I", @"helped", nil];
arr = [arr filteredArrayWithPredicate: [NSPredicate predicateWithFormat: @"SELF.length > 5"]]; // arr is now [@"CenterStage", "@"release", @"helped"]

Verbose, but useful.

reduce()

Frankly, Apple don’t give us any way to do this in pure Cocoa. The best way I’ve found (if you really need this, which is less often than one needs map() and filter()) is to use the F-Script framework and apply the \ operator to an array. Unfortunately, this takes a bit of overhead.

In conclusion, Cocoa and Objective-C almost bring us the joys of functional programming. We can only hope that Leopard and Obj-C 2.0 improve on these in some way.

List comprehensions, anyone?

June 18, 2007 at 10:42 pm 4 comments

Older Posts


About Me



I'm Patrick Thomson. This was a blog about computer programming and computer science that I wrote in high school and college. I have since disavowed many of the views expressed on this site, but I'm keeping it around out of fondness.

If you like this, you might want to check out my Twitter or Tumblr, both of which are occasionally about code.

Blog Stats

  • 541,018 hits

Follow

Get every new post delivered to your Inbox.