Adventures in Pythonic Encapsulation

November 3, 2006 at 2:59 am 3 comments

Python has undergone a fair share of criticism for its lack of support for information hiding. Despite its being a solidly object-oriented language, Python has historically refused to support this facet of object-orientation. Other programming languages have implemented encapsulation in a variety of ways:

  • All variables in Smalltalk, the canonical OO language, are private; in order to access them, the programmer must write explicit accessor and mutator methods for each variable.
  • C++, Java, and C# rely on the public, private, and protected keywords in order to implement variable scoping and encapsulation.
  • Cocoa’s Objective-C code strongly encourages the programmer to use key-value coding for encapsulatory purposes.
  • Ruby does ultra-sexy encapsulation through the attr_accessor function(s).

However, Pythonistas like myself often assert that “we’re all consenting adults here.” While this attitude is refreshing in this day of slavish devotion to OOP principles, the Python community has realized that in order to avoid alienating newcomers, Python should perform some sort of encapsulatory behavior. This article intends to show the various ways in which Python supports encapsulation and/or information hiding.

For the sake of argument, let’s implement a Student class:

class Student(object):

    def __init__(self, name):

        self.name = name

Easy enough, right? Now, let’s write a unit test. Note that it depends on the name attribute being stored as a string.

george = Student("George Washington")

print george.name.lower()

When run, this test will produce the following output: george washington
But what if *cue dramatic music* we changed the implementation of the Student class so that it stored the name variable as a list of strings?

class Student(object):

    def __init__(self, name):

        self.name = name.split()

Now, running the above unit test will give us the following error:
AttributeError: 'list' object has no attribute 'lower'
So how could we modify the Student class in such a way that people could interact with it as a string, despite it being stored in a list? Read on.

Method 1: Getters and setters
Python, being an underscore-happy language (not that there’s anything wrong with that!) allows for automatic variable-name-mangling with the addition of two preceding underscores; that is a fancy way to say that, in a Student class, a variable named __name will actually have the variable name _Student_name at runtime. Therefore, we could write the following code, using getters and setters (I’m using the Smalltalk-style naming conventions – foo accesses the variable foo, and setFoo sets it – for getters and setters here, though it applies equally well to Java’s getFoo and setFoo paradigm), and thereby encapsulate details of the class variables inside the class:

class Student(object):



    def __init__(self, name):

        self.__name = name.split()



    def name(self):

        return " ".join(self.__name)



    def setName(self, newName):

        self.__name = newName.split()

However, in order to get the unit test to work, we need to change it to this:

george = Student("George Washington")

print george.name().title()

After that, everything works dandily.

Advantages of this method:

  • It follows other languages’ paradigms for accessor methods
  • It doesn’t require much effort to implement

Disadvantages of this method:

  • The end-user must change his/her programs to use accessor methods.
  • It can be unclear whether one’s using the Java or Smalltalk accessor paradigm.
  • The end user can access the internal object – in this case, by using _Student_name


Method 2: __getattr__ and __setattr__

By overloading the __getattr__ and __setattr__ methods, one can intercept all calls to read or assign the value of an attribute belonging to a class. Using this, plus the double-underscore name-mangling paradigm, you can insert true, 100% garden-fresh private members. Here’s an example – it’s best written, not explained.

class Student(object):

    def __init__(self, name):
        self.name = name

    def __getattribute__(self, attr):
        if attr == "name":
            return " ".join(self.__dict__[attr])
        else:
            return object.__getattribute__(self, attr)

    def __setattr__(self, attr, value):
        if attr == "name":
            self.__dict__[attr] = value.split()
        else:
            object.__setattr__(self, attr, value)

Advantages of this method:

  • Transparent encapsulation; __setattr__ and __getattr__ are called even when it appears that the user is modifying an instance variable through the assignment operator.

Disadvantages of this method:

  • Verbose.
  • Complicated to implement.
  • Prone to errors.


Method 3: properties

Python 2.2 implemented a new style of attribute access, called properties. Simply put, they’re faster, easier ways to implement the __getattr__ and __setattr__ methods’ ways of transparent attribute gathering. Here’s an example:

class Student(object):
    def __init__(self, name):
        self.__name = name

    def getName(self):
        return " ".join(self.__name)

    def setName(self, newName):
        self.__name = newName.split()

    name = property(getName, setName)

Now, all assignment references to the name attribute will return properly. And, since we defined getName and setName, programmers who are used to the accessor-method paradigm can use those.

Advantages of this method:

  • Transparent encapsulation
  • Very elegant
  • Allows for multiple paradigms of accessing values.

Disadvantages of this method:

  • Very verbose – however, using lambda statements makes things a bit clearer, and this crazy-awesome code allows for automated property generation. (Psst! Guido! If you want your language to be flocked to like Ruby, allow for metaprogramming like this! Incorporate this into the Python code base! At the very least, give us some ways to make syntactic sugar, like Ruby’s class_eval and instance_eval.
  • Determined people can still access the internal object – however, is this truly a disadvantage? There are times when it’s useful to break encapsulation, such as running a proper debugger.


In conclusion:

These days, I’m using properties. They seem to be the best way to implement encapsulatory principles – and if the attribute methods get incorporated into the Python built-in object class, I shall never pine for Ruby’s attr_accessor again. And that’s a good thing, because I truly want Python to succeed.

Any inaccuracies? Complaints? Death threats? Compliments? Leave a comment!

Entry filed under: code, encapsulation, oop, python. Tags: .

Fancy Windows, &c.

3 Comments

  • 1. Chris Ryland  |  December 26, 2006 at 4:41 pm

    Patrick–There’s something missing in the first part of this post: there’s no “title” reference at all that would cause the purported error.

    Also, I think Guido most definitely doesn’t want people to flock to Python like Ruby, at least not for the reasons illustrated by that kind of crazy-awesome code. ;-) He’s more of a meat-and-potatoes guy (with a little spice where it make sense), and the language definitely reflects that. I tend to agree with him in the end, even though the hacker urge still pops up every so often.

    Cheers!

  • 2. Patrick Thomson  |  December 26, 2006 at 6:53 pm

    Oh dear; that’s embarrassing. Fixed.
    As to your assertion about Guido, I totally agree with his meat-and-potatoes style. But allowing Pythoneers to write the sort of crazy-awesome code that Ruby is noted for would be a blessing.

  • 3. Justin  |  February 3, 2007 at 5:20 pm

    I do not think that stuff like this should go into the stdlib.

    It obfuscates behaviour a lot and keeps the language from staying. Simple is better than complex.

    class Student(object):
    name = property(lambda self: return self.__name, lambda self, x: None, lambda self: None)

    def __init__(self, name):
    self.__name = name

    This is great and enough for making a read only attribute. It is readable and every python programmer should know the property builtin anyway.

    If you really want something like name_setter, I guess you can achieve this with a Metaclass.


About Me



I'm Patrick Thomson. This was a blog about computer programming and computer science that I wrote in high school and college. I have since disavowed many of the views expressed on this site, but I'm keeping it around out of fondness.

If you like this, you might want to check out my Twitter or Tumblr, both of which are occasionally about code.

Blog Stats

  • 555,677 hits

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: