Python Gotchas: Class Attributes versus Instance Members

Links of the Week will be delayed until tomorrow or Friday.

Today, I’d like to share with you a real treat. It happened well into the evening last night when my coherent thought processes were lost in a semi-drunken stupor induced by lack of sleep (I don’t drink, but I do have slight insomnia sometimes). What was most ironic is that in the five or so years of writing in Python (off and on) I have never encountered this particular problem.

Click that fancy link (below) to read more. This is a fairly lengthy article, so make sure you have some free time!

What I encountered was a peculiar aspect of class attributes not well-covered by Python tutorials or the Python documentation. It is, however, mentioned Learning Python: 2nd Edition by Mark Lutz and David Ascher. On page 318 they explain: “All the statements inside the class statement run when the class statement itself runs (not when the class is later called to make an instance).” They then go on to state that variables assigned at the “top-level of a class statement will be shared by all instances” of that class (not duplicated for each instance individually). Because of this, they point out, it is important to think of Python class attributes assigned before __init__ as static class members like you would encounter in C++ and Java.

The reason I never encountered this before is most likely due in no small part to the fact that I have generally written Python classes like this, precisely as I learned when I was reading Learning Python:

1
2
3
4
class SomeClass (object):
    def __init__ (self):
        self.somelist = []
        self.somedict = {}

Learning by example is great, isn’t it? There’s also the other option of writing them like this:

1
2
3
4
5
6
class SomeClass (object):
    somelist = []
    somedict = {}
    def __init__ (self):
        self.somelist = []
        self.somedict = {}

Notice that in both cases, the variables are (re-)assigned in the __init__ method. This particular assignment works in the latter case because the class attributes somelist and somedict are overwritten by the instance members defined in __init__ and are thus relegated to assignment per-instance rather than as static class attributes.

Now, what happens if we change this around? Since I’ve been writing in a mix of languages recently (PHP and C#), I’ve grown sorely accustomed to including the instance members but not always initializing them in the constructor. It’s a bad habit, but for some reason, I felt it was appropriate shorthand for Python. Let’s take a look at what happens:

Python 2.5.4 (r254:67916, Jun  1 2009, 16:46:59)
[GCC 4.1.2 (Gentoo 4.1.2 p1.3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> class Example(object):
...     test = {}
...
>>> c = Example()
>>> c.test["offset"] = "Sure is a test!"
>>> c.test
{'offset': 'Sure is a test!'}
>>> x = Example()
>>> x.test
{'offset': 'Sure is a test!'}
>>>

Oops. x.test now displays the same thing that c.test does. If we were to change x.test, those changes would also be reflected in c.test. Why? Because test is not an instance member, it is a class attribute. Class attributes in Python are shared among all instances of the same class.

There is a mild exception to this rule. When dealing with integers and strings, for example, the following would work just fine in Python 2.5:

>>> class Example(object):
...     n = 1
...
>>> c = Example()
>>> c.n
1
>>> c.n = 2
>>> x = Example()
>>> x.n
1
>>> c.n
2

Likewise, in Java, instance members work as expected:

Example.java

1
2
3
4
5
6
7
8
9
10
11
12
public class Example
{
    public static void main (String args[])
    {
        ExampleInstance ex = new ExampleInstance();
        System.out.println("First instance string: " + ex.test);
        ex.test = "I have changed it now.";
        System.out.println("First instance string (changed): " + ex.test);
        ExampleInstance ex2 = new ExampleInstance();
        System.out.println("Second instance string: " + ex2.test);
    }
}

ExampleInstance.java

1
2
3
4
public class ExampleInstance
{
    public String test = "This is just a test";
}

And the output:

First instance string: This is just a test
First instance string (changed): I have changed it now.
Second instance string: This is just a test

However, if we include the keyword static, we are then able to emulate Python’s default behavior:

ExampleInstance.java

1
2
3
4
public class ExampleInstance
{
    public static String test = "This is just a test";
}

Which will output:

First instance string: This is just a test
First instance string (changed): I have changed it now.
Second instance string: I have changed it now.

Perhaps this rant exists because I find it rather strange that a language designed around the principle “There should be only one way to do it” and in which nothing should be a surprise to the programmer would do something so unusual by default. Yes, I realize classes behave something like modules in the sense that they operate as a container object for their attributes. However, it is an unexpected behavior for those who write in and are exposed to other languages that don’t exhibit this same modus operandi.

Oh, and I should mention that while I was doing research for this post, I stumbled upon a really good, concise explanation of this language feature posted to the Python mailing list just last month. The irony.

So, here’s the catch. If you’re wanting OOP behavior that mimics most closely that of the C-derived/inspired languages (C++, Java, PHP), here’s what you can do. You can create the class attribute (overkill) and redefine it in the constructor:

Python 2.5.4 (r254:67916, Jun  1 2009, 16:46:59)
[GCC 4.1.2 (Gentoo 4.1.2 p1.3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> class Example(object):
...     test = {}
...     def __init__(self):
...             self.test = {}
...
>>> c = Example()
>>> c.test["offset"] = "This won't appear in the next one."
>>> c.test
{'offset': "This won't appear in the next one."}
>>> x = Example()
>>> x.test
{}
>>>

You can, preferably, limit the definition to the constructor alone:

Python 2.5.4 (r254:67916, Jun  1 2009, 16:46:59)
[GCC 4.1.2 (Gentoo 4.1.2 p1.3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> class Example(object):
...     def __init__(self):
...             self.test = {}
...
>>> c = Example()
>>> c.test["offset"] = "Same here."
>>> c.test
{'offset': 'Same here.'}
>>> x = Example()
>>> x.test
{}
>>>

Or you can use the class object as a struct and assign individual attributes as you see fit (not recommended for reasons related to duck typing since clients wouldn’t be guaranteed access to the attribute):

Python 2.5.4 (r254:67916, Jun  1 2009, 16:46:59)
[GCC 4.1.2 (Gentoo 4.1.2 p1.3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> class Example(object):
...     pass
...
>>> c = Example()
>>> c.test = {"offset": "This won't work."}
>>> c.test
{'offset': "This won't work."}
>>> x = Example()
>>> x.test
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'Example' object has no attribute 'test'
>>>

The second method of defining instance members in the constructor is your best bet because it’s less typing and you’re not overwriting class attributes. Of course, this is completely counter to everything you know if you’re coming from a background most familiar with C-derived languages!

What this did teach me is that, sometimes, you don’t know everything you think you do. Surprises can crop up, even after years of use. In my case, the surprise was derived from design patterns I had been using in PHP and C# since you can define and set instance variables in the class itself. Don’t do this in Python. It won’t work.

I take that back. It will work–just not the way you expected.

***

Leave a comment

Valid tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>