Class vs. Instance Namespaces

A namespace is a mapping from names to objects, with the property that there is zero relation between names in different namespaces. They’re usually implemented as Python dictionaries, although this is abstracted away.

Depending on the context, you may need to access a namespace using dot syntax (e.g., object.name_from_objects_namespace) or as a local variable (e.g., object_from_namespace). As a concrete example:

Python classes and instances of classes each have their own distinct namespaces represented by pre-defined attributesMyClass.__dict__ and instance_of_MyClass.__dict__, respectively.

When you try to access an attribute from an instance of a class, it first looks at its instance namespace. If it finds the attribute, it returns the associated value. If not, it then looks in the class namespace and returns the attribute (if it’s present, throwing an error otherwise). For example:

The instance namespace takes supremacy over the class namespace: if there is an attribute with the same name in both, the instance namespace will be checked first and its value returned. Here’s a simplified version of the code (source) for attribute lookup:

At the namespace level… we’re setting MyClass.__dict__['class_var'] = 2. (Note: this isn’t the exact code(which would be setattr(MyClass, 'class_var', 2)) as __dict__ returns a dictproxy, an immutable wrapper that prevents direct assignment, but it helps for demonstration’s sake). Then, when we access foo.class_var, class_var has a new value in the class namespace and thus 2 is returned.

If a Python class variable is set by accessing an instance, it will override the value only for that instance. This essentially overrides the class variable and turns it into an instance variable available, intuitively, only for that instance. For example:

At the namespace level… we’re adding the class_var attribute to foo.__dict__, so when we lookup foo.class_var, we return 2. Meanwhile, other instances of MyClass will not have class_var in their instance namespaces, so they continue to find class_var in MyClass.__dict__ and thus return 1.

Mutability

Quiz question: What if your class attribute has a mutable type? You can manipulate (mutilate?) the class attribute by accessing it through a particular instance and, in turn, end up manipulating the referenced object that all instances are accessing (as pointed out by Timothy Wiseman).

This is best demonstrated by example. Let’s go back to the Service I defined earlier and see how my use of a class variable could have led to problems down the road.

My goal was to have the empty list ([]) as the default value for data, and for each instance of Service to have its own data that would be altered over time on an instance-by-instance basis. But in this case, we get the following behavior (recall that Service takes some argument other_data, which is arbitrary in this example):

In this case, we’re adding s1.__dict__['data'] = [1], so the original Service.__dict__['data'] remains unchanged.

Unfortunately, this requires that Service users have intimate knowledge of its variables, and is certainly prone to mistakes. In a sense, we’d be addressing the symptoms rather than the cause. We’d prefer something that was correct by construction.

My personal solution: if you’re just using a class variable to assign a default value to a would-be Python instance variable, don’t use mutable values. In this case, every instance of Service was going to override Service.data with its own instance attribute eventually, so using an empty list as the default led to a tiny bug that was easily overlooked. Instead of the above, we could’ve either:

Stuck to instance attributes entirely, as demonstrated in the introduction.

This only makes sense if you will want your typical instance of MyClass to hold just 10 elements or fewer—if you’re giving all of your instances different limits, then limit should be an instance variable. (Remember, though: take care when using mutable values as your defaults.)

Tracking all data across all instances of a given class. This is sort of specific, but I could see a scenario in which you might want to access a piece of data related to every existing instance of a given class.To make the scenario more concrete, let’s say we have a Person class, and every person has a name. We want to keep track of all the names that have been used. One approach might be to iterate over the garbage collector’s list of objects, but it’s simpler to use class variables.Note that, in this case, names will only be accessed as a class variable, so the mutable default is acceptable.

Under-the-hood

Note:If you’re worrying about performance at this level, you might not want to be use Python in the first place, as the differences will be on the order of tenths of a millisecond—but it’s still fun to poke around a bit, and helps for illustration’s sake.

Recall that a class’s namespace is created and filled in at the time of the class’s definition. That means that we do just one assignment—ever—for a given class variable, while instance variables must be assigned every time a new instance is created. Let’s take an example.

When we look at the byte code, it’s again obvious that Foo.__init__ has to do two assignments, while Bar.__init__ does just one.

In practice, what does this gain really look like? I’ll be the first to admit that timing tests are highly dependent on often uncontrollable factors and the differences between them are often hard to explain accurately.

However, I think these small snippets (run with the Python timeit module) help to illustrate the differences between class and instance variables, so I’ve included them anyway.

Note: I’m on a MacBook Pro with OS X 10.8.5 and Python 2.7.2.

Initialization

10000000 calls to `Bar(2)`: 4.940s
10000000 calls to `Foo(2)`: 6.043s

The initializations of Bar are faster by over a second, so the difference here does appear to be statistically significant.

So why is this the case? One speculative explanation: we do two assignments in Foo.__init__, but just one in Bar.__init__.

Assignment

Note: There’s no way to re-run your setup code on each trial with timeit, so we have to reinitialize our variable on our trial. The second line of times represents the above times with the previously calculated initialization times deducted.

From the above, it looks like Foo only takes about 60% as long as Bar to handle assignments.

Why is this the case? One speculative explanation: when we assign to Bar(2).y, we first look in the instance namespace (Bar(2).__dict__[y]), fail to find y, and then look in the class namespace (Bar.__dict__[y]), then making the proper assignment. When we assign to Foo(2).y, we do half as many lookups, as we immediately assign to the instance namespace (Foo(2).__dict__[y]).

In summary, though these performance gains won’t matter in reality, these tests are interesting at the conceptual level. If anything, I hope these differences help illustrate the mechanical distinctions between class and instance variables.

In Conclusion

Class attributes seem to be underused in Python; a lot of programmers have different impressions of how they work and why they might be helpful.

My take: Python class variables have their place within the school of good code. When used with care, they can simplify things and improve readability. But when carelessly thrown into a given class, they’re sure to trip you up.

Appendix: Private Instance Variables

One thing I wanted to include but didn’t have a natural entrance point…

Python doesn’t have private variables so-to-speak, but another interesting relationship between class and instance naming comes with name mangling.

In the Python style guide, it’s said that pseudo-private variables should be prefixed with a double underscore: ‘__’. This is not only a sign to others that your variable is meant to be treated privately, but also a way to prevent access to it, of sorts. Here’s what I mean:

Look at that: the instance attribute __zap is automatically prefixed with the class name to yield _Bar__zap.

While still settable and gettable using a._Bar__zap, this name mangling is a means of creating a ‘private’ variable as it prevents you and others from accessing it by accident or through ignorance.

Edit: as Pedro Werneck kindly pointed out, this behavior is largely intended to help out with subclassing. In the PEP 8 style guide, they see it as serving two purposes: (1) preventing subclasses from accessing certain attributes, and (2) preventing namespace clashes in these subclasses. While useful, variable mangling shouldn’t be seen as an invitation to write code with an assumed public-private distinction, such as is present in Java.

#codango #developer #development #coder #coding

We're happy to share this resource that we found. The content displayed on this page is property of it's original author and/or their organization.

Miscellaneous Sites

This website uses cookies to improve your browsing experience. We do not automatically presume that you approve of this technology. Hit the [Accept] button to proceed.Or, after your experience here, you may opt-out and remove all cookies by clicking [I Do Not Accept] and following these instructions. AcceptI Do Not AcceptRead More