Blogroll

Forums

python

Operator Overload! Learn how to change the behavior of equality operators.

By: Mark Mruss

Note: This article was first published the November 2007 issue of Python Magazine

While the equality operator works great on numbers and strings the fact the way it treats your custom objects really is not that useful. This article looks into overloading the equality operator so that you can easily compare your custom classes.

In my experience as a professional programmer, testing for the equality between two instances of a class is a fairly common task. In other words, you are comparing the data that each class contains and checking whether the data in one class is identical to the data in the other class.

One of the nice features of Python is that it has a default equality operator defined for any custom objects that you create. The unfortunate thing about this default equality operator is that it doesnÃ¢Â€Â™t provide the functionality that you expect. This is because the equality operator (==) actually performs an identity comparison, rather than an equivalence test. If you were to run the following code:

[code lang=”python”]
if (object_one == object_two):
[/code]

By default Python actually compares whether or not object_oneisobject_two (this is the same comparison that can be made using the is keyword) instead of determining whether or not object_one is equivalent to object_two. Fortunately for us, overloading the default equality operator in Python is a relatively easy task. There are, however, some “gotchas” and other interesting features of which one should be aware.

An operator can be difficult to define, and like many programming definitions, sometimes the definition only serves to confuse the matter further. In general though, you can think of operators as being very similar to the operators that you encountered in Math class, such as: the + operator, the – operator, and so forth.

In programming languages we generally encounter binary operators. This means that each operator takes two operands. An operand serves as input to an operator. For example, in the statement:

[code lang=”python”]
2 + 6
[/code]

+ is a binary operator that takes two operands, 2 and 6 as inputs. Similarly, in this statement:

[code lang=”python”]
my_value – 6
[/code]

– is an operator that takes two operands, my_value and 6 as inputs.

Operator overloading is a programming term that means taking the default behaviour of an operator and overloading it. That is, changing the default implementation of an operator for a given object. An example of this (although something that you should never do) would be to overload the + operator to actually perform subtraction instead when it is applied to your class.

Now that the definitions are out of the way, let’s look at an example where one might want to overload the equality operator. For this example I will bring back a favourite example from my Computer Science days: the Student class:

Here, as in the previous example, “Not Equal” will be printed out. This is because, as mentioned earlier, the default implementation of the equality operator is to perform an identity comparison. In other words, the default equality operator asks, is mark the same object as mark_two? In Python the equality comparison depends on the type of objects being compared. For custom classes that you or I will create, the equality comparison will perform an identity comparison by comparing the objectÃ¢Â€Â™s internal id. In other words, it will only result in True if the objects being compared actually are each other. For example:

Note: The equality comparison for built-in objects and types like numbers, strings, lists, tuples, and mappings behave differently. Numbers are compared arithmetically. The numerical values of the characters within strings are compared arithmetically. The comparison of lists and tuples is simply a comparison of their inner values, while the comparison of mappings are comparisons of an ordered list of their values.[2]

Would result in “Equal” being printed out, i.e. a true equality comparison as opposed to an identity comparison. In order to do this we need to change to the default functionality of the equality operator. In other words we need to overload it.

In general, operator overloading in Python means adding a special function to your class that will perform the function of the operator it is meant to represent. There are two ways in which one can overload the equality operator in Python: 1) the first method is to use the __eq__ function, a so-called “rich comparison” function. “Rich comparison” functions are functions that overload specific comparison operators (i.e. __eq__ to overload ==). 2) The second is to use the __cmp__ function, which is used to overload all comparison operators if no “rich comparison” functions are present.

Since __cmp__ is used to override all comparison operators (==, !=, < , <=, >, >=), I would suggest using the “rich comparison” method unless you are using a version of Python that is earlier then version 2.1, or you are convinced that you know what < = means to our Student class. Let’s forget about the __cmp__ operator for now and focus on using the “rich comparison” functions to overload the equality operator.

“Rich comparison” functions can return any value, but you should try to return a value that is, or can be, interpreted as a boolean value. This is important because these functions will often be used in situations where the return value will be used in a boolean comparison.

When using the “rich comparison” functions it is important to know which functions are being called internally. For example, when we run:

[code lang=”python”]
student_one == student_two
[/code]

If __eq__ exists in the Student class, the following is actually being called:

[code lang=”python”]
student_one.__eq__(student_two)
[/code]

When we run:

[code lang=”python”]
student_two == student_one
[/code]

The following is actually called:

[code lang=”python”]
student_two.__eq__(student_one)
[/code]

As you can see it is the operand on the left-hand side whose __eq__ function will be called. It is important to note that if the operand on the left-hand side lacks the __eq__ function while the operand on the right-hand side has one, the right-hand operand’s __eq__ function will not be called.

Lets start off with a simple, but incorrect, example (the reasons for its incorrectness will be explained below):

The way we are overriding the equality operator is not correct because it automatically assumes that the other object has the name and student_number data members. There are a number of methods to get around this problem, including: 1) using the hasattr function, or 2) using the isinstance function. Using the hasattr function determines if other has the attributes we are looking for before actually querying them. hasattr simply tells us if an object has a specific attribute or not. Here is a quick example illustrating how to do this:

First, we check to see if other has the name and student_number attributes. If it does, we proceed as normal. If it does not, we simply return false. When we compare the professor and the student we get “False” as expected.

What’s nice about this method is that we don’t have to care what type other is. We only care whether or not it contains the attributes we need to compare. However, the drawback to this function is that you have to test for the existence of each attribute. Although this may not always be a big deal, if you are dealing with fifty data members in your classes this can quickly become a pain in the neck.

Another solution to the problem with our first overloading example is to use the isinstance function to make sure that other is an instance of our class type. This has the drawback of forcing other to be the same type as your class. In practice however, I believe this to be more of an advantage than a disadvantage.

The first thing we do is check the variable other to make sure that it is an instance of the Student class. If it is, we then compare all of the data members in the Student class. If object is not an instance of the Student class, we return False.

In my opinion, this is the preferred method since knowing that the class is the correct type is often important. The hasattr method seems more appropriate for simple data containers like a “rect” or “vector” class where you are only interested in three or four data members.

Up until this point in time we have been returning False when our __eq__ function does not support the type of object passed in as other. While this is acceptable and correct given the Python documentation, it seems to be “proper” to actually return NotImplemented. According to the Python documentation, “Numeric methods and rich comparison methods may return this value if they do not implement the operation for the operands provided. (The interpreter will then try the reflected operation, or some other fallback, depending on the operator.)” [4]Let’s forget abou In other words, if the left operand returns NotImplemented, Python will attempt to use the right hand operand’s equality operator. And if that does not exist, Python will fall back to the default equality operator.

We can return NotImplemted from our Student class if the operand passed in is not an instance of the Student:

This returns NotImplemented. As a result, the reflected operation is attempted:

[code lang=”python”]
rob == guido
[/code]

Because the Professor class does not have the equality operator overloaded, the default operation is executed and False is printed out just like we wanted.

NotImplemented is useful in because instead of returning False, which means that the two operand are not equivalent, you return a value that says that the comparison between the operands has not been implemented.

Now that we know how to overload the equality operator, it stands to reason that we have the opposite operation, the inequality operator (!=) covered as well. But not so fast. In Python the inequality and equality operators are handled separately, meaning that inequality is not simply the opposite of equality. This means that whenever you overload the equality operator, you have to be sure to overload the inequality operator as well. If you don’t you might get some strange results. For example, when we use the current code (without the inequality operator overloaded), the following:

In the first comparison the overloaded equality operator is used, and results in True being printed. Because the inequality operator is not overloaded in the second comparison, the default inequality operator is used (the identity comparison). True is printed because guido and guido_too are not the same instances.

Thankfully once you have overloaded the equality operator, overloading the inequality operator is very easy. As a general rule, you have to return the opposite of the equality operator, but because we are working with NotImplemented, we have to do a bit more processing to ensure that we don’t return False when we really want to return NotImplemented. Here is how we can overload the inequality operator in the Student class:

First, we call self.__eq__ to test whether or not we are equal to other. We then check to make sure that equal_result is not NotImplemented. If it is not, we know that the equality test was implemented and we can safely return itsÃ¢Â€Â™ opposite. If the result for the equality comparison was NotImplemented, we return NotImplemented for the inequality comparison.

Note: It is safe to use the is check on NotImplemented (rather than an isinstance check) because NotImplemented is a singleton, meaning that there is only ever one instance of NotImplemented at anytime.

While it may seem like operator overloading should become part of every class that you write, a word of warning is necessary. There is a large school of thought that views operator overloading as a dangerous programming technique. They argue that overloading operators changes the default way that an operator works, and not always correctly. Moreover, instead of overriding the equality operator, one can simply add an is_equal_to function to perform the equality check.

The logic behind this criticism is that when someone is using a class or reading some code that you wrote, they will be unable to tell what the equality operator is doing. For example, if they see:

What gets printed out? True or False? If Ã¢Â€ÂœMyClassÃ¢Â€Â overrode the equality operator then True will be printed. However, if the equality operator is not overloaded, the standard Python behaviour of equality will result with False being printed out.

While it’s true that overloading the equality operator does change the default way the Python functions, I feel that it’s generally a safe and beneficial addition to your classes. Especially since unless people know the ins and outs of the equality operator they will generally assume that should work the way it does when you overload it. Like all the decisions that you make when working with Python, context is key.