Note: phiên bản Tiếng Việt của bài này ở link dưới.
https://duongnt.com/mixin-python-vie
Unlike other languages like C# or Java, Python supports multiple inheritance of classes. This enables some interesting features, one of them is mixin. Mixins are small classes, whose purpose is to extend the functionality of other classes. They are not meant to work by themselves, and we never instantiate them directly.
Our sample classes
Let’s say we have a simple Person
class, which consists of two attributes: name
and age
.
class Person:
def __init__(self, name, age, country):
self._name = name
self._age = age
By default, when comparing two objects of type Person
, we will compare their references.
p1 = Person('John', 32)
p2 = Person('John', 32)
print(p1 == p2) # False
If we want to compare them by attributes, we have to override the __eq__
method.
class Person:
# omitted
def __eq__(self, o):
if not isinstance(o, Person):
return False
return self.__dict__ == o.__dict__
p1 = Person('John', 32)
p2 = Person('John', 32)
print(p1 == p2) # True
Now, let’s say we have another class called Country
, and we also want to compare its objects by attributes.
class Country:
def __init__(self, name, area):
self._name = name
self._area = area
def __eq__(self, o):
if not isinstance(o, Country):
return False
return self.__dict__ == o.__dict__
Perhaps you can already see the problem, the __eq__
method is duplicated. If we ever want to change its implementation, we will have to modify all affected classes.
Mixin to the rescue
Instead of defining the __eq__
method inside each class, we will move it into an auxiliary class. Let’s call it CompareByAttributeMixin
.
class CompareByAttributeMixin:
def __eq__(self, o):
if not isinstance(o, type(self)):
return False
return self.__dict__ == o.__dict__
Then we can add this mixin into both Person
and Country
, and comparison by attributes will work as we expected.
class Person(CompareByAttributeMixin):
def __init__(self, name, age, country):
self._name = name
self._age = age
class Country(CompareByAttributeMixin):
def __init__(self, name, area):
self._name = name
self._area = area
p1 = Person('John', 32)
p2 = Person('John', 32)
print(p1 == p2) # True
c1 = Country('Vietnam', 331700)
c2 = Country('Vietnam', 331700)
print(c1 == c2) # True
Small classes that only exist to extend the functionality of other classes like CompareByAttributeMixin
are called mixin.
Adding multiple mixins to the same class
If you try to calculate the hash of a Person
or Country
object, you will encounter an error.
print(hash(c1))
This will throw.
Exception has occurred: TypeError
unhashable type: 'Person'
This is because a class that overrides __eq__
must also override __hash__
, as documented here. Otherwise, its __hash__
method will be implicitly set to None
and hashing will be disabled.
To enable hashing of Person
objects, we have to override the __hash__
method and calculate the hash of an object based on its attributes. Let’s create another mixin called HashFromAttributeMixin
.
class HashFromAttributeMixin:
"""
Assuming that all attributes are hashable
"""
def __hash__(self):
return sum(hash(v) for v in self.__dict__.values())
We can now add the new mixin to both Person
and Country
.
class Person(HashFromAttributeMixin, CompareByAttributeMixin):
def __init__(self, name, age, country):
self._name = name
self._age = age
class Country(HashFromAttributeMixin, CompareByAttributeMixin):
def __init__(self, name, area):
self._name = name
self._area = area
Hashing of Person
and Country
objects will now work.
print(hash(p1) == hash(p2)) # True
print(hash(c1) == hash(c2)) # True
Note: Generally, it is a bad idea to enable hashing based on mutable attributes. Because if the object’s hash value changes, it will be in the wrong hash bucket. And if we use an object of typePerson
or Country
as key in a dictionary and its hash changes, we can no longer retrieve the corresponding value. Please consider the code above as a demonstration only.
Pay attention to the order of mixins
Maybe readers with keen eyes have already noticed that we put HashFromAttributeMixin
before CompareByAttributeMixin
in the list of mixins. What will happen if we reverse this order?
class PersonUnhashable(CompareByAttributeMixin, HashFromAttributeMixin):
def __init__(self, name, age, country):
self._name = name
self._age = age
p_unhashable = PersonUnhashable('John', 32)
print(hash(p_unhashable))
The snippet above will raise the familiar TypeError
.
Exception has occurred: TypeError
unhashable type: 'PersonUnhashable'
And this is the result when we check the __hash__
method of PersonUnhashable
.
print(PersonUnhashable.__hash__) # None
While the __hash__
method of Person
is inherited from HashFromAttributeMixin
as we expected.
print(Person.__hash__) # <function HashFromAttributeMixin.__hash__ at 0x000002551968B160>
To understand why, we need to know how Python decides which method to use in case of multiple inheritance.
Multiple inheritance and the MRO
One of the main problems with multiple inheritance is finding out which method to call if multiple parent classes define the same method. Python solves this by using the class’s MRO, which is an ordered list of parent classes. Whenever we call a method on a class, the interpreter first checks if that class implements the method in question. If it does not, the interpreter then walks up the class’s MRO until it finds the first parent that implements the method.
Let’s check the MRO of Person
and PersonUnhashable
.
print(Person.__mro__)
# (<class '__main__.Person'>, <class '__main__.HashFromAttributeMixin'>, <class '__main__.CompareByAttributeMixin'>, <class 'object'>)
print(PersonUnhashable.__mro__)
# (<class '__main__.PersonUnhashable'>, <class '__main__.CompareByAttributeMixin'>, <class '__main__.HashFromAttributeMixin'>, <class 'object'>)
In the example above, the first class in PersonUnhashable
‘s MRO that implements __hash__
is CompareByAttributeMixin
. Because we redefine CompareByAttributeMixin.__eq__
, CompareByAttributeMixin.__hash__
has been implicitly set to None
. This means we cannot calculate the hash of a PersonUnhashable
object.
On the other hand, the first class in Person
‘s MRO that implements __hash__
is HashFromAttributeMixin
. Because of that, we can calculate the hash of a Person
object using its attributes.
As we’ve seen above, the MRO not only depends on the inheritance graph, but also depends on the order in which parent classes are listed in a class declaration. Behind the scenes, the interpreter uses the C3 linearization algorithm to build the MRO. The algorithm itself is not very hard to follow, but I doubt we will ever need to manually calculate the MRO of a class.
Conclusion
Mixin is a useful tool to take advantage of the power of multiple inheritance. And by understanding how the MRO is determined in Python, we can make sure that mixing multiple mixins does not produce any unexpected result.
One Thought on “Mixin and multiple inheritance in Python”