Note: phiên bản Tiếng Việt của bài này ở link dưới.

https://duongnt.com/mixin-python-vie

mixin-python

Unlike other languages like C# or Java, Python supports multiple inheritance of classes. This enables some interesting features, one of them is mixin. Mixins are small classes, whose purpose is to extend the functionality of other classes. They are not meant to work by themselves, and we never instantiate them directly.

Our sample classes

Let’s say we have a simple Person class, which consists of two attributes: name and age.

class Person:
    def __init__(self, name, age, country):
        self._name = name
        self._age = age

By default, when comparing two objects of type Person, we will compare their references.

p1 = Person('John', 32)
p2 = Person('John', 32)
print(p1 == p2) # False

If we want to compare them by attributes, we have to override the __eq__ method.

class Person:
    # omitted

    def __eq__(self, o):
        if not isinstance(o, Person):
            return False
        return self.__dict__ == o.__dict__

p1 = Person('John', 32)
p2 = Person('John', 32)
print(p1 == p2) # True

Now, let’s say we have another class called Country, and we also want to compare its objects by attributes.

class Country:
    def __init__(self, name, area):
        self._name = name
        self._area = area

    def __eq__(self, o):
        if not isinstance(o, Country):
            return False
        return self.__dict__ == o.__dict__

Perhaps you can already see the problem, the __eq__ method is duplicated. If we ever want to change its implementation, we will have to modify all affected classes.

Mixin to the rescue

Instead of defining the __eq__ method inside each class, we will move it into an auxiliary class. Let’s call it CompareByAttributeMixin.

class CompareByAttributeMixin:
    def __eq__(self, o):
        if not isinstance(o, type(self)):
            return False
        return self.__dict__ == o.__dict__

Then we can add this mixin into both Person and Country, and comparison by attributes will work as we expected.

class Person(CompareByAttributeMixin):
    def __init__(self, name, age, country):
        self._name = name
        self._age = age

class Country(CompareByAttributeMixin):
    def __init__(self, name, area):
        self._name = name
        self._area = area

p1 = Person('John', 32)
p2 = Person('John', 32)
print(p1 == p2) # True

c1 = Country('Vietnam', 331700)
c2 = Country('Vietnam', 331700)
print(c1 == c2) # True

Small classes that only exist to extend the functionality of other classes like CompareByAttributeMixin are called mixin.

Adding multiple mixins to the same class

If you try to calculate the hash of a Person or Country object, you will encounter an error.

print(hash(c1))

This will throw.

Exception has occurred: TypeError
unhashable type: 'Person'

This is because a class that overrides __eq__ must also override __hash__, as documented here. Otherwise, its __hash__ method will be implicitly set to None and hashing will be disabled.

To enable hashing of Person objects, we have to override the __hash__ method and calculate the hash of an object based on its attributes. Let’s create another mixin called HashFromAttributeMixin.

class HashFromAttributeMixin:
    """
    Assuming that all attributes are hashable
    """

    def __hash__(self):
        return sum(hash(v) for v in self.__dict__.values())

We can now add the new mixin to both Person and Country.

class Person(HashFromAttributeMixin, CompareByAttributeMixin):
    def __init__(self, name, age, country):
        self._name = name
        self._age = age

class Country(HashFromAttributeMixin, CompareByAttributeMixin):
    def __init__(self, name, area):
        self._name = name
        self._area = area

Hashing of Person and Country objects will now work.

print(hash(p1) == hash(p2)) # True
print(hash(c1) == hash(c2)) # True

Note: Generally, it is a bad idea to enable hashing based on mutable attributes. Because if the object’s hash value changes, it will be in the wrong hash bucket. And if we use an object of typePerson or Country as key in a dictionary and its hash changes, we can no longer retrieve the corresponding value. Please consider the code above as a demonstration only.

Pay attention to the order of mixins

Maybe readers with keen eyes have already noticed that we put HashFromAttributeMixin before CompareByAttributeMixin in the list of mixins. What will happen if we reverse this order?

class PersonUnhashable(CompareByAttributeMixin, HashFromAttributeMixin):
    def __init__(self, name, age, country):
        self._name = name
        self._age = age

p_unhashable = PersonUnhashable('John', 32)
print(hash(p_unhashable))

The snippet above will raise the familiar TypeError.

Exception has occurred: TypeError
unhashable type: 'PersonUnhashable'

And this is the result when we check the __hash__ method of PersonUnhashable.

print(PersonUnhashable.__hash__) # None

While the __hash__ method of Person is inherited from HashFromAttributeMixin as we expected.

print(Person.__hash__) # <function HashFromAttributeMixin.__hash__ at 0x000002551968B160>

To understand why, we need to know how Python decides which method to use in case of multiple inheritance.

Multiple inheritance and the MRO

One of the main problems with multiple inheritance is finding out which method to call if multiple parent classes define the same method. Python solves this by using the class’s MRO, which is an ordered list of parent classes. Whenever we call a method on a class, the interpreter first checks if that class implements the method in question. If it does not, the interpreter then walks up the class’s MRO until it finds the first parent that implements the method.

Let’s check the MRO of Person and PersonUnhashable.

print(Person.__mro__)
# (<class '__main__.Person'>, <class '__main__.HashFromAttributeMixin'>, <class '__main__.CompareByAttributeMixin'>, <class 'object'>)

print(PersonUnhashable.__mro__)
# (<class '__main__.PersonUnhashable'>, <class '__main__.CompareByAttributeMixin'>, <class '__main__.HashFromAttributeMixin'>, <class 'object'>)

In the example above, the first class in PersonUnhashable‘s MRO that implements __hash__ is CompareByAttributeMixin. Because we redefine CompareByAttributeMixin.__eq__, CompareByAttributeMixin.__hash__ has been implicitly set to None. This means we cannot calculate the hash of a PersonUnhashable object.

On the other hand, the first class in Person‘s MRO that implements __hash__ is HashFromAttributeMixin. Because of that, we can calculate the hash of a Person object using its attributes.

As we’ve seen above, the MRO not only depends on the inheritance graph, but also depends on the order in which parent classes are listed in a class declaration. Behind the scenes, the interpreter uses the C3 linearization algorithm to build the MRO. The algorithm itself is not very hard to follow, but I doubt we will ever need to manually calculate the MRO of a class.

Conclusion

Mixin is a useful tool to take advantage of the power of multiple inheritance. And by understanding how the MRO is determined in Python, we can make sure that mixing multiple mixins does not produce any unexpected result.

A software developer from Vietnam and is currently living in Japan.

One Thought on “Mixin and multiple inheritance in Python”

Leave a Reply