Python Protocol Classes (and Type Hints)

Explore the concepts of Python protocol classes and type hints in this tutorial. Perfect for all Python developers.

TL;DR

  • Type hints, a concept that protocol-based type checking builds upon, enhance code readability and enable static type checking, but don’t enforce type checks at runtime.
  • Python 3.8 introduced protocol classes, allowing developers to focus on expected behaviors rather than inheritance.
  • Using protocols, types like Rectangle and Circle can fit the same expectations if they implement all required behaviors.
  • If you’re familiar with Java, you’ll find that protocols in Python serve a purpose akin to interfaces in Java.

Type hints in Python

In Python, the following piece of code is a perfectly valid function definition:

def multiply(a, b):
    return a * b

But what if someone tried to call the function with two strings (e.g. "10" and "5") instead of actual integers? 🤔

TypeError: can't multiply sequence by non-int of type 'str'

BAM!!! That gave us a TypeError 🫢

How could we make it more clear that this function expects integers as input? This is where type hints come into play:

def multiply(a: int, b: int) -> int:
    return a * b

In a nutshell, type hints allow us to specify the types of a function’s input & output variables. Their purpose is to enhance code readability and enable the use of static type checking tools.

The semantics of type hints were initially laid out in PEP 484, expanding on the syntax for function annotations that was introduced earlier in PEP 3107.

It’s important to note that type hints by themselves don’t perform any type checking. This means that we could still pass strings to the annotated multiply function and get the same TypeError.

Hinting structural subtypes

Though very useful, type hints as introduced by PEP 484 had one important limitation that the Python community couldn’t simply ignore:

Type hints introduced in PEP 484 can be used to specify type metadata for static type checkers and other third party tools. However, PEP 484 only specifies the semantics of nominal subtyping. In this PEP we specify static and runtime semantics of protocol classes that will provide a support for structural subtyping (static duck typing).

PEP 544

In other words, PEP 544 extends the capabilities of Python’s type hinting system by allowing for the declaration of structural subtypes in addition to nominal ones (originally introduced in PEP 484).

In contrast to nominal subtyping, structural subtyping means that a type is considered a subtype of another type if it has the same properties or behaviors, regardless of whether it was explicitly declared to be a subtype.(( https://peps.python.org/pep-0544/#nominal-vs-structural-subtyping ))(( https://en.wikipedia.org/wiki/Structural_type_system ))

So in essence, what structural subtyping allows you to do is specify the expected behaviors of a function’s input and output objects, even if their explicit (nominal) types don’t possess those behaviors.

But how exactly does such a specification look in Python? Enter protocol classes. Introduced in Python 3.8(( https://docs.python.org/3/whatsnew/3.8.html#typing )), the ability to define protocols via special classes gives a fresh spin to how the language handles both duck typing and structural subtyping.

To illustrate, the next section will present an example use case for how to implement a protocol class. But first, a quote:

If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck.

– The Duck test

An example protocol class

To recap, the idea behind protocols is all about letting you focus on what an object can do, not what it is. With that being said, let’s consider the following code sample:

from typing import Protocol

class HasArea(Protocol):
    def area(self) -> float:
        ...

def compute_area(shape: HasArea) -> float:
    return shape.area()

class Rectangle:
    def __init__(self, width: float, height: float):
        self.width = width
        self.height = height
    
    def area(self) -> float:
        return self.width * self.height

class Circle:
    def __init__(self, radius: float):
        self.radius = radius
    def area(self) -> float:
        return 3.14159 * self.radius * self.radius

In the realm of nominal object orientation, the compute_area function would define expected object types based purely on their parent classes. This means that input objects would have to directly or indirectly inherit from a known parent class such as Shape:

def compute_area(shape: Shape) -> float:
    return shape

Granted, the differences between the above two definitions of the compute_area function might at first seem cosmetic. But there are a number of benefits that become evident upon closer examination.

FAQ

What is nominal subtyping?

Nominal subtyping means that a type is considered a subtype of another type if and only if it is explicitly stated so in the code. This explicit statement is usually done through inheritance.(( https://peps.python.org/pep-0544/#nominal-vs-structural-subtyping ))(( https://en.wikipedia.org/wiki/Nominal_type_system ))

What is static type checking?

Static type checking is the verification of correct type usage of a program’s variables, functions, and other constructs based on the static analysis of its source code (i.e. before it’s executed).(( https://en.wikipedia.org/wiki/Type_system#Static_type_checking ))

For example, if you were trying to calculate the sum of a number and a text string in your code, your static type checker would flag this as an error before the program gets executed.

If it was not for static type checking, protocol classes as defined by PEP 544 wouldn’t really add anything new to Python, right?

Yes, that’s right. Python’s dynamic nature has always allowed for structural subtyping. Thanks to duck typing, objects can be used interchangeably if they have the same methods and properties, regardless of their actual class hierarchy.

What are the key benefits of using Python’s protocol classes and type hints?

🛈 The response to this FAQ was in part generated by ChatGPT-4 and has been manually reviewed & edited by me for factual accuracy.

  1. Focus on Behavior Over Inheritance: Emphasizes the expected behaviors and capabilities of objects rather than their strict inheritance lineage, facilitating more practical and flexible design in diverse ecosystems.
  2. Enhanced Code Readability: Improves code clarity by specifying expected behaviors of a function’s input and output objects, making it clearer what functionalities are required.
  3. Facilitates Static Type Checking: Enables the use of static type checking tools, allowing for a more robust and comprehensive type checking process, especially beneficial in dynamically typed languages.
  4. Supports Duck Typing Philosophy: Aligns with the concept of duck typing, where an object’s suitability for a purpose is determined by its behaviors and properties, not just its type.
  5. Promotes Robust and Resilient Type Structures: Leads to the development of more robust and resilient type structures, which is valuable in a dynamic and evolving programming environment.
  6. Greater Flexibility in Code Design: Offers flexibility in code design, allowing developers to create innovative and efficient design patterns without being constrained by specific superclass inheritance.
  7. Facilitates Code Reusability: Enhances code reusability by making it easier to reuse existing code with any new classes that fit the structural criteria.
  8. Reduces Coupling Between Code Segments: Helps in reducing coupling in a program, minimizing the dependency between different parts of the codebase.
  9. Easier Integration with Third-party Code: Simplifies the integration with external libraries or modules by allowing the creation of classes that conform to expected structures without altering the inheritance hierarchy.
  10. Improved Code Maintainability: Leads to easier maintenance of the codebase, as changes to class hierarchies or structures don’t necessitate widespread modifications, as long as the structural contracts are respected.
  11. Promotes Interface-based Programming: Encourages thinking in terms of interfaces and behaviors, leading to cleaner, more abstract, and testable code.
  12. Simplifies Unit Testing: Makes creating mock objects and test doubles easier in unit testing, as only the required structure needs to be replicated, not the entire class hierarchy.
  13. Better Alignment with Dynamic Typing: Enhances expressiveness and dynamism in dynamically typed languages without compromising clarity.

Is it true that Python had something called protocols even before the introduction of protocol classes?

Yes, that’s correct! The folks behind the introduction of protocol classes have the following to say on the subject:

We propose to use the term protocols for types supporting structural subtyping. The reason is that the term iterator protocol, for example, is widely understood in the community, and coming up with a new term for this concept in a statically typed context would just create confusion.

This has the drawback that the term protocol becomes overloaded with two subtly different meanings: the first is the traditional, well-known but slightly fuzzy concept of protocols such as iterator; the second is the more explicitly defined concept of protocols in statically typed code. The distinction is not important most of the time, and in other cases we propose to just add a qualifier such as protocol classes when referring to the static type concept.

PEP 544

What is object-oriented programming?

Object-oriented programming (OOP) is a programming paradigm focused on “objects” that encapsulate both data (attributes) and behavior (methods).(( https://en.wikipedia.org/wiki/Object-oriented_programming )) This stands in contrast to procedural programming, where data and the flow of execution aren’t logically grouped together, as seen with structs in the C programming language.(( https://en.wikipedia.org/wiki/Struct_(C_programming_language) ))

What is an interface?

In computing, an interface acts as a bridge between two distinct software and/or hardware components to communicate and share data.(( https://en.wikipedia.org/wiki/Interface_(computing) ))

Moreover, in OOP (and especially in Java), the term “interface” can also refer to a blueprint or contract of behaviors. A type that adheres to this blueprint is said to implement the interface.

Sources

Published

Leave a comment

Your email address will not be published. Required fields are marked *