Use type annotations to make Python code easier to read

It should take about 10 minutes to read this article.

We know that Python is a dynamic language, we do not need to explicitly declare its type when declaring a variable, such as the following example:

a = 2print('1 + a =', 1 + a)print('1 + a =', 1 + a)

operation result:

1 + a = 3

Here we first declare a variable aand assign it to 2, then print the final result, and the program outputs the correct result. But in the process, we didn’t declare what type it was.

But what if we turn ainto ? Rewritten as follows:

a = '2'print('1 + a =', 1 + a)print('1 + a =', 1 + a)

operation result:

TypeError: unsupported operand type(s) for +: 'int' and 'str''int' and 'str'

The error is reported directly. The reason for the error is that we have added a string type variable and a numerical type variable. The two data types are different and cannot be added.

If we rewrite the above statement into a method definition:

def add(a):    return a + 1    return a + 1

This defines a method that takes a parameter, increments it by 1, and returns.

If you call it in the following way at this time, the incoming parameter is a numeric type:

add(2)2)

Then the result 3 can be output normally. But if the parameter we pass in is not the type we expect, such as a character type, then the same error will be reported just now.

However, due to the characteristics of Python, in many cases we do not need to declare its type, so from the perspective of the method definition, we actually don’t know what type of parameters should be passed in to a method.

This actually causes a lot of inconvenience. In some cases, some complex methods, if we do not rely on some additional instructions, we do not know what type of parameters are.

Therefore, type annotations in Python are more important.

type annotation

In Python 3.5, Python PEP 484 introduced type hints, and in Python 3.6, PEP 526 further introduced Variable Annotations, so we rewrite the above code as follows:

a: int = 2print('5 + a =', 5 + a)def add(a: int) -> int:    return a + 1print('5 + a =', 5 + a)def add(a: int) -> int:    return a + 1

The specific syntax can be summarized into two points:

In PEP 8, the specific format is specified as follows:

With such a declaration, if we see the definition of this method in the future, we will know the type of parameters passed in. For example, when calling the add method, we will know that the incoming variable needs to be a numeric type, not a character. String type, very intuitive.

But it is worth noting that this type and variable annotation is actually just a type hint, which has no effect on the operation. For example, when calling the add method, we pass in a float type instead of an int type. It will not report an error, nor will it perform type conversion on parameters, such as:

add(1.5)

We passed in a float type value of 1.5. Let’s see the running result:

2.5

It can be seen that the running result is output normally, and 1.5 has not been cast to 1, otherwise the result will be 2.

Therefore, type and variable annotations are just a hint and have no practical effect on execution.

However, with type annotations, some IDEs can recognize and prompt them. For example, PyCharm can recognize that the parameter types are inconsistent when calling a method, and will prompt WARNING.

For example, the above call, if it is in PyCharm, will have the following prompt:

Expected type 'int', got 'float' insteadThis inspection detects type errors in function call expressions. Due to dynamic dispatch and duck typing, this is possible in a limited but useful number of cases. Types of function parameters can be specified in docstrings or in Python 3 function annotations.This inspection detects type errors in function call expressions. Due to dynamic dispatch and duck typing, this is possible in a limited but useful number of cases. Types of function parameters can be specified in docstrings or in Python 3 function annotations.

In addition, there are some libraries that support type checking, such as mypy. After installation, you can use mypy to check the calling situation in Python scripts that do not conform to type annotations.

The above is just an example of a simple int type. Let’s take a look at how to declare some relatively complex data structures, such as lists, tuples, and dictionaries.

As you can imagine, lists are represented by lists, tuples are represented by tuples, and dictionaries are represented by dicts, so naturally, when we declare it, we naturally write it like this:

names: list = ['Germey', 'Guido']version: tuple = (3, 7, 4)operations: dict = {'show': False, 'sort': True}'Guido']version: tuple = (3, 7, 4)operations: dict = {'show': False, 'sort': True}

This seems to be no problem. It is indeed declared for the corresponding type, but it does not actually reflect the structure of the entire list and tuple. For example, we only know the type of the elements in names through type annotations, only that names is a The list type is actually a string str type. We also don’t know what type each element of the version tuple is, it’s actually an int. But we have no way of knowing this information. Therefore, just relying on declarations like list and tuple is very “weak”, and we need a stronger type declaration.

At this time, we need to rely on the typing module, which provides very “strong” type support, for example List[str], Tuple[int, int, int]it can represent a list composed of elements of type str and a tuple of length 3 composed of elements of type int. So the above statement can be rewritten as follows:

from typing import List, Tuple, Dictnames: List[str] = ['Germey', 'Guido']version: Tuple[int, int, int] = (3, 7, 4)operations: Dict[str, bool] = {'show': False, 'sort': True}import List, Tuple, Dictnames: List[str] = ['Germey', 'Guido']version: Tuple[int, int, int] = (3, 7, 4)operations: Dict[str, bool] = {'show': False, 'sort': True}

In this way, the type of the variable can be reflected very intuitively.

At present, the typing module has also been added to the Python standard library, and we can use it directly without installing third-party modules.

typing

Let’s take a look at the specific usage of the typing module in detail. Here we will mainly introduce some commonly used annotation types, such as List, Tuple, Dict, Sequence, etc. After understanding the specific usage of each type, we can handle any variable with ease. declared.

Just import it directly through the typing module when importing, for example:

from typing import List, Tupleimport List, Tuple

List

List, list, is a [generic type] of list, which is basically equivalent to list, followed by a square bracket, which represents the element type that constitutes the list. For example, a list composed of numbers can be declared as:

var: List[int or float] = [2, 3.5]2, 3.5]

It is also possible to nest declarations:

var: List[List[int]] = [[1, 2], [2, 3]]2], [2, 3]]

Tuple、NamedTuple

Tuple, tuple, is a generic type of tuple, followed by a square bracket. The square brackets declare the element types constituting this tuple in order. For example, Tuple[X, Y]represents that the first element constituting a tuple is of type X, and the first Two elements are of type Y.

For example, if you want to declare a tuple representing name, age, and height, and the three data types are str, int, and float, you can declare it like this:

person: Tuple[str, int, float] = ('Mike', 22, 1.75)22, 1.75)

Type nesting can also be used.

NamedTuple, a generic type of collections.namedtuple, is actually exactly the same as namedtuple usage, but personally I don’t recommend using NamedTuple. It is recommended to use the attrs library to declare some representative classes.

Dict、Mapping、MutableMapping

Dict, dictionary, is a generic type of dict; Mapping, mapping, is a generic type of collections.abc.Mapping. According to the official documentation, Dict is recommended for annotating return types, and Mapping is recommended for annotating parameters. They are used in the same way, followed by a bracket, the brackets declare the key name and the type of the key value, such as:

def size(rect: Mapping[str, int]) -> Dict[str, int]:    return {'width': rect['width'] + 100, 'height': rect['width'] + 100}    return {'width': rect['width'] + 100, 'height': rect['width'] + 100}

Here Dict is used as the return value type annotation and Mapping is used as the parameter type annotation.

MutableMapping is a subclass of Mapping object, and MutableMapping is often used instead of Mapping in many libraries.

Set、AbstractSet

Set, collection, is the generic type of set; AbstractSet, is the generic type of collections.abc.Set. According to the official documentation, Set is recommended for annotating return types and AbstractSet for annotating parameters. They are used in the same way, followed by a square bracket, which declares the type of the elements in the collection, such as:

def describe(s: AbstractSet[int]) -> Set[int]:    return set(s)    return set(s)

Here Set is used as the return value type annotation and AbstractSet is used as the parameter type annotation.

Sequence

Sequence, a generic type of collections.abc.Sequence, in some cases, we may not need to strictly distinguish whether a variable or parameter is a list type or a tuple type, we can use a more generalized type , called Sequence, and its usage is similar to List, such as:

def square(elements: Sequence[float]) -> List[float]:    return [x ** 2 for x in elements]    return [x ** 2 for x in elements]

NoReturn

NoReturn, when a method does not return a result, in order to annotate its return type, we can annotate it as NoReturn, for example:

def hello() -> NoReturn:    print('hello')    print('hello')

Any

Any, is a special type that can represent all types. All types of the static type checker are compatible with the Any type. All parameterless type annotations and return type annotations will use the Any type by default, that is, the following The declarations of the two methods are completely equivalent:

def add(a):    return a + 1def add(a: Any) -> Any:    return a + 1    return a + 1def add(a: Any) -> Any:    return a + 1

The principle is similar to object, all types are subclasses of object. But if we declare the parameter as object type, the static parameter type check will throw an error, while Any will not. For details, please refer to the official documentation: https://docs.python.org/zh-cn/3/ library/typing.html?highlight=typing#the-any-type.

TypeVar

TypeVar, we can use it to customize variables that are compatible with specific types. For example, some variables declared as int, float, and None meet the requirements. In fact, they can represent any number or empty content, but other types cannot. Such as list list, dictionary dict, etc. In such cases, we can use TypeVar to represent.

For example, the height of a person can be represented by int or float or None, but not by dict, so it can be declared like this:

height = 1.75Height = TypeVar('Height', int, float, None)def get_height() -> Height:    return heightHeight = TypeVar('Height', int, float, None)def get_height() -> Height:    return height

Here we declare a Height type using TypeVar and then use it to annotate the return result of the method.

NewType

NewType, we can use it to declare some types with special meanings. For example, like the example of Tuple, we need to represent it as Person, that is, the meaning of a person, but it is not intuitive to declare it as Tuple on the surface, so We can declare a type for it using NewType like:

Person = NewType('Person', Tuple[str, int, float])person = Person(('Mike', 22, 1.75))person = Person(('Mike', 22, 1.75))

In fact, person is a tuple type, and we can operate it normally like a tuple.

Callable

Callable, a callable type, is usually used to annotate a method. For example, we just declared an add method, which is a Callable type:

print(Callable, type(add), isinstance(add, Callable))

operation result:

typing.Callable <class 'function'> True

Here, although the result obtained by the two add using the type method is a function, in fact, the isinstance method is used to judge that it is indeed True.

Callable needs to use Callable[[Arg1Type, Arg2Type, ...], ReturnType]such annotating both the parameter type and the return value type, for example:

def date(year: int, month: int, day: int) -> str:    return f'{year}-{month}-{day}'def get_date_fn() -> Callable[[int, int, int], str]:    return date    return f'{year}-{month}-{day}'def get_date_fn() -> Callable[[int, int, int], str]:    return date

Here first, a method date is declared, which receives three int parameters and returns a str result. The get_date_fn method returns the method itself, and its return value type can be marked as Callable, and the parameter types of the returned method are marked in square brackets respectively. and return value type.

Union

[Union] , the union type, Union[X, Y]represents either the X type or the Y type.

A union type of a union type is equivalent to the flattened type:

Union[Union[int, str], float] == Union[int, str, float]

Union types with only one parameter collapse into the parameter itself, for example:

Union[int] == int

Extra parameters are skipped, for example:

Union[int, str, int] == Union[int, str]

When comparing union types, the parameter order is ignored, for example:

Union[int, str] == Union[str, int]

This is useful when declaring some method parameters, such as a method, either pass a method name represented by a string, or pass the method directly:

def process(fn: Union[str, Callable]):    if isinstance(fn, str):        # str2fn and process        pass    elif isinstance(fn, Callable):        fn()    if isinstance(fn, str):        # str2fn and process        pass    elif isinstance(fn, Callable):        fn()

Such declarations are very common in the definition of some class library methods.

Optional

Optional, which means that this parameter can be null or a declared type, i.e. Optional[X]equivalent to Union[X, None].

But it is worth noting that this is not equivalent to an optional parameter. When it is annotated as a parameter type, it does not mean that the parameter can not be passed, but that the parameter can be passed as None.

For example, when a method is executed, it does not return an error message if the execution is completed, and returns an error message if a problem occurs, you can declare it like this:

def judge(result: bool) -> Optional[str]:    if result: return 'Error Occurred'    if result: return 'Error Occurred'

Generator

If you want to represent a generator type, you can use Generator, which has a special declaration, followed by three parameters in square brackets, representing YieldType, SendType, and ReturnType, such as:

def echo_round() -> Generator[int, float, str]:    sent = yield 0    while sent >= 0:        sent = yield round(sent)    return 'Done'    sent = yield 0    while sent >= 0:        sent = yield round(sent)    return 'Done'

Here, the type of the variable following the yield keyword is YieldType, the type of the result returned by yield is SendType, and finally the content of the generator return is ReturnType.

Of course, in many cases, the generator often only needs the content of yield. We do not need SendType and ReturnType, and can be set to empty, such as:

def infinite_stream(start: int) -> Generator[int, None, None]:    while True:        yield start        start += 1    while True:        yield start        start += 1

Case combat

Next let’s look at an actual project and see how the frequently used types are generally used.

The library we are looking at here is requests-html, which was developed by Kenneth Reitz. Its GitHub address is: https://github.com/psf/requests-html. Let’s mainly look at some types in its source code. How to declare.

The source code of this library is actually a file, that is https://github.com/psf/requests-html/blob/master/requests_html.py, let’s take a look at some typing definitions and method definitions in it.

First of all, the definition part of Typing is as follows:

from typing import Set, Union, List, MutableMapping, Optional_Find = Union[List['Element'], 'Element']_XPath = Union[List[str], List['Element'], str, 'Element']_Result = Union[List['Result'], 'Result']_HTML = Union[str, bytes]_BaseHTML = str_UserAgent = str_DefaultEncoding = str_URL = str_RawHTML = bytes_Encoding = str_LXML = HtmlElement_Text = str_Search = Result_Containing = Union[str, List[str]]_Links = Set[str]_Attrs = MutableMapping_Next = Union['HTML', List[str]]_NextSymbol = List[str]import Set, Union, List, MutableMapping, Optional_Find = Union[List['Element'], 'Element']_XPath = Union[List[str], List['Element'], str, 'Element']_Result = Union[List['Result'], 'Result']_HTML = Union[str, bytes]_BaseHTML = str_UserAgent = str_DefaultEncoding = str_URL = str_RawHTML = bytes_Encoding = str_LXML = HtmlElement_Text = str_Search = Result_Containing = Union[str, List[str]]_Links = Set[str]_Attrs = MutableMapping_Next = Union['HTML', List[str]]_NextSymbol = List[str]

Here you can see that the main types used are Set, Union, List, MutableMapping, and Optional, which have been explained above. In addition, Union is used several times to declare some new types, such _Findas either A list of Element objects, either a single Element object, _Resulta list of Result objects, or a single Result object. In addition, it is _Attrsactually a dictionary type, which is represented by MutableMapping here, neither Dict nor Mapping is used.

Next, look at the declaration of an Element class:

class Element(BaseParser):    """An element of HTML.    :param element: The element from which to base the parsing upon.    :param url: The URL from which the HTML originated, used for {{EJS0}}absolute_links{{EJS1}}.    :param default_encoding: Which encoding to default to.    """    __slots__ = [        'element', 'url', 'skip_anchors', 'default_encoding', '_encoding',        '_html', '_lxml', '_pq', '_attrs', 'session'    ]    def __init__(self, *, element, url: _URL, default_encoding: _DefaultEncoding = None) -> None:        super(Element, self).__init__(element=element, url=url, default_encoding=default_encoding)        self.element = element        self.tag = element.tag        self.lineno = element.sourceline        self._attrs = None    def __repr__(self) -> str:        attrs = ['{}={}'.format(attr, repr(self.attrs[attr])) for attr in self.attrs]        return "<Element {} {}>".format(repr(self.element.tag), ' '.join(attrs))    @property    def attrs(self) -> _Attrs:        """Returns a dictionary of the attributes of the :class:{{EJS2}}        ({{EJS3}}_).        """        if self._attrs is None:            self._attrs = {k: v for k, v in self.element.items()}            # Split class and rel up, as there are ussually many of them:            for attr in ['class', 'rel']:                if attr in self._attrs:                    self._attrs[attr] = tuple(self._attrs[attr].split())        return self._attrs    """An element of HTML.    :param element: The element from which to base the parsing upon.    :param url: The URL from which the HTML originated, used for {{EJS4}}absolute_links{{EJS5}}.    :param default_encoding: Which encoding to default to.    """    __slots__ = [        'element', 'url', 'skip_anchors', 'default_encoding', '_encoding',        '_html', '_lxml', '_pq', '_attrs', 'session'    ]    def __init__(self, *, element, url: _URL, default_encoding: _DefaultEncoding = None) -> None:        super(Element, self).__init__(element=element, url=url, default_encoding=default_encoding)        self.element = element        self.tag = element.tag        self.lineno = element.sourceline        self._attrs = None    def __repr__(self) -> str:        attrs = ['{}={}'.format(attr, repr(self.attrs[attr])) for attr in self.attrs]        return "<Element {} {}>".format(repr(self.element.tag), ' '.join(attrs))    @property    def attrs(self) -> _Attrs:        """Returns a dictionary of the attributes of the :class:{{EJS6}}        ({{EJS7}}_).        """        if self._attrs is None:            self._attrs = {k: v for k, v in self.element.items()}            # Split class and rel up, as there are ussually many of them:            for attr in ['class', 'rel']:                if attr in self._attrs:                    self._attrs[attr] = tuple(self._attrs[attr].split())        return self._attrs

The __init__method receives a lot of parameters, and uses _URLand _DefaultEncodingannotates the parameter type at the same time. In addition, the attrs method uses _Attrsto annotate the return result type.

On the whole, the type and return value of each parameter are clearly annotated, and the readability of the code is greatly improved.

The above is a detailed introduction to type annotations and the typing module.

The code in this section can be obtained by replying to the ” Type Annotation “ on the official account “Attack Coder ” .

Cui Qingcai

Jingmi blogger, author of “Python3 Web Crawler Development”

invisible word

Personal public account: Coder of Attack

Long press to identify the QR code to follow

Good text to watch with friends~

Leave a Comment

Your email address will not be published. Required fields are marked *