初窥python泛型系统与类型约束

对类进行索引
翻阅python源码有时会看到类似这样的实现,class Dataset(Generic[T_co]):Generic是一个类,但是可以直接对其进行索引,这需要归功于魔法方法__class_getitem__

class Box:
    def __class_getitem__(cls, item):
        print(cls, item)


var = Box[int, bool, str]  # 会输出 (<class 'int'>, <class 'bool'>, <class 'str'>)

之后会看到一个更具体且复杂的应用。

使用typing.TypeVar声明类型

通过typing.TypeVar可以声明一种类型,可以将这个类型作为type hint,例如:

_F = typing.TypeVar("_F")

def func():
    return 1

data: _F = func

但是这样的代码并不具有实际意义,我们希望能够对变量进行更好地类型检查并获得更好的提示功能。因此我们可以对类型增加多种约束。但是这些约束不会强制执行,只会得到警告,这是python语言特性决定的。例如:

_F = typing.TypeVar("_F", bound=typing.Callable[..., int])

我们对类型_F增加约束,希望它是一个可以接受任意数量参数,返回值为int类型的可调用对象。再例如:

T = TypeVar("T", int, float)

我们限定T只能是int或是float类型。
实际上typing.TypeVar非常灵活,有非常多可配置项。完整init函数声明如下:

 def __init__(self, name, *constraints, bound=None,
                 covariant=False, contravariant=False):

对于使用TypeVar声明的类型,还可以在运行时获取类型的基本信息,例如:

T = TypeVar("T", int, float)
print(T.__name__)  // T
print(T.__constraints__)  // (<class 'int'>, <class 'float'>)
// ... 更多用法

python的typing库提供了丰富的约束条件,几乎可以代表python中的所有类型特点。例如:

from typing import TypeVar, SupportsRound, SupportsAbs
SR = TypeVar("SR", bound=SupportsRound)  //希望类型SR可以支持round操作
from typing import TypeVar, Awaitable
SW = TypeVar("SW", bound=Awaitable)  // 希望类型SW是可以被await的

此外,typing库还内置了很多基本类型例如List、'Dict'、'Union'等。

T = TypeVar("T", int, float) 
TD = Dict[str, T]

td: TD = {}
td["a"] = 1
td["b"] = "2"  // 会得到一个警告 值的类型不匹配

TD表示一个key类型为字符串,value类型为int或是float类型的字典。
covariant是一个不太直观的编程概念,但是有时会用到这一特性。例如:

T_co = TypeVar("T_co", covariant=True)

__init_subclass__方法
函数名称可能具有一定误导性,这个方法在声明子类时就调用而不需要实例化子类对象。并且可以在定义子类时传递参数。

class Base:
    def __init_subclass__(cls, config=None, **kwargs):
        cls.config = config
        print(f"Subclass {cls.__name__} created with config: {config}")
        super().__init_subclass__(**kwargs)


class Sub1(Base, config="config1"):
    pass


class Sub2(Base, config="config2"):
    pass

Generic使用

T_co = TypeVar("T_co", covariant=True)


class Dataset(Generic[T_co]):
    def __init__(self, data: List[T_co]):
        self.data = data

    def get_data(self) -> List[T_co]:
        return self.data

d: Dataset[int] = Dataset([1, 2, 3])  # 通过泛型得到类型提示
print(Dataset[int].__origin__)        # 继承自Generic类会获取该属性
print(Dataset[int].__args__)          # 继承自Generic类会获取该属性
print(Dataset[int].__parameters__)    # 继承自Generic类会获取该属性
class Generic:
    """Abstract base class for generic types.

    A generic type is typically declared by inheriting from
    this class parameterized with one or more type variables.
    For example, a generic mapping type might be defined as::

      class Mapping(Generic[KT, VT]):
          def __getitem__(self, key: KT) -> VT:
              ...
          # Etc.

    This class can then be used as follows::

      def lookup_name(mapping: Mapping[KT, VT], key: KT, default: VT) -> VT:
          try:
              return mapping[key]
          except KeyError:
              return default
    """
    __slots__ = ()
    _is_protocol = False

    @_tp_cache
    def __class_getitem__(cls, params):
        """Parameterizes a generic class.

        At least, parameterizing a generic class is the *main* thing this method
        does. For example, for some generic class `Foo`, this is called when we
        do `Foo[int]` - there, with `cls=Foo` and `params=int`.

        However, note that this method is also called when defining generic
        classes in the first place with `class Foo(Generic[T]): ...`.
        """
        if not isinstance(params, tuple):
            params = (params,)

        params = tuple(_type_convert(p) for p in params)
        if cls in (Generic, Protocol):
            # Generic and Protocol can only be subscripted with unique type variables.
            if not params:
                raise TypeError(
                    f"Parameter list to {cls.__qualname__}[...] cannot be empty"
                )
            if not all(_is_typevar_like(p) for p in params):
                raise TypeError(
                    f"Parameters to {cls.__name__}[...] must all be type variables "
                    f"or parameter specification variables.")
            if len(set(params)) != len(params):
                raise TypeError(
                    f"Parameters to {cls.__name__}[...] must all be unique")
        else:
            # Subscripting a regular Generic subclass.
            for param in cls.__parameters__:
                prepare = getattr(param, '__typing_prepare_subst__', None)
                if prepare is not None:
                    params = prepare(cls, params)
            _check_generic(cls, params, len(cls.__parameters__))

            new_args = []
            for param, new_arg in zip(cls.__parameters__, params):
                if isinstance(param, TypeVarTuple):
                    new_args.extend(new_arg)
                else:
                    new_args.append(new_arg)
            params = tuple(new_args)

        return _GenericAlias(cls, params,
                             _paramspec_tvars=True)

    def __init_subclass__(cls, *args, **kwargs):
        super().__init_subclass__(*args, **kwargs)
        tvars = []
        if '__orig_bases__' in cls.__dict__:
            error = Generic in cls.__orig_bases__
        else:
            error = (Generic in cls.__bases__ and
                        cls.__name__ != 'Protocol' and
                        type(cls) != _TypedDictMeta)
        if error:
            raise TypeError("Cannot inherit from plain Generic")
        if '__orig_bases__' in cls.__dict__:
            tvars = _collect_parameters(cls.__orig_bases__)
            # Look for Generic[T1, ..., Tn].
            # If found, tvars must be a subset of it.
            # If not found, tvars is it.
            # Also check for and reject plain Generic,
            # and reject multiple Generic[...].
            gvars = None
            for base in cls.__orig_bases__:
                if (isinstance(base, _GenericAlias) and
                        base.__origin__ is Generic):
                    if gvars is not None:
                        raise TypeError(
                            "Cannot inherit from Generic[...] multiple types.")
                    gvars = base.__parameters__
            if gvars is not None:
                tvarset = set(tvars)
                gvarset = set(gvars)
                if not tvarset <= gvarset:
                    s_vars = ', '.join(str(t) for t in tvars if t not in gvarset)
                    s_args = ', '.join(str(g) for g in gvars)
                    raise TypeError(f"Some type variables ({s_vars}) are"
                                    f" not listed in Generic[{s_args}]")
                tvars = gvars
        cls.__parameters__ = tuple(tvars)

我们可以看到继承了泛型类后我们自定义的Dataset类支持Dataset[int]写法,这得益于Generic类实现了__class_getitem__(cls, params)方法。
但是我们可以注意到一个反常的现象那就是Generic__class_getitem__(cls, params)方法返回了一个_GenericAlias对象,所以Generic[T]的写法应当等价于_GenericAlias(cle, T),不应该继承Generic才对。但是我们用pycharm等工具却会发现Dataset类还是继承了Generic类,这是因为_GenericAlias继承了_BaseGenericAlias类,这个类中有一个关键的魔法方法__mro_entries__,这个类可以动态修改python类的继承关系,充分体现了python编程的灵活性。具体实现如下:

def __mro_entries__(self, bases):
    res = []
    if self.__origin__ not in bases:
        res.append(self.__origin__)
    i = bases.index(self)
    for b in bases[i+1:]:
        if isinstance(b, _BaseGenericAlias) or issubclass(b, Generic):
            break
    else:
        res.append(Generic)
    return tuple(res)

观察这个函数的实现逻辑,显然会判断是否继承自泛型类,没有就在res中添加Generic类。

两类type hint的细微区别:

def add_module(self, name: str, module: Optional['Module']) -> None: 
def add_module(self, name: str, module: Optional[Module]) -> None:

区别只在于一个单引号,大部分场景下两种用法可以等同。前者做法的优点在于可以避免一些作用域带来的问题,例如:

from typing import Union, Optional


class Module:
    def __init__(self, name: str):
        self.name = name

    def test(self, other: Optional['Module']):
        if isinstance(other, Module):
            print(f"{self.name} and {other.name} are both modules.")


Module("module1").test(Module("module2"))

此时如果去掉单引号程序会报错。

posted @   LRJ313  阅读(100)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· winform 绘制太阳,地球,月球 运作规律
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
· 上周热点回顾(3.3-3.9)
· 超详细:普通电脑也行Windows部署deepseek R1训练数据并当服务器共享给他人
点击右上角即可分享
微信分享提示