起因:公司的移动APPsaas后台项目基本稳定,但是总感觉不够精炼,和一些成熟的开源python框架比感觉缺乏美感,总想着重构后台代码,但是做的时候一团乱麻,不知道从何处下手;

由于缺乏框架实现的经验,所以打算从使用的几个Python框架入手,先学习别人的框架设计思路;

以此为为记,2017年3月31日。

 

  • pony,一个ORM的mode实现(ORM中M的实现)

pony的mode有点特殊,需要继承Database中的成员类,直接撸关键代码:

class Database(object):

    @cut_traceback
    def __init__(self, *args, **kwargs):
        # argument 'self' cannot be named 'database', because 'database' can be in kwargs
        self.priority = 0
        self._insert_cache = {}

        # ER-diagram related stuff:
        self._translator_cache = {}
        self._constructed_sql_cache = {}
        self.entities = {}
        self.schema = None
        self.Entity = type.__new__(EntityMeta, 'Entity', (Entity,), {})
        self.Entity._database_ = self

        # Statistics-related stuff:
        self._global_stats = {}
        self._global_stats_lock = RLock()
        self._dblocal = DbLocal()

        self.provider = None
        if args or kwargs: self._bind(*args, **kwargs)

 

用户自定义的mode是Database中的Entity变量,这个变量是一个类,实现用户自定义变量的获取的转化处理;这样实现和Database偶合在一起了,即mode实例不能单独存在,必须依附于Database实例。

self.Entity = type.__new__(EntityMeta, 'Entity', (Entity,), {})

 

自己实现的mode继承方式:

class Customer(db.Entity):
    id = PrimaryKey(int, auto=True)
    name = Required(str)
    email = Required(str, unique=True)
    orders = Set("Order")

既然是一个db实例的的db.Entity成员,只不过这个成员比较特殊,是一个类:

继续查看EntityMeta、Entity是何方神圣:

代码太长只摘取关键部分:

class EntityMeta(type):
    def __new__(meta, name, bases, cls_dict):
        if 'Entity' in globals():
            if '__slots__' in cls_dict: throw(TypeError, 'Entity classes cannot contain __slots__ variable')
            cls_dict['__slots__'] = ()
        return super(EntityMeta, meta).__new__(meta, name, bases, cls_dict)
    @cut_traceback
    def __init__(entity, name, bases, cls_dict):
        super(EntityMeta, entity).__init__(name, bases, cls_dict)
        .......
     # 查找mode中用户自定义属性,并根据属性类型做转化从而适配数据库,具体看Attribute类; direct_bases
= [ c for c in entity.__bases__ if issubclass(c, Entity) and c.__name__ != 'Entity' ] entity._direct_bases_ = direct_bases base_attrs = [] for base in direct_bases: for a in base._attrs_: prev = base_attrs_dict.get(a.name) if prev is None: base_attrs_dict[a.name] = a base_attrs.append(a) entity._base_attrs_ = base_attrs new_attrs = [] for name, attr in items_list(entity.__dict__): if name in base_attrs_dict: throw(ERDiagramError, "Name '%s' hides base attribute %s" % (name,base_attrs_dict[name])) if not isinstance(attr, Attribute): continue if name.startswith('_') and name.endswith('_'): throw(ERDiagramError, 'Attribute name cannot both start and end with underscore. Got: %s' % name) if attr.entity is not None: throw(ERDiagramError, 'Duplicate use of attribute %s in entity %s' % (attr, entity.__name__)) attr._init_(entity, name) new_attrs.append(attr)
     # 按照定义的顺序排序
     new_attrs.sort(key=attrgetter('id'))
        # 完成属性的收集
     entity._new_attrs_ = new_attrs
entity._attrs_ = base_attrs + new_attrs
      entity._adict_ = {attr.name: attr for attr in entity._attrs_}

 

用户调用接口:

    @cut_traceback
    def __getitem__(entity, key):
        if type(key) is not tuple: key = (key,)
        if len(key) != len(entity._pk_attrs_):
            throw(TypeError, 'Invalid count of attrs in %s primary key (%s instead of %s)'
                             % (entity.__name__, len(key), len(entity._pk_attrs_)))
        kwargs = {attr.name: value for attr, value in izip(entity._pk_attrs_, key)}
        return entity._find_one_(kwargs)
Entity是以EntityMeta为元类的一个类,主要处理数据库中的复杂关系:
class Entity(with_metaclass(EntityMeta)):
    .......
上面的定义和下面等价:
class Entity(object):
    __metaclass__ = EntityMeta

这样写是为了兼容py2和py3的差异:

py3中的语法为:

class MyClass(metaclass=Meta):
    pass

 

由于牵涉到元类的使用,实现难度:4颗星

关键:捕获用户自定义变量,实现底层存储和转化的封装,常用户ORM的M层实现。

总结:要实现子类成员的收集分以下3步,

1、需要实现自己的元类;

2、对子类类型进行判断,同类型属性合并

4、对外实现接口,如:__getitem__,__setter__

元类的使用可以参考:http://blog.jobbole.com/21351/

 

下面对比infi.clickhouse_orm中M的实现方式: 

第一步:创建自己的元类

class ModelBase(type):
    '''
    A metaclass for ORM models. It adds the _fields list to model classes.
    '''

    ad_hoc_model_cache = {}

    def __new__(cls, name, bases, attrs):
        new_cls = super(ModelBase, cls).__new__(cls, name, bases, attrs)
        # Collect fields from parent classes
        base_fields = []
        for base in bases:
            if isinstance(base, ModelBase):
                base_fields += base._fields
        # Build a list of fields, in the order they were listed in the class
        fields = base_fields + [item for item in attrs.items() if isinstance(item[1], Field)]
        fields.sort(key=lambda item: item[1].creation_counter)
        setattr(new_cls, '_fields', fields)
        return new_cls

 

其中,_fields存放用户自定义(类)属性:

第二步:实现M的基类,提供对外调用的接口

class Model(with_metaclass(ModelBase)):
    '''
    A base class for ORM models.
    '''

    engine = None
    readonly = False

    def __init__(self, **kwargs):
        '''
        Creates a model instance, using keyword arguments as field values.
        Since values are immediately converted to their Pythonic type,
        invalid values will cause a ValueError to be raised.
        Unrecognized field names will cause an AttributeError.
        '''
        super(Model, self).__init__()

        self._database = None

        # Assign field values from keyword arguments
        for name, value in kwargs.items():
            field = self.get_field(name)
            if field:
                setattr(self, name, value)
            else:
                raise AttributeError('%s does not have a field called %s' % (self.__class__.__name__, name))
        # Assign default values for fields not included in the keyword arguments
        for name, field in self._fields:
            if name not in kwargs:
                setattr(self, name, field.default)

    def __setattr__(self, name, value):
        '''
        When setting a field value, converts the value to its Pythonic type and validates it.
        This may raise a ValueError.
        '''
        field = self.get_field(name)
     # 当field没有被覆盖,还是Field类型
if field: value = field.to_python(value, pytz.utc) field.validate(value)
     # 如果已经被覆盖,直接覆盖(此处有bug,初次赋值对类型做检查,再次赋值不会对类型做检查) super(Model, self).
__setattr__(name, value) def get_field(self, name): ''' Get a Field instance given its name, or None if not found. ''' field = getattr(self.__class__, name, None) return field if isinstance(field, Field) else None

其中:

__init__提供类似ModeSome(**kwargs)的构建方式,_fields的作用1、初始化时设置默认值,2、在类级别保存Field,因为ModeSome(**kwargs)及__setattr__会覆盖Field属性。

__setattr__提供类似字典赋值的接口,

 

在抓取界赫赫有名的Scrapy中的用户自定义Item也用到了ORM模型的思想:

大家感受一下scrapy中元类的实现方式:

class ItemMeta(ABCMeta):

    def __new__(mcs, class_name, bases, attrs):
        classcell = attrs.pop('__classcell__', None)
        new_bases = tuple(base._class for base in bases if hasattr(base, '_class'))
        _class = super(ItemMeta, mcs).__new__(mcs, 'x_' + class_name, new_bases, attrs)

        fields = getattr(_class, 'fields', {})
        new_attrs = {}
        for n in dir(_class):
            v = getattr(_class, n)
            if isinstance(v, Field):
                fields[n] = v
            elif n in attrs:
                new_attrs[n] = attrs[n]

        new_attrs['fields'] = fields
        new_attrs['_class'] = _class
        if classcell is not None:
            new_attrs['__classcell__'] = classcell
        return super(ItemMeta, mcs).__new__(mcs, class_name, bases, new_attrs)

元类继承了ABCMeta而来,子类的区分方式是根据是否包含_class变量来区分的

scrapy中Item的父类:

live_refs = defaultdict(weakref.WeakKeyDictionary)


class object_ref(object):
    """Inherit from this class (instead of object) to a keep a record of live
    instances"""

    __slots__ = ()

    def __new__(cls, *args, **kwargs):
        obj = object.__new__(cls)
        live_refs[cls][obj] = time()
        return obj


class BaseItem(object_ref):
    """Base class for all scraped items."""
    pass

Item的实现:

class DictItem(MutableMapping, BaseItem):

    fields = {}

    def __init__(self, *args, **kwargs):
        self._values = {}
        if args or kwargs:  # avoid creating dict for most common case
            for k, v in six.iteritems(dict(*args, **kwargs)):
                self[k] = v
        
    ...........


@six.add_metaclass(ItemMeta)
class Item(DictItem):
    pass

 

Item复用了MutableMapping类,其行为更像python原生字典。

  • pony加载不同的provider实现(动态创建实例)

 

def _bind(self, *argv, **kwargs):

        if self.provider is not None:
            throw(TypeError, 'Database object was already bound to %s provider' % self.provider.dialect)
        if args: provider, args = args[0], args[1:]
        elif 'provider' not in kwargs: throw(TypeError, 'Database provider is not specified')
        else: provider = kwargs.pop('provider')
        if isinstance(provider, type) and issubclass(provider, DBAPIProvider):
            provider_cls = provider
        else:
            if not isinstance(provider, basestring): throw(TypeError)
            if provider == 'pygresql': throw(TypeError,
                'Pony no longer supports PyGreSQL module. Please use psycopg2 instead.')
            provider_module = import_module('pony.orm.dbproviders.' + provider)
            provider_cls = provider_module.provider_cls
        self.provider = provider = provider_cls(*args, **kwargs)

关键代码:

provider_module = import_module('pony.orm.dbproviders.' + provider)
provider_cls = provider_module.provider_cls
self.provider = provider = provider_cls(*args, **kwargs)
provider由是用户输入的标识字符串,所有的provider模块对外统一接口名:provider_cls

以SQLite模块为例:

provider_cls = SQLiteProvider

调用举例:

db = Database("sqlite", "demo.sqlite", create_db=True)

 

实现技巧:1颗星

适用场景,通过标识创建所需要的对象,工场模式。

 

posted on 2017-03-31 19:56  闪电战  阅读(747)  评论(0编辑  收藏  举报