python中对象self的由来

一、使用的例子

这里使用的例子使用的是https://eli.thegreenplace.net/2012/06/15/under-the-hood-of-python-class-definitions中的方法,这种方法的精妙之处在于把class定义到一个类的内部,从而可以通过__code__.co_consts来把build_class找那个使用的代码完整的打印出来。为了避免跳转,这里照猫画虎再把这些内容完整的实现一份

tsecer@harry: cat classself.py 
def tsecer():
class harry():
def fry():
pass

tsecer@harry: ../../Python-3.6.0/python 
Python 3.6.0 (default, Nov 15 2018, 10:32:57) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import dis, classself
>>> dis.dis(classself) 
Disassembly of tsecer:
2 0 LOAD_BUILD_CLASS
2 LOAD_CONST 1 (<code object harry at 0x7f7b51e3ddc0, file "/data1/harry/work/python/classself/classself.py", line 2>)
4 LOAD_CONST 2 ('harry')
6 MAKE_FUNCTION 0
8 LOAD_CONST 2 ('harry')
10 CALL_FUNCTION 2
12 STORE_FAST 0 (harry)
14 LOAD_CONST 0 (None)
16 RETURN_VALUE
>>> dis.dis(classself.tsecer.__code__.co_consts[1])
2 0 LOAD_NAME 0 (__name__)
2 STORE_NAME 1 (__module__)
4 LOAD_CONST 0 ('tsecer.<locals>.harry')
6 STORE_NAME 2 (__qualname__)

3 8 LOAD_CONST 1 (<code object fry at 0x7f7b51eae640, file "/data1/harry/work/python/classself/classself.py", line 3>)
10 LOAD_CONST 2 ('tsecer.<locals>.harry.fry')
12 MAKE_FUNCTION 0
14 STORE_NAME 3 (fry)
16 LOAD_CONST 3 (None)
18 RETURN_VALUE
>>>

二、LOAD_BUILD_CLASS指令的虚拟机执行代码

这里要要注意到,PyEval_EvalCodeEx传入的第三个参数是函数执行之后的locals存储,也就是把class的定义作为函数调用,并把函数的locals存储在ns中。class的代码执行之后,把这个填充的ns作为meta的参数调用,这些locals也作为类的attrs来创建新的class类型。
/* AC: cannot convert yet, waiting for *args support */
static PyObject *
builtin___build_class__(PyObject *self, PyObject *args, PyObject *kwds)
{
……
cell = PyEval_EvalCodeEx(PyFunction_GET_CODE(func), PyFunction_GET_GLOBALS(func), ns,
NULL, 0, NULL, 0, NULL, 0, NULL,
PyFunction_GET_CLOSURE(func));
if (cell != NULL) {
PyObject *margs[3] = {name, bases, ns};
cls = _PyObject_FastCallDict(meta, margs, 3, mkw);
……
}

三、从locals到新创建类的tp_dict的转换

static PyObject *

type_new(PyTypeObject *metatype, PyObject *args, PyObject *kwds)
{
……
/* Check arguments: (name, bases, dict) */
if (!PyArg_ParseTuple(args, "UO!O!:type.__new__", &name, &PyTuple_Type,
&bases, &PyDict_Type, &orig_dict))
……
dict = PyDict_Copy(orig_dict);
if (dict == NULL)
goto error;
……
/* Initialize tp_dict from passed-in dict */
Py_INCREF(dict);
type->tp_dict = dict;
……
}

四、class定义中STORE_NAME虚拟机指令的执行

这里可以看到,STORE_NAME虚拟机指令将解析出的内容存储在f->f_locals中,对应的就是builtin___build_class__函数中传入的ns。
Python-3.6.0\Python\ceval.c
PyObject *
_PyEval_EvalFrameDefault(PyFrameObject *f, int throwflag)
{
……
TARGET(STORE_NAME) {
PyObject *name = GETITEM(names, oparg);
PyObject *v = POP();
PyObject *ns = f->f_locals;
int err;
if (ns == NULL) {
PyErr_Format(PyExc_SystemError,
"no locals found when storing %R", name);
Py_DECREF(v);
goto error;
}
if (PyDict_CheckExact(ns))
err = PyDict_SetItem(ns, name, v);
else
err = PyObject_SetItem(ns, name, v);
Py_DECREF(v);
if (err != 0)
goto error;
DISPATCH();
}
……
}

五、fry函数对应的MAKE_FUNCTION指令的执行

也就是创建一个PyFunction_Type类型的对象
Python-3.6.0\Python\ceval.c
PyObject *
_PyEval_EvalFrameDefault(PyFrameObject *f, int throwflag)
{
……
TARGET(MAKE_FUNCTION) {
PyObject *qualname = POP();
PyObject *codeobj = POP();
PyFunctionObject *func = (PyFunctionObject *)
PyFunction_NewWithQualName(codeobj, f->f_globals, qualname);
……
}
Python-3.6.0\Objects\funcobject.c
PyObject *
PyFunction_NewWithQualName(PyObject *code, PyObject *globals, PyObject *qualname)
{
PyFunctionObject *op;
PyObject *doc, *consts, *module;
static PyObject *__name__ = NULL;

if (__name__ == NULL) {
__name__ = PyUnicode_InternFromString("__name__");
if (__name__ == NULL)
return NULL;
}

op = PyObject_GC_New(PyFunctionObject, &PyFunction_Type);
if (op == NULL)
return NULL;
……
}

六、当通过一个对象调用特定接口时

1、简单的示例代码

从代码中看,其中是a.xyz()就是通过LOAD_ATTR虚拟机指令来进行属性查找
tsecer@harry: cat callmethod.py
class A():
def xyz():
pass
a = A()
a.xyz()

tsecer@harry: ../../Python-3.6.0/python -m dis callmethod.py 
1 0 LOAD_BUILD_CLASS
2 LOAD_CONST 0 (<code object A at 0x7f8e748a6040, file "callmethod.py", line 1>)
4 LOAD_CONST 1 ('A')
6 MAKE_FUNCTION 0
8 LOAD_CONST 1 ('A')
10 CALL_FUNCTION 2
12 STORE_NAME 0 (A)

4 14 LOAD_NAME 0 (A)
16 CALL_FUNCTION 0
18 STORE_NAME 1 (a)

5 20 LOAD_NAME 1 (a)
22 LOAD_ATTR 2 (xyz)
24 CALL_FUNCTION 0
26 POP_TOP
28 LOAD_CONST 2 (None)
30 RETURN_VALUE
tsecer@harry:

2、LOAD_ATTR虚拟机指令的执行

这里关键是调用了_PyType_Lookup函数
/* This is similar to PyObject_GenericGetAttr(),
but uses _PyType_Lookup() instead of just looking in type->tp_dict. */
static PyObject *
type_getattro(PyTypeObject *type, PyObject *name)
{
……
/* No data descriptor found on metatype. Look in tp_dict of this
* type and its bases */
attribute = _PyType_Lookup(type, name);
if (attribute != NULL) {
/* Implement descriptor functionality, if any */
descrgetfunc local_get = Py_TYPE(attribute)->tp_descr_get;

Py_XDECREF(meta_attribute);

if (local_get != NULL) {
/* NULL 2nd argument indicates the descriptor was
* found on the target object itself (or a base) */
return local_get(attribute, (PyObject *)NULL,
(PyObject *)type);
}

Py_INCREF(attribute);
return attribute;
}
……
}

在这个函数中,由于A在a的mro中,所以可以从中找到fry,这个fry是前面看到的PyFunction_Type类型实例。
/* Internal API to look for a name through the MRO.
This returns a borrowed reference, and doesn't set an exception! */
PyObject *
_PyType_Lookup(PyTypeObject *type, PyObject *name)
{
Py_ssize_t i, n;
PyObject *mro, *res, *base, *dict;
unsigned int h;

if (MCACHE_CACHEABLE_NAME(name) &&
PyType_HasFeature(type, Py_TPFLAGS_VALID_VERSION_TAG)) {
/* fast path */
h = MCACHE_HASH_METHOD(type, name);
if (method_cache[h].version == type->tp_version_tag &&
method_cache[h].name == name) {
#if MCACHE_STATS
method_cache_hits++;
#endif
return method_cache[h].value;
}
}

/* Look in tp_dict of types in MRO */
mro = type->tp_mro;
……
n = PyTuple_GET_SIZE(mro);
for (i = 0; i < n; i++) {
base = PyTuple_GET_ITEM(mro, i);
assert(PyType_Check(base));
dict = ((PyTypeObject *)base)->tp_dict;
assert(dict && PyDict_Check(dict));
res = PyDict_GetItem(dict, name);
if (res != NULL)
break;
}
……
}
从_PyType_Lookup返回之后,执行type_getattro函数中的
descrgetfunc local_get = Py_TYPE(attribute)->tp_descr_get;
语句,对于PyFunction_Type,这个就是

/* Bind a function to an object */
static PyObject *
func_descr_get(PyObject *func, PyObject *obj, PyObject *type)
{
if (obj == Py_None || obj == NULL) {
Py_INCREF(func);
return func;
}
return PyMethod_New(func, obj);
}
这里初始化了im->im_self = self;
PyObject *
PyMethod_New(PyObject *func, PyObject *self)
{
PyMethodObject *im;
if (self == NULL) {
PyErr_BadInternalCall();
return NULL;
}
im = free_list;
if (im != NULL) {
free_list = (PyMethodObject *)(im->im_self);
(void)PyObject_INIT(im, &PyMethod_Type);
numfree--;
}
else {
im = PyObject_GC_New(PyMethodObject, &PyMethod_Type);
if (im == NULL)
return NULL;
}
im->im_weakreflist = NULL;
Py_INCREF(func);
im->im_func = func;
Py_XINCREF(self);
im->im_self = self;
_PyObject_GC_TRACK(im);
return (PyObject *)im;
}

3、执行CALL_FUNCTION指令时

static PyObject *

call_function(PyObject ***pp_stack, Py_ssize_t oparg, PyObject *kwnames)
{
PyObject **pfunc = (*pp_stack) - oparg - 1;
PyObject *func = *pfunc;
PyObject *x, *w;
Py_ssize_t nkwargs = (kwnames == NULL) ? 0 : PyTuple_GET_SIZE(kwnames);
Py_ssize_t nargs = oparg - nkwargs;
PyObject **stack;

/* Always dispatch PyCFunction first, because these are
presumed to be the most frequent callable object.
*/
if (PyCFunction_Check(func)) {
PyThreadState *tstate = PyThreadState_GET();

PCALL(PCALL_CFUNCTION);

stack = (*pp_stack) - nargs - nkwargs;
C_TRACE(x, _PyCFunction_FastCallKeywords(func, stack, nargs, kwnames));
}
else {
if (PyMethod_Check(func) && PyMethod_GET_SELF(func) != NULL) {//PyMethod_New返回的对象满足这个分支,所以在栈中压入self,并且递增nargs的值,这个也就是在类方法中的self参数
/* optimize access to bound methods */
PyObject *self = PyMethod_GET_SELF(func);
PCALL(PCALL_METHOD);
PCALL(PCALL_BOUND_METHOD);
Py_INCREF(self);
func = PyMethod_GET_FUNCTION(func);
Py_INCREF(func);
Py_SETREF(*pfunc, self);
nargs++;
}

4、把PyMethod_Type的创建延迟到LOAD_ATTR执行时的好处

这样可以随时获得一个绑定了对象的函数,例如
tsecer@harry: cat methodbind.py 
class A():
def __init__(self):
self.xx = "xxx"
def show(self):
print(self.xx)

a = A()
f = a.show
f()


tsecer@harry: ../../Python-3.6.0/python methodbind.py 
xxx
tsecer@harry:

5、module对象的LOAD_ATTR为什么没有self
对于module对象,在执行_PyType_Lookup时,它的mro类型只有object和module两种类型,这两种类型中都不包含模块内变量信息,所以找不到descrgetfunc,不会版定self参数。

七、从例子中看

1、直观的例子

tsecer@harry: cat methodbind.py 
class A():
def __init__(self):
self.xx = "xxx"
def show(self):
print(self.xx)

a = A()
f = a.show
f()


tsecer@harry: ../../Python-3.6.0/python 
Python 3.6.0 (default, Nov 15 2018, 10:32:57) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import methodbind
xxx
>>> print(methodbind.__class__.__mro__)
(<class 'module'>, <class 'object'>)
>>> print(methodbind.a.__class__.__mro__)
(<class 'methodbind.A'>, <class 'object'>)
>>> dir(methodbind.A)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'show']
>>> print(methodbind.__class__)
<class 'module'>
>>> dir(methodbind.__class__)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']
>>>

2、解释

从上面输出可以看到,作为module的methodbind,它的mro为object和module两种类型,而这两种类型中都没有包含需要查找的函数,因为函数是在模块的dict中存储;而对于对象a,它的mro包含了A和object,而show是在A的tp_dict中,所以可以被查找到。

八、后记

在gdb调试时,如果编译的python版本开启了DEBUG模式,那么可以通过_PyUnicode_utf8来显示PyUnicodeObject类型的变量
Breakpoint 2, _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:2162
2162 if (ns == NULL) {
(gdb) p _PyUnicode_utf8(name)
$1 = 0x7ffff7fa6f60 "__module__"

posted on 2018-11-28 18:15  tsecer  阅读(553)  评论(0编辑  收藏  举报

导航