读懂掌握 Python logging 模块源码 (附带一些 example)
搜了一下自己的 Blog 一直缺乏一篇 Python logging 模块的深度使用的文章。其实这个模块非常常用,也有非常多的滥用。所以看看源码来详细记录一篇属于 logging 模块的文章。
整个 logging 模块的主要部分 1700 来行代码,还是很简单的。我们从实际行为来带大家过下代码
当我们在写 logging.info 的时候我们在干啥?
def info(msg, *args, **kwargs): if len(root.handlers) == 0: basicConfig() root.info(msg, *args, **kwargs)
可以看到默认使用的 root.info 接口。
如果我们直接调用的话会默认调用 basicConfig 的初始配置:
_acquireLock() try: if len(root.handlers) == 0: filename = kwargs.get("filename") if filename: mode = kwargs.get("filemode", 'a') hdlr = FileHandler(filename, mode) else: stream = kwargs.get("stream") hdlr = StreamHandler(stream) fs = kwargs.get("format", BASIC_FORMAT) dfs = kwargs.get("datefmt", None) fmt = Formatter(fs, dfs) hdlr.setFormatter(fmt) root.addHandler(hdlr) level = kwargs.get("level") if level is not None: root.setLevel(level) finally: _releaseLock()
第一行就是拿一把锁,我们可以容易判断, logging 模块应该是考虑了线程安全的情况。
这里我们没有 filename 所以直接会初始化一个流 handler StreamHandler 在这里会传入一个 None 最后会得到一个标准输出 sys.stderr。
之后就是得到一些默认配置,然后设置一下输出的格式化方法最后 set 上 handler。
最后为 root 添加上这个流处理的 handler。
完成后释放锁。
这样我们就完成了 baseConfig 的初始化。 当然你在平时看到的其他 blog 中肯定有介绍初始化 config 的各种方法,其实他们在做的就是覆盖上面那些代码的一些参数。修改成自定义的参数。
那么可能大家会问了, root.info 这个 root 是什么定义呢?
我们来看代码:
root = RootLogger(WARNING) Logger.root = root Logger.manager = Manager(Logger.root)
由于上面代码是会在 logging 被 import 的时候就执行,所以 root 会直接被赋值。让我们来看看 RootLogger 初始化的时候发生了什么。
class RootLogger(Logger): def __init__(self, level): """ Initialize the logger with the name "root". """ Logger.__init__(self, "root", level)
没什么看点 让我们来看父类 logger.__init__ 实现了什么
class Logger(Filterer): def __init__(self, name, level=NOTSET): Filterer.__init__(self) self.name = name self.level = _checkLevel(level) self.parent = None self.propagate = 1 self.handlers = [] self.disabled = 0
这里默认 name 是 root。 然后会初始化这些参数最后返回这个 logger 对象。
logger 有非常多的方法其中就包括 .info .error .warn 之类的直接打印日志的方法。
def info(self, msg, *args, **kwargs): """ Log 'msg % args' with severity 'INFO'. To pass exception information, use the keyword argument exc_info with a true value, e.g. logger.info("Houston, we have a %s", "interesting problem", exc_info=1) """ if self.isEnabledFor(INFO): self._log(INFO, msg, args, **kwargs)
当我们再调用 logging.info 的时候就是在调用 logger('root').info
当我们知道这点时候 就继续来看一下 logger 是怎么打日志的吧。
调用 self._log 方法其实就是调用 logger._log 方法
def _log(self, level, msg, args, exc_info=None, extra=None): if _srcfile: try: fn, lno, func = self.findCaller() except ValueError: fn, lno, func = "(unknown file)", 0, "(unknown function)" else: fn, lno, func = "(unknown file)", 0, "(unknown function)" if exc_info: if not isinstance(exc_info, tuple): exc_info = sys.exc_info() record = self.makeRecord(self.name, level, fn, lno, msg, args, exc_info, func, extra) self.handle(record)
如果没有什么特别的参数配置
会直接初始化 record 调用 makeRecord 方法。最后调用自己的 handle 方法。我们可以先来看下 makeRecord 方法。
在看这个代码之前可以先为大家明确一下 logRecord 在 logging 日志模块里面意义。就是说我们每打印的一条日志 其实就是一个 logRecord 实例。好我们来看代码
def __init__(self, name, level, pathname, lineno, msg, args, exc_info, func=None): """ Initialize a logging record with interesting information. """ ct = time.time() self.name = name self.msg = msg # # The following statement allows passing of a dictionary as a sole # argument, so that you can do something like # logging.debug("a %(a)d b %(b)s", {'a':1, 'b':2}) # Suggested by Stefan Behnel. # Note that without the test for args[0], we get a problem because # during formatting, we test to see if the arg is present using # 'if self.args:'. If the event being logged is e.g. 'Value is %d' # and if the passed arg fails 'if self.args:' then no formatting # is done. For example, logger.warn('Value is %d', 0) would log # 'Value is %d' instead of 'Value is 0'. # For the use case of passing a dictionary, this should not be a # problem. # Issue #21172: a request was made to relax the isinstance check # to hasattr(args[0], '__getitem__'). However, the docs on string # formatting still seem to suggest a mapping object is required. # Thus, while not removing the isinstance check, it does now look # for collections.Mapping rather than, as before, dict. if (args and len(args) == 1 and isinstance(args[0], collections.Mapping) and args[0]): args = args[0] self.args = args self.levelname = getLevelName(level) self.levelno = level self.pathname = pathname try: self.filename = os.path.basename(pathname) self.module = os.path.splitext(self.filename)[0] except (TypeError, ValueError, AttributeError): self.filename = pathname self.module = "Unknown module" self.exc_info = exc_info self.exc_text = None # used to cache the traceback text self.lineno = lineno self.funcName = func self.created = ct self.msecs = (ct - long(ct)) * 1000 self.relativeCreated = (self.created - _startTime) * 1000 if logThreads and thread: self.thread = thread.get_ident() self.threadName = threading.current_thread().name else: self.thread = None self.threadName = None if not logMultiprocessing: self.processName = None else: self.processName = 'MainProcess' mp = sys.modules.get('multiprocessing') if mp is not None: # Errors may occur if multiprocessing has not finished loading # yet - e.g. if a custom import hook causes third-party code # to run when multiprocessing calls import. See issue 8200 # for an example try: self.processName = mp.current_process().name except StandardError: pass if logProcesses and hasattr(os, 'getpid'): self.process = os.getpid() else: self.process = None
可以看到各种参数,包括打印所需要的当前时间戳,名字要打印的内容。包括一些带进来的参数,需要打印时候的 level 等级。都会被赋到这个 logRecord 实例上面。
拿到这个 record 之后就是执行 logger.handle 方法了。并且传入 record 实例对象。
def handle(self, record): """ Call the handlers for the specified record. This method is used for unpickled records received from a socket, as well as those created locally. Logger-level filtering is applied. """ if (not self.disabled) and self.filter(record): self.callHandlers(record)
如果说没有被 self.filter 过滤的话。注意这里的 handle 方法是是 logger 的。filter 方法却是基类 filter 的。 filter 这里会去取之前 add 进 handler 的 filter 方法是一个数组, Logger 和 Handler 类的 父类都是 Filterer类 他会维护一个 self.filters 的数组。
然后我们调用 callHandlers 方法
def callHandlers(self, record): """ Pass a record to all relevant handlers. Loop through all handlers for this logger and its parents in the logger hierarchy. If no handler was found, output a one-off error message to sys.stderr. Stop searching up the hierarchy whenever a logger with the "propagate" attribute set to zero is found - that will be the last logger whose handlers are called. """ c = self found = 0 while c: for hdlr in c.handlers: found = found + 1 if record.levelno >= hdlr.level: hdlr.handle(record) if not c.propagate: c = None #break out else: c = c.parent if (found == 0) and raiseExceptions and not self.manager.emittedNoHandlerWarning: sys.stderr.write("No handlers could be found for logger" " \"%s\"\n" % self.name) self.manager.emittedNoHandlerWarning = 1
通过 logger 上挂的 handler 数组,依次拿出来调用 handle 方法处理 record 对象。
def handle(self, record): """ Conditionally emit the specified logging record. Emission depends on filters which may have been added to the handler. Wrap the actual emission of the record with acquisition/release of the I/O thread lock. Returns whether the filter passed the record for emission. """ rv = self.filter(record) if rv: self.acquire() try: self.emit(record) finally: self.release() return rv
这个方法是 Handler 类的方法。还记得吗 刚才我们初始化的是 StreamHandler 方法, Handler 类是它的父类。
然后这里会再次调用 handler 的 自身的 filter 方法。如果过滤之后依然有结果就加锁 调起 emit 方法开始打日志。
这里会使用 StreamHandler 方法的 emit 方法。
def emit(self, record): """ Emit a record. If a formatter is specified, it is used to format the record. The record is then written to the stream with a trailing newline. If exception information is present, it is formatted using traceback.print_exception and appended to the stream. If the stream has an 'encoding' attribute, it is used to determine how to do the output to the stream. """ try: msg = self.format(record) stream = self.stream fs = "%s\n" if not _unicode: #if no unicode support... stream.write(fs % msg) else: try: if (isinstance(msg, unicode) and getattr(stream, 'encoding', None)): ufs = u'%s\n' try: stream.write(ufs % msg) except UnicodeEncodeError: #Printing to terminals sometimes fails. For example, #with an encoding of 'cp1251', the above write will #work if written to a stream opened or wrapped by #the codecs module, but fail when writing to a #terminal even when the codepage is set to cp1251. #An extra encoding step seems to be needed. stream.write((ufs % msg).encode(stream.encoding)) else: stream.write(fs % msg) except UnicodeError: stream.write(fs % msg.encode("UTF-8")) self.flush() except (KeyboardInterrupt, SystemExit): raise except: self.handleError(record)
然后就愉快的写到 流里面去了,就结束了整个流程。
其实只要理解了这个流程,会发现打日志还是很简单的。我们平常的使用无法就是在这个流程上修改一些配置,或者初始化的时候不使用 root 而使用自定义的 名字然后增加一些自定义的配置 XD.
Reference:
https://juejin.im/post/5b13fdd0f265da6e0b6ff3dd Pycon2018 利用logging模块轻松地进行Python日志记录