如何在 mistune 解析器上完美支持 Mathjax



mistune 篇(Markdown 篇)

我用的是mistune解析器(因为这真是Python上很快的 Markdown 解析器了),本来作者也写了math支持扩展,但是exmaple并没有给全,这……只有自己去寻找蛛丝马迹……




# Modify from https://blog.depado.eu/post/mistune-parser-syntax-mathjax-centered-images
import re, mistune

class MathBlockGrammar(mistune.BlockGrammar):
    block_math = re.compile(r"^\$\$(.*?)\$\$", re.DOTALL)
    latex_environment = re.compile(r"^\\begin\{([a-z]*\*?)\}(.*?)\\end\{\1\}", re.DOTALL)

class MathBlockLexer(mistune.BlockLexer):
    default_rules = ['block_math', 'latex_environment'] + mistune.BlockLexer.default_rules

    def __init__(self, rules=None, **kwargs):
        if rules is None:
            rules = MathBlockGrammar()
        super(MathBlockLexer, self).__init__(rules, **kwargs)

    def parse_block_math(self, m):
        """Parse a $$math$$ block"""
            'type': 'block_math',
            'text': m.group(1)

    def parse_latex_environment(self, m):
            'type': 'latex_environment',
            'name': m.group(1),
            'text': m.group(2)

class MathInlineGrammar(mistune.InlineGrammar):
    math = re.compile(r"^\$(.+?)\$", re.DOTALL)
    block_math = re.compile(r"^\$\$(.+?)\$\$", re.DOTALL)
    text = re.compile(r'^[\s\S]+?(?=[\\<!\[_*`~\$]|https?://| {2,}\n|$)')

class MathInlineLexer(mistune.InlineLexer):
    default_rules = ['block_math', 'math'] + mistune.InlineLexer.default_rules

    def __init__(self, renderer, rules=None, **kwargs):
        if rules is None:
            rules = MathInlineGrammar()
        super(MathInlineLexer, self).__init__(renderer, rules, **kwargs)

    def output_math(self, m):
        return self.renderer.inline_math(m.group(1))

    def output_block_math(self, m):
        return self.renderer.block_math(m.group(1))

class MathRendererMixin(mistune.Renderer):
    def block_code(self, code, lang=None):
        code = code.rstrip('\n')
        if not lang:
            lang = 'text'
        code = mistune.escape(code, quote=True, smart_amp=False)
        return '<pre class="language-%s"><code class="language-%s">%s\n</code></pre>\n' % (lang, lang, code)

    def block_math(self, text):
        return '$$%s$$' % text

    def latex_environment(self, name, text):
        return r'\begin{%s}%s\end{%s}' % (name, text, name)

    def inline_math(self, text):
        return '$%s$' % text

class MarkdownWithMath(mistune.Markdown):
    def __init__(self, renderer, **kwargs):
        if 'inline' not in kwargs:
            kwargs['inline'] = MathInlineLexer
        if 'block' not in kwargs:
            kwargs['block'] = MathBlockLexer
        super(MarkdownWithMath, self).__init__(renderer, **kwargs)

    def output_block_math(self):
        return self.renderer.block_math(self.token['text'])

    def output_latex_environment(self):
        return self.renderer.latex_environment(self.token['name'], self.token['text'])


mk = MarkdownWithMath(renderer=MathRendererMixin())
content = mk(r"{}".format(content))



The entries of \(C\) are given by the exact formula:

\[C_{ik} = \sum_{j=1}^n A_{ij} B_{jk} \]

but there are many ways to implement this computation. \(\approx 2mnp\) flops


\[C = \begin{pmatrix} 0 & 0 & 0 & \cdots & 0 & 0 & -c_0 \\ 0 & 0 & 0 & \cdots & 0 & 1 & -c_{m-1} \end{pmatrix} \]



\[ {\bf b}_{i}^{r}(t)=(1-t)\,{\bf b}_{i}^{r-1}(t)+t\,{\bf b}_{i+1}^{r-1}(t),\: i=\overline{0,n-r}, \]

i.e. the \(i^{th}\)

以上数学公式的源码(这些样例来自 mistune 的扩展库):

The entries of $C$ are given by the exact formula:
C_{ik} = \sum_{j=1}^n A_{ij} B_{jk}
but there are many ways to _implement_ this computation.   $\approx 2mnp$ flops

C = \begin{pmatrix}
          0 & 0 & 0 & \cdots & 0 & 0 & -c_0 \\
          0 & 0 & 0 & \cdots & 0 & 1 & -c_{m-1}

$$ {\bf
b}_{i}^{r}(t)=(1-t)\,{\bf b}_{i}^{r-1}(t)+t\,{\bf b}_{i+1}^{r-1}(t),\:
 i=\overline{0,n-r}, $$
i.e. the $i^{th}$

至于block_code, 由于我用的CodeBlock.js去处理代码块外框 和 Prism.js 进行代码高亮,识别的代码块应该是<pre class="language-%s"><code class="language-%s">%s\n</code></pre>\n 这种形式,所以我就简单地重新实现了Rendererblock_code函数。




<!-- Mathjax JS dns prefetch-->
<link rel="dns-prefetch" href="//cdn.bootcss.com" />


<script type="text/x-mathjax-config">
	var articlemathId = document.getElementById("articleContent");
	var commentmathId = document.getElementById("commentlist-container");
		tex2jax: {
			inlineMath: [ ['$','$'] ], //行内公式
			displayMath: [ ['$$','$$'] ], //行间公式
			skipTags: ['script', 'noscript', 'style', 'textarea', 'pre','code','a'], //渲染时跳过的html标签
			ignoreClass: "summary", //忽略的class
	MathJax.Hub.Queue(["Typeset", MathJax.Hub, articlemathId, commentmathId]); //指定渲染的html块,可以为多个
<script src="//cdn.bootcss.com/mathjax/2.7.7/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>

至于为什么script标签src//开头,这叫做相对URL,相关的标准可以看 RFC 3986 Section 4.2 (估计没几个人能认真看完)。

简单来说,对于相对URL,浏览器会根据当前的网页协议,自动在 // 前面加上相同的协议。比如我这篇文章是在https协议下,则会在//cdn.bootcss.com/前加上https变成https://cdn.bootcss.com/,其它协议同理。

完成以上内容,你就可以愉快地在 Python 的网站框架上使用支持 Mathjax 的 mistune 解析器了。

