如何批量导入markdown到博客园

目的

有时候我们不在博客园记笔记,可能是用笔记软件,比如有道印象笔记,也有可能放在github,使用hexo搭建等等。要是某一天,突然想把这些笔记公开到博客园上怎么办?不可能一个一个的编辑再上传吧,那工作量也太大了。所以需要一个批量上传的方法。

实现方法

  1. 博客园提供的接口metaweblog,这个地址在博客园后台管理界面的最后一行 MetaWeblog访问地址
  2. python

简单来说就是使用python批量将markdown转换成html,再利用metaweblog API上传到博客园。

批量转换markdown 成html

之前我有篇文章有详细介绍怎么将markdown转换成html,这里将其改成批处理

import mistune
import sys
import codecs
import os
from pygments import highlight
from pygments.lexers import get_lexer_by_name
from pygments.formatters import html


class HighlightRenderer(mistune.Renderer):

    def block_code(self, code, lang):
        if not lang:
            return '\n<pre><code>%s</code></pre>\n' % \
                mistune.escape(code)
        lexer = get_lexer_by_name(lang, stripall=True)
        formatter = html.HtmlFormatter()
        return highlight(code, lexer, formatter)


def md2html(s):
    with codecs.open(s, mode='r', encoding='utf-8') as mdfile:
        md_text = mdfile.read()
        extras = ['code-friendly', 'fenced-code-blocks', 'footnotes']
        renderer = HighlightRenderer()
        markdown = mistune.Markdown(renderer=renderer)
        html_text = markdown(md_text)
        html_text = html_text.replace('highlight', 'cnblogs_code ')

        html_name = '%s.html' % (s[:-3])
        with codecs.open(html_name, 'w', encoding='utf-8', errors='xmlcharrefreplace') as output_file:
            output_file.write(html_text)


def getAllFile(path, suffix='.'):
    "recursive is enable"
    f = os.walk(path)
    fpath = []

    for root, dir, fname in f:
        for name in fname:
            if name.endswith(suffix):
                fpath.append(os.path.join(root, name))

    return fpath


def convertAll(path):
    flist = getAllFile(path, ".md")
    for fname in flist:
        md2html(fname)


if __name__ == "__main__":
    path = ''
    if len(sys.argv) == 1:
        path = os.getcwd()

    elif len(sys.argv) == 2:
        path = sys.argv[1]
    else:
        print("error parameter")
        exit()

    convertAll(path)

这里有一行代码html_text = html_text.replace('highlight', 'cnblogs_code ')可能会觉得奇怪。原因是mistune + pygments生成的代码块类名是highlight,而博客园默认的是cnblogs_code。所以为了代码能高亮,需要做两步:

  1. 将highlight替换成cnblogs_code,即加入上面那行代码
  2. 进入博客园后台->设置->页面定制css代码,加入下面的样式
/*.cnblogs_code { background-color: #f5f5f5;border:1px}
.cnblogs_code {border-right:gray 1px solid;border-top:gray 1px solid;border-left:gray 1px solid;border-bottom:gray 1px solid;background-color:#fff;padding:2px}*/
.cnblogs_code .hll { background-color: #ffffcc }
.cnblogs_code .c { color: #999988; font-style: italic } /* Comment */
.cnblogs_code .err { color: #a61717; background-color: #e3d2d2 } /* Error */
.cnblogs_code .k { color: #000000; font-weight: bold } /* Keyword */
.cnblogs_code .o { color: #000000; font-weight: bold } /* Operator */
.cnblogs_code .cm { color: #999988; font-style: italic } /* Comment.Multiline */
.cnblogs_code .cp { color: #999999; font-weight: bold; font-style: italic } /* Comment.Preproc */
.cnblogs_code .c1 { color: #999988; font-style: italic } /* Comment.Single */
.cnblogs_code .cs { color: #999999; font-weight: bold; font-style: italic } /* Comment.Special */
.cnblogs_code .gd { color: #000000; background-color: #ffdddd } /* Generic.Deleted */
.cnblogs_code .ge { color: #000000; font-style: italic } /* Generic.Emph */
.cnblogs_code .gr { color: #aa0000 } /* Generic.Error */
.cnblogs_code .gh { color: #999999 } /* Generic.Heading */
.cnblogs_code .gi { color: #000000; background-color: #ddffdd } /* Generic.Inserted */
.cnblogs_code .go { color: #888888 } /* Generic.Output */
.cnblogs_code .gp { color: #555555 } /* Generic.Prompt */
.cnblogs_code .gs { font-weight: bold } /* Generic.Strong */
.cnblogs_code .gu { color: #aaaaaa } /* Generic.Subheading */
.cnblogs_code .gt { color: #aa0000 } /* Generic.Traceback */
.cnblogs_code .kc { color: #000000; font-weight: bold } /* Keyword.Constant */
.cnblogs_code .kd { color: #000000; font-weight: bold } /* Keyword.Declaration */
.cnblogs_code .kn { color: #000000; font-weight: bold } /* Keyword.Namespace */
.cnblogs_code .kp { color: #000000; font-weight: bold } /* Keyword.Pseudo */
.cnblogs_code .kr { color: #000000; font-weight: bold } /* Keyword.Reserved */
.cnblogs_code .kt { color: #445588; font-weight: bold } /* Keyword.Type */
.cnblogs_code .m { color: #009999 } /* Literal.Number */
.cnblogs_code .s { color: #d01040 } /* Literal.String */
.cnblogs_code .na { color: #008080 } /* Name.Attribute */
.cnblogs_code .nb { color: #0086B3 } /* Name.Builtin */
.cnblogs_code .nc { color: #445588; font-weight: bold } /* Name.Class */
.cnblogs_code .no { color: #008080 } /* Name.Constant */
.cnblogs_code .nd { color: #3c5d5d; font-weight: bold } /* Name.Decorator */
.cnblogs_code .ni { color: #800080 } /* Name.Entity */
.cnblogs_code .ne { color: #990000; font-weight: bold } /* Name.Exception */
.cnblogs_code .nf { color: #990000; font-weight: bold } /* Name.Function */
.cnblogs_code .nl { color: #990000; font-weight: bold } /* Name.Label */
.cnblogs_code .nn { color: #555555 } /* Name.Namespace */
.cnblogs_code .nt { color: #000080 } /* Name.Tag */
.cnblogs_code .nv { color: #008080 } /* Name.Variable */
.cnblogs_code .ow { color: #000000; font-weight: bold } /* Operator.Word */
.cnblogs_code .w { color: #bbbbbb } /* Text.Whitespace */
.cnblogs_code .mf { color: #009999 } /* Literal.Number.Float */
.cnblogs_code .mh { color: #009999 } /* Literal.Number.Hex */
.cnblogs_code .mi { color: #009999 } /* Literal.Number.Integer */
.cnblogs_code .mo { color: #009999 } /* Literal.Number.Oct */
.cnblogs_code .sb { color: #d01040 } /* Literal.String.Backtick */
.cnblogs_code .sc { color: #d01040 } /* Literal.String.Char */
.cnblogs_code .sd { color: #d01040 } /* Literal.String.Doc */
.cnblogs_code .s2 { color: #d01040 } /* Literal.String.Double */
.cnblogs_code .se { color: #d01040 } /* Literal.String.Escape */
.cnblogs_code .sh { color: #d01040 } /* Literal.String.Heredoc */
.cnblogs_code .si { color: #d01040 } /* Literal.String.Interpol */
.cnblogs_code .sx { color: #d01040 } /* Literal.String.Other */
.cnblogs_code .sr { color: #009926 } /* Literal.String.Regex */
.cnblogs_code .s1 { color: #d01040 } /* Literal.String.Single */
.cnblogs_code .ss { color: #990073 } /* Literal.String.Symbol */
.cnblogs_code .bp { color: #999999 } /* Name.Builtin.Pseudo */
.cnblogs_code .vc { color: #008080 } /* Name.Variable.Class */
.cnblogs_code .vg { color: #008080 } /* Name.Variable.Global */
.cnblogs_code .vi { color: #008080 } /* Name.Variable.Instance */
.cnblogs_code .il { color: #009999 } /* Literal.Number.Integer.Long */

这样我们就能获得代码块样式了。

使用MetaWeblog API

看到这儿,你应该已经看过了MetaWeblog API内容。因此这里不多做介绍。

首先我们需要获取一下自己博客的一些信息

import xmlrpc.client as xmlrpclib

serviceUrl, appkey = 'http://rpc.cnblogs.com/metaweblog/UserName', 'xxxx'
usr, passwd = 'xxxxx', 'xxxx'

server = xmlrpclib.ServerProxy("http://rpc.cnblogs.com/metaweblog/UserName")
blogInfo = server.blogger.getUsersBlogs(appkey, usr, passwd)

print(blogInfo)

这样就获取到了自己博客的一些信息,其中有用的是blogid(注意上面的UserName需要替换成你自己的名字,用户名和密码也需要填写正确)
blogger.getUsersBlogs这个函数就是metawebblog的。

接下来就是将之前生成的html批量上传了。

import xmlrpc.client as xmlrpclib
import codecs
import os
import sys
import threading
from time import strftime

serviceUrl, appkey = 'http://rpc.cnblogs.com/metaweblog/WeyneChen', 'xxxx'
blogid = 'xxxx'
usr, passwd = 'xxxx', 'xxxx'


def postfile(filepath):

    with codecs.open(filepath, 'r', encoding='utf-8', errors='xmlcharrefreplace') as f:
        des = f.read()
        filename = os.path.basename(filepath)[:-5]
        cate_list = ["[随笔分类]python"]
        post = dict(description=des, title=filename, dateCreate=strftime(
            "%Y%m%dT%H:%M:%S"), categories=cate_list)
        server = xmlrpclib.ServerProxy(
            "http://rpc.cnblogs.com/metaweblog/WeyneChen")
        newPost = server.metaWeblog.newPost(blogid, usr, passwd, post, True)

def getAllFile(path, suffix='.'):
    "recursive is enable"
    f = os.walk(path)
    fpath = []

    for root, dir, fname in f:
        for name in fname:
            if name.endswith(suffix):
                fpath.append(os.path.join(root, name))

    return fpath


def transferAll(path):
    flist = getAllFile(path, ".html")

    def post():
        if len(flist) != 0:
            fname = flist.pop()
            postfile(fname)
            print("post %s" % fname)
            print(strftime("%Y%m%dT%H:%M:%S"))
            t = threading.Timer(60, post)
            t.start()
        else:
            exit()

    t = threading.Timer(3, post)
    t.start()


if __name__ == "__main__":
    path = ''
    if len(sys.argv) == 1:
        path = os.getcwd()

    elif len(sys.argv) == 2:
        path = sys.argv[1]
    else:
        print("error parameter")
        exit()

    transferAll(path)

这里有个注意点就是,因为博客园的限制,不能连续不断的发表,所以使用threading间隔60S发一次。
粗略版本,有问题请指出。

posted @ 2017-04-05 18:03  Weyne  阅读(1234)  评论(0编辑  收藏  举报