Html CSS JS批量压缩
以前在写网站的时候,代码总是刻意写得很整齐,而且为了容易看懂,会加很多注释。但是常常希望发布的时候能将代码尽可能压缩,去掉注释以及换行和空格等,以减少不必要的传输开销。之前虽然也知道有可以压缩的工具,但是一直没找到好用的的批处理方法,于是就打算自己写一个。
是用Python写的,用了htmlcompressor,关于htmlcompressor的详细信息可以去htmlcompressor的google code首页参考。另外需要保证htmlcompressor-1.5.3.jar和yuicompressor-2.4.7.jar(版本可以不用一致)与BatchCompressor.py在同一目录下。BachCompressor.py执行完后会产生一个log.txt的日志文件,BatchCompressor.py代码如下:
1 #-*- encoding: utf-8 -*- 2 ''' 3 Created on 2013-01-08 19:43:11 4 Updated on 2013-01-12 16:49:56 5 6 @author: Neil 7 ''' 8 9 import os 10 import datetime 11 class BatchCompressor: 12 ''' 13 line_break:压缩后的内容是否换行,如果不为空则输入换行长度 14 all_files:是否默认input_dir下的所有文件 15 s_html, s_js, s_css:分别为压缩后的文件名添加的后缀 16 ''' 17 def __init__(self, input_dir, output_dir, line_break, all_files = True, s_html = '.min', s_js = '.min', s_css = '.min'): 18 if input_dir == output_dir: 19 print 'waring: output_dir is same with the input_dir' 20 guess = str(raw_input('Enter OK to continue: ')) 21 if guess != 'OK': 22 import sys 23 sys.exit(0) 24 25 self.input_dir = input_dir 26 self.output_dir = output_dir 27 self.s_html = s_html 28 self.s_js = s_js 29 self.s_css = s_css 30 self.line_break = line_break 31 self.all_files = all_files 32 self.cmdHtml = 'java -jar htmlcompressor-1.5.3.jar --type html --remove-quotes --compress-js --compress-css' 33 self.cmdJSCSS = 'java -jar yuicompressor-2.4.7.jar --charset utf-8 --preserve-semi' 34 if line_break is not None: 35 self.cmdJSCSS += ' --line-break ' + str(line_break) 36 self.cmdJSCSS += ' -o ' 37 38 #获取path下的所有目录和文件 39 def getSubDirsFiles(self, path): 40 dirList = [] 41 fileList = [] 42 files = os.listdir(path) 43 for f in files: 44 if(os.path.isdir(path + '/' + f)): 45 if(f[0] != '.'): 46 dirList.append(f)#添加非隐藏文件夹 47 if(os.path.isfile(path + '/' + f)): 48 fileList.append(f)#添加文件 49 50 return dirList, fileList 51 52 def run(self): 53 print 'BatchCompressor is running, please wait...' 54 inDirList = [self.input_dir] 55 outDirList = [self.output_dir] 56 s_file_num = 0#压缩成功文件计数 57 f_file_num = 0#压缩失败文件计数 58 file_log = open('log.txt', 'w') 59 file_log.write('Created on ' + datetime.datetime.now().strftime("%Y-%m-%d %X") + '\n') 60 file_log.write('-------------------------------------\n') 61 file_log.write('Input dir: ' + self.input_dir + '\n') 62 file_log.write('Output dir: ' + self.output_dir + '\n') 63 file_log.write('-------------------------------------\nFailed files\n\n') 64 while len(inDirList) > 0: 65 in_d = inDirList.pop(-1) 66 curDirList, curFileList = self.getSubDirsFiles(in_d) 67 out_d = outDirList.pop(-1)#取出最后一个目录,并删除该目录 68 if not os.path.exists(out_d): 69 os.mkdir(out_d)#如果文件夹不存在则创建 70 for d in curDirList: 71 inDirList.append(in_d + '/' + d) 72 outDirList.append(out_d + '/' + d) 73 74 for f in curFileList: 75 if f[-5:].lower() == '.html': 76 new_f = f[:-5] + self.s_html + '.html' 77 cmdStr = self.cmdHtml 78 elif f[-3:].lower() == '.js': 79 new_f = f[:-3] + self.s_js + '.js' 80 cmdStr = self.cmdJSCSS 81 elif f[-4:].lower() == '.css': 82 new_f = f[:-4] + self.s_css + '.css' 83 cmdStr = self.cmdJSCSS 84 else: 85 continue 86 out_f = out_d + '/' + new_f 87 os.system(cmdStr + out_f + ' ' + in_d + '/' + f) 88 if os.path.isfile(out_f):#判断压缩的输出文件是否存在 89 s_file_num += 1 90 else: 91 f_file_num += 1 92 file_log.write(in_d + '/' + f + '\n') 93 94 if not self.all_files: 95 break 96 file_log.write('-------------------------------------\n') 97 file_log.write(str(f_file_num) + ' files were failed.' + '\n') 98 file_log.write(str(s_file_num) + ' files had been compressed.' + '\n') 99 file_log.close() 100 print 'Done.' 101 102 if __name__ == '__main__': 103 comp = BatchCompressor('d:/static', 'd:/static.min', 3000, True, '.min', '.min', '.min') 104 comp.run()
htmlcompressor压缩html文件时,还可以自定义保留块,如django的模板文件html中的{% ... %},{{ ... }}。可以通过自定义正则表达式来实现,需要写一个正则表达式文件,每行一个正则表达式。例如我保留然django中的{% ... %}和{{ ... }}的正则表达式文件regexp.txt如下。
\{%(.*)%\}
\{\{(.*)\}\}
运行htmlcompressor时加上参数-p regexp.txt,例如
java -jar htmlcompressor-1.5.3.jar --type html --remove-quotes --compress-js --compress-css -p regexp.txt -o to/output/path/test.min.html to/input/path/test.html
可以修改BatchCompressor.py中的self.cmdHtml,需要保证regexp.txt在同一目录下。
self.cmdHtml = 'java -jar htmlcompressor-1.5.3.jar --type html --remove-quotes --compress-js --compress-css -p regexp.txt -o '
注意:使用htmlcompressor时有一些已知的问题:
Known Issues
1. When <script> tag contains some custom preserved block (for example <?php>), enabling inline javascript compression will fail. Such <script> tags could be skipped by wrapping them with <!-- {{{ -->...<!-- }}} --> comments (skip blocks).
如果<script>...</script>之间包含自定义的保留块,可以加上<!-- {{{ --><script>...</script><!-- }}} -->,来忽略压缩这段<script>标签中的代码,这样就不会出错,代价是这段js代码不能压缩。注意,Html的注释方法为<!--
注释的内容
-->
2. Removing intertag spaces might break text formatting, for example spaces between words surrounded with <b> will be removed. Such spaces might be preserved by replacing them with  or .
经过测试发现,在django的模板文件中,在<script>标签中不能包含{% ... %},会出现错误,即<script>{% ... %}</script>,
而<script>{{ ... }}</script>是可以的。因此如果遇到<script>标签中有{% ... %}可以考虑修改代码或加上<!-- {{{ --><script>...</script><!-- }}} -->