Html CSS JS批量压缩

以前在写网站的时候,代码总是刻意写得很整齐,而且为了容易看懂,会加很多注释。但是常常希望发布的时候能将代码尽可能压缩,去掉注释以及换行和空格等,以减少不必要的传输开销。之前虽然也知道有可以压缩的工具,但是一直没找到好用的的批处理方法,于是就打算自己写一个。

是用Python写的,用了htmlcompressor,关于htmlcompressor的详细信息可以去htmlcompressor的google code首页参考。另外需要保证htmlcompressor-1.5.3.jar和yuicompressor-2.4.7.jar(版本可以不用一致)与BatchCompressor.py在同一目录下。BachCompressor.py执行完后会产生一个log.txt的日志文件,BatchCompressor.py代码如下:

  1 #-*- encoding: utf-8 -*-
  2 '''
  3 Created on 2013-01-08 19:43:11
  4 Updated on 2013-01-12 16:49:56
  5 
  6 @author: Neil
  7 '''
  8 
  9 import os
 10 import datetime
 11 class BatchCompressor:
 12     '''
 13     line_break:压缩后的内容是否换行,如果不为空则输入换行长度
 14     all_files:是否默认input_dir下的所有文件
 15     s_html, s_js, s_css:分别为压缩后的文件名添加的后缀
 16     '''
 17     def __init__(self, input_dir, output_dir, line_break, all_files = True, s_html = '.min', s_js = '.min', s_css = '.min'):
 18         if input_dir == output_dir:
 19             print 'waring: output_dir is same with the input_dir'
 20             guess = str(raw_input('Enter OK to continue: '))
 21             if guess != 'OK':
 22                 import sys
 23                 sys.exit(0)
 24 
 25         self.input_dir  = input_dir
 26         self.output_dir = output_dir
 27         self.s_html     = s_html
 28         self.s_js       = s_js
 29         self.s_css      = s_css
 30         self.line_break = line_break
 31         self.all_files  = all_files
 32         self.cmdHtml    = 'java -jar htmlcompressor-1.5.3.jar --type html --remove-quotes --compress-js --compress-css'
 33         self.cmdJSCSS   = 'java -jar yuicompressor-2.4.7.jar --charset utf-8 --preserve-semi'
 34         if line_break is not None:
 35             self.cmdJSCSS += ' --line-break ' + str(line_break)
 36             self.cmdJSCSS += ' -o '
 37 
 38     #获取path下的所有目录和文件
 39     def getSubDirsFiles(self, path):
 40         dirList  = []
 41         fileList = []
 42         files = os.listdir(path)
 43         for f in files:
 44             if(os.path.isdir(path + '/' + f)):
 45                 if(f[0] != '.'):
 46                     dirList.append(f)#添加非隐藏文件夹
 47             if(os.path.isfile(path + '/' + f)):
 48                 fileList.append(f)#添加文件
 49 
 50         return dirList, fileList
 51 
 52     def run(self):
 53         print 'BatchCompressor is running, please wait...'
 54         inDirList  = [self.input_dir]
 55         outDirList = [self.output_dir]
 56         s_file_num = 0#压缩成功文件计数
 57         f_file_num = 0#压缩失败文件计数
 58         file_log = open('log.txt', 'w')
 59         file_log.write('Created on ' + datetime.datetime.now().strftime("%Y-%m-%d %X") + '\n')
 60         file_log.write('-------------------------------------\n')
 61         file_log.write('Input dir: ' + self.input_dir + '\n')
 62         file_log.write('Output dir: ' + self.output_dir + '\n')
 63         file_log.write('-------------------------------------\nFailed files\n\n')
 64         while len(inDirList) > 0:
 65             in_d = inDirList.pop(-1)
 66             curDirList, curFileList = self.getSubDirsFiles(in_d)
 67             out_d = outDirList.pop(-1)#取出最后一个目录,并删除该目录
 68             if not os.path.exists(out_d):
 69                 os.mkdir(out_d)#如果文件夹不存在则创建
 70             for d in curDirList:
 71                 inDirList.append(in_d + '/' + d)
 72                 outDirList.append(out_d + '/' + d)
 73 
 74             for f in curFileList:
 75                 if f[-5:].lower() == '.html':
 76                     new_f = f[:-5] + self.s_html + '.html'
 77                     cmdStr = self.cmdHtml
 78                 elif f[-3:].lower() == '.js':
 79                     new_f = f[:-3] + self.s_js + '.js'
 80                     cmdStr = self.cmdJSCSS
 81                 elif f[-4:].lower() == '.css':
 82                     new_f = f[:-4] + self.s_css + '.css'
 83                     cmdStr = self.cmdJSCSS
 84                 else:
 85                     continue
 86                 out_f = out_d + '/' + new_f
 87                 os.system(cmdStr + out_f + ' ' + in_d + '/' + f)
 88                 if os.path.isfile(out_f):#判断压缩的输出文件是否存在
 89                     s_file_num += 1
 90                 else:
 91                     f_file_num += 1
 92                     file_log.write(in_d + '/' + f + '\n')
 93             
 94             if not self.all_files:
 95                 break
 96         file_log.write('-------------------------------------\n')
 97         file_log.write(str(f_file_num) + ' files were failed.' + '\n')
 98         file_log.write(str(s_file_num) + ' files had been compressed.' + '\n')
 99         file_log.close()
100         print 'Done.'
101 
102 if __name__ == '__main__':
103     comp = BatchCompressor('d:/static', 'd:/static.min', 3000, True, '.min', '.min', '.min')
104     comp.run()

 

htmlcompressor压缩html文件时,还可以自定义保留块,如django的模板文件html中的{% ... %},{{ ... }}。可以通过自定义正则表达式来实现,需要写一个正则表达式文件,每行一个正则表达式。例如我保留然django中的{% ... %}和{{ ... }}的正则表达式文件regexp.txt如下。

\{%(.*)%\}
\{\{(.*)\}\}

 运行htmlcompressor时加上参数-p regexp.txt,例如

java -jar htmlcompressor-1.5.3.jar --type html --remove-quotes --compress-js --compress-css -p regexp.txt -o to/output/path/test.min.html to/input/path/test.html

可以修改BatchCompressor.py中的self.cmdHtml,需要保证regexp.txt在同一目录下。

self.cmdHtml    = 'java -jar htmlcompressor-1.5.3.jar --type html --remove-quotes --compress-js --compress-css -p regexp.txt -o '

 

注意:使用htmlcompressor时有一些已知的问题:

Known Issues
1. When <script> tag contains some custom preserved block (for example <?php>), enabling inline javascript compression will fail. Such <script> tags could be skipped by wrapping them with <!-- {{{ -->...<!-- }}} --> comments (skip blocks).

如果<script>...</script>之间包含自定义的保留块,可以加上<!-- {{{ --><script>...</script><!-- }}} -->,来忽略压缩这段<script>标签中的代码,这样就不会出错,代价是这段js代码不能压缩。注意,Html的注释方法为<!--注释的内容-->

2. Removing intertag spaces might break text formatting, for example spaces between words surrounded with <b> will be removed. Such spaces might be preserved by replacing them with &#20; or &nbsp;.

 

经过测试发现,在django的模板文件中,在<script>标签中不能包含{% ... %},会出现错误,即<script>{% ... %}</script>,

而<script>{{ ... }}</script>是可以的。因此如果遇到<script>标签中有{% ... %}可以考虑修改代码或加上<!-- {{{ --><script>...</script><!-- }}} -->

posted @ 2013-01-08 22:21  NaN-Hax  阅读(2317)  评论(0编辑  收藏  举报