通过合并文件所学到的一些知识
做这个小功能的初衷是让服务器合并压缩CSS和JS并将生成的文件返回客户端,从而减少HTTP请求。
页面中引用CSS和JS的方式一般采用下面这种形式:
<link href=”style.css” rel="stylesheet" /> <script src=”script.js”></script>
这样会向服务器发送两个请求,如果引用的文件越多,请求数就越多,性能就下降得越多。
再来看另一种方式:
<link href="CombineHandler.ashx?/static/css/s1.css,/static/css/s2.css" rel="stylesheet" />
这种方式将需要请求的文件用逗号分隔,以参数形式发送给服务器的一般处理程序,由服务器处理后返回给客户端,很显然这样能够减少HTTP请求。
这种方式在早期用得比较多,但是也会增加服务器压力来处理这些请求文件,所以现在一般用自动化打包工具来实现,如Grunt、Webpack等。
先来看看CombineHandler.ashx这个文件所做的事情:
public void ProcessRequest (HttpContext context) { //获取查询参数,即/static...,/static string query = HttpUtility.UrlDecode(context.Request.QueryString.ToString()); //将各个路径保存到数组 string[] Files = query.Split(new char[] { ',' }, StringSplitOptions.RemoveEmptyEntries); //调用合并文件方法,并返回合并后文件的路径。 string outPath = Combine(Files); //返回这个文件的物理路径供客户端读取 context.Response.WriteFile(outPath); }
再来看看Combine方法的实现细节:
/// <summary> /// 合并文件 /// </summary> /// <param name="Files">一个用于保存各个文件路径的数组</param> /// <returns>返回合并后文件的路径</returns> private string Combine(string[] Files) { string curPath = _context.Server.MapPath("."); //获取当前目录 int len = Files.Length; List<FileInfo> fileinfo = new List<FileInfo>(); string filePath = string.Empty; //用于保存待合并文件的路径 string contentType = string.Empty; //返回的ContentType类型 for (int i = 0; i < len; i++) { filePath = Path.GetFullPath(curPath + Files[i]); fileinfo.Add(new FileInfo() { //路径 FilePath = filePath, //后缀 PostFix = Path.GetExtension(filePath), //不带后缀的文件名 FileName = Path.GetFileNameWithoutExtension(filePath) }); } switch (fileinfo[0].PostFix) { case ".css": contentType = ContentType.css; break; case ".js": contentType = ContentType.js.ToString(); break; default: contentType = ContentType.text.ToString(); break; } _context.Response.ContentType = contentType; string res = string.Empty; string outFileName = string.Empty; foreach (FileInfo p in fileinfo) { res += File.ReadAllText(p.FilePath); outFileName += p.FileName + "-"; } //合并后的文件名 outFileName = outFileName.Substring(0, outFileName.Length - 1) + fileinfo[0].PostFix; string txt = string.Empty; if(contentType == ContentType.js) { //示例字符串 ' sd " d g " s \' d \'" "d \' \' " " ' //查找引号及之间的内容 Regex reDoubleQuote = new Regex("([\"']).*?\\1",RegexOptions.IgnoreCase); //思路:先将字符内的转义符号\" \'用占位符{0} {1}替换 //然后用正则查找所有引号和引号内的内容,将匹配项存入一个数组 //将文件整体去空格 //替换匹配的引号及内部为数组里的原项 //替换占位符为转义字符 //替换一些关键字的空格 res = Regex.Replace(res, "\\\\'","{0}"); res = Regex.Replace(res, "(\\\\\")", "{1}"); ArrayList list = new ArrayList(); foreach (Match m in reDoubleQuote.Matches(res)) { list.Add(m); //将找到的引号及内部文本存入数组 } //文件整体去空格 res = Regex.Replace(res, @"\s", ""); int i = 0; foreach(Match m in reDoubleQuote.Matches(res)) { //将去空格后的引号及内部替换为原来的 res = res.Replace(m.ToString(), list[i++].ToString()); } var gpList = Regex.Matches(res, "(.*?)(?:([\"']).*?\\2|$)").OfType<Match>() .Select(t => t.Groups[1].Value) .Where(T => T != "").ToList(); //适配一些关键字,但也会替换文本中的关键字,用平衡组解决 //string str = "'aa'KKK'bb'HHH'HHHH"; //var j = Regex.Matches(str, "(.*?)(?:'[^']+'|$)").OfType<Match>() //.Select(t => t.Groups[1].Value).Where(T => T != "").ToList(); i = 0; res = string.Empty; foreach (var item in gpList) { //将这部分进行关键字处理 string tempStr = HandleKeyWords(item.ToString(), KeyWords); res += tempStr + (i < list.Count ? list[i] : ""); i++; } //将占位符换成原来的转义符 res = Regex.Replace(res, @"\{0\}", "\\'"); res = Regex.Replace(res, @"\{1\}", "\\\""); //txt = res; //res = Regex.Replace(res,@"function\s*","function "); //res = Regex.Replace(res, @"var\s*", "var "); //res = Regex.Replace(res, @"new\s*", "new "); //res = Regex.Replace(res, @"typeof\s*", "typeof "); //res = Regex.Replace(res, @"delete\s*", "delete "); //res = Regex.Replace(res, @"throw\s*", "throw "); //res = Regex.Replace(res, @"in\s*", "in "); //res = Regex.Replace(res, @"instanceof\s*", "instanceof "); } else { //去空格 res = Regex.Replace(res, @"\s", ""); } //输出路径 string outPath = filePath + outFileName; using (StreamWriter sw = new StreamWriter(outPath,false,System.Text.Encoding.UTF8)) { sw.Write(res); } return outPath; }
这是整体思路:
最后补充一些相关方法:
private HttpContext _context; //保存一个副本用于传参 /// <summary> /// 保存文件的信息 /// </summary> private class FileInfo { public string FilePath { get; set; } public string PostFix { get; set; } public string FileName { get; set; } } /// <summary> /// 常量 /// </summary> static class ContentType { public const string css = "text/css"; public const string js = "application/x-javascript"; public const string text = "text/plain"; } /// <summary> /// 一些js关键字 /// </summary> private string[] KeyWords = new string[] { "function","var","new","typeof","delete","throw","in","instanceof" }; private string HandleKeyWords(string sourceString, string[] keywords) { for(int i = 0; i < keywords.Length; i++) { sourceString = Regex.Replace(sourceString, keywords[i] + @"\s*", keywords[i] + " "); } return sourceString; }
在这个项目中需要掌握一些方法:
out和ref关键字的使用
- out关键字
string s; test(s);
这段代码是不能编译通过的,因为s未赋值,这时就可以用out关键字
string str; test(out str); void test(out string s) { s = “This is a string”; Console.WriteLine(s); }
很多时候,我们不知道某个变量的值,需要在执行时才知道应该给它赋什么值(比如上面的s很可能是一个动态字符串),这时就可以用out关键字。
- ref关键字
和out关键字不同,ref需要先赋值。
string s = string.Empty; test(ref s); Response.Write(s); //this is a string void test(ref string s) { s = "this is a string"; }
可以看到,虽然test方法没有返回任何值,但是执行以后,s就被赋上了值。可以利用ref关键字作为函数的返回值:
string s = string.Empty; test(ref s); Response.Write(s); //this is a string string test(string m, ref string s) { s = "this is a string"; return m; }
test方法有两个参数,但是同时返回两个参数就需要返回一个数组形式,这时用ref关键字就能省去一个返回值,只返回m就可以了。
文件读写与文件流
- FileStream、StreamReader(不推荐)
FileStream fs = new FileStream(path, FileMode.Create, FileAccess.Read); fs.Read(byte[],offset,count); fs.ReadByte();//从文件中读取一个字节,并将读取位置提升一个字节 fs.Write(byte[],offset,count); fs.WriteByte(byte); //将一个字节写入文件流的当前位置
using(StreamReader sr = new StreamReader(fs,Encoding.UTF8,true)) { int b; while((b = sr.Read()) != -1) { fsOut.WriteByte((byte)b); } }
由上面的例子可以知道需要将字节一个个读出来是很麻烦的,而且有个弊端,就是不能保存为无BOM格式的文件,这样返回文件时会有\ufeff这个非法字符在头部,造成文件不能读取或识别错误(之前返回的CSS不能用就是这个原因)。
- File.ReadAllText
string s = File.ReadAllText(string path,encoding);
这样就方便多了
using (StreamWriter sw = new StreamWriter(Path,false,System.Text.Encoding.UTF8)) { sw.Write(res); }
而且可以将文件保存为指定的编码格式,不用为BOM折腾了
- Path、HttpUtility.UrlEncode、(HttpContext)context.Server.MapPath
Path方法提供了一组获取文件名,后缀名,整体路径等一系列方法,不用一个个去分析字符串了。
- 正则平衡组
这个还弄不太明白,目前只知道可以配合Linq查询对正则匹配到的内容取反(即没有匹配到的),有时间弄明白了会再写一篇心得。
暂时就这么多了,最后就写到这。
附上CombineHandler.ashx文件源码和测试用代码。
测试代码:
;alert(1); console.log(' "11111" sd " d g " s \' d \'" "d \' \' " "' + ' sdfsd啊啊 '); alert(' 呵呵呵呵呵 \’呵\' 呵呵呵呵呵 " 呵 呵 " '); alert(" 呵呵 \"呵 \" 正 则 好难 学 啊 "); function f() { var t = setInterval(function () { var a = new t(); }, 3000); } var t = function () { };
CombineHandler.ashx:
<%@ WebHandler Language="C#" Class="CombineHandler" %> using System; using System.Web; using System.IO; using System.Collections.Generic; using System.Text.RegularExpressions; using System.Collections; using System.Linq; public class CombineHandler : IHttpHandler { private HttpContext _context; //保存一个副本用于传参 /// <summary> /// 保存文件的信息 /// </summary> private class FileInfo { public string FilePath { get; set; } public string PostFix { get; set; } public string FileName { get; set; } } /// <summary> /// 常量 /// </summary> static class ContentType { public const string css = "text/css"; public const string js = "application/x-javascript"; public const string text = "text/plain"; } public void ProcessRequest (HttpContext context) { _context = context; //保存到副本 string query = HttpUtility.UrlDecode(context.Request.QueryString.ToString()); string[] Files = query.Split(new char[] { ',' }, StringSplitOptions.RemoveEmptyEntries); string outPath = Combine(Files); context.Response.WriteFile(outPath); //context.Response.Write(outPath); } /// <summary> /// 合并文件 /// </summary> /// <param name="Files">一个用于保存各个文件路径的数组</param> /// <returns>返回合并后文件的路径</returns> private string Combine(string[] Files) { string curPath = _context.Server.MapPath("."); //获取当前目录 int len = Files.Length; List<FileInfo> fileinfo = new List<FileInfo>(); string filePath = string.Empty; //用于保存待合并文件的路径 string contentType = string.Empty; //返回的ContentType类型 for (int i = 0; i < len; i++) { filePath = Path.GetFullPath(curPath + Files[i]); fileinfo.Add(new FileInfo() { //路径 FilePath = filePath, //后缀 PostFix = Path.GetExtension(filePath), //不带后缀的文件名 FileName = Path.GetFileNameWithoutExtension(filePath) }); } switch (fileinfo[0].PostFix) { case ".css": contentType = ContentType.css; break; case ".js": contentType = ContentType.js.ToString(); break; default: contentType = ContentType.text.ToString(); break; } _context.Response.ContentType = contentType; string res = string.Empty; string outFileName = string.Empty; foreach (FileInfo p in fileinfo) { res += File.ReadAllText(p.FilePath); outFileName += p.FileName + "-"; } //合并后的文件名 outFileName = outFileName.Substring(0, outFileName.Length - 1) + fileinfo[0].PostFix; string txt = string.Empty; if(contentType == ContentType.js) { //示例字符串 ' sd " d g " s \' d \'" "d \' \' " " ' //查找引号及之间的内容 Regex reDoubleQuote = new Regex("([\"']).*?\\1",RegexOptions.IgnoreCase); //思路:先将字符内的转义符号\" \'用占位符{0} {1}替换 //然后用正则查找所有引号和引号内的内容,将匹配项存入一个数组 //将文件整体去空格 //替换匹配的引号及内部为数组里的原项 //替换占位符为转义字符 //替换一些关键字的空格 res = Regex.Replace(res, "\\\\'","{0}"); res = Regex.Replace(res, "(\\\\\")", "{1}"); ArrayList list = new ArrayList(); foreach (Match m in reDoubleQuote.Matches(res)) { list.Add(m); //将找到的引号及内部文本存入数组 } //文件整体去空格 res = Regex.Replace(res, @"\s", ""); int i = 0; foreach(Match m in reDoubleQuote.Matches(res)) { //将去空格后的引号及内部替换为原来的 res = res.Replace(m.ToString(), list[i++].ToString()); } var gpList = Regex.Matches(res, "(.*?)(?:([\"']).*?\\2|$)").OfType<Match>() .Select(t => t.Groups[1].Value) .Where(T => T != "").ToList(); //适配一些关键字,但也会替换文本中的关键字,待解决,平衡组 //string str = "'aa'KKK'bb'HHH'HHHH"; //var j = Regex.Matches(str, "(.*?)(?:'[^']+'|$)").OfType<Match>() //.Select(t => t.Groups[1].Value).Where(T => T != "").ToList(); i = 0; res = string.Empty; foreach (var item in gpList) { //将这部分进行关键字处理 string tempStr = HandleKeyWords(item.ToString(), KeyWords); res += tempStr + (i < list.Count ? list[i] : ""); i++; } //将占位符换成原来的转义符 res = Regex.Replace(res, @"\{0\}", "\\'"); res = Regex.Replace(res, @"\{1\}", "\\\""); //txt = res; //res = Regex.Replace(res,@"function\s*","function "); //res = Regex.Replace(res, @"var\s*", "var "); //res = Regex.Replace(res, @"new\s*", "new "); //res = Regex.Replace(res, @"typeof\s*", "typeof "); //res = Regex.Replace(res, @"delete\s*", "delete "); //res = Regex.Replace(res, @"throw\s*", "throw "); //res = Regex.Replace(res, @"in\s*", "in "); //res = Regex.Replace(res, @"instanceof\s*", "instanceof "); //res = HandleKeyWords(res, KeyWords); } else { //去空格 res = Regex.Replace(res, @"\s", ""); } //输出路径 string outPath = filePath + outFileName; using (StreamWriter sw = new StreamWriter(outPath,false,System.Text.Encoding.UTF8)) { sw.Write(res); } return outPath; //return res; } /// <summary> /// 一些js关键字 /// </summary> private string[] KeyWords = new string[] { "function","var","new","typeof","delete","throw","in","instanceof" }; private string HandleKeyWords(string sourceString, string[] keywords) { for(int i = 0; i < keywords.Length; i++) { sourceString = Regex.Replace(sourceString, keywords[i] + @"\s*", keywords[i] + " "); } return sourceString; } public bool IsReusable { get { return false; } } }