请求中的“开源节流”—MXHR的实现细节和应用
页面中最常见的三种资源是:JS文件,CSS文件,图片文件。为了减少HTTP请求数量,通常在部署一个应用的时候,都会用工具把一堆的JS文件合并再压缩,就像一块儿海绵一样,把里面的水分拧去;CSS文件通常都是合并(压缩),CSS的压缩只是去除注释,空格以及换行符。那么图片文件呢?
如果一个页面的用户访问量很大,而且这个页面中有100个图片,那么,就会有100次的HTTP请求(除去图片信息)之外的消耗,MXHR似乎可以解决这个问题:
MXHR技术,整体的流程就是,把这100个图片在后端使用base64编码,然后把它拼成一个长字符串,通过一次HTTP请求,传送回客户端,然后通过JS来把这个长字符串分割,并解析成浏览器可以识别的图片形式。当然用MXHR也可以用来传送JS或者CSS文件,但是现在通常用更简洁的合并压缩来部署,这里先不考虑JS和CSS文件的MXHR应用。
关于MXHR原始的介绍和应用在这里,但是貌似原始的测试小例子有些问题,修改后的在这里,我们来详细的学习下这个例子,目的:搞懂MXHR的实现细节。
先来看下mxhr_test.php文件,为了简便起见,把原先的英文注释翻译了一遍,帮助理解:
<?php /** * Functions for combining payloads into a single stream that the * JS will unpack on the client-side, to reduce the number of HTTP requests. * 这里的payloads可以理解为一个流(stream)中的单元,包含信息和控制符,mxhr_stream函数将每一个 * payload(复数加s)合并成一个“流”,在客户端,Javascript将会解析这些payload,进而减少HTTP请求的数量。 * Takes an array of payloads and combines them into a single stream, which is then * sent to the browser. * 此函数以payload为元素的数组作为参数,并把它们合并成一个单独的“流”,这个“流”将会发送回浏览器。 * Each item in the input array should contain the following keys: * 参数数组的每一个单元,应该包含如下keys: * data - the image or text data. image data should be base64 encoded. * data - 图片或者文本的data,图片的data是经过base64编码的。 * content_type - the mime type of the data * xontent_type - data 的 mime 类型 */ function mxhr_stream($payloads) { $stream = array(); $version = 1; //使用特殊的符号来作为分隔符和边界符(它们都属于控制符) $sep = chr(1); // control-char SOH/ASCII 1 $newline = chr(3); // control-char ETX/ASCII 3 foreach ($payloads as $payload) { $stream[] = $payload['content_type'] . $sep . (isset($payload['id']) ? $payload['id'] : '') . $sep . $payload['data']; } echo $version . $newline . implode($newline, $stream) . $newline; /* 此例中$stream中的一个元素的展现:image/png0iVBORw0KGgoAAAANSUhEUgAAABwAAAAWCAMAAADkSAzAAAAAQlBMVEWZmZmenp7m5ubKysq/v7+ysrLy8vLHx8fb29uvr6/v7++pqan39/empqbT09Pq6urX19dtkDObvkpLbSmLsDX///8MOm2bAAAAFnRSTlP///////////////////////////8AAdLA5AAAAIdJREFUKM910NkSwyAIBVDM0iVdUrnk/381wdLUBXlgRo7D6KXNLUA7+RYjeogoIvAxmSp1zcW/tZhZg7nVWFiFpX0RcC0h7IjIhSmCmXXQ2IEQVo22MrMT04XK0lrpmD3IN/uKuGYhQDz7JQTPzvjg2EbL8Bmn+REAOiqE132eruP7NqyX5w49di+cmF4NJgAAAABJRU5ErkJggg== */ } // Package image data into a payload(将一个图片的data打包成一个payload) function mxhr_assemble_image_payload($image_data, $id=null, $mime='image/jpeg') { return array( 'data' => base64_encode($image_data), 'content_type' => $mime, 'id' => $id ); } // Package html text into a payload(将一个html文件打包成一个payload,这个例子中没有用到) function mxhr_assemble_html_payload($html_data, $id=null) { return array( 'data' => $html_data, 'content_type' => 'text/html', 'id' => $id ); } // Package javascript text into a payload(将一个javascript文件打包成一个payload,这个例子中没有用到) function mxhr_assemble_javascript_payload($js_data, $id=null) { return array( 'data' => $js_data, 'content_type' => 'text/javascript', 'id' => $id ); } // Send the multipart stream(发送“流”) if ($_GET['send_stream']) { //设置重复次数 $repetitions = 300; $payloads = array(); // JS files(可以略去) $js_data = 'var a = "JS execution worked"; console.log(a, '; for ($n = 0; $n < $repetitions; $n++) { //$payloads[] = mxhr_assemble_javascript_payload($js_data . $n . ', $n);'); } // HTML files(可以略去) $html_data = '<!DOCTYPE HTML><html><head><title>Sample HTML Page</title></head><body></body></html>'; for ($n = 0; $n < $repetitions; $n++) { //$payloads[] = mxhr_assemble_html_payload($html_data, $n); } // Images(这里使用的是测试图片) $image = 'icon_check.png'; $image_fh = fopen($image, 'r'); //将此图片read进$image_data变量 $image_data = fread($image_fh, filesize($image)); fclose($image_fh); for ($n = 0; $n < $repetitions; $n++) { //生成特定的payload数组 $payloads[] = mxhr_assemble_image_payload($image_data, $n, 'image/png'); } // Send off the multipart stream(发送) mxhr_stream($payloads); exit; } ?>
在这个测试里面,设置了300次的重复次数,这个php作为后端的支持文件,将用它来揭示mxhr加载300个测试图片和用普通模式的加载300个图片的区别,以及耗时多少的比较。
小提示:从后端php传回的数据的总体结构是:
[version][boundary][payload][boundary][payload][boundary][payload]........[payload][boundary]
通过php文件可以知道,这里的[version]等于1;[boundary]则为 \u0001 ,对于客户端来说 \u0001 的length等于1;[payload]则作为我们的重点要提取的内容。
而一个[payload]的结构是:
[mimetype][sep][id][sep][data]
[sep]即为各个字段之间的分隔符:\u0003,[data]则为我们重点要提取的内容。
接下来是重头戏,看下mxhr.js文件的实现细节,同样的,相关说明均在注释之中:
(function() { // ================================================================================ // MXHR // -------------------------------------------------------------------------------- // F.mxhr is a porting of DUI.Stream (git://github.com/digg/stream.git). // // We ripped out the jQuery specific code, and replaced it with normal for() loops. // Also worked around some of the brittleness in the string manipulations, and // refactored some of the rest of the code. // // Images don't work on IE yet, since we haven't found a way to get the base64 // encoded image data into an actual image (RFC 822 looks promising, and terrifying: // http://www.hedgerwow.com/360/dhtml/base64-image/demo.php) // // Another possible approach uses "mhtml:", // http://www.stevesouders.com/blog/2009/10/05/aptimize-realtime-spriting-and-more/ // // -------------------------------------------------------------------------------- // GLOSSARY // packet: the amount of data sent in one ping interval // payload: an entire piece of content, contained between control char boundaries // stream: the data sent between opening and closing an XHR. depending on how you // implement MHXR, that could be a while. // 这里使用到的术语: // packet: 一次请求的数据包大小 // payload: 可以把它看成是整个stream中的一个单元,包含着控制符,边界符,以及数据data // stream: 一次http请求,注意:between opening and closing an XHR // ================================================================================ F = window.F || {}; F.mxhr = { // -------------------------------------------------------------------------------- // Variables that must be global within this object. // -------------------------------------------------------------------------------- getLatestPacketInterval: null, lastLength: 0, listeners: {},//我们可以通过这个来设置监听器 //与php中的chr(3)和chr(1)相对应 boundary: "\u0003", // IE jumps over empty entries if we use the regex version instead of the string. fieldDelimiter: "\u0001", //这里需要注意,在IE中初始化xmlhttp的时候,老版本的IE(6,7)不支持readyState == 3的情况(在本文的最后还会有说明) _msxml_progid: [ 'MSXML2.XMLHTTP.6.0', 'MSXML3.XMLHTTP', 'Microsoft.XMLHTTP', // Doesn't support readyState == 3 header requests. 'MSXML2.XMLHTTP.3.0', // Doesn't support readyState == 3 header requests. ], // -------------------------------------------------------------------------------- // load() // -------------------------------------------------------------------------------- // Instantiate the XHR object and request data from url. // 实例化XHR对象,请求数据 // -------------------------------------------------------------------------------- load: function(url) { this.req = this.createXhrObject(); if (this.req) { this.req.open('GET', url, true); var that = this; this.req.onreadystatechange = function() { that.readyStateHandler(); } this.req.send(null); } }, // -------------------------------------------------------------------------------- // createXhrObject() // -------------------------------------------------------------------------------- // Try different XHR objects until one works. Pulled from YUI Connection 2.6.0. // -------------------------------------------------------------------------------- createXhrObject: function() { var req; try { req = new XMLHttpRequest(); } catch(e) { for (var i = 0, len = this._msxml_progid.length; i < len; ++i) { try { req = new ActiveXObject(this._msxml_progid[i]); break; } catch(e2) { } } } finally { return req; } }, // -------------------------------------------------------------------------------- // readyStateHandler() // -------------------------------------------------------------------------------- // Start polling on state 3; stop polling and fire off oncomplete event on state 4. // 这个是一个重要的函数,处理返回状态等,在readyState为3时开始不断地轮询,直到为4,会暂停轮询,并且激活oncomplete事件 // -------------------------------------------------------------------------------- readyStateHandler: function() { if (this.req.readyState === 3 && this.getLatestPacketInterval === null) { // Start polling.(开始轮询) var that = this; this.getLatestPacketInterval = window.setInterval(function() { that.getLatestPacket(); }, 15); } if (this.req.readyState == 4) { // Stop polling. clearInterval(this.getLatestPacketInterval); // Get the last packet. this.getLatestPacket(); // Fire the oncomplete event. // 激活oncomplete函数 if (this.listeners.complete && this.listeners.complete.length) { var that = this; for (var n = 0, len = this.listeners.complete.length; n < len; n++) { this.listeners.complete[n].apply(that); } } } }, // -------------------------------------------------------------------------------- // getLatestPacket() // -------------------------------------------------------------------------------- // Get all of the responseText downloaded since the last time this was executed. // 此函数得到调用此函数之时的所有响应(responseText) // -------------------------------------------------------------------------------- getLatestPacket: function() { //获取响应字符串的总长度 var length = this.req.responseText.length; //获取此次调用之时,服务器的增量响应 var packet = this.req.responseText.substring(this.lastLength, length); this.processPacket(packet); this.lastLength = length; }, // -------------------------------------------------------------------------------- // processPacket() // -------------------------------------------------------------------------------- // Keep track of incoming chunks of text; pass them on to processPayload() once // we have a complete payload. // 一个packet里面不一定就会有一个整数倍的payload(在这里,一个payload才是一个可以解析的单元) // 这个函数会不断地跟踪响应数据,如果获取到了一个完整的payload,那么就会将这个payload交予processPayload // 函数处理 // -------------------------------------------------------------------------------- processPacket: function(packet) { if (packet.length < 1) return; // Find the beginning and the end of the payload. (找到一个payload的开始和结尾) // boundary 作为每个payload的分割符(一个payload的边界线)chr(3) // 一个整体的响应的结构可以看成: // [version][boundary][payload][boundary][payload][boundary][payload]........[payload][boundary] // 参照上面的结构,有助于理解下面的逻辑 var startPos = packet.indexOf(this.boundary), endPos = -1; if (startPos > -1) { if (this.currentStream) { // If there's an open stream, that's an end marker. endPos = startPos; startPos = -1; } else { endPos = packet.indexOf(this.boundary, startPos + this.boundary.length); } } // Using the position markers, process the payload. if (!this.currentStream) { // Start a new stream. this.currentStream = ''; if (startPos > -1) { if (endPos > -1) { // Use the end marker to grab the entire payload in one swoop // 当确认了一个payload的开始和结束位置的时候,就把它截取出来 var payload = packet.substring(startPos, endPos); this.currentStream += payload; // Remove the payload from this chunk packet = packet.slice(endPos); this.processPayload(); // Start over on the remainder of this packet try { this.processPacket(packet); } catch(e) { } // This catches the "Maximum call stack size reached" error in Safari (which has a // really low call stack limit, either 100 or 500 depending on the version). //这里主要说明,在老版本的Safari下,可能会引起一个调用栈大小限制的错误(这里使用递归算法),根据不同的版本而情况各异 } else { // Grab from the start of the start marker to the end of the chunk. this.currentStream += packet.substr(startPos); // Leave this.currentStream set and wait for another packet. } } } else { // There is an open stream. if (endPos > -1) { // Use the end marker to grab the rest of the payload. var chunk = packet.substring(0, endPos); this.currentStream += chunk; // Remove the rest of the payload from this chunk. packet = packet.slice(endPos); this.processPayload(); //Start over on the remainder of this packet. this.processPacket(packet); } else { // Put this whole packet into this.currentStream. this.currentStream += packet; // Wait for another packet... } } }, // -------------------------------------------------------------------------------- // processPayload() // -------------------------------------------------------------------------------- // Extract the mime-type and pass the payload on to its listeners. // 提取出一个payload的mime-type,并且把待处理的payload交予它的监听器 // -------------------------------------------------------------------------------- processPayload: function() { // Get rid of the boundary. this.currentStream = this.currentStream.replace(this.boundary, ''); // Perform some string acrobatics to separate the mime-type and id from the payload. // This could be customized to allow other pieces of data to be passed in as well, // such as image height & width. // 把图片的相关信息从一个payload中提取出来,除去测试中的数据,还可以自定义一些其他的图片信息,作为 // payload的字段,字段之间使用chr(1)来分割('\u0001') var pieces = this.currentStream.split(this.fieldDelimiter); var mime = pieces[0] var payloadId = pieces[1]; //payload即为图片的data var payload = pieces[2]; // Fire the listeners for this mime-type.(开始执行这个mime type下的监听函数) var that = this; if (typeof this.listeners[mime] != 'undefined') { for (var n = 0, len = this.listeners[mime].length; n < len; n++) { this.listeners[mime][n].call(that, payload, payloadId); } } //删除此次的currentStream delete this.currentStream; }, // -------------------------------------------------------------------------------- // listen() // -------------------------------------------------------------------------------- // Registers mime-type listeners. Will probably rip this out and use YUI custom // events at some point. For now, it's good enough. // 使用listen函数来主次mime type监听器 // -------------------------------------------------------------------------------- listen: function(mime, callback) { if (typeof this.listeners[mime] == 'undefined') { this.listeners[mime] = []; } if (typeof callback === 'function') { this.listeners[mime].push(callback); } } }; })();
简单起见,只把index.html的主要测试代码展示出来,如下:
<div id="bd"> <!-- 作为mxhr输出的展示区 --> <div id="mxhr-output"> <div id="mxhr-timing"></div> </div> <!-- 作为normal输出的展示区 --> <div id="normal-output"> <div id="normal-timing"></div> </div> <script src="mxhr.js"></script> <script> // -------------------------------------- // Test code // -------------------------------------- var totalImages = 0; F.mxhr.listen('image/png', function(payload, payloadId) { var img = document.createElement('img'); img.src = 'data:image/png;base64,' + payload; document.getElementById('mxhr-output').appendChild(img); totalImages++; }); /* F.mxhr.listen('text/html', function(payload, payloadId) { console.log('Found text/html payload:', payload, payloadId); }); F.mxhr.listen('text/javascript', function(payload, payloadId) { eval(payload); });*/ F.mxhr.listen('complete', function() { var time = (new Date).getTime() - streamStart; document.getElementById('mxhr-timing').innerHTML = '<p>' + totalImages + ' images in a multipart stream took: <strong>' + time + 'ms</strong> (' + (Math.round(100 * (time / totalImages)) / 100) + 'ms per image)</p>'; var normalStart = (new Date).getTime(); var img; for (var i = 0, last = 300; i < last; i++) { img = document.createElement('img'); img.src = 'icon_check.png?nocache=' + (new Date).getTime() * Math.random(); img.width = 28; img.height = 22; document.getElementById('normal-output').appendChild(img); var count = 0; img.onload = function() { count++; if (count === last) { var time = (new Date).getTime() - normalStart; document.getElementById('normal-timing').innerHTML = '<p>' + last + ' normal, uncached images took: <strong>' + time + 'ms</strong> (' + (Math.round(100 * (time / count)) / 100) + 'ms per image)</p>'; } }; } }); var streamStart = (new Date).getTime(); F.mxhr.load('mxhr_test.php?send_stream=1'); </script> </div>
测试结果:
IE8:
300 images in a multipart stream took: 178ms (0.59ms per image)
300 normal, uncached images took: 3066ms (10.22ms per image)
IE9:
300 images in a multipart stream took: 78ms (0.26ms per image)
300 normal, uncached images took: 5822ms (19.41ms per image)
Firefox 9.0.1:
300 images in a multipart stream took: 129ms (0.43ms per image)
300 normal, uncached images took: 10278ms (34.26ms per image)
Chrome 16:
300 images in a multipart stream took: 499ms (1.66ms per image)
300 normal, uncached images took: 2593ms (8.64ms per image)
Safari 5.1.2:
300 images in a multipart stream took: 50ms (0.17ms per image)
300 normal, uncached images took: 2504ms (8.35ms per image)
Opera 11.60:
300 images in a multipart stream took: 75ms (0.25ms per image)
300 normal, uncached images took: 1060ms (3.53ms per image)
测试数据不一定很准确,只能显示一定程度上的差别。
要是对mxhr感兴趣,可以猛击这里跳至官网:Multipart XHR,也可以直接下载,然后在本地测试(需要php环境的支持)。
Mxhr的却减少了HTTP请求的数量,但是也有浏览器自身的限制,由于IE6,7中的xmlhttp请求不支持readyState为3的情况,而且不支持图片的:
img.src = 'data:image/png;base64,' + imageData;
形式解析,所以只能另寻他法,但是总体上来说,mxhr还是能够提高网页的整体性能的,实现请求中的“开源节流”。