请求中的“开源节流”—MXHR的实现细节和应用

页面中最常见的三种资源是:JS文件,CSS文件,图片文件。为了减少HTTP请求数量,通常在部署一个应用的时候,都会用工具把一堆的JS文件合并再压缩,就像一块儿海绵一样,把里面的水分拧去;CSS文件通常都是合并(压缩),CSS的压缩只是去除注释,空格以及换行符。那么图片文件呢?

如果一个页面的用户访问量很大,而且这个页面中有100个图片,那么,就会有100次的HTTP请求(除去图片信息)之外的消耗,MXHR似乎可以解决这个问题:

MXHR技术,整体的流程就是,把这100个图片在后端使用base64编码,然后把它拼成一个长字符串,通过一次HTTP请求,传送回客户端,然后通过JS来把这个长字符串分割,并解析成浏览器可以识别的图片形式。当然用MXHR也可以用来传送JS或者CSS文件,但是现在通常用更简洁的合并压缩来部署,这里先不考虑JS和CSS文件的MXHR应用。

关于MXHR原始的介绍和应用在这里,但是貌似原始的测试小例子有些问题,修改后的在这里,我们来详细的学习下这个例子,目的:搞懂MXHR的实现细节。

分布

先来看下mxhr_test.php文件,为了简便起见,把原先的英文注释翻译了一遍,帮助理解:

<?php
	/**
	 * Functions for combining payloads into a single stream that the
	 * JS will unpack on the client-side, to reduce the number of HTTP requests.
	 * 这里的payloads可以理解为一个流(stream)中的单元,包含信息和控制符,mxhr_stream函数将每一个
	 * payload(复数加s)合并成一个“流”,在客户端,Javascript将会解析这些payload,进而减少HTTP请求的数量。
	 * Takes an array of payloads and combines them into a single stream, which is then
	 * sent to the browser.
	 * 此函数以payload为元素的数组作为参数,并把它们合并成一个单独的“流”,这个“流”将会发送回浏览器。
	 * Each item in the input array should contain the following keys:
	 * 参数数组的每一个单元,应该包含如下keys:
	 * data         - the image or text data. image data should be base64 encoded.
	 * data - 图片或者文本的data,图片的data是经过base64编码的。
	 * content_type - the mime type of the data
	 * xontent_type - data 的 mime 类型
	 */
	function mxhr_stream($payloads) {
		
		$stream = array();
		
		$version = 1;
		//使用特殊的符号来作为分隔符和边界符(它们都属于控制符)
		$sep = chr(1); // control-char SOH/ASCII 1
		$newline = chr(3); // control-char ETX/ASCII 3
		
		foreach ($payloads as $payload) {
			$stream[] = $payload['content_type'] . $sep . (isset($payload['id']) ? $payload['id'] : '') . $sep . $payload['data'];
		}
		echo $version . $newline . implode($newline, $stream) . $newline;
		/*
		此例中$stream中的一个元素的展现:image/png0iVBORw0KGgoAAAANSUhEUgAAABwAAAAWCAMAAADkSAzAAAAAQlBMVEWZmZmenp7m5ubKysq/v7+ysrLy8vLHx8fb29uvr6/v7++pqan39/empqbT09Pq6urX19dtkDObvkpLbSmLsDX///8MOm2bAAAAFnRSTlP///////////////////////////8AAdLA5AAAAIdJREFUKM910NkSwyAIBVDM0iVdUrnk/381wdLUBXlgRo7D6KXNLUA7+RYjeogoIvAxmSp1zcW/tZhZg7nVWFiFpX0RcC0h7IjIhSmCmXXQ2IEQVo22MrMT04XK0lrpmD3IN/uKuGYhQDz7JQTPzvjg2EbL8Bmn+REAOiqE132eruP7NqyX5w49di+cmF4NJgAAAABJRU5ErkJggg==
		*/
	}
	
	// Package image data into a payload(将一个图片的data打包成一个payload)
	
	function mxhr_assemble_image_payload($image_data, $id=null, $mime='image/jpeg') {
		return array(
			'data' => base64_encode($image_data),
			'content_type' => $mime,
			'id' => $id
		);
	}
	
	// Package html text into a payload(将一个html文件打包成一个payload,这个例子中没有用到)

	function mxhr_assemble_html_payload($html_data, $id=null) {
		return array(
			'data' => $html_data,
			'content_type' => 'text/html',
			'id' => $id
		);
 	}

	// Package javascript text into a payload(将一个javascript文件打包成一个payload,这个例子中没有用到)

	function mxhr_assemble_javascript_payload($js_data, $id=null) {
		return array(
			'data' => $js_data,
			'content_type' => 'text/javascript',
			'id' => $id
		);
 	}

	// Send the multipart stream(发送“流”)

	if ($_GET['send_stream']) {
		//设置重复次数
		$repetitions = 300;
		$payloads = array();

		// JS files(可以略去)

		$js_data = 'var a = "JS execution worked"; console.log(a, ';

		for ($n = 0; $n < $repetitions; $n++) {
			//$payloads[] = mxhr_assemble_javascript_payload($js_data . $n . ', $n);');
		}

		// HTML files(可以略去)

		$html_data = '<!DOCTYPE HTML><html><head><title>Sample HTML Page</title></head><body></body></html>';

		for ($n = 0; $n < $repetitions; $n++) {
			//$payloads[] = mxhr_assemble_html_payload($html_data, $n);
		}

		// Images(这里使用的是测试图片)

		$image = 'icon_check.png';
		$image_fh = fopen($image, 'r');
		//将此图片read进$image_data变量
		$image_data = fread($image_fh, filesize($image));
		fclose($image_fh);

		for ($n = 0; $n < $repetitions; $n++) {
			//生成特定的payload数组
			$payloads[] = mxhr_assemble_image_payload($image_data, $n, 'image/png');
		}

		// Send off the multipart stream(发送)
		mxhr_stream($payloads);
		exit;
	}

?>

在这个测试里面,设置了300次的重复次数,这个php作为后端的支持文件,将用它来揭示mxhr加载300个测试图片和用普通模式的加载300个图片的区别,以及耗时多少的比较。

小提示:从后端php传回的数据的总体结构是:

[version][boundary][payload][boundary][payload][boundary][payload]........[payload][boundary]

通过php文件可以知道,这里的[version]等于1;[boundary]则为 \u0001 ,对于客户端来说 \u0001 的length等于1;[payload]则作为我们的重点要提取的内容。

而一个[payload]的结构是:

[mimetype][sep][id][sep][data]

[sep]即为各个字段之间的分隔符:\u0003,[data]则为我们重点要提取的内容。

接下来是重头戏,看下mxhr.js文件的实现细节,同样的,相关说明均在注释之中:

(function() {
	
	// ================================================================================
	// MXHR
	// --------------------------------------------------------------------------------
	// F.mxhr is a porting of DUI.Stream (git://github.com/digg/stream.git).
	//
	// We ripped out the jQuery specific code, and replaced it with normal for() loops. 
	// Also worked around some of the brittleness in the string manipulations, and 
	// refactored some of the rest of the code.
	// 
	// Images don't work on IE yet, since we haven't found a way to get the base64
	// encoded image data into an actual image (RFC 822 looks promising, and terrifying:
	// http://www.hedgerwow.com/360/dhtml/base64-image/demo.php)
	//
	// Another possible approach uses "mhtml:", 
	// http://www.stevesouders.com/blog/2009/10/05/aptimize-realtime-spriting-and-more/ 
	//
	// --------------------------------------------------------------------------------
	// GLOSSARY
	// packet:  the amount of data sent in one ping interval
	// payload: an entire piece of content, contained between control char boundaries
	// stream:  the data sent between opening and closing an XHR. depending on how you 
	//          implement MHXR, that could be a while.
	// 这里使用到的术语:
	// packet: 一次请求的数据包大小
	// payload: 可以把它看成是整个stream中的一个单元,包含着控制符,边界符,以及数据data
	// stream: 一次http请求,注意:between opening and closing an XHR
	// ================================================================================

	F = window.F || {};
	F.mxhr = {
		
		// --------------------------------------------------------------------------------
		// Variables that must be global within this object.
		// --------------------------------------------------------------------------------

		getLatestPacketInterval: null,
		lastLength: 0,
		listeners: {},//我们可以通过这个来设置监听器
		//与php中的chr(3)和chr(1)相对应
		boundary: "\u0003", 		// IE jumps over empty entries if we use the regex version instead of the string.
		fieldDelimiter: "\u0001",

		//这里需要注意,在IE中初始化xmlhttp的时候,老版本的IE(6,7)不支持readyState == 3的情况(在本文的最后还会有说明)
		_msxml_progid: [
			'MSXML2.XMLHTTP.6.0',
			'MSXML3.XMLHTTP',
			'Microsoft.XMLHTTP', // Doesn't support readyState == 3 header requests.
			'MSXML2.XMLHTTP.3.0', // Doesn't support readyState == 3 header requests.
		],

		// --------------------------------------------------------------------------------
		// load()
		// --------------------------------------------------------------------------------
		// Instantiate the XHR object and request data from url.
		// 实例化XHR对象,请求数据
		// --------------------------------------------------------------------------------

		load: function(url) {
			this.req = this.createXhrObject();
			if (this.req) {
				this.req.open('GET', url, true);

				var that = this;
				this.req.onreadystatechange = function() {
					that.readyStateHandler();
				}

				this.req.send(null);
			}
		},

		// --------------------------------------------------------------------------------
		// createXhrObject()
		// --------------------------------------------------------------------------------
		// Try different XHR objects until one works. Pulled from YUI Connection 2.6.0.
		// --------------------------------------------------------------------------------
		
		createXhrObject: function() {
			var req;
			try {
				req = new XMLHttpRequest();
			}
			catch(e) {
				for (var i = 0, len = this._msxml_progid.length; i < len; ++i) {
					try {
						req = new ActiveXObject(this._msxml_progid[i]);
						break;
					}
					catch(e2) {  }
				}
			}
			finally {
				return req;
			}
		},		
    
		// --------------------------------------------------------------------------------
		// readyStateHandler()
		// --------------------------------------------------------------------------------
		// Start polling on state 3; stop polling and fire off oncomplete event on state 4.
		// 这个是一个重要的函数,处理返回状态等,在readyState为3时开始不断地轮询,直到为4,会暂停轮询,并且激活oncomplete事件
		// --------------------------------------------------------------------------------

		readyStateHandler: function() {

			if (this.req.readyState === 3 && this.getLatestPacketInterval === null) {
					
				// Start polling.(开始轮询)

				var that = this;					
				this.getLatestPacketInterval = window.setInterval(function() { that.getLatestPacket(); }, 15);
			}

			if (this.req.readyState == 4) {

				// Stop polling.

				clearInterval(this.getLatestPacketInterval);

				// Get the last packet.

				this.getLatestPacket();

				// Fire the oncomplete event.
				// 激活oncomplete函数
				if (this.listeners.complete && this.listeners.complete.length) {
					var that = this;
					for (var n = 0, len = this.listeners.complete.length; n < len; n++) {
						this.listeners.complete[n].apply(that);
					}
				}
			}
		},
		
		// --------------------------------------------------------------------------------
		// getLatestPacket()
		// --------------------------------------------------------------------------------
		// Get all of the responseText downloaded since the last time this was executed.
		// 此函数得到调用此函数之时的所有响应(responseText)
		// --------------------------------------------------------------------------------		
    
		getLatestPacket: function() {
			//获取响应字符串的总长度
			var length = this.req.responseText.length;
			//获取此次调用之时,服务器的增量响应
			var packet = this.req.responseText.substring(this.lastLength, length);

			this.processPacket(packet);
			this.lastLength = length;
		},
   
		// --------------------------------------------------------------------------------
		// processPacket()
		// --------------------------------------------------------------------------------
		// Keep track of incoming chunks of text; pass them on to processPayload() once
		// we have a complete payload.
		// 一个packet里面不一定就会有一个整数倍的payload(在这里,一个payload才是一个可以解析的单元)
		// 这个函数会不断地跟踪响应数据,如果获取到了一个完整的payload,那么就会将这个payload交予processPayload
		// 函数处理
		// --------------------------------------------------------------------------------
 
		processPacket: function(packet) {

			if (packet.length < 1) return;

			// Find the beginning and the end of the payload. (找到一个payload的开始和结尾)
			// boundary 作为每个payload的分割符(一个payload的边界线)chr(3)
			// 一个整体的响应的结构可以看成:
			// [version][boundary][payload][boundary][payload][boundary][payload]........[payload][boundary]
			// 参照上面的结构,有助于理解下面的逻辑
			var startPos = packet.indexOf(this.boundary),
			    endPos = -1;

			if (startPos > -1) {
				if (this.currentStream) {

					// If there's an open stream, that's an end marker.

					endPos = startPos;
					startPos = -1;
				} 
				else {
					endPos = packet.indexOf(this.boundary, startPos + this.boundary.length);
				}
			}

			// Using the position markers, process the payload.

			if (!this.currentStream) {

				// Start a new stream.

				this.currentStream = '';

				if (startPos > -1) {

					if (endPos > -1) {

						// Use the end marker to grab the entire payload in one swoop
						// 当确认了一个payload的开始和结束位置的时候,就把它截取出来

						var payload = packet.substring(startPos, endPos);
						this.currentStream += payload;

						// Remove the payload from this chunk

						packet = packet.slice(endPos);

						this.processPayload();

						// Start over on the remainder of this packet

						try {
							this.processPacket(packet);
						}
						catch(e) {  } 
						// This catches the "Maximum call stack size reached" error in Safari (which has a 
						// really low call stack limit, either 100 or 500 depending on the version).
						//这里主要说明,在老版本的Safari下,可能会引起一个调用栈大小限制的错误(这里使用递归算法),根据不同的版本而情况各异
					} 
					else {
						// Grab from the start of the start marker to the end of the chunk.

						this.currentStream += packet.substr(startPos);

						// Leave this.currentStream set and wait for another packet.
					}
				} 
			} 
			else {
				// There is an open stream.

				if (endPos > -1) {

					// Use the end marker to grab the rest of the payload.

					var chunk = packet.substring(0, endPos);
					this.currentStream += chunk;

					// Remove the rest of the payload from this chunk.
					packet = packet.slice(endPos);

					this.processPayload();

					//Start over on the remainder of this packet.

					this.processPacket(packet);
				} 
				else {
					// Put this whole packet into this.currentStream.

					this.currentStream += packet;

					// Wait for another packet...
				}
			}
		},

		// --------------------------------------------------------------------------------
		// processPayload()
		// --------------------------------------------------------------------------------
		// Extract the mime-type and pass the payload on to its listeners.
		// 提取出一个payload的mime-type,并且把待处理的payload交予它的监听器
		// --------------------------------------------------------------------------------
    
		processPayload: function() {

			// Get rid of the boundary.
			
			this.currentStream = this.currentStream.replace(this.boundary, '');

			// Perform some string acrobatics to separate the mime-type and id from the payload.
			// This could be customized to allow other pieces of data to be passed in as well,
			// such as image height & width.
			// 把图片的相关信息从一个payload中提取出来,除去测试中的数据,还可以自定义一些其他的图片信息,作为
			// payload的字段,字段之间使用chr(1)来分割('\u0001')

			var pieces = this.currentStream.split(this.fieldDelimiter);
			var mime = pieces[0]
			var payloadId = pieces[1];
			//payload即为图片的data
			var payload = pieces[2];

			// Fire the listeners for this mime-type.(开始执行这个mime type下的监听函数)

			var that = this;
			if (typeof this.listeners[mime] != 'undefined') {
				for (var n = 0, len = this.listeners[mime].length; n < len; n++) {
					this.listeners[mime][n].call(that, payload, payloadId);
				}
			}
			//删除此次的currentStream
			delete this.currentStream;
		},
		
		// --------------------------------------------------------------------------------
		// listen()
		// --------------------------------------------------------------------------------
		// Registers mime-type listeners. Will probably rip this out and use YUI custom
		// events at some point. For now, it's good enough.
		// 使用listen函数来主次mime type监听器
		// --------------------------------------------------------------------------------		
    
		listen: function(mime, callback) {
			if (typeof this.listeners[mime] == 'undefined') {
				this.listeners[mime] = [];
			}

			if (typeof callback === 'function') {
				this.listeners[mime].push(callback);
			}
		}
	};

})();

简单起见,只把index.html的主要测试代码展示出来,如下:

	<div id="bd">
    	<!-- 作为mxhr输出的展示区 -->
		<div id="mxhr-output">
			<div id="mxhr-timing"></div>
		</div>

		<!-- 作为normal输出的展示区 -->
		<div id="normal-output">
			<div id="normal-timing"></div>
		</div>

		<script src="mxhr.js"></script>
		<script>
			// --------------------------------------
			// Test code
			// --------------------------------------

			var totalImages = 0;

			F.mxhr.listen('image/png', function(payload, payloadId) {
				var img = document.createElement('img');
				img.src = 'data:image/png;base64,' + payload;
				document.getElementById('mxhr-output').appendChild(img);

				totalImages++;
			});

/*			F.mxhr.listen('text/html', function(payload, payloadId) {
				console.log('Found text/html payload:', payload, payloadId);
			});

			F.mxhr.listen('text/javascript', function(payload, payloadId) {
				eval(payload);
			});*/

			F.mxhr.listen('complete', function() {

				var time = (new Date).getTime() - streamStart;
				document.getElementById('mxhr-timing').innerHTML = '<p>' + totalImages + ' images in a multipart stream took: <strong>' + time + 'ms</strong> (' + (Math.round(100 * (time / totalImages)) / 100) + 'ms per image)</p>';
		
				var normalStart = (new Date).getTime();
				var img;
				for (var i = 0, last = 300; i < last; i++) {
					img = document.createElement('img');
					img.src = 'icon_check.png?nocache=' + (new Date).getTime() * Math.random();
					img.width = 28;
					img.height = 22;
					document.getElementById('normal-output').appendChild(img);

					var count = 0;
					img.onload = function() {
						count++;
						if (count === last) {
							var time = (new Date).getTime() - normalStart;
							document.getElementById('normal-timing').innerHTML = '<p>' + last + ' normal, uncached images took: <strong>' + time + 'ms</strong> (' + (Math.round(100 * (time / count)) / 100) + 'ms per image)</p>';
						}
					};
				}
			});

			var streamStart = (new Date).getTime();
			F.mxhr.load('mxhr_test.php?send_stream=1');
		</script>
	</div>

测试结果:

IE8:

300 images in a multipart stream took: 178ms (0.59ms per image)

300 normal, uncached images took: 3066ms (10.22ms per image)

IE9:

300 images in a multipart stream took: 78ms (0.26ms per image)

300 normal, uncached images took: 5822ms (19.41ms per image)

Firefox 9.0.1:

300 images in a multipart stream took: 129ms (0.43ms per image)

300 normal, uncached images took: 10278ms (34.26ms per image)

Chrome 16:

300 images in a multipart stream took: 499ms (1.66ms per image)

300 normal, uncached images took: 2593ms (8.64ms per image)

Safari 5.1.2:

300 images in a multipart stream took: 50ms (0.17ms per image)

300 normal, uncached images took: 2504ms (8.35ms per image)

Opera 11.60:

300 images in a multipart stream took: 75ms (0.25ms per image)

300 normal, uncached images took: 1060ms (3.53ms per image)

测试数据不一定很准确,只能显示一定程度上的差别。

要是对mxhr感兴趣,可以猛击这里跳至官网:Multipart XHR,也可以直接下载,然后在本地测试(需要php环境的支持)。

Mxhr的却减少了HTTP请求的数量,但是也有浏览器自身的限制,由于IE6,7中的xmlhttp请求不支持readyState为3的情况,而且不支持图片的:

img.src = 'data:image/png;base64,' + imageData;

形式解析,所以只能另寻他法,但是总体上来说,mxhr还是能够提高网页的整体性能的,实现请求中的“开源节流”。

posted @ 2012-01-07 15:50  无墨来点睛  Views(2778)  Comments(6Edit  收藏  举报