网络字节序问题? - draeag

http://blog.163.com/lyzaily@126/blog/static/42438837200910151422229/

在上一篇文章<<剥掉MFC的socket类的外衣 >>中讲述了MFC中提供的两个socket类与RAW socket API之间的关系,同时也涉及到了字节序的问题,现在在该文章中我来把这个问题理一理,顺便做个总结,供自己日后复习之用,同时也供广大初学者参考.有不对之处希望各位看官指出,共同学习共同进步.

我在<<剥掉MFC的socket类的外衣 >>一文中说了,如果你写的SOCKET程序要和另一端的SOCKET程序进行通信,如果另一段的程序是使用MFC的CSocket程序写的,则你自己的程序最好也要用MFC的CSocket类来写,这样做的原因就是考虑到了字节序的问题.如果使用CSocket类写程序,我们就不需要考虑字节序的问题,因为CSocket类已经为我们做了这部分工作.

那么我们在工作中不能完全依赖CSocket类,我们有必要掌握自己的命运,不能把自己的命运控制全都交给微软吧.为了搞清字节序的问题,我们得搞清楚大端小端的问题.

什么是大端?

就是对于多个字节的变量(int,sort,long类型的变量),他们在内存里的存放方式是最高位所在的字节存

放在该变量所占内存的最底地址处的哪个字节. 从下面的英文表述中可以知道所有计算机架构中多

字节变量的最高位所在字节的地址就是该变量在内存中的最低地址.

英文的表述就是:The most significant byte is on the left end of a word.

什么是小端？

就是对于一个多字节的变量（int，short，long类型的变量），它们在内存中存储时，是将该变量的高

字节存储在该变量所占内存的最右边那个字节。英文表示如下：

The most significant byte is on the right end of a word.

小结一下：

在计算机内存中，一多字节的变量所占据的内存的地址总是从左到右，从高到低的排列。所以上面英

文表述中the left end of a word是这个word的最左边一个字节所在的内存地址；the right end of a word

是一个word的最右边一个字节所在内存的地址。

好了，我们已经对什么是字节序以及什么是大端小端已经弄得非常清楚了，那么接下来应该知道为什么要定义大小端字节序以及什么时候需要考虑大小端的问题呢？

一般情况下，你在网络上收发数据时都不需要担心字节序转换的问题，但是在以下几种情况下，你必须要考虑字节序转换问题：

（一）什么时候必须对多字节变量进行字节序转换？

（1）、在传输那些需要被network 解析的数据时，需要字节序转换功能；如传输的是，端口号和网络地址。

防止解释错误，下面给出英文表述：

You are passing information that needs to be interpreted by the network, as opposed to the data you

are sending to another machine. For example, you might pass ports and addresses, which the network

must understand.

(2)、如果你的客户端程序在和一个服务器端程序进行通信，而这个服务器不是使用MFC的CSocket类写的

（并且你没有服务器的源代码可以修改），并且通信两端的机器架构不一样（一个是大端字节序存储数

据，另一个是小段字节序存储数据）这就需要对所传输的多字节变量进行字节序进行转换，然后再通过

网络进行传输。

同样给出该情况下的英文表述：

The server application with which you are communicating is not an MFC application (and you do not

have source code for it). This calls for byte order conversions if the two machines do not share the

same byte ordering.

自然有需要进行字节序转换的地方，那么也有不需要字节转换的情况，那么究竟在什么情况下不需要对一个多字节序进行字节序转换而直接进行网络传输呢？

（二）、在以下几种情况下，你不需要对传输的多字节变量进行字节序转换：

（1）、通信两端的机器具有相同的字节序存储方式，同时两端的通信程序都约定好不要进行字换，也就是说通信的一端不需要对所传的数据先转换成网络字节序（大端序）再通过网络传输,而接受端从网络上接受到数据时，就不需要将多字节数据转换成主机字节序了。

同理，在此处给出英文表述：

The machines on both ends can agree not to swap bytes, and both machines use the same

byte order.

（2）、与你的客户端进行通信的服务器程序是使用MFC的CSocket类实现的。也就是说在这种情况下客

户端对普通的多字节变量不需去考虑服务器端究竟是按那种字节序进行存储多字节变量，因为

CSocket类会帮助我们将数据转换成服务器端的存储形式。

英文表述如下：

The server you are communicating with is an MFC application.

（3）、如果你有服务器端的源代码，这样你就可以字节定义协议来决定是否必须需要进行字节序进行转

换。

英文表述：

You have source code for the server you're communicating with, so you can tell explicitly

whether you must convert byte orders or not.

Working with CAsyncSocket, you must manage any necessary byte-order conversions yourself. Windows Sockets standardizes the "big-Endian" byte-order model and provides functions to convert between this order and others. CArchive, however, which you use with CSocket, uses the opposite ("little-Endian") order, but CArchive takes care of the details of byte-order conversions for you. By using this standard ordering in your applications, or using Windows Sockets byte-order conversion functions, you can make your code more portable.

The ideal case for using MFC sockets is when you are writing both ends of the communication: using MFC at both ends. If you are writing an application that will communicate with non-MFC applications, such as an FTP server, you will probably need to manage byte-swapping yourself before you pass data to the archive object, using the Windows Sockets conversion routines ntohs, ntohl, htons, and htonl. An example of these functions used in communicating with a non-MFC application appears later in this article.

NOTE:

When the other end of the communication is not an MFC application, you also must avoid streaming C++ objects derived from CObject into your archive because the receiver will not be able to handle them. See the note in Windows Sockets: Using Sockets with Archives.

For more information about byte orders, see the Windows Sockets specification, available in the Platform SDK.

A Byte-Order Conversion Example

The following example shows a serialization function for a CSocket object that uses an archive. It also illustrates using the byte-order conversion functions in the Windows Sockets API.

This example presents a scenario in which you are writing a client that communicates with a non-MFC server application for which you have no access to the source code. In this scenario, you must assume that the non-MFC server uses standard network byte order. In contrast, your MFC client application uses a CArchive object with a CSocket object, and CArchive uses "little-Endian" byte order, the opposite of the network standard. Suppose the non-MFC server with which you plan to communicate has an established protocol for a message packet like the following:

struct Message

{

long MagicNumber;

unsigned short Command;

short Param1;

long Param2;

};

In MFC terms, this would be expressed as follows:

struct Message

{

long m_lMagicNumber;

short m_nCommand;

short m_nParam1;

long m_lParam2;

void Serialize( CArchive& ar );

};

In C++, a struct is essentially the same thing as a class. The Message structure can have member functions, such as the Serialize member function declared above. The Serialize member function might look like this:

void Message::Serialize(CArchive& ar)

{

if (ar.IsStoring())

{

ar << (DWORD)htonl(m_lMagicNumber);

ar << (WORD)htons(m_nCommand);

ar << (WORD)htons(m_nParam1);

ar << (DWORD)htonl(m_lParam2);

}

else

{

WORD w;

DWORD dw;

ar >> dw;

m_lMagicNumber = ntohl((long)dw);

ar >> w ;

m_nCommand = ntohs((short)w);

ar >> w;

m_nParam1 = ntohs((short)w);

ar >> dw; m_lParam2 = ntohl((long)dw);

}

This example calls for byte-order conversions of data because there is a clear mismatch between the byte ordering of the non-MFC server application on one end and the CArchive used in your MFC client application on the other end. The example illustrates several of the byte-order conversion functions that Windows Sockets supplies. The following table describes these functions.

Windows Sockets Byte-Order Conversion Functions:

ntohs

Convert a 16-bit quantity from network byte order to host byte order (big-Endian to little-Endian).

ntohl

Convert a 32-bit quantity from network byte order to host byte order (big-Endian to little-Endian).

htons

Convert a 16-bit quantity from host byte order to network byte order (little-Endian to big-Endian).

htonl

Convert a 32-bit quantity from host byte order to network byte order (little-Endian to big-Endian).

Another point of this example is that when the socket application on the other end of the communication is a non-MFC application, you must avoid doing something like the following:

ar << pMsg;

where pMsg is a pointer to a C++ object derived from class CObject. This will send extra MFC information associated with objects and the server will not understand it, as it would if it were an MFC application.

【总结】

1、对于网络协议中需要的多字节数据（如，端口号，网络地址）必须将此转换成网络字节序（大端）。

2、相对于网络协议需要解析的多字节数据，就是普通的多字节数据，如，你所要传送的一个文件中的一

个int、short以及long型的变量等，它们在被传输的过程中，不一定要被转换成网络字节序再传输，

这个取决于进行网络通信的两个终端的协定；

（1）、如果通信两端的机器都是小端或大端的，则两端都可不进行字节序转换，而直接发送、接受

并存储；

（2）、如果两端的机器存储架构不一样，则发送端在发送时必须将多字节数据转换成网络字节序，

而接受端在接受到数据后根据自己主机的字节序将接受到的多字节数据转换成本机字节序。

（3）、通信两端的机器存储字节序不同，但是如果通信两端的程序都是使用CScoket类实现的，这

样情况下，也不需要对普通的多字节数据进行字节序转换直接传输和接收。

(4)、通信的两端字节序不同，而且有一端的通信程序不是使用CSocket类来实现的，则在数据交

互中必须考虑字节序转换的问题。

posted on 2012-02-11 11:29 draeag 阅读(468) 评论(0) 编辑收藏举报

刷新页面返回顶部