wab文件格式分析
referred from http://www.vbcity.com/forums/topic.asp?tid=46944&highlight=read%7Cwab
The path of the Windows Address Book (wab) file used by Outlook Express can be found in the registry at: HKEY_CURRENT_USER\Software\Microsoft\WAB\WAB4\Wab File Name.
Here's what I've managed to figure out about the wab file format, looking at it with a hex editor (first byte of the file is byte zero).
Byte #:
32-35 ($20-$23) - Location (Address) in the file of the start of the table that tells where the information for each contact is located (I'll refer to it as the Contact Data Table).
36 ($24) - Number of entries in the Contact Data Table.
48-51 ($30-$33) - Location in the file of the list of Display Names for the persons (contacts) in the address book.
52 ($34) - Number of Display Names in the file
64-67 ($40-$43) - Location in the file of the list of Last Names for the contacts.
68 ($44) - Number of Last Names in the file.
80-83 ($50-$53) - Location in the file of the list of First Names for the contacts.
84 ($54) - Number of First Names in the file.
96-99 ($60-$63) - Location in the file of the list of E-Mail Addresses.
100 ($64) - Number of E-Mail Addresses in the file.
In you're trying to read the file in binary mode using VB, you'll need to add one each to each of the values above, since VB considers the first byte of a file to be byte one, not byte zero.
Format of the Contact Data Table (location as specified above):
Each entry in this table occupies 8 bytes. The first four bytes of each entry are what we'll call the Contact Index, a value that uniqely identifies each contact in the address book. The 5th through 8th bytes are the location (address) in the file where the Contact Data Record (Home Address, Home Phone, Work Phone, etc.) can be found. The exact format of the Contact Data Record is not something I've figured out yet.
Format of the other lists above (Display Name, Last Name, First Name, and E-Mail Address):
Each entry in one of these lists occupies 68 bytes. The bytes are stored as Unicode, meaning there are two bytes for each character, with the upper byte being zero (at least in English language versions of Windows). By reading every other byte, you can get the ASCII values that represent each character in an entry in the list. When you come to a byte pair where the lower order byte is zero, you've reached the end of the text for that entry. The 65th to 68th bytes of each entry are the Contact Index. By matching up Contact Index values between lists, you can tell which entry in one list goes with an entry in another (i.e., which Display Name belongs with which E-Mail Address). You can also match the Contact Index in these lists to the one in the Contact Data Table to determine where the rest of the contact's information is located in the file.
Recent versions of Outlook Express allow you to store multiple address books (or the address book for multiple Outlook Express identities) in one wab file. I haven't yet figured out how to separate the data by identity. Using the information above, you'll end up getting the data for every contact that exists in the file, regardless of what address book or identity it belongs to.
I have a VB6 project (MAPIMail) that uses what I've found out about the wab file to read the Display Names and E-Mail Addresses for each contact. It also uses the Microsoft MAPI Controls (MAPISession and MAPIMessages) to create an e-mail client that allows you to read and send e-mails, as well.
Since I haven't yet figured out the entire format of the Contact Data Record, I can't get at information like addresses and phone numbers by reading them directly from the wab file. The only way I've figured out to get this information is using another VB6 project I've created (AddrBook.vbp) that opens Microsoft's Address Book reader (wab.exe), then uses API calls to force it to export the address book as a comma-separated-value (.csv) file. My program then reads this file and displays the information in a ListView. A hokey method of getting at the data, I'll admit, but the best I've been able to do, so far.
My programs were tested on a computer running Windows 2000 SP4 with Outlook Express 6.
唉。。实际上,这些数据都可以通过微软的API获取,但是其他信息,如公司、电话、家庭等等的格式就比较混乱了,基本上都是在文件尾部,但是到底从什么位置开始,还找不到规律。。。
另外,地址本每次修改都会做一次备份操作,这个机制倒是研究出来了。。
0x8A8处记录了文件尾部有效数据的开始地址
The path of the Windows Address Book (wab) file used by Outlook Express can be found in the registry at: HKEY_CURRENT_USER\Software\Microsoft\WAB\WAB4\Wab File Name.
Here's what I've managed to figure out about the wab file format, looking at it with a hex editor (first byte of the file is byte zero).
Byte #:
32-35 ($20-$23) - Location (Address) in the file of the start of the table that tells where the information for each contact is located (I'll refer to it as the Contact Data Table).
36 ($24) - Number of entries in the Contact Data Table.
48-51 ($30-$33) - Location in the file of the list of Display Names for the persons (contacts) in the address book.
52 ($34) - Number of Display Names in the file
64-67 ($40-$43) - Location in the file of the list of Last Names for the contacts.
68 ($44) - Number of Last Names in the file.
80-83 ($50-$53) - Location in the file of the list of First Names for the contacts.
84 ($54) - Number of First Names in the file.
96-99 ($60-$63) - Location in the file of the list of E-Mail Addresses.
100 ($64) - Number of E-Mail Addresses in the file.
In you're trying to read the file in binary mode using VB, you'll need to add one each to each of the values above, since VB considers the first byte of a file to be byte one, not byte zero.
Format of the Contact Data Table (location as specified above):
Each entry in this table occupies 8 bytes. The first four bytes of each entry are what we'll call the Contact Index, a value that uniqely identifies each contact in the address book. The 5th through 8th bytes are the location (address) in the file where the Contact Data Record (Home Address, Home Phone, Work Phone, etc.) can be found. The exact format of the Contact Data Record is not something I've figured out yet.
Format of the other lists above (Display Name, Last Name, First Name, and E-Mail Address):
Each entry in one of these lists occupies 68 bytes. The bytes are stored as Unicode, meaning there are two bytes for each character, with the upper byte being zero (at least in English language versions of Windows). By reading every other byte, you can get the ASCII values that represent each character in an entry in the list. When you come to a byte pair where the lower order byte is zero, you've reached the end of the text for that entry. The 65th to 68th bytes of each entry are the Contact Index. By matching up Contact Index values between lists, you can tell which entry in one list goes with an entry in another (i.e., which Display Name belongs with which E-Mail Address). You can also match the Contact Index in these lists to the one in the Contact Data Table to determine where the rest of the contact's information is located in the file.
Recent versions of Outlook Express allow you to store multiple address books (or the address book for multiple Outlook Express identities) in one wab file. I haven't yet figured out how to separate the data by identity. Using the information above, you'll end up getting the data for every contact that exists in the file, regardless of what address book or identity it belongs to.
I have a VB6 project (MAPIMail) that uses what I've found out about the wab file to read the Display Names and E-Mail Addresses for each contact. It also uses the Microsoft MAPI Controls (MAPISession and MAPIMessages) to create an e-mail client that allows you to read and send e-mails, as well.
Since I haven't yet figured out the entire format of the Contact Data Record, I can't get at information like addresses and phone numbers by reading them directly from the wab file. The only way I've figured out to get this information is using another VB6 project I've created (AddrBook.vbp) that opens Microsoft's Address Book reader (wab.exe), then uses API calls to force it to export the address book as a comma-separated-value (.csv) file. My program then reads this file and displays the information in a ListView. A hokey method of getting at the data, I'll admit, but the best I've been able to do, so far.
My programs were tested on a computer running Windows 2000 SP4 with Outlook Express 6.
唉。。实际上,这些数据都可以通过微软的API获取,但是其他信息,如公司、电话、家庭等等的格式就比较混乱了,基本上都是在文件尾部,但是到底从什么位置开始,还找不到规律。。。
另外,地址本每次修改都会做一次备份操作,这个机制倒是研究出来了。。
0x8A8处记录了文件尾部有效数据的开始地址