Waver's blog

心境如秋

  博客园 :: 首页 :: 博问 :: 闪存 :: 新随笔 :: 联系 :: 订阅 订阅 :: 管理 ::
This code sample shows how to convert UTF-8 byte sequences (aka code page 65001) into unicode strings and back again in Visual Basic 6 instead of .Net like most of the examples I could find. What's more, this sample does not use APIs, but instead, relies on the Stream object provided by the ADODB library. This may not be the most efficient way of doing it, but these functions can be easily ported to Classic ASP by dropping all the variable types.

For this code to work, you will need to add a reference to the Microsoft ActiveX Data Objects 2.5 Library later versions of this library will also work.

The first function converts a unicode string to a byte array:
' accept a byte array containing utf-8 data
' and convert it to a string
Public Function ConvertStringToUtf8Bytes(ByRef strText As String) As Byte()
Dim objStream As ADODB.Stream
Dim data() As Byte
' init stream
Set objStream = New ADODB.Stream
objStream.Charset = "utf-8"
objStream.Mode = adModeReadWrite
objStream.Type = adTypeText
objStream.Open
' write bytes into stream
objStream.WriteText strText
objStream.Flush
' rewind stream and read text
objStream.Position = 0
objStream.Type = adTypeBinary
objStream.Read 3 ' skip first 3 bytes as this is the utf-8 marker
data = objStream.Read()
' close up and return
objStream.Close
ConvertStringToUtf8Bytes = data
End Function
This second function does the opposite, converting a byte array into a unicode string:
' accept a byte array containing utf-8 data
' and convert it to a string
Public Function ConvertUtf8BytesToString(ByRef data() As Byte) As String
Dim objStream As ADODB.Stream
Dim strTmp As String
' init stream
Set objStream = New ADODB.Stream
objStream.Charset = "utf-8"
objStream.Mode = adModeReadWrite
objStream.Type = adTypeBinary
objStream.Open
' write bytes into stream
objStream.Write data
objStream.Flush
' rewind stream and read text
objStream.Position = 0
objStream.Type = adTypeText
strTmp = objStream.ReadText
' close up and return
objStream.Close
ConvertUtf8BytesToString = strTmp
End Function
This test method uses a function called DecodeBase64 which is defined in this article: Free, Easy and Quick Base64 Encoding and Decoding in Visual Basic.
Public Sub Main()
Dim strB64 As String
Dim data() As Byte
Dim strTmp As String
' define test data as base64 and decode to array of bytes
strB64 = "R3JlZXRpbmdzIGFuZCBTYWx1dGF0aW9uISAo4oKsKSBhbmQgc29"
strB64 = strB64 & "tZSBVcmR1OiDaqdix2KfahtuMINm+2Kfaqdiz2KrYp9mG24w="
data = DecodeBase64(strB64)
' convert from utf-8 to string
strTmp = ConvertUtf8BytesToString(data)
' convert back to bytes
data = ConvertStringToUtf8Bytes(strTmp)
End Sub
Please note that the VB6 IDE and the standard VB6 form controls have difficulty showing Unicode characters and will show exotic characters as '?????'

The code from this article can be download as a VB6 project here: NonHostile_VB6_Convert_UTF8.zip

Hope this helps :-)
posted on 2008-08-28 22:43  Waver  阅读(4013)  评论(1编辑  收藏  举报