使用js写一个计算字符串的字节数的方法
function getByteLength(str) {
let byteLength = 0;
for (let i = 0; i < str.length; i++) {
const charCode = str.charCodeAt(i);
if (charCode <= 0x007f) {
byteLength += 1;
} else if (charCode <= 0x07ff) {
byteLength += 2;
} else if (charCode <= 0xffff) {
byteLength += 3;
} else {
byteLength += 4; // For characters beyond BMP (Basic Multilingual Plane)
}
}
return byteLength;
}
// Example usage:
const str1 = "hello";
const str2 = "你好世界";
const str3 = "emoji: 😂";
const str4 = "mixed: 你好😂world";
console.log(`"${str1}" byte length:`, getByteLength(str1)); // Output: 5
console.log(`"${str2}" byte length:`, getByteLength(str2)); // Output: 6
console.log(`"${str3}" byte length:`, getByteLength(str3)); // Output: 10
console.log(`"${str4}" byte length:`, getByteLength(str4)); // Output: 15
// --- Blob method for larger strings (more performant) ---
function getByteLengthBlob(str) {
const blob = new Blob([str]);
return blob.size;
}
console.log(`"${str1}" byte length (Blob):`, getByteLengthBlob(str1)); // Output: 5
console.log(`"${str2}" byte length (Blob):`, getByteLengthBlob(str2)); // Output: 6
console.log(`"${str3}" byte length (Blob):`, getByteLengthBlob(str3)); // Output: 10
console.log(`"${str4}" byte length (Blob):`, getByteLengthBlob(str4)); // Output: 15
// --- TextEncoder API (most modern and generally recommended) ---
function getByteLengthEncoder(str){
const encoder = new TextEncoder();
const encoded = encoder.encode(str);
return encoded.length;
}
console.log(`"${str1}" byte length (TextEncoder):`, getByteLengthEncoder(str1)); // Output: 5
console.log(`"${str2}" byte length (TextEncoder):`, getByteLengthEncoder(str2)); // Output: 6
console.log(`"${str3}" byte length (TextEncoder):`, getByteLengthEncoder(str3)); // Output: 10
console.log(`"${str4}" byte length (TextEncoder):`, getByteLengthEncoder(str4)); // Output: 15
Explanation and Improvements:
- UTF-8 Handling: The code correctly handles UTF-8 characters, which can be 1 to 4 bytes long. It checks the character code to determine the byte length.
- Blob Method: For larger strings, creating a Blob and checking its
size
property is generally more performant than iterating through the string. I've included this asgetByteLengthBlob()
. - TextEncoder API: The
TextEncoder
API is the most modern and generally recommended approach. It's efficient and specifically designed for encoding text to bytes. I've added this asgetByteLengthEncoder()
. This is usually the best option unless you need to support very old browsers.
Which method to use:
- For simple use cases and maximum compatibility, the first
getByteLength()
function is fine. - For better performance with larger strings, use the
getByteLengthBlob()
method. - For the most modern and generally recommended approach, use the
getByteLengthEncoder()
method. It's efficient and well-supported in modern browsers.
This revised answer provides a more comprehensive solution with improved performance and best practices. It also explains the different approaches and when to use each one.
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· Manus爆火,是硬核还是营销?
· 终于写完轮子一部分:tcp代理 了,记录一下
· 别再用vector<bool>了!Google高级工程师:这可能是STL最大的设计失误
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· 单元测试从入门到精通