String vs &str in Rust
Most likely, soon after you’ve started your Rust journey, you ran into this scenario where you tried to work with string types (or should I say, you thought you were?), and the compiler refused to compile your code because of something that looks like a string, actually isn’t a string.
当你开始Rust的学习之旅后,很可能遇到需要使用字符串的场景,但是编译器却无法让你的代码通过编译,因为有一部分代码,看起来像字符串,事实上却又不是。
For example, let’s take a look at this super simple function greet(name: String)
which takes something of type String
and prints it to screen using the println!()
macro:
例如,让我们看看下面这个简单的函数greet(name: String)
,这个函数接收一个String
类型的参数,然后使用println!()
这个宏将它打印到屏幕上:
fn main() { let my_name = "Pascal"; greet(my_name); } fn greet(name: String) { println!("Hello, {}!", name); }
Compiling this code will result in a compile error that looks something like this:
编译这段代码会产生下面的编译错误:
error[E0308]: mismatched types --> src/main.rs:3:11 | 3 | greet(my_name); | ^^^^^^^ | | | expected struct `std::string::String`, found `&str` | help: try using a conversion method: `my_name.to_string()` error: aborting due to previous error For more information about this error, try `rustc --explain E0308`.
You can see this behaviour in action here. Just hit the “Run” button and look at the compiler output.
你可以在这里运行代码。只要点击Run
按钮就可以看到编译输出。
Luckily, Rust’s compiler is very good at telling us what’s the problem. Clearly, we’re dealing with two different types here: std::string::String
, or short String
, and &str
. While greet()
expects a String
, apparently what we’re passing to the function is something of type &str
. The compiler even provides a hint on how it can be fixed. Changing line 3 to let my_name = "Pascal".to_string();
fixes the issue.
幸运地是, Rust编译器很友好地告诉了我们问题所在。很明显,这里我们使用了两个不同的类型: std::string::String
,简写为String
,和&str
。但是greet()
期望传入一个String
, 很显然,我们传给函数的类型是&str
。 编译器甚至已经提示我们如何修正这个错误。 把第3行改为let my_name= "Pascal".to_string();
即可修正这个问题。
What’s going on here? What is a &str
? And why do we have to perform an explicit conversion using to_string()
?
这里发生了什么? &str
是什么? 为什么我们不得不使用to_string()
执行一个显式的转换?
Understanding the String
type
To answer these questions, it’s beneficial to have a good understanding of how Rust stores data in memory. If you haven’t read our article on Taking a closer look at Ownership in Rust yet, I highly recommend checking it out first.
要回答这些问题,需要我们很好地理解Rust是如何在内存中存储数据的。如果你还没有阅读我们的文章 Taking a closer look at Ownership in Rust, 我强烈推荐你先去阅读一下。
Let’s take the example from above and look at how my_name
is stored in memory, assuming that it’s of type String
(e.g we’ve used .to_string()
as the compiler suggested):
让我们以上面的代码为例,看看my_name
是如何在内存中存储的,先假定它是String
类型(我们已经按照编译器提示使用了 .to_string()
):
buffer / capacity / / length / / / +–––+–––+–––+ stack frame │ • │ 8 │ 6 │ <- my_name: String +–│–+–––+–––+ │ [–│–––––––– capacity –––––––––––] │ +–V–+–––+–––+–––+–––+–––+–––+–––+ heap │ P │ a │ s │ c │ a │ l │ │ │ +–––+–––+–––+–––+–––+–––+–––+–––+ [––––––– length ––––––––]
Rust will store the String
object for my_name
on the stack. The object comes with a pointer to a heap-allocated buffer which holds the actual data, the buffer’s capacity and the length of the data that is being stored. Given this, the size of the String
object itself is always fixed and three words long.
Rust会在栈上存储String
对象。这个对象里包含以下三个信息: 一个指针指向一块分配在堆上的缓冲区,这也是数据真正存储的地方,数据的容量和长度。因此,String
对象本身长度总是固定的三个字(word)。
One of the things that make a String
a String
, is the capability of resizing its buffer if needed. For example, we could use its .push_str()
method to append more text, which potentially causes the underlying buffer to increase in size (notice that my_name
needs to be mutable to make this work):
String
之所以为String
的一个原因在于它能够根据需要调整缓冲区的容量。例如,我们能够使用push_str()
方法追加更多的文本,这种追加操作可能会引起缓冲区的增长。(注意,my_name
需要是可变(mutable)的):
let mut my_name = "Pascal".to_string(); my_name.push_str( " Precht");
In fact, if you’re familiar with Rust’s Vec<T>
type, you already know what a String
is because it’s essentially the same in behaviour and characteristics, just with the difference that it comes with guarantees of only holding well-formed UTF-8 text.
事实上, 如果你熟悉Rust的Vec<T>
类型,你就可以理解String
是什么样子的了。因为它们的行为和特性在本质上是相同的,唯一不同地是,String保证内部只保存标准的UTF-8文本。
Understanding string slices
String slices (or str
) are what we work with when we either reference a range of UTF-8 text that is “owned” by someone else, or when we create them using string literals.
当我们需要引用一个被拥有的UTF-8文本的区间(range),或者当我们使用字符串字面量(string literals)时,我们就需要使用字符串切片(也就是 str
)。
If we were only interested in the last name stored in my_name
, we can get a reference to that part of the string like this:
如果我们只是对存储在my_name
中的last name感兴趣,我们可以像下面这样来获取一个针对字符串中的特定部分的引用:
let mut my_name = "Pascal".to_string(); my_name.push_str( " Precht"); let last_name = &my_name[7..];
By specifying the range from the 7th byte (because there’s a whitespace) until the end of the buffer (”..”), last_name
is now a string slice referencing text owned by my_name
. It borrows it. Here’s what it looks like in memory:
通过指定从第7个字节(因为有空格)开始一直到缓冲区的结尾(".."),last_name
现在是一个引用自my_name
拥有的文本的字符串切片(string slice)。它借用了这个文本。这里是它在内存中的样子:
my_name: String last_name: &str [––––––––––––] [–––––––] +–––+––––+––––+–––+–––+–––+ stack frame │ • │ 16 │ 13 │ │ • │ 6 │ +–│–+––––+––––+–––+–│–+–––+ │ │ │ +–––––––––+ │ │ │ │ │ [–│––––––– str –––––––––] +–V–+–––+–––+–––+–––+–––+–––+–V–+–––+–––+–––+–––+–––+–––+–––+–––+ heap │ P │ a │ s │ c │ a │ l │ │ P │ r │ e │ c │ h │ t │ │ │ │ +–––+–––+–––+–––+–––+–––+–––+–––+–––+–––+–––+–––+–––+–––+–––+–––+
Notice that last_name
does not store capacity information on the stack. This is because it’s just a reference to a slice of another String
that manages its capacity. The string slice, or str
itself, is what’s considered ”unsized”. Also, in practice string slices are always references so their type will always be &str
instead of str
.
注意last_name
没有在栈上存储容量信息。这是因为它只是对一个字符串切片的引用,而该字符串管理它的容量。这个字符串切片,即str
本身,是不确定大小(unsized)的。 而且,在实际使用中,字符串切片总是以引用的形式出现,也就是它们的类型总是&str
而不是str
。
Okay, this explains the difference between String
, &String
and str
and &str
, but we haven’t actually created such a reference in our original example, did we?
上面已经解释了String
,&String
,和str
以及&str
的区别,但是我们还没有在最开始的示例中创建过这样的引用,不是吗?
Understanding string literals
As mentioned earlier, there are two cases when we’re working with string slices: we either create a reference to a sub string, or we use string literals.
正如前面所提到的,有两种情况我们需要使用字符串切片:要么创建一个对子字符串的引用,或者我们使用字符串字面量(string literals)。
A string literal is created by surrounding text with double quotes, just like we did earlier:
一个字符串字面量由一串被双引号包含的文本创建,就像我们之前写的:
let my_name = "Pascal Precht"; // This is a `&str` not a `String`
The next question is, if a &str
is a slice reference to a String
owned by someone else, who is the owner of that value given that the text is created in place?
&str
是一个引用了被(某人)拥有的String
的切片,假定这个文本在适当的地方被创建,那么这么String
的所有者是谁?It turns out that string literals are a bit special. They are string slices that refer to “preallocated text” that is stored in read-only memory as part of the executable. In other words, it’s memory that ships with our program and doesn’t rely on buffers allocated in the heap.
也就是说,栈上还有一个入口,指向当程序执行时预分配的内存。
my_name: &str [–––––––––––] +–––+–––+ stack frame │ • │ 6 │ +–│–+–––+ │ +––+ │ preallocated +–V–+–––+–––+–––+–––+–––+ read-only │ P │ a │ s │ c │ a │ l │ memory +–––+–––+–––+–––+–––+–––+
With a better understanding of the difference between String
and &str
, there’s probably another question that comes up.
当我们对String
和&str
的区别有了更好的理解之后,另一个问题也就随之而来了。
Which one should be used?
Obviously, this depends on a number of variables, but generally, it’s safe to say that, if the API we’re building doesn’t need to own or mutate the text it’s working with, it should take a &str
instead of a String
. This means, an improved version of the original greet()
function would look like this:
显然,这取决于很多因素,但是一般地,保守来讲,如果我们正在构建的API不需要拥有或者修改使用的文本,那么应该使用&str
而不是String
。这意味着,我们可以改进一下最原始的greet()
函数:
fn greet(name: &str) { println!("Hello, {}!", name); }
Wait, but what if the caller of this API really only has a String
and can’t convert it to a &str
for unknown reasons? No problem at all. Rust has this super powerful feature called deref coercing which allows it to turn any passed String
reference using the borrow operator, so &String
, to a &str
before the API is executed. This will be covered in more detail in another article.
等一下,但是如果这个API的调用者真的有一个String
并且出于某些未知原因无法将其转换成&str
呢?完全没有问题。Rust有一个超级强大的特性叫做deref coercing
,这个特性能够允许把传进来的带有借用操作符的String
引用,也就是&String
,在API执行之前转成&str
。我们会在另一篇文章里介绍更多地相关细节。
Our greet()
function therefore will work with the following code:
因此,我们的greet()
函数在下面代码中也可以正常工作:
fn main() { let first_name = "Pascal"; let last_name = "Precht".to_string(); greet(first_name); greet(&last_name); // `last_name` is passed by reference } fn greet(name: &str) { println!("Hello, {}!", name); }
See it in action here!
这里可以运行代码。
That’s it! I hope this article was useful. There’s an interesting discussion on Reddit about this content as well! Let me know what you think or what you would like to learn about next on twitter or sign up for the Rust For JavaScript Developers mailing list!
这就是本文全部内容,希望这篇文章对你有用。关于这部分内容,Reddit上有一个很有意思的讨论。
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 全程不用写代码,我用AI程序员写了一个飞机大战
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 记一次.NET内存居高不下排查解决与启示
· DeepSeek 开源周回顾「GitHub 热点速览」
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了