C++左值、右值、左值引用、右值引用与move语义

左值与右值

　　C++的值现在分为很多种类型：lvalue、xvalue、glvalue、rvalue、prvalue，具体定义见：http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3055.pdf

An lvalue (so-called, historically, because lvalues could appear on the left-hand side of an assignment expression) designates a function or an object. [Example: If E is an expression of pointer type, then *E is an lvalue expression referring to the object or function to which E points. As another example, the result of calling a function whose return type is an lvalue reference is an lvalue.]
An xvalue (an “eXpiring” value) also refers to an object, usually near the end of its lifetime (so that its resources may be moved, for example). An xvalue is the result of certain kinds of expressions involving rvalue references. [Example: The result of calling a function whose return type is an rvalue reference is an xvalue.]
A glvalue (“generalized” lvalue) is an lvalue or an xvalue.
An rvalue (so-called, historically, because rvalues could appear on the right-hand side of an assignment expression) is an xvalue, a temporary object or subobject thereof, or a value that is not associated with an object.
A prvalue (“pure” rvalue) is an rvalue that is not an xvalue. [Example: The result of calling a function whose return type is not a reference is a prvalue]

一些具体的解析，可以参考：https://en.cppreference.com/w/cpp/language/value_category，一般主要关注左值（lvalue）和右值（rvalue），可以简单（但不准确）地认为：左值是一个变量，可以用取址运算&取得它的内存地址；右值没有变量名，只存在于内存（或者寄存器）的值，没法用&取得它的地址。举例：

int a = 1 + 2; // a是一个变量，是左值。1和2没有变量名，是右值

引用

　　简单地说，就是把一个值绑定到一个变量，分为左值引用和右值引用。具体的解析见：https://en.cppreference.com/w/cpp/language/reference_initialization。举例：

int a = 0;

int &b = a; // 把值a绑定到变量b，左值引用
int &&c = 5; // 把值5绑定到变量c，右值引用

move语义

　　把一个变量转换为右值引用类型，其实就是用static_cast转换为右值引用类型而已，在编译时处理。见：https://en.cppreference.com/w/cpp/utility/move。举例：

int a = 0;

// 这是错的，a的类型是int，b的类型是int &&，无法转换
// cannot bind rvalue reference of type ‘int&&’ to lvalue of type ‘int’
// int &&b = a;

int &&b = std::move(a); // move把a的类型转换为右值引用类型，即int&&

常见错误

　　上面说了这么多概念性的东西，是因为在日常使用过程中，经常会犯一些错误。

1. 右值引用比左值引用快

class Test;

Test t;

Test &t1 = t;
Test &&t2 = std::move(t);

上面的代码中，t2的写法并不比t1快。上面说了，std::move实际是一个static_cast，编译时就处理完了。t1和t2都是引用，并没有效率上的差别，它们只是类型不一样。

2. std::move之后，原来的变量就不可以用了

有些人把std::move理解为转移，Test &&t2 = std::move(t); 会把t的内容转换到t2中，所以t不再可用。不是的，t2只是引用了t，std::move是编译时，在运行时啥都不做，没有移动任何东西。导致原来的变量不可用的是move拷贝构造函数，见下面的解释。

3. 右值引用、类型为右值引用的左值、右值引用赋值

void set(int &&i)
{
}

int t = 0;
int &&t1 = std::move(t);

// error: cannot bind rvalue reference of type ‘int&&’ to lvalue of type ‘int’
set(t1);

set(1);

上面的代码，通常认为t1的类型int的右值引用(int&&)，而set函数的参数i类型也是int的右值引用，所以set(t1)这个调用是正确的，然而编译器会抛出一个错误：error: cannot bind rvalue reference of type ‘int&&’ to lvalue of type ‘int’。t1的类型int的右值引用(int&&)，而set函数的参数i类型也是int的右值引用，这个是对的，但是忽略了另一个问题：右值引用的赋值。C++中只能把一个右值赋给（或者称bind，绑定）右值引用，例如：

int t = 0;
int &&t1 = i; // ERROR

int &&t2 = 0; // OK

上面的t1，也会报同样的错误：cannot bind rvalue reference of type ‘int&&’ to lvalue of type ‘int’，即不能把一个左值赋值给右值引用，这是语法规则，要死记硬背。回到set(t1)函数调用问题，t1的类型是int的右值引用(int&&)没错，但它是一个变量，用&可以取到它的地址，所以，它是一个类型为int的右值引用(int&&)的左值，既然它是一个左值，那左值没法赋值给右值引用(set函数的参数i)，所以报错了。std::move可以把左值转换为右值，所以set(std::move(t1))这样调用才是正确的。更多的讨论参考：

https://stackoverflow.com/questions/38604070/passing-rvalue-raises-cannot-bind-to-lvalue和https://isocpp.org/blog/2012/11/universal-references-in-c11-scott-meyers

4. 右值引用、move构造函数创建新对象

Test t;

Test &t1 = t;
Test &&t2 = std::move(t);

Test t3 = t;
Test t4 = std::move(t);

Test t5(t);
Test t6(std::move(t));

上面的代码中，t2是创建一个右值引用，实际上很少看到这个写法，因为没什么意义，用普通的引用也是一样的，强制把t转成右值再做一个右值引用是多此一举。t4是调用move构造函数构造一个新对象，原对象t不应该再继续使用。

5. move赋值操作符和move构造函数

move赋值操作符：class_name & class_name :: operator= ( class_name && )，move构造函数：class_name ( class_name && )，当用右值创建对象时，调用move构造函数，当把右值赋值给对象时，用move赋值操作值。如：

class Test;

Test t;

Test t1 = std::move(t); // 创建新对象，用move构造函数
t1 = std::move(t); // 赋值操作，用move赋值操作符

https://en.cppreference.com/w/cpp/language/move_assignment

https://en.cppreference.com/w/cpp/language/move_constructor

6. move赋值和move构造之后，原变量的析构问题

move赋值和move构造存在的意义，就是直接使用原对象的资源，减少拷贝：Move constructors typically "steal" the resources held by the argument (e.g. pointers to dynamically-allocated objects, file descriptors, TCP sockets, I/O streams, running threads, etc.) rather than make copies of them，以提升效率。那这就引发一个问题，使用了原对象的资源，那原对象怎么办？注意，拷贝构造函数的参数，是const的，但move构造函数的并不是，说明move构造函数里，是需要对原对象进行处理的。直接使用原对象的资源，那就需要对原对象进行一些合适的处理，以保证不会操作被接管的资源导致错误。例如，gcc的std::string的move构造函数中，会把原对象的长度设置为0（__str._M_set_length(0);）。

https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/basic_string.h （大概在545行，不同版本会变）

      /**
       *  @brief  Move construct string.
       *  @param  __str  Source string.
       *
       *  The newly-created string contains the exact contents of @a __str.
       *  @a __str is a valid, but unspecified string.
       **/
      basic_string(basic_string&& __str) noexcept
      : _M_dataplus(_M_local_data(), std::move(__str._M_get_allocator()))
      {
    if (__str._M_is_local())
      {
        traits_type::copy(_M_local_buf, __str._M_local_buf,
                  _S_local_capacity + 1);
      }
    else
      {
        _M_data(__str._M_data());
        _M_capacity(__str._M_allocated_capacity);
      }

    // Must use _M_length() here not _M_set_length() because
    // basic_stringbuf relies on writing into unallocated capacity so
    // we mess up the contents if we put a '\0' in the string.
    _M_length(__str.length());
    __str._M_data(__str._M_local_data());
    __str._M_set_length(0);
      }

7. 基础数据类型的move构造函数

基础数据类型，如int，是没有move构造函数的。因此：int a = std::move(i)和int a = i它们的结果是一样的。参考：fundamental types don't have move constructors. Moves degrade to copies https://stackoverflow.com/questions/14679605/do-built-in-types-have-move-semantics

8. 有必要实现move构造函数吗，能提升效率吗

不一定，得看情况。因为基础数据类型没有move构造函数，所以，如果你的类成员只有基础数据类型，那没必要实现move构造函数，实现了它也提升不了效率。

9. move解决了返回临时对象效率低的问题了吗

很多情况下，我们需要返回一个临时对象，如：

class Test get()
{
    class Test t;
    // ...
    return t;      
}

class Test t1 = get();

首先，在get函数里，会先构造一个t对象，然后返回时，会在内存中构造一个临时对象，然后赋值给t1时，再构造一个t1对象。这一来一回，就有三次对象的构造，三次析构，在别的语言里，一般都是只有一次构造一次析构，C++的这种机制，明显谈不上效率高。那现在有了move，可以把一个变量中的资源“移动”到另一个变量中，那这个问题是不是就解决了。不，std::move和move构造函数解决不了这个问题：

class Test get()
{
    class Test t;
    // ...
    return t; // 常规写法，效率不高
    return std::move(t); // 把t转换成一个右值完全没有意义，多此一举 
}

// 尝试返回一个右值引用提高效率，不好意思，t是临时对象，不能这么干
class Test &&get()
{
    class Test t;

    return std::move(t);
}

class Test t1 = get();  // 常规写法，效率不高
class Test &&t1 = get(); // 采取引用临时对象而不是创建一个t1对象，省去一次构造，还是有两次构造
class Test t1 = std::move(get()); // get返回的值本来就是右值，std::move是多此一举
class Test &&t1 = std::move(get()); // std::move是多此一举

可以看到，只用std::move和move构造函数，无论你怎么组合，都没有像预期一样只构造一次，只析构一次。move构造函数只能接管另一个对象里的部分可接管的资源（基础类型就无法接管），所以如果你的对象里有大量可接管的资源（比如已分配的大量内存），这一部分才会被优化。它优化了对象拷贝，但和临时对象没什么关系。那这个临时对象的问题，是不是就没法解决呢，也不是，请看下面的Copy Elision。

10. Copy Elision

复制消除（copy elision）或者称RVO（Return Value Optimization，返回值优化），在C++11之前，部分编译器就有这个特性，称为RVO，C++11标准出来之后，这个特性放到了标准当中，称为copy elision。先看个例子：

#include <iostream>

class Test
{
public:
    Test()
    {
        std::cout << "construct" << std::endl;
    }

    ~Test()
    {
        std::cout << "destruct" << std::endl;
    }

    Test(const Test &other)
    {
        std::cout << "copy construct" << std::endl;
    }

    Test(const Test &&other)
    {
        std::cout << "move construct" << std::endl;
    }
};

class Test get()
{
    class Test t;

    return t;
}

int main()
{
    class Test t1 = get();

    return 0;
}

$ g++ -fno-elide-constructors -O0 -g test.cpp 
$ ./a.out 
construct
move construct
destruct
move construct
destruct
destruct

$ g++ -O0 -g test.cpp 
$ ./a.out 
construct
destruct

$ gdb ./a.out 
get () at test.cpp:31
31	    return t;
(gdb) p &t
$1 = (Test *) 0x7fffffffde27
(gdb) s
32	}
(gdb) s
main () at test.cpp:38
38	    return 0;
(gdb) p &t1
$2 = (Test *) 0x7fffffffde27

在没有复制消除时（-fno-elide-constructors），程序像上面说的那样，执行了三次对象构造，三次析构。而后面正常编译时，只执行了一次构造一次析构，通过gdb调试发现，get函数里的对象t的地址，和main函数里的t1的地址，是一样的。这是因为编译器做了优化，直接把get函数里的对象创建在main函数里和t1的地址，这样返回时就不用再创建多余的对象。copy elision需要编译器能预先计算出返回位的位置，这样才能把临时对象直接在返回值那里创建，如果计算出不来，那这个优化就不会执行，如：

class Test get(int i)
{
    if (0 == i)
    {
        class Test t0;
        return t0;
    }
    class Test t;
    return t;
}

编译器在编译时，无法知道是应该把t0还是t创建在返回值的地址，因为这里没有copy elision优化。参考：https://en.cppreference.com/w/cpp/language/copy_elision

以上内容我自己日常使用遇到的问题，查找各种资料得出的结论，不一定正确，欢迎指正。

posted on 2020-05-04 15:50 coding my life 阅读(967) 评论(0) 编辑收藏举报