emplace_back and std::move

What are the differences between push_back and emplace_back?

Intro

Let's see an example in C++98.

push_back

Suppose there is a class A, and we want to use a vector to store some instances of class A.

class A {
  protected:
    int *ptr;
    int size;

  public:
    A(int n = 16) {
        ptr = new int[n];
        size = n;
        puts("A(int)");
    }
    A(const A &a) {
        size = a.size;
        ptr = new int[size];
        memcpy(ptr, a.ptr, size * sizeof(int));
        puts("A(const A&)");
    }
    virtual ~A() {
        if (ptr != nullptr)
            delete[] ptr;
    }
};
int main() {
    vector<A> vec;
    vec.push_back(A(32));
}

Compile this code with command:

clang++ -std=c++98 push_back.cpp

And it will output:

A(int)
A(const A&)

We can see that, if we want to store an instance of A in the vector, there are at least two instances constructed. One is temporary, and the another one is stored in the heap of vector.

If A were a very very heavy class, then the temporary one will slow performance of our program. And this is what emplace_back wants to optimize, to reduce one temporary instance copy.

emplace_back

In template class vector, push_back is defined as:

void push_back (const value_type& val); // C++98
void push_back (value_type&& val);      // since C++11, where && denote rvalue reference

However, emplace_back is defined as:

template <class... Args>
  void emplace_back (Args&&... args);  // where && denote universal reference, we will explain it latter
template< class... Args >
  reference emplace_back( Args&&... args );

The arguments of emplace_back is variadic, which is similar to printf.

And, the argument of emplace_back is an universal reference, we will explain what is universal reference in the next section.

After C++11, the C++ standard introduces "move semantic" and "perfect forward". And there is a new type of constructor, call "move constructor".

#include <iostream>
#include <vector>
using namespace std;
class A {
  protected:
    int *ptr;
    int size;

  public:
    A(int n = 16) {
        ptr = new int[n];
        size = n;
        puts("A(int)");
    }
    A(const A &a) {
        size = a.size;
        ptr = new int[size];
        memcpy(ptr, a.ptr, size * sizeof(int));
        puts("A(const A&)");
    }
    A(A &&a) {
        size = a.size;
        ptr = a.ptr;
        a.ptr = nullptr;
        puts("A(const A&&)");
    }
    virtual ~A() {
        if (ptr != nullptr)
            delete[] ptr;
    }
};

int main() {
    vector<A> vec;
    vec.emplace_back(10);
}

Compiled it with clang++ -std=c++17 push.cpp. Then the program will output:

A(int)

Now, we can see the differences between push_back and emplace_back.

What will happen if we call emplace_back(A(10)) ? Actually, it will output:

A(int)
A(const A&&)

So we can see that, there is still only one copy, no temporary object.

In the next section, we will explain what is "universal reference", and introduce the difference among lvalue-reference, rvalue-reference and universal-reference.

lvalue, rvalue and xvalue

Please refer to

for more details.

Generally speaking,

lvalue - Left-hand side value of an assignment expression. A lvalue always has an identity name.
- Please note that "assignment" is not declaration and initialization.
- For example, int x = 1; is declaring a lvalue x, initialized it with 1.
- int &y = x; is declaring a lvalue reference y, initialized it with lvalue x.
rvalue - Right-hand side value of an assignment expression. A rvalue usually is a temporary object.
- e.g. string s = string("hello"), where string("hello") is a rvalue.
- A rvalue has no identity name.
xvalue - "eXpiring value", it usually refers to an object, usually near the end of its lifetime (so that its resources may be moved).
- e.g. suppose we have a function auto f() { return string("hello")}, and we let str += f(), where f() is a xvalue (also a rvalue).

Universal Reference

In C++, there are two common reference types: lvalue reference and rvalue reference. In addition,

non-const lvalue reference must be bound to a lvalue,
const lvalue reference can be bound to a either const lvalue or a rvalue
- e.g. if we have a function void f(const string &str);, then f(string("ABC")) is valid.
rvalue reference must be bound to a rvalue.

void f1(vector<int>& vec) {}
void f2(vector<int>&& vec) {}

In above code, vector<int>& means vec is lvalue reference, and vector<int>&& means vec is a rvalue reference.

Actully, there are 3rd reference type, called "universal reference". Universal reference is a reference that may resolve to either an lvalue reference or an rvalue reference.

Now, let us see another example, which is about template.

template<class T> void f1(T &val);          // lvalue reference
template<class T> void f2(T &&val);         // universal reference
template<class T> void f3(vector<T> &&val); // rvalue reference
template<class T> void f4(const T&& param); // rvalue reference

T & is the most common reference type, lvalue-reference, which must be bound to a lvalue.
T && is actually the universal reference.
vector<T> && and const T && are the rvalue references.

So, we can see that it's easy to distinguish the lvalue reference, there is only one & in lvalue reference.

But how can we distinguish rvalue reference and universal reference, both of them have two &?

Refer to this blog: Universal References in C++11

"Universal references can only occur in the form T&&!"

More specifically, universal references always have the form T&& for some deduced type T.

Let's revisit the push_back and emplace_back.

template <class T, class Allocator = allocator<T> >
class vector {
public:
    ...
    void push_back(T&& x);       // fully specified parameter type => no type deduction;
    ...                          // && is rvalue reference
};

Actually, the declaration for push_back is:

template <class T>
void vector<T>::push_back(T&& x);

push_back can't exist without the class std::vector<T> that contains it. But if we have a class std::vector<T>, we already know what T is, so there’s no need to deduce it. Hence T && is not a deduced type.

The case is different in emplace_back.

template <class T, class Allocator = allocator<T> >
class vector {
public:
    ...
    template <class... Args>
    void emplace_back(Args&&... args); // deduced parameter types => type deduction;
    ...                                // && is universal references
};

And the declaration of emplace_back is:

template<class T>
template<class... Args>
void std::vector<T>::emplace_back(Args&&... args);

Here Args is a deduced type, obviously. Hence Args && is universal reference.

move

std::move is used to "cast a lvalue to rvalue".

std::move is used to indicate that an object t may be "moved from", i.e. allowing the efficient transfer of resources from t to another object.

In particular, std::move produces an xvalue expression that identifies its argument t. It is exactly equivalent to a static_cast to an rvalue reference type.

std::move is defined as:

template<class T>
constexpr std::remove_reference_t<T>&& move( T&& t ) noexcept;    // since C++14

Here T &&t is an universal reference, since T is a deduced type.

The implementation of move is very simple, what it does is to make a type-casting by static_cast.

template<class T>
constexpr std::remove_reference_t<T>&& move( T&& t ) noexcept {
    return static_cast<typename std::remove_reference<T>::type&&>(t);
}

The effect of remove_reference is remove reference qualifier of a type T.

template<class T> struct remove_reference      {typedef T type;};
template<class T> struct remove_reference<T&>  {typedef T type;};
template<class T> struct remove_reference<T&&> {typedef T type;};

We can make these code simpler, that is:

template<class T>
constexpr T&& move(T&& t) noexcept {
    return static_cast<T &&>(t);
}

forward

std::forward is defined as:

template< class T >
constexpr T&& forward( std::remove_reference_t<T>& t ) noexcept {
    return static_cast<T&&>(t);
}

template< class T >
constexpr T&& forward(std::remove_reference_t<T>&& t) noexcept {
    static_assert(!is_lvalue_reference<T>::value,
                  "can not forward an rvalue as an lvalue");
    return static_cast<T&&>(t);
}

For the 1st one, it forwards lvalue t as either lvalue or as rvalue, depending on T.
- std::forward<string &>(str) will produce an lvalue reference. (Actually, it does nothing here.)
- std::forward<string &&>(str) will produce an rvalue reference. It can forward str (a lvalue) as rvalue. Here we can see that, this version of forward can replace move. See Usage of std::forward vs std::move.
For the 2nd one, it forwards rvalue t as rvalues and prohibits forwarding of rvalues as lvalues.
- e.g. std::forward<string &>("") will cause compiler error, since it attempts to forward a rvalue "" as a lvalue.

std::forward makes it possible to forward a result of an expression (such as function call), which may be rvalue or lvalue, as the original value category of a forwarding reference argument.

The forward operation will keep the reference property while forwarding t, hence it is called "Perfect Forwarding".

Implementation of emplace_back

Based on std::forward<>() and std::move(), (after C++11) one of the possible implementations of push_back and emplace_back is:

template<class T>
class Vector {
protected:
    using value_type = T;
    using pointer_type = T*;
    using reference_type = T&;
    pointer_type start;
    std::size_t size;
    std::size_t capacity;
    // ...
public:
    void push_back(value_type &&val) { this->emplace_back(val); }
    void push_back(const value_type &val) {
        if (size == capacity) {
            // ...
        }
        start[size] = new T(val);  // this will call copy constructor
        ++size;
    }

    template <class... Args>
    reference_type emplace_back (Args&&... args) {
        if (size == capacity) {
            // make vector grow via some strategies
        }
        // new placement
        return *new(start + (size++)) T(std::forward<Args>(args)...);
    }
};

In C++98 (before C++11), implementation of push_back maybe:

void push_back(const value_type &val) {
    if (size == capacity) {
        // ...
    }
    start[size] = new T(val);  // this will call copy constructor
    ++size;
}

References

posted @ 2022-02-16 19:14 sinkinben 阅读(598) 评论(1) 编辑收藏举报

刷新页面返回顶部