字符串拼接两种实现方式

字符串拼接

字符串拼接的实现方式主要有两种
现以python和go来说明两种实现方式,python和go中的字符串对象都是不可变的.

go的代码:

package main
import "fmt"

func main() {
    t := ""
    s := "123"
    for i := 0; i < 100000; i++ {
        t += s
    }
    fmt.Println("len:", len(t))
}

time的输出:

real	0m1.706s
user	0m1.703s
sys	    0m0.108s

python的代码:

t = ""
s = "123"
for _ in range(100000):
    t += s
 
print len(t)

time的输出:

real	0m0.053s
user	0m0.033s
sys	    0m0.014s

go的效率不是接近于C的吗?python不是那蜗牛般的脚本语言吗?
让我们来分别看下go和python是如何实现字符串拼接的

python的源代码:

static PyObject *
string_concatenate(PyObject *v, PyObject *w,
                   PyFrameObject *f, unsigned char *next_instr)
{
    /* This function implements 'variable += expr' when both arguments
       are strings. */
    Py_ssize_t v_len = PyString_GET_SIZE(v);
    Py_ssize_t w_len = PyString_GET_SIZE(w);
    Py_ssize_t new_len = v_len + w_len;
    if (new_len < 0) {
        PyErr_SetString(PyExc_OverflowError,
                        "strings are too large to concat");
        return NULL;
    }
 
    if (v->ob_refcnt == 2) {
        /* In the common case, there are 2 references to the value
         * stored in 'variable' when the += is performed: one on the
         * value stack (in 'v') and one still stored in the
         * 'variable'.  We try to delete the variable now to reduce
         * the refcnt to 1.
         */
        switch (*next_instr) {
        case STORE_FAST:
        {
            int oparg = PEEKARG();
            PyObject **fastlocals = f->f_localsplus;
            if (GETLOCAL(oparg) == v)
                SETLOCAL(oparg, NULL);
            break;
        }
        case STORE_DEREF:
        {
            PyObject **freevars = (f->f_localsplus +
                                   f->f_code->co_nlocals);
            PyObject *c = freevars[PEEKARG()];
            if (PyCell_GET(c) == v)
                PyCell_Set(c, NULL);
            break;
        }
        case STORE_NAME:
        {
            PyObject *names = f->f_code->co_names;
            PyObject *name = GETITEM(names, PEEKARG());
            PyObject *locals = f->f_locals;
            if (PyDict_CheckExact(locals) &&
                PyDict_GetItem(locals, name) == v) {
                if (PyDict_DelItem(locals, name) != 0) {
                    PyErr_Clear();
                }
            }
            break;
        }
        }
    }
 
    if (v->ob_refcnt == 1 && !PyString_CHECK_INTERNED(v)) {
        /* Now we own the last reference to 'v', so we can resize it
         * in-place.
         */
        if (_PyString_Resize(&v, new_len) != 0) {
            /* XXX if _PyString_Resize() fails, 'v' has been
             * deallocated so it cannot be put back into
             * 'variable'.  The MemoryError is raised when there
             * is no value in 'variable', which might (very
             * remotely) be a cause of incompatibilities.
             */
            return NULL;
        }
        /* copy 'w' into the newly allocated area of 'v' */
        memcpy(PyString_AS_STRING(v) + v_len,
               PyString_AS_STRING(w), w_len);
        return v;
    }
    else {
        /* When in-place resizing is not an option. */
        PyString_Concat(&v, w);
        return v;
    }
}

可以看到python对于+=这种字符串拼接做了优化,它会增大左值的大小,然后把另一个字符串的内容copy到左值扩充的内存处,时间复杂度为O(n)

go的源代码:

String
runtime·catstring(String s1, String s2)
{
    String s3;
 
    if(s1.len == 0)
    	return s2;
    if(s2.len == 0)
    	return s1;
 
    s3 = gostringsize(s1.len + s2.len);
    runtime·memmove(s3.str, s1.str, s1.len);
    runtime·memmove(s3.str+s1.len, s2.str, s2.len);
    return s3;
}

go对于字符串的拼接未做任何优化,每一次拼接,都会申请一段新的内存,将两个字符串的内容copy到新的对象中,时间复杂度为O(n2)

posted on 2015-05-17 14:18  richmonkey  阅读(980)  评论(0编辑  收藏  举报

导航