字符串拼接

String

在Java中，String是一个不可变类，所以String对象一旦在堆中被创建出来就不能修改。

 package java.lang;
//import ...
public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence {
    /** The value is used for character storage. */
    private final char value[];
}

Java字符串其实是基于字符数组实现的，该数组被关键字final标注，一经赋值就不可修改。

既然字符串是不可变的，那么字符串拼接又是怎么回事呢？

字符串不变性与字符串拼接

其实所谓的字符串拼接，都是重新生成了一个新的字符串（JDK7开始，substring() 操作也是重新生成一个新的字符串）。下面一段字符串拼接代码：

 String s = "hello ";
s = s.concat("world!");

其实生成了一个新字符串，s最终保存的是一个新字符串的引用，如下图所示：

Java字符串拼接方式

+ 语法糖

在Java中，拼接字符串最简单的方式就是直接使用符号+来拼接，如：

 public class Main2 {
    public static void main(String[] args) {
        String s1 = "hello " + "world " + "!";
        String s2 = "xzy ";
        String s3 = s2 + s1;
    }
 
    private void concat(String s1) {
        String s2 = "xzy" + s1;
    }
}

这里要特别说明一点，有人把Java中使用+拼接字符串的功能理解为运算符重载。其实并不是，Java是不支持运算符重载的，这其实只是Java提供的一个语法糖。

编译，反编译上面的代码：

 public class Main2 {
    public Main2() {
    }
 
    public static void main(String[] var0) {
        String var1 = "hello world !";
        String var2 = "xzy ";
        (new StringBuilder()).append(var2).append(var1).toString();
    }
 
    private void concat(String var1) {
        (new StringBuilder()).append("xzy").append(var1).toString();
    }
}

通过查看反编译后的代码，我们发现，使用 + 进行字符串拼接，最终是通过StringBuilder，创建一个新的String对象。

concat

除了使用+拼接字符串之外，还可以使用String类中的方法concat方法来拼接字符串，如：

     public static void main(String[] args) {
        String s1 = "hello " + "world " + "!";
        String s2 = "xzy ";
        String s3 = s2.concat(s1);
    }

concat方法的源码如下：

 public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence {
    
    /** The value is used for character storage. */
    private final char value[];
    
   /**
     * Concatenates the specified string to the end of this string.
     * <p>
     * If the length of the argument string is {@code 0}, then this
     * {@code String} object is returned. Otherwise, a
     * {@code String} object is returned that represents a character
     * sequence that is the concatenation of the character sequence
     * represented by this {@code String} object and the character
     * sequence represented by the argument string.<p>
     * Examples:
     * <blockquote><pre>
     * "cares".concat("s") returns "caress"
     * "to".concat("get").concat("her") returns "together"
     * </pre></blockquote>
     *
     * @param   str   the {@code String} that is concatenated to the end
     *                of this {@code String}.
     * @return  a string that represents the concatenation of this object's
     *          characters followed by the string argument's characters.
     */
    public String concat(String str) {
        int otherLen = str.length();
        if (otherLen == 0) {
            return this;
        }
        int len = value.length;
        char buf[] = Arrays.copyOf(value, len + otherLen);
        str.getChars(buf, len);
        return new String(buf, true);
    }
    
   /**
     * Copy characters from this string into dst starting at dstBegin.
     * This method doesn't perform any range checking.
     */
    void getChars(char dst[], int dstBegin) {
        System.arraycopy(value, 0, dst, dstBegin, value.length);
    }
}

Arrays.copyOf()方法源码：

 //创建一个长度为newLength的字符数组，然后将original字符数组中的字符拷贝过去。
public static char[] copyOf(char[] original, int newLength) {
    char[] copy = new char[newLength];
    System.arraycopy(original, 0, copy, 0,
                     Math.min(original.length, newLength));
    return copy;
}

从上面的源码看出，使用a.concat(b)拼接字符串a b，创建了一个长度为a.length + b.length的字符数组，a和b先后被拷贝进字符数组，最后使用这个字符数组创建了一个新的String对象。

StringBuffer 和StringBuilder

关于字符串，Java中除了定义了一个可以用来定义字符串常量的String类以外，还提供了可以用来定义字符串变量的StringBuffer类、StringBuilder，它的对象是可以扩充和修改的，如：

 public static void main(String[] args) {
    StringBuffer stringBuffer = new StringBuffer();
    String s1 = "hello " + "world " + "!";
    String s2 = "xzy ";
    String s3;
    stringBuffer.append(s1).append(s2);
    s3 = stringBuffer.toString();
}

 public static void main(String[] args) {
    StringBuilder stringBuilder = new StringBuilder();
    String s1 = "hello " + "world " + "!";
    String s2 = "xzy ";
    String s3;
    stringBuilder.append(s1).append(s2);
    s3 = stringBuilder.toString();
}

接下来看看StringBuffer和StringBuilder的实现原理。

StringBuffer和StringBuilder都继承自AbstractStringBuilder，下面是AbstractStringBuilder的部分源码：

 abstract class AbstractStringBuilder implements Appendable, CharSequence {
    /**
     * The value is used for character storage.
     */
    char[] value;
 
    /**
     * The count is the number of characters used.
     */
    int count;
}

与String类似，AbstractStringBuilder也封装了一个字符数组，不同的是，这个字符数组没有使用final关键字修改，也就是所，这个字符数组是可以修改的。还要一个差异就是，这个字符数组不一定所有位置都要被占满，AbstractStringBuilder中有一个count变量同来记录字符数组中存在的字符个数。

试着看看StringBuffer、StringBuilder、AbstractStringBuilder中append方法的源码：

  public final class StringBuffer 
     extends AbstractStringBuilder 
     implements java.io.Serializable, CharSequence{
    
   /**
     * A cache of the last value returned by toString. Cleared
     * whenever the StringBuffer is modified.
     */
    private transient char[] toStringCache;
     
    @Override
    public synchronized StringBuffer append(String str) {
        toStringCache = null;
        super.append(str);
        return this;
    }
 }

 public final class StringBuilder
    extends AbstractStringBuilder
    implements java.io.Serializable, CharSequence{
        
    @Override
    public StringBuilder append(String str) {
        super.append(str);
        return this;
    }
}

 abstract class AbstractStringBuilder implements Appendable, CharSequence {
    /**
     * The value is used for character storage.
     */
    char[] value;
 
    /**
     * The count is the number of characters used.
     */
    int count;
        
   /**
     * Appends the specified string to this character sequence.
     * <p>
     * The characters of the {@code String} argument are appended, in
     * order, increasing the length of this sequence by the length of the
     * argument. If {@code str} is {@code null}, then the four
     * characters {@code "null"} are appended.
     * <p>
     * Let <i>n</i> be the length of this character sequence just prior to
     * execution of the {@code append} method. Then the character at
     * index <i>k</i> in the new character sequence is equal to the character
     * at index <i>k</i> in the old character sequence, if <i>k</i> is less
     * than <i>n</i>; otherwise, it is equal to the character at index
     * <i>k-n</i> in the argument {@code str}.
     *
     * @param   str   a string.
     * @return  a reference to this object.
     */
    public AbstractStringBuilder append(String str) {
        if (str == null)
            return appendNull();
        int len = str.length();
        ensureCapacityInternal(count + len);
        //拷贝字符到内部的字符数组中，如果字符数组长度不够，进行扩展。
        str.getChars(0, len, value, count);
        count += len;
        return this;
    }
}

可以观察到一个比较明显的差异，StringBuffer类的append方法使用synchronized关键字修饰，说明StringBuffer的append方法是线程安全的，为了实现线程安全，StringBuffer牺牲了部分性能。

效率比较

既然有这么多种字符串拼接的方法，那么到底哪一种效率最高呢？我们来简单对比一下。

 long t1 = System.currentTimeMillis();
//这里是初始字符串定义
for (int i = 0; i < 50000; i++) {
    //这里是字符串拼接代码
}
long t2 = System.currentTimeMillis();
System.out.println("cost:" + (t2 - t1));

 public class Main2 {
    public static void main(String[] args) {
        test1();
        test2();
        test3();
        test4();
    }
 
    public static void test1() {
        long t1 = System.currentTimeMillis();
        String str = "";
        for (int i = 0; i < 50000; i++) {
            String s = String.valueOf(i);
            str += s;
        }
        long t2 = System.currentTimeMillis();
        System.out.println("+ cost:" + (t2 - t1));
    }
 
    public static void test2() {
        long t1 = System.currentTimeMillis();
        String str = "";
        for (int i = 0; i < 50000; i++) {
            String s = String.valueOf(i);
            str = str.concat("hello");
        }
        long t2 = System.currentTimeMillis();
        System.out.println("concat cost:" + (t2 - t1));
    }
 
    public static void test3() {
        long t1 = System.currentTimeMillis();
        String str;
        StringBuffer stringBuffer = new StringBuffer();
        for (int i = 0; i < 50000; i++) {
            String s = String.valueOf(i);
            stringBuffer.append(s);
        }
        str = stringBuffer.toString();
        long t2 = System.currentTimeMillis();
        System.out.println("stringBuffer cost:" + (t2 - t1));
    }
 
    public static void test4() {
        long t1 = System.currentTimeMillis();
        String str;
        StringBuilder stringBuilder = new StringBuilder();
        for (int i = 0; i < 50000; i++) {
            String s = String.valueOf(i);
            stringBuilder.append(s);
        }
        str = stringBuilder.toString();
        long t2 = System.currentTimeMillis();
        System.out.println("stringBuilder cost:" + (t2 - t1));
    }
}

我们使用形如以上形式的代码，分别测试下五种字符串拼接代码的运行时间。得到结果如下：

 + cost:7135
concat cost:1759
stringBuffer cost:5
stringBuilder cost:5

从结果可以看出，用时从短到长的对比是：

StringBuilder < StringBuffer < concat < +

那么问题来了，前面我们分析过，其实使用+拼接字符串的实现原理也是使用的StringBuilder，那为什么结果相差这么多，高达1000多倍呢？

反编译上面的代码：

 /*
 * Decompiled with CFR 0.149.
 */
package com.learn.java;
 
public class Main2 {
    public static void main(String[] arrstring) {
        Main2.test1();
        Main2.test2();
        Main2.test3();
        Main2.test4();
    }
 
    public static void test1() {
        long l = System.currentTimeMillis();
        String string = "";
        for (int i = 0; i < 50000; ++i) {
            String string2 = String.valueOf(i);
            string = new StringBuilder().append(string).append(string2).toString();
        }
        long l2 = System.currentTimeMillis();
        System.out.println(new StringBuilder().append("+ cost:").append(l2 - l).toString());
    }
 
    public static void test2() {
        long l = System.currentTimeMillis();
        String string = "";
        for (int i = 0; i < 50000; ++i) {
            String string2 = String.valueOf(i);
            string = string.concat("hello");
        }
        long l2 = System.currentTimeMillis();
        System.out.println(new StringBuilder().append("concat cost:").append(l2 - l).toString());
    }
 
    public static void test3() {
        long l = System.currentTimeMillis();
        StringBuffer stringBuffer = new StringBuffer();
        for (int i = 0; i < 50000; ++i) {
            String string = String.valueOf(i);
            stringBuffer.append(string);
        }
        String string = stringBuffer.toString();
        long l2 = System.currentTimeMillis();
        System.out.println(new StringBuilder().append("stringBuffer cost:").append(l2 - l).toString());
    }
 
    public static void test4() {
        long l = System.currentTimeMillis();
        StringBuilder stringBuilder = new StringBuilder();
        for (int i = 0; i < 50000; ++i) {
            String string = String.valueOf(i);
            stringBuilder.append(string);
        }
        String string = stringBuilder.toString();
        long l2 = System.currentTimeMillis();
        System.out.println(new StringBuilder().append("stringBuilder cost:").append(l2 - l).toString());
    }
}

我们可以看到，反编译后的代码，在for循环中，每次都是new了一个StringBuilder，然后再把String转成StringBuilder，再进行append。

而频繁的新建对象当然要耗费很多时间了，不仅仅会耗费时间，频繁的创建对象，还会造成内存资源的浪费。

所以，阿里巴巴Java开发手册建议：循环体内，字符串的连接方式，使用 StringBuilder 的 append 方法进行扩展。而不要使用+。

总结

常用的字符串拼接方式有：+、使用concat、使用StringBuilder、使用StringBuffer

由于字符串拼接过程中会创建新的对象，所以如果要在一个循环体中进行字符串拼接，就要考虑内存问题和效率问题。

经过对比，我们发现，直接使用StringBuilder的方式是效率最高的。因为StringBuilder天生就是设计来定义可变字符串和字符串的变化操作的。

但是，还要强调的是：

1、如果不是在循环体中进行字符串拼接的话，直接使用+就好了。

2、如果在并发场景中进行字符串拼接的话，要使用StringBuffer来代替StringBuilder。

参考文献：https://hollischuang.github.io/toBeTopJavaer/#/basics/java-basic/string-concat

发表于 2020-04-28 23:00 Hello_xzy_World 阅读(8111) 评论(6) 编辑收藏举报

Java字符串拼接

字符串拼接

String

字符串不变性与字符串拼接

Java字符串拼接方式

+ 语法糖

concat

StringBuffer 和StringBuilder

效率比较

总结

公告

搜索

常用链接

积分与排名

随笔分类 (156)

随笔档案 (155)

阅读排行榜

评论排行榜

推荐排行榜

最新评论