面试中关于字符串及常量池的一些考点

字符串及常量池在面试中很容易被问到,前2天在为公司做校招面试时,发现很多同学对相关细节不太清楚,在此梳理一下:

先回顾一下java中字符串的设计,大家都知道jvm中有所谓的"字符串常量池"设计,当String s = "xxx"时,会先检查常量池中有没有,如果没有则加入常量池(缓存起来),下次再遇到同样的String s2="xxx"赋值时,直接从池中取用,不再重复创建。

 

围绕这个,就能设计一系列问题:(以下环境都基于jdk8)

题目1:

String s1 = "123";
String s2 = "123";
System.out.println(s1==s2);

==号比较的是字符串的引用地址,根据刚才的回顾,这题很容易回答。第1次赋值放到常量池,同时返回这个字符串在池中的引用,第2次发现常量池中已经有了,直接返回引用地址,所以s1与s2的地址相同,输出true

 

题目2:

String s1 = "123";
String s2 = new String("123");
System.out.println(s1==s2);

注意这里s2用了String的构造函数来创建,看下这个方法的签名:

    /**
     * Initializes a newly created {@code String} object so that it represents
     * the same sequence of characters as the argument; in other words, the
     * newly created string is a copy of the argument string. Unless an
     * explicit copy of {@code original} is needed, use of this constructor is
     * unnecessary since Strings are immutable.
     *
     * @param  original
     *         A {@code String}
     */
    @HotSpotIntrinsicCandidate
    public String(String original) {
        this.value = original.value;
        this.coder = original.coder;
        this.hash = original.hash;
    }

方法注释里已经说明,将创建一个新实例,而且这个新实例是参数字符串的副本,即然是个新的副本,那地址自然跟常量池里的不同,因此这题输出false

 

题目3:

String s1 = "123";
String s2 = String.valueOf(123);
System.out.println(s1 == s2);

这里s2用了1个新的方法valueOf,而且入参是个整数,跟踪下这个方法:

    /**
     * Returns the string representation of the {@code int} argument.
     * <p>
     * The representation is exactly the one returned by the
     * {@code Integer.toString} method of one argument.
     *
     * @param   i   an {@code int}.
     * @return  a string representation of the {@code int} argument.
     * @see     java.lang.Integer#toString(int, int)
     */
    public static String valueOf(int i) {
        return Integer.toString(i);
    }

发现调用了Integer.toString()方法,再点进去:

    /**
     * Returns a {@code String} object representing the
     * specified integer. The argument is converted to signed decimal
     * representation and returned as a string, exactly as if the
     * argument and radix 10 were given as arguments to the {@link
     * #toString(int, int)} method.
     *
     * @param   i   an integer to be converted.
     * @return  a string representation of the argument in base 10.
     */
    @HotSpotIntrinsicCandidate
    public static String toString(int i) {
        int size = stringSize(i);
        if (COMPACT_STRINGS) {
            byte[] buf = new byte[size];
            getChars(i, size, buf);
            return new String(buf, LATIN1);
        } else {
            byte[] buf = new byte[size * 2];
            StringUTF16.getChars(i, size, buf);
            return new String(buf, UTF16);
        }
    }

仍然是new String(),根据上一题的分析,最终s2也是一个新实例,相当于"123"的一个新副本,所以s1与s2的地址不同,输出false。

 

题目4:

String s1 = "123".intern();
String s2 = "123".intern();
System.out.println(s1 == s2);

又出现1个新方法intern,不清楚功能的话,直接看源码注释是最快的学习方法:

    /**
     * Returns a canonical representation for the string object.
     * <p>
     * A pool of strings, initially empty, is maintained privately by the
     * class {@code String}.
     * <p>
     * When the intern method is invoked, if the pool already contains a
     * string equal to this {@code String} object as determined by
     * the {@link #equals(Object)} method, then the string from the pool is
     * returned. Otherwise, this {@code String} object is added to the
     * pool and a reference to this {@code String} object is returned.
     * <p>
     * It follows that for any two strings {@code s} and {@code t},
     * {@code s.intern() == t.intern()} is {@code true}
     * if and only if {@code s.equals(t)} is {@code true}.
     * <p>
     * All literal strings and string-valued constant expressions are
     * interned. String literals are defined in section 3.10.5 of the
     * <cite>The Java™ Language Specification</cite>.
     *
     * @return  a string that has the same contents as this string, but is
     *          guaranteed to be from a pool of unique strings.
     * @jls 3.10.5 String Literals
     */
    public native String intern();

首先这是1个native方法,也就是说对于初学者,不用关心实现了,专心看注释就好。核心看中间这段:

When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.
It follows that for any two strings s and t, s.intern() == t.intern() is true if and only if s.equals(t) is true.

翻译一下:

当intern方法被调用时,如果常量池中已经存在1个相同内容的字符串(用equals判断),将直接返回池中的对象(注:String是引用类型,即返回的就是池中的引用),否则这个字符串将加入池中,同时返回字符串的引用。

所以,回到这题,第1次调用intern时,发现池中没有,会放到池中,然后返回池中的引用,第2次再调用intern时,发现池中已有,返回池中的引用,所以s1与s2地址相同,返回true

 

题目5:

String s1 = new String("123").intern();
String s2 = "123";
System.out.println(s1 == s2);

如果理解了上一题,知道intern方法的作用后,这题其实是障眼法,s1这一行,相当于先创建"123"的1个副本,然后返回常量池中的引用地址,接下来s2发现常量池中有内容为"123"的字符串,直接返回池中的地址,所以s1与s2地址相同,返回true

 

题目6:

String s1 = new String("123");
String s2 = s1.intern();
String s3 = "123";
String s4 = String.valueOf(123);

System.out.println(s1 == s2);
System.out.println(s1 == s3);
System.out.println(s1 == s4);
System.out.println(s2 == s3);
System.out.println(s2 == s4);
System.out.println(s3 == s4);

这题属于综合运用了,根据刚才的分析s1、s4都是全新实例(跟常量池没关系),只有s2与s3都是从常用池中取的,所以除了s2==s3返回true外,其它全是false

 

题目7:

String s1 = "123";
String s2 = "12" + "3";
System.out.println(s1 == s2);

先说答案,输出true,然后改下写法:

String s1 = "123";
String s2 = "12" + String.valueOf(3);
System.out.println(s1 == s2);

这次输出变成false了,如果想不明白,可以放大招,用javap 看字节码:

"12"+"3"的写法,字节码如下:

{
  public SimpleTest();
    descriptor: ()V
    flags: ACC_PUBLIC
    Code:
      stack=1, locals=1, args_size=1
         0: aload_0
         1: invokespecial #1                  // Method java/lang/Object."<init>":()V
         4: return
      LineNumberTable:
        line 1: 0

  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=3, locals=3, args_size=1
         0: ldc           #2                  // String 123
         2: astore_1
         3: ldc           #2                  // String 123
         5: astore_2
         6: getstatic     #3                  // Field java/lang/System.out:Ljava/io/PrintStream;
         9: aload_1
        10: aload_2
        11: if_acmpne     18
        14: iconst_1
        15: goto          19
        18: iconst_0
        19: invokevirtual #4                  // Method java/io/PrintStream.println:(Z)V
        22: return
      LineNumberTable:
        line 4: 0
        line 5: 3
        line 6: 6
        line 7: 22
      StackMapTable: number_of_entries = 2
        frame_type = 255 /* full_frame */
          offset_delta = 18
          locals = [ class "[Ljava/lang/String;", class java/lang/String, class java/lang/String ]
          stack = [ class java/io/PrintStream ]
        frame_type = 255 /* full_frame */
          offset_delta = 0
          locals = [ class "[Ljava/lang/String;", class java/lang/String, class java/lang/String ]
          stack = [ class java/io/PrintStream, int ]
}

18、20行,可以看到,注释里都是// String 123,说明"12"+"3"在编译时,就直接优化成"123"了,s1与s2其实都相当于 s="123",所以输出true

而"12" + String.valueOf(3)写法,字节码如下:

 public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=3, locals=3, args_size=1
         0: ldc           #2                  // String 123
         2: astore_1
         3: new           #3                  // class java/lang/StringBuilder
         6: dup
         7: invokespecial #4                  // Method java/lang/StringBuilder."<init>":()V
        10: ldc           #5                  // String 12
        12: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        15: iconst_3
        16: invokestatic  #7                  // Method java/lang/String.valueOf:(I)Ljava/lang/String;
        19: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        22: invokevirtual #8                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
        25: astore_2
        26: getstatic     #9                  // Field java/lang/System.out:Ljava/io/PrintStream;
        29: aload_1
        30: aload_2
        31: if_acmpne     38
        34: iconst_1
        35: goto          39
        38: iconst_0
        39: invokevirtual #10                 // Method java/io/PrintStream.println:(Z)V
        42: return
      LineNumberTable:
        line 4: 0
        line 5: 3
        line 6: 26
        line 7: 42
      StackMapTable: number_of_entries = 2
        frame_type = 255 /* full_frame */
          offset_delta = 38
          locals = [ class "[Ljava/lang/String;", class java/lang/String, class java/lang/String ]
          stack = [ class java/io/PrintStream ]
        frame_type = 255 /* full_frame */
          offset_delta = 0
          locals = [ class "[Ljava/lang/String;", class java/lang/String, class java/lang/String ]
          stack = [ class java/io/PrintStream, int ]
}

6及11行,可以看到,1个是//String 123,另1个是//String 12,说明这2个的内容都不同,然后接下来看到了StringBuilder,以及append方法,即:通过StringBuilder的append方法,将"3"追加上去,所以最终效果类似:

StringBuilder sb = new StringBuilder("12");
sb.append("3");
s2 = sb.toString();

然后StringBuilder的toString方法:

    @Override
    @HotSpotIntrinsicCandidate
    public String toString() {
        // Create a copy, don't share the array
        return isLatin1() ? StringLatin1.newString(value, 0, count)
                          : StringUTF16.newString(value, 0, count);
    }

最终newString,创建了1个新实例,所以回到这题,s2是一个新字符串实例,输出false。 同时从字节码看出另外1个知识点:

s2 += "abc" 为什么不推荐这种写法(特别是在循环中),因为内部是使用StringBuilder实现的,不仅需要创建StringBuilder实例,而且StringBuilder.toString()还会生成字符串新实例。

 

题目8:

这是一道有点欠抽的题目,老实说,实际开发中,这辈子可能都遇不到这种需求。

String s1 = "123";
System.out.println(s1 + "/" + s1.hashCode());
//这里加入一段代码,要求:s1的内容变成"1234",但是s1的引用地址不能变!
// s1 = s1.concat("4"); //这样显然是不行的,会生成1个新的字符串实例
System.out.println(s1 + "/" + s1.hashCode());

地球人都知道,String是final类,不可变的,不允许修改,改了内容后,必然会生成1个新的字符串实例(即:引用地址就变了!),可以看看String的原码(JDK8)

如果是高版本,比如JDK 11,value成员的类型变成了byte[]

这道题其实是考反射,jdk8环境下,可以参考下面的做法:

String s1 = "123";
System.out.println(s1 + "/" + s1.hashCode());
Field value = s1.getClass().getDeclaredField("value");
value.setAccessible(true);
value.set(s1, "1234".toCharArray());
System.out.println(s1 + " / " + s1.hashCode());

在JDK11下,改成这样:

String s1 = "123";
System.out.println(s1 + "/" + s1.hashCode());
Field value = s1.getClass().getDeclaredField("value");
value.setAccessible(true);
value.set(s1, "1234".getBytes());
System.out.println(s1 + "/" + s1.hashCode());

默认情况下,可能会有一堆报警,但程序还是最终执行通过了:

如果觉得报警内容太多,可以在运行时加一行JVM参数 :

 再运行,报警就只剩1行了

posted @ 2022-05-22 09:10  菩提树下的杨过  阅读(165)  评论(4编辑  收藏  举报