There are two forms of string object creation in Java, one is a literal form, such as String str = "droid";, and the other is to use new, a standard method of constructing objects, such as String str = new String(" droid");, these two methods are often used when writing code, especially the literal method. However, there are actually some differences in performance and memory usage between these two implementations. All this is due to the fact that in order to reduce the repeated creation of string objects, the JVM maintains a special memory. This memory is called a string constant pool or a string literal pool.
Working principle
When a string object is created in the form of a literal in the code, the JVM will first check the literal. If there is a reference to a string object with the same content in the string constant pool, the reference will be returned, otherwise a new string will be created. The object is created, this reference is put into the string constant pool, and the reference is returned.
Give an example
Literal creation form
Copy the code code as follows:
String str1 = "droid";
The JVM detects this literal. Here we think that there is no object whose content is droid. The JVM cannot find the existence of a string object with the content of droid through the string constant pool, then it will create the string object, then put the reference of the newly created object into the string constant pool, and return the reference to the variable str1 .
If there is a piece of code like this next
Copy the code code as follows:
String str2 = "droid";
Similarly, the JVM still needs to detect this literal. The JVM searches the string constant pool and finds that the string object with the content of "droid" exists, so it returns the reference of the existing string object to the variable str2. Note that a new string object is not recreated here.
To verify whether str1 and str2 point to the same object, we can use this code
Copy the code code as follows:
System.out.println(str1 == str2);
The result is true.
Create using new
Copy the code code as follows:
String str3 = new String("droid");
When we use new to construct a string object, a new string object will be created regardless of whether there is a reference to an object with the same content in the string constant pool. So we use the following code to test it,
Copy the code code as follows:
String str3 = new String("droid");
System.out.println(str1 == str3);
The result is false as we thought, indicating that the two variables point to different objects.
intern
For the string object created using new above, if you want to add the reference of this object to the string constant pool, you can use the intern method.
After calling intern, first check whether there is a reference to the object in the string constant pool. If it exists, return the reference to the variable. Otherwise, add the reference and return it to the variable.
Copy the code code as follows:
String str4 = str3.intern();
System.out.println(str4 == str1);
The output result is true.
Difficult questions
Prerequisite?
The prerequisite for the implementation of the string constant pool is that the String object in Java is immutable, which can safely ensure that multiple variables share the same object. If the String object in Java is mutable and a reference operation changes the value of the object, other variables will also be affected. Obviously this is unreasonable.
reference or object
The most common problem is whether references or objects are stored in the string constant pool. The string constant pool stores object references, not objects. In Java, objects are created in heap memory.
Update verification, many comments received are also discussing this issue, I simply verified it. Verification environment:
Copy the code code as follows:
22:18:54-androidyue~/Videos$ cat /etc/os-release
NAME=Fedora
VERSION="17 (Beefy Miracle)"
ID=fedora
VERSION_ID=17
PRETTY_NAME="Fedora 17 (Beefy Miracle)"
ANSI_COLOR="0;34"
CPE_NAME="cpe:/o:fedoraproject:fedora:17"
22:19:04-androidyue~/Videos$ java -version
java version "1.7.0_25"
OpenJDK Runtime Environment (fedora-2.3.12.1.fc17-x86_64)
OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode)
Verification idea: The following Java program reads a video file with a size of 82M and performs intern operations in the form of strings.
Copy the code code as follows:
22:01:17-androidyue~/Videos$ ll -lh | grep why_to_learn.mp4
-rw-rw-r--. 1 androidyue androidyue 82M Oct 20 2013 why_to_learn.mp4
Verification code
Copy the code code as follows:
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
public class TestMain {
private static String fileContent;
public static void main(String[] args) {
fileContent = readFileToString(args[0]);
if (null != fileContent) {
fileContent = fileContent.intern();
System.out.println("Not Null");
}
}
private static String readFileToString(String file) {
BufferedReader reader = null;
try {
reader = new BufferedReader(new FileReader(file));
StringBuffer buff = new StringBuffer();
String line;
while ((line = reader.readLine()) != null) {
buff.append(line);
}
return buff.toString();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (null != reader) {
try {
reader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
return null;
}
}
Since the string constant pool exists in the permanent generation in the heap memory, it is applicable before Java8. We verified this by setting the permanent generation to a very small value. If the string object exists in the string constant pool, then the java.lang.OutOfMemoryError permgen space error will inevitably be thrown.
Copy the code code as follows:
java -XX:PermSize=6m TestMain ~/Videos/why_to_learn.mp4
Running the proof program did not throw OOM. In fact, this cannot prove very well whether objects or references are stored.
But this at least proves that the actual content object char[] of the string is not stored in the string constant pool. In this case, it is actually not that important whether the string constant pool stores string objects or references to string objects. But personally I still prefer to store it as a reference.
Advantages and Disadvantages
The advantage of the string constant pool is to reduce the creation of strings with the same content and save memory space.
If we insist on talking about the disadvantages, it is that CPU computing time is sacrificed in exchange for space. The CPU calculation time is mainly used to find whether there is a reference to an object with the same content in the string constant pool. However, its internal implementation is HashTable, so the calculation cost is low.
GC recycling?
Because the string constant pool holds references to shared string objects, does this mean that these objects cannot be recycled?
First of all, the shared objects in the question are generally relatively small. As far as I know, this problem did exist in earlier versions, but with the introduction of weak references, this problem should be gone now.
Regarding this issue, you can learn more about this article interned Strings: Java Glossary
intern use?
The prerequisite for using intern is that you know you really need to use it. For example, we have millions of records here, and a certain value in the record is California, USA many times. We don’t want to create millions of such string objects. We can use intern to keep only one copy in the memory. Can. For a more in-depth understanding of intern, please refer to In-depth Analysis of String#intern.
Are there always exceptions?
Do you know that the following code will create several string objects and save several references in the string constant pool?
Copy the code code as follows:
String test = "a" + "b" + "c";
The answer is that only one object is created and only one reference is saved in the constant pool. We can find out by using javap to decompile and take a look.
Copy the code code as follows:
17:02 $ javap -c TestInternedPoolGC
Compiled from "TestInternedPoolGC.java"
public class TestInternedPoolGC extends java.lang.Object{
public TestInternedPoolGC();
Code:
0: aload_0
1: invokespecial #1; //Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]) throws java.lang.Exception;
Code:
0: ldc #2; //String abc
2: astore_1
3: return
Did you see that during compilation, these three literals have been combined into one? This is actually an optimization that avoids the creation of redundant string objects and does not cause string splicing problems. Regarding string splicing, you can view Java details: String splicing.