Analysis of Java Serialization Algorithm
Serialization is a process of describing an object as a series of bytes; deserialization is a process of reconstructing these bytes into an object. The Java Serialization API provides a standard mechanism for handling object serialization. Here you can learn how to serialize an object, when serialization is required, and the Java serialization algorithm. We use an example to demonstrate how the bytes after serialization describe the information of an object.
The need for serialization
In Java, everything is an object. In a distributed environment, it is often necessary to transfer Objects from one network or device to the other. This requires a protocol that can transmit data on both ends. The Java serialization mechanism was created to solve this problem.
How to serialize an object
The prerequisite for an object to be serializable is to implement the Serializable interface. The Serializable interface has no methods and is more like a marker. Classes with this tag can be processed by the serialization mechanism.
100.
The serialization format of the object
What does an object look like after being serialized? Open the temp.out file we just serialized and outputted the object, and display it in hexadecimal. The content should be as follows:
73 74 A0 0C 34 00 FE B1 DD F9 02 00 02 42 00 05
63 6F 75 6E 74 42 00 07 76 65 72 73 69 6F 6E 78
70 00 64
public byte version = 100;
public byte count = 0;
Both are of byte type. In theory, only 2 bytes are needed to store these two fields, but in fact temp.out occupies 51 bytes, which means that in addition to data, it also includes other descriptions of the serialized object.
Java serialization algorithm
The serialization algorithm generally does the following steps:
◆Output class metadata related to object instances.
◆Recursively output the superclass description of the class until there are no more superclasses.
◆After the class metadata is completed, the actual data values of the object instance are output starting from the top-level superclass.
◆Recursively output instance data from top to bottom
Let’s illustrate with another example that more completely covers all possible situations:
class contain implements Serializable{
int containVersion = 11;
}
public class SerialTest extends parent implements Serializable {
int version = 66;
contain con = new contain();
public int getVersion() {
return version;
}
public static void main(String args[]) throws IOException {
FileOutputStream fos = new FileOutputStream("temp.out");
ObjectOutputStream oos = new ObjectOutputStream(fos);
SerialTest st = new SerialTest();
oos.writeObject(st);
oos.flush();
oos.close();
}
}
The serialized format is as follows:
AC ED 00 05 7372 00 0A 53 65 72 69 61 6C 54 65
73 74 05 52 81 5A AC 66 02 F6 02 00 0249 00 07
76 65 72 73 69 6F 6E4C00 03 63 6F 6E74 00 09
4C63 6F 6E 74 61 69 6E 3B 7872 00 06 70 61 72
65 6E 74 0E DB D2 BD 85 EE 63 7A 02 00 0149 00
0D 70 61 72 65 6E 74 56 65 72 73 69 6F 6E 78 70
0000000A 0000004273 72 00 07 63 6F 6E 74
61 69 6E FC BB E6 0E FB CB 60 C7 02 00 0149 00
0E 63 6F 6E 74 61 69 6E 56 65 72 73 69 6F 6E 78
700000000B
Let's take a closer look at what these bytes represent. Beginning, see color:
The first step in the serialization algorithm is to output a description of the object's related class. The object shown in the example is an instance of the SerialTest class, so the description of the SerialTest class is output next. See colors:
Next, the algorithm outputs one of the fields, inversion=66; see color:
Then, the algorithm outputs the next domain, contain con = new contain(); this is a bit special, it is an object. When describing object type references, you need to use the JVM's standard object signature notation, see color:
.Then the algorithm will output the description of the super class, which is the Parent class. See the color:
Next, output the domain description of the parent class, intparentVersion=100; see also color:
So far, the algorithm has output the descriptions of all classes. The next step is to output the actual value of the instance object. At this time, it starts from the domain of parent Class, see color:
There are also fields of the SerialTest class:
The bytes that follow are more interesting. The algorithm needs to describe the information of the contain class. Remember, the contain class has not been described yet. See the color:
.Output the unique domain description of contain, intcontainVersion=11;
At this time, the serialization algorithm will check whether contain has a superclass, and if so, it will output it.
Finally, the actual domain value of the contain class is output.
OK, we have discussed the mechanism and principles of java serialization, and we hope it will be helpful to the students.