Not long ago, I interviewed some candidates looking for a job as a senior Java development engineer. I often interview them and say, "Can you introduce me to some weak references in Java?". If the interviewer says, "Well, is it related to garbage collection?", I will be basically satisfied, and I will not I don’t expect the answer to be a description of a paper that gets to the bottom of things.
However, contrary to expectations, I was surprised to find that among the nearly 20 candidates with an average of 5 years of development experience and highly educated backgrounds, only two people knew about the existence of weak references, but only one of these two people really Learn about this. During the interview process, I also tried to prompt some things to see if anyone would suddenly say "So this is it", but the result was very disappointing. I began to wonder why this piece of knowledge was so ignored. After all, weak references are a very useful feature, and this feature was introduced when Java 1.2 was released 7 years ago.
Well, I don’t expect you to become an expert on weak references after reading this article, but I think you should at least understand what weak references are, how to use them, and what scenarios they are used in. Since they are some unknown concepts, I will briefly explain the previous three questions.
Strong Reference
Strong reference is the reference we often use, and it is written as follows:
Copy the code code as follows:
StringBuffer buffer = new StringBuffer();
The above creates a StringBuffer object and stores a (strong) reference to this object in the variable buffer. Yes, this is a pediatric operation (please forgive me for saying this). The most important thing about a strong reference is that it can make the reference strong, which determines its interaction with the garbage collector. Specifically, if an object is reachable through a string of strongly reference links (Strongly reachable), it will not be recycled. This is exactly what you need if you don't want the object you're working with to be recycled.
But strong quotes are so strong
In a program, it is somewhat uncommon to set a class to be non-extensible. Of course, this can be achieved by marking the class as final. Or it can be more complicated, which is to return an interface (Interface) through a factory method that contains an unknown number of specific implementations. For example, we want to use a class called Widget, but this class cannot be inherited, so new functions cannot be added.
But what should we do if we want to track additional information about the Widget object? Suppose we need to record the serial number of each object, but because the Widget class does not contain this attribute and cannot be extended, we cannot add this attribute. In fact, there is no problem at all. HashMap can completely solve the above problems.
Copy the code code as follows:
serialNumberMap.put(widget, widgetSerialNumber);
This may seem fine on the surface, but strong references to widget objects may cause problems. We can be sure that when a widget serial number is no longer needed, we should remove the entry from the map. If we do not remove it, it may cause memory leaks, or we may delete the widgets we are using when we remove them manually, which may lead to the loss of valid data. In fact, these problems are very similar. This is a problem that languages without garbage collection mechanisms often encounter when managing memory. But we don't have to worry about this problem, because we are using the Java language with a garbage collection mechanism.
Another problem that strong references may cause is caching, especially for large files like images. Suppose you have a program that needs to process images provided by users. A common approach is to cache image data, because loading images from disk is very expensive, and at the same time we also want to avoid having two copies of the same image data in memory at the same time.
The purpose of caching is to prevent us from loading unnecessary files again. You will quickly find that the cache will always contain a reference to the image data in memory. Using strong references will force the image data to stay in memory, which requires you to decide when the image data is no longer needed and manually remove it from the cache so that it can be reclaimed by the garbage collector. So you are once again forced to do what the garbage collector does and manually decide which objects to clean.
Weak Reference
A weak reference is simply a reference that is not so strong in keeping the object in memory. Using WeakReference, the garbage collector will help you decide when the referenced object is recycled and remove the object from memory. Create a weak reference as follows:
Copy the code code as follows:
eakReference<Widget> weakWidget = new WeakReference<Widget>(widget);
You can get the real Widget object by using weakWidget.get(). Because weak references cannot prevent the garbage collector from recycling them, you will find that (when there is no strong reference to the widget object), null is suddenly returned when using get.
The easiest way to solve the above problem of widget sequence number recording is to use Java's built-in WeakHashMap class. WeakHashMap is almost the same as HashMap, the only difference is that its keys (not values!!!) are referenced by WeakReference. When a WeakHashMap key is marked as garbage, the entry corresponding to this key will be automatically removed. This avoids the above problem of manual deletion of unnecessary Widget objects. WeakHashMap can be easily converted to HashMap or Map.
Reference Queue
Once a weak reference object starts returning null, the object pointed to by the weak reference is marked as garbage. And this weak reference object (not the object it points to) is of no use. Usually some cleanup needs to be done at this time. For example, WeakHashMap will remove useless entries at this time to avoid storing meaningless weak references that grow indefinitely.
Reference queues make it easy to track unwanted references. When you pass in a ReferenceQueue object when constructing WeakReference, when the object pointed to by the reference is marked as garbage, the reference object will automatically be added to the reference queue. Next, you can process the incoming reference queue at a fixed period, such as doing some cleanup work to deal with these useless reference objects.
Four kinds of references
There are actually four types of references with different strengths in Java. From strong to weak, they are strong references, soft references, weak references and virtual references. The above section introduces strong references and weak references, and the following describes the remaining two, soft references and virtual references.
Soft Reference
A soft reference is basically the same as a weak reference, except that it has a stronger ability to prevent the garbage collection period from recycling the object it points to than a weak reference. If an object is reachable by a weak reference, the object will be destroyed by the garbage collector in the next collection cycle. But if the soft reference can be reached, then the object will stay in memory for a longer time. The garbage collector will only reclaim objects reachable by these soft references when there is insufficient memory.
Since objects reachable by soft references will stay in memory longer than objects reachable by weak references, we can use this feature for caching. In this way, you can save a lot of things. The garbage collector will care about which type is currently reachable and the degree of memory consumption for processing.
Phantom Reference
Unlike soft references and weak references, the objects pointed to by virtual references are very fragile, and we cannot get the objects they point to through the get method. Its only function is that after the object it points to is recycled, it is added to the reference queue to record that the object pointed to by the reference has been destroyed.
When the object pointed to by a weak reference becomes reachable by a weak reference, the weak reference is added to the reference queue. This operation occurs before object destruction or garbage collection actually occurs. Theoretically, the object that is about to be recycled can be resurrected in a non-compliant destructor method. But this weak reference will be destroyed. A virtual reference is added to the reference queue only after the object it points to is removed from memory. Its get method keeps returning null to prevent the almost destroyed object it points to from being resurrected.
There are two main usage scenarios for virtual references. It allows you to know exactly when the object it refers to is removed from memory. And actually this is the only way in Java. This is especially true when dealing with large files such as images. When you determine that an image data object should be recycled, you can use virtual references to determine if the object is recycled before continuing to load the next image. This way you can avoid terrible memory overflow errors as much as possible.
The second point is that virtual references can avoid many problems during destruction. The finalize method can resurrect objects that are about to be destroyed by creating strong references pointing to them. However, an object that overrides the finalize method needs to go through two separate garbage collection cycles if it wants to be recycled. In the first cycle, an object is marked as recyclable and can then be destructed. But because there is still a slight possibility that this object will be resurrected during the destruction process. In this case, the garbage collector needs to run again before the object is actually destroyed. Because destruction may not be very timely, an indeterminate number of garbage collection cycles need to go through before the object's destruction is called. This means that there may be a large delay in actually cleaning up the object. This is why you still get annoying out-of-memory errors when most of the heap is marked as garbage.
Using virtual references, the above situation will be solved. When a virtual reference is added to the reference queue, you have absolutely no way to get a destroyed object. Because at this time, the object has been destroyed from memory. Because a virtual reference cannot be used to regenerate the object it points to, its object will be cleaned up in the first cycle of garbage collection.
Obviously, the finalize method is not recommended to be overridden. Because virtual references are obviously safe and efficient, removing the finalize method can make the virtual machine significantly simpler. Of course, you can also override this method to achieve more. It's all down to personal choice.
Summarize
I think seeing this, many people are starting to complain. Why are you talking about an old API from the past ten years? Well, from my experience, many Java programmers don’t know this knowledge very well. I think there are Some in-depth understanding is necessary, and I hope everyone can gain something from this article.