Garbage collection is a hidden mechanism of JavaScript
. We usually don’t need to worry about garbage collection, we only need to focus on the development of functions. But this does not mean that we can sit back and relax when writing JavaScript
. As the functions we implement become more and more complex and the amount of code accumulates, performance problems become more and more prominent. How to write code that executes faster and takes up less memory is the never-ending pursuit of programmers. An excellent programmer can always achieve amazing results with extremely limited resources. This is also the difference between ordinary beings and aloof gods.
code? When executing in the computer's memory, all variables, objects, and functions we define in the code will occupy a certain amount of memory space in the memory. In computers, memory space is a very tight resource. We must always pay attention to the memory usage. After all, memory modules are very expensive! A variable, function, or object can be called garbage if it is no longer needed for subsequent code execution after creation.
Although it is very easy to understand the definition of garbage intuitively, for a computer program, it is difficult for us to conclude at a certain moment that the currently existing variables, functions, or objects will no longer be used in the future. In order to reduce the cost of computer memory and ensure the normal execution of computer programs, we usually stipulate that objects or variables that meet any of the following conditions are garbage:
variables or objects that are not referenced are equivalent to a house without a door. We can never enter it, so it is impossible to use them. Although inaccessible objects are connected, they are still inaccessible from the outside and therefore cannot be used again. Objects or variables that meet the above conditions will never be used again in the future execution of the program, so they can be safely treated as garbage collection.
When we clarify the objects that need to be discarded through the above definition, does it mean that there is no garbage in the remaining variables and objects?
No! The garbage we currently identify is only a part of all garbage. There will still be other garbage that does not meet the above conditions, but it will not be used again.
Can it be said that garbage that meets the above definition is "absolute garbage" and other garbage hidden in the program is "relative garbage"?
The garbage collection mechanism ( GC,Garbage Collection
) is responsible for recycling useless variables and memory space occupied during program execution. The phenomenon that an object still exists in memory even though it has no possibility of being used again is called a memory leak . Memory leaks are a very dangerous phenomenon, especially in long-running programs. If a program has a memory leak, it will occupy more and more memory space until it runs out of memory.
Strings, objects, and arrays do not have a fixed size, so dynamic storage allocation for them can only be done if their size is known. Every time a JavaScript program creates a string, array, or object, the interpreter allocates memory to store the entity. Whenever memory is allocated dynamically like this, it must eventually be freed so that it can be used again; otherwise, the JavaScript interpreter will consume all available memory in the system, causing the system to crash.
JavaScript
's garbage collection mechanism will intermittently check for useless variables and objects (garbage) and release the space they occupy.
Different programming languages adopt different garbage collection strategies. For example, C++
does not have a garbage collection mechanism. All memory management relies on the programmer's own skills, which makes C++
more difficult to master. JavaScript
uses reachability to manage memory. Literally, reachability means reachable, which means that the program can access and use variables and objects in some way. The memory occupied by these variables cannot be released. of.
JavaScript
specifies an inherent set of reachable values, and the values in the set are inherently reachable:
the above variables are called roots , which are the top nodes of the reachability tree.
A variable or object is considered reachable if it is directly or indirectly used by the root variable.
In other words, a value is reachable if it can be reached through the root (for example, Abcde
).
let people = { boys:{ boys1:{name:'xiaoming'}, boys2:{name:'xiaojun'}, }, girls:{ girls1:{name:'xiaohong'}, girls2:{name:'huahua'}, }};
The above code creates an object and assigns it to the variable people
. The variable people
contains two objects, boys
and girls
, and boys
and girls
contain two sub-objects respectively. This also creates a data structure containing 3
levels of reference relationships (regardless of basic type data), as shown below:
Among them, the people
node is naturally reachable because it is a global variable. The boys
and girls
nodes are indirectly reachable because they are directly referenced by global variables. boys1
, boys2
, girls1
and girls2
are also reachable variables because they are indirectly used by global variables and can be accessed through people.boys.boys
.
If we add the following code after the above code:
people.girls.girls2 = null; people.girls.girls1 = people.boys.boys2;
Then, the above reference hierarchy diagram will become as follows:
Among them, girls1
and girls2
became unreachable nodes due to disconnection from the grils
node, which means they will be recycled by the garbage collection mechanism.
And if at this time, we execute the following code:
people.boys.boys2 = null;
then the reference hierarchy diagram will become as follows:
At this time, although the boys
node and boys2
node are disconnected, due to the reference relationship between boys2
node and girls
node, boys2
is still reachable and will not be recycled by the garbage collection mechanism.
The above association diagram proves why the global variable equivalent value is called the root , because in the association diagram, this type of value usually appears as the root node of the relationship tree.
let people = { boys:{ boys1:{name:'xiaoming'}, boys2:{name:'xiaojun'}, }, girls:{ girls1:{name:'xiaohong'}, girls2:{name:'huahua'}, }};people.boys.boys2.girlfriend = people.girls.girls1; //boys2 refers to girls1people.girls.girls1.boyfriend = people.boys.boys2; //girls1 refers to boys2.
The above code creates an interrelated relationship between boys2
and girls1
. The relationship structure diagram is as follows:
At this point, if we cut off the association between boys
and boys2
:
delete people.boys.boys2;
the association diagram between objects is as follows:
Obviously, there are no unreachable nodes.
At this point, if we cut off the boyfriend
relationship connection:
delete people.girls.girls1;
the relationship diagram becomes:
At this time, although there is still a girlfriend
relationship between boys2
and girls1
, boys2
becomes an unreachable node and will be reclaimed by the garbage collection mechanism.
let people = { boys:{ boys1:{name:'xiaoming'}, boys2:{name:'xiaojun'}, }, girls:{ girls1:{name:'xiaohong'}, girls2:{name:'huahua'}, }};delete people.boys;delete people.girls;
The reference hierarchy diagram formed by the above code is as follows:
At this time, although there is still a mutual reference relationship between the objects inside the dotted box, these objects are also unreachable and will be deleted by the garbage collection mechanism. These nodes have lost their relationship with the root and become unreachable.
The so-called reference counting, as the name suggests, is counting every time an object is referenced. Adding a reference will increase it by one, and deleting a reference will decrease it by one. If the reference number becomes 0, it is considered Garbage, thus deleting objects to reclaim memory.
For example:
let user = {username:'xiaoming'}; //The object is referenced by the user variable, count +1 let user2 = user; //The object is referenced by a new variable, and the count +1 user = null; //The variable no longer refers to the object, the count is -1 user2 = null; //The variable no longer refers to the object, odd number -1 //At this time, the number of object references is 0 and will be deleted.
Although the reference counting method seems very reasonable, in fact, there are obvious loopholes in the memory recycling mechanism using the reference counting method.
For example:
let boy = {}; let girl = {}; boy.girlfriend = girl; girl.boyfriend = boy; boy = null; girl = null;
The above code has mutual references between boy
and girl
. Counting deletes the references in boy
and girl
, and the two objects will not be recycled. Due to the existence of circular references, the reference counts of the two anonymous objects will never return to zero, resulting in a memory leak.
There is a concept of smart pointer ( shared_ptr
) in C++
. Programmers can use the smart pointer to use the object destructor to release the reference count. However, memory leaks will occur in the case of circular references.
Fortunately, JavaScript
has adopted another safer strategy, which avoids the risk of memory leaks to a greater extent.
Mark mark and sweep
is a garbage collection algorithm adopted by the JavaScript
engine. Its basic principle is to start from the root , breadth-first traverse the reference relationship between variables, and put a mark (优秀员工徽章
) on the traversed variables. Unmarked objects are finally deleted.
The basic process of the algorithm is as follows:
2
until there is no new Excellent employees join;For example:
If there is an object reference relationship in our program as shown below:
We can clearly see that there is an "reachable island" on the right side of the entire picture. Starting from the root , the island can never be reached. But the garbage collector does not have a God's perspective like ours. They will only mark the root node as an outstanding employee based on the algorithm.
Then start from the outstanding employees and find all the nodes cited by the outstanding employees, such as the three nodes in the dotted box in the figure above. Then mark the newly found nodes as outstanding employees.
The process of finding and marking is repeated until all nodes that can be found are successfully marked.
Finally, the effect shown in the figure below is achieved:
Since the islands on the right are still unmarked after the algorithm execution cycle ends, these nodes will be unable to be reached by the garbage collector task and will eventually be cleared.
Children who have studied data structures and algorithms may be surprised to find that this is graph traversal, similar to connected graph algorithms.
Garbage collection is a large-scale task. Especially when the amount of code is very large, frequent execution of the garbage collection algorithm will significantly drag down the execution of the program. The JavaScript
algorithm has made a lot of optimizations in garbage collection to ensure that the program can be executed efficiently while ensuring the normal execution of the recycling work.
Strategies adopted for performance optimization usually include the following points:
JavaScript
programs will maintain a considerable number of variables during execution, and frequent scanning of these variables will cause significant overhead. However, these variables have their own characteristics in the life cycle. For example, local variables are frequently created, used quickly, and then discarded, while global variables occupy memory for a long time. JavaScript
manages the two types of objects separately. For local variables that are quickly created, used and discarded, the garbage collector will scan frequently to ensure that these variables are quickly cleaned up after they lose their use. For variables that hold memory for a long time, reduce the frequency of checking them, thereby saving a certain amount of overhead.
The incremental idea is very common in performance optimization and can also be used for garbage collection. When the number of variables is very large, it is obviously very time-consuming to traverse all variables at once and issue outstanding employee marks, resulting in lags during program execution. Therefore, the engine will divide the garbage collection work into multiple subtasks, and gradually execute each small task during the execution of the program. This will cause a certain recovery delay, but usually will not cause obvious program lags.
CPU
does not always work even in complex programs. This is mainly because CPU
works very fast and peripheral IO
is often several orders of magnitude slower. Therefore, it is a good idea to arrange a garbage collection strategy when CPU
is idle. This is a very effective performance optimization method and will basically not have any adverse effects on the program itself. This strategy is similar to the system's idle time upgrade, and users are not aware of the background execution at all.
The main task of this article is to simply end garbage collection mechanisms, commonly used strategies and optimization methods. It is not intended to give everyone an in-depth understanding of the background execution principles of the engine.
Through this article, you should understand:
JavaScript
, which is executed in the background and does not require us to worry about it;