We all know that Node.js uses a single-threaded, event-driven asynchronous I/O model. Its characteristics determine that it cannot take advantage of the multi-core CPU and is not good at completing some non-I/O operations (such as executing scripts). , AI computing, image processing, etc.), in order to solve such problems, Node.js provides a conventional multi-process (thread) solution (for discussions on processes and threads, please refer to the author’s other article Node.js and Concurrency Model ), this article will introduce you to the multi-thread mechanism of Node.js.
We can use the child_process
module to create a child process of Node.js to complete some special tasks (such as executing scripts). This module mainly provides exec
, execFile
, fork
, spwan
and other methods. Below we will briefly introduce these methods. use.
const { exec } = require('child_process'); exec('ls -al', (error, stdout, stderr) => { console.log(stdout); });
This method processes the command string according to the executable file specified by options.shell
, caches its output during the execution of the command, and then returns the execution result in the form of callback function parameters until the command execution is completed.
The parameters of this method are explained as follows:
command
: the command to be executed (such as ls -al
);
options
: parameter settings (optional), the relevant properties are as follows:
cwd
: the current working directory of the child process, the default is process.cwd()
value;
env
: environment variable setting (key-value pair object), the default value is the value of process.env
;
encoding
: character encoding, the default value is: utf8
;
shell
: executable file that processes command strings, the default value on Unix
is /bin/sh
, the default value on Windows
is the value of process.env.ComSpec
(if it is empty, it is cmd.exe
); for example:
const { exec } = require('child_process'); exec("print('Hello World!')", { shell: 'python' }, (error, stdout, stderr) => { console.log(stdout); });
Running the above example will output Hello World!
which is equivalent to the subprocess executing python -c "print('Hello World!')"
command. Therefore, when using this attribute, you need to pay attention to the specified executable file. Execution of related statements through the -c
option must be supported.
Note: It happens that Node.js
also supports the -c
option, but it is equivalent to the --check
option. It is only used to detect whether there are syntax errors in the specified script and will not execute the relevant script.
signal
: Use the specified AbortSignal to terminate the child process. This attribute is available above v14.17.0, for example:
const { exec } = require('child_process'); const ac = new AbortController(); exec('ls -al', { signal: ac.signal }, (error, stdout, stderr) => {});
In the above example, we can terminate the child process early by calling ac.abort()
.
timeout
: The timeout time of the child process (if the value of this attribute is greater than 0
, then when the running time of the child process exceeds the specified value, the termination signal specified by the attribute killSignal
will be sent to the child process), in millimeters, the default value is 0
;
maxBuffer
: The maximum cache (binary) allowed by stdout or stderr. If exceeded, the child process will be killed and any output will be truncated. The default value is 1024 * 1024
;
killSignal
: The child process termination signal, the default value is SIGTERM
;
uid
: uid
for executing the child process;
gid
: gid
for executing the child process;
windowsHide
: whether to hide the console window of the child process, commonly used in Windows
systems, the default value is false
;
callback
: callback function, including error
, stdout
, stderr
Parameters:
error
: If the command line is executed successfully, the value is null
, otherwise the value is an instance of Error, where error.code
is the exit error code of the child process, error.signal
is the signal for the termination of the child process;stdout
and stderr
: child stdout
and stderr
of the process are encoded according to the value of encoding
attribute. If encoding
value is buffer
, or the value of stdout
or stderr
is an unrecognizable string, it will be encoded according to buffer
.const { execFile } = require('child_process'); execFile('ls', ['-al'], (error, stdout, stderr) => { console.log(stdout); });
The function of this method is similar to exec
. The only difference is that execFile
directly processes the command with the specified executable file (that is, the value of the parameter file
) by default, which makes its efficiency slightly higher than exec
(if you look at the shell's When it comes to processing logic, I feel that the efficiency is negligible).
The parameters of this method are explained as follows:
file
: the name or path of the executable file;
args
: the parameter list of the executable file;
options
: parameter settings (can not be specified), the relevant properties are as follows:
shell
: when the value is false
it means directly using the specified The executable file (that is, the value of the parameter file
) processes the command. When the value is true
or other strings, the function is equivalent to shell
in exec
. The default value is false
;windowsVerbatimArguments
: whether to quote or escape the parameters in Windows
. This attribute will be ignored in Unix
, and the default value is false
;cwd
, env
, encoding
, timeout
, maxBuffer
, killSignal
, uid
, gid
, windowsHide
, and signal
have been introduced above and will not be repeated here.callback
: callback function, which is equivalent to callback
in exec
and will not be explained here.
const { fork } = require('child_process'); const echo = fork('./echo.js', { silent: true }); echo.stdout.on('data', (data) => { console.log(`stdout: ${data}`); }); echo.stderr.on('data', (data) => { console.error(`stderr: ${data}`); }); echo.on('close', (code) => { console.log(`child process exited with code ${code}`); });
This method is used to create a new Node.js instance to execute the specified Node.js script and communicate with the parent process through IPC.
The parameters of this method are explained as follows:
modulePath
: the path of the Node.js script to be run;
args
: the parameter list passed to the Node.js script;
options
: parameter settings (can not be specified), related attributes such as:
detached
: see below for spwan
Description of options.detached
;
execPath
: Create the executable file of the child process;
execArgv
: The string parameter list passed to the executable file, the default value is the value of process.execArgv
;
serialization
: The serial number type of the inter-process message, the available values are json
and advanced
, the default value is json
;
slient
: If true
, stdin
, stdout
and stderr
of the child process will be passed to the parent process through pipes, otherwise stdin
, stdout
and stderr
of the parent process will be inherited; the default value is false
;
stdio
: See the description of options.stdio
in spwan
below. What needs to be noted here is that
slient
will be ignored;ipc
must be included (such as [0, 1, 2, 'ipc']
), otherwise an exception will be thrown.The properties cwd
, env
, uid
, gid
, windowsVerbatimArguments
, signal
, timeout
and killSignal
have been introduced above and will not be repeated here.
const { spawn } = require('child_process'); const ls = spawn('ls', ['-al']); ls.stdout.on('data', (data) => { console.log(`stdout: ${data}`); }); ls.stderr.on('data', (data) => { console.error(`stderr: ${data}`); }); ls.on('close', (code) => { console.log(`child process exited with code ${code}`); });
This method is the basic method of the child_process
module. exec
, execFile
, and fork
will eventually call spawn
to create a child process.
The parameters of this method are explained as follows:
command
: the name or path of the executable file;
args
: the parameter list passed to the executable file;
options
: parameter settings (can not be specified), the relevant attributes are as follows:
argv0
: sent to the child process argv[0 ] value, the default value is the value of parameter command
;
detached
: whether to allow the child process to run independently of the parent process (that is, after the parent process exits, the child process can continue to run), the default value is false
, and when its value is true
, each platform The effect is as follows:
Windows
systems, after the parent process exits, the child process can continue to run, and the child process has its own console window (once this feature is started, it cannot be changed during the running process);Windows
In the system, the child process will serve as the leader of the new process session group. At this time, regardless of whether the child process is separated from the parent process, the child process can continue to run after the parent process exits.It should be noted that if the child process needs to perform a long-term task and wants the parent process to exit early, the following points need to be met at the same time:
unref
method of the child process to remove the child process from the event loop of the parent process;detached
Set to true
;stdio
is ignore
.For example, the following example:
// hello.js const fs = require('fs'); let index = 0; function run() { setTimeout(() => { fs.writeFileSync('./hello', `index: ${index}`); if (index < 10) { index += 1; run(); } }, 1000); } run(); // main.js const { spawn } = require('child_process'); const child = spawn('node', ['./hello.js'], { detached: true, stdio: 'ignore' }); child.unref();
stdio
: child process standard input and output configuration, the default value is pipe
, the value is a string or array:
pipe
is converted into ['pipe', 'pipe', 'pipe']
), the available values are pipe
, overlapped
, ignore
, inherit
;stdin
, stdout
and stderr
respectively, each The available values of the item are pipe
, overlapped
, ignore
, inherit
, ipc
, Stream object, positive integer (the file descriptor opened in the parent process), null
(if it is located in the first three items of the array, it is equivalent to pipe
, otherwise it is equivalent to ignore
) , undefined
(if it is located in the first three items of the array, it is equivalent to pipe
, otherwise it is equivalent to ignore
).The attributes cwd
, env
, uid
, gid
, serialization
, shell
(value is boolean
or string
), windowsVerbatimArguments
, windowsHide
, signal
, timeout
, killSignal
have been introduced above and will not be repeated here.
The above gives a brief introduction to the use of the main methods in the child_process
module. Since execSync
, execFileSync
, forkSync
, and spwanSync
methods are synchronous versions of exec
, execFile
, and spwan
, there is no difference in their parameters, so they will not be repeated.
Through the cluster
module, we can create a Node.js process cluster. By adding the Node.js process into the cluster, we can make fuller use of the advantages of multi-core and distribute program tasks to different processes to improve the execution efficiency of the program; below, we will use This example introduces the use of the cluster
module:
const http = require('http'); const cluster = require('cluster'); const numCPUs = require('os').cpus().length; if (cluster.isPrimary) { for (let i = 0; i < numCPUs; i++) { cluster.fork(); } } else { http.createServer((req, res) => { res.writeHead(200); res.end(`${process.pid}n`); }).listen(8000); }
The above example is divided into two parts based on the judgment of cluster.isPrimary
attribute (that is, judging whether the current process is the main process):
cluster.fork
call;8000
).Run the above example and access http://localhost:8000/
in the browser. We will find that pid
returned is different for each access, which shows that the request is indeed distributed to each child process. The default load balancing strategy adopted by Node.js is round-robin scheduling, which can be modified through the environment variable NODE_CLUSTER_SCHED_POLICY
or cluster.schedulingPolicy
property:
NODE_CLUSTER_SCHED_POLICY = rr // or none cluster.schedulingPolicy = cluster.SCHED_RR; // or cluster.SCHED_NONE.
Another thing to note is that although each child process has created an HTTP server and listened to the same port, it does not mean that these child processes are free to compete for user requests. , because this cannot guarantee that the load of all child processes is balanced. Therefore, the correct process should be for the main process to listen to the port, and then forward the user request to a specific sub-process for processing according to the distribution policy.
Since processes are isolated from each other, processes generally communicate through mechanisms such as shared memory, message passing, and pipes. Node.js completes communication between parent and child processes through消息传递
, such as the following example:
const http = require('http'); const cluster = require('cluster'); const numCPUs = require('os').cpus().length; if (cluster.isPrimary) { for (let i = 0; i < numCPUs; i++) { const worker = cluster.fork(); worker.on('message', (message) => { console.log(`I am primary(${process.pid}), I got message from worker: "${message}"`); worker.send(`Send message to worker`) }); } } else { process.on('message', (message) => { console.log(`I am worker(${process.pid}), I got message from primary: "${message}"`) }); http.createServer((req, res) => { res.writeHead(200); res.end(`${process.pid}n`); process.send('Send message to primary'); }).listen(8000); }
Run the above example and visit http://localhost:8000/
, then check the terminal, we will see output similar to the following:
I am primary(44460), I got message from worker: "Send message to primary" I am worker(44461), I got message from primary: "Send message to worker" I am primary(44460), I got message from worker: "Send message to primary" I am worker(44462), I got message from primary: "Send message to worker"
Using this mechanism, we can monitor the status of each child process so that when an accident occurs in a child process, we can intervene in it in time to ensure that Availability of Services.
The interface of cluster
module is very simple. In order to save space, here we only make some special statements about the cluster.setupPrimary
method. For other methods, please check the official documentation:
cluster.setupPrimary
is called, the relevant settings will be synchronized to the cluster.settings
attribute, and every Each call is based on the value of the current cluster.settings
attribute;cluster.setupPrimary
is called, it has no impact on the running child process, only subsequent cluster.fork
calls are affected;cluster.setupPrimary
is called, it does not affect subsequent passes to cluster.fork
The env
parameter of the call;cluster.setupPrimary
can only be used in the main process.We introduced cluster
module earlier, through which we can create a Node.js process cluster to improve the running efficiency of the program. However, cluster
is based on the multi-process model, with high-cost switching between processes and isolation of resources between processes. The increase in the number of child processes can easily lead to the problem of being unable to respond due to system resource constraints. To solve such problems, Node.js provides worker_threads
. Below we briefly introduce the use of this module through specific examples:
// server.js const http = require('http'); const { Worker } = require('worker_threads'); http.createServer((req, res) => { const httpWorker = new Worker('./http_worker.js'); httpWorker.on('message', (result) => { res.writeHead(200); res.end(`${result}n`); }); httpWorker.postMessage('Tom'); }).listen(8000); // http_worker.js const { parentPort } = require('worker_threads'); parentPort.on('message', (name) => { parentPort.postMessage(`Welcone ${name}!`); });
The above example shows the simple use of worker_threads
. When using worker_threads
, you need to pay attention to the following points:
Create a Worker instance through worker_threads.Worker
, where the Worker script can be either an independent JavaScript
file or字符串
, for example, the above example can be modified as:
const code = "const { parentPort } = require('worker_threads'); parentPort.on('message', (name) => {parentPort.postMessage(`Welcone ${name}!` );})"; const httpWorker = new Worker(code, { eval: true });
When creating a Worker instance through worker_threads.Worker
, you can set the initial metadata of the Worker sub-thread by specifying the value of workerData
, such as:
// server.js const { Worker } = require('worker_threads'); const httpWorker = new Worker('./http_worker.js', { workerData: { name: 'Tom'} }); // http_worker.js const { workerData } = require('worker_threads'); console.log(workerData);
When creating a Worker instance through worker_threads.Worker
, you can set SHARE_ENV
to realize the need to share environment variables between the Worker sub-thread and the main thread, for example:
const { Worker, SHARE_ENV } = require('worker_threads '); const worker = new Worker('process.env.SET_IN_WORKER = "foo"', { eval: true, env: SHARE_ENV }); worker.on('exit', () => { console.log(process.env.SET_IN_WORKER); });
Different from the inter-process communication mechanism in cluster
, worker_threads
uses MessageChannel to communicate between threads:
parentPort.postMessage
method, and processes messages from the main thread by listening to message
event of parentPort
message;httpWorker
through the postMessage
method of the Worker sub-thread instance (here is httpWorker
, and is replaced by this Worker sub-thread below), and processes messages from the Worker sub-thread by listening to the message
event of httpWorker
.In Node.js, whether it is a child process created by cluster
or a Worker child thread created by worker_threads
, they all have their own V8 instance and event loop. The difference is that
Although it seems that Worker sub-threads are more efficient than child processes, Worker sub-threads also have shortcomings, that is, cluster
provides load balancing, while worker_threads
requires us to complete the design and implementation of load balancing by ourselves.
This article introduces the use of the three modules child_process
, cluster
and worker_threads
in Node.js. Through these three modules, we can make full use of the advantages of multi-core CPUs and efficiently solve some special problems in a multi-thread (thread) mode. The operating efficiency of tasks (such as AI, image processing, etc.). Each module has its applicable scenarios. This article only explains its basic use. How to use it efficiently based on your own problems still needs to be explored by yourself. Finally, if there are any mistakes in this article, I hope you can correct them. I wish you all happy coding every day.