Understand the event loop and process.nextTick() in Node

Author：Eve Cole Update Time：2022-07-28 14:57:20

This article will give you an understanding of the event loop in Nodejs, analyze the event loop mechanism, process.nextTick(), etc. I hope it will be helpful to you!

What is an event loop?

An event loop is Node.js's mechanism for handling non-blocking I/O operations - even though JavaScript is single-threaded - by offloading operations to the system kernel when possible.

Since most cores today are multi-threaded, they can handle a variety of operations in the background. When one of the operations is completed, the kernel notifies Node.js to add the appropriate callback function to the polling queue and wait for the opportunity to execute. We will introduce it in detail later in this article.

Event loop mechanism analysis

When Node.js is started, it will initialize the event loop and process the provided input script (or throw it into the REPL, which is not covered in this article). It may call some asynchronous APIs, schedule timers, or call process.nextTick() and then start processing the event loop.

The diagram below shows a simplified overview of the event loop's sequence of operations.

┌───────────────────────────┐
┌─>│ timers │
│ └─────────────┬─────────────┘
│ ┌─────────────┴─────────────┐
│ │ pending callbacks │
│ └─────────────┬─────────────┘
│ ┌─────────────┴─────────────┐
│ │ idle, prepare │
│ └─────────────┬────────────┘ ┌────────────────┐
│ ┌─────────────┴────────────┐ │ incoming: │
│ │ poll │<─────┤ connections, │
│ └─────────────┬─────────────┘ │ data, etc. │
│ ┌─────────────┴────────────┐ └────────────────┘
│ │ check │
│ └─────────────┬─────────────┘
│ ┌─────────────┴─────────────┐
└──┤ close callbacks │
   └─────────────────────────────┘

Note: Each box is called a stage of the event loop mechanism.

Each stage has a FIFO queue to execute callbacks. While each stage is special, generally when the event loop enters a given stage, it will perform any operations specific to that stage and then execute the callbacks in that stage's queue until the queue is exhausted or the maximum The number of callbacks has been executed. When the queue is exhausted or the callback limit is reached, the event loop moves to the next phase, and so on.

Since any of these operations may schedule more operations and new events queued by the kernel to be processed during the polling phase, polling events may be queued while processing events in the polling phase. Therefore, a long-running callback can allow the polling phase to run longer than the timer's threshold time. See the Timers and Polling section for more information.

Note: There are subtle differences between the Windows and Unix/Linux implementations, but this is not important for the purpose of the demonstration. The most important part is here. There are actually seven or eight steps, but what we care about is that Node.js actually uses some of the steps above.

Phase overview

timer : This phase executes the scheduling callback function that has been setTimeout() and setInterval() .
Pending callback : I/O callback whose execution is delayed until the next loop iteration.
idle, prepare : only used internally by the system.
Polling : Retrieve new I/O events; execute I/O-related callbacks (in almost all cases, except for closed callback functions, those scheduled by timers and setImmediate() ), in other cases the node will Block here when appropriate.
Detection : The setImmediate() callback function is executed here.
Close callback function : Some close callback functions, such as: socket.on('close', ...) .

Between each run of the event loop, Node.js checks to see if it is waiting for any asynchronous I/O or timers, and if not, shuts down completely.

Detailed Overview of Phases

Timers

Timers specify the threshold at which the provided callback can be executed, rather than the exact time the user wants it to execute. After the specified interval, the timer callback will be run as early as possible. However, they may be delayed by operating system scheduling or other running callbacks.

Note : The polling phase controls when the timer executes.

For example, suppose you schedule a timer that times out after 100 milliseconds, and then your script starts asynchronously reading a file that takes 95 milliseconds:

const fs = require('fs');

function someAsyncOperation(callback) {
  // Assume this takes 95ms to complete
  fs.readFile('/path/to/file', callback);
}

const timeoutScheduled = Date.now();

setTimeout(() => {
  const delay = Date.now() - timeoutScheduled;

  console.log(`${delay}ms have passed since I was scheduled`);
}, 100);

// do someAsyncOperation which takes 95 ms to complete
someAsyncOperation(() => {
  const startCallback = Date.now();

  // do something that will take 10ms...
  while (Date.now() - startCallback < 10) {
    // do nothing
  }
});

When the event loop enters the polling phase, it has an empty queue ( fs.readFile() has not completed yet), so it will wait for the remaining number of milliseconds until the fastest timer threshold is reached. When it waits 95 milliseconds for fs.readFile() to finish reading the file, its callback, which takes 10 milliseconds to complete, will be added to the polling queue and executed. When the callback completes, there are no more callbacks in the queue, so the event loop mechanism will look at the timer that reached the threshold fastest and will then go back to the timer phase to execute the timer's callback. In this example, you will see that the total delay between the timer being scheduled and its callback being executed will be 105 milliseconds.

NOTE: To prevent the polling phase from starving the event loop, libuv (the C library that implements the Node.js event loop and all the asynchronous behavior of the platform) also has a hard maximum ( system dependent).

Pending callback functions

This phase executes callbacks for certain system operations (such as TCP error types). For example, some *nix systems want to wait to report an error if a TCP socket receives ECONNREFUSED when trying to connect. This will be queued for execution during the pending callback phase.

Polling

The polling phase has two important functions:

calculating how long I/O should be blocked and polled.
Then, handle the events in the polling queue.

When the event loop enters the polling phase and there are no timers scheduled, one of two things will happen:

If the polling queue is not empty

, the event loop will iterate through the callback queue and execute them synchronously until the queue is empty. exhausted, or a system-related hard limit reached.
If the poll queue is empty , two more things happen:
- if the script is scheduled by setImmediate() , the event loop will end the poll phase and continue the check phase to execute those scheduled scripts.
- If the script is not scheduled by setImmediate() , the event loop will wait for the callback to be added to the queue and then execute it immediately.

Once the poll queue is empty, the event loop checks for a timer that has reached its time threshold. If one or more timers are ready, the event loop wraps back to the timer phase to execute the callbacks for those timers.

Checking Phase

This phase allows one to execute a callback immediately after the polling phase is completed. If the polling phase becomes idle and the script is queued after using setImmediate() , the event loop may continue to the checking phase instead of waiting.

setImmediate() is actually a special timer that runs in a separate phase of the event loop. It uses a libuv API to schedule callbacks to be executed after the polling phase is completed.

Typically, when executing code, the event loop eventually hits the polling phase, where it waits for incoming connections, requests, etc. However, if the callback has been scheduled using setImmediate() and the polling phase becomes idle, it will end this phase and continue to the check phase instead of continuing to wait for the polling event.

Closed callback function

If the socket or handler is closed suddenly (eg socket.destroy() ), the 'close' event will be emitted at this stage. Otherwise it will be emitted via process.nextTick() .

setImmediate() vs. setTimeout()

setImmediate() and setTimeout() are very similar, but they behave differently based on when they are called.

setImmediate() is designed to execute the script once the current polling phase is complete.
setTimeout() runs the script after a minimum threshold (in ms) has passed.

The order in which timers are executed will vary depending on the context in which they are called. If both are called from within the main module, the timer will be bound by the performance of the process (which may be affected by other running applications on the computer).

For example, if you run the following script that is not inside an I/O cycle (i.e., the main module), the order in which the two timers are executed is non-deterministic because it is bounded by the performance of the process:

// timeout_vs_immediate.js
setTimeout(() => {
  console.log('timeout');
}, 0);

setImmediate(() => {
  console.log('immediate');
});


$ node timeout_vs_immediate.js
timeout
immediate

$ node timeout_vs_immediate.js
immediate
timeout

However, if you put these two functions into an I/O loop and call them, setImmediate will always be called first:

// timeout_vs_immediate.js
const fs = require('fs');

fs.readFile(__filename, () => {
  setTimeout(() => {
    console.log('timeout');
  }, 0);
  setImmediate(() => {
    console.log('immediate');
  });
});


$ node timeout_vs_immediate.js
immediate
timeout

$ node timeout_vs_immediate.js
immediate

The main advantage of using setImmediate() for

timeout

setImmediate() setTimeout() is that if setImmediate() is scheduled during the I/O cycle, it will be executed before any timer in it, depending on how many timers there are. Unrelated to

process.nextTick()

Understanding process.nextTick()

You may have noticed process.nextTick() is not shown in the diagram, even though it is part of the asynchronous API. This is because process.nextTick() is not technically part of the event loop. Instead, it will handle nextTickQueue after the current operation is completed, regardless of the current stage of the event loop. An operation here is considered a transition from the underlying C/C++ processor, and handles the JavaScript code that needs to be executed.

Looking back at our diagram, any time process.nextTick() is called in a given phase, all callbacks passed to process.nextTick() will be resolved before the event loop continues. This can create some bad situations, as it allows you to "starve" your I/O via recursive process.nextTick() calls , preventing the event loop from reaching the polling stage.

Why is this allowed?

Why is something like this included in Node.js? Part of it is a design philosophy where an API should always be asynchronous, even though it doesn't have to be. Take this code snippet as an example:

function apiCall(arg, callback) {
  if (typeof arg !== 'string')
    return process.nextTick(
      callback,
      new TypeError('argument should be string')
    );
}

Code snippet for parameter checking. If incorrect, the error is passed to the callback function. The API was recently updated to allow passing arguments to process.nextTick() which will allow it to accept any argument after the position of the callback function and pass the arguments to the callback function as arguments to the callback function so you don't have to nest function.

What we are doing is passing the error back to the user, but only after the rest of the user's code has been executed. By using process.nextTick() , we guarantee that apiCall() always executes its callback function after the rest of the user code and before letting the event loop continue. To achieve this, the JS call stack is allowed to unwind and then immediately execute the provided callback, allowing recursive calls to process.nextTick() to be made without hitting RangeError: 超过V8 的最大调用堆栈大小.

This design principle can lead to some potential problems. Take this code snippet as an example:

let bar;

// this has an asynchronous signature, but calls callback synchronously
function someAsyncApiCall(callback) {
  callback();
}

// the callback is called before `someAsyncApiCall` completes.
someAsyncApiCall(() => {
  // since someAsyncApiCall has completed, bar hasn't been assigned any value
  console.log('bar', bar); // undefined
});

bar = 1;

The user defines someAsyncApiCall() as having an asynchronous signature, but in fact it runs synchronously. When it is called, the callback provided to someAsyncApiCall() is called within the same phase of the event loop because someAsyncApiCall() doesn't actually do anything asynchronously. As a result, the callback function is trying to reference bar , but the variable may not be in scope yet because the script has not yet finished running.

By placing the callback in process.nextTick() , the script still has the ability to run to completion, allowing all variables, functions, etc. to be initialized before the callback is called. It also has the advantage of not letting the event loop continue, and is suitable for warning the user when an error occurs before letting the event loop continue. Here is the previous example using process.nextTick() :

let bar;

function someAsyncApiCall(callback) {
  process.nextTick(callback);
}

someAsyncApiCall(() => {
  console.log('bar', bar); // 1
});

bar = 1;

This is another real example:

const server = net.createServer(() => {}).listen(8080);

server.on('listening', () => {});

Only when the port is passed, the port will be bound immediately. Therefore, the 'listening' callback can be called immediately. The problem is that the callback of .on('listening') has not been set at that point in time.

To get around this problem, the 'listening' event is queued within nextTick() to allow the script to run to completion. This lets the user set whatever event handlers they want.

process.nextTick() vs. setImmediate()

As far as the user is concerned, we have two similar calls, but their names are confusing.

process.nextTick() is executed immediately in the same stage.
setImmediate() fires on the next iteration or 'tick' of the event loop.

Essentially, the two names should be swapped because process.nextTick() fires faster than setImmediate() , but this is a legacy from the past and therefore unlikely to change. If you do a name swap rashly, you will break most packages on npm. More new modules are being added every day, which means every day we have to wait, the more potential damage can occur. Although these names are confusing, the names themselves will not change.

We recommend that developers use setImmediate() in all situations because it is easier to understand.

Why use process.nextTick()?

There are two main reasons:

to allow the user to handle errors, clean up any unneeded resources, or retry the request before the event loop continues.
Sometimes it is necessary to have the callback run after the stack is unrolled but before the event loop continues.

Here is a simple example that meets user expectations:

const server = net.createServer();
server.on('connection', (conn) => {});

server.listen(8080);
server.on('listening', () => {});

Assume listen() runs at the beginning of the event loop, but the listening callback is placed in setImmediate() . Unless a hostname is passed, the port will be bound immediately. In order for the event loop to continue, it must hit the polling phase, which means it is possible that a connection has been received and the connection event has been fired before the listening event.

Another example runs a function constructor that inherits from EventEmitter and wants to call the constructor:

const EventEmitter = require('events');
const util = require('util');

function MyEmitter() {
  EventEmitter.call(this);
  this.emit('event');
}
util.inherits(MyEmitter, EventEmitter);

const myEmitter = new MyEmitter();
myEmitter.on('event', () => {
  console.log('an event occurred!');
});

You can't trigger the event immediately from the constructor because the script has not yet processed to the point where the user assigns a callback function to the event. So in the constructor itself you can use process.nextTick() to set up a callback so that the event is emitted after the constructor completes, which is what is expected:

const EventEmitter = require('events');
const util = require('util');

function MyEmitter() {
  EventEmitter.call(this);

  // use nextTick to emit the event once a handler is assigned
  process.nextTick(() => {
    this.emit('event');
  });
}
util.inherits(MyEmitter, EventEmitter);

const myEmitter = new MyEmitter();
myEmitter.on('event', () => {
  console.log('an event occurred!');
});

Source: https://nodejs.org/en/docs/guides/event-loop-timers-and-nexttick/