2019-06-09

Programming

21 minutes read (About 3216 words)

The Javascript Event Loop

After writing in Node.js for 3 years, I felt that the majority of the posts about it were lacking deeper knowledge, especially about the event loops internals, so, I decided to dig deep and give you my insights. Node.js seems like a simple beast to handle but in reality, there are a lot of layers of abstraction behind it. From one perspective, this abstraction is great, but, like everything else in software, abstraction comes at a cost.
The purpose of this post is to show you a broader picture of how the Node core handles the event loop, what it really is and how it coexists with the JS runtime.

Lets start with asking a few simple questions and then answer them with detailed examples.

What is the event loop?
What are the phases of the event loop?
What are the differences between the phases of the event loop and Node.js?
How is it implemented in practice?
Is Node.js really single threaded?
Why do we have a thread pool if we already have the event loop that does all the async operations?

What is the event loop?

From the Node.js website, summarized:

Node.js is an event-based platform. This means that everything that happens in Node is the reaction to an event. Abstracted away from the developer, the reactions to events are all handled by a library called libuv.

libuv is cross-platform support library which was originally written for Node.js. It is designed around the event-driven asynchronous I/O model.

libuv provides 2 mechanisms. One is a called an event loop, which runs on one thread and the other mechanism is called a thread pool.

It is important to note that the event loop and the JS runtime run on the same thread and the thread pool is initiated with 4 threads by default. Also, libuv can be used not only in Node.js, that is why I will show the differences between the event loop itself and the event loop in Node.js context.

What are the phases in libuv?

The phases of libuv are:

update loop time - it is the starting point to know time from now on
check loop alive - Loop is considered alive if it has active and ref’d handles, active requests or closing handles.
Run due timers - the timers we pass to libuv, that are past the definition of now, we defined in the first phase, are getting called. The timers are held in a min heap data structure, for efficiency. If the first timer is not expired than we know that all others below it were created later in time.
Call pending callbacks - There are cases in which the callbacks are deffered to next iteration, this is where they are called.
Run idle handles - internal
Run prepare handles - internal
Poll for IO - this is where the loop handles any incoming TCP connections. The time for how long the loop is waiting in each iteration will be explained later on.
Run check handles - All check handles are run here, for example setImmediate if we are in the context of Node.js
Run close callbacks - all .on('close') events are called here, if something happened we want that the timers+poll+check callbacks will complete and afterwards the close callbacks are called, for example, the socket.destroy() in Node.js.
Repeat

The visual diagram of the phases looks like this, do not worry if you still do not get the full picture, we still have things to cover!

What are the differences between the phases of libuv and Node.js

Node.js uses all of the above mentioned phases and adds 2 more: the process.nextTick and microTaskQueue. the differences in Node.js are:

Process.nextTick() is not technically part of the event loop. Instead, the nextTickQueue will be processed after the current operation is completed, regardless of the current phase of the event loop. Here, an operation is defined as a transition from the underlying C/C++ handler, and handling the JavaScript that needs to be executed.
Any time you call process.nextTick() in a given phase, all callbacks passed to process.nextTick() will be resolved before the event loop continues. This can create some bad situations because it allows you to “starve” your I/O by making recursive process.nextTick() calls, which prevents the event loop from reaching the poll phase.
In the context of native promises, a promise callback is considered as a microtask and queued in a microtask queue which will be processed right after the next tick queue. It is also dangerous as recursive calls to promises in theory will never reach the next tick.
When the event loop enters the poll phase and there are no timers scheduled, one of two things will happen:
If the poll queue is not empty, the event loop will iterate through its queue of callbacks executing them synchronously until either the queue has been exhausted, or the system-dependent hard limit is reached.
If the poll queue is empty, one of two more things will happen:
If scripts have been scheduled by setImmediate(), the event loop will end the poll phase and continue to the check phase to execute those scheduled scripts.
If scripts have not been scheduled by setImmediate(), the event loop will wait for callbacks to be added to the queue, then execute them immediately.
Once the poll queue is empty the event loop will check for timers whose time thresholds have been reached. If one or more timers are ready, the event loop will wrap back to the timers phase to execute those timers’ callbacks

Before we continue, it is important to note that all phases we mentioned(timers, pending, poll, close) always call 1 callback which is a macro operation that is sent to JS land and afterwards the nextTickQueue + microTaskQueue are run, no matter where we are in the event loop. Each macro operation combined with the nextTickQueue and microTaskQueue is called an event loop tick.

Here is the diagram for Node.js event loop, notice there is not much difference except the 2 queues we talked about(nextTickQueue and microTaskQueue).

Did you notice something interesting we learned about the callback queues? Every one talks about a callback queue but in reality each phase has its own callback queue.

How is the event loop implemented in practice?

Here is the c++ code from the libuv github, I am sure you can figure the phases out by yourself and if you look carefully, you will see that the timeout for the poll phase is calculated. Also, we have a special mode of RUN_ONCE which dictates that if no callbacks were called in current poll phase, it means we have no work to do except the timers, so just update time and run timers phase before exiting.

int uv_run(uv_loop_t* loop, uv_run_mode mode) {
  int timeout;
  int r;
  int ran_pending;

  r = uv__loop_alive(loop);
  if (!r)
    uv__update_time(loop);

  while (r != 0 && loop->stop_flag == 0) {
    uv__update_time(loop);
    uv__run_timers(loop);
    ran_pending = uv__run_pending(loop);
    uv__run_idle(loop);
    uv__run_prepare(loop);

    timeout = 0;
    if ((mode == UV_RUN_ONCE && !ran_pending) || mode == UV_RUN_DEFAULT)
      timeout = uv_backend_timeout(loop);

    uv__io_poll(loop, timeout);
    uv__run_check(loop);
    uv__run_closing_handles(loop);

    if (mode == UV_RUN_ONCE) {
      /* UV_RUN_ONCE implies forward progress: at least one callback must have
       * been invoked when it returns. uv__io_poll() can return without doing
       * I/O (meaning: no callbacks) when its timeout expires - which means we
       * have pending timers that satisfy the forward progress constraint.
       *
       * UV_RUN_NOWAIT makes no guarantees about progress so it's omitted from
       * the check.
       */
      uv__update_time(loop);
      uv__run_timers(loop);
    }

    r = uv__loop_alive(loop);
    if (mode == UV_RUN_ONCE || mode == UV_RUN_NOWAIT)
      break;
  }

  /* The if statement lets gcc compile it to a conditional store. Avoids
   * dirtying a cache line.
   */
  if (loop->stop_flag != 0)
    loop->stop_flag = 0;

  return r;
}

Let’s see how the timeout for the poll phase(waiting on TCP connections) is calculated.
To break it down for you it goes like this:

If the stop flag is not 0, return 0
If we dont have active handles or requests, return 0
If the idle and pending queues are not empty return 0
If we have closing handles return 0
Otherwise return the next timeout from the loop

What this means is that the loop will not wait in poll phase if not neccessary, but if we have a timer, then it will wait in poll phase for the time of that timer in order to wrap to the timers phase instead of just doing meaningless iterations.

int uv_backend_timeout(const uv_loop_t* loop) {
  if (loop->stop_flag != 0)
    return 0;

  if (!uv__has_active_handles(loop) && !uv__has_active_reqs(loop))
    return 0;

  if (!QUEUE_EMPTY(&loop->idle_handles))
    return 0;

  if (!QUEUE_EMPTY(&loop->pending_queue))
    return 0;

  if (loop->closing_handles)
    return 0;

  return uv__next_timeout(loop);
}

Is Node.js single threaded?

Well, as we said previously, the event loop runs on the same thread as the JS runtime, which means, that we have one execution thread, so by definition, Node.js is single threaded.

Essentially, when the JS stack is empty of user code(all sync code we defined in our scripts), the event loop runs a tick. Important thing to note - slow sync code can halt the system to the ground at scale, due to the shared thread of the JS runtime with the event loop. More concrete, the context switch to c++ land cannot happen until the JS stack is empty.

If you try to do recursive operations in different queues(different stages of the event loop), even the setImmediate, it will always run on later ticks, remember the 1 macro per tick we talked about? This is it. But, microTaskQueue and nextTickQueue that are defined in JS land, will run recursively on the same tick, because we learned that the event loop will only run when all JS code is executed and the 2 queues above are simply javascript.

Why do we have a thread pool if we already have the event loop that does all the async operations?

This is an interesting part of Node.js land, because it is single threaded, like we already answered, but its also not! It all depends from what perspective you look at it. We do have one execution thread, but we also have the thread pool that comes along with the event loop.

Taken from the libuv website:

Whenever possible, libuv will use asynchronous interfaces provided by the OS, avoiding usage of the thread pool. The same applies to third party subsystems like databases. The event loop follows the rather usual single threaded asynchronous I/O approach: all (network) I/O is performed on non-blocking sockets which are polled using the best mechanism available on the given platform: epoll on Linux, kqueue on OSX and other BSDs, event ports on SunOS and IOCP on Windows.

What this basically means is that all operating systems have some sort of non-blocking IO handling we can leverage. We can subscribe to all events regarding new TCP connections, incoming data from TCP connection and etc. But what about crypto stuff, DNS resolving, file system operations? They are supposed to be non blocking as well, right?

Well, Not all the types of I/O can be performed using async implementations. Even on the same OS platform, there are complexities in supporting different types of I/O. Typically, network I/O can be performed in a non-blocking way using epoll, kqueue, event ports and IOCP, but the file I/O is much more complex. Certain systems, such as Linux does not support complete asynchrony for file system access. And there are limitations in file system event notifications/signaling with kqueue in MacOS.

Ok, we figured out that fs operations are problematic and must be done sync, but why DNS resolving must be sync too? To resolve DNS you must read the hosts file first on the host machine. We must use the fs for that and you get it by now…
Regarding the crypto, Cryptography on the most basic level is performing calculations, and there is no way to to calculations asynchronously on the same thread. For all the reasons above, we have a thread pool. When you send an async fs operation or an async crypto/dns resolve callbacks to the event loop, it gets picked up by the thread pool and resolved on a different thread synchronously.

Examples

Now, after we answered all the questions, lets look at some examples:
We will call the crypto.pbkdf2 async implementation.

I added a comment of how the results look in the previous example with default 4 threads that took 9xx ms completion time

UV_THREADPOOL_SIZE

const { pbkdf2 } = require('crypto');
const start = Date.now();
const doExpensiveHashing = () => {
  pbkdf2('pwd', 'salt', 100000, 512, 'sha512', () =>
    console.log(`Done in ${Date.now() - start}ms`)
  );
};
doExpensiveHashing(); // Done in 937ms
doExpensiveHashing(); // Done in 942ms
doExpensiveHashing(); // Done in 943ms
doExpensiveHashing(); // Done in 943ms

now set the environment variable UV_THREADPOOL_SIZE=1 and run the code again,

const { pbkdf2 } = require('crypto');
const start = Date.now();
const doExpensiveHashing = () => {
  pbkdf2('pwd', 'salt', 100000, 512, 'sha512', () =>
    console.log(`Done in ${Date.now() - start}ms`)
  );
};
doExpensiveHashing(); // Done in 658ms
doExpensiveHashing(); // Done in 1321ms
doExpensiveHashing(); // Done in 1982ms
doExpensiveHashing(); // Done in 2637ms

We saw that we get sequential execution because we have only 1 thread. If you try the sync implementation we will simply block our thread, but the thread pool allows us to do all the non trivial crypto work on another thread and keep our business rolling! Isn’t that cool?

Timers

Before we start, important note, if we set setTimeout to 0, it will be 1ms.

Who will execute first? According to what we saw, it will be setImmediate because the timer has not expired yet, but try and run it several times:

1 2	setTimeout(()=>console.log('setTimeout'),0) setImmediate(()=>console.log('setImmediate'))

First time:

1 2	setImmediate setTimeout

Second time:

1 2	setTimeout setImmediate

The interesting part here is that we know that when the event loop runs, it checks if the timer expired, which in our case is 1ms, which is a long time for a cpu, and the setImmediate will run only on next tick after we defined it and only after poll phase!. If the timer expired on start of next tick it will be called first, if not, the setImmediate will be called first and setTimeout afterwards.

Timers + IO

Now, lets add an IO cycle

const fs = require('fs')

fs.readFile('test.txt', ()=>{
  setTimeout(()=>console.log('setTimeout'),0)
  setImmediate(()=>console.log('setImmediate'))
})

The fs sends the readFile to our thread pool, when the operation finishes in poll phase, it registers the setTimeout and setImmediate and as we saw, the check phase always comes after the poll phase, so our setImmediate is guaranteed to execute first.

Recursion + Timers

Recursive operations in the event loop:

take this example:

const express = require('express')
const app = express()

app.use((req,res,next)=>{
  res.json('ok')
})

function compute(){
  for(i=0;i<1e3;i++){
    console.log(`computing i=${i}...`)
  }
  setImmediate(compute)
  // Promise.resolve().then(compute)
  // process.nextTick(compute)
}

app.listen(3000, ()=>console.log('listening on port 3000'))
compute()

Will our web server answer our requests?
Remember we said that recursive operations are dangerous on microTaskQueue and nextTickQueue in JS land? Well, thats true, which means, that a recursive call to setImmediate will not be immediate(1 macro per tick, remember?), but split to next tick after each invocation and that means that we pass iterations through poll phase which is responsible for our IO. To answer the question - Yes, we will answer the web requests!

Promise.resolve

Now, lets see with Promise.resolve().then(compute)

const express = require('express')
const app = express()

app.use((req,res,next)=>{
  res.json('ok')
})

function compute(){
  for(i=0;i<1e3;i++){
    console.log(`computing i=${i}...`)
  }
  // setImmediate(compute)
  Promise.resolve().then(compute)
  // process.nextTick(compute)
}

app.listen(3000, ()=>console.log('listening on port 3000'))
compute()

The native implementation of promises work with the microTaskQueue, which will run in recursion indefinitely and our web server will never answer!

Fun fact - With native promises we could implement long running promises that do recursions, and would never continue to next ticks - so fs operations and other network IO and phases from c++ land will never get execution time! For these reasons, Bluebird and other promise libraries implemented the resolve and reject to be in setImmediate which will never starve the IO!

The process.nextTick should be obvious by now.

Promise.resolve + nextTick + Timers

Last one before we wrap up:

Promise.resolve().then(() => console.log('promise1 resolved'));
Promise.resolve().then(() => console.log('promise2 resolved'));
Promise.resolve().then(() => {
    console.log('promise3 resolved');
    process.nextTick(() => console.log('next tick inside promise resolve handler'));
});
Promise.resolve().then(() => console.log('promise4 resolved'));
Promise.resolve().then(() => console.log('promise5 resolved'));
setImmediate(() => console.log('set immediate1'));
setImmediate(() => console.log('set immediate2'));

process.nextTick(() => console.log('next tick1'));
process.nextTick(() => console.log('next tick2'));
process.nextTick(() => console.log('next tick3'));

setTimeout(() => console.log('set timeout'), 0);
setImmediate(() => console.log('set immediate3'));
setImmediate(() => console.log('set immediate4'));

Do not be startled! Lets do what we always do in software engineering - break it down

Promise.resolve - native implementation and on js land, it will run last after nextTickQueue
setImmediate runs after poll phase
setTimeout runs on timers phase
process.nextTick must be exhausted before continuing to next tick

Now, nextTick is run on current tick and must be exhausted like we said, so it has to go first, because it runs on JS land regardless of where we are on the event loop! After the nextTick we have the microTaskQueue which like we said previously, is run last after nextTickQueue so it runs.
We have a process.nextTick inside the callback of a micro tick. That one we did not cover on purpose! What do you think will happen? Like we said, the nextTickQueue must be exhasted on each iteration, and we added a callback to the queue, so it wraps back to nextTickQueue and runs the callbacks again.
After all those steps we finally can start a new tick and by this time our timer has elapsed, so our setTimeout gets executed, our setImmediate is last and the result is:

next tick1
next tick2
next tick3
promise1 resolved
promise2 resolved
promise3 resolved
promise4 resolved
promise5 resolved
next tick inside promise resolve handler
set timeout
set immediate1
set immediate2
set immediate3
set immediate4

Summary

We started by asking a few basic questions in Node.js, like: how does its internals look like and how does it operate? Later on, we tried to answer each of those questions as deep as possible. We learned that after all the abstractions, it is a (not so simple) while loop which enables us to build advanced systems and build web servers without worrying about concurrency control or multi threading. Finally, we saw some examples that seem trivial at first, but after our dive to Node core, we were able to answer them.

# Event Loop, Javascript, Node js, Node.js, libuv