Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
177 views
in Technique[技术] by (71.8m points)

javascript - Why setImmediate() execute before fs.readFile() in Nodejs Event Loop's works?

I have read a lot of related documents.But I still can't understand how it works.

const fs = require('fs')
const now = Date.now();

setTimeout(() => console.log('timer'), 10);
fs.readFile(__filename, () => console.log('readfile'));
setImmediate(() => console.log('immediate'));
while(Date.now() - now < 1000) {
}

const now = Date.now();

setImmediate(() => console.log('immediate'));
setTimeout(() => console.log('timer'), 10);
while(Date.now() - now < 1000) {
}

I think the first piece of code should log:

readfile
immediate

And the second piece of code logs.

timer
immediate

I think it is ok.

Problem: I don't understand why the first piece of code logs

immediate
readfile

I think that the file has been read completely and its callback function enqueue I/O callbacks phase's queque after 1 second.

And then I think event loop will move to timers(none),I/O callbacks(fs.readFile's callback),idle/prepare(none),poll(none),check(setImmediate's callback)and finally close callbacks(none) in order, but result is that setImmediate() still run first.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The behavior you see is because there are multiple types of queues in the event loop and the system runs events in an order according to their type. It is not just one giant event queue where everything runs in FIFO order based on when it was added to the event queue. Instead, it likes to run all the events of one type (up to a limit), advance to the next type, runs all those and so on.

And, I/O events are only added to their queue at one specific point in the cycle so they are forced into a particular order. This is the reason that setImmediate() callback executes before the readFile() callback even though both are ready to go when the while loop is done.

And then I think event loop will move to timers(none),I/O callbacks(fs.readFile's callback),idle/prepare(none),poll(none),check(setImmediate's callback)and finally close callbacks(none) in order, but result is that setImmediate() still run first.

The issue is that the I/O callbacks stage of the event loop runs I/O callbacks that are already in the event queue, but they don't get put into the event queue automatically when they are done. Instead, they are only put into the event queue later in the process in the I/O poll step (see diagram below). So, the first time through the I/O callbacks stage, there are no I/O callbacks to run yet, thus you don't get the readfile output when you think you would.

But, the setImmediate() callback is ready the first time through the event loop so it gets to run before the readFile() callback.

This delayed addition of I/O callbacks is likely the explanation for why you are surprised that the readFile() callback happens last instead of before the setImmediate() callback.

Here's what happens when the while loop finishes:

  1. When the while loop finishes, it starts with timer callbacks and sees that the timer is ready to run so it runs that.
  2. Then, it goes to run any I/O callbacks that are already there, but there are none yet. The I/O callback from the readFile() has not been collected yet. It will get collected later in this cycle.
  3. Then, it goes through several other stages and gets to I/O poll. There is collects the readFile() callback event and puts it in the I/O queue (but does not run it yet).
  4. Then, it goes to the checkHandlers phase where it runs the setImmediate() callback.
  5. Then, it starts the event loop over again. There are no timers so it goes to I/O callbacks and it finally finds and runs the readFile() callback.

So let's document what actually happens in your code in a little more detail for those that are not as familiar with the event loop process. When you run this code (with timing added to the output):

const fs = require('fs')

let begin = 0;
function log(msg) {
    if (!begin) {
        begin = Date.now();
    }
    let t = ((Date.now() - begin) / 1000).toFixed(3);
    console.log("" + t + ": " + msg);
}

log('start program');

setTimeout(() => log('timer'), 10);
setImmediate(() => log('immediate'));
fs.readFile(__filename, () => log('readfile'));

const now = Date.now();
log('start loop');
while(Date.now() - now < 1000) {}
log('done loop');

You get this output:

0.000: start program
0.004: start loop
1.004: done loop
1.005: timer
1.006: immediate
1.008: readfile

I've added timing in seconds relative to when the program started so you can see when things are executed.

Here's what happens:

  1. Timer is started and set for 10ms from now, other code continues to run
  2. fs.readFile() operation is started, other code continues to run
  3. setImmediate() is registered into the event system and its event is in the appropriate event queue, other code continues to run
  4. while loop starts looping
  5. During the while loop, the fs.readFile() finishes its work (running in the background). It's event is ready, but not yet in the appropriate event queue (more on this later)
  6. while loop finishes after 1 second of looping and this initial sequence of Javascript is done and returns back to the system
  7. The interpreter now needs to get the "next" event from the event loop. But, all types of events are not treated equally. The event system has a particular order that it processes different types of events in the queue. In our case, here, the timer event gets processed first (I will explain this in the following text). The system checks to see if any timers have "expired" and are ready to call their callback. In this case it finds that our timer has "expired" and is ready to go.
  8. Timer callback is called and we see the console message timer.
  9. There are no more timers so the event loop advances to its next stage. The next stage of the event loop is to run any pending I/O callbacks. But, there are no pending I/O callbacks in the event queue yet. Even though the readFile() is done by now, it is not yet in the queue (explanation coming).
  10. Then, the next step is to gather any I/O events that have finished and get them ready to run. Here, the readFile() event will be collected (though not run yet) and put into the I/O event queue.
  11. Then the next step is to run any setImmediate() handlers that are pending. When it does that, we get the output immediate.
  12. Then, the next step in the event process is to run any close handlers (there are none here to run).
  13. Then, the event loop starts over again by checking for timers. There are no pending timers to run.
  14. Then, the event loop runs any pending I/O callbacks. Here the readFile() callback runs and we see readfile in the console.
  15. The program has no more events to wait for so it executes.

The event loop itself is a series of queues for different types of events and (with some exceptions), each queue is processed before moving onto the next type of queue. This causes groupings of events (timers in one group, pending I/O callbacks in another group, setImmediate() in another group and so on). It is not a strict FIFO queue among all types. Events are FIFO within a group. But, all pending timer callbacks (up to some limit to keep one type of event from hogging the event loop indefinitely) are processed before other types of callbacks.

You can see the basic structure in this diagram:

enter image description here

which comes from this very excellent article. If you really want to understand all this stuff, then read this referenced article several times over.

What initially surprised me is why the readFile always comes at the end. This is because even though the readFile() operation is done, it is not immediately put in the queue. Instead, there's a step in the event loop where completed I/O events are collected (to be processed in the next cycle through the event loop) and setImmediate() events are processed at the end of the current cycle before the I/O events that were just collected. This makes the readFile() callback go after the setImmediate() callback, even though they both are ready to go during the while loop.

And, further, it doesn't matter which order you execute the readFile() and the setImmediate(). Because they both are ready to go before the while loop is done, their execution order is determined by the sequencing though the event loop as it runs different types of events, not by exactly when they finished .


In your second code block, you remove the readFile() and put the setImmediate() before the setTimeout(). Using my timed version, that would be this:

const fs = require('fs')

let begin = 0;
function log(msg) {
    if (!begin) {
        begin = Date.now();
    }
    let t = ((Date.now() - begin) / 1000).toFixed(3);
    console.log("" + t + ": " + msg);
}

log('start program');

setImmediate(() => log('immediate'));
setTimeout(() => log('timer'), 10);

const now = Date.now();
log('start loop');
while(Date.now() - now < 1000) {}
log('done loop');

And, it generates this output:

0.000: start program
0.003: start loop
1.003: done loop
1.005: timer
1.008: immediate

The explanation is similar (shortened a bit this time since lots of detail was explained earlier).

  1. setImmediate() is registered into the appropriate queue.
  2. setTimeout() is registered into the timer queue.
  3. The while loop runs for its 1000ms
  4. The code finishes executing and returns control back to the system
  5. The system starts at the top of the event logic which starts with timer events. The timer we started earlier is done now so it runs its callback and logs timer.
  6. With no more timers, the event loop runs through several other types of event queues until it gets to where it runs setImmediate() handlers and it logs immediate.

If, you have multiple items schedule to start within an I/O callback such as this:

// timeout_vs_immediate.js
const fs = require('fs');

fs.readFile(__filename, () => {
  setTimeout(() => {
    console.log('timeout');
  }, 0);
  setImmediate(() => {
    console.log('immediate');
  });
});

Then, you get slightly different behavior because the setTimeout() and setImmediate() will be scheduled when the event loop is in a different part of its cycle. In this particular example, the setImmediate() will always execute BEFORE the timer so the output will be:

 immediate
 timeout

In the flow chart above, you can see where the "run completed I/O handlers" step is. Because the setTimeout() and setImmediate() calls will be scheduled from within an I/O handler, they are going to be scheduled at the "Run completed I/O handlers" phase of the event loop. Follow the flow of the event loop, the setImmediate() will get serviced in the "check handlers" phase before the event loop gets back around to service timers.

If the setImmediate() and setTimeout() are scheduled at a different point in the event loop, then the timer may fire before setImmediate() which is what happens in the earlier example. So, the relative timing of the two depends upon what phase the event loop is in when the functions are called.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...