This is the third post of a series about asynchronous programming. The whole series tries to answer the simple question: "What is asynchrony?".
At first, when I just started digging into the question - I thought that I know what it is. It turned out that I didn't know a clue about what asynchrony is all about. So, let's find out!
- Asynchronous programming. Blocking I/O and non-blocking I/O
- Asynchronous programming. Cooperative multitasking
- Asynchronous programming. Await the Future
- Asynchronous programming. Python3.5+
Some applications implement parallelism using multiple processes instead of multiple threads(first post). Although the programming details are different, conceptually it is the same model and in this post, I'm talking in terms of threads but you can easily change it to processes.
Also here we will be talking only in terms of explicit cooperative multitasking — callbacks, because it's the most common and widely used option for asynchronous frameworks implementations.
The most common activity of modern applications is to deal with input and output, rather than a lot of number-crunching. The problem with using input/output(I/O) functions is that they are blocking. The actual write to a hard disk or reading from the network takes an extremely long time compared to the speed of the CPU. The functions don’t finish until the task is done so meanwhile, your application is doing nothing. For applications that require high performance, this is a major roadblock as other activities and other I/O operations are kept waiting.
One of the standard solutions is to use threads. Each blocking I/O operation is started in a separate thread. When the blocking function gets invoked in the thread, the processor can schedule another thread to run, which actually needs the CPU.
In this post, we will be talking about the concepts of synchrony and asynchrony in general.
In this concept, a thread is assigned to a single task and starts working on it. When a task is completed, the thread takes the next task and does the same: it performs all its commands one after the other to complete one specified task. In this system, a thread cannot leave the task halfway and move on to the next. Because of this, we can know for sure: whenever and wherever a function is executed — it cannot be set on hold and will be fully completed before starting to execute another one (which can change the data with which the current function works).
If the system is executed single-threadedly and several tasks are connected with it, then they will be executed in this one thread sequentially one after the other.
And if the tasks are always performed in a definite order, the implementation of a later task can assume that all earlier tasks have finished without errors, with all their output available for use — a definite simplification in logic.
If one of the commands is slow than the whole system will be waiting for this command to finish — there is no way around it.
In a multi-threaded system, the principle is preserved — one thread is assigned to one task and works on it until it is completed.
But in this system, each task is performed in a separate thread of control. The threads are managed by the operating system and may, on a system with multiple processors or multiple cores, run in parallel, or may run be interleaved together on a single processor.
Only now we have more than one thread and the tasks (not one task, but several different tasks) can be executed in parallel. Usually, tasks differ in the duration of the processing and in fact, the thread that has finished working on one of its tasks can go to the next one.
Multithreaded programs are more complicated, and typically more error-prone, they include common troublesome issues: race-conditions, dead-locks, and resource starvation.
The other approach uses another style, which is the asynchronous, non-blocking style. Asynchronous is a style of concurrent programming, but it is not parallelism.
Most modern operating systems provide event notification subsystems. For example, a normal
read call on a socket would block until the sender actually sent something. Instead, the application can request the operating system to watch the socket and put an event notification in the queue. The application can inspect the events at its convenience(perhaps doing some number crunching before to use the processor to the maximum) and grab the data. It is asynchronous because the application expressed interest at one point, then used the data at another point (in time and space). It is non-blocking because the application thread was free to do other tasks.
Asynchronous code removes the blocking operation from the main application thread, so that it continues to be executed, but sometime later(or maybe somewhere else), and the handler can go further. Simply put, the main thread sets the task and transfers it's execution to sometime later(or to another independent thread).
Asynchrony and context switching
While asynchronous programming can prevent all these issues, it was actually designed for an entirely different problem: CPU context switching. When you have multiple threads running, each CPU core can still only run one thread at a time. In order to allow all threads/processes to share resources, the CPU switch context very often. To oversimplify things, the CPU, at a random interval, saves all the context info of a thread and switches to another thread. The CPU is constantly switching between your threads in non-deterministic intervals. Threads are also resources, they are not free.
Asynchronous programming is essentially cooperative multitasking with user-space threading, where the application manages the threads and context switching rather than the CPU. Basically, in an asynchronous world, context is switched only at defined switch points rather than in non-deterministic intervals.
Compared to the synchronous model, the asynchronous model performs best when:
- There are a large number of tasks so there is likely always at least one task that can make progress;
- The tasks perform lots of I/O, causing the synchronous program to waste lots of time blocking when other tasks could be running;
- The tasks are largely independent of one another so there is little need for inter-task communication (and thus for one task to wait upon another).
These conditions almost perfectly characterize a typical busy server (like a webserver) in a client-server environment. Each task represents one client request with I/O in the form of receiving the request and sending the reply. The server implementation is a prime candidate for the asynchronous model, which is why Twisted and Node.js, among other asynchronous server libraries, have grown so much popularity in recent years.
Why not just use more threads? If one thread is blocking on an I/O operation, another thread can make progress, right? However, as the number of threads increases, your server may start to experience performance problems. With each new thread, there is some memory overhead associated with the creation and maintenance of the thread state. Another performance gain from the asynchronous model is that it avoids context switching — every time the OS transfers control over from one thread to another it has to save all the relevant registers, memory map, stack pointers, CPU context etc. so that the other thread can resume execution where it left off. The overhead of doing this can be quite significant.
How does an event of new task arrival can reach the application if the execution thread is busy processing another task? The fact is that the operating system has many threads and the code that actually interacts with the user is executed separately from our application and only sends messages to it.
And how are doing all the event-thread managing? In the event loop.
The event loop is exactly what it sounds like, there is a queue of events(where all the happened events are stored — it's called task queue on the picture above) and a loop that just constantly pulls these events off the queue and executes callbacks on these events(all execution goes on the call stack). API represents API for asynchronous functions calls like waiting for the response from the client or database.
So all the operations go first into call stack than asynchronous commands go into API and after they are done required callback goes into the task queue and then for the execution on call stack again.
Coordination of this process takes place in the event loop.
You see how is this different from the reactor pattern we talked about in the last post? Right - nothing.
When the event loop forms the central control flow construct of a program, as it often does, it may be termed the main loop or main event loop. This title is appropriate because such an event loop is at the highest level of control within the application.
In event-driven programming, an application expresses interest in certain events and respond to them when they occur. The responsibility of gathering events from the operating system or monitoring other sources of events is handled by the event loop, and the user can register callbacks to be invoked when an event occurs. The event-loop usually keeps running forever.
JS event loop concept explained: What the heck is the event loop anyway? | Philip Roberts | JSConf EU
Summarizing the whole theoretical series:
- Asynchronous operations in applications can make it more efficient, and most importantly fast for the user.
- Resource savings. OS threads are cheaper than processes but they are still very expensive to use one thread per task. It will be more efficient to reuse it — and that is what asynchronous programming is providing us.
- This is one of the most important techniques for optimizing and scaling I/O-bound applications(yes — they will not help in case CPU-bound tasks)
- Asynchronous programs are difficult for the programmer to write and debug