This is the third post in a series on asynchronous programming. The whole series tries to answer a simple question: "What is asynchrony?". In the beginning, when I first started digging into the question, I thought I knew what it was. It turned out that I didn't know the slightest thing about asynchrony. So let's find out!
- Asynchronous programming. Blocking I/O and non-blocking I/O
- Asynchronous programming. Cooperative multitasking
- Asynchronous programming. Await the Future
- Asynchronous programming. Python3.5+
Some applications implement parallelism using several processes instead of several threads. Although the implementation details are different, conceptually it is the same model and in this post, I use the terms threads, but you can easily change it into processes.
Also here we will speak only in terms of explicit cooperative multitasking — callbacks, since this is the most common and widely used variant for implementing asynchronous frameworks. But I think it's also interchangeable with cooperative threads.
The most common activity of modern applications is working with input and output operations(I/O) rather than with a lot of number crunching. The problem with using I/O functions is that they are blocking. Actual writing to a hard disk or reading from a network takes a lot of time compared to the speed of CPU. Functions cannot be completed until the task is done, meanwhile your application does nothing. For the applications that require high performance, this is a major obstacle because other actions and other I/O operations keep waiting.
One of the standard solutions is to use threads. Each blocking I/O operation starts in a separate thread. When a thread calls the blocking function, the processor schedules another thread that actually needs CPU.
In this execution model, thread is assigned to one task and starts the execution of its commands. When the task is complete, the thread takes the next task and does the same: it executes all its commands one after another to perform one specified task. In such a system, a thread cannot leave a task halfway done and go to the next one. So we can be sure that when a function is executed — it can't be put on hold and will be completely finished before another one starts (and can change the data the current function is working with).
If a system is executed in a single thread and there are several tasks associated with it, they will be executed in this one thread one after another sequentially.
And if tasks are always executed in a certain order, the execution of a later task can assume that all earlier tasks ended without errors, with all their results available for use — a certain simplification in logic.
And if one of the commands is slow, the whole system will wait for the completion of this command — it is impossible to bypass it.
In a multi-threaded system, the principle is preserved — one thread is assigned to one task and works on it until it is completed.
But in the multi-threaded system, each task is executed in a separate control thread. Threads are controlled by the operating system and can be executed, in a system with several processors or several cores in parallel or can be multiplexed on one processor.
Now we have more than one thread and tasks (not one task, but several different tasks) to be executed at the same time. In fact, the thread that has completed work on one of its tasks can proceed to the next one.
Synchronizing access to data is one of the first things a person will encounter when taking on multithreaded programming. Low-level languages like C doesn't have any built-in primitives for access synchronization at all, and you will at least need to use POSIX semaphores or write your own solutions.
The problem is that in most non-isolated languages, two threads can read or write to the same variable without warning. If you do not solve such situations, you can easily get undefined behavior of your program, but most likely it will crash.
Overall multithreaded programs are more complex and tend to be more prone to errors, and include common issues: race conditions, deadlocks, and resource exhaustion. Additional concepts and entities are introduced to resolve these problems(locks, semaphores, timeouts, etc).
Another execution model uses a different style, asynchronous style.
A small note. Do not confuse non-blocking and asynchronous I/O. Asynchronous is the opposite of synchronous, while non-blocking I/O is the opposite of blocking. Both are quite similar, but also differ in that asynchronous is used with a wider range of operations, while non-blocking is mainly used with I/O.
Most modern operating systems provide subsystems of event notification. For example, a usual
read call to a socket will be blocked until the sender really sends something. Instead, an application can ask the operating system to watch the socket and queue the notification event until the data will be ready. The application can check the events whenever it likes (perhaps by doing some scripting before using the processor to the maximum) and process other things in the meantime. This is asynchronous because application at one point expresses interest and at another point uses data (in time and space).
The asynchronous code removes the blocking operation from the main thread of the application, so it continues to run, but after some time (or maybe somewhere else). Simply put, the main thread saves the task and passes its execution to a later time.
Asynchrony and context switching
Although asynchronous programming can speed up I/O tasks and solve thread synchronization problems it was actually designed to deal with a completely different problem — frequent switching of processor context.
When several threads are launched, each processor core can still execute only one thread at a time. To allow all the threads/processes to share resources, CPU context switching is very frequent. To simplify the situation, the processor constantly saves all the information about the thread at random intervals and switches to another thread. Threads are also resources, they are not free. This leads to frequent dumps and loads of thread' data and therefore to frequent misses in the low-level CPU cache and therefore possible performance problems.
Asynchronous programming is essentially cooperative multitasking with the user-space threading where the application manages the threads and switches the context, not the OS. Basically, in the asynchronous world, context switching only occurs at certain switching points, not at undetermined intervals.
Compared to the synchronous model, the asynchronous model works best when:
- There is a large number of tasks, so there is probably always at least one task that can make progress;
- Tasks perform many I/O operations, which causes the synchronous program to spend a lot of time in blocking mode when other tasks can be performed;
- Tasks are largely independent of each other, so there is no need for intertask interaction (and therefore, no need to wait for one task from another).
These conditions almost perfectly characterize a typical busy server (for example, a web server) in a client-server system. Each task is a single client request with input/output in the form of receiving the request and sending the response. The server implementation is the main candidate for an asynchronous model, so Twisted and Node.js along with other asynchronous libraries, have become so popular in recent years.
Why not just use more threads? If one thread is blocked by I/O operation, the other thread can make progress, right?
Threads are still a resource and they are not free and limited by the OS. As the number of threads increases, your server may begin to experience performance problems. With each new thread, there are some memory overheads associated with creating and maintaining a thread's state.
Another performance gain from an asynchronous model is that it avoids context switching — every time the operating system transfers control from one thread to another it must store all the relevant registers, memory map, stack pointers, processor context, etc. so that another thread can resume execution where it stopped. The cost of this can be quite significant.
How can an event of a new task arrival reach the application if the execution thread is busy processing another task?
Everything depends on the implementation. Different libraries use different approaches. Some of them use cooperative multitasking and simply produce coroutines and some of them separate event receipts and event processing into different threads (OS threads or user-level threads).
But as we understood from the previous post reactor/proactor patterns have their advantages.
And how do you manage all the events? In the analog of our reactor/proactor patterns — the scheduler or the event loop.
An event loop is exactly how it sounds, there is a queue of events (where all the events that have happened are stored — in the figure above it's called a "task queue") and a loop that simply constantly pulls those events out of the queue and calls callbacks from those events (all execution goes through the call stack). The API is an API for calls to asynchronous functions such as waiting for a response from a client or a database.
In this flow, all function calls go first to the call stack, then asynchronous commands are executed through API and after they are completed, the callback goes to the task queue and then to the call stack again.
The coordination of this process takes place in the event loop.
You see how this differs from the reactor pattern we talked about [in the last post] (https://luminousmen.com/post/asynchronous-programming-cooperative-multitasking)? That's right — they are the same.
When the event loop forms the central control flow of the program, as it often happens, it can be called the main loop or main event loop. This name is appropriate, because such an event loop is at the highest level of control inside the application.
In event-driven programming, the application expresses interest in certain events and responds to them when they occur. It is the responsibility of the event loop to collect events from the operating system or monitor other event sources, and the user can register callbacks that will be called when an event occurs.
The event loop usually runs forever.
JS event loop concept explained: What the heck is the event loop anyway? | Philip Roberts | JSConf EU
Summing up the whole theoretical series:
Asynchronous operations in applications can make them more efficient, and most importantly, faster for the user.
OS threads are cheaper than processes, but it is still very expensive to use one thread for each task. Reuse will be more efficient and this is what asynchronous programming provides us with.
The asynchronous model is one of the most important approaches to optimizing and scaling applications related to input/output (yes — it will not help in the case of tasks related to CPU).
Asynchronous programs may be difficult to write and debug.
Check out my book on asynchronous concepts: Link