This is the fourth post in a series on asynchronous programming. The whole series explores a single question: What is asynchrony? When I first started digging into this, I thought I had a solid grasp of it. Turns out, I didn't know the first thing about asynchrony. So, let’s dive in together!
Whole series:
- Asynchronous Programming. Blocking I/O and non-blocking I/O
- Asynchronous Programming. Threads and processes
- Asynchronous Programming. Cooperative multitasking
- Asynchronous Programming. Await the Future
- Asynchronous Programming. Python3.5+
Some applications implement parallelism using several processes instead of several threads. Although the implementation details are different, conceptually it is the same model and in this post, I use the terms threads, but you can easily change it into processes.
Also here we will speak only in terms of explicit cooperative multitasking — callbacks, since this is the most common and widely used variant for implementing asynchronous frameworks. But I think it's also interchangeable with cooperative threads.
The most common activity in modern applications is working with input and output operations (I/O) rather than intensive number crunching. The issue with I/O functions is that they are blocking. Writing to a disk or reading from a network takes much longer than the CPU takes to process data, so functions cannot complete until these tasks finish, leaving the application idle in the meantime. For high-performance applications, this is a significant bottleneck, as other tasks and I/O operations end up waiting.
One standard solution is to use threads, where each blocking I/O operation runs in a separate thread. When one thread calls a blocking function, the CPU scheduler can allocate resources to another thread that needs processing time.
Synchrony
In a synchronous execution model, each thread is assigned one task and runs through its commands. When the task completes, the thread picks up the next task, executing commands sequentially. Here, a thread can’t leave a task halfway to start another one, so we know that when a function begins, it will run to completion without interruption.
Single Thread
If a system is executed in a single thread and there are several tasks associated with it, they will be executed in this one thread one after another sequentially.
With tasks always executing in a fixed order, later tasks can assume earlier ones finished successfully, simplifying logic. However, if one task is slow, the entire system waits for its completion—there’s no way to bypass it.
Multiple Threads
In a multi-threaded system, the principle remains the same — each thread handles one task from start to finish. However, with multiple threads, each task runs in a separate thread, which the OS manages. On multi-core processors, these tasks can run in parallel or be multiplexed on a single core.
With multiple threads, several tasks can execute simultaneously. Once a thread finishes one task, it can proceed to the next.
Multithreaded programming introduces the need to synchronize data access. Low-level languages like C lack built-in synchronization, requiring POSIX semaphores or custom solutions. Without proper synchronization, threads may read or write the same variable simultaneously, leading to undefined behavior or crashes.
Multithreaded programs are generally more complex and error-prone, with common issues like race conditions, deadlocks, and resource exhaustion. Various tools — such as locks, semaphores, and timeouts — help manage these problems.
Asynchrony
Another execution model is the asynchronous style.
Most modern OSs provide event notification subsystems. For example, a typical read
call on a socket blocks until the sender sends data. But an application can instead ask the OS to watch the socket and queue a notification event once data is available. The application can then check events as needed (potentially running other scripts to optimize CPU usage in the meantime). This is asynchronous because the application expresses interest at one point and accesses data later.
Asynchronous code removes blocking from the application’s main thread, allowing it to keep running while the task completes in the background. Essentially, the main thread saves the task for later execution.
Asynchrony and Context Switching
Asynchronous programming can speed up I/O tasks and ease thread synchronization issues but was originally designed to minimize frequent processor context switching.
When multiple threads are running, each core still executes one thread at a time, requiring frequent context switches to share CPU resources. The CPU must save all the thread’s data and switch to another at regular intervals. Threads themselves are resources, and this frequent switching can cause cache misses and degrade performance.
Asynchronous programming is essentially cooperative multitasking with user-space threading — the application, not the OS, manages thread switching. Context switches occur only at specific points, rather than random intervals.
Comparison
Compared to the synchronous model, the asynchronous model works best when:
- There are numerous tasks, meaning there is likely always one ready to progress;
- Tasks involve a lot of I/O, where synchronous programs would otherwise spend much of their time waiting;
- Tasks are largely independent, minimizing the need for inter-task synchronization.
These conditions are typical of a busy server (e.g., a web server) in a client-server setup. Each task represents a client request involving I/O, making servers ideal candidates for asynchronous models. Libraries like Twisted and Node.js have become popular for handling such cases.
Why not just add more threads? If one thread is blocked on I/O, another can proceed, right?
Threads, however, are resources and are neither free nor unlimited. As the thread count grows, performance may degrade. Each new thread introduces memory overhead for managing its state, and context switching becomes more frequent.
Another advantage of asynchronous programming is avoiding context switching. When the OS switches threads, it must store and reload the thread’s state, including registers, memory map, stack pointers, and more. The performance cost of this can be substantial.
Event loop
How can an event of a new task arrival reach the application if the execution thread is busy processing another task?
Implementation varies by library. Some use cooperative multitasking with coroutines, while others separate event receipt and processing into different threads (OS or user-level threads).
The event loop is the central control flow for handling events, analogous to the reactor/proactor patterns we discussed earlier.
An event loop is exactly how it sounds, there is a queue of events (where all the events that have happened are stored — in the figure above it's called a "task queue") and a loop that simply constantly pulls those events out of the queue and calls callbacks from those events (all execution goes through the call stack). The API is an API for calls to asynchronous functions such as waiting for a response from a client or a database.
In this flow, all function calls go first to the call stack, then asynchronous commands are executed through API and after they are completed, the callback goes to the task queue and then to the call stack again.
This coordination is managed by the event loop.
You’ll notice the similarity to the reactor pattern discussed in the previous post — they’re essentially the same.
When the event loop forms the central control flow of the program, as it often happens, it can be called the main loop or main event loop. This name is appropriate, because such an event loop is at the highest level of control inside the application.
In event-driven programming, the application expresses interest in certain events and responds to them when they occur. It is the responsibility of the event loop to collect events from the operating system or monitor other event sources, and the user can register callbacks that will be called when an event occurs.
The event loop typically runs indefinitely.
For more on JavaScript event loops: What the heck is the event loop anyway? | Philip Roberts | JSConf EU
Conclusion
- Asynchronous operations enhance application efficiency and improve user experience by making apps faster.
- While OS threads are cheaper than processes, using one thread per task is costly. Reusing threads efficiently is where asynchronous programming shines.
- The asynchronous model is a crucial approach for optimizing and scaling I/O-bound applications (it won’t help with CPU-bound tasks).
- Asynchronous programming can be complex to write and debug but is valuable for performance in the right contexts.