This is the second post of a series about asynchronous programming. The whole series tries to answer the simple question: "What is asynchrony?".
At first, when I just started digging into the question - I thought that I know what it is. It turned out that I didn't know a clue about what asynchrony is all about. So, let's find out!
- Asynchronous programming. Blocking I/O and non-blocking I/O
- Asynchronous programming. Cooperative multitasking
- Asynchronous programming. Await the Future
- Asynchronous programming. Python3.5+
In the last post, we talked about how can we ensure the simultaneous processing of multiple requests, and that it can be implemented using threads or processes. But there is one more option — cooperative multitasking.
This option is the most difficult. Here we say that the OS is, of course, cool, it has schedulers/planners there, it can handle processes, threads, organize switches between them, handle locks, etc., but it still doesn't know about how the application works, what we as developers know. We know that we have short moments when some computation operations are performed on the CPU, but most of the time we expect network I/O, and we know better when to switch between processing individual requests.
From the OS point of view, cooperative multitasking is just one execution thread, but inside it the application switches between processing individual requests/commands. In terms of previous networking example, as soon as some data arrived, it reads them, parses the request, sents data to the database for example, and this is a blocking operation, but instead of waiting for the response from the database to come, it can start to process another request. It is called "cooperative" because all tasks/commands must cooperate for the entire scheduling scheme to work. They are interleaved with one another, but in a single thread of control, known as a cooperative scheduler, having its role reduced down to starting the processes and letting them return control back to it voluntarily.
This is simpler than the threaded multitasking because the programmer always knows that when one task is executing, another task is not. Although in a single-processor system a threaded application will also execute in an interleaved pattern, a programmer using threads should still think of pitfalls of this approach, lest the application will work incorrectly when moved to a multi-processor system. But a single-threaded asynchronous system will always execute with interleaving, even on a multi-processor system.
The difficulty of writing such programs lies in the fact that this process of switching, maintaining the context as such, organize each task as a sequence of smaller steps that execute intermittently, falls on the developers. On the other hand, we gain in efficiency, because there is no unnecessary switching, there are no problems switching, say, the processor context when switching between threads and processes.
There are two ways to implement cooperative multitasking — callbacks and green threads.
Since all blocking operations lead to the fact that the action will happen sometime in the future and our execution thread should return a result when it will be ready. So, in order to get the result, we have to register the callback — when the request/operation is successful it will call one callback, or if it is not successful, it will call another one. The callback is an explicit option — the developer should write programs in such a way that he doesn't really know when the callback function will be called.
This is the most used option because it is explicit and it is supported by the majority of the modern languages.
Pros and cons:
- Differs from threaded programs and don't have its problems;
- Threads/coroutines are invisible to the programmer;
- Callbacks swallow exceptions;
- Callback after callback gets confusing and hard to debug.
The second option is implicit — when developers write a program in such a way that, it seems like, there is no cooperative multitasking. We do a blocking operation, as we did before, and we expect the result right here like there is just one process or thread. But there is a black magic "under the hood" — framework or programming language makes the blocking operation non-blocking and transfers control to some other execution thread, but not in the sense of the OS thread, but to the logical thread(user-level thread). They are scheduled by an "ordinary" user-level process, not by the kernel. This option is called green threads.
Pros and cons:
- Are controlled at the application level, rather than OS;
- They feel like threads;
- Includes all the problems of normal thread-based programming other than CPU context switching.
Inside cooperative multitasking, there is always a processing kernel that is responsible for all I/O processing. It is called a reactor from the design pattern name. The reactor interface says: "Give me a bunch of your sockets and your callbacks, and when this socket is ready for I/O, I will call your callback functions."
There is a second interface provided by the reactor, it's called timer — "Call me in X milliseconds, this is my callback that needs to be called." This thing will be everywhere where is cooperative multitasking, explicit or implicit.
"Under the hood" the reactor is quite simple. It has a list of timers sorted by response time. He takes the list of sockets that he was given, sends them into the polling readiness mechanism. And the availability polling mechanism always has one more argument — it says on how much time he will block if there is no network activity. As a blocking time, it indicates the response time of the nearest timer. Accordingly, either there will be some kind of network activity, some of the sockets will be ready for I/O, or we will wait for the next time to trigger, unlock and transfer control to one or another callback, essentially to a logical flow of execution.
But in fact, none of these options is ideal. The combined version works best, because cooperative multitasking usually benefits, especially if your connections hang for a long time. For example, a web socket is a long-lived connection. If you allocate one process or one thread for processing a single web socket, you significantly limit how many connections you can have on one backend server at the same time. And since the connection lives for a long time, it is important to keep many simultaneous connections, while there will be little work on each connection.
The lack of cooperative multitasking is that such a program can use only one processor core. You can, of course, run multiple instances of the application on the same machine(this is not always convenient and has its drawbacks), so it would be nice to run several processes and use cooperative multitasking using the reactor inside each process.
This combination makes it possible, on one hand, to use all available processor cores in our system, and on the other hand, it would work efficiently inside each core, without allocating large resources to process each individual connection.
The difficulty of writing applications that use cooperative multitasking lies in the fact that this process of switching, maintaining the context as such placed on the shoulders of poor developers. On the other hand, using this approach we gain in efficiency because there is no unnecessary switching, there are no problems when switching between threads and processes.
In the next post, we will be talking about asynchronous programming itself and how it differs from synchronous one, the old concepts but on a new level and with new terms.