asyncio and threads: friends or enemies?

asyncio was one of the most exciting additions to Python in the history of the language: it dramatically changed how developers approach the language, and enabled them to build libraries and frameworks around a much more reliable and conventional asynchronous programming model - it took what other libraries like Twisted and Tornado were doing well to a whole new level. It allowed concurrency with a very simple syntax.

But we already had multithreading in Python, which allows concurrency as well (when it comes to situations when the GIL is released, like when there's I/O controlled by the stdlib for example - sockets or filesystem).

Both asyncio and multithreading allow concurrency, so they can be used together just fine, right?

Whoa, not so fast! Yes, they can, but there's a catch: you can easily end up blocking an asyncio coroutine with threads. If for using threads alone it's an absolute must that you know what you're doing (you should avoid them, unless you really need them), then using them together with asyncio requires even more attention and knowledge. This article helps you with the knowledge part.

Crash course on asyncio

Here's a very rough crash course on how asyncio works (when used correctly): you have a main event loop which is the "backbone" of the asyncio model - it's the central piece that controls switching context between coroutines, which are like "special functions" that yield back control to the event loop so that it can give this control to another coroutine to run. Whenever there's an await before a coroutine, the interpreter switches context from the outer coroutine (where the await is defined) to another coroutine that's also scheduled to run. This way, multiple operations can be done concurrently, like HTTP requests for example.

However, if you trigger an operation that doesn't yield this control back to the event loop, then this coroutine will block the whole event loop, and no other coroutine will be run until this operation finishes. This is the case when there's data processing happening on the Python level, for example. We'll see more about this later.

asyncio and threads

There's no fundamental conflict between asyncio coroutines and traditional threads. They can be mixed together just fine.

The problem, however, is when the interpreter has to wait for threads to finish. When this is done in the traditional way, the interpreter control is not yielded back to the event loop, and thus that waiting for the thread blocks the execution.

Let's see how this happens, and some other code examples, next.

Examples

Here are some examples showing how these things work when put together.

All examples here have been run with Python 3.11 on my computer, but should work on some of the earlier versions too.

Basic asyncio

Let's say you have a file named async_only.py, with the following content:

import asyncio

async def wait_async_for(id: int, seconds: float) -> None:
    print(f'Coroutine {id} waiting for {seconds} seconds...')
    await asyncio.sleep(seconds)
    print(f'Coroutine {id} waited for {seconds} seconds.')

async def main() -> None:
    await asyncio.gather(*[
        wait_async_for(1, 0.1),
        wait_async_for(2, 0.3),
    ])

if __name__ == '__main__':
    asyncio.run(main())

If you run this on your terminal, here's what you'll get as the output:

$ python async_only.py 
Coroutine 1 waiting for 0.1 seconds...
Coroutine 2 waiting for 0.3 seconds...
Coroutine 1 waited for 0.1 seconds.
Coroutine 2 waited for 0.3 seconds.

As you probably noticed, these coroutines are sleeping in parallel (happening at the same time). There's no blocking operation done.

asyncio and blocking function

Now here's a code that mixes a simple blocking function with a coroutine, let's call it async_plus_blocking.py:

import asyncio
import time

async def wait_sync_for(id: int, seconds: float) -> None:
    print(f'Function {id} waiting for {seconds} seconds...')
    time.sleep(seconds)
    print(f'Function {id} waited for {seconds} seconds.')

async def wait_async_for(id: int, seconds: float) -> None:
    print(f'Coroutine {id} waiting for {seconds} seconds...')
    await asyncio.sleep(seconds)
    print(f'Coroutine {id} waited for {seconds} seconds.')

async def main() -> None:
    await asyncio.gather(*[
        wait_sync_for(1, 0.1),
        wait_async_for(2, 0.3),
    ])

if __name__ == '__main__':
    asyncio.run(main())

Notice that, even though wait_sync_for is defined as a coroutine, it calls time.sleep(), which, although it releases the GIL (which is helpful with multi-threading), it doesn't yield back control to the event loop. Let's see, then, the output:

$ python async_plus_blocking.py 
Function 1 waiting for 0.1 seconds...
Function 1 waited for 0.1 seconds.
Coroutine 2 waiting for 0.3 seconds...
Coroutine 2 waited for 0.3 seconds.

Notice how it ran the first function until completion, and only then it was able to run the coroutine - which made the code not run in parallel at all. In other words, what happened this is that the first function blocked the execution.

OK, this is probably something that everybody knows already, but how do asyncio and threads work together? Let's see this next.

asyncio and started thread

What happens if you start a thread in a coroutine, and try to run another coroutine in parallel? Let's see with async_plus_non_blocking_thread.py:

import asyncio
import time
from threading import Thread

def wait_sync_for(id: int, seconds: float) -> None:
    print(f'Function {id} waiting for {seconds} seconds...')
    time.sleep(seconds)
    print(f'Function {id} waited for {seconds} seconds.')

async def wait_within_non_blocking_thread(id: int, seconds: float) -> None:
    Thread(target=wait_sync_for, args=(id, seconds)).start()
    print('Started thread.')

async def wait_async_for(id: int, seconds: float) -> None:
    print(f'Coroutine {id} waiting for {seconds} seconds...')
    await asyncio.sleep(seconds)
    print(f'Coroutine {id} waited for {seconds} seconds.')

async def main() -> None:
    await asyncio.gather(*[
        wait_within_non_blocking_thread(1, 0.1),
        wait_async_for(2, 0.3),
    ])

if __name__ == '__main__':
    asyncio.run(main())

In wait_within_non_blocking_thread, we only start the thread, but don't wait. And here's the output:

$ python async_plus_non_blocking_thread.py
Function 1 waiting for 0.1 seconds...
Started thread.
Coroutine 2 waiting for 0.3 seconds...
Function 1 waited for 0.1 seconds.
Coroutine 2 waited for 0.3 seconds.

Notice two things here: 1. "Started thread" is printed out before "Function 1 waited" is printed 2. The threaded function and the coroutine ran in parallel just fine

This is because starting a thread, per se, doesn't block the interpreter, so it's almost as if there was nothing happening there from the main thread's perspective, so other operations can move on.

The problem with this approach, though, is that the running thread is not being controlled at all - so the program might even exit before it finishes, which could cause problems. But it would also be problematic to keep the function blocking the code while waiting for a thread, which is the next example.

asyncio and joined thread

Now, this is probably the most important example, as it's a common problem in code using asyncio. Let's call it async_plus_blocking_thread.py:

import asyncio
import time
from threading import Thread

def wait_sync_for(id: int, seconds: float) -> None:
    print(f'Function {id} waiting for {seconds} seconds...')
    time.sleep(seconds)
    print(f'Function {id} waited for {seconds} seconds.')

async def wait_within_blocking_thread(id: int, seconds: float) -> None:
    thread = Thread(target=wait_sync_for, args=(id, seconds))
    thread.start()
    thread.join()
    print('Joined thread.')

async def wait_async_for(id: int, seconds: float) -> None:
    print(f'Coroutine {id} waiting for {seconds} seconds...')
    await asyncio.sleep(seconds)
    print(f'Coroutine {id} waited for {seconds} seconds.')

async def main() -> None:
    await asyncio.gather(*[
        wait_within_blocking_thread(1, 0.1),
        wait_async_for(2, 0.3),
    ])

if __name__ == '__main__':
    asyncio.run(main())

Notice how, this time, besides starting the thread, we also join it right afterward. And here's the output:

$ python async_plus_blocking_thread.py
Function 1 waiting for 0.1 seconds...
Function 1 waited for 0.1 seconds.
Joined thread.
Coroutine 2 waiting for 0.3 seconds...
Coroutine 2 waited for 0.3 seconds.

Sure enough, it killed parallelism when the join() happened. Just to recap: this happens because the interpreter doesn't know that that operation should yield control back to the event loop, so it just waits there until the thread finishes.

There's a fix for this, though, which we'll see next.

asyncio with thread running in event loop default executor

Instead of joining threads directly, you can use the event loop's default thread executor. Let's call this async_plus_thread_in_default_executor.py:

import asyncio
import time
from threading import Thread

def wait_sync_for(id: int, seconds: float) -> None:
    print(f'Function {id} waiting for {seconds} seconds...')
    time.sleep(seconds)
    print(f'Function {id} waited for {seconds} seconds.')

async def wait_within_blocking_thread(id: int, seconds: float) -> None:
    thread = Thread(target=wait_sync_for, args=(id, seconds))
    thread.start()
    await asyncio.get_running_loop().run_in_executor(None, thread.join)
    print('Awaited for thread in default executor.')

async def wait_async_for(id: int, seconds: float) -> None:
    print(f'Coroutine {id} waiting for {seconds} seconds...')
    await asyncio.sleep(seconds)
    print(f'Coroutine {id} waited for {seconds} seconds.')

async def main() -> None:
    await asyncio.gather(*[
        wait_within_blocking_thread(1, 0.1),
        wait_async_for(2, 0.3),
    ])

if __name__ == '__main__':
    asyncio.run(main())

The main difference, here, is that instead of:

thread.join()

we do:

await asyncio.get_running_loop().run_in_executor(None, thread.join)

which is just a way of telling the interpreter to join the thread in some thread executor somewhere else and yield back control to the event loop. What's the output of this then? Here it is:

$ python async_plus_thread_in_default_executor.py
Function 1 waiting for 0.1 seconds...
Coroutine 2 waiting for 0.3 seconds...
Function 1 waited for 0.1 seconds.
Awaited for thread in default executor.
Coroutine 2 waited for 0.3 seconds.

Thus, it makes the code non-blocking for asyncio! Neat, isn't it? And you can do pretty much the same thing with any blocking function where the GIL is released at some point, and you can have the guarantee that it will stop blocking when this happens - you don't need to run threads for this, just pass the function directly to the run_in_executor method, and you're good to go. The point, here, is that threads can cooperate with asyncio, if the right tool is used - which, in this case, is loop.run_in_executor().

asyncio and new executor

You don't need to unblock your stuff in the event loop's default executor though; You can spin up your own executor, thus avoiding using the default threads, thus avoiding possible contention problems. Let's call this one async_plus_thread_in_new_executor.py:

import asyncio
import time
from concurrent.futures import ThreadPoolExecutor
from threading import Thread

def wait_sync_for(id: int, seconds: float) -> None:
    print(f'Function {id} waiting for {seconds} seconds...')
    time.sleep(seconds)
    print(f'Function {id} waited for {seconds} seconds.')

async def wait_within_blocking_thread(id: int, seconds: float) -> None:
    thread = Thread(target=wait_sync_for, args=(id, seconds))
    thread.start()
    with ThreadPoolExecutor() as executor:
        await asyncio.get_running_loop().run_in_executor(
            executor, thread.join)
    print('Awaited for thread in new thread pool executor.')

async def wait_async_for(id: int, seconds: float) -> None:
    print(f'Coroutine {id} waiting for {seconds} seconds...')
    await asyncio.sleep(seconds)
    print(f'Coroutine {id} waited for {seconds} seconds.')

async def main() -> None:
    await asyncio.gather(*[
        wait_within_blocking_thread(1, 0.1),
        wait_async_for(2, 0.3),
    ])

if __name__ == '__main__':
    asyncio.run(main())

This runs as well as the previous example:

$ python async_plus_thread_in_new_executor.py
Function 1 waiting for 0.1 seconds...
Coroutine 2 waiting for 0.3 seconds...
Function 1 waited for 0.1 seconds.
Awaited for thread in new thread pool executor.
Coroutine 2 waited for 0.3 seconds.

with the difference that it runs in a new executor, not the default one.

Conclusion

If used incorrectly, threads can render your asyncio usage pretty much useless, making it as efficient as any normal synchronous code.

If used correctly, however, it can be a powerful tool, especially if you need to call blocking functions that are part of libraries that your code depends on.

#python #concurrency #asyncio #multithreading