ā¹ļø Skipped - page is already crawled
| Filter | Status | Condition | Details |
|---|---|---|---|
| HTTP status | PASS | download_http_code = 200 | HTTP 200 |
| Age cutoff | PASS | download_stamp > now() - 6 MONTH | 0.1 months ago |
| History drop | PASS | isNull(history_drop_reason) | No drop reason |
| Spam/ban | PASS | fh_dont_index != 1 AND ml_spam_score = 0 | ml_spam_score=0 |
| Canonical | PASS | meta_canonical IS NULL OR = '' OR = src_unparsed | Not set |
| Property | Value |
|---|---|
| URL | https://emptysqua.re/blog/why-should-async-get-all-the-love/ |
| Last Crawled | 2026-04-05 19:57:21 (4 days ago) |
| First Indexed | 2022-05-16 22:12:58 (3 years ago) |
| HTTP Status Code | 200 |
| Meta Title | Why Should Async Get All The Love?: Advanced Control Flow With Threads |
| Meta Description | Writeup of my PyCon 2022 talk. How to write safe, elegant concurrent Python with threads. |
| Meta Canonical | null |
| Boilerpipe Text | I spoke at PyCon 2022 about writing safe, elegant concurrent Python with threads. Hereās the video. Sorry about the choppy audio,
the A/V at PyCon this year was a shitshow
. Below is a written version of the talk.
asyncio.
Asyncio is really hip. And not just asyncioāthe older async frameworks like Twisted and Tornado, and more recent ones like Trio and Curio are hip, too. I think they deserve to be! Iām a big fan. I spent a lot of time contributing to Tornado and asyncio some years ago. My very first PyCon talk, in 2014, was called ā
What Is Async, How Does It Work, And When Should I Use It?
ā I was an early async booster.
Asyncio introduced a lot of Pythonistas to advanced control flows with Tasks, Futures, chaining,
asyncio.gather
, and so on. All this was really exciting! But thereās something a lot of Python programmers didnāt notice at the time: All this was already possible with threads, too.
Threads.
Compared to asyncio, threads seem hopelessly outdated. The cool kids will laugh at you if you use threads.
Concurrency and parallelism
#
Threads and asyncio are two ways to achieve
concurrency
.
Letās avoid any confusion at the start: Concurrency is not parallelism. Parallelism is when your program executes code on multiple CPUs at once. Python mostly canāt do parallelism due to the Global Interpreter Lock. You can understand the GIL with a phrase short enough to fit on your hand: One thread runs Python, while N others sleep or await I/O.
Learn more about the GIL from my PyCon talk a few years ago
.
So threads and asyncio have the same limitation: Neither threads nor asyncio Tasks can use multiple CPUs.
(An aside about multiprocessing, just so you know I know what youāre thinking: If you really need parallelism, use multiprocessing. Thatās the only way to run Python code using multiple CPUs at once with standard Python. But coordinating and exchanging data among Python processes is much harder than with threads, only do this if you really have to.)
But in this article Iām not talking about parallelism, Iām talking about concurrency. Concurrency is dealing with events in
partial order
: your program is waiting for several things to happen, and they could occur in one of several sequences. By far the most important example is waiting for data on many network connections at once. Some network clients and most network servers have to support concurrency, sometimes very high concurrency. We can use threads or an async framework, such as asyncio, as our method of supporting concurrency.
Threads vs. asyncio
#
Memory
#
Which one should you use, threads or asyncio? Letās start with asyncioās main advantage: Very very high concurrency programs are more memory efficient with asyncio.
Hereās a chart of
two simple programs
spawning lots of threads (blue) and asyncio Tasks (red). Just importing asyncio means the red program starts with a higher memory footprint, but that doesnāt matter. What matters is, as concurrency increases, the red asyncio programās memory grows slower.
A Python thread costs about 10k of memory. Thatās not much memory! More than a few hundred threads is impractical in Python, and the operating system imposes limits that prevent huge numbers of threads. But if you have low hundreds, you donāt need asyncio. Threads work great. If you remember
the problems David Beazley pointed out in Python 2
, they were solved in Python 3.
With asyncio, each Task costs about 2k of memory, and thereās effectively no upper bound. So asyncio is more memory-efficient for very high concurrency, e.g. waiting for network events on a huge number of mostly idle connections.
Speed
#
Is asyncio faster than threads? No. As
Cal Peterson wrote
:
Sadly async is not go-faster-stripes for the Python interpreter.
Under realistic conditions asynchronous web frameworks are slightly worse throughput and much worse latency variance.
Standard library asyncio is definitely slower than most multi-threaded frameworks, because asyncio executes a lot of Python for each event. Generally frameworks are faster the more that theyāre implemented in C or another compiled language. Even with the fastest async frameworks, like
those based on uvloop
, tail latency seems to be worse than with multi-threading.
Iām not going to say all async frameworks are definitely slower than threads. What I can say confidently is that asyncio isnāt faster, and itās more efficient only for huge numbers of mostly idle connections. And only for that.
Compatibility
#
What about compatibility? Here are the most popular Python web frameworks (
source
).
The sum is more than 100% because respondents could choose multiple. Flask, Django, and most of the others are multi-threaded frameworks. Only three (FastAPI, Falcon, and Tornado) are asyncio-compatible. (We donāt know about the āotherā category, but itās only 4%.)
So your web application is probably multi-threaded now, not async. If you want to use asyncio, that means rewriting a large portion of your app. Whereas multi-threaded code is compatible with most of the apps, libraries, and frameworks already written.
Trickiness
#
How tricky is it to write correct concurrent code with threads or asyncio?
Letās make a function called
do_something
which adds one to a global counter, and run it on two threads at once.
counter
=
0
def
do_something
():
global
counter
print
(
"doing something...."
)
counter
+=
1
# Not atomic!
t0
=
threading
.
Thread
(
target
=
do_something
)
t1
=
threading
.
Thread
(
target
=
do_something
)
t0
.
start
()
t1
.
start
()
t0
.
join
()
t1
.
join
()
print
(
f
"Counter:
{
counter
}
"
)
Will
counter
always eventually equal 2? No!
Plus-equals isnāt atomic
. It first loads the value from the global, then adds 1, then stores the value to the global. If the two threads interleave during this process, one of their updates could be lost, and we end up with
counter
equal to 1, not 2.
We need to protect the plus-equals with a lock:
counter
=
0
lock
=
threading
.
Lock
()
def
do_something
():
global
counter
print
(
"doing something...."
)
with
lock
:
counter
+=
1
This is tricky!
In a 2014 blog post
Glyph Lefkowitz, the author of Twisted, talks about this trickiness. Itās still my favorite argument on the topic.
As we know, threads are a bad idea, (for most purposes). Threads make local reasoning difficult, and local reasoning is perhaps the most important thing in software development.
Glyph says the main reason to write async code isnāt that itās faster. Itās not even memory efficiency. Itās that itās less prone to concurrency bugs and it requires less tricky programming. (But it doesnāt have to be that bad, as youāll see below.)
Letās rewrite our counter-incrementing example with asyncio.
counter
=
0
async
def
do_something
():
global
counter
print
(
"doing something...."
)
await
call_some_coroutine
()
counter
+=
1
# Atomic! No "await" in +=.
async
def
main
():
t0
=
asyncio
.
Task
(
do_something
())
t1
=
asyncio
.
Task
(
do_something
())
Now
do_something
is a coroutine. It calls another coroutine for the sake of illustration, and then increments the counter. We run it on two Tasks at once. Just by looking at the code we know where interleaving is possible. If it has an
await
expression, a coroutine can interleave there. Otherwise itās atomic. Thatās ālocal reasoningā. Plus-equals has no
await
expression, so itās atomic. We donāt need a lock here.
Therefore asyncio is better than multi-threading, because itās less tricky, right? We shall seeā¦.
In summary:
Threads
asyncio
Speed
: Threads are at least as fast.
Memory
: asyncio efficiently waits on huge numbers of mostly-idle network connections.
Compatibility
: Threads work with Flask, Django, etc., without rewriting your app for asyncio.
Trickiness
: asyncio is less error-prone than threads.
Must multi-threaded code be so tricky?
Itās Time To Take Another Look At Threads
#
All along, itās been possible to write elegant, correct code with threads. To begin, letās look at how to use threads with Futures. Threads had Futures first, before asyncio. Futures let us express control flows youād struggle to write with mutexes and condition variables.
(Confusingly,
asyncio introduced a new Future class
thatās different from the one we use with threads. Iāve never had to use both in the same program, so itās fine.)
Future.
Letās rewrite our previous counter-incrementing example with Futures.
from
concurrent.futures
import
Future
future0
=
Future
()
future1
=
Future
()
def
do_something
(
future
):
print
(
"doing something...."
)
future
.
set_result
(
1
)
# How much to increment the counter.
t0
=
threading
.
Thread
(
target
=
do_something
,
args
=
(
future0
,))
t1
=
threading
.
Thread
(
target
=
do_something
,
args
=
(
future1
,))
t0
.
start
()
t1
.
start
()
# Blocks until another thread calls Future.set_result().
counter
=
future0
.
result
()
+
future1
.
result
()
print
(
f
"Counter:
{
counter
}
"
)
The
concurrent.futures
module is where all the cool threads stuff lives. It was introduced back in Python 3.2.
Now
do_something
takes a Future and sets its result to 1. This is called āresolving the Futureā.
We run the function on two threads and pass in the two Futures as arguments. Then we wait for the threads to call
set_result
, and sum the two results. Calling
Future.result()
blocks until the future is resolved. Note that we no longer need to call
Thread.join()
.
This code isnāt much of an improvement. Iām just showing how Futures work. In reality youād write something more like this:
from
concurrent.futures
import
ThreadPoolExecutor
executor
=
ThreadPoolExecutor
()
# Takes a dummy argument.
def
do_something
(
_
):
print
(
"doing something...."
)
return
1
# Like builtin "map" but concurrent.
counter
=
sum
(
executor
.
map
(
do_something
,
range
(
2
)))
print
(
f
"Counter:
{
counter
}
"
)
We create a
ThreadPoolExecutor
, which runs code on threads and reuses threads efficiently.
executor.map
is like the builtin
map
function, but it calls the function concurrently over all the arguments at once. In this case
do_something
doesnāt need an argument, so we use a dummy argument list,
range(2)
.
Thereās no more explicit Futures or threads here, theyāre hidden inside the implementation of
map
. I think this looks really nice, and not error-prone at all.
Workflows
#
What about more complex workflows?
The morning before I gave this talk in Salt Lake City, I made French press coffee in my Airbnb. I brought a hand grinder, so grinding the coffee took some time. Then I heated water, combined them and waited for it to brew, and drank it.
Obviously thatās not efficient. I should start the water heating and grind the coffee concurrently.
How can we code this with threads?
executor
=
ThreadPoolExecutor
()
def
heat_water
():
...
def
grind_coffee
():
...
def
brew
(
future1
,
future2
):
future1
.
result
()
future2
.
result
()
time
.
sleep
(
4
*
60
)
# Brew for 4 minutes.
heated_future
=
executor
.
submit
(
heat_water
)
ground_future
=
executor
.
submit
(
grind_coffee
)
brew
(
heated_future
,
ground_future
)
print
(
"Drinking coffee"
)
The
brew
function takes two Futures and waits until both have completed, then waits for the coffee to brew. We use the
ThreadPoolExecutor
to start heating and grinding concurrently. We call
brew
and when itās done, we can drink.
So far so good. Letās add more steps to this workflow and see how this technique handles the added complexity.
Thereās a quick step right after heating the water: I pour it into the French press. And after I grind the coffee I add the grounds to the press. These events can happen in either order, but I always want to do the red step as soon as its blue step is completed.
def
heat_water
():
return
"heated water"
def
grind_coffee
():
return
"ground coffee"
def
brew
(
future1
,
future2
):
for
future
in
as_completed
([
future1
,
future2
]):
print
(
f
"Adding
{
future
.
result
()
}
to French press"
)
time
.
sleep
(
4
*
60
)
# Brew for 4 minutes.
Now the
heat_water
and
grind_coffee
functions have return values; they produce something. The new
brew
function uses
as_completed
, which is also in the
concurrent.futures
module. If the water is heated first, then we add it to the press, or if the coffee is ground first, we add the grounds first. Once both steps are done, then we wait 4 minutes. The rest of the code is like before.
Imagine if you had to use old-fashioned thread code, with locks and condition variables to signal when each step was done. It would be a nightmare. But with
concurrent.futures
the code is just as clean and easy as with asyncio.
Futures and Typing
#
These code examples arenāt really modern Python yet, because they donāt have any types.
def
heat_water
()
->
str
:
return
"heated water"
def
grind_coffee
()
->
str
:
return
"ground coffee"
def
brew
(
future1
:
Future
[
str
],
future2
:
Future
[
str
]):
for
future
in
as_completed
([
future1
,
future2
]):
print
(
f
"Adding
{
future
.
result
()
}
to French press"
)
# ^ type system knows result() returns a string.
To use types with Futures, just subscript the Future type with whatever the Future resolves to, in this case a string. Then the type system knows that
result()
returns a string.
Workflows, Part 2
#
What if the ācoffeeā workflow is one component of a much larger workflow, encompassing a whole afternoon?
First I make and drink coffee, then I have the motivation to do chores, which is a separate complex workflow. Of course Iām listening to a podcast the whole time.
with
ThreadPoolExecutor
()
as
main_exec
:
main_exec
.
submit
(
listen_to_podcast
)
with
ThreadPoolExecutor
()
as
coffee_exec
:
heated_future
=
coffee_exec
.
submit
(
heat_water
)
ground_future
=
coffee_exec
.
submit
(
grind_coffee
)
brew
(
heated_future
,
ground_future
)
print
(
"Drinking coffee"
)
# Join and shut down coffee_exec.
with
ThreadPoolExecutor
()
as
chores_exec
:
...
# Join and shut down chores_exec.
# Join and shut down main_exec.
A nice way to structure nested workflows is using a
with
statement. I start a block like
with ThreadPoolExecutor
and run a function on that executor. I can start an inner executor using another
with
statement. When we leave the block, either normally with an exception, we automatically join and shut down the executor, so all threads started within the block must finish.
This style is called āstructured concurrencyā. Itās been popularized in several languages and frameworks; Nathaniel Smithās Trio framework
introduced it to a lot of Pythonistas
, and it will be
included in asyncio as ātask groupsā in Python 3.11
.
Unfortunately we canāt do full structured concurrency with Python threads. Ideally, if one thread dies with an exception, other threads started in the same block would be quickly cancelled, and all exceptions thrown in the block would be grouped together and bubble up. But exceptions in
ThreadPoolExecutor
blocks donāt work well today, and cancellation with Python threads is Stone-Aged.
Cancellation
#
Threads are not nearly as good at cancellation as asyncio, Trio, or other async frameworks. Hereās a handwritten solution; youāll need something like this in your program if you want cancellation.
class
ThreadCancelledError
(
BaseException
):
pass
class
CancellationToken
:
is_cancelled
=
False
def
check_cancelled
(
self
):
if
self
.
is_cancelled
:
raise
ThreadCancelledError
()
def
do_something
(
token
):
while
True
:
# Don't forget to call check_cancelled!
token
.
check_cancelled
()
token
=
CancellationToken
()
executor
=
ThreadPoolExecutor
()
future
=
executor
.
submit
(
do_something
,
token
)
time
.
sleep
(
1
)
token
.
is_cancelled
=
True
try
:
future
.
result
()
# Wait for do_something to notice that it's cancelled.
except
ThreadCancelledError
:
print
(
"Thread cancelled"
)
The custom
ThreadCancelledError
inherits from
BaseException
rather than
Exception
, so that it bypasses most
except
blocks. Now in
do_something
we must add frequent calls to
check_cancelled
.
Python doesnāt control the thread scheduler the way it controls the asyncio event loop, so itās not possible for thread cancellation to be as good. But it could be improved. See
Nathaniel Smithās 2018 article
for superior ideas. Iām curious if anyone has a PEP for improving thread cancellation.
A Real Life Example
#
Letās get back to the good news about threads.
Hereās a real life example I coded a few months ago.
I implemented Paxos in Python
. Paxos is a way for a group of servers to cooperate for fault-tolerance. Hereās a group of three servers which all communicate with each other.
How does each server know all its peersā names? Letās give them all a config file.
{
"servers"
:
[
"host0.example.com"
,
"host1.example.com"
,
"host2.example.com"
]
}
But how does any server know which one it is? This is surprisingly hard. In a data center or cloud deployment, each server usually has several IPs and several DNS names, such as its internal and external names. Calling
gethostname()
usually doesnāt give you the information you need. Thereās no easy way to know if a DNS query for
host0
, for example, resolves to this server or another server.
The solution is sort of amazing. First, each server generates a random unique id for itself when it starts up. Next, each server sends a request to all the servers in the list, which includes itself, but it doesnāt know which one is self. Here I show
host0
sending out three requests; the others do the same.
host0
gets replies from
host1
and
host2
with different ids, and it gets a reply from
host0
with its
own
id! So it knows that it is
host0
.
This is actually how MongoDB and lots of other distributed systems solve this problem.
Servers canāt process any requests until they find themselves, and they can start up in any order, so this creates a complex control flow. Sounds like a job for Futures!
Hereās the server code. Weāll start by generating a unique id for this server.
I want to use Flask for the server, of courseāFlask is the most popular web framework.
The server makes a Future which it will resolve when it finds itself.
server_id
=
uuid
.
uuid4
()
.
hex
app
=
flask
.
Flask
(
'PyPaxos'
)
self_future
=
Future
()
@app.route
(
'/server_id'
,
methods
=
[
'GET'
])
def
get_server_id
():
return
flask
.
jsonify
(
server_id
)
@app.route
(
'/client-request'
,
methods
=
[
'POST'
])
def
client_request
():
# Can't handle requests until I find myself, block here.
self_future
.
result
()
...
config
=
json
.
load
(
open
(
"paxos-config.json"
))
executor
=
ThreadPoolExecutor
()
# Run Flask server on a thread, so main thread can search for self.
app_done
=
executor
.
submit
(
app
.
run
)
start
=
time
.
monotonic
()
while
time
.
monotonic
()
-
start
<
60
and
not
self_future
.
done
():
for
server_name
in
config
[
"servers"
]:
try
:
# Use Requests to query /server_id handler, above.
reply
=
requests
.
get
(
f
"http://
{
server_name
}
/server_id"
)
if
reply
.
json
()
==
server_id
:
# Found self. Unblock threads in client_request()
# above, and exit loop.
self_future
.
set_result
(
server_name
)
break
except
requests
.
RequestException
as
exc
:
# See explanation below.
pass
time
.
sleep
(
1
)
app_done
.
result
()
# Let app run in background.
The server canāt process any client requests until itās found itself, so
client_request
waits for
self_future
to be resolved by calling
self_future.result()
. Once the future has been resolved, calling
result()
always returns immediately.
The search loop tries repeatedly for a minute to find self, by querying for each serverās id. It might catch an exception when querying; either because itās trying to reach another server that hasnāt started yet, or itās trying to reach
itself
but Flask hasnāt initialized on its background thread.
After the search loop completes we wait for
app_done.result()
: that means the main thread sleeps until the server thread exits, maybe because of a Control-C or some other signal.
Clean and clear, right? If I had rewritten this with asyncio I couldnāt use Flask, the most popular web framework, and I couldnāt use Requests, the most popular client library. (Requests doesnāt support asyncio.) I wouldāve had to rewrite everything to use asyncio. But with threads, I can implement this advanced control flow in a straightforward and legible manner, and I can still use Flask and Requests.
Cool Threads
#
Threads are cool.
Donāt let the asyncio kids make you feel like a nerd.
Threads are a better choice than asyncio for most concurrent programs.
Theyāre at least as fast as asyncio, theyāre compatible with the popular frameworks, and with the techniques we looked at, using Futures and ThreadPoolExecutors, multi-threaded code can be safe and elegant. |
| Markdown | # [A. Jesse Jiryu Davis](https://emptysqua.re/blog/)
- [All Articles](https://emptysqua.re/blog/all-posts/)
- [Feed](https://emptysqua.re/blog/index.xml)
- [About](https://emptysqua.re/blog/about/)
- [Photography](https://portfolio.emptysqua.re/rock-climbing)
May 16, 2022
ā [A. Jesse Jiryu Davis](https://twitter.com/jessejiryudavis)
# [Why Should Async Get All The Love?: Advanced Control Flow With Threads](https://emptysqua.re/blog/why-should-async-get-all-the-love/)
I spoke at PyCon 2022 about writing safe, elegant concurrent Python with threads. Hereās the video. Sorry about the choppy audio, [the A/V at PyCon this year was a shitshow](https://pycon.blogspot.com/2022/05/pycon-us-2022-recordings-update.html). Below is a written version of the talk.
***

asyncio.
Asyncio is really hip. And not just asyncioāthe older async frameworks like Twisted and Tornado, and more recent ones like Trio and Curio are hip, too. I think they deserve to be! Iām a big fan. I spent a lot of time contributing to Tornado and asyncio some years ago. My very first PyCon talk, in 2014, was called ā[What Is Async, How Does It Work, And When Should I Use It?](https://www.youtube.com/watch?v=9WV7juNmyE8)ā I was an early async booster.
Asyncio introduced a lot of Pythonistas to advanced control flows with Tasks, Futures, chaining, `asyncio.gather`, and so on. All this was really exciting! But thereās something a lot of Python programmers didnāt notice at the time: All this was already possible with threads, too.

Threads.
Compared to asyncio, threads seem hopelessly outdated. The cool kids will laugh at you if you use threads.
# Concurrency and parallelism [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#concurrency-and-parallelism)
Threads and asyncio are two ways to achieve **concurrency**.
Letās avoid any confusion at the start: Concurrency is not parallelism. Parallelism is when your program executes code on multiple CPUs at once. Python mostly canāt do parallelism due to the Global Interpreter Lock. You can understand the GIL with a phrase short enough to fit on your hand: One thread runs Python, while N others sleep or await I/O. [Learn more about the GIL from my PyCon talk a few years ago](https://emptysqua.re/blog/series/grok-the-gil/).

So threads and asyncio have the same limitation: Neither threads nor asyncio Tasks can use multiple CPUs.
(An aside about multiprocessing, just so you know I know what youāre thinking: If you really need parallelism, use multiprocessing. Thatās the only way to run Python code using multiple CPUs at once with standard Python. But coordinating and exchanging data among Python processes is much harder than with threads, only do this if you really have to.)

But in this article Iām not talking about parallelism, Iām talking about concurrency. Concurrency is dealing with events in *partial order*: your program is waiting for several things to happen, and they could occur in one of several sequences. By far the most important example is waiting for data on many network connections at once. Some network clients and most network servers have to support concurrency, sometimes very high concurrency. We can use threads or an async framework, such as asyncio, as our method of supporting concurrency.
# Threads vs. asyncio [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#threads-vs-asyncio)
## Memory [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#memory)
Which one should you use, threads or asyncio? Letās start with asyncioās main advantage: Very very high concurrency programs are more memory efficient with asyncio.

Hereās a chart of [two simple programs](https://github.com/ajdavis/python-paxos-jepsen) spawning lots of threads (blue) and asyncio Tasks (red). Just importing asyncio means the red program starts with a higher memory footprint, but that doesnāt matter. What matters is, as concurrency increases, the red asyncio programās memory grows slower.
A Python thread costs about 10k of memory. Thatās not much memory! More than a few hundred threads is impractical in Python, and the operating system imposes limits that prevent huge numbers of threads. But if you have low hundreds, you donāt need asyncio. Threads work great. If you remember [the problems David Beazley pointed out in Python 2](https://archive.org/details/pyvideo_588___mindblowing-python-gil), they were solved in Python 3.
With asyncio, each Task costs about 2k of memory, and thereās effectively no upper bound. So asyncio is more memory-efficient for very high concurrency, e.g. waiting for network events on a huge number of mostly idle connections.
## Speed [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#speed)
Is asyncio faster than threads? No. As [Cal Peterson wrote](https://calpaterson.com/async-python-is-not-faster.html):
> Sadly async is not go-faster-stripes for the Python interpreter.
>
> Under realistic conditions asynchronous web frameworks are slightly worse throughput and much worse latency variance.
Standard library asyncio is definitely slower than most multi-threaded frameworks, because asyncio executes a lot of Python for each event. Generally frameworks are faster the more that theyāre implemented in C or another compiled language. Even with the fastest async frameworks, like [those based on uvloop](https://github.com/MagicStack/uvloop), tail latency seems to be worse than with multi-threading.
Iām not going to say all async frameworks are definitely slower than threads. What I can say confidently is that asyncio isnāt faster, and itās more efficient only for huge numbers of mostly idle connections. And only for that.
## Compatibility [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#compatibility)
What about compatibility? Here are the most popular Python web frameworks ([source](https://www.jetbrains.com/lp/devecosystem-2021/python/)).

The sum is more than 100% because respondents could choose multiple. Flask, Django, and most of the others are multi-threaded frameworks. Only three (FastAPI, Falcon, and Tornado) are asyncio-compatible. (We donāt know about the āotherā category, but itās only 4%.)
So your web application is probably multi-threaded now, not async. If you want to use asyncio, that means rewriting a large portion of your app. Whereas multi-threaded code is compatible with most of the apps, libraries, and frameworks already written.
## Trickiness [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#trickiness)
How tricky is it to write correct concurrent code with threads or asyncio?
Letās make a function called `do_something` which adds one to a global counter, and run it on two threads at once.
```
counter = 0
def do_something():
global counter
print("doing something....")
counter += 1 # Not atomic!
t0 = threading.Thread(target=do_something)
t1 = threading.Thread(target=do_something)
t0.start()
t1.start()
t0.join()
t1.join()
print(f"Counter: {counter}")
```
Will `counter` always eventually equal 2? No! [Plus-equals isnāt atomic](https://emptysqua.re/blog/python-increment-is-weird/). It first loads the value from the global, then adds 1, then stores the value to the global. If the two threads interleave during this process, one of their updates could be lost, and we end up with `counter` equal to 1, not 2.
We need to protect the plus-equals with a lock:
```
counter = 0
lock = threading.Lock()
def do_something():
global counter
print("doing something....")
with lock:
counter += 1
```
This is tricky! [In a 2014 blog post](https://glyph.twistedmatrix.com/2014/02/unyielding.html) Glyph Lefkowitz, the author of Twisted, talks about this trickiness. Itās still my favorite argument on the topic.
> As we know, threads are a bad idea, (for most purposes). Threads make local reasoning difficult, and local reasoning is perhaps the most important thing in software development.
Glyph says the main reason to write async code isnāt that itās faster. Itās not even memory efficiency. Itās that itās less prone to concurrency bugs and it requires less tricky programming. (But it doesnāt have to be that bad, as youāll see below.)
Letās rewrite our counter-incrementing example with asyncio.
```
counter = 0
async def do_something():
global counter
print("doing something....")
await call_some_coroutine()
counter += 1 # Atomic! No "await" in +=.
async def main():
t0 = asyncio.Task(do_something())
t1 = asyncio.Task(do_something())
```
Now `do_something` is a coroutine. It calls another coroutine for the sake of illustration, and then increments the counter. We run it on two Tasks at once. Just by looking at the code we know where interleaving is possible. If it has an `await` expression, a coroutine can interleave there. Otherwise itās atomic. Thatās ālocal reasoningā. Plus-equals has no `await` expression, so itās atomic. We donāt need a lock here.
Therefore asyncio is better than multi-threading, because itās less tricky, right? We shall seeā¦.
In summary:
| | |
|---|---|
| Threads | asyncio |
| **Speed**: Threads are at least as fast. | **Memory**: asyncio efficiently waits on huge numbers of mostly-idle network connections. |
| **Compatibility**: Threads work with Flask, Django, etc., without rewriting your app for asyncio. | **Trickiness**: asyncio is less error-prone than threads. |
Must multi-threaded code be so tricky?
# Itās Time To Take Another Look At Threads [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#its-time-to-take-another-look-at-threads)

All along, itās been possible to write elegant, correct code with threads. To begin, letās look at how to use threads with Futures. Threads had Futures first, before asyncio. Futures let us express control flows youād struggle to write with mutexes and condition variables.
(Confusingly, [asyncio introduced a new Future class](https://docs.python.org/3/library/asyncio-future.html) thatās different from the one we use with threads. Iāve never had to use both in the same program, so itās fine.)

Future.
Letās rewrite our previous counter-incrementing example with Futures.
```
from concurrent.futures import Future
future0 = Future()
future1 = Future()
def do_something(future):
print("doing something....")
future.set_result(1) # How much to increment the counter.
t0 = threading.Thread(target=do_something, args=(future0,))
t1 = threading.Thread(target=do_something, args=(future1,))
t0.start()
t1.start()
# Blocks until another thread calls Future.set_result().
counter = future0.result() + future1.result()
print(f"Counter: {counter}")
```
The `concurrent.futures` module is where all the cool threads stuff lives. It was introduced back in Python 3.2. Now `do_something` takes a Future and sets its result to 1. This is called āresolving the Futureā. We run the function on two threads and pass in the two Futures as arguments. Then we wait for the threads to call `set_result`, and sum the two results. Calling `Future.result()` blocks until the future is resolved. Note that we no longer need to call `Thread.join()`.
This code isnāt much of an improvement. Iām just showing how Futures work. In reality youād write something more like this:
```
from concurrent.futures import ThreadPoolExecutor
executor = ThreadPoolExecutor()
# Takes a dummy argument.
def do_something(_):
print("doing something....")
return 1
# Like builtin "map" but concurrent.
counter = sum(executor.map(do_something, range(2)))
print(f"Counter: {counter}")
```
We create a `ThreadPoolExecutor`, which runs code on threads and reuses threads efficiently. `executor.map` is like the builtin `map` function, but it calls the function concurrently over all the arguments at once. In this case `do_something` doesnāt need an argument, so we use a dummy argument list, `range(2)`.
Thereās no more explicit Futures or threads here, theyāre hidden inside the implementation of `map`. I think this looks really nice, and not error-prone at all.
# Workflows [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#workflows)
What about more complex workflows?
The morning before I gave this talk in Salt Lake City, I made French press coffee in my Airbnb. I brought a hand grinder, so grinding the coffee took some time. Then I heated water, combined them and waited for it to brew, and drank it.

Obviously thatās not efficient. I should start the water heating and grind the coffee concurrently.

How can we code this with threads?
```
executor = ThreadPoolExecutor()
def heat_water():
...
def grind_coffee():
...
def brew(future1, future2):
future1.result()
future2.result()
time.sleep(4 * 60) # Brew for 4 minutes.
heated_future = executor.submit(heat_water)
ground_future = executor.submit(grind_coffee)
brew(heated_future, ground_future)
print("Drinking coffee")
```
The `brew` function takes two Futures and waits until both have completed, then waits for the coffee to brew. We use the `ThreadPoolExecutor` to start heating and grinding concurrently. We call `brew` and when itās done, we can drink.
So far so good. Letās add more steps to this workflow and see how this technique handles the added complexity.

Thereās a quick step right after heating the water: I pour it into the French press. And after I grind the coffee I add the grounds to the press. These events can happen in either order, but I always want to do the red step as soon as its blue step is completed.
```
def heat_water():
return "heated water"
def grind_coffee():
return "ground coffee"
def brew(future1, future2):
for future in as_completed([future1, future2]):
print(f"Adding {future.result()} to French press")
time.sleep(4 * 60) # Brew for 4 minutes.
```
Now the `heat_water` and `grind_coffee` functions have return values; they produce something. The new `brew` function uses `as_completed`, which is also in the `concurrent.futures` module. If the water is heated first, then we add it to the press, or if the coffee is ground first, we add the grounds first. Once both steps are done, then we wait 4 minutes. The rest of the code is like before.
Imagine if you had to use old-fashioned thread code, with locks and condition variables to signal when each step was done. It would be a nightmare. But with `concurrent.futures` the code is just as clean and easy as with asyncio.
# Futures and Typing [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#futures-and-typing)

These code examples arenāt really modern Python yet, because they donāt have any types.
```
def heat_water() -> str:
return "heated water"
def grind_coffee() -> str:
return "ground coffee"
def brew(future1: Future[str], future2: Future[str]):
for future in as_completed([future1, future2]):
print(f"Adding {future.result()} to French press")
# ^ type system knows result() returns a string.
```
To use types with Futures, just subscript the Future type with whatever the Future resolves to, in this case a string. Then the type system knows that `result()` returns a string.
# Workflows, Part 2 [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#workflows-part-2)
What if the ācoffeeā workflow is one component of a much larger workflow, encompassing a whole afternoon?

First I make and drink coffee, then I have the motivation to do chores, which is a separate complex workflow. Of course Iām listening to a podcast the whole time.
```
with ThreadPoolExecutor() as main_exec:
main_exec.submit(listen_to_podcast)
with ThreadPoolExecutor() as coffee_exec:
heated_future = coffee_exec.submit(heat_water)
ground_future = coffee_exec.submit(grind_coffee)
brew(heated_future, ground_future)
print("Drinking coffee")
# Join and shut down coffee_exec.
with ThreadPoolExecutor() as chores_exec:
...
# Join and shut down chores_exec.
# Join and shut down main_exec.
```
A nice way to structure nested workflows is using a `with` statement. I start a block like `with ThreadPoolExecutor` and run a function on that executor. I can start an inner executor using another `with` statement. When we leave the block, either normally with an exception, we automatically join and shut down the executor, so all threads started within the block must finish.
This style is called āstructured concurrencyā. Itās been popularized in several languages and frameworks; Nathaniel Smithās Trio framework [introduced it to a lot of Pythonistas](https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/), and it will be [included in asyncio as ātask groupsā in Python 3.11](https://github.com/python/cpython/issues/90908).
Unfortunately we canāt do full structured concurrency with Python threads. Ideally, if one thread dies with an exception, other threads started in the same block would be quickly cancelled, and all exceptions thrown in the block would be grouped together and bubble up. But exceptions in `ThreadPoolExecutor` blocks donāt work well today, and cancellation with Python threads is Stone-Aged.
# Cancellation [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#cancellation)
Threads are not nearly as good at cancellation as asyncio, Trio, or other async frameworks. Hereās a handwritten solution; youāll need something like this in your program if you want cancellation.
```
class ThreadCancelledError(BaseException):
pass
class CancellationToken:
is_cancelled = False
def check_cancelled(self):
if self.is_cancelled: raise ThreadCancelledError()
def do_something(token):
while True:
# Don't forget to call check_cancelled!
token.check_cancelled()
token = CancellationToken()
executor = ThreadPoolExecutor()
future = executor.submit(do_something, token)
time.sleep(1)
token.is_cancelled = True
try:
future.result() # Wait for do_something to notice that it's cancelled.
except ThreadCancelledError:
print("Thread cancelled")
```
The custom `ThreadCancelledError` inherits from `BaseException` rather than `Exception`, so that it bypasses most `except` blocks. Now in `do_something` we must add frequent calls to `check_cancelled`.
Python doesnāt control the thread scheduler the way it controls the asyncio event loop, so itās not possible for thread cancellation to be as good. But it could be improved. See [Nathaniel Smithās 2018 article](https://vorpus.org/blog/timeouts-and-cancellation-for-humans) for superior ideas. Iām curious if anyone has a PEP for improving thread cancellation.
# A Real Life Example [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#a-real-life-example)
Letās get back to the good news about threads.
Hereās a real life example I coded a few months ago. [I implemented Paxos in Python](https://github.com/ajdavis/python-paxos-jepsen). Paxos is a way for a group of servers to cooperate for fault-tolerance. Hereās a group of three servers which all communicate with each other.

How does each server know all its peersā names? Letās give them all a config file.
```
{
"servers": [
"host0.example.com",
"host1.example.com",
"host2.example.com"
]
}
```
But how does any server know which one it is? This is surprisingly hard. In a data center or cloud deployment, each server usually has several IPs and several DNS names, such as its internal and external names. Calling `gethostname()` usually doesnāt give you the information you need. Thereās no easy way to know if a DNS query for `host0`, for example, resolves to this server or another server.
The solution is sort of amazing. First, each server generates a random unique id for itself when it starts up. Next, each server sends a request to all the servers in the list, which includes itself, but it doesnāt know which one is self. Here I show `host0` sending out three requests; the others do the same. `host0` gets replies from `host1` and `host2` with different ids, and it gets a reply from `host0` with its **own** id! So it knows that it is `host0`.

This is actually how MongoDB and lots of other distributed systems solve this problem.
Servers canāt process any requests until they find themselves, and they can start up in any order, so this creates a complex control flow. Sounds like a job for Futures\!
Hereās the server code. Weāll start by generating a unique id for this server. I want to use Flask for the server, of courseāFlask is the most popular web framework. The server makes a Future which it will resolve when it finds itself.
```
server_id = uuid.uuid4().hex
app = flask.Flask('PyPaxos')
self_future = Future()
@app.route('/server_id', methods=['GET'])
def get_server_id():
return flask.jsonify(server_id)
@app.route('/client-request', methods=['POST'])
def client_request():
# Can't handle requests until I find myself, block here.
self_future.result()
...
config = json.load(open("paxos-config.json"))
executor = ThreadPoolExecutor()
# Run Flask server on a thread, so main thread can search for self.
app_done = executor.submit(app.run)
start = time.monotonic()
while time.monotonic() - start < 60 and not self_future.done():
for server_name in config["servers"]:
try:
# Use Requests to query /server_id handler, above.
reply = requests.get(f"http://{server_name}/server_id")
if reply.json() == server_id:
# Found self. Unblock threads in client_request()
# above, and exit loop.
self_future.set_result(server_name)
break
except requests.RequestException as exc:
# See explanation below.
pass
time.sleep(1)
app_done.result() # Let app run in background.
```
The server canāt process any client requests until itās found itself, so `client_request` waits for `self_future` to be resolved by calling `self_future.result()`. Once the future has been resolved, calling `result()` always returns immediately.
The search loop tries repeatedly for a minute to find self, by querying for each serverās id. It might catch an exception when querying; either because itās trying to reach another server that hasnāt started yet, or itās trying to reach **itself** but Flask hasnāt initialized on its background thread.
After the search loop completes we wait for `app_done.result()`: that means the main thread sleeps until the server thread exits, maybe because of a Control-C or some other signal.
Clean and clear, right? If I had rewritten this with asyncio I couldnāt use Flask, the most popular web framework, and I couldnāt use Requests, the most popular client library. (Requests doesnāt support asyncio.) I wouldāve had to rewrite everything to use asyncio. But with threads, I can implement this advanced control flow in a straightforward and legible manner, and I can still use Flask and Requests.
# Cool Threads [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#cool-threads)

Threads are cool. Donāt let the asyncio kids make you feel like a nerd.
Threads are a better choice than asyncio for most concurrent programs. Theyāre at least as fast as asyncio, theyāre compatible with the popular frameworks, and with the techniques we looked at, using Futures and ThreadPoolExecutors, multi-threaded code can be safe and elegant.
Categories: [Python](https://emptysqua.re/blog/category/python/ "All posts in Python")
Tags: [pycon](https://emptysqua.re/blog/tag/pycon/ "All posts tagged pycon")
[ā Buddhist Groups Supporting Refugees](https://emptysqua.re/blog/sanghas-supporting-refugees/)
[Climbing in Peterskill ā](https://emptysqua.re/blog/peterskill/)
X |
| Readable Markdown | I spoke at PyCon 2022 about writing safe, elegant concurrent Python with threads. Hereās the video. Sorry about the choppy audio, [the A/V at PyCon this year was a shitshow](https://pycon.blogspot.com/2022/05/pycon-us-2022-recordings-update.html). Below is a written version of the talk.
***

asyncio.
Asyncio is really hip. And not just asyncioāthe older async frameworks like Twisted and Tornado, and more recent ones like Trio and Curio are hip, too. I think they deserve to be! Iām a big fan. I spent a lot of time contributing to Tornado and asyncio some years ago. My very first PyCon talk, in 2014, was called ā[What Is Async, How Does It Work, And When Should I Use It?](https://www.youtube.com/watch?v=9WV7juNmyE8)ā I was an early async booster.
Asyncio introduced a lot of Pythonistas to advanced control flows with Tasks, Futures, chaining, `asyncio.gather`, and so on. All this was really exciting! But thereās something a lot of Python programmers didnāt notice at the time: All this was already possible with threads, too.

Threads.
Compared to asyncio, threads seem hopelessly outdated. The cool kids will laugh at you if you use threads.
## Concurrency and parallelism [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#concurrency-and-parallelism)
Threads and asyncio are two ways to achieve **concurrency**.
Letās avoid any confusion at the start: Concurrency is not parallelism. Parallelism is when your program executes code on multiple CPUs at once. Python mostly canāt do parallelism due to the Global Interpreter Lock. You can understand the GIL with a phrase short enough to fit on your hand: One thread runs Python, while N others sleep or await I/O. [Learn more about the GIL from my PyCon talk a few years ago](https://emptysqua.re/blog/series/grok-the-gil/).

So threads and asyncio have the same limitation: Neither threads nor asyncio Tasks can use multiple CPUs.
(An aside about multiprocessing, just so you know I know what youāre thinking: If you really need parallelism, use multiprocessing. Thatās the only way to run Python code using multiple CPUs at once with standard Python. But coordinating and exchanging data among Python processes is much harder than with threads, only do this if you really have to.)

But in this article Iām not talking about parallelism, Iām talking about concurrency. Concurrency is dealing with events in *partial order*: your program is waiting for several things to happen, and they could occur in one of several sequences. By far the most important example is waiting for data on many network connections at once. Some network clients and most network servers have to support concurrency, sometimes very high concurrency. We can use threads or an async framework, such as asyncio, as our method of supporting concurrency.
## Threads vs. asyncio [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#threads-vs-asyncio)
## Memory [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#memory)
Which one should you use, threads or asyncio? Letās start with asyncioās main advantage: Very very high concurrency programs are more memory efficient with asyncio.

Hereās a chart of [two simple programs](https://github.com/ajdavis/python-paxos-jepsen) spawning lots of threads (blue) and asyncio Tasks (red). Just importing asyncio means the red program starts with a higher memory footprint, but that doesnāt matter. What matters is, as concurrency increases, the red asyncio programās memory grows slower.
A Python thread costs about 10k of memory. Thatās not much memory! More than a few hundred threads is impractical in Python, and the operating system imposes limits that prevent huge numbers of threads. But if you have low hundreds, you donāt need asyncio. Threads work great. If you remember [the problems David Beazley pointed out in Python 2](https://archive.org/details/pyvideo_588___mindblowing-python-gil), they were solved in Python 3.
With asyncio, each Task costs about 2k of memory, and thereās effectively no upper bound. So asyncio is more memory-efficient for very high concurrency, e.g. waiting for network events on a huge number of mostly idle connections.
## Speed [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#speed)
Is asyncio faster than threads? No. As [Cal Peterson wrote](https://calpaterson.com/async-python-is-not-faster.html):
> Sadly async is not go-faster-stripes for the Python interpreter.
>
> Under realistic conditions asynchronous web frameworks are slightly worse throughput and much worse latency variance.
Standard library asyncio is definitely slower than most multi-threaded frameworks, because asyncio executes a lot of Python for each event. Generally frameworks are faster the more that theyāre implemented in C or another compiled language. Even with the fastest async frameworks, like [those based on uvloop](https://github.com/MagicStack/uvloop), tail latency seems to be worse than with multi-threading.
Iām not going to say all async frameworks are definitely slower than threads. What I can say confidently is that asyncio isnāt faster, and itās more efficient only for huge numbers of mostly idle connections. And only for that.
## Compatibility [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#compatibility)
What about compatibility? Here are the most popular Python web frameworks ([source](https://www.jetbrains.com/lp/devecosystem-2021/python/)).

The sum is more than 100% because respondents could choose multiple. Flask, Django, and most of the others are multi-threaded frameworks. Only three (FastAPI, Falcon, and Tornado) are asyncio-compatible. (We donāt know about the āotherā category, but itās only 4%.)
So your web application is probably multi-threaded now, not async. If you want to use asyncio, that means rewriting a large portion of your app. Whereas multi-threaded code is compatible with most of the apps, libraries, and frameworks already written.
## Trickiness [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#trickiness)
How tricky is it to write correct concurrent code with threads or asyncio?
Letās make a function called `do_something` which adds one to a global counter, and run it on two threads at once.
```
counter = 0
def do_something():
global counter
print("doing something....")
counter += 1 # Not atomic!
t0 = threading.Thread(target=do_something)
t1 = threading.Thread(target=do_something)
t0.start()
t1.start()
t0.join()
t1.join()
print(f"Counter: {counter}")
```
Will `counter` always eventually equal 2? No! [Plus-equals isnāt atomic](https://emptysqua.re/blog/python-increment-is-weird/). It first loads the value from the global, then adds 1, then stores the value to the global. If the two threads interleave during this process, one of their updates could be lost, and we end up with `counter` equal to 1, not 2.
We need to protect the plus-equals with a lock:
```
counter = 0
lock = threading.Lock()
def do_something():
global counter
print("doing something....")
with lock:
counter += 1
```
This is tricky! [In a 2014 blog post](https://glyph.twistedmatrix.com/2014/02/unyielding.html) Glyph Lefkowitz, the author of Twisted, talks about this trickiness. Itās still my favorite argument on the topic.
> As we know, threads are a bad idea, (for most purposes). Threads make local reasoning difficult, and local reasoning is perhaps the most important thing in software development.
Glyph says the main reason to write async code isnāt that itās faster. Itās not even memory efficiency. Itās that itās less prone to concurrency bugs and it requires less tricky programming. (But it doesnāt have to be that bad, as youāll see below.)
Letās rewrite our counter-incrementing example with asyncio.
```
counter = 0
async def do_something():
global counter
print("doing something....")
await call_some_coroutine()
counter += 1 # Atomic! No "await" in +=.
async def main():
t0 = asyncio.Task(do_something())
t1 = asyncio.Task(do_something())
```
Now `do_something` is a coroutine. It calls another coroutine for the sake of illustration, and then increments the counter. We run it on two Tasks at once. Just by looking at the code we know where interleaving is possible. If it has an `await` expression, a coroutine can interleave there. Otherwise itās atomic. Thatās ālocal reasoningā. Plus-equals has no `await` expression, so itās atomic. We donāt need a lock here.
Therefore asyncio is better than multi-threading, because itās less tricky, right? We shall seeā¦.
In summary:
| | |
|---|---|
| Threads | asyncio |
| **Speed**: Threads are at least as fast. | **Memory**: asyncio efficiently waits on huge numbers of mostly-idle network connections. |
| **Compatibility**: Threads work with Flask, Django, etc., without rewriting your app for asyncio. | **Trickiness**: asyncio is less error-prone than threads. |
Must multi-threaded code be so tricky?
## Itās Time To Take Another Look At Threads [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#its-time-to-take-another-look-at-threads)

All along, itās been possible to write elegant, correct code with threads. To begin, letās look at how to use threads with Futures. Threads had Futures first, before asyncio. Futures let us express control flows youād struggle to write with mutexes and condition variables.
(Confusingly, [asyncio introduced a new Future class](https://docs.python.org/3/library/asyncio-future.html) thatās different from the one we use with threads. Iāve never had to use both in the same program, so itās fine.)

Future.
Letās rewrite our previous counter-incrementing example with Futures.
```
from concurrent.futures import Future
future0 = Future()
future1 = Future()
def do_something(future):
print("doing something....")
future.set_result(1) # How much to increment the counter.
t0 = threading.Thread(target=do_something, args=(future0,))
t1 = threading.Thread(target=do_something, args=(future1,))
t0.start()
t1.start()
# Blocks until another thread calls Future.set_result().
counter = future0.result() + future1.result()
print(f"Counter: {counter}")
```
The `concurrent.futures` module is where all the cool threads stuff lives. It was introduced back in Python 3.2. Now `do_something` takes a Future and sets its result to 1. This is called āresolving the Futureā. We run the function on two threads and pass in the two Futures as arguments. Then we wait for the threads to call `set_result`, and sum the two results. Calling `Future.result()` blocks until the future is resolved. Note that we no longer need to call `Thread.join()`.
This code isnāt much of an improvement. Iām just showing how Futures work. In reality youād write something more like this:
```
from concurrent.futures import ThreadPoolExecutor
executor = ThreadPoolExecutor()
# Takes a dummy argument.
def do_something(_):
print("doing something....")
return 1
# Like builtin "map" but concurrent.
counter = sum(executor.map(do_something, range(2)))
print(f"Counter: {counter}")
```
We create a `ThreadPoolExecutor`, which runs code on threads and reuses threads efficiently. `executor.map` is like the builtin `map` function, but it calls the function concurrently over all the arguments at once. In this case `do_something` doesnāt need an argument, so we use a dummy argument list, `range(2)`.
Thereās no more explicit Futures or threads here, theyāre hidden inside the implementation of `map`. I think this looks really nice, and not error-prone at all.
## Workflows [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#workflows)
What about more complex workflows?
The morning before I gave this talk in Salt Lake City, I made French press coffee in my Airbnb. I brought a hand grinder, so grinding the coffee took some time. Then I heated water, combined them and waited for it to brew, and drank it.

Obviously thatās not efficient. I should start the water heating and grind the coffee concurrently.

How can we code this with threads?
```
executor = ThreadPoolExecutor()
def heat_water():
...
def grind_coffee():
...
def brew(future1, future2):
future1.result()
future2.result()
time.sleep(4 * 60) # Brew for 4 minutes.
heated_future = executor.submit(heat_water)
ground_future = executor.submit(grind_coffee)
brew(heated_future, ground_future)
print("Drinking coffee")
```
The `brew` function takes two Futures and waits until both have completed, then waits for the coffee to brew. We use the `ThreadPoolExecutor` to start heating and grinding concurrently. We call `brew` and when itās done, we can drink.
So far so good. Letās add more steps to this workflow and see how this technique handles the added complexity.

Thereās a quick step right after heating the water: I pour it into the French press. And after I grind the coffee I add the grounds to the press. These events can happen in either order, but I always want to do the red step as soon as its blue step is completed.
```
def heat_water():
return "heated water"
def grind_coffee():
return "ground coffee"
def brew(future1, future2):
for future in as_completed([future1, future2]):
print(f"Adding {future.result()} to French press")
time.sleep(4 * 60) # Brew for 4 minutes.
```
Now the `heat_water` and `grind_coffee` functions have return values; they produce something. The new `brew` function uses `as_completed`, which is also in the `concurrent.futures` module. If the water is heated first, then we add it to the press, or if the coffee is ground first, we add the grounds first. Once both steps are done, then we wait 4 minutes. The rest of the code is like before.
Imagine if you had to use old-fashioned thread code, with locks and condition variables to signal when each step was done. It would be a nightmare. But with `concurrent.futures` the code is just as clean and easy as with asyncio.
## Futures and Typing [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#futures-and-typing)

These code examples arenāt really modern Python yet, because they donāt have any types.
```
def heat_water() -> str:
return "heated water"
def grind_coffee() -> str:
return "ground coffee"
def brew(future1: Future[str], future2: Future[str]):
for future in as_completed([future1, future2]):
print(f"Adding {future.result()} to French press")
# ^ type system knows result() returns a string.
```
To use types with Futures, just subscript the Future type with whatever the Future resolves to, in this case a string. Then the type system knows that `result()` returns a string.
## Workflows, Part 2 [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#workflows-part-2)
What if the ācoffeeā workflow is one component of a much larger workflow, encompassing a whole afternoon?

First I make and drink coffee, then I have the motivation to do chores, which is a separate complex workflow. Of course Iām listening to a podcast the whole time.
```
with ThreadPoolExecutor() as main_exec:
main_exec.submit(listen_to_podcast)
with ThreadPoolExecutor() as coffee_exec:
heated_future = coffee_exec.submit(heat_water)
ground_future = coffee_exec.submit(grind_coffee)
brew(heated_future, ground_future)
print("Drinking coffee")
# Join and shut down coffee_exec.
with ThreadPoolExecutor() as chores_exec:
...
# Join and shut down chores_exec.
# Join and shut down main_exec.
```
A nice way to structure nested workflows is using a `with` statement. I start a block like `with ThreadPoolExecutor` and run a function on that executor. I can start an inner executor using another `with` statement. When we leave the block, either normally with an exception, we automatically join and shut down the executor, so all threads started within the block must finish.
This style is called āstructured concurrencyā. Itās been popularized in several languages and frameworks; Nathaniel Smithās Trio framework [introduced it to a lot of Pythonistas](https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/), and it will be [included in asyncio as ātask groupsā in Python 3.11](https://github.com/python/cpython/issues/90908).
Unfortunately we canāt do full structured concurrency with Python threads. Ideally, if one thread dies with an exception, other threads started in the same block would be quickly cancelled, and all exceptions thrown in the block would be grouped together and bubble up. But exceptions in `ThreadPoolExecutor` blocks donāt work well today, and cancellation with Python threads is Stone-Aged.
## Cancellation [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#cancellation)
Threads are not nearly as good at cancellation as asyncio, Trio, or other async frameworks. Hereās a handwritten solution; youāll need something like this in your program if you want cancellation.
```
class ThreadCancelledError(BaseException):
pass
class CancellationToken:
is_cancelled = False
def check_cancelled(self):
if self.is_cancelled: raise ThreadCancelledError()
def do_something(token):
while True:
# Don't forget to call check_cancelled!
token.check_cancelled()
token = CancellationToken()
executor = ThreadPoolExecutor()
future = executor.submit(do_something, token)
time.sleep(1)
token.is_cancelled = True
try:
future.result() # Wait for do_something to notice that it's cancelled.
except ThreadCancelledError:
print("Thread cancelled")
```
The custom `ThreadCancelledError` inherits from `BaseException` rather than `Exception`, so that it bypasses most `except` blocks. Now in `do_something` we must add frequent calls to `check_cancelled`.
Python doesnāt control the thread scheduler the way it controls the asyncio event loop, so itās not possible for thread cancellation to be as good. But it could be improved. See [Nathaniel Smithās 2018 article](https://vorpus.org/blog/timeouts-and-cancellation-for-humans) for superior ideas. Iām curious if anyone has a PEP for improving thread cancellation.
## A Real Life Example [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#a-real-life-example)
Letās get back to the good news about threads.
Hereās a real life example I coded a few months ago. [I implemented Paxos in Python](https://github.com/ajdavis/python-paxos-jepsen). Paxos is a way for a group of servers to cooperate for fault-tolerance. Hereās a group of three servers which all communicate with each other.

How does each server know all its peersā names? Letās give them all a config file.
```
{
"servers": [
"host0.example.com",
"host1.example.com",
"host2.example.com"
]
}
```
But how does any server know which one it is? This is surprisingly hard. In a data center or cloud deployment, each server usually has several IPs and several DNS names, such as its internal and external names. Calling `gethostname()` usually doesnāt give you the information you need. Thereās no easy way to know if a DNS query for `host0`, for example, resolves to this server or another server.
The solution is sort of amazing. First, each server generates a random unique id for itself when it starts up. Next, each server sends a request to all the servers in the list, which includes itself, but it doesnāt know which one is self. Here I show `host0` sending out three requests; the others do the same. `host0` gets replies from `host1` and `host2` with different ids, and it gets a reply from `host0` with its **own** id! So it knows that it is `host0`.

This is actually how MongoDB and lots of other distributed systems solve this problem.
Servers canāt process any requests until they find themselves, and they can start up in any order, so this creates a complex control flow. Sounds like a job for Futures\!
Hereās the server code. Weāll start by generating a unique id for this server. I want to use Flask for the server, of courseāFlask is the most popular web framework. The server makes a Future which it will resolve when it finds itself.
```
server_id = uuid.uuid4().hex
app = flask.Flask('PyPaxos')
self_future = Future()
@app.route('/server_id', methods=['GET'])
def get_server_id():
return flask.jsonify(server_id)
@app.route('/client-request', methods=['POST'])
def client_request():
# Can't handle requests until I find myself, block here.
self_future.result()
...
config = json.load(open("paxos-config.json"))
executor = ThreadPoolExecutor()
# Run Flask server on a thread, so main thread can search for self.
app_done = executor.submit(app.run)
start = time.monotonic()
while time.monotonic() - start < 60 and not self_future.done():
for server_name in config["servers"]:
try:
# Use Requests to query /server_id handler, above.
reply = requests.get(f"http://{server_name}/server_id")
if reply.json() == server_id:
# Found self. Unblock threads in client_request()
# above, and exit loop.
self_future.set_result(server_name)
break
except requests.RequestException as exc:
# See explanation below.
pass
time.sleep(1)
app_done.result() # Let app run in background.
```
The server canāt process any client requests until itās found itself, so `client_request` waits for `self_future` to be resolved by calling `self_future.result()`. Once the future has been resolved, calling `result()` always returns immediately.
The search loop tries repeatedly for a minute to find self, by querying for each serverās id. It might catch an exception when querying; either because itās trying to reach another server that hasnāt started yet, or itās trying to reach **itself** but Flask hasnāt initialized on its background thread.
After the search loop completes we wait for `app_done.result()`: that means the main thread sleeps until the server thread exits, maybe because of a Control-C or some other signal.
Clean and clear, right? If I had rewritten this with asyncio I couldnāt use Flask, the most popular web framework, and I couldnāt use Requests, the most popular client library. (Requests doesnāt support asyncio.) I wouldāve had to rewrite everything to use asyncio. But with threads, I can implement this advanced control flow in a straightforward and legible manner, and I can still use Flask and Requests.
## Cool Threads [\#](https://emptysqua.re/blog/why-should-async-get-all-the-love/#cool-threads)

Threads are cool. Donāt let the asyncio kids make you feel like a nerd.
Threads are a better choice than asyncio for most concurrent programs. Theyāre at least as fast as asyncio, theyāre compatible with the popular frameworks, and with the techniques we looked at, using Futures and ThreadPoolExecutors, multi-threaded code can be safe and elegant. |
| Shard | 1 (laksa) |
| Root Hash | 14532298480263119201 |
| Unparsed URL | re,emptysqua!/blog/why-should-async-get-all-the-love/ s443 |