Formatter (#51)

Enforce consistent formatting use `dprint`
This commit is contained in:
Luca Palmieri
2024-05-24 17:00:03 +02:00
committed by GitHub
parent 537118574b
commit 99591a715e
157 changed files with 1057 additions and 1044 deletions

View File

@@ -1,11 +1,11 @@
# Async Rust
Threads are not the only way to write concurrent programs in Rust.
In this chapter we'll explore another approach: **asynchronous programming**.
Threads are not the only way to write concurrent programs in Rust.\
In this chapter we'll explore another approach: **asynchronous programming**.
In particular, you'll get an introduction to:
- The `async`/`.await` keywords, to write asynchronous code effortlessly
- The `Future` trait, to represent computations that may not be complete yet
- `tokio`, the most popular runtime for running asynchronous code
- The cooperative nature of Rust asynchronous model, and how this affects your code
- The cooperative nature of Rust asynchronous model, and how this affects your code

View File

@@ -1,16 +1,16 @@
# Asynchronous functions
All the functions and methods you've written so far were eager.
All the functions and methods you've written so far were eager.\
Nothing happened until you invoked them. But once you did, they ran to
completion: they did **all** their work, and then returned their output.
Sometimes that's undesirable.
Sometimes that's undesirable.\
For example, if you're writing an HTTP server, there might be a lot of
**waiting**: waiting for the request body to arrive, waiting for the
database to respond, waiting for a downstream service to reply, etc.
What if you could do something else while you're waiting?
What if you could choose to give up midway through a computation?
What if you could do something else while you're waiting?\
What if you could choose to give up midway through a computation?\
What if you could choose to prioritise another task over the current one?
That's where **asynchronous functions** come in.
@@ -38,7 +38,7 @@ fn run() {
}
```
Nothing happens!
Nothing happens!\
Rust doesn't start executing `bind_random` when you call it,
not even as a background task (as you might expect based on your experience
with other languages).
@@ -68,18 +68,18 @@ async fn run() {
}
```
`.await` doesn't return control to the caller until the asynchronous function
`.await` doesn't return control to the caller until the asynchronous function
has run to completion—e.g. until the `TcpListener` has been created in the example above.
## Runtimes
## Runtimes
If you're puzzled, you're right to be!
If you're puzzled, you're right to be!\
We've just said that the perk of asynchronous functions
is that they don't do **all** their work at once. We then introduced `.await`, which
doesn't return until the asynchronous function has run to completion. Haven't we
just re-introduced the problem we were trying to solve? What's the point?
Not quite! A lot happens behind the scenes when you call `.await`!
Not quite! A lot happens behind the scenes when you call `.await`!\
You're yielding control to an **async runtime**, also known as an **async executor**.
Executors are where the magic happens: they are in charge of managing all your
ongoing asynchronous **tasks**. In particular, they balance two different goals:
@@ -95,8 +95,8 @@ no default runtime. The standard library doesn't ship with one. You need to
bring your own!
In most cases, you'll choose one of the options available in the ecosystem.
Some runtimes are designed to be broadly applicable, a solid option for most applications.
`tokio` and `async-std` belong to this category. Other runtimes are optimised for
Some runtimes are designed to be broadly applicable, a solid option for most applications.
`tokio` and `async-std` belong to this category. Other runtimes are optimised for
specific use cases—e.g. `embassy` for embedded systems.
Throughout this course we'll rely on `tokio`, the most popular runtime for general-purpose
@@ -130,10 +130,10 @@ fn main() {
### `#[tokio::test]`
The same goes for tests: they must be synchronous functions.
The same goes for tests: they must be synchronous functions.\
Each test function is run in its own thread, and you're responsible for
setting up and launching an async runtime if you need to run async code
in your tests.
in your tests.\
`tokio` provides a `#[tokio::test]` macro to make this easier:
```rust
@@ -141,4 +141,4 @@ in your tests.
async fn my_test() {
// Your async test code goes here
}
```
```

View File

@@ -12,12 +12,12 @@ pub async fn echo(listener: TcpListener) -> Result<(), anyhow::Error> {
}
```
This is not bad!
This is not bad!\
If a long time passes between two incoming connections, the `echo` function will be idle
(since `TcpListener::accept` is an asynchronous function), thus allowing the executor
to run other tasks in the meantime.
But how can we actually have multiple tasks running concurrently?
But how can we actually have multiple tasks running concurrently?\
If we always run our asynchronous functions until completion (by using `.await`), we'll never
have more than one task running at a time.
@@ -25,7 +25,7 @@ This is where the `tokio::spawn` function comes in.
## `tokio::spawn`
`tokio::spawn` allows you to hand off a task to the executor, **without waiting for it to complete**.
`tokio::spawn` allows you to hand off a task to the executor, **without waiting for it to complete**.\
Whenever you invoke `tokio::spawn`, you're telling `tokio` to continue running
the spawned task, in the background, **concurrently** with the task that spawned it.
@@ -51,12 +51,12 @@ pub async fn echo(listener: TcpListener) -> Result<(), anyhow::Error> {
### Asynchronous blocks
In this example, we've passed an **asynchronous block** to `tokio::spawn`: `async move { /* */ }`
Asynchronous blocks are a quick way to mark a region of code as asynchronous without having
Asynchronous blocks are a quick way to mark a region of code as asynchronous without having
to define a separate async function.
### `JoinHandle`
`tokio::spawn` returns a `JoinHandle`.
`tokio::spawn` returns a `JoinHandle`.\
You can use `JoinHandle` to `.await` the background task, in the same way
we used `join` for spawned threads.
@@ -83,10 +83,10 @@ pub async fn do_work() {
### Panic boundary
If a task spawned with `tokio::spawn` panics, the panic will be caught by the executor.
If a task spawned with `tokio::spawn` panics, the panic will be caught by the executor.\
If you don't `.await` the corresponding `JoinHandle`, the panic won't be propagated to the spawner.
Even if you do `.await` the `JoinHandle`, the panic won't be propagated automatically.
Awaiting a `JoinHandle` returns a `Result`, with [`JoinError`](https://docs.rs/tokio/latest/tokio/task/struct.JoinError.html)
Even if you do `.await` the `JoinHandle`, the panic won't be propagated automatically.
Awaiting a `JoinHandle` returns a `Result`, with [`JoinError`](https://docs.rs/tokio/latest/tokio/task/struct.JoinError.html)
as its error type. You can then check if the task panicked by calling `JoinError::is_panic` and
choose what to do with the panic—either log it, ignore it, or propagate it.
@@ -112,11 +112,11 @@ pub async fn work() {
### `std::thread::spawn` vs `tokio::spawn`
You can think of `tokio::spawn` as the asynchronous sibling of `std::spawn::thread`.
You can think of `tokio::spawn` as the asynchronous sibling of `std::spawn::thread`.
Notice a key difference: with `std::thread::spawn`, you're delegating control to the OS scheduler.
You're not in control of how threads are scheduled.
With `tokio::spawn`, you're delegating to an async executor that runs entirely in
user space. The underlying OS scheduler is not involved in the decision of which task
to run next. We're in charge of that decision now, via the executor we chose to use.
user space. The underlying OS scheduler is not involved in the decision of which task
to run next. We're in charge of that decision now, via the executor we chose to use.

View File

@@ -6,9 +6,9 @@ it has an impact on our code.
## Flavors
`tokio` ships two different runtime _flavors_.
`tokio` ships two different runtime _flavors_.
You can configure your runtime via `tokio::runtime::Builder`:
You can configure your runtime via `tokio::runtime::Builder`:
- `Builder::new_multi_thread` gives you a **multithreaded `tokio` runtime**
- `Builder::new_current_thread` will instead rely on the **current thread** for execution.
@@ -19,29 +19,29 @@ You can configure your runtime via `tokio::runtime::Builder`:
### Current thread runtime
The current-thread runtime, as the name implies, relies exclusively on the OS thread
it was launched on to schedule and execute tasks.
it was launched on to schedule and execute tasks.\
When using the current-thread runtime, you have **concurrency** but no **parallelism**:
asynchronous tasks will be interleaved, but there will always be at most one task running
at any given time.
### Multithreaded runtime
When using the multithreaded runtime, instead, there can up to `N` tasks running
_in parallel_ at any given time, where `N` is the number of threads used by the
runtime. By default, `N` matches the number of available CPU cores.
When using the multithreaded runtime, instead, there can up to `N` tasks running
_in parallel_ at any given time, where `N` is the number of threads used by the
runtime. By default, `N` matches the number of available CPU cores.
There's more: `tokio` performs **work-stealing**.
There's more: `tokio` performs **work-stealing**.\
If a thread is idle, it won't wait around: it'll try to find a new task that's ready for
execution, either from a global queue or by stealing it from the local queue of another
thread.
Work-stealing can have significant performance benefits, especially on tail latencies,
thread.\
Work-stealing can have significant performance benefits, especially on tail latencies,
whenever your application is dealing with workloads that are not perfectly balanced
across threads.
## Implications
`tokio::spawn` is flavor-agnostic: it'll work no matter if you're running on the multithreaded
or current-thread runtime. The downside is that the signature assume the worst case
`tokio::spawn` is flavor-agnostic: it'll work no matter if you're running on the multithreaded
or current-thread runtime. The downside is that the signature assume the worst case
(i.e. multithreaded) and is constrained accordingly:
```rust
@@ -52,7 +52,7 @@ where
{ /* */ }
```
Let's ignore the `Future` trait for now to focus on the rest.
Let's ignore the `Future` trait for now to focus on the rest.\
`spawn` is asking all its inputs to be `Send` and have a `'static` lifetime.
The `'static` constraint follows the same rationale of the `'static` constraint
@@ -85,4 +85,4 @@ fn spawner(input: Rc<u64>) {
println!("{}", input);
})
}
```
```

View File

@@ -12,11 +12,11 @@ pub fn spawn<F>(future: F) -> JoinHandle<F::Output>
{ /* */ }
```
What does it _actually_ mean for `F` to be `Send`?
What does it _actually_ mean for `F` to be `Send`?\
It implies, as we saw in the previous section, that whatever value it captures from the
spawning environment has to be `Send`. But it goes further than that.
Any value that's _held across a .await point_ has to be `Send`.
Any value that's _held across a .await point_ has to be `Send`.\
Let's look at an example:
```rust
@@ -65,13 +65,13 @@ note: required by a bound in `tokio::spawn`
| ^^^^ required by this bound in `spawn`
```
To understand why that's the case, we need to refine our understanding of
To understand why that's the case, we need to refine our understanding of
Rust's asynchronous model.
## The `Future` trait
We stated early on that `async` functions return **futures**, types that implement
the `Future` trait. You can think of a future as a **state machine**.
the `Future` trait. You can think of a future as a **state machine**.
It's in one of two states:
- **pending**: the computation has not finished yet.
@@ -90,27 +90,27 @@ trait Future {
### `poll`
The `poll` method is the heart of the `Future` trait.
A future on its own doesn't do anything. It needs to be **polled** to make progress.
The `poll` method is the heart of the `Future` trait.\
A future on its own doesn't do anything. It needs to be **polled** to make progress.\
When you call `poll`, you're asking the future to do some work.
`poll` tries to make progress, and then returns one of the following:
- `Poll::Pending`: the future is not ready yet. You need to call `poll` again later.
- `Poll::Ready(value)`: the future has finished. `value` is the result of the computation,
of type `Self::Output`.
of type `Self::Output`.
Once `Future::poll` returns `Poll::Ready`, it should not be polled again: the future has
completed, there's nothing left to do.
completed, there's nothing left to do.
### The role of the runtime
You'll rarely, if ever, be calling poll directly.
You'll rarely, if ever, be calling poll directly.\
That's the job of your async runtime: it has all the required information (the `Context`
in `poll`'s signature) to ensure that your futures are making progress whenever they can.
## `async fn` and futures
We've worked with the high-level interface, asynchronous functions.
We've worked with the high-level interface, asynchronous functions.\
We've now looked at the low-level primitive, the `Future trait`.
How are they related?
@@ -143,23 +143,23 @@ pub enum ExampleFuture {
```
When `example` is called, it returns `ExampleFuture::NotStarted`. The future has never
been polled yet, so nothing has happened.
been polled yet, so nothing has happened.\
When the runtime polls it the first time, `ExampleFuture` will advance until the next
`.await` point: it'll stop at the `ExampleFuture::YieldNow(Rc<i32>)` stage of the state
machine, returning `Poll::Pending`.
When it's polled again, it'll execute the remaining code (`println!`) and
return `Poll::Ready(())`.
machine, returning `Poll::Pending`.\
When it's polled again, it'll execute the remaining code (`println!`) and
return `Poll::Ready(())`.
When you look at its state machine representation, `ExampleFuture`,
When you look at its state machine representation, `ExampleFuture`,
it is now clear why `example` is not `Send`: it holds an `Rc`, therefore
it cannot be `Send`.
## Yield points
As you've just seen with `example`, every `.await` point creates a new intermediate
state in the lifecycle of a future.
state in the lifecycle of a future.\
That's why `.await` points are also known as **yield points**: your future _yields control_
back to the runtime that was polling it, allowing the runtime to pause it and (if necessary)
schedule another task for execution, thus making progress on multiple fronts concurrently.
We'll come back to the importance of yielding in a later section.
We'll come back to the importance of yielding in a later section.

View File

@@ -1,6 +1,6 @@
# Don't block the runtime
Let's circle back to yield points.
Let's circle back to yield points.\
Unlike threads, **Rust tasks cannot be preempted**.
`tokio` cannot, on its own, decide to pause a task and run another one in its place.
@@ -11,13 +11,13 @@ you `.await` a future.
This exposes the runtime to a risk: if a task never yields, the runtime will never
be able to run another task. This is called **blocking the runtime**.
## What is blocking?
## What is blocking?
How long is too long? How much time can a task spend without yielding before it
becomes a problem?
It depends on the runtime, the application, the number of in-flight tasks, and
many other factors. But, as a general rule of thumb, try to spend less than 100
many other factors. But, as a general rule of thumb, try to spend less than 100
microseconds between yield points.
## Consequences
@@ -27,7 +27,7 @@ Blocking the runtime can lead to:
- **Deadlocks**: if the task that's not yielding is waiting for another task to
complete, and that task is waiting for the first one to yield, you have a deadlock.
No progress can be made, unless the runtime is able to schedule the other task on
a different thread.
a different thread.
- **Starvation**: other tasks might not be able to run, or might run after a long
delay, which can lead to poor performances (e.g. high tail latencies).
@@ -46,12 +46,12 @@ of entries.
## How to avoid blocking
OK, so how do you avoid blocking the runtime assuming you _must_ perform an operation
that qualifies or risks qualifying as blocking?
that qualifies or risks qualifying as blocking?\
You need to move the work to a different thread. You don't want to use the so-called
runtime threads, the ones used by `tokio` to run tasks.
`tokio` provides a dedicated threadpool for this purpose, called the **blocking pool**.
You can spawn a synchronous operation on the blocking pool using the
You can spawn a synchronous operation on the blocking pool using the
`tokio::task::spawn_blocking` function. `spawn_blocking` returns a future that resolves
to the result of the operation when it completes.
@@ -76,4 +76,4 @@ because the cost of thread initialization is amortized over multiple calls.
## Further reading
- Check out [Alice Ryhl's blog post](https://ryhl.io/blog/async-what-is-blocking/)
on the topic.
on the topic.

View File

@@ -32,9 +32,9 @@ async fn http_call(v: &[u64]) {
### `std::sync::MutexGuard` and yield points
This code will compile, but it's dangerous.
This code will compile, but it's dangerous.
We try to acquire a lock over a `Mutex` from `std` in an asynchronous context.
We try to acquire a lock over a `Mutex` from `std` in an asynchronous context.
We then hold on to the resulting `MutexGuard` across a yield point (the `.await` on
`http_call`).
@@ -42,18 +42,18 @@ Let's imagine that there are two tasks executing `run`, concurrently, on a singl
runtime. We observe the following sequence of scheduling events:
```text
Task A Task B
|
Acquire lock
Yields to runtime
|
+--------------+
|
Tries to acquire lock
Task A Task B
|
Acquire lock
Yields to runtime
|
+--------------+
|
Tries to acquire lock
```
We have a deadlock. Task B we'll never manage to acquire the lock, because the lock
is currently held by task A, which has yielded to the runtime before releasing the
is currently held by task A, which has yielded to the runtime before releasing the
lock and won't be scheduled again because the runtime cannot preempt task B.
### `tokio::sync::Mutex`
@@ -73,32 +73,32 @@ async fn run(m: Arc<Mutex<Vec<u64>>>) {
```
Acquiring the lock is now an asynchronous operation, which yields back to the runtime
if it can't make progress.
if it can't make progress.\
Going back to the previous scenario, the following would happen:
```text
Task A Task B
|
Acquires the lock
Starts `http_call`
Yields to runtime
|
+--------------+
|
Tries to acquire the lock
Cannot acquire the lock
Yields to runtime
|
+--------------+
|
`http_call` completes
Releases the lock
Yield to runtime
|
+--------------+
|
Acquires the lock
[...]
Task A Task B
|
Acquires the lock
Starts `http_call`
Yields to runtime
|
+--------------+
|
Tries to acquire the lock
Cannot acquire the lock
Yields to runtime
|
+--------------+
|
`http_call` completes
Releases the lock
Yield to runtime
|
+--------------+
|
Acquires the lock
[...]
```
All good!
@@ -107,14 +107,14 @@ All good!
We've used a single-threaded runtime as the execution context in our
previous example, but the same risk persists even when using a multithreaded
runtime.
runtime.\
The only difference is in the number of concurrent tasks required to create the deadlock:
in a single-threaded runtime, 2 are enough; in a multithreaded runtime, we
would need `N+1` tasks, where `N` is the number of runtime threads.
would need `N+1` tasks, where `N` is the number of runtime threads.
### Downsides
Having an async-aware `Mutex` comes with a performance penalty.
Having an async-aware `Mutex` comes with a performance penalty.\
If you're confident that the lock isn't under significant contention
_and_ you're careful to never hold it across a yield point, you can
still use `std::sync::Mutex` in an asynchronous context.
@@ -124,6 +124,6 @@ will incur.
## Other primitives
We used `Mutex` as an example, but the same applies to `RwLock`, semaphores, etc.
We used `Mutex` as an example, but the same applies to `RwLock`, semaphores, etc.\
Prefer async-aware versions when working in an asynchronous context to minimise
the risk of issues.
the risk of issues.

View File

@@ -1,6 +1,6 @@
# Cancellation
What happens when a pending future is dropped?
What happens when a pending future is dropped?\
The runtime will no longer poll it, therefore it won't make any further progress.
In other words, its execution has been **cancelled**.
@@ -38,9 +38,9 @@ async fn http_call() {
}
```
Each yield point becomes a **cancellation point**.
Each yield point becomes a **cancellation point**.\
`http_call` can't be preempted by the runtime, so it can only be discarded after
it has yielded control back to the executor via `.await`.
it has yielded control back to the executor via `.await`.
This applies recursively—e.g. `stream.write_all(&request)` is likely to have multiple
yield points in its implementation. It is perfectly possible to see `http_call` pushing
a _partial_ request before being cancelled, thus dropping the connection and never
@@ -49,7 +49,7 @@ finishing transmitting the body.
## Clean up
Rust's cancellation mechanism is quite powerful—it allows the caller to cancel an ongoing task
without needing any form of cooperation from the task itself.
without needing any form of cooperation from the task itself.\
At the same time, this can be quite dangerous. It may be desirable to perform a
**graceful cancellation**, to ensure that some clean-up tasks are performed
before aborting the operation.
@@ -71,7 +71,7 @@ async fn transfer_money(
```
On cancellation, it'd be ideal to explicitly abort the pending transaction rather
than leaving it hanging.
than leaving it hanging.
Rust, unfortunately, doesn't provide a bullet-proof mechanism for this kind of
**asynchronous** clean up operations.
@@ -86,8 +86,8 @@ The optimal choice is contextual.
## Cancelling spawned tasks
When you spawn a task using `tokio::spawn`, you can no longer drop it;
it belongs to the runtime.
When you spawn a task using `tokio::spawn`, you can no longer drop it;
it belongs to the runtime.\
Nonetheless, you can use its `JoinHandle` to cancel it if needed:
```rust
@@ -102,8 +102,8 @@ async fn run() {
- Be extremely careful when using `tokio`'s `select!` macro to "race" two different futures.
Retrying the same task in a loop is dangerous unless you can ensure **cancellation safety**.
Check out [`select!`'s documentation](https://tokio.rs/tokio/tutorial/select) for more details.
If you need to interleave two asynchronous streams of data (e.g. a socket and a channel), prefer using
[`StreamExt::merge`](https://docs.rs/tokio-stream/latest/tokio_stream/trait.StreamExt.html#method.merge) instead.
- Rather than "abrupt" cancellation, it can be preferable to rely
on [`CancellationToken`](https://docs.rs/tokio-util/latest/tokio_util/sync/struct.CancellationToken.html).
Check out [`select!`'s documentation](https://tokio.rs/tokio/tutorial/select) for more details.\
If you need to interleave two asynchronous streams of data (e.g. a socket and a channel), prefer using
[`StreamExt::merge`](https://docs.rs/tokio-stream/latest/tokio_stream/trait.StreamExt.html#method.merge) instead.
- Rather than "abrupt" cancellation, it can be preferable to rely
on [`CancellationToken`](https://docs.rs/tokio-util/latest/tokio_util/sync/struct.CancellationToken.html).

View File

@@ -1,7 +1,7 @@
# Outro
Rust's asynchronous model is quite powerful, but it does introduce additional
complexity. Take time to know your tools: dive deep into `tokio`'s documentation
complexity. Take time to know your tools: dive deep into `tokio`'s documentation
and get familiar with its primitives to make the most out of it.
Keep in mind, as well, that there is ongoing work at the language and `std` level
@@ -10,25 +10,25 @@ rough edges in your day-to-day work due to some of these missing pieces.
A few recommendations for a mostly-pain-free async experience:
- **Pick a runtime and stick to it.**
- **Pick a runtime and stick to it.**\
Some primitives (e.g. timers, I/O) are not portable across runtimes. Trying to
mix runtimes is likely to cause you pain. Trying to write code that's runtime
agnostic can significantly increase the complexity of your codebase. Avoid it
if you can.
- **There is no stable `Stream`/`AsyncIterator` interface yet.**
An `AsyncIterator` is, conceptually, an iterator that yields new items
agnostic can significantly increase the complexity of your codebase. Avoid it
if you can.
- **There is no stable `Stream`/`AsyncIterator` interface yet.**\
An `AsyncIterator` is, conceptually, an iterator that yields new items
asynchronously. There is ongoing design work, but no consensus (yet).
If you're using `tokio`, refer to [`tokio_stream`](https://docs.rs/tokio-stream/latest/tokio_stream/)
If you're using `tokio`, refer to [`tokio_stream`](https://docs.rs/tokio-stream/latest/tokio_stream/)
as your go-to interface.
- **Be careful with buffering.**
It is often the cause of subtle bugs. Check out
- **Be careful with buffering.**\
It is often the cause of subtle bugs. Check out
["Barbara battles buffered streams"](https://rust-lang.github.io/wg-async/vision/submitted_stories/status_quo/barbara_battles_buffered_streams.html)
for more details.
- **There is no equivalent of scoped threads for asynchronous tasks**.
for more details.
- **There is no equivalent of scoped threads for asynchronous tasks**.\
Check out ["The scoped task trilemma"](https://without.boats/blog/the-scoped-task-trilemma/)
for more details.
Don't let these caveats scare you: asynchronous Rust is being used effectively
at _massive_ scale (e.g. AWS, Meta) to power foundational services.
at _massive_ scale (e.g. AWS, Meta) to power foundational services.\
You will have to master it if you're planning building networked applications
in Rust.