9.1 KiB
Locks, Send and
Arc
The patching strategy you just implemented has a major drawback: it’s
racy.
If two clients send patches for the same ticket roughly at same time,
the server will apply them in an arbitrary order. Whoever enqueues their
patch last will overwrite the changes made by the other client.
Version numbers
We could try to fix this by using a version
number.
Each ticket gets assigned a version number upon creation, set to
0.
Whenever a client sends a patch, they must include the current version
number of the ticket alongside the desired changes. The server will only
apply the patch if the version number matches the one it has stored.
In the scenario described above, the server would reject the second patch, because the version number would have been incremented by the first patch and thus wouldn’t match the one sent by the second client.
This approach is fairly common in distributed systems (e.g. when
client and servers don’t share memory), and it is known as
optimistic concurrency control.
The idea is that most of the time, conflicts won’t happen, so we can
optimize for the common case. You know enough about Rust by now to
implement this strategy on your own as a bonus exercise, if you want
to.
Locking
We can also fix the race condition by introducing a
lock.
Whenever a client wants to update a ticket, they must first acquire a
lock on it. While the lock is active, no other client can modify the
ticket.
Rust’s standard library provides two different locking primitives:
Mutex<T> and RwLock<T>.
Let’s start with Mutex<T>. It stands for
mutual exclusion, and it’s the
simplest kind of lock: it allows only one thread to access the data, no
matter if it’s for reading or writing.
Mutex<T> wraps the data it protects, and it’s
therefore generic over the type of the data.
You can’t access the data directly: the type system forces you to
acquire a lock first using either Mutex::lock or
Mutex::try_lock. The former blocks until the lock is
acquired, the latter returns immediately with an error if the lock can’t
be acquired.
Both methods return a guard object that dereferences to the data,
allowing you to modify it. The lock is released when the guard is
dropped.
use std::sync::Mutex;
// An integer protected by a mutex lock
let lock = Mutex::new(0);
// Acquire a lock on the mutex
let mut guard = lock.lock().unwrap();
// Modify the data through the guard,
// leveraging its `Deref` implementation
*guard += 1;
// The lock is released when `data` goes out of scope
// This can be done explicitly by dropping the guard
// or happen implicitly when the guard goes out of scope
drop(guard)Locking granularity
What should our Mutex wrap?
The simplest option would be the wrap the entire
TicketStore in a single Mutex.
This would work, but it would severely limit the system’s performance:
you wouldn’t be able to read tickets in parallel, because every read
would have to wait for the lock to be released.
This is known as coarse-grained locking.
It would be better to use fine-grained locking, where each ticket is protected by its own lock. This way, clients can keep working with tickets in parallel, as long as they aren’t trying to access the same ticket.
// The new structure, with a lock for each ticket
struct TicketStore {
tickets: BTreeMap<TicketId, Mutex<Ticket>>,
}This approach is more efficient, but it has a downside:
TicketStore has to become aware of the
multithreaded nature of the system; up until now,
TicketStore has been blissfully ignoring the existence of
threads.
Let’s go for it anyway.
Who holds the lock?
For the whole scheme to work, the lock must be passed to the client
that wants to modify the ticket.
The client can then directly modify the ticket (as if they had a
&mut Ticket) and release the lock when they’re
done.
This is a bit tricky.
We can’t send a Mutex<Ticket> over a channel, because
Mutex is not Clone and we can’t move it out of
the TicketStore. Could we send the MutexGuard
instead?
Let’s test the idea with a small example:
use std::thread::spawn;
use std::sync::Mutex;
use std::sync::mpsc::sync_channel;
fn main() {
let lock = Mutex::new(0);
let (sender, receiver) = sync_channel(1);
let guard = lock.lock().unwrap();
spawn(move || {
receiver.recv().unwrap();
});
// Try to send the guard over the channel
// to another thread
sender.send(guard);
}The compiler is not happy with this code:
error[E0277]: `MutexGuard<'_, i32>` cannot be sent between
threads safely
--> src/main.rs:10:7
|
10 | spawn(move || {
| _-----_^
| | |
| | required by a bound introduced by this call
11 | | receiver.recv().unwrap();
12 | | });
| |_^ `MutexGuard<'_, i32>` cannot be sent between threads safely
|
= help: the trait `Send` is not implemented for
`MutexGuard<'_, i32>`, which is required by
`{closure@src/main.rs:10:7: 10:14}: Send`
= note: required for `Receiver<MutexGuard<'_, i32>>`
to implement `Send`
note: required because it's used within this closure
MutexGuard<'_, i32> is not Send: what
does it mean?
Send
Send is a marker trait that indicates that a type can be
safely transferred from one thread to another.
Send is also an auto-trait, just like Sized;
it’s automatically implemented (or not implemented) for your type by the
compiler, based on its definition.
You can also implement Send manually for your types, but it
requires unsafe since you have to guarantee that the type
is indeed safe to send between threads for reasons that the compiler
can’t automatically verify.
Channel requirements
Sender<T>, SyncSender<T> and
Receiver<T> are Send if and only if
T is Send.
That’s because they are used to send values between threads, and if the
value itself is not Send, it would be unsafe to send it
between threads.
MutexGuard
MutexGuard is not Send because the
underlying operating system primitives that Mutex uses to
implement the lock require (on some platforms) that the lock must be
released by the same thread that acquired it.
If we were to send a MutexGuard to another thread, the lock
would be released by a different thread, which would lead to undefined
behavior.
Our challenges
Summing it up:
- We can’t send a
MutexGuardover a channel. So we can’t lock on the server-side and then modify the ticket on the client-side. - We can send a
Mutexover a channel because it’sSendas long as the data it protects isSend, which is the case forTicket. At the same time, we can’t move theMutexout of theTicketStorenor clone it.
How can we solve this conundrum?
We need to look at the problem from a different angle. To lock a
Mutex, we don’t need an owned value. A shared reference is
enough, since Mutex uses internal mutability:
impl<T> Mutex<T> {
// `&self`, not `self`!
pub fn lock(&self) -> LockResult<MutexGuard<'_, T>> {
// Implementation details
}
}It is therefore enough to send a shared reference to the
client.
We can’t do that directly, though, because the reference would have to
be 'static and that’s not the case.
In a way, we need an “owned shared reference”. It turns out that Rust
has a type that fits the bill: Arc.
Arc to the rescue
Arc stands for atomic reference
counting.
Arc wraps around a value and keeps track of how many
references to the value exist. When the last reference is dropped, the
value is deallocated.
The value wrapped in an Arc is immutable: you can only get
shared references to it.
use std::sync::Arc;
let data: Arc<u32> = Arc::new(0);
let data_clone = Arc::clone(&data);
// `Arc<T>` implements `Deref<T>`, so can convert
// a `&Arc<T>` to a `&T` using deref coercion
let data_ref: &u32 = &data;If you’re having a déjà vu moment, you’re right: Arc
sounds very similar to Rc, the reference-counted pointer we
introduced when talking about interior mutability. The difference is
thread-safety: Rc is not Send, while
Arc is. It boils down to the way the reference count is
implemented: Rc uses a “normal” integer, while
Arc uses an atomic integer, which can be
safely shared and modified across threads.
Arc<Mutex<T>>
If we pair Arc with Mutex, we finally get a
type that:
- Can be sent between threads, because:
ArcisSendifTisSend, andMutexisSendifTisSend.TisTicket, which isSend.
- Can be cloned, because
ArcisCloneno matter whatTis. Cloning anArcincrements the reference count, the data is not copied. - Can be used to modify the data it wraps, because
Arclets you get a shared reference toMutex<T>which can in turn be used to acquire a lock.
We have all the pieces we need to implement the locking strategy for our ticket store.
Further reading
- We won’t be covering the details of atomic operations in this
course, but you can find more information in the
stddocumentation as well as in the “Rust atomics and locks” book.