Files

Luca Palmieri 99591a715e Formatter (#51 )

Enforce consistent formatting use `dprint`

2024-05-24 17:00:03 +02:00

3.5 KiB

Raw Blame History

`HashMap`

Our implementation of Index/IndexMut is not ideal: we need to iterate over the entire Vec to retrieve a ticket by id; the algorithmic complexity is O(n), where n is the number of tickets in the store.

We can do better by using a different data structure for storing tickets: a HashMap<K, V>.

use std::collections::HashMap;

// Type inference lets us omit an explicit type signature (which
// would be `HashMap<String, String>` in this example).
let mut book_reviews = HashMap::new();

book_reviews.insert(
    "Adventures of Huckleberry Finn".to_string(),
    "My favorite book.".to_string(),
);

HashMap works with key-value pairs. It’s generic over both: K is the generic parameter for the key type, while V is the one for the value type.

The expected cost of insertions, retrievals and removals is constant, O(1). That sounds perfect for our usecase, doesn’t it?

Key requirements

There are no trait bounds on HashMap’s struct definition, but you’ll find some on its methods. Let’s look at insert, for example:

// Slightly simplified
impl<K, V> HashMap<K, V>
where
    K: Eq + Hash,
{
    pub fn insert(&mut self, k: K, v: V) -> Option<V> {
        // [...]
    }
}

The key type must implement the Eq and Hash traits.
Let’s dig into those two.

`Hash`

A hashing function (or hasher) maps a potentially infinite set of a values (e.g. all possible strings) to a bounded range (e.g. a u64 value).
There are many different hashing functions around, each with different properties (speed, collision risk, reversibility, etc.).

A HashMap, as the name suggests, uses a hashing function behind the scene. It hashes your key and then uses that hash to store/retrieve the associated value. This strategy requires the key type must be hashable, hence the Hash trait bound on K.

You can find the Hash trait in the std::hash module:

pub trait Hash {
    // Required method
    fn hash<H>(&self, state: &mut H)
       where H: Hasher;
}

You will rarely implement Hash manually. Most of the times you’ll derive it:

#[derive(Hash)]
struct Person {
    id: u32,
    name: String,
}

`Eq`

HashMap must be able to compare keys for equality. This is particularly important when dealing with hash collisions—i.e. when two different keys hash to the same value.

You may wonder: isn’t that what the PartialEq trait is for? Almost!
PartialEq is not enough for HashMap because it doesn’t guarantee reflexivity, i.e. a == a is always true.
For example, floating point numbers (f32 and f64) implement PartialEq, but they don’t satisfy the reflexivity property: f32::NAN == f32::NAN is false.
Reflexivity is crucial for HashMap to work correctly: without it, you wouldn’t be able to retrieve a value from the map using the same key you used to insert it.

The Eq trait extends PartialEq with the reflexivity property:

pub trait Eq: PartialEq {
    // No additional methods
}

It’s a marker trait: it doesn’t add any new methods, it’s just a way for you to say to the compiler that the equality logic implemented in PartialEq is reflexive.

You can derive Eq automatically when you derive PartialEq:

#[derive(PartialEq, Eq)]
struct Person {
    id: u32,
    name: String,
}

`Eq` and `Hash` are linked

There is an implicit contract between Eq and Hash: if two keys are equal, their hashes must be equal too. This is crucial for HashMap to work correctly. If you break this contract, you’ll get nonsensical results when using HashMap.

3.5 KiB Raw Blame History Unescape Escape

HashMap