← Back to blog

C++ std::unordered_map: Essential Methods and Practical Uses

C++ std::unordered_map is a hash table–based associative container that stores key–value pairs and provides average-case amortized O(1) lookup, insertion, and deletion. It’s a go-to choice when you need fast access by key and don’t care about ordering.

This post covers the most useful methods, when to use them, and common patterns you’ll reach for in real code.

Prerequisites / C++ version notes

  • contains() is C++20 (compile with e.g. -std=c++20).
  • try_emplace() and insert_or_assign() are C++17.
  • Everything else shown is available in C++11+.

References (cppreference):

  • std::unordered_map: https://en.cppreference.com/w/cpp/container/unordered_map
  • operator[]: https://en.cppreference.com/w/cpp/container/unordered_map/operator_at
  • try_emplace: https://en.cppreference.com/w/cpp/container/unordered_map/try_emplace
  • insert_or_assign: https://en.cppreference.com/w/cpp/container/unordered_map/insert_or_assign
  • reserve / rehash: https://en.cppreference.com/w/cpp/container/unordered_map/reserve

A quick mental model (why “average O(1)” isn’t guaranteed)

An unordered_map keeps buckets (an array) and uses a hash function to map each key to a bucket index. When multiple keys land in the same bucket (collisions), the container must resolve them (implementation-dependent, but conceptually it means more work within that bucket).

  • With a good hash and a controlled load factor (roughly “elements per bucket”), operations are typically O(1) on average.
  • In the worst case, many keys collide into one bucket (e.g., poor hash or adversarial inputs), and lookup/insert/erase can degrade toward O(n).

If you accept untrusted keys (e.g., network/user input), hash quality and collision behavior matter more.

When to choose unordered_map

Use std::unordered_map<Key, T> when:

  • You need fast key-based access and updates.
  • The order of keys doesn’t matter.
  • You can provide a hash (built-in or custom) for the key type.

Prefer std::map when you need ordered traversal, range queries, or stable iteration order.

Basic anatomy

#include <unordered_map>
#include <string>

std::unordered_map<std::string, int> counts;

Key points:

  • Keys are unique.
  • Iteration order is unspecified and can change after rehashing.
  • Performance depends on hash quality and load factor.

Core methods you’ll use constantly

operator[]: insert-or-access (with caveats)

counts["apple"] += 1;  // inserts "apple" with value 0 if missing, then increments

operator[] is convenient, but:

  • It inserts when the key is absent (default-initializing the mapped value).
  • It requires mapped_type to be default-insertable. If T has no default constructor, m[key] won’t compile.

If you only want to read without inserting, prefer find() / contains() / at().

at(): bounds-checked access

int n = counts.at("apple"); // throws std::out_of_range if missing

Use at() when absence is exceptional and you want an explicit failure.

find(): safe lookup without insertion

auto it = counts.find("apple");
if (it != counts.end()) {
    // it->first is key, it->second is value
    it->second += 1;
}

find() is the standard way to check existence and get the value without modifying the map.

contains() (C++20): existence check

if (counts.contains("apple")) {
    // key exists
}

contains() is a cleaner alternative to find() != end() when you don’t need the iterator.

insert() vs emplace(): add elements

Both return a pair (iterator, bool) indicating where the element is and whether insertion happened. If the key already exists, no insertion occurs and the bool is false.

insert()

auto [it, inserted] = counts.insert({"banana", 3});
if (!inserted) {
    // key already existed; it points to the existing element
}

emplace()

auto [it, inserted] = counts.emplace("banana", 3);
if (inserted) {
    // inserted a new element
}

emplace() constructs the element in-place and can avoid copies for complex types.

try_emplace() and insert_or_assign(): modern update patterns (C++17)

try_emplace(): insert if missing, don’t overwrite

std::unordered_map<std::string, std::string> cache;
cache.try_emplace("key", "expensive-to-build");

If the key exists, construction of the mapped value is skipped—useful when building T is expensive.

insert_or_assign(): upsert

cache.insert_or_assign("key", "new-value");

This inserts a new key or overwrites the existing value.

erase(): remove by key or iterator

counts.erase("banana");

auto it = counts.find("apple");
if (it != counts.end()) counts.erase(it);

A small transition: from “API” to “using it”

Once you can safely look up, insert, and update, the next questions are (1) how to iterate, and (2) how to keep performance predictable as the table grows.

Iteration patterns

for (const auto& [key, value] : counts) {
    // key and value
}

Remember: ordering is unspecified.

Hash table tuning: buckets, load factor, reserve, rehash

unordered_map stores elements in buckets. As the container grows, it may rehash (allocate a new bucket array and redistribute elements), which changes bucket assignment and typically changes iteration order.

reserve() semantics (it’s about elements, not buckets)

std::unordered_map<int, int> m;
m.reserve(100000); // plan for ~100k elements

reserve(n) requests enough buckets so that n elements can be inserted without exceeding max_load_factor(). It may still rehash immediately to satisfy that constraint.

Relationship (conceptually):

  • target bucket count ≈ ceil(n / max_load_factor())

So if you lower max_load_factor(), the same reserve(n) will typically result in more buckets (more memory, fewer collisions).

max_load_factor(): control density

m.max_load_factor(0.7f);

Lower load factor can improve lookup speed at the cost of memory.

bucket_count() and load_factor(): inspect

auto buckets = m.bucket_count();
auto lf = m.load_factor();

Rule of thumb (avoid premature tuning)

  • Call reserve() when you can reasonably estimate size (big win, low risk).
  • Only tune max_load_factor() after profiling and measuring memory/speed tradeoffs.

Practical use cases (with idiomatic snippets)

Frequency counting

std::unordered_map<std::string, int> freq;
for (const auto& w : words) {
    ++freq[w];
}

Grouping values by key

std::unordered_map<std::string, std::vector<int>> groups;
for (const auto& [category, value] : items) {
    groups[category].push_back(value);
}

Deduplication / membership tests

Idiomatic solution: use std::unordered_set when you only need membership.

#include <unordered_set>

std::unordered_set<int> seen;
if (seen.insert(id).second) {
    // first time seeing id
}

You can emulate with a map, but it’s heavier and easier to misuse.

Caching computed results (memoization)

std::unordered_map<long long, long long> memo;

long long fib(long long n) {
    if (n <= 1) return n;

    if (auto it = memo.find(n); it != memo.end())
        return it->second;

    auto v = fib(n - 1) + fib(n - 2);
    memo.try_emplace(n, v); // avoids overwriting if another path inserted
    return v;
}

Notes:

  • Recursion has overhead and can hit stack limits for large n; iterative DP is often better.
  • unordered_map is not thread-safe for concurrent writes; synchronize if used across threads.

Fast lookup tables (ID → object)

struct User { /* ... */ };
std::unordered_map<int, User> users;

users.try_emplace(42, User{/*...*/}); // avoids constructing/moving if key already exists

if (auto it = users.find(42); it != users.end()) {
    User& u = it->second;
    // update u
}

Even with emplace(42, User{...}), the User{...} temporary may be constructed (and then moved) before insertion is known to succeed. try_emplace is the tool when “don’t build it if it already exists” matters.

If objects are large or non-movable, store std::unique_ptr<User> instead.

End-to-end mini-example (reserve + count + safe lookup + erase)

#include <unordered_map>
#include <string>
#include <vector>

void process(const std::vector<std::string>& words) {
    std::unordered_map<std::string, int> freq;
    freq.reserve(words.size());

    for (const auto& w : words) {
        ++freq[w];
    }

    // Safe read without inserting:
    if (auto it = freq.find("the"); it != freq.end()) {
        // ... use it->second
    }

    // Remove rare words:
    for (auto it = freq.begin(); it != freq.end(); ) {
        if (it->second == 1) it = freq.erase(it);
        else ++it;
    }
}

Transition: from patterns to hazards

The patterns above are why unordered_map is popular. The next section covers the issues that most often cause surprising bugs or performance cliffs.

Common pitfalls and gotchas

Accidental insertion with operator[]

std::unordered_map<std::string, int> m;
int x = m["missing"]; // inserts "missing" with value 0

Use find(), contains(), or at() when you don’t want to mutate the map.

Invalidation rules (rehash vs erase)

Be precise about what invalidates what:

  • erase(it) / erase(key): invalidates iterators/references/pointers to the erased element(s). Other iterators remain valid (no rehash implied).
  • Rehashing (triggered by insertions or explicit rehash() / reserve()):
    • Iterators are invalidated.
    • References/pointers: treat them as potentially invalidated as well. Although node-based implementations often keep element addresses stable, the standard guarantees are subtle; rehash definitely changes bucket assignment and iteration order, and you should not rely on pointer/reference stability across rehash.

If you keep iterators around, consider:

  • Calling reserve() up front.
  • Avoiding storing iterators across insertions that might rehash.

(See cppreference notes on iterator invalidation under unordered_map.)

Custom keys: provide hash and equality

For user-defined key types, you need a hash functor and equality comparison.

#include <cstddef>
#include <functional>
#include <unordered_map>

struct Point {
    int x, y;
};

struct PointHash {
    std::size_t operator()(const Point& p) const noexcept {
        // Simple combiner for demonstration.
        // For production, prefer a stronger mix (e.g., boost::hash_combine-style).
        std::size_t h1 = std::hash<int>{}(p.x);
        std::size_t h2 = std::hash<int>{}(p.y);
        return h1 ^ (h2 + 0x9e3779b97f4a7c15ull + (h1 << 6) + (h1 >> 2));
    }
};

struct PointEq {
    bool operator()(const Point& a, const Point& b) const noexcept {
        return a.x == b.x && a.y == b.y;
    }
};

std::unordered_map<Point, int, PointHash, PointEq> grid;

Notes:

  • Naive XOR/shift combinations can cause unnecessary collisions.
  • Mark hash/equality noexcept where possible; it can enable optimizations and matches common expectations.

Heterogeneous lookup (e.g., std::string vs std::string_view)

By default, unordered_map<std::string, T> typically expects find()/contains() with a std::string key.

In C++20+, heterogeneous lookup is possible with transparent hashing/equality (is_transparent), allowing lookups by std::string_view without allocating a std::string. Support depends on providing compatible hash/equality types.

If you care about string_view lookups, search for “transparent hashing is_transparent unordered_map” and verify your standard library’s support.

Choosing between unordered_map and unordered_set

  • Use unordered_map when you need key → value.
  • Use unordered_set when you only need membership.

If the “value” is just a flag, a set is often simpler and more memory-efficient.

Allocators / std::pmr::unordered_map (advanced)

For allocator-sensitive or high-performance scenarios (lots of allocations), consider std::pmr::unordered_map with a std::pmr::memory_resource to reduce allocation overhead and improve locality.

Quick method cheat sheet

  • Lookup: find(), contains() (C++20), at()
  • Insert: emplace(), insert(), try_emplace() (C++17)
  • Update: insert_or_assign() (C++17), operator[]
  • Remove: erase(key), erase(it), clear()
  • Capacity/tuning: reserve(), rehash(), load_factor(), max_load_factor(), bucket_count()

Closing thoughts

std::unordered_map excels as a fast key–value store for counting, caching, grouping, and lookup tables—on average. The key is understanding when that average-case behavior can degrade (poor hashes, high collision rates, adversarial inputs) and choosing the right API (operator[] vs find()/at(), emplace vs try_emplace).

For performance-sensitive code, reserve() is usually the first and best knob to turn; tune load factor only after measuring. For deeper details and edge cases, cppreference is an excellent next stop.