HashMap Internal Working in Java: Step-by-Step Guide

Related Courses

HashMap Internal Working in Java (Step-by-Step Breakdown)

Whenever you need to store data as key–value pairs with fast lookup, HashMap is usually the first choice. It powers configurations, caches, session storage, dictionaries, mappings between IDs and objects, and much more.

On the surface, using a HashMap looks simple: you put a key and a value, and later you get the value back using the key. But internally, HashMap is doing a lot of intelligent work to make these operations fast, even when the map holds thousands or millions of entries.

In this detailed guide, we will explore the internal working of HashMap in Java step by step, in human-friendly language and without using code, so you can build a solid mental model of what's happening behind the scenes.

1. What Is a HashMap, Conceptually?

Conceptually, a HashMap is:

  • A collection of key–value pairs

  • Where each key is unique

  • That allows very fast operations like:

    • Insert (put)

    • Retrieve (get)

    • Remove (delete)

  • With average-case time complexity close to O(1) for these operations

The core idea behind HashMap is to use a hashing function to convert a key into a numeric hash, and then use that hash to decide where to store the key value pair in memory.

2. Core Building Blocks of a HashMap

To understand the internal working, it helps to break the structure into components.

2.1 The Bucket Array

At its core, a HashMap maintains an array internally.
Each slot in this array is often called a bucket.

  • This bucket array is where all key–value entries are stored (directly or indirectly).

  • The array has a capacity (for example, 16, 32, 64…).

  • Each bucket can hold zero, one, or multiple entries.

The position of a key–value pair in this array is decided by its hash value.

2.2 Nodes / Entries

Each stored key–value pair is represented as a node (often called an "entry").
A node logically contains:

  • The key

  • The value

  • The hash of the key

  • A reference to the next node in the same bucket (for handling collisions)

You can imagine a bucket as a small linked list or tree of entries that share the same bucket index.

2.3 Hash Function

The hash function is what converts a key (which could be a string, number, object, etc.) into an integer hash value.
The process is roughly:

  1. The key provides its own hashCode (from the key's class).

  2. HashMap further processes this hashCode (for better distribution).

  3. That processed value is then mapped to a bucket index using the current array size.

The goal of this processing is to spread keys evenly across all buckets to minimize collisions.

2.4 Load Factor and Capacity

Two important parameters control HashMap's growth:

  • Capacity: Size of the internal bucket array.

  • Load factor: A value (like 0.75) that determines when to resize.

The threshold is calculated as:
threshold = capacity × loadFactor
When the number of stored entries exceeds this threshold, HashMap resizes (expands) and rehashes elements into a larger array.

3. Step-by-Step: What Happens When You Call put(key, value)?

Let's walk through the internal steps when you insert a key–value pair into a HashMap.

Step 1: Compute the Key's Hash

  • The key's own hashCode is obtained.

  • HashMap applies a hash spreading step (for example, mixing high and low bits) to reduce clustering.

Result: A well-distributed integer hash value.

Step 2: Calculate the Bucket Index

The hash value is converted into a bucket index.
Conceptually:

  • Index = some function of hash and array length

  • The result is always within the range 0 to capacity – 1

This decides which bucket the entry should go into.

Step 3: Check the Target Bucket

HashMap looks at the bucket at that index:

  • If the bucket is empty, a new node is created and stored there. Done.

  • If the bucket is not empty, there is already at least one node stored there. This is a collision.

Step 4: Handle Collisions

A collision happens when two different keys map to the same bucket index.
In that case:

  1. HashMap traverses the existing nodes in that bucket.

  2. For each node:

    • It first compares the stored hash.

    • If the hashes match, it compares the keys using equals.

  3. If a matching key is found:

    • The value is updated for that key.

  4. If no matching key is found:

    • The new key–value node is appended to the chain (or inserted into a tree, as we'll see shortly).

So collisions are not fatal; HashMap simply stores multiple entries in the same bucket but links them together.

Step 5: Treeification (Converting Bucket to Tree)

In Java 8 and later, if:

  • Many nodes accumulate in the same bucket (for example, more than a certain threshold), and

  • The overall array is large enough,

then HashMap may convert that bucket's linked list into a balanced tree (typically a Red-Black Tree).
Why?

  • To improve worst-case performance from O(n) to O(log n) for that bucket.

  • In the event of many collisions, lookup will still be reasonably fast.

So a bucket can internally be:

  • Empty

  • A single node

  • A small linked list

  • A balanced tree of nodes

Step 6: Update Size and Possibly Resize

After a successful insertion:

  • HashMap increments its size.

  • It checks whether size > threshold.

  • If yes, it triggers a resize (expansion).

Resizing is one of the most important internal behaviors, so let's look at that next.

4. Resizing and Rehashing: How HashMap Grows

When the HashMap reaches a certain fullness based on the load factor, it grows the internal array to reduce collisions and maintain performance.

4.1 When Does Resizing Happen?

Resizing happens when:
number of entries > capacity × loadFactor
For example:

  • Capacity = 16

  • Load factor = 0.75

  • Threshold = 12

When the 13th entry is inserted, HashMap decides to enlarge the array.

4.2 How Does Resizing Work?

Conceptually:

  1. A new array with double capacity is created.

  2. All existing entries are reassigned to new buckets in the new array.

  3. The index of each key may change because the capacity used in the index calculation has changed.

This process is called rehashing.
It is a relatively heavy operation (O(n)), so it is not done often only when needed.

4.3 Split Optimization (Java 8 Behavior)

Later Java versions include optimizations where:

  • When resizing from capacity N to 2N, elements in a bucket often either:

    • Stay in the same index, or

    • Move to index oldIndex + oldCapacity

This avoids recomputing full hashes, making resizing more efficient.

5. Step-by-Step: What Happens When You Call get(key)?

Retrieving a value by its key is where HashMap shines.

Step 1: Compute Hash

HashMap computes the hash of the key (same logic as in put).

Step 2: Find the Bucket Index

Using the hash and current array length, it calculates the bucket index.

Step 3: Locate the Correct Entry in the Bucket

In that bucket:

  • If it is empty, the key is not present → return null.

  • If it has entries:

    • It traverses the linked list or tree in that bucket.

    • It compares the hash and then the key using equals.

    • Once a match is found, the value associated with that key is returned.

Thanks to good hash distribution and structure optimization, this is usually very fast average O(1).

6. Step-by-Step: What Happens When You Call remove(key)?

Removal is similar to retrieval, but also adjusts the bucket structure.

Step 1: Compute Hash and Bucket Index

Same as get.

Step 2: Traverse Bucket

HashMap searches through the nodes (list or tree) in that bucket.

Step 3: Remove Matching Node

Once it finds a node whose key matches:

  • It adjusts links (or tree pointers) so the node is no longer part of the structure.

  • It reduces the size counter.

In linked list buckets, this means connecting the previous node directly to the next node.

7. The Importance of equals and hashCode

HashMap depends heavily on two methods of the key object:

  • The key's hashCode

  • The key's equals method

7.1 hashCode

  • Must be consistent: always return the same value for the same key while it is in the map.

  • Should be well-distributed: to avoid many collisions.

  • Must follow the contract with equals:

    • If two keys are equal, they must have the same hashCode.

7.2 equals

  • Used to determine key equality.

  • When two keys land in the same bucket, equals decides whether they represent the same key or different keys.

7.3 Why This Matters

If you:

  • Implement an incorrect hashCode or equals,

  • Or use mutable objects as keys and then change their fields used in hashCode/equals,

you can break the map's behavior, leading to:

  • Missing entries during get

  • Duplicated keys

  • Unexpected collisions

Good hashCode and equals are critical for correct HashMap behavior.

8. Time Complexity of HashMap Operations

Let's look at the performance characteristics.

8.1 Average Case

For well-distributed keys:

  • put: O(1)

  • get: O(1)

  • remove: O(1)

This is why HashMap is preferred when fast lookup is needed.

8.2 Worst Case

If many keys end up in the same bucket (due to poor hashing or deliberate attacks):

  • In older Java versions: bucket is a linked list → O(n)

  • In Java 8+ with tree bins: bucket becomes a balanced tree → O(log n)

So the worst case is significantly improved in modern Java versions.

9. Special Behaviors and Features of HashMap

9.1 Null Keys and Null Values

HashMap:

  • Allows one null key

  • Allows multiple null values

Internally, null key is treated as a special case and usually stored in a specific bucket.

9.2 Iteration Order

HashMap does not guarantee any order when iterating over keys or entries.

  • The order may seem stable in small tests, but it is not guaranteed.

  • After resizing or different insertions, the order may change.

If you need predictable iteration order, you should use a LinkedHashMap instead.

9.3 Fail-Fast Iterators

HashMap's iterators are fail-fast:

  • If the map is structurally modified while iterating (other than through the iterator's own remove method), the iteration will throw an exception.

  • This is a safety feature to avoid unpredictable behavior during concurrent modification.

9.4 Not Thread-Safe

HashMap is not synchronized:

  • It is not safe to use concurrently from multiple threads without external synchronization.

  • For thread-safe behavior, one may use ConcurrentHashMap or synchronize externally.

10. Real-World Use Cases of HashMap

HashMap fits naturally into many application scenarios:

10.1 Caches and Lookups

  • Caching database results

  • Mapping user IDs to user profiles

  • Storing configuration properties at runtime

10.2 Indexing and Grouping

  • Grouping items by category

  • Mapping error codes to messages

  • Building small in-memory indexes

10.3 Frequency Counting

  • Counting occurrences of words

  • Counting events per user

  • Aggregating data per key

10.4 Graphs and Complex Data Structures

  • Adjacency lists (mapping nodes to lists of neighbors)

  • Metadata maps for objects

  • Inverting relationships (value → list of keys)

Because HashMap offers a flexible mapping from keys to values with fast access, it appears everywhere in real Java applications.

11. Common Pitfalls and Best Practices

11.1 Avoid Mutable Keys

Using objects as keys and then mutating their key fields after insertion can break the map.
Best practice:
Use immutable keys (like strings, wrapper types, or properly designed immutable objects).

11.2 Implement hashCode and equals Correctly

For custom key classes:

  • Include all relevant fields in both methods.

  • Ensure consistency:

    • If equals returns true for two objects, they must have identical hashCode values.

11.3 Choose a Reasonable Initial Capacity

If you know roughly how many entries you will store:

  • Specify an initial capacity to avoid early resizing.

  • This can significantly improve performance.

11.4 Be Aware of Load Factor

The default load factor (like 0.75) is a good general choice:

  • Lower load factor → fewer collisions but more memory.

  • Higher load factor → more dense map but more collisions.

11.5 Use the Right Map Type

  • HashMap: Fast, unordered

  • LinkedHashMap: Predictable insertion order

  • TreeMap: Sorted keys with log-time operations

  • ConcurrentHashMap: Thread-safe without full locking

Pick one based on needs.

12. Conclusion

HashMap is a fundamental part of Java development. Behind a simple API lies a sophisticated internal design combining:

  • A bucket array

  • Hashing and index calculation

  • Collision handling with linked lists and balanced trees

  • Dynamic resizing and rehashing

  • Careful use of hashCode and equals

By understanding this internal working step by step, you gain:

  • More control over performance

  • Better debugging skills

  • Ability to design correct key classes

  • Confidence when using maps in real applications

HashMap's strength is in its ability to give near-constant time performance for key–value operations in most practical scenarios, making it an essential tool in every Java developer's toolkit. For comprehensive learning, consider enrolling in a structured Java–DSA training program.

FAQs

1. What is the main advantage of HashMap in Java?

HashMap provides very fast insertion, retrieval, and deletion of key–value pairs in average constant time, making it ideal for lookup-heavy operations.

2. How does HashMap find the bucket for a key?

It uses the key's hashCode, processes it to improve distribution, and then maps it to a bucket index based on the current array size.

3. What happens when two keys have the same bucket index?

This is a collision. HashMap stores both entries in the same bucket using a linked list or a balanced tree and uses equals to distinguish between keys.

4. Why did Java 8 introduce trees inside HashMap buckets?

To improve worst-case performance. When many collisions occur in one bucket, using a balanced tree reduces lookup from O(n) to O(log n).

5. Can HashMap store null keys and values?

Yes. It allows one null key and multiple null values.

6. Is HashMap thread-safe?

No. HashMap is not synchronized. For concurrent use, you should use ConcurrentHashMap or other thread-safe mechanisms.

7. Why are hashCode and equals so important for HashMap keys?

Because HashMap relies on hashCode to find the bucket and equals to confirm key equality. Poor implementations can lead to incorrect behavior and bad performance. For comprehensive learning, consider a Java full stack developer course in Hyderabad to master these concepts.