Wednesday, 14 September 2011

How does HashMap work in Java

We all know and must have used HashMap, its a Map interface implementation. It stores keys and its corresponding value. Keys cannot contain duplicates and can contain at the most one null key (HashTable does not allow null as key, FYI). It has non synchronized methods unlike HashTable. It does not guarantee the order of retrieval (it changes every time the HashMap is modified) etc etc. Enough of this now..

So how does HashMap work?
Basically when this question is asked it generally means how is the object stored and retrieved from the     HashMap. But of course with the get and put methods in the Map API. Easy isn't it, but that is not what
meets the eye. We need to understand what goes inside to understand the answer to the title of this topic.

So what happens when you put a key/value pair in the HashMap, Firstly, hashCode of the key is retrieved and supplied to hash function to defend against poorly constructed hashCode implementation; to get a new hash value. This value is then used to figure out which bucket the entry (Entry is an static inner class within the HashMap structure, which stores the key, its associated value and the reference to the next entry, it is created whenever you try to put an new mapping in the HashMap; Buckets are nothing but an array of Entry objects) should belong to, using another method named indexFor which calculates the index for the bucket. (Initially, the number of buckets within the HashMap is equal to in the initial capacity of the Map,FYI).
Once the index is found, if there is an entry at that location in the bucket; then that entry's hash value and key is checked for equality with the new key if they are same the old value is replaced, else the old entry is marked as successor to the new entry forming a singly linked list. If there is no entry at the calculated index location; the new value is stored at that location.

What happens when you retrieve the key from the Map using get? If you understand how the above put logic works, retrieval logic is easy. We first, get the hash value by applying the hash function to the hashCode of the key. This value is then used to calculate the index value of the bucket. Using this index we get the entry, if the hash value  and the key of the retrieved entry and key passed is the same then the value is returned, else we traverse the linked list till we get the value. If not we return null.

This above understanding helps us to understand various questions
  • What is HashMap? Why do we use it?
  • How does HashMap work? How does the get method work?
  • What will happen when two different objects has same hashCode, (during both insertion and retrieval)
Another interesting thing to note is what happens when the capacity of the HashMap is reached.  As said in the java docs, when the number of entries exceeds the product of the load factor and the current capacity, the capacity is roughly doubled by calling the rehash method. Note, when rehashing happens, issues like race conditions can occur when the HashMap is used concurrently.
Also, the iterator returned by some of the methods in the class are fail-fast; if the Map is structurally modified after the creation of the iterator it will throw ConcurrentModificationException. This fact is recorded via the modCount (modification count) whenever an mapping is added or removed.

Hope this helps you understand why a correct implementation of hashCode and equals methods are so important to a class when its objects are used in a collections (based on principle of hashing)  and explains the strange behavior of these collections sometimes.


Tuesday, 13 September 2011

Immutable classes

An object is said to be mutable when the state of the object can be changed E.x. through the setter methods mostly (of course there are other alternatives to change the state, which we will understand once we step through the guidelines of creating an immutable class). Immutable objects are objects whose state cannot be changed after they are created. Once created their state remains the same till the lifetime of the application (rather every life of the application).

Why would one create an immutable objects?  

  • Immutability objects find their use in concurrent applications/ multi-threaded applications, such an object is always thread safe which means threads wont see an inconsistent state of such an object,so you don't have to synchronize access to them across threads.
  • They also are good candidate in Hash based collections like HashSet  (they need to override equals and hashCode methods). 
  • You can freely share and cache references to immutable objects without having to copy or clone them; you can cache their fields or the results of their methods without worrying about the values becoming stale or inconsistent with the rest of the object's state. 
  • Wrapper classes in java language like Integer, Short, String etc are immutable. 

    So how would you create an Immutable class or what are steps to create one?
    1. Don't provide "setter" methods — methods that modify fields or objects referred to by fields.
    2. Make all fields final and private.
    3. Don't allow subclasses to override methods. The simplest way to do this is to declare the class as final. A more sophisticated approach is to make the constructor private and construct instances in factory methods.
    4. If the instance fields include references to mutable objects, don't allow those objects to be changed:
      • Don't provide methods that modify the mutable objects.
      • Don't share references to the mutable objects. Never store references to external, mutable objects passed to the constructor; if necessary, create copies, and store references to the copies. Similarly, create copies of your internal mutable objects when necessary to avoid returning the originals in your methods.