Monday, June 20, 2016

Collections





        

25.          What is difference between Array and ArrayList? When will you use Array over ArrayList?

Arrays can contain primitive or Objects whereas ArrayList can contain only Objects.
Arrays are fixed size whereas ArrayList size is dynamic.
Arrays doesn’t provide a lot of features like ArrayList, such as addAll, removeAll, iterator etc.
using ArrayList over Array is support of Generics. Array doesn't support Generics,

1. What is difference between ArrayList and LinkedList?

ArrayList and LinkedList both implement List interface but there are some differences between them.
A.    ArrayList is an index based data structure backed by Array, so it provides random access to it’s elements with performance as O(1) but LinkedList stores data as list of nodes where every node is linked to it’s previous and next node. So even though there is a method to get the element using index, internally it traverse from start to reach at the index node and then return the element, so performance is O(n) that is slower than ArrayList.
B.    Insertion, addition or removal of an element is faster in LinkedList compared to ArrayList because there is no concept of resizing array or updating index when element is added in middle.
C.    LinkedList consumes more memory than ArrayList because every node in LinkedList stores reference of previous and next elements.

20) When does ConcurrentModificationException occur on iteration?
When you remove object using Collection's or List's remove method e.g. remove(Object element) or remove(int index), instead of Iterator's remove() method than ConcurrentModificationException occur. As per Iterator's contract, if it detect any structural change in Collection e.g. adding or removing of element, once Iterator begins, it can throw ConcurrentModificationException. 

Difference between Set, List and Map in Java - Interview question



Set, List and Map are three important interface of Java collection framework and Difference between Set, List and Map in Java is one of the most frequently asked Java Collection interview question. Some time this question is asked as When to use List, Set and Map in Java. Clearly, interviewer is looking to know that whether you are familiar with fundamentals of Java collection framework or not. In order to decide when to use List, Set or Map , you need to know what are these interfaces and what functionality they provide. List in Java provides ordered and indexed collection which may contain duplicates. Set provides an unordered collection of unique objects, i.e. Set doesn't allow duplicates, while Map provides a data structure based on key value pair and hashing. All three List, Set and Map are interfaces in Java and there are many concrete implementation of them are available in Collection API. ArrayList and LinkedList are two most popular used List implementation while LinkedHashSet, TreeSet and HashSet are frequently used Set implementation. In this Java article we will see difference between Map, Set and List in Java and learn when to use List, Set or Map.

Duplicate Objects
Main difference between List and Set interface in Java is that List allows duplicates while Setdoesn't allow duplicates. All implementation of Set honor this contract. Map  holds two object per Entry e.g. key and value and It may contain duplicate values but keys are always unique.

Order
Another key difference between List and Set is that List is an ordered collection, List's contract maintains insertion order or element. Set is an unordered collection, you get no guarantee on which order element will be stored. Though some of the Set implementation e.g.LinkedHashSet maintains order. Also SortedSet and SortedMap e.g. TreeSet and TreeMapmaintains a sorting order, imposed by using Comparator or Comparable.

Null elements
List allows null elements and you can have many null objects in a List, because it also allowed duplicates. Set just allow one null element as there is no duplicate permitted while in Map you can have null values and at most one null key. worth noting is that Hashtable doesn't allow null key or values but HashMap allows null values and one null keys.  This is also the main difference between these two popular implementation of Map interface, aka HashMap vs Hashtable

Popular implementation
List - ArrayList, LinkedList and Vector
Set - HashSet, TreeSet and LinkedHashSet
Map - HashMap, Hashtable and TreeMap

When to use List, Set and Map in Java

Based upon our understanding of difference between Set, List and Map we can now decide when to use List, Set or Map in Java.
1) If you need to access elements frequently by using index, than List is a way to go. Its implementation e.g. ArrayList provides faster access if you know index.

2) If you want to store elements and want them to maintain an order on which they are inserted into collection then go for List again, as 
List is an ordered collection and maintain insertion order.
3) If you want to create collection of unique elements and don't want any duplicate than choose any Set implementation e.g. HashSet, LinkedHashSet or TreeSet. All Set implementation follow there general contract e.g. uniqueness but also add addition feature e.g. TreeSet is aSortedSet and elements stored on TreeSet can be sorted by using Comparator or Comparable in Java. LinkedHashSet also maintains insertion order.

4) If you store data in form of key and value than Map is the way to go. You can choose from
Hashtable, HashMap, TreeMap based upon your subsequent need. In order to choose between first two see difference between HashSet and HashMap in Java.


is a concurrent Collection class introduced in Java 5 Concurrency API along with its popular cousin ConcurrentHashMap in Java.
CopyOnWriteArrayList implements List interface like ArrayList, Vector and LinkedList but its a thread-safe collection and it achieves its thread-safety in a slightly different way than Vector or other thread-safe collection class.

As name suggest CopyOnWriteArrayList creates copy of underlying ArrayList with every mutation operation e.g. add or set. Normally CopyOnWriteArrayList is very expensive because it involves costly Array copy with every write operation but its very efficient if you have a List where Iteration outnumber mutation e.g. you mostly need to iterate the ArrayList and don't modify it too often.
Iterator of CopyOnWriteArrayList is fail-safe and doesn't throw ConcurrentModificationException even if underlying CopyOnWriteArrayList is modified once Iteration begins because Iterator is operating on separate copy of ArrayList. Consequently all the updates made on CopyOnWriteArrayList is not available to Iterator


Synchronized Collection Methods of Collections class
In JavaSW, normally collections aren't synchronized, which leads to fast performance. However, in multi-threaded situations, it can be very useful for collections to be synchronized. The Java Collections class has several static methods on it that provide synchronized collections. These methods are:
Synchronized Collection Methods of Collections class
Collections.synchronizedCollection(Collection<T> c)
Collections.synchronizedList(List<T> list)
Collections.synchronizedMap(Map<K,V> m)
Collections.synchronizedSet(Set<T> s)
Collections.synchronizedSortedMap(SortedMap<K,V> m)
Collections.synchronizedSortedSet(SortedSet<T> s)
As an example, let's take a normal list (implemented by the ArrayList class) and make it synchronized. This is shown in the SynchronizedListExample class. We pass the Collections.synchronizedList method a new ArrayList of Strings. The method returns a synchronized List of Strings.

SynchronizedListExample.java

package com.cakes;  import java.util.ArrayList; import java.util.Collections; import java.util.Iterator; import java.util.List;  public class SynchronizedListExample {   public static void main(String[] args) {               
List<String> syncList = Collections.synchronizedList(new ArrayList<String>()); 
         syncList.add("one");       
         syncList.add("two");       
         syncList.add("three");               
 
// when iterating over a synchronized list, we need to synchronize access to the synchronized list            
synchronized (syncList) {                     
 
         Iterator<String> iterator = syncList.iterator(); 
                           while (iterator.hasNext()) {
                                     System.out.println("item: " + iterator.next());
                           }        
         }        
} 
Notice that when iterating over the list, this access is still done using a synchronized block that locks on the syncList object. In general, iterating over a synchronized collection should be done in a synchronized block. See the javadocs for the Collections class for more information.
Also, it can be good practice to synchronize your code in other ways when accessing a synchronized collection to ensure that you obtain the behavior you expect.

What are the pros and cons of CopyOnWriteArrayList vs Collections.SynchronizedList? When do we choose one over another?

First lets look at the difference between the two :

CopyOnWriteArrayList implements all mutative operations (add, set, and so on) by making a fresh copy of the underlying array. So there is no additional overhead during a read operation but massive overhead during a write operation

On the other hand, in case of Collections.synchronizedList(), every operation is synchronized even read operations. For example, the code snippet below shows the definition of get(int i) method in the list implementation returned in Collections.synchronizedList() :
1 2 3
public E get(int index) {
            synchronized (mutex) {return list.get(index);}
        }

Also CopyOnWriteArrayList uses ReentrantLock in addition to copying the underlying collection. So, write operations are slower than Collections.synchronizedList()
The iterator in CopyOnWriteArrayList is a fail-safe iterator (i.e just a snapshot iterator)- it doesnt throw ConcurrentModifficationException nor can it be used to update the data

So clearly,
1. CopyOnWriteArrayList has no performance overhead for read opeartions unlike Collections.synchronizedList().

2. For write operations, CopyOnWriteArrayList uses ReentrantLock and creates a backup copy of the data.. Costly operation on writes
.. while Collections.synchronizedList() uses synchronization. 3. CopyOnWriteArrayList provides shapshot-style fail-safe iterator

So,
1. CopyOnWriteArrayList is a good choice when number of reads is significantly higher than number of writes.
2. Collections.synchronizedList() is a good choice when  writes outnumber reads
3. CopyOnWriteArrayList  cannot be used if you want your iterators to detect concurrent modification or if you want to update the collection through the iterator, Use Collections.synchronizedList() in such cases


What's the difference between ConcurrentHashMap and Collections.synchronizedMap(Map)?


There seem to be three different synchronized Map implementations in the Java API:
·       Hashtable
·       Collections.synchronizedMap(Map)
·       ConcurrentHashMap

ConcurrentHashMap

·       You should use ConcurrentHashMap when you need very high concurrency in your project.
·       It is thread safe without synchronizing the whole map.
·       Reads can happen very fast while write is done with a lock.
·       There is no locking at the object level.
·       The locking is at a much finer granularity at a hashmap bucket level.
·       ConcurrentHashMap doesn’t throw a ConcurrentModificationException if one thread tries to modify it while another is iterating over it.
·       ConcurrentHashMap uses multitude of locks.

SynchronizedHashMap

·       Synchronization at Object level.
·       Every read/write operation needs to acquire lock.
·       Locking the entire collection is a performance overhead.
·       This essentially gives access to only one thread to the entire map & blocks all the other threads.
·       It may cause contention.
·       SynchronizedHashMap returns Iterator, which fails-fast on concurrent modification.
·       Use SynchronizedHashMap if you need to ensure data consistency, and each thread needs to have an up-to-date view of the map
·        


1) ConcurrentHashMap locks only portion of Map but SynchronizedMap locks whole MAp.
2) ConcurrentHashMap has better performance over SynchronizedMap and more scalable.
3) In case of multiple reader and Single writer ConcurrentHashMap is best choice.
 4. choose ConcurrentHashMap  - if many update operations and relative small amount of read operations
5. ConcurrentHashMap can guarantee that there is no ConcurrentModificationException (Fail-Safe)
6. Collections.synchronizedMap() is not guaranteed. Throws ConcurrentModificationException  (Fail-Fast).
7. Use the ConcurrentHashMap if performance is critical, and each thread only inserts data to the map, with reads happening less frequently.


1. What are best practices related to Java Collections Framework?

·       Chosing the right type of collection based on the need, for example if size is fixed, we might want to use Array over ArrayList. If we have to iterate over the Map in order of insertion, we need to use TreeMap. If we don’t want duplicates, we should use Set.
·       Some collection classes allows to specify the initial capacity, so if we have an estimate of number of elements we will store, we can use it to avoid rehashing or resizing.
·       Write program in terms of interfaces not implementations, it allows us to change the implementation easily at later point of time.
·       Always use Generics for type-safety and avoid ClassCastException at runtime.
·       Use immutable classes provided by JDK as key in Map to avoid implementation of hashCode() and equals() for our custom class.
·       Use Collections utility class as much as possible for algorithms or to get read-only, synchronized or empty collections rather than writing own implementation. It will enhance code-reuse with greater stability and low maintainability.

Why start an ArrayList with an initial capacity?

Default size of Arraylist is 10.
ArrayDeque() - 16

Because ArrayList is a dynamically resizing array data structure, which means it is implemented as an array with an initial (default) fixed size. When this is filled up, the array will be extended to a double sized one. This operation is costful, so you want as few as possible.
So, if you know your upper bound is 20 items, then creating the array with initial length of 20 is better than using a default of, say, 15 and then resize it to 15*2 = 30 and use only 20.
P.S. - As AmitG says, the expansion factor is implementation specific (in this case (oldCapacity * 3)/2 + 1)

Resizing an ArrayList once or twice isn't going to cost you that much, but if you're executing a query that you know for certain (by analyzing the data or knowing the domain well) will return at least 300 elements, on average 500, and at most 1000, you may as well presize it at 500 at least to avoid ten resizes.

The key to remember is that wasting 500 empty reference slots is only 2k. On today's hardware that doesn't even count as lint. I would prefer that to 10 extra allocations and 10 + 15 + 23 + 35 + 53 + 80 + ... + ~500 unnecessary copying of references.


Iterator : Enables you to traverse through a collection in the forward direction only, for obtaining or removing elements ListIterator : extends Iterator, and allows bidirectional traversal of list and also allows the modification of elements.
·       ListIterator inherits from Iterator interface and comes with extra functionalities like adding an element, replacing an element, getting index position for previous and next elements.

2.     What is the difference between Enumeration and Iterator?
Enumeration
Iterator
Enumeration doesn't have a remove() method
Iterator has a remove() method
Enumeration acts as Read-only interface, because it has the methods only to traverse and fetch the objects
Can be abstract, final, native, static, or synchronized
Note: So Enumeration is used whenever we want to make Collection objects as Read-only.

3.     What are the main implementations of the List interface ?
The main implementations of the List interface are as follows :
  • ArrayList : Resizable-array implementation of the List interface. The best all-around implementation of the List interface.
  • Vector : Synchronized resizable-array implementation of the List interface with additional "legacy methods."
  • LinkedList : Doubly-linked list implementation of the List interface. May provide better performance than the ArrayList implementation if elements are frequently inserted or deleted within the list. Useful for queues and double-ended queues (deques).
4.     What are the advantages of ArrayList over arrays ?
Some of the advantages ArrayList has over arrays are:
  • It can grow dynamically
  • It provides more powerful insertion and search mechanisms than arrays.
5.     Difference between ArrayList and Vector ?
ArrayList
Vector
ArrayList is NOT synchronized by default.
Vector List is synchronized by default.
ArrayList can use only Iterator to access the elements.
Vector list can use Iterator and Enumeration Interface
To access elements.
The ArrayList increases its array size by 50 percent if it runs out of room.
A Vector defaults to doubling the size of its array if it runs out of room
ArrayList has default size 10.
Out of room While vector has a default size of 10.

6.     How to obtain Array from an ArrayList ?
Array can be obtained from an ArrayList using toArray() method on ArrayList.
        List arrayList = new ArrayList();
        arrayList.add(…

        Object  a[] = arrayList.toArray();

7.     Why insertion and deletion in ArrayList is slow compared to LinkedList ?
  • ArrayList internally uses and array to store the elements, when that array gets filled by inserting elements a new array of roughly 1.5 times the size of the original array is created and all the data of old array is copied to new array.
  • During deletion, all elements present in the array after the deleted elements have to be moved one step back to fill the space created by deletion. In linked list data is stored in nodes that have reference to the previous node and the next node so adding element is simple as creating the node an updating the next pointer on the last node and the previous pointer on the new node. Deletion in linked list is fast because it involves only updating the next pointer in the node before the deleted node and updating the previous pointer in the node after the deleted node.
·       LinkedList by nature does not have "capacity", since it does not allocate memory to the items before the items are added to the list. Each item in a LinkedList holds a pointer to the next in the list.
·      
·       There would be no point in allocating memory to the list beforehand, since LinkedList does not have capacity.

8.     Why are Iterators returned by ArrayList called Fail Fast ?

·       Because, if list is structurally modified at any time after the iterator is created, in any way except through the iterator's own remove or add methods, the iterator will throw a ConcurrentModificationException. Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.

9.     How do you decide when to use ArrayList and When to use LinkedList?
If you need to support random access, without inserting or removing elements from any place other than the end, then ArrayList offers the optimal collection. If, however, you need to frequently add and remove elements from the middle of the list and only access the list elements sequentially, then LinkedList offers the better implementation.

10. What is the Set interface ?

  • The Set interface provides methods for accessing the elements of a finite mathematical set
  • Sets do not allow duplicate elements
  • Contains no methods other than those inherited from Collection
  • It adds the restriction that duplicate elements are prohibited
  • Two Set objects are equal if they contain the same elements
11.What are the main Implementations of the Set interface ?
The main implementations of the List interface are as follows:
  • HashSet
  • TreeSet
  • LinkedHashSet
  • EnumSet
12.       What is a HashSet ?
  • A HashSet is an unsorted, unordered Set.
  • It uses the hashcode of the object being inserted (so the more efficient your hashcode() implementation the better access performance you’ll get).
  • Use this class when you want a collection with no duplicates and you don’t care about order when you iterate through it.
13.       What is a TreeSet ?

TreeSet is a Set implementation that keeps the elements in sorted order. The elements are sorted according to the natural order of elements or by the comparator provided at creation time.

14. Difference between HashSet and TreeSet ?
HashSet
TreeSet
HashSet is under set interface i.e. it  does not guarantee for either sorted order or sequence order.
TreeSet is under set i.e. it provides elements in a sorted
Assending order).
We can add any type of elements to hash set.
We can add only similar types
of elements to tree set.

15. What is a Map ?

  • A map is an object that stores associations between keys and values (key/value pairs).
  • Given a key, you can find its value. Both keys  and  values are objects.
  • The keys must be unique, but the values may be duplicated.
  • Some maps can accept a null key and null values, others cannot.

16.What are the main Implementations of the Map interface ?
The main implementations of the Map interface are as follows:
  • HashMap
  • HashTable
  • TreeMap
  • EnumMap
17. What is a TreeMap ?

TreeMap actually implements the SortedMap interface which extends the Map interface. In a TreeMap the data will be sorted in ascending order of keys according to the natural order for the key's class, or by the comparator provided at creation time. TreeMap is based on the Red-Black tree data structure.

18.How do you decide when to use HashMap and when to use TreeMap ?

For inserting, deleting, and locating elements in a Map, the HashMap offers the best alternative. If, however, you need to traverse the keys in a sorted order, then TreeMap is your better alternative. Depending upon the size of your collection, it may be faster to add elements to a HashMap, then convert the map to a TreeMap for sorted key traversal.

19.Difference between HashMap and Hashtable ?
HashMap
Hashtable
HashMap lets you have null values as well as one null key.
HashTable  does not allows null values as key and value.
The iterator in the HashMap is fail-safe (If you change the map while iterating, you’ll get concurrent modification exception(fail-fast).
The enumerator for the Hashtable is not fail-safe.
HashMap is unsynchronized.not thread safe.
Hashtable is synchronized. thread safe
Note: Only one NULL is allowed as a key in HashMap. HashMap does not allow multiple keys to be NULL. Nevertheless, it can have multiple NULL values.

20.How does a Hashtable internally maintain the key-value pairs?

TreeMap actually implements the SortedMap interface which extends the Map interface. In a TreeMap the data will be sorted in ascending order of keys according to the natural order for the key's class, or by the comparator provided at creation time. TreeMap is based on the Red-Black tree data structure.
21. Q7) What is difference between List and a Set?
Ans)
1) List can contain duplicate values but Set doesnt allow. Set allows only to unique elements. 
2) List allows retrieval of data to be in same order in the way it is inserted but Set doesnt ensures the sequence in which data can be retrieved.(Except HashSet)
22. Q10) Consider a scenario. If an ArrayList has to be iterate to read data only, what are the possible ways and which is the fastest?
Ans) It can be done in two ways, using for loop or using iterator of ArrayList. The first option is faster than using iterator. Because value stored in arraylist is indexed access. So while accessing the value is accessed directly as per the index.
23. Q11) Now another question with respect to above question is if accessing through iterator is slow then why do we need it and when to use it.
Ans) For loop does not allow the updation in the array(add or remove operation) inside the loop whereas Iterator does. Also Iterator can be used where there is no clue what type of collections will be used because all collections have iterator.
22. Q13) Why is it preferred to declare: List<String> list = new ArrayList<String>(); instead of ArrayList<String> = new ArrayList<String>();
Ans) It is preferred because:
1.      If later on code needs to be changed from ArrayList to Vector then only at the declaration place we can do that.
2.     The most important one – If a function is declared such that it takes list. E.g void showDetails(List list);
When the parameter is declared as List to the function it can be called by passing any subclass of List like ArrayList,Vector,LinkedList making the function more flexible
23. Q15) How to sort list in reverse order?
Ans) To sort the elements of the List in the reverse natural order of the strings, get a reverse Comparator from the Collections class with reverseOrder(). Then, pass the reverse Comparator to the sort() method.
List list = new ArrayList();
Comparator comp = Collections.reverseOrder();
Collections.sort(list, comp)

24. Q18) How to make a List (ArrayList,Vector,LinkedList) read only?
Ans) A list implemenation can be made read only using Collections.unmodifiableList(list). This method returns a new list. If a user tries to perform add operation on the new list; UnSupportedOperationException is thrown.
25. Q19) What is ConcurrentHashMap?
Ans) A concurrentHashMap is thread-safe implementation of Map interface. In this class put and remove method are synchronized but not get method. This class is different from Hashtable in terms of locking; it means that hashtable use object level lock but this class uses bucket level lock thus having better performance.
26. Q22) Arrange in the order of speed - HashMap,HashTable, Collections.synchronizedMap,concurrentHashmap
Ans) HashMap is fastest, ConcurrentHashMap,Collections.synchronizedMap,HashTable.

25.          What is difference between fail-fast and fail-safe?

Iterator fail-safe property work with the clone of underlying collection, hence it’s not affected by any modification in the collection. By design, all the collection classes in java.util package are fail-fast whereas collection classes in java.util.concurrent are fail-safe.
Fail-fast iterators throw ConcurrentModificationException whereas fail-safe iterator never throws ConcurrentModificationException.
Check this post for 
CopyOnWriteArrayList Example.

26.          How to avoid ConcurrentModificationException while iterating a collection?

We can use concurrent collection classes to avoid ConcurrentModificationException while iterating over a collection, for example CopyOnWriteArrayList instead of ArrayList.



HashMap in Java works on hashing principle. It is a data structure which allows us to store object and retrieve it in constant time O(1) provided we know the key. In hashing, hash functions are used to link key and value in HashMap.

Objects are stored by calling put(key, value) method of HashMap and retrieved by calling get(key) method.

When we call put method, hashcode() method of key object is called so that hash function of map can find a bucket location to store value object, which is actually index of internal array, known as table. HashMap internally store mapping in form of Map.Entry object which contains both key and value object. When you want to retrieve the object, you call get() method and again pass key object. This time again key object generate same hash code (it's mandatory for it to do so to retrieve object and that's why HashMap keys are immutable e.g. String) and we end up at same bucket location.

If there is only one object then it is returned and that's your value object which you have stored earlier. Things get little tricky when collisions occurs. Since internal array of HashMap is of fixed size, and if you keep storing objects, at some point of time hash function will return same bucket location for two different keys, this is called collision in HashMap. In this case, a linked list is formed at that bucket location and new entry is stored as next node. If we try to retrieve object from this linked list, we need an extra check to search correct value, this is done by equals() method. Since each node contains an entry, HashMap keep comparing entry's key object with passed key using equals() and when it return true, Map returns corresponding value. Since searching in lined list is O(n) operation, in worst case hash collision reduce a map to linked list. This issue is recently addressed in Java 8 by replacing linked list to tree to search in O(logN) time. By the way, you can easily verify how HashMap work by looking at code of HashMap.java in your Eclipse IDE, if you know how to attach source code of JDK in Eclipse.

The contract between equals() and hasCode() is that:
1. If two objects are equal, then they must have the same hash code.
2. If two objects have the same hashcode, they may or may not be equal.

The idea behind a Map is to be able to find an object faster than a linear search. Using hashed keys to locate objects is a two-step process. Internally the Map stores objects as an array of arrays. The index for the first array is the hashcode() value of the key. This locates the second array which is searched linearly by using equals() to determine if the object is found.



How HashMap Internally Works in Java
Questions start with simple statement :

Have you used HashMap before or  What is HashMap? Why do you use it 
Almost everybody answers this with yes and then interviewee keep talking about common facts about HashMap like HashMap accept null while Hashtable doesn't, HashMap is not synchronized, HashMap is fast and so on along with basics like its stores key and value pairs etc. This shows that person has used HashMap and quite familiar with the functionality it offers, but interview takes a sharp turn from here and next set of follow-up questions gets more detailed about fundamentals involved with HashMap in Java . Interviewer strike back with questions like :



Do you Know how HashMap works in Java or How does get () method of HashMap works in Java
And then you get answers like,  I don't bother its standard Java API, you better look code on Java source or Open JDK; I can find it out in Google at any time etc. But some interviewee definitely answer this and will say HashMap works on principle of hashing, we have put(key, value) and get(key) method for storing and retrieving Objects from HashMap.

When we pass Key and Value object  to put() method on Java HashMap, HashMap implementation calls hashCode method on Key object and applies returned hashcode into its own hashing function to find a bucket location for storing Entry object, important point to mention is that HashMap in Java stores both key and value object as Map.Entry in bucket which is essential to understand the retrieving logic.

 If people fails to recognize this and say it only stores Value in the bucket they will fail to explain the retrieving logic of any object stored in Java HashMap . This answer is very much acceptable and does make sense that interviewee has fair bit of knowledge on how hashing works and how HashMap  works in Java. But this is just start of story and confusion increases when you put interviewee on scenarios faced by Java developers on day by day basis. Next question could be about collision detection and collision resolution in Java HashMap  e.g. 



What will happen if two different objects have same hashcode?
Now from here onwards real confusion starts, Some time candidate will say that since hashcode is equal, both objects are equal and HashMap  will throw exception or not store them again etc, Then you might want to remind them about equals() and hashCode() contract  that two unequal object in Java can have same hash code. Some will give up at this point and few will move ahead and say "Since hashcode is same, bucket location would be same and collision will occur in HashMap, Since HashMap use LinkedList to store object, this entry (object of Map.Entry comprise key and value )  will be stored in LinkedList. Great this answer make sense though there are many collision resolution methods available  like linear probing and chaining, this is simplest and HashMap in Java does follow this. But story does not end here and interviewer asks



How will you retrieve Value object  if two Keys will have same hashcode?

Interviewee will say we will call get() method and then HashMap uses Key Object's hashcode to find out bucket location and retrieves Value object but then you need to remind him that there are two Value objects are stored in same bucket , so they will say about traversal in LinkedList until we find the value object , then you ask how do you identify value object because you don't  have value object to compare ,Until they know that HashMap  stores both Key and Value in LinkedList node or as Map.Entry they won't be able to resolve this issue and will try and fail.


But those bunch of people who remember this key information will say that after finding bucket location , we will call keys.equals() method to identify correct node in LinkedList and return associated value object for that key in Java HashMap . Perfect this is the correct answer.


In many cases interviewee fails at this stage because they get confused between hashCode() and equals() or keys and values object in Java HashMap  which is pretty obvious because they are dealing with the hashcode() in all previous questions and equals() come in picture only in case of retrieving value object from HashMap in Java. Some good developer point out here that using immutable, final object with proper equals() and hashcode() implementation would act as perfect Java HashMap  keys and improve performance of Java HashMap  by reducing collision. Immutability also allows caching there hashcode of different keys which makes overall retrieval process very fast and suggest that String and various wrapper classes e.g. Integer very good keys in Java HashMap.

Now if you clear this entire Java HashMap interview,  You will be surprised by this very interesting question "What happens On HashMap in Java if the size of the HashMap  exceeds a given threshold defined by load factor ?". Until you know how HashMap  works exactly you won't be able to answer this question. If the size of the Map exceeds a given threshold defined by load-factor e.g. if load factor is .75 it will act to re-size the map once it filled 75%. Similar to other collection classes like ArrayList,  Java HashMap re-size itself by creating a new bucket array of size twice of previous size of HashMap , and then start putting every old element into that new bucket array. This process is called rehashing because it also applies hash function to find new bucket location. 


If you manage to answer this question on HashMap in Java you will be greeted by "do you see any problem with resizing of HashMap  in Java" , you might not be able to pick the context and then he will try to give you hint about multiple thread accessing the Java HashMap and potentially looking for race condition on HashMap  in Java


So the answer is Yes there is potential race condition exists while resizing HashMap in Java, if two thread at the same time found that now HashMap needs resizing and they both try to resizing. on the process of resizing of HashMap in Java , the element in bucket which is stored in linked list get reversed in order during there migration to new bucket because Java HashMap  doesn't append the new element at tail instead it append new element at head to avoid tail traversing. If race condition happens then you will end up with an infinite loop. Though this point you can potentially argue that what the hell makes you think to use HashMap  in multi-threaded environment to interviewer :)

Some more Hashtable and HashMap Questions 
Few more question on HashMap in Java which is contributed by readers of Javarevisited blog  :


1) Why String, Integer and other wrapper classes are considered good keys ?
String, Integer and other wrapper classes are natural candidates of HashMap key, and String is most frequently used key as well because String is immutable and final,and overrides equals and hashcode() method. Other wrapper class also shares similar property. Immutabiility is required, in order to prevent changes on fields used to calculate hashCode() because if key object return different hashCode during insertion and retrieval than it won't be possible to get object from HashMap. Immutability is best as it offers other advantages as well like thread-safety, If you can  keep your hashCode same by only making certain fields final, then you go for that as well. Since equals() and hashCode() method is used during reterival of value object from HashMap, its important that key object correctly override these methods and follow contact. If unequal object return different hashcode than chances of collision will be less which subsequently improve performance of HashMap.



2) Can we use any custom object as key in HashMap ?
This is an extension of previous questions. Ofcourse you can use any Object as key in Java HashMap provided it follows equals and hashCode contract and its hashCode should not vary once the object is inserted into Map. If custom object is Immutable than this will be already taken care because you can not change it once created.

3) Can we use ConcurrentHashMap in place of Hashtable ?
This is another question which getting popular due to increasing popularity of ConcurrentHashMap. Since we know Hashtable is synchronized but ConcurrentHashMap provides better concurrency by only locking portion of map determined by concurrency level. ConcurrentHashMap is certainly introduced as Hashtable and can be used in place of it but Hashtable provide stronger thread-safety than ConcurrentHashMap. See my post difference between Hashtable and ConcurrentHashMap for more details.

Personally, I like this question because of its depth and number of concept it touches indirectly, if you look at questions asked during interview this HashMap  questions has verified
    Concept of hashing
    Collision resolution in HashMap
    Use of equals () and hashCode () and there importance in HashMap?
    Benefit of immutable object?
    Race condition on HashMap  in Java
    Resizing of Java HashMap

Just to summarize here are the answers which does makes sense for above questions

How HashMap  works in Java
HashMap  works on principle of hashing, we have put() and get() method for storing and retrieving object form HashMap .When we pass an both key and value to put() method to store on HashMap , it uses key object hashcode() method to calculate hashcode and they by applying hashing on that hashcode it identifies bucket location for storing value object. While retrieving it uses key object equals method to find out correct key value pair and return value object associated with that key. HashMap  uses linked list in case of collision and object will be stored in next node of linked list. Also HashMap stores both key and value tuple in every node of linked list in form of Map.Entry object. 


What will happen if two different HashMap  key objects have same hashcode?
They will be stored in same bucket but no next node of linked list. And keys equals () method will be used to identify correct key value pair in HashMap .


How null key is handled in HashMap? Since equals() and hashCode() are used to store and retrieve values, how does it work in case of null key?
Null key is handled specially in HashMap, there are two separate method for that putForNullKey(V value) and getForNullKey(). Later is offloaded version of get() to look up null keys.  Null keys always map to index 0.  This null case is split out into separate methods for the sake of performance in the two most commonly used operations (get and put), but incorporated with conditionals in others. In short, equals() and hashcode() method are not used in case of null keys in HashMap.

here is how nulls are retreived from HashMap

   private V getForNullKey() {
        if (size == 0) {
            return null;
        }
        for (Entry<K,V> e = table[0]; e != null; e = e.next) {
            if (e.key == null)
                return e.value;
        }
        return null;
    }


In terms of usage Java HashMap is very versatile and I have mostly used HashMap as cache in electronic trading application I have worked . Since finance domain used Java heavily and due to performance reason we need caching HashMap and ConcurrentHashMap  comes as very handy there. You can also check following articles form Javarevisited to learn more about HashMap and Hashtable in Java :


HashMap Changes in JDK 1.7 and JDK 1.8
There is some performance improvement done on HashMap and ArrayList from JDK 1.7, which reduce memory consumption. Due to this empty Map are lazily initialized and will cost you less memory. Earlier, when you create HashMap e.g. new HashMap() it automatically creates array of default length e.g. 16. After some research, Java team founds that most of this Map are temporary and never use that many elements, and only end up wasting memory. Also, From JDK 1.8 onwards HashMap has introduced an improved strategy to deal with high collision rate. Since a poor hash function e.g. which always return location of same bucket, can turn a HashMap into linked list, i.e. converting get() method to perform in O(n) instead of O(1) and someone can take advantage of this fact, Java now internally replace linked list to a binary true once certain threshold is breached. This ensures performance or order O(log(n)) even in worst case where hash function is not distributing keys properly.




Comparator vs Comparable in Java

Here are some of the common differences, which is worth remembering to answer this question if asked during a telephonic or face to face interview:

1) Comparator in Java is defined in java.util package while Comparable interface in Java is defined in java.lang package, which very much says that Comparator should be used as an utility to sort objects which Comparable should be provided by default.

2) Comparator interface in Java has method public int compare (Object o1, Object o2) which returns a negative integer, zero, or a positive integer as the first argument is less than, equal to, or greater than the second.

Comparator  - compare two objects-- sort on some other attribute of object
public int compare (Object o1, Object o2)


While Comparable interface has method public int compareTo(Object o) which returns a negative integer, zero, or a positive integer as this object is less than, equal to, or greater than the specified object.

Comparable - sort objects based on natural order

Comparable interface compares "this" reference with the object specified.

public int compareTo(Object o)


3) If you see then logical difference between these two is Comparator in Java compare two objects provided to him, while Comparable interface compares "this" reference with the object specified. I have shared lot of tips on how to override compareTo() method and avoid some common mistakes programmer makes while implementing Comparable interface.

4) Comparable in Java is used to implement natural ordering of object. In Java API String, Date and wrapper classes implements Comparable interface. Its always good practice to override compareTo() for value objects.

5) If any class implement Comparable interface in Java then collection of that object either List or Array can be sorted automatically by using  Collections.sort() or Arrays.sort() method and object will be sorted based on there natural order defined by CompareTo method.

6)Objects which implement Comparable in Java  can be used as keys in a SortedMap like TreeMap or elements in a SortedSet  for example TreeSet, without specifying any Comparator.


Example of using Comparator and Comparable in Java
So in Summary if you want to sort objects based on natural order then use Comparable in Java and if you want to sort on some other attribute of object then use Comparator in Java. Now to understand these concepts lets see an example or real life coding:


1) There is class called Person, sort the Person based on person_id, which is primary key in database
2) Sort the Person based on there name.

For a Person class, sorting based on person_id can be treated as natural order sorting and sorting based on name field can be implemented using Comparator interface. To sort based on person_id we need to implement compareTo() method.


public class Person implements Comparable {
    private int person_id;
    private String name;
   
    /**
     * Compare current person with specified person
     * return zero if person_id for both person is same
     * return negative if current person_id is less than specified one
     * return positive if specified person_id is greater than specified one
     */
    @Override
    public int compareTo(Object o) {
        Person p = (Person) o;
        return this.person_id - o.person_id ;
    }
    ….
}

Generally you should not use difference of integers to decide output of compareTo method as result of integer subtraction can overflow but if you are sure that both operands are positive then its one of the quickest way to compare two objects. See my post things to remember while overriding compareTo in Java for more tips on compareTo.

And for sorting based on person name we can implement compare(Object o1, Object o2) method of Java Comparator class.

/**
 * Comparator implementation which sorts Person objects on person_id field
 */
public class SortByPerson_ID implements Comparator{

    public int compare(Object o1, Object o2) {
        Person p1 = (Person) o;
        Person p2 = (Person) o;
        return p1.getPersonId() - p2.getPersonId();
    }
}

Similar guidelines applies while implementing compare() method as well and instead of using subtraction operator, its better to use logical operator to compare whether two integers are equal to, less than or greater than. You can write several types of Java Comparator based upon your need for example  reverseComparator , ANDComparator , ORComparator etc which will return negative or positive number based upon logical results. String in Java even provides an special comparator called CASE_INSENSITIVE_ORDER, to perform case insensitive comparison of String objects.



How to Compare String in Java
String is immutable in Java and one of the most used value class. For comparing String in Java we should not be worrying because String implements Comparable interface and provides a lexicographic implementation for CompareTo method which compare two strings based on contents of characters or you can say in lexical order. You just need to call String.compareTo(AnotherString) and Java will determine whether specified String is greater than , equal to or less than current object. See my post 4 example to compare String in Java for alternatives ways of comparing String.


How to Compare Dates in Java
Dates are represented by java.util.Date class in Java and like String,  Date also implements Comparable in Java so they will be automatically sorted based on there natural ordering if they got stored in any sorted collection like TreeSet or TreeMap. If you explicitly wants to compare two dates in Java you can call Date.compareTo(AnotherDate) method in Java and it will tell whether specified date is greater than , equal to or less than current String. See my post 3 ways to compare Dates in Java for more alternatives of comparing two dates.

When to use Comparator and Comparable in Java
At last let’s see some best practices and recommendation on when to use Comparator or Comparable in Java:

1) If there is a natural or default way of sorting Object already exist during development of Class than use Comparable. This is intuitive and you given the class name people should be able to guess it correctly like Strings are sorted chronically, Employee can be sorted by there Id etc. On the other hand if an Object can be sorted on multiple ways and client is specifying on which parameter sorting should take place than use Comparator interface. for example Employee can again be sorted on name, salary or department and clients needs an API to do that. Comparator implementation can sort out this problem.

2) Some time you write code to sort object of a class for which you are not the original author, or you don't have access to code. In these cases you can not implement Comparable and Comparator is only way to sort those objects.

3) Beware with the fact that How those object will behave if stored in SorteSet or SortedMap like TreeSet and TreeMap. If an object doesn't implement Comparable than while putting them into SortedMap, always provided corresponding Comparator which can provide sorting logic.

4) Order of comparison is very important while implementing Comparable or Comparator interface. for example if you are sorting object based upon name than you can compare first name or last name on any order, so decide it judiciously. I have shared more detailed tips on compareTo on my post how to implement CompareTo in Java.

5) Comparator has a distinct advantage of being self descriptive  for example if you are writing Comparator to compare two Employees based upon there salary than name that comparator as SalaryComparator, on the other hand compareTo()




Top 25 Java Collection Framework Interview Questions Answers for Freshers and Experienced Programmers

     


1. How HashMap works in Java?
This is Classical Java Collection interview questions which I have also discussed in How HashMap works in Java. This collection interview questions is mostly asked during AVP Role interviews on Investment-Banks and has lot of follow-up questions based on response of interviewee e.g. Why HashMap keys needs to be immutable, what is race conditions on HashMap and how HashMap resize in Java. For explanation and answers of these questions Please see earlier link.


2. What is difference between poll() and remove() method of Queue interface?
Though both poll() and remove() method from Queue is used to remove object and returns head of the queue, there is subtle difference between them. If Queue is empty() then a call to remove() method will throw Exception, while a call to poll() method returns null. By the way, exactly which element is removed from the queue depends upon queue's ordering policy and varies between different implementation, for example PriorityQueue keeps lowest element as per Comparator or Comparable at head position.

3. What is difference between fail-fast and fail-safe Iterators?
This is relatively new collection interview questions and can become trick if you hear the term fail-fast and fail-safe first time. Fail-fast Iterators throws ConcurrentModificationException when one Thread is iterating over collection object and other thread structurally modify Collection either by adding, removing or modifying objects on underlying collection. They are called fail-fast because they try to immediately throw Exception when they encounter failure. On the other hand fail-safe Iterators works on copy of collection instead of original collection


4. How do you remove an entry from a Collection? and subsequently what is difference between remove() method of Collection and remove() method of Iterator, which one you will use, while removing elements during iteration?

Collection interface defines remove(Object obj) method to remove objects from Collection. List interface adds another method remove(int index), which is used to remove object at specific index. You can use any of these method to remove an entry from Collection, while not iterating. Things change, when you iterate. Suppose you are traversing a List and removing only certain elements based on logic, then you need to use Iterator's remove() method. This method removes current element from Iterator's perspective. If you use Collection's or List's remove() method during iteration then your code will throw ConcurrentModificationException. That's why it's advised to use Iterator remove() method to remove objects from Collection.

5. What is difference between Synchronized Collection and Concurrent Collection?
Java 5 has added several new Concurrent Collection classes e.g. ConcurrentHashMap, CopyOnWriteArrayList, BlockingQueue etc, which has made Interview questions on Java Collection even trickier. Java Also provided way to get Synchronized copy of collection e.g. ArrayList, HashMap by using Collections.synchronizedMap() Utility function.One Significant difference is that Concurrent Collections has better performance than synchronized Collection because they lock only a portion of Map to achieve concurrency and Synchronization. See Difference between Synchronized Collection and Concurrent Collection in Java for more details.


6. What is difference between Iterator and Enumeration?
This is a beginner level collection interview questions and mostly asked during interviews of Junior Java developer up to experience of 2 to 3 years Iterator duplicate functionality of Enumeration with one addition of remove() method and both provide navigation functionally on objects of Collection.Another difference is that Iterator is more safe than Enumeration and doesn't allow another thread to modify collection object during iteration except remove() method and throws ConcurrentModificaitonException. See Iterator vs Enumeration in Java for more differences.


7. How does HashSet is implemented in Java, How does it uses Hashing ?
This is a tricky question in Java, because for hashing you need both key and value and there is no key for store it in a bucket, then how exactly HashSet store element internally. Well, HashSet is built on top of HashMap. If you look at source code of java.util.HashSet class, you will find that that it uses a HashMap with same values for all keys, as shown below :

private transient HashMap map;

// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();

When you call add() method of HashSet, it put entry in HashMap :

public boolean add(E e) {
  return map.put(e, PRESENT)==null;
}

Since keys are unique in a HashMap, it provides uniqueness guarantee of Set interface.


8. What do you need to do to use a custom object as key in Collection classes like Map or Set?
Answer is : If you are using any custom object in Map as key, you need to override equals() and hashCode() method, and make sure they follow there contract. On the other hand if you are storing a custom object in Sorted Collection e.g. SortedSet or SortedMap, you also need to make sure that your equals() method is consistent to compareTo() method, otherwise those collection will not follow there contacts e.g. Set may allow duplicates.



12. How do you Sort objects on collection?
Sorting is implemented using Comparable and Comparator in Java and when you call Collections.sort() it gets sorted based on natural order specified in compareTo() method while Collections.sort(Comparator) will sort objects based on compare() method of Comparator. See Sorting in Java using Comparator and Comparable for more details.

19) Why ListIterator has add() method but Iterator doesn't or Why add() method is declared in ListIterator and not on Iterator.
Answer : ListIterator has add() method because of its ability to traverse or iterate in both direction of collection. it maintains two pointers in terms of previous and next call and in position to add new element without affecting current iteration.


22) What is BlockingQueue, how it is different than other collection classes?
BlockingQueue is a Queue implementation available in java.util.concurrent package. It's one of the concurrent Collection class added on Java 1.5, main difference between BlockingQueue and other collection classes is that apart from storage, it also provides flow control. It can be used in inter thread communication and also provides built-in thread-safety by using happens-before guarantee. You can use BlockingQueue to solve Producer Consumer problem, which is what is needed in most of concurrent applications.

Top 5 Concurrent Collections from JDK 5 and 6 Java Programmer Should Know

Read more: 
http://javarevisited.blogspot.com/2013/02/concurrent-collections-from-jdk-56-java-example-tutorial.html#ixzz3XzO6So1C


Several new Collection classes are added in Java 5 and Java 6 specially concurrent alternatives of standard synchronized ArrayListHashtable and  synchronized HashMap collection classes. Many Java programmer still not familiar with these new collection classes from java.util.concurrent package and misses a whole new set of functionality which can be utilized to build more scalable and high performance Java application. In this Java tutorialwe will some of useful collection classes e.g. ConcurrentHashMapBlockingQueue which provides some of the very useful functionalities to build concurrent Java application. By the way this is not a comprehensive article explaining each feature of all these concurrent collections, Instead I will just try to list out why they are there, which Collection class they replace or provides alternative for. Idea is to keep it short and simple while highlighting key points of those useful java.util.concurrent collections.

1. ConcurrentHashMap
Description: ava Concurrent Collections from JDK 5 and 6 Example TutorialConcurrentHashMap is undoubtedly most popular collection class introduced in Java 5 and most of us are already using it. ConcurrentHashMap provides a concurrent alternative of Hashtable or Synchronized Map classes with aim to support higher level of concurrency by implementing fined grained locking. Multiple reader can access the Map concurrently  while a portion of Map gets locked for write operation depends upon concurrency level of Map. ConcurrentHashMap provides better scalability than there synchronized counter part. Iterator of ConcurrentHashMap are fail-safe iterators which doesn't throw ConcurrencModificationException thus eliminates another requirement of locking during iteration which result in further scalability and performance.


2. CopyOnWriteArrayList and CopyOnWriteArraySet
CopyOnWriteArrayList is a concurrent alternative of synchronized List. CopyOnWriteArrayList provides better concurrency than synchronized List by allowing multiple concurrent reader and replacing the whole list on write operation. Yes, write operation is costly on CopyOnWriteArrayList but it performs better when there are multiple reader and requirement of iteration is more than writing. Since CopyOnWriteArrayList Iterator also don't throw ConcurrencModificationException it eliminates need to lock the collection during iteration. Remember both ConcurrentHashMap and CopyOnWriteArrayList doesn't provides same level of locking as Synchronized Collection and achieves thread-safety by there locking and mutability strategy. So they perform better if requirements suits there nature. Similarly, CopyOnWriteArraySet is a concurrent replacement to Synchronized Set. See What is CopyOnWriteArrayList in Java for more details


3. BlockingQueue
BlockingQueue is also one of better known collection class in Java 5. BlockingQueue makes it easy to implement producer-consumer design pattern by providing inbuilt blocking support for put() and take() method. put() method will block if Queue is full while take() method will block if Queue is empty. Java 5 API provides two concrete implementation of BlockingQueue in form of ArrayBlockingQueue and LinkedBlockingQueue, both of them implement FIFO ordering of element. ArrayBlockingQueue is backed by Array and its bounded in nature while LinkedBlockingQueue is optionally bounded. Consider using BlockingQueue to solve producer Consumer problem in Java instead of writing your won wait-notify code. Java 5 also provides PriorityBlockingQueue, another implementation of BlockingQueue which is ordered on priority and useful if you want to process elements on order other than FIFO.


4. Deque and BlockingDeque
Deque interface is added in Java 6 and it extends Queue interface to support insertion and removal from both end of Queue referred as head and tail. Java6 also provides concurrent implementation of Deque like ArrayDeque and LinkedBlockingDeque. Deque Can be used efficiently to increase parallelism in program by allowing set of worker thread to help each other by taking some of work load from other thread by utilizing Deque double end consumption property. So if all Thread has there own set of task Queue and they are consuming from head; helper thread can also share some work load via consumption from tail.

                           ~~~
5. ConcurrentSkipListMap and ConcurrentSkipListSet
Just like ConcurrentHashMap provides a concurrent alternative of synchronized HashMapConcurrentSkipListMap and ConcurrentSkipListSet provide concurrent alternative for synchronized version of SortedMap and SortedSet. For example instead of using TreeMap or TreeSet wrapped inside synchronized Collection, You can consider using ConcurrentSkipListMap or ConcurrentSkipListSet from java.util.concurrent package. They also implement NavigableMap and NavigableSet to add additional navigation method we have seen in our post How to use NavigableMap in Java.

That’s all on this list of concurrent Collection classes from Java 5 and 6. They are added on java.util.concurrent package as concurrent alternative of there synchronized counterpart. It’s good idea to learn these Collection classes along with other popular classes from Java Collection Framework.

No comments:

Post a Comment