Sunday, 6 November 2011

Java Collections – Performance (Time Complexity)


Many developers I came across in my career as a software developer are only familiar with the most basic data structures, typically, Array, Map and Linked List. These are fundamental data structures and one could argue that they are generic enough to fit most of the commercial software requirements. But what worries me most is that even seasoned developers are not familiar with the vast repertoire of available data structures and their time complexity. In this post the ADTs (Abstract Data Types) present in the Java Collections (JDK 1.6) are enlisted and the performance of the various data structures, in terms of time, is assessed.

Before we start it is helpful to understand the so-called “Big O” notation. This notation approximately describes how the time to do a given task grows with the size of the input. Roughly speaking, on one end we have O(1) which is “constant time” and on the opposite end we have O(xn) which is “exponential time”. The following chart summarizes the growth in complexity due to growth of input (n). In our data structure walk-through we sometimes use the symbol h to signify the Hash Table capacity.


List

A list is an ordered collection of elements.


Add
Remove
Get
Contains
Data  Structure
ArrayList
 O(1)
O(n)
O(1)
O(n)
Array
LinkedList
 O(1)
O(1)
O(n)
O(n)
Linked List
CopyonWriteArrayList
 O(n)
O(n)
O(1)
O(n)
Array

Set

A collection that contains no duplicate elements.


Add
Contains
Next
Data Structure
HashSet
O(1)
O(1)
O(h/n)
Hash Table
LinkedHashSet
O(1)
O(1)
O(1)
Hash Table + Linked List
EnumSet
O(1)
O(1)
O(1)
Bit Vector
TreeSet
O(log n)
O(log n)
O(log n)
Red-black tree
CopyonWriteArraySet
O(n)
O(n)
O(1)
Array
ConcurrentSkipList
O(log n)
O(log n)
O(1)
Skip List

Queue

A collection designed for holding elements prior to processing.


Offer
Peak
Poll
Size
Data Structure
PriorityQueue
O(log n )
O(1)
O(log n)
O(1)
Priority Heap
LinkedList
 O(1)
O(1)
O(1)
O(1)
Array
ArrayDequeue
 O(1)
O(1)
O(1)
O(1)
Linked List
ConcurrentLinkedQueue
 O(1)
O(1)
O(1)
O(n)
Linked List
ArrayBlockingQueue
 O(1)
O(1)
O(1)
O(1)
Array
PriorirityBlockingQueue
O(log n)
O(1)
O(log n)
O(1)
Priority Heap
SynchronousQueue
O(1)
O(1)
O(1)
O(1)
None!
DelayQueue
O(log n)
O(1)
O(log n)
O(1)
Priority Heap
LinkedBlockingQueue
O(1)
O(1)
O(1)
O(1)
Linked List

Map

An object that maps keys to values. A map cannot duplicate keys; each key can map to at most one value.


Get
ContainsKey
Next
Data Structure
HashMap
O(1)
O(1)
O(h / n)
Hash Table
LinkedHashMap
O(1)
O(1)
O(1)
Hash Table + Linked List
IdentityHashMap
O(1)
O(1)
O(h / n)
Array
WeakHashMap
O(1)
O(1)
O(h / n)
Hash Table
EnumMap
O(1)
O(1)
O(1)
Array
TreeMap
O(log n)
O(log n)
O(log n)
Red-black tree
ConcurrentHashMap
O(1)
O(1)
O(h / n)
Hash Tables
ConcurrentSkipListMap
O(log n)
O(log n)
O(1)
Skip List

30 comments:

  1. Please, correct the description for LinkedList remove operation. It is two-step operation and the complexity is O(n), not O(1).

    ReplyDelete
    Replies
    1. I second that. The element should be found in the list before it can be removed by changing the pointers so it is O(n).

      Delete
    2. It is common to just write how long removal itself will take without actual search. You have to take it as: if you have pointer to object o in linked list then removal will take O(1).
      It is done this way, so you can see difference between different collections:
      ArrayList has remove O(n) + search, while LinkedList have O(1)+ search. Removal would have O(n) complexity for both even though LinkedList removal is way faster.

      Delete
  2. every queue implementation has O(N) on contains() method?

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
    2. Array and LinkedList has O(n) on contains() method for sure. If element is not in array or list you have to traverse all elements just to be sure. In Priority Heap as name suggest use some kind of heap (couldn't find exactly which, so I will assume it is binary heap), which is data structure similar to binary tree, with special rules. Contains() on binary heap takes O(log n).

      Delete
  3. Linked List not use Array Data Structure

    ReplyDelete
  4. How come LinkedList use Arrays ????

    Big question. Please correct...

    ReplyDelete
  5. Thanks for providing this informative information. it is very useful you may also refer- http://www.s4techno.com/blog/2016/07/12/exception-handling/

    ReplyDelete
  6. perfect explanation about java programming .its very useful.thanks for your valuable information.java training in chennai | java training in velachery

    ReplyDelete
  7. For niit projects, assignments, cycle tests, lab@homes, c#, html, java, java script, sql, oracle and much more visit http://gniithelp.blogspot.in or https://mkniit.blogspot.in

    ReplyDelete
  8. Being new to the blogging world I feel like there is still so much to learn. Your tips helped to clarify a few things for me as well as giving..
    Android App Development Company
    Android App Development Company

    ReplyDelete
  9. Really useful. Thanks for sharing.

    ReplyDelete
  10. Really useful. Thanks for sharing.

    ReplyDelete
  11. I just want to say that all the information you have given here is awesome...great and nice blog thanks sharing..Thank you very much for this one. And i hope this will be useful for many people.. and i am waiting for your next post keep on updating these kinds of knowledgeable things...
    Web Design Development Company
    Web design Company in Chennai
    Web development Company in Chennai

    ReplyDelete
  12. Free easy & simple way to learn java online and much more.. go to =>> http://foundjava.blogspot.in

    ReplyDelete
  13. it is really amazing...thanks for sharing....provide more useful information...
    Mobile app development company

    ReplyDelete
  14. Thank you for nice article but I just want some more explanation because I am prepring this for my interview, like reason for each data structure complexity.good explained for arraylist and linkedlist but I need this type of explanation for each is it possible here??

    ReplyDelete
  15. This article is very much helpful and i hope this will be an useful information for the needed one.Keep on updating these kinds of informative things...
    iOS App Development Company

    ReplyDelete
  16. Many of this runtimes are not correct. One should distinguish between per-operation, amortized and (stochastic) expected worst-case runtimes. The runtimes here are mixed up and always choose the beset of the three.

    E.g. ArrayList#add has a worst case complexity of O(n) (array size doubling), but the amortized complexity over a series of operations is in O(1). HashSet#contains has a worst case complexity of O(n) (<= Java 7) and O(log n) otherwise, but the expected complexity is in O(1).

    ReplyDelete
  17. Thanks a lot! You made a new blog entry to answer my question; I really appreciate your time and effort.
    java training in chennai |
    java training institute in chennai

    ReplyDelete
  18. Thanks for sharing with us the information on Java collections and I have learned a lot of new programming information from the article that has helped me to improve my basic programming skills. I Will be recommending this site to clients who access our Papers Reviewing Services so that they can read the article.

    ReplyDelete
  19. A easy and exciting blog about java learning. Please comment your opinions and share..
    http://foundjava.blogspot.in

    ReplyDelete