Tuesday, October 15, 2013

Hash Lookup vs Binary Search

So I was browsing through our code base, and saw that in several places, we have some binary searches. Binary search is known to be pretty effective when we are finding things in a sorted array. It has O(lg(n)) running time. Hashtables, on the other hand, have O(1) running time. For small tables, though, binary search can be faster than hashtables since a comparison is cheaper than computing a hash function. At some point, though, we can expect the hashtable to out perform binary search. Let's try and find out what the break-even point is:
And here's the results:
Well, it doesn't look like binary search is ever better than the hashtable. Binary search is logarithmic at first, but then it starts growing faster. The hashtable performs pretty consistently at the beginning, and then slowly grows at the end.

The abnormalities are probably due to caching. There's two places where the slope changes. Those probably correspond to different layers of memory. Maybe first an L2 cache miss, and then an L3 cache miss.

I also implemented an iterative binary search:

It did not perform significantly better. Hashtable was still always faster.

No comments:

Post a Comment