Our intern arrived at the beginning of June and promptly began putting bugs into my pristine and utterly bug free search engine code.

One of the bugs he planted/discovered was that the recent change to use a BST for part of the dictionary lookup resulted in a BST search that was a bit, shall we say, overzealous in searching the tree. I fixed the bug and committed the change. Today I had a bit of time, so I decided to re-run the multi threading tests to see how the change affected our dictionary lookup times.

I was expecting the times to be worse, because I thought that perhaps we were not finding the right entries (but the dictionary code checks for that), but here's what I found:

# of threadsTotal Number of LookupsAverage Lookup Time (ms)
12385001.26
23010001.99
45165002.32
85885004.08
1610080004.77
3245200021.66
6456650034.48
128116350033.62
200167950036.30

That's a huge speedup at the low end: somewhere between 3 to 5 times! Something weird happens in the middle where the results are worse, but then at the top end is settles down to be more than twice as fast. Also notice that I added a result for 200 threads (hey, why not, right?)

Here's the graph:

I have to tell you: it's pretty cool to see a machine with 256 hardware threads looking like this in prstat:

   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP       
  1529 stgreen   100M   86M cpu169  30    0   4:00:48  71% java/223

Onward to query performance...

Comments:

Post a Comment:
Comments are closed for this entry.

This blog copyright 2009 by searchguy