Trond Norbye's Weblog

« Debugging memcached... | Main | Pluggable hashing... »

http://blogs.sun.com/trond/date/20090328 Saturday March 28, 2009

How well is your hash table working for you?

If you look in the internals of memcached you will find a large hash table where all of the items are stored in (we hash to a bucket before we do a linary search in a linked list to try to locate the item). Memcached is supposed to grow the hash table automatically to avoid having too long lists in each bucket (that would kill the performance), but wouldn't it be cool to know how well the hash table works for you?? Well that's really easy to find out with dtrace!

So let's look at the user-supplied-bugs in libmemcached as an example. I already have memcached running on my server, so I just open up a new terminal and start the following dtrace one-liner:

trond@opensolaris> pfexec dtrace -n ':::assoc-find { @a = quantize(arg2);}'
dtrace: description ':::assoc-find ' matched 1 probe

In another terminal I just type the following commands:

trond@opensolaris> export MEMCACHED_SERVERS=localhost:11211
trond@opensolaris> ./testapp user
servers localhost:11211
        localhost : 11211


user

Testing user_supplied_bug1                                       0.993 [ ok ]
Testing user_supplied_bug2                                       0.009 [ ok ]
Testing user_supplied_bug3                                       0.041 [ ok ]
Testing user_supplied_bug4                                       0.000 [ ok ]
Testing user_supplied_bug5                                       0.192 [ ok ]
Testing user_supplied_bug6                                       0.053 [ ok ]
Testing user_supplied_bug7                                       0.031 [ ok ]
Testing user_supplied_bug8                                       0.000 [ ok ]
Testing user_supplied_bug9                                       0.004 [ ok ]
Testing user_supplied_bug10                                      2.572 [ ok ]
Testing user_supplied_bug11                                      2.555 [ ok ]
Testing user_supplied_bug12                                      0.000 [ ok ]
Testing user_supplied_bug13                                      0.008 [ ok ]
Testing user_supplied_bug14                                      2.607 [ ok ]
Testing user_supplied_bug15                                      0.000 [ ok ]
Testing user_supplied_bug16                                      0.008 [ ok ]
Testing user_supplied_bug18                                      0.003 [ ok ]
Testing user_supplied_bug19                                      0.000 [ ok ]
Testing user_supplied_bug20                                      0.010 [ ok ]

All tests completed successfully

So let's terminate dtrace and look at the results:

^C


           value  ------------- Distribution ------------- count    
              -1 |                                         0        
               0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 245648   
               1 |                                         589      
               2 |                                         35       
               4 |                                         0

So what does this tell us? In almost all of our requests the interesting item is the first item in the bucket :-)

Comments:

That's only half of the story, you also need to check how much time you spend increasing the size of the hash table, how much memory you are using, etc. Otherwise I know of some trivial implementations where the interesting item will _always_ be first in its bucket :-)

Still, good sign, and nice use of dtrace.

Posted by Marc on March 28, 2009 at 10:31 AM CET #

Yes, this is just one easily available trigger in the system. The hashing on the client is probably more important than the hashing inside the server, so that you don't end up with just one server being used ;-) The application memstat in libmemcached provides an easy access to this information :)

Posted by Trond Norbye on April 01, 2009 at 08:24 AM CEST #

Post a Comment:
  • HTML Syntax: NOT allowed

Valid HTML! Valid CSS!

This is a personal weblog, I do not speak for my employer.