Trond Norbye's Weblog

« Previous month (Feb 2009) | Main | Next month (Apr 2009) »

http://blogs.sun.com/trond/date/20090423 Thursday April 23, 2009

Presentation at the MySQL Users Conference

Earlier today I did the presentation Memcached Meet Flash, the pluggable engine interface, and if you missed it you can download the slides. It is kind of fun to think back on the hackathon at the users conference the last year when Toru shared his ideas about a storage interface, followed by the interesting discussion I had with Matt during the OpenSolaris summit down in Santa Clara. I didn't know back then that I would present this at the users conference this year :-)

My brother came down for my presentation and took the following picture with his iPhone during the session:

If you have any questions regarding the slides, come look me up at the hackathon tonight :)

http://blogs.sun.com/trond/date/20090419 Sunday April 19, 2009

Using CVS with pserver access with OpenGrok

Today I pushed two fixes into OpenGrok so that you may use OpenGrok on sources you checked out via the pserver protocol in CVS. From a performance perspective I would not recommend that you use this configuration, but it might be good enough for you if you just want to search your own projects.

With the latest development build of OpenGrok installed into /var/opengrok/bin and /var/tomcat6/webapps/source.war I was able to index and browse the PostgreSQL sources. I checked out PostgreSQL into /var/opengrok/source/pgsql, and executed the following commands:

trond@opensolaris> cd /var/opengrok
trond@opensolaris> java -jar bin/opengrok.jar -c /var/opengrok/bin/ctags \
                              -v -s /var/opengrok/source -d /var/opengrok/data -S -P \
                              -p /opengrok -n -r on -W /etc/opengrok/configuration.xml
trond@opensolaris> java -Xmx2g -jar bin/opengrok.jar -R /etc/opengrok/configuration.xml -U localhost:2424

Please let me know if you have problems getting this to work (or even better, admit that it is 2009 and move on to a more modern SCM system ;-) )

http://blogs.sun.com/trond/date/20090415 Wednesday April 15, 2009

Using Subversion with OpenGrok

In my previous blog entry Using CVS with OpenGrok I showed the steps needed to configure OpenGrok with CVS, and in this entry I will extend that example to include a project using Subversion.

The first thing we need to do is to install a Subversion client and check out the source code. I don't use Subversion for any of my projects, but Knut Anders is working on Apache Derby (hosted in a Subversion repository) so lets use that in this example.

trond@opensolaris> pfexec pkg install SUNWsvn
trond@opensolaris> cd /var/opengrok/source
trond@opensolaris> svn co https://svn.apache.org/repos/asf/db/derby/code/trunk derby

If we use the browser to navigate to http://localhost:8080/source/xref you will see a new directory named derby. The history links and selection box selection box in http://localhost:8080/source/ does however not work for Derby yet, so let's go ahead and update the configuration:

trond@opensolaris> cd /var/opengrok
trond@opensolaris> java -jar /var/opengrok/bin/opengrok.jar -c /var/opengrok/bin/ctags \
                              -v -s /var/opengrok/source -d /var/opengrok/data -S -P \
                              -p /opengrok -n -r on -W /etc/opengrok/configuration.xml

(run look at the man page for a description of the different options).

With the new configuration in place, we can start the index generation:

trond@opensolaris> cd /var/opengrok
trond@opensolaris> java -Xmx2g -jar /var/opengrok/bin/opengrok.jar -R /etc/opengrok/configuration.xml -H

With the new index database in place it is time to update the web application to use the new configuration:

trond@opensolaris> java -Xmx2g -jar /var/opengrok/bin/opengrok.jar -R /etc/opengrok/configuration.xml -n\
                                     -U localhost:2424

Or you could just restart the Tomcat web server:

trond@opensolaris> svcadm restart tomcat6

If you navigate to http://localhost:8080/source/history/derby/README you should get the history for the README file and the annotate link should be available. Subversion supports changesets so you should be able to request history for directories, but the directory information is not cached so this is a potentially slow operation (if you have remote SCM repositories).

http://blogs.sun.com/trond/date/20090410 Friday April 10, 2009

Using CVS with OpenGrok

If you look at the mail archives for OpenGrok it seems that the most popular question out there right now is how to configure OpenGrok with CVS. Personally I have extremely limited experience with cvs, but I guess there is some old projects out there that haven't converted to a distributed scm system yet (check out http://www.selenic.com/mercurial/wiki/index.cgi/RepositoryConversion ;-)). In this blog I'll show you how to configure a project using cvs in OpenGrok.

I don't use cvs, so the first thing we need to do is to install cvs and create a cvs repository for our source. An empty cvs repository doesn't help us, so lets import the OpenGrok sources and use them in the example:

trond@opensolaris> pfexec pkg install SUNWcvs
trond@opensolaris> pfexec zfs create -o mountpoint=/cvsroot rpool/cvsroot
trond@opensolaris> pfexec chown trond:staff /cvsroot
trond@opensolaris> cd /cvsroot
trond@opensolaris> export CVSROOT=`pwd`
trond@opensolaris> cvs init
trond@opensolaris> cd /tmp
trond@opensolaris> hg clone ssh://anon@hg.opensolaris.org/hg/opengrok/trunk opengrok
trond@opensolaris> cd opengrok
trond@opensolaris> rm -rf .hg
trond@opensolaris> cvs import -m "Initial import of OpenGrok" opengrok opengrok-trunk start
trond@opensolaris> rm -rf opengrok

I got my OpenGrok installation in /var/opengrok with the sources in /var/opengrok/source, so let's check out the sources:

trond@opensolaris> cd /var/opengrok/source
trond@opensolaris> cvs co opengrok

The next thing we need to do is to update the configuration with the knowledge of the new project (and it's repository):

trond@opensolaris> cd /var/opengrok
trond@opensolaris> java -jar /var/opengrok/bin/opengrok.jar -c /var/opengrok/bin/ctags \
                              -v -s /var/opengrok/source -d /var/opengrok/data -S -P \
                              -p /opengrok -n -W /etc/opengrok/configuration.xml

(run look at the man page for a description of the different options).

With the new configuration in place, we can start the index generation:

trond@opensolaris> cd /var/opengrok
trond@opensolaris> java -Xmx2g -jar /var/opengrok/bin/opengrok.jar -R /etc/opengrok/configuration.xml

So let's install tomcat and try it out:

trond@opensolaris> pfexec pkg install SUNWtcat
trond@opensolaris> pfexec cp /var/opengrok/source.war /var/tomcat6/webapps
trond@opensolaris> svcadm enable tomcat6

If you navigate to http://localhost:8080/source/history/opengrok/LICENSE.txt you should get the history for the LICENSE file and the annotate link should be available. You should be able to navigate around and look at the change history for the files in your repository. Please note that cvs operates on a pr file basis, so you cannot request history for a directory.

http://blogs.sun.com/trond/date/20090402 Thursday April 02, 2009

Pluggable hashing algorithm in memcached?

In my blog post How well is your hash table working for you?, I pointed out that your keys could give you a bad distribution in the internal hash table inside memcached. Right now there is not much you can do apart from using another algorithm to generate your keys, but that may not be the easiest thing to do. Wouldnt it be cooler if you could just use another hashing algorithm instead?

I talked with Brian Aker on IRC the other day, and he pointed out that libmemcached contains a handfull of different algorithms (and is covered by the same license as the memcached server) so we could actually use the hashing routines from libmemcached in the server. The first thing we should do is probably to create a "hashing benchmark tool" in libmemcached that reads an input file of keys and determines the best hashing algorithm to use based upon speed and distribution. With the benchmark in place we could add a new configure option to memcached --with-hashing-algorithm=algorithm (this would of course require that we have libmemcached installed).

With the posibility to change the hashing algoritm in the server, I would love to take this one step further (I hate compiletime settings, because it makes life hard for people shipping binaries). What if we could dynamically change the hashing algorithm on the server without invalidating the existing cache? Wouldn't that be cool? Since memcached supports dynamic hash expansion, it shouldn't be hard to change the hash function as well. If you take a quick look in the function assoc_find located in assoc.c you will see that if expanding is set, we need to search in the old hash-table instead of the new. This is the place where we should add our logic that if the hash function changed (and we haven't repopulated the complete hash yet), we need to recompute the hash with the old hash function.

Anyone up for the challenge of implementing:

  • A key hashing benchmark program (in libmemcached)
  • Add configure option to memcached that detects and links with libmemcached, and overrides the default hashing algorithm
  • Create a new command to set the hashing algorithm runtime


Valid HTML! Valid CSS!

This is a personal weblog, I do not speak for my employer.