Trond Norbye's Weblog

« Previous page | Main | Next page »

http://blogs.sun.com/trond/date/20090127 Tuesday January 27, 2009

Noreply support for binary protocol in libmemcached

Just a short notice that I pushed a patch that implements support for noreply in the binary protocol implementation in libmemcached. In order to try it out you need to get a recent build of memcached:

trond@razor> hg clone http://hg.tangent.org/libmemcached 
destination directory: libmemcached
requesting all changes
adding changesets
adding manifests
adding file changes
added 623 changesets with 2375 changes to 210 files
updating working directory
144 files updated, 0 files merged, 0 files removed, 0 files unresolved
trond@razor> cd libmemcached
trond@razor> ./config/bootstap
./config/bootstrap: running `aclocal-1.10' 
./config/bootstrap: running `autoheader' 
./config/bootstrap: running `libtoolize --automake --copy --force' 
./config/bootstrap: running `automake-1.10 --add-missing --copy --force' 
configure.ac:42: installing `config/missing'
configure.ac:42: installing `config/install-sh'
clients/Makefile.am: installing `config/depcomp'
Makefile.am: installing `./INSTALL'
./config/bootstrap: running `autoconf' 
trond@razor> ./configure --enable-64bit --prefix=/opt/memcached
[ a lot of output ]
trond@razor> make all install
[ a lot of output ]
trond@razor> cd ..
trond@razor> git clone git://github.com/dustin/memcached.git
Initialized empty Git repository in /export/home/trond/khepri38/foobar/memcached/.git/
remote: Counting objects: 3451, done.
remote: Compressing objects: 100% (1162/1162), done.
remote: Total 3451 (delta 2377), reused 3286 (delta 2261)
Receiving objects: 100% (3451/3451), 1.04 MiB | 163 KiB/s, done.
Resolving deltas: 100% (2377/2377), done.
trond@razor> cd memcached
trond@razor> git checkout -b rewritten-bin origin/rewritten-bin
Branch rewritten-bin set up to track remote branch refs/remotes/origin/rewritten-bin.
Switched to a new branch "rewritten-bin"
trond@razor> ./autogen.sh
aclocal...
autoheader...
automake...
configure.ac:10: installing `./compile'
configure.ac:3: installing `./config.sub'
configure.ac:5: installing `./missing'
configure.ac:5: installing `./install-sh'
configure.ac:3: installing `./config.guess'
Makefile.am: installing `./depcomp'
autoconf...
trond@razor> ./configure --prefix=/opt/memcached --enable-64bit --enable-dtrace
[ a lot of output ]
trond@razor> make all test install
[ a lot of output ]

So unless you got a compilation error on your platform, you should now be able to start /opt/memcached/bin/memcached, and link your application with the libmemcached installed in /opt/memcached/lib.

Happy hacking :-)

http://blogs.sun.com/trond/date/20090123 Friday January 23, 2009

libmemcached cleanup and enhancements

I have been pretty busy this morning applying patches and cleaning up the code in libmemcached. I started out with a patch from Toru Maesaka adding stat analysis from your memcached cluster:

trond@razor> memstat -a --server=server1,server2,server3
Memcached Cluster Analysis Report

	Number of Servers Analyzed         : 3
	Average Item Size (incl/overhead)  : 567 bytes

	Node with most memory consumption  : server1:11211 (7572852 bytes)
	Node with least free space         : server1:11211 (59536012 bytes remaining)
	Node with longest uptime           : server1:11211 (667s)
	Pool-wide Hit Ratio                : 0%

I have also pushed a small optimization for parsing the buffer returned from the client (in the text protocol) that operates directly on the receive buffer instead of calling memcached_io_read to read out a single byte.

When the source have compiler warnings it makes it hard to spot if you introduce a new one, so I went through the source and removed all of the compiler warnings. With a warning free source code I toggled the compiler flags, so that warnings should be treated as errors. So the next time someone tries to push a patch introducing a warning, the Solaris build bots will scream out on #memcached on irc.freenode.net.

I also improved the test I had written for the noreply

support, and discovered that there was a bug in the implementation. Luckily it was pretty easy to fix, so I have pushed a patch for that as well.

I feel that we have made a lot of progress on libmemcached lately, so I'm looking forward to the next release!!!

http://blogs.sun.com/trond/date/20090121 Wednesday January 21, 2009

noreply support in libmemcached - part two

In my previous post I described the implementation of noreply support in libmemcached, but Brian Aker commented that we should buffer the noreply packets as well to avoid sending a lot of packets over the wire.

Earlier today I pushed a patch that introduce a new function: memcached_return memcached_flush_buffers(memcached_st *).

So what does this mean for you as a developer? If we look at the following example:

  memcached_behavior_set(memc, MEMCACHED_BEHAVIOR_BUFFER_REQUESTS, 1);
  memcached_behavior_set(memc, MEMCACHED_BEHAVIOR_NOREPLY, 1);
  for (int x= 0; x < 100; ++x) {
    char key[10];
    size_t len= sprintf(key, "%d", x);
    memcached_response ret;
    ret= memcached_set(memc, key, len, key, len, 0, 0);
    if (ret != MEMCACHED_SUCCESS && ret != MEMCACHED_BUFFERED)
      abort();
  }
  [ ... Perform application logic ... ]
  
  char* value= memcached_get(memc, key, strlen(key),
                             &value_length, &flags, &rc);

Without the patch I added today, libmemcached would send out the command to the server for each invocation of memcached_set. With the the patch I added today, all of the commands would be sent to the server in chunks when the user calls memcached_get. Now this doesn't buy us much from what we had before I added the noreply support (just that we don't have to parse the reply codes from the server). Well, lets talk about the new function I added today: memcached_flush_buffers. Let's look at some source:

  memcached_behavior_set(memc, MEMCACHED_BEHAVIOR_BUFFER_REQUESTS, 1);
  memcached_behavior_set(memc, MEMCACHED_BEHAVIOR_NOREPLY, 1);
  for (int x= 0; x < 100; ++x) {
    char key[10];
    size_t len= sprintf(key, "%d", x);
    memcached_response ret;
    ret= memcached_set(memc, key, len, key, len, 0, 0);
    if (ret != MEMCACHED_SUCCESS && ret != MEMCACHED_BUFFERED)
      abort();
  }
  
  if (memcached_flush_buffers(memc) != MEMCACHED_SUCCESS)
    abort();
  

  [ ... Perform application logic ... ]
  
  char* value= memcached_get(memc, key, strlen(key),
                             &value_length, &flags, &rc);

Now memcached_flush_buffers will send all of the commands in the buffers to the memcached servers, so that the memcached server may start executing them. This means that the memcached server may start processing the commands while we perform application logic. When we invoke memcached_get the memcached server may already be finish executing the commands in the buffer, and process the get request immediately.

http://blogs.sun.com/trond/date/20090120 Tuesday January 20, 2009

noreply support in libmemcached

The memcached textual protocol allows you to execute commands on the memcached server without sending the result back to you. Why would you want to do this? One case could be that you just want to store the item in the cache, and you don't really care if it is successful or not. Up until today you had two modes you could use to insert items in the cache: synchronous (default) and asynchronous.

If we look at the following code snippet:

  for (int x= 0; x < 100; ++x) {
    char key[10];
    size_t len= sprintf(key, "%d", x);
    if (memcached_set(memc, key, len, key, len, 0, 0) != MEMCACHED_SUCCESS)
      abort();
  }
  memcached_quit(memc);

During the execution of the above loop, libmemcached will send the set command to the server and wait for the response from the server before it sends the next command to the server etc. This is not very efficient, so you may use the asynchronous mode to queue up commands. To do that, we modify the code:

  memcached_behavior_set(memc, MEMCACHED_BEHAVIOR_BUFFER_REQUESTS, 1);
  for (int x= 0; x < 100; ++x) {
    char key[10];
    size_t len= sprintf(key, "%d", x);
    memcached_response ret;
    ret= memcached_set(memc, key, len, key, len, 0, 0);
    if (ret != MEMCACHED_SUCCESS && ret != MEMCACHED_BUFFERED)
      abort();
  }
  memcached_quit(memc);

With this modification, libmemcached will buffer multiple requests to the server and send them to the server when the buffer exceeds a configurable threshold.

Now this sounds pretty neat doesn't it? Well it is one problem with the model, and that is if we try to issue a get command. The problem now is that we need to flush the send buffer and let the server execute all of the commands in the send buffer before we can send the get request. If we are lucky the buffer is empty (or just contain a few commands), but it could potentially contain a lot of set commands and the memcached server could use some time processing all of those commands.

To solve this problem I implemented use of the noreply command (If you are interested in the glory details, take a look at the patch). So let me modify the source code once more:

  memcached_behavior_set(memc, MEMCACHED_BEHAVIOR_BUFFER_REQUESTS, 1);
  memcached_behavior_set(memc, MEMCACHED_BEHAVIOR_NOREPLY, 1);
  for (int x= 0; x < 100; ++x) {
    char key[10];
    size_t len= sprintf(key, "%d", x);
    memcached_response ret;
    ret= memcached_set(memc, key, len, key, len, 0, 0);
    if (ret != MEMCACHED_SUCCESS && ret != MEMCACHED_BUFFERED)
      abort();
  }
  memcached_quit(memc);

With this modification libmemcached will send the command to the memcached server immediately, but it also tells the server that it doesn't want a reply from the server so it doesn't wait for a response before it sends the next command. When I implemented this patch I noticed that the binary protocol didn't have support for all of the "noreply" variants of the commands, so I filed a bug report on memcached (and implemented it as well ;-)).

http://blogs.sun.com/trond/date/20090107 Wednesday January 07, 2009

DTrace probes in libmemcached on Mac OSX?

I have just pushed a patch to libmemcached that allows you to build libmemcached with the static DTrace probes. Please note that I haven't had the time yet to look at probes it contains (and if they are useful), but I am going to look into that :-)

So why don't you just go ahead and grab the latest source and try it out?

trond@mac > hg clone http://hg.tangent.org/libmemcached
trond@mac > cd libmemcached
trond@mac > ./config/bootstrap && ./configure --enable-dtrace --enable-64bit

http://blogs.sun.com/trond/date/20081217 Wednesday December 17, 2008

Adding a backdoor to memcached

I have been using libumem and LD_PRELOAD to track down memory allocation problems in a lot of applications over the years, and I just love the runtime linker on Solaris (AFAIK you will find some of the features on Linux as well). The fact that I can load other libraries that replace or add functionality of the program is just great. If you haven't read it already I would encourage you to read the man page for ld.so.1. If you are a developer using Solaris and haven't used libumem to hunt down memory bugs, you should read this blog by Adam Leventhal.

Last week I spent an evening trying to track down yet another memory allocation problem, so my head was spinning on all these crazy ideas when I got to bed. While I was lying there I got this cool idea for how you could use the runtime linker to add a “backdoor” to memcached poviding the functionality you've been missing all these years. And by backdoor I actually mean a Solaris door as described in the libdoor(3LIB) So how are we going to do this? Well we are going to create a shared object and add some code to the init-section, and let the runtime linker do the rest of the work. Sounds easy doesn't it? The best part is that it is easy as well :-)

Since I have no idea what all of you are missing from memcached (and I want the blog entry to be simple enough to describe how it works, and not describe internal details of memcached), I'll just create a small example that let you "bulk-load" data stored in your MySQL database. It should be fairly simple to modify the code to do other things, like dumping all the data stored in the cache to disk and reading it back in..

Enough talk, let's look at the source!!!

backdoor.c:
     1	#include <sys/types.h>
     2	#include <sys/stat.h>
     3	#include <stdio.h>
     4	#include <pthread.h>
     5	#include <door.h>
     6	#include <fcntl.h>
     7	#include <errno.h>
     8	#include <string.h>
     9	#include <stdlib.h>
       
    10	static const char* doorfile = "/var/run/memcached_backdoor";
       
    11	#include "config.h"
    12	#include "memcached.h"
       
    13	static void door_server(void *cookie, char *argp, size_t arg_size, door_desc_t *dp, uint_t n_desc) {
    14	    if (n_desc == 1) {
    15	        /* I prefer to use fgets() instead of read(), so lets open a stream
    16	        ** for the descriptor passed in the dp argment
    17	        */
    18	        FILE *fp = fdopen(dp->d_data.d_desc.d_descriptor, "rb");
    19	        if (fp == NULL) {
    20	            char buffer[1024];
    21	            int len = sprintf(buffer, "Failed to reopen stream: %s", strerror(errno));
    22	            /* Return to the client with an error message */
    23	            door_return(buffer, len, NULL, 0);
    24	        } else {
    25	            char buffer[1024];
    26	            while (fgets(buffer, sizeof (buffer), fp) != NULL) {
    27	                /* buffer contains one line of input with the following format
    28	                ** "key tab value". I don't do any error checking her to avoid
    29	                ** cluttering the code with extra tests to see that the key is
    30	                ** valid, the tab is there etc etc..
    31	                */
    32	                char *key = buffer;
    33	                char *value = strchr(buffer, '\t');
    34	                *value = '\0';
    35	                ++value;
       
    36	                int len = strlen(value);
    37	                value[len - 1] = '\0';
       
    38	                /* Allocate memory to store the item */
    39	                item* it = item_alloc(key, strlen(key), 0, 0, strlen(value) + 2)
    40	                if (it != NULL) {
    41	                    conn c;
    42	                    if (settings.verbose) {
    43	                        printf("Key: [%s] value [%s]\n", key, value);
    44	                    }
    45	                    /* Insert the value into the item. The memcached server
    46	                    ** stores the data with a terminating \r\n so we need
    47	                    ** to add those as well
    48	                    */
    49	                    memcpy(ITEM_data(it), value, len);
    50	                    *(ITEM_data(it) + it->nbytes - 2) = '\r';
    51	                    *(ITEM_data(it) + it->nbytes - 1) = '\n';
    52	                    if (store_item(it, NREAD_SET, &c) != 1) {
    53	                        char msg[1024];
    54	                        sprintf(msg, "Failed to store %s\n", key);
    55	                        door_return(msg, strlen(msg), NULL, 0);
    56	                    }
    57	                    /* Release our reference */
    58	                    item_remove(it);
    59	                }
    60	            }
    61	            (void) fclose(fp);
    62	        }
    63	    }
    64	    /* Return the control back to the client */
    65	    door_return(NULL, 0, NULL, 0);
    66	}
       
    67	void init(void) {
    68	    /* Create a filesystem entry for our door so that clients may find us */
    69	    mode_t mask = umask(0);
    70	    int fd = open(doorfile, O_CREAT | O_TRUNC, 0444);
    71	    (void) umask(mask);
    72	    if (fd < 0) {
    73	        perror("Failed to open door");
    74	    } else {
    75	        (void) close(fd);
    76	        /* Detach any existing services from the file */
    77	        (void) fdetach(doorfile);
    78	        /* Create a door id for our door function */
    79	        int did = door_create(door_server, NULL, DOOR_NO_CANCEL);
    80	        if (did > 0) {
    81	            /* Associate our door with our door id */
    82	            if (fattach(did, doorfile) < 0) {
    83	                perror("fattach door failed");
    84	            }
    85	        } else {
    86	            (void) perror("door_create failed");
    87	        }
    88	    }
    89	}
       
    90	#pragma init (init)
        

So how does this work? First, look at line 90. The #pragma init(init) instructs the compiler to add the function named init to the init section. This means that during the initialization of the object, the function init is called. We need a filesystem entry for our door for "clients" to be able to use it. In line 68-76 we create a new filesystem entry. Line 79 creates a door identifier associated to the function named door_server. Since there might be other services already attached to the file we want to use as a door, we call fdetach in line 77 to remove all such associations. Line 82 attaches our server function with the door file.

So what does the door_server function do? This function is called whenever someone invokes a door_call on our door. It expect dp->d_data.d_desc.d_descriptor to contain a file-descriptor where we should read input data, and store it as items in our cache. The server expects the data on the input stream to be in the following format:

key tab value
        

Well, we should be ready to compile and start our memcached server:

trond@razor:> cc -o backdoor.so -mt -m64 -G -g -KPIC backdoor.c
trond@razor:> pfexec ksh -c "LD_PRELOAD=./backdoor.so ./memcached -u noaccess" &
trond@razor:> ls -l /var/run/memcached_backdoor
Dr--r--r--   1 root     root           0 Dec 17 15:21 /var/run/memcached_backdoor
        

The capital D in the filesystem listing identifies this as a door. See man ls for more details.

You can telnet to port 11211 and try sending commands to the server if you like, or you could execute the following command:

trond@razor:> echo stats | nc localhost 11211
STAT pid 15396
STAT uptime 4
STAT time 1229524715
STAT version 1.3.1
STAT pointer_size 32
STAT rusage_user 0.009059
STAT rusage_system 0.029021
STAT curr_connections 6
STAT total_connections 7
STAT connection_structures 7
STAT cmd_get 0
STAT cmd_set 0
STAT get_hits 0
STAT get_misses 0
STAT bytes_read 6
STAT bytes_written 0
STAT limit_maxbytes 67108864
STAT threads 5
STAT bytes 0
STAT curr_items 0
STAT total_items 0
STAT evictions 0
END
        

Now that we have the server set up, let's create a client application that uses the door. I want to get my data from a MySQL database, so I want the client program to process the data from standard input:

client.c
     1	#include <stdio.h>
     2	#include <door.h>
     3	#include <sys/types.h>
     4	#include <fcntl.h>
     5	#include <unistd.h>
     6	#include <errno.h>
     7	#include <stdlib.h>
     8	#include <sys/mman.h>

     9	int main(int argc, char** argv) {
    10	    int doorfd = open("/var/run/memcached_backdoor", O_RDONLY);
    11	    if (doorfd == -1) {
    12	        perror("Failed to open door file");
    13	        return EXIT_FAILURE;
    14	    }

    15	    door_desc_t descr;
    16	    descr.d_data.d_desc.d_descriptor = STDIN_FILENO;
    17	    descr.d_attributes = DOOR_DESCRIPTOR;

    18	    door_arg_t door_args = {
    19	        .desc_ptr = &descr,
    20	        .desc_num = 1
    21	    };

    22	    if (door_call(doorfd, &door_args) == -1) {
    23	        perror("door_call failed");
    24	    } else if (door_args.data_size > 0) {
    25	        write(STDOUT_FILENO, door_args.data_ptr, door_args.data_size);
    26	        if (munmap(door_args.rbuf, door_args.rsize) == -1) {
    27	            perror("Failed to unmap memory");
    28	        }
    29	    }

    30	    return (EXIT_SUCCESS);
    31	}
        

This program should be easy to understand without any comments, but I would like to point out a few lines. Line 16 inserts the standard input filedescriptor of this process, and that is passed into the door during the door_call in line 22. The kernel makes sure that the file descriptor is available as a valid file-descriptor in memcached when it invokes my door_server function.

Now it's time for us to compile the client:

trond@razor:> cc -o client client.c
        

Let's use the data stored in our database to test the thing:

trond@razor:>  /usr/mysql/bin/mysql -u root -D memcached \
   -e "SELECT CONCAT('user_', id), bio FROM user" --skip-column-names \
   |  ./client
        

Let's look at the stats and try to get one of the objects to verify that it works:

trond@razor:>echo stats | nc localhost 11211
STAT pid 15396
STAT uptime 190
STAT time 1229524901
STAT version 1.3.1
STAT pointer_size 32
STAT rusage_user 0.014419
STAT rusage_system 0.038423
STAT curr_connections 6
STAT total_connections 8
STAT connection_structures 7
STAT cmd_get 0
STAT cmd_set 0
STAT get_hits 0
STAT get_misses 0
STAT bytes_read 12
STAT bytes_written 463
STAT limit_maxbytes 67108864
STAT threads 5
STAT bytes 1019
STAT curr_items 14
STAT total_items 15
STAT evictions 0
END
trond@razor:> echo get user_1 | nc localhost 11211
VALUE user_1 0 61
Trond spends his evenings in front of the computer... bla bla
END
        

Please note that I don't think this is something you should do in a realworld scenario, but rather something you could do in the development phase of your application. (Or it can be used to preload "mocking" objects for testing various parts of memcached ;)

http://blogs.sun.com/trond/date/20081204 Thursday December 04, 2008

Alignment bug in the binary protocol

When we implemented support for the binary protocol in the server we assigned the protocol header to a struct with the following code snippet:

protocol_binary_request_header* req;
req = (protocol_binary_request_header*)c->rcurr;
c->binary_header = *req;

The problem is that some hardware require a certain alignment for various datatypes, and SPARC is one of them. If you don't pipeline commands c->rcurr points to the beginning of a buffer allocated with malloc(), and malloc() will always return pointers to properly aligned memory. The problem occurs if you pipeline multiple commands to the server, and the size of one of them isn't a multiple of the structure alignment ( size % 8 != 0 ). I discovered this bug the other day when I experimented with the Spy memcached client by Dustin Sallings.

Luckily the bug was pretty easy to track down and fix: http://github.com/trondn/memcached/commit/4c3797b1d4fe9088ef96c9f7d571b1221cb40fef

Now I just need to figure out how to write a Perl test for the fix :)

http://blogs.sun.com/trond/date/20081201 Monday December 01, 2008

Displaying the 10 most popular keys in your memcached instance?

If you run your memcached-server on Solaris you can easily display the 10 most popular keys in you memcached server by using dtrace. The following little script will display the 10 most popular keys every 10th second.

#! /usr/bin/ksh 
pid=`pgrep -x memcached` 
if [ -z "${pid}" ] 
then 
   echo memcached not running 
   exit 1 
fi 

file /proc/${pid}/path/a.out | grep "ELF 64" > /dev/zero 2>&1 
if [ $? -ne 0 ] 
then 
  mode="-32" 
else 
  mode="-64" 
fi 

/usr/sbin/dtrace ${mode} -n ' 
#pragma D option quiet 

memcached'${pid}':::assoc-find 
{ 
    @assK[copyinstr(arg0)] = count(); 
} 

tick-10s 
{ 
    printf("Top 10 keys\n"); 
    trunc(@assK, 10); 
    printa(@assK); 
    trunc(@assK, 0); 
} 

END { 
    trunc(@assK, 10); 
    printa(@assK); 
    trunc(@assK, 0); 
} 
'

You may find the probe name assoc-find strange, but that is a function that is called for all commands when the memcached server tries to locate an item in the cache. By using that probe, we can count all get/add/set/replace/delete with one single probe :-)

http://blogs.sun.com/trond/date/20081126 Wednesday November 26, 2008

Not using CAS? disable them and save 8 bytes pr. item in your cache

The memcached server stores an 8-byte big unique id for each item that is stored (or updated), so that you can ask the server to do some operations on the items if and only if the cache has the value you provide. If your application doesn't use this feature, you are wasting 8 bytes pr object. You may think that eight bytes is not much on modern hardware (even my laptop has 4GB of memory), but the more items you put in the cache the more memory you "waste".

Earlier today I created a patch that allows you to remove the allocation of these bytes if you start memcached with -C. You might ask yourself why I didn't just make this a compile-time-setting, and there is two reasons for that:

  1. Compile-time-settings are a pain if you would like to ship a binary, because then you need to ship two versions and the users have to select the correct one
  2. It would cause problems for the work we are doing with the storage engines. If we have two different item structures depending if you compile with or without cas support, you need also to create two different engine implementations.

If you look at the diffs in my github repository you will see that it is actually a pretty simple change, but if you don't want to read diffs I'll just highlight the important changes here. The old item structure looked like:

typedef struct _stritem {
    struct _stritem *next;
    struct _stritem *prev;
    struct _stritem *h_next; /* hash chain next */
    rel_time_t time; /* least recent access */
    rel_time_t exptime; /* expire time */
    int nbytes; /* size of data */
    unsigned short refcount;
    uint8_t nsuffix; /* length of flags-and-length string */
    uint8_t it_flags; /* ITEM_* above */
    uint8_t slabs_clsid;/* which slab class we're in */
    uint8_t nkey; /* key length, w/terminating null and padding */
    uint64_t cas_id; /* the CAS identifier */
    void * end[];
    /* then null-terminated key */
    /* then " flags length\r\n" (no terminating null) */
    /* then data with terminating \r\n (no terminating null; it's binary!) */
} item;

What I did was to move the cas_id into the variable part of the structure, so that the struct now looks like:

typedef struct _stritem {
    struct _stritem *next;
    struct _stritem *prev;
    struct _stritem *h_next; /* hash chain next */
    rel_time_t time; /* least recent access */
    rel_time_t exptime; /* expire time */
    int nbytes; /* size of data */
    unsigned short refcount;
    uint8_t nsuffix; /* length of flags-and-length string */
    uint8_t it_flags; /* ITEM_* above */
    uint8_t slabs_clsid;/* which slab class we're in */
    uint8_t nkey; /* key length, w/terminating null and padding */
    void * end[];
    /* if it_flags & ITEM_CAS we have 8 bytes CAS */
    /* then null-terminated key */
    /* then " flags length\r\n" (no terminating null) */
    /* then data with terminating \r\n (no terminating null; it's binary!) */
} item;

This means that we don't have a cas_id member anymore, so I created two "functions" to get and set the value:

#define ITEM_get_cas(i) ((uint64_t)(((i)->it_flags & ITEM_CAS) ? \
                                    *(uint64_t*)&((i)->end[0]) : 0x0))
#define ITEM_set_cas(i,v) { if ((i)->it_flags & ITEM_CAS) { \
                          *(uint64_t*)&((i)->end[0]) = v; } }

The next thing to do was to replace all usage of the cas_id-member with these accessory functions. Since the old offset of the variable data in the structure is no longer true, I had to update the macro's we use to get the other variable parts of the structure:

#define ITEM_key(item) (((char*)&((item)->end[0])) \
         + (((item)->it_flags & ITEM_CAS) ? sizeof(uint64_t) : 0))
 
#define ITEM_suffix(item) ((char*) &((item)->end[0]) + (item)->nkey + 1 \
         + (((item)->it_flags & ITEM_CAS) ? sizeof(uint64_t) : 0))
 
#define ITEM_data(item) ((char*) &((item)->end[0]) + (item)->nkey + 1 \
         + (item)->nsuffix \
         + (((item)->it_flags & ITEM_CAS) ? sizeof(uint64_t) : 0))
 
#define ITEM_ntotal(item) (sizeof(struct _stritem) + (item)->nkey + 1 \
         + (item)->nsuffix + (item)->nbytes \
         + (((item)->it_flags & ITEM_CAS) ? sizeof(uint64_t) : 0))

And that was basically it :-)

http://blogs.sun.com/trond/date/20081110 Monday November 10, 2008

Engine interface

I have been working on the engine interface lately, and I am happy to say that I have just uploaded the document to the memcached wiki at http://code.google.com/p/memcached/wiki/EngineInterface.

Please go ahead and start sending me (or the memcached mailing list) comments on our proposal. In the meantime I will go ahead and create another example engine as a proof of concept.

http://blogs.sun.com/trond/date/20080911 Thursday September 11, 2008

Memcached UDF for Java DB on OpenSolaris

I have been talking about the UDFs for MySQL during lunch, and Knut Anders bragged about being able to do that in Java DB as well..

So what could I do? I just had to challenge him ;-)

As always, Knut Anders delivers! Just check out the blog entry he wrote!

Knut Anders, I'm impressed! This looks like a clean solution! keep up the good work! (By the way, you should use SMF to control the memcached server. Check out my blog post for the details).

http://blogs.sun.com/trond/date/20080902 Tuesday September 02, 2008

Memcached UDF for MySQL on OpenSolaris

I have been hearing about the Memcached UDF for MySQL for a while now, so I decided to spend some time playing with them. Being the geek I am, playing for me is to get my hands dirty with the code so I cloned the source repository from: http://hg.tangent.org/memcached_functions_mysql/.

With the source code and a cup of coffee available I ran:

trond@opensolaris:compile> ./config/bootstrap
trond@opensolaris:compile> ./configure CC=cc --with-myslq=/usr/mysql/bin/mysql_config
[... cut ...]
checking for DEPS... configure: error: Package requirements (libmemcached >= 0.17) were not met:
[... cut ...]
    

I know that there is a version of libmemcached in OpenSolaris (I was involved during the integration), so the version is either too old or configure isn't picking it up..

trond@opensolaris:compile> pkginfo -l SUNWlibmemcached | grep DESC
      DESC:  memcached C API 0.16
    

It turns out that the version of libmemcached integrated in OpenSolaris is too old, so I filed 6743510 to get it upgraded.

I have been contributing to libmemcached (adding support for the binary protocol), so I have a "fresh-from-the-oven" version installed in /opt/memcached". All I needed to do was to get the configure script to pick it up... There was at least two different roads I could go:

  • I could just comment out the test in configure.ac, regenerate the configure script and run it as:
    trond@opensolaris:compile> ./configure CC=cc CFLAGS=-I/opt/memcached/include \
           LDFLAGS="-L/opt/memcached/lib -R/opt/memcached/lib" \
           --with-myslq=/usr/mysql/bin/mysql_config
          
  • I could do it the "clean" way and be nice to the community and add a: --with-libmemcached option

For unknown reasons I decided to do it the clean way and submitted a patch back to the project (use hg import if you would like to use it before it is included in the upstream repository).

With the lib installed as /opt/memcached/lib/libmemcached_functions_mysql.so, I tried to figure out how to load it in MySQL. The documentation told me to copy the library to /usr/local/mysql/lib/mysql/plugins/, but my OpenSolaris delivers MySQL in /usr/mysql so I tried to copy the file to /usr/mysql/lib/mysql/plugins/ without success.

A quick truss on mysqld revealed that it did not search any other directories than ld does. Being a MySQL novice I didn't know if my setup was mis-configured or not, so I returned to the MySQL documentation and found the following page. I don't want to expose the library to everything on my system, so creating a link from /usr/lib or using crle was out of the question. Instead I modified /lib/svc/method/mysql and exported LD_LIBRARY_PATH.

I am now able to use the Memcached UDF for MySQL on my OpenSolaris box, but I don't like the LD_LIBRARY_PATH hack so I am looking forward for the MySQL 5.1 release. Perhaps I should try to compile a version myself and test it out.

http://blogs.sun.com/trond/date/20080616 Monday June 16, 2008

Open position in the Memcached team

The last few month has been an interesting period for me. I used to work in the Database Technology Group, and as you may guess that group was affected by the acquisition of MySQL. The merge of the two teams are now complete, and we now all located in the Database Group.

During this merge it was natural to look at the staffing in the different projects, and I am extremely glad to see that the project I am working on get additional headcount.

I am therefore extremely exited over the fact that we have the following open positions in my team:

http://blogs.sun.com/trond/date/20080609 Monday June 09, 2008

Memcached source code

Finding the source code for Memcached can be a challenge. Right now there are multiple source repositories, so you might want to look into multiple of them. You will find a link to a Subversion repository from the download section on the project homepage, but the ongoing development is performed in various git.

I have created my own git repository at http://github.com/trondn/memcached/commits/binary where I do my development. Other repositories you might find useful are:

http://blogs.sun.com/trond/date/20080529 Thursday May 29, 2008

Support for Memcached

During the webinar "Highly scalable solutions with MySQL and Memcached" Ivan Zoratti announced that Sun will support Memcached as a part of MySQL Enterprise Support.

You may want to check you the white paper "Designing and Implementing Scalable Applications with Memcached and MySQL"

There is another webinar later on today (Designing and Implementing Scalable Applications with Memcached and MySQL), so if you missed out on yesterdays webinar (or liked it) you should sign up for the event!


Valid HTML! Valid CSS!

This is a personal weblog, I do not speak for my employer.