keith thompson

tales from the darkstar


Monday Jul 27, 2009

Of protocols and transports

As mentioned in my previous post I worked on the Darkchat server which was going to be demoed and used at JavaOne 2009.  The server would be on the public network so that attendees could access it during the convention. For this to work we needed to to be able to limit logins to authorized clients (JavaOne attendees). The details of creating and distributing account usernames and passwords were being worked on and early-on it looked like we would have a way for attendees to register for an account, and for them to create a username and password (the final solution was way cooler than this scheme, perhaps a story for later). The default protocol and transport that comes with Darkstar provides very little security for client logins. The Simple SGS protocol sends usernames and passwords in the clear over TCP. I was concerned that passwords could be snooped (humm.. a conference full of computer geeks, what were the chances? - for answer see below1) and though this was only a demo, what if someone used an important password thinking that it was being protected. In looking at the options and the time that was available two solutions seemed possible, to create an SSL transport to replace TCP, or create a new protocol that included some level encryption. To me, creating a new protocol seemed the easier of the two.

In the Darkstar server are two services that handle the bulk of the communication duties with the client, the client session service and the channel service. These two services implement the most of the ClientSession and Channel semantics that the application sees. They are built on top of a protocol layer. This layer determines the format and content of the messages exchanged between the client and the server. The protocol layer is, in turn, built on top of a transport layer. The transport layer deals with the network connection with the client.

The protocol layer is defined by the interfaces in the com.sun.sgs.protocol package in the Darkstar Server Internal API sub-project.  The protocol implementation is configurable at server runtime through the session service property:

com.sun.sgs.impl.service.session.protocol.acceptor

The property must refer to a class which implements the ProtocolAcceptor interface. The implementation class must also have a public constructor which takes the following arguments:

java.util.Properties
com.sun.sgs.kernel.ComponentRegistry
com.sun.sgs.service.TransactionProxy

The ComponentRegistry and TransactionProxy objects provide access to the components within the Darkstar kernel such as the data service or the task scheduler.

The ProtocolAcceptor object is responsible for initial client connection and login, including authentication. The client session service registers its interest with the acceptor by calling accept(), passing in a ProtocolListener object. When a successful client connection and login is made the acceptor should call ProtocolListener.newLogin() passing in the authenticated Identity of the client, a SessionProtocol object, and a completion handler. It is the SessionProtocol object which implements the outgoing messages needed by the session and channel services. The implementation must be able to handle  sending a session and channel message, a channel join and leave message, and be able to disconnect the client. Exactly how these messages are sent is up to the SessionProtocol implementation and the client. When done processing the new login, the ProtocolListener will call the completion handler with a SessionProtocolHandler object. This object handles the incoming session and channel messages as well as logout and disconnect. It is the responsibility of the protocol implementation to call the SessionProtocolHandler object as needed. Piece of cake.

Now the protocol layer needs only be responsible for the when and what of the messages exchanged between the server and the client. It does not need to worry about the underlying network transport. It can, but it does not need to. There is a set of interfaces in the internal APIs that define a network transport layer. Through these APIs the protocol layer can use existing implementations of network transports and transport providers can code against the API to create new transports. But I will leave the discussion of transports for another post.

There is one item that deserves extra explanation. One of the methods that the protocol acceptor must implement is getDescriptor(). This method returns a ProtocolDescriptor object. The information in this object is used when a client connection must be redirected to another node. In particular, the getConnectionData() returns an array of data that contains the protocol and transport specific information needed by the client to connect to the node described by the descriptor. For example the byte array might contain the IP address and port number needed for a TCP connection. The server does not look at the data, but the client protocol must be able to interpret it. In general, if using the TCP transport supplied with Darkstar, getConnectionData() needs simply to return the data from the transport descriptor's getConnectionData() method. (see either the SGS simple protocol, or the challenge-response descriptor implementations).

In summary, to build a custom protocol you need to do the following:
  • Create an implementation of ProtocolAcceptor to handle client logins
  • Call ProtocolListener.newLogin() for each successful login
  • Create an implementation of SessionProtocol which forms outgoing messages
  • Call SessionProtocolHandler methods for incoming messages
  • Set the ...protocol.acceptor property to the ProtocolAcceptor implementation
  • Create the client-side protocol handling code

For what Darkchat needed, the above steps were rendered trivial by simply copying the SGS simple protocol sources (found in the com.sun.sgs.impl.protocol.simple package) and modifying them to add in a couple of new protocol messages. The new protocol implements a challenge-response login exchange where the server sends a randomly generated challenge to the client and the client responds with their password encrypted using the challenge. The new messages were the client's request for challenge, and the server's message containing the challenge. The new protocol can be found in the Darkchat server sub-project in the com.sun.sgs.impl.protocol.challengeresponse package.

A similar tact was used to create the client-side protocol handler. The client challenge-response code was based on the sources from the sgs-java-client project. Unfortunately the client code in Darkchat is a little hard to follow for two reasons. First, the sgs-java-client code is hard to follow in the first place. Second, there is an extra two layers in the client side stack, one to allow replacement of the underlying Java SE client with one for Java ME, and the second to handle the conversion from Java to JavaFX. The Java ME client was never completed and does not exist in the source tree. Maybe something for the future!


1Shortly after the JavaOne Keynote the Darkchat gang scurried back to our hotel to get things going. The first step was to send out the email which contained the JNLP script to launch the Darkchat client to all of the attendees. Ant had built a nifty little program which sent an email out every 20-30 seconds or so. We quickly realized it would take a couple of hours to send roughly 8,000 emails so we all sat back and waited for our first login. It was like watching grass grow on a soccer field where a game is being played. In only a few minutes we had our first guinea pig, GOAL! There was much rejoicing. We hung around chatting with the folks entering Darkchat and after about a half hour in, a person noted that he could see everyones messages and was surprised that nothing was encrypted. When queried, he said that he was using (some program I don't remember the name of) to snoop the packets in and out of the server! I informed him that only the login passwords were encrypted and smiled to myself....

Friday Jul 17, 2009

I'm baaack.....

It has been awhile since my last note but I've been busy. In fact, earlier this year I found myself in a very familiar situation... working on a demo. We had three months to do it, but this time it was not going to be a demo shown at a booth under a nice controlled environment like Snowman at GDC, nooooo... this time the server was going to be on the internet and all 8,000 (or whatever) attendees of JavaOne 2009 would be handed a client. The origin of this insanity came from the JavaFX team. We (the Project Darkstar team) had innocently contacted them, inquiring about the possibility of using JavaFX clients with the Darkstar stack. When we got on the phone they started talking about an idea they had about doing a demo for JavaOne using Darkstar, what timing! (I often wonder what would have happened if we hadn't called them?) Anyway, Anthony Rogers and Moises Chicharro (the JavaFX dudes) were obviously crazy enough to be convinced it was possible and apparently so was I. I talked myself and by bosses into temporarily shelving my current work and have at it.

The next couple of months was... interesting, and for the most part fun. I built the server, some client libraries, and tools while Ant and Mo built the JavaFX based client. The result was Darkchat, a multiplayer visual chat environment. The visual client was brilliant (kudos to Ant and Mo) and fun to use. It, along with the server did a great job at JavaOne, in spite of the last code change coming in at 2 am the morning of the Keynote where Darkchat was to be announced and demoed in front of thousands. But I'm getting ahead of myself here. (BTW if you watch the video of the Keynote you can see (starting at 2:40) our own Chris Melissinos give a live demo of Darkchat! Ant, Mo, and I were in the second row holding our collective breaths.)

As with building anything you learn things along the way, tricks, patterns, traps to avoid, etc.. like always point the nail-gun away from your body. It was no different building the Darkchat server (except for the nail-gun thing). Since I already had experience working on a server with Snowman I thought I would have a good handle on what was needed for Darkchat. lol! A common answer, that I myself often give to the question "what is the best way to do x with Darkstar?" is, "it depends on the application". Well, Darkchat was different enough from Snowman that a whole new set of tricks had to be learned. I'm planning on sharing some on those in forthcoming posts so stay tuned.

Monday Oct 20, 2008

I can write that service in two lines...

Folks on the forums have requested facilities to support iterating over objects, primarily for debugging, and for obtaining objects by object ID. This capability is not part of the standard API available to a Darkstar application, however it is possible to write a manager to provide this functionally. It turns out that such a manager is quite simple, and can provide a straightforward example of how to create a manager. This note describes how to write, install, and access a new manager and its backing service. For a more rich (read complex) example, check out Seth's blog on writing services.


In Darkstar, managers provide many of the application level APIs. By creating a new manager one can extend the API set with functions that application could not otherwise implement. Underneath the managers, not directly visible to the application, are corresponding services. Services provide the functions needed by their managers and have access to the other services in Darkstar that applications do not have access to.

The example manager described below will be built on a service which will use the DataService to provide the desired functionality.

A Darkstar service must implement the com.sun.sgs.service.Service interface and have a specific constructor. Ignoring Javadoc, import and package statements, the class declaration and constructor look like this:

public class ObjectAccessService implements Service {
    private final DataService dataService;

    public ObjectAccessService(Properties properties,
                               ComponentRegistry systemRegistry,
                               TransactionProxy proxy) {
        dataService = proxy.getService(DataService.class);

    }

The three arguments to the constructor are the properties, system component registry, and the transaction proxy. The example service does not require any properties or system components but it does require the data service which is accessed through the transaction proxy. The two new service methods, createReferenceForId() and nextObjectId() call through to the data service and provide the functions needed by the manager:

    public ManagedReference<?> createReferenceForId(BigInteger id) {
        return dataService.createReferenceForId(id);
    }

    public BigInteger nextObjectId(BigInteger objectId) {
        return dataService.nextObjectId(objectId);
    }


The three methods defined by the Service interface, getName(), ready(), and shutdown() are described in Service's documentation and for this service require fairly straightforward implementations:

    public String getName() {
        return ObjectAccessService.class.getName();
    }

    public void ready() throws Exception {
        // ignore
    }

    public boolean shutdown() {
        return true;
    }

}

The manager does not need to implement any particular interface and in this case the implementation is even more compact than the service:

public class ObjectAccessManager {
    private final ObjectAccessService service;

    public ObjectAccessManager(ObjectAccessService service) {
        this.service = service;
    }


When a manager is created it is passed a reference to its backing service, in this case the ObjectAccessService. The two new calls available to the application are createReferenceForId() and nextObjectId(). Both methods call directly to the underlying service (and then onto the data service):

    public ManagedReference<?> createReferenceForId(BigInteger id) {
        return service.createReferenceForId(id);
   }

   public BigInteger nextObjectId(BigInteger objectId) {
       return service.nextObjectId(objectId);
   }

}

Note that since both methods follow a direct path to the data service, their definitions can taken from the DataService interface.

In order to make the new manager available to the application, both it and its service needs to be installed in the Darkstar server. This is done by adding the following two items to the server's configuration file:

com.sun.sgs.services=mypackage.ObjectAccessService
com.sun.sgs.managers=mypackage.ObjectAccessManager


Obviously the manager and server classes need to be in the class-path of the server.

With the manager installed it is a simple matter for the application to access it:

ObjectAccessManager accessManger = AppContext.getManager(ObjectAccessManager.class);

Though this is an extremely simple example, it does provide a means to iterate over the objects in the system, by their identifier, and given an identifier, get the object. It also demonstrates the basic manager-service operation in Darkstar.

Friday Oct 03, 2008

Two wrongs don't make a right, ...

...but five rights will get you back on the highway...1 One of the givens of Project Darkstar application development that I've picked up is that if you know you are going to modify a ManagedObject you should inform the server as early as possible. A related belief is that turning off modification detection should be a good thing. (See Seth's blog for additional comments) With these two beliefs in mind we wanted to verify them using the Project Snowman game server. Seth has posted a good description of the ins and outs of managed objects and serialization, and why the two claims above should be true. What I want to cover here is how the theory was applied to a "real", OK "demo" application, how it was tested, and what were the results.

With the goal of being able to turn off Darkstar's modification detection, the first order of business is to make sure all modified managed objects (MOs) are marked for update before the task ends. This is done in one of two ways, if the MO is in hand, you can make a call to DataManaged.markForUpdate(). If you are getting a MO from a ManagedReference you can call ManageReference.getForUpdate()instead of get(). Both of these calls will inform the server that these obects will be modified and shoulw be made persistent when the task ends. Currently the best way (only way) to go about adding these calls to your application is via code inspection.  With the Snowman server this appeared easy because of the small number of classes. I suppose there are many ways to approach the source but what made some sense was to take a top down look, starting with the AppListener and ClientSessionListener classes. After a generous sprinkling of markForUpdate() calls, the next step is to test whether everything has been properly marked. Luckily, Darkstar provides a little help with this. The modification detection code makes a log entry each time the check finds an object that has been modified (but not marked). You can enable this logging by adding the following to the server's logging properties:

com.sun.sgs.impl.service.data.DataServiceImpl.detect.modifications.level = ALL

The logging messages includes the class of the object that was found to be modified buy not much more. Also, the logging is made in the modification detection code (as you would expect) which is happening well after your task has ended. So the logs tell you that an object has been modified, but not where or by whom. This made finding that one last pesky state modification a challenge.2 You also have to be careful that all code paths are executed. The snowman server is fairly simple, but in a much more complex application this could be quite difficult to verify.

With the code properly marked for update it was time to test it with the detection turned off. This is done by setting the following server property:

com.sun.sgs.impl.service.data.DataServiceImpl.detect.modifications=false

In my previous post I mentioned using the profile listener SnapshotProfileListener. Though useful, the data form this profiler is fairly course grained. To get a more detailed view into the server I used the AggregateProfileListener which produces data on task execution over the life of the server. As before the listener can is enabled by adding this to the server's properties:

com.sun.sgs.impl.kernel.profile.listeners=\
com.sun.sgs.impl.profile.listener.AggregateProfileListener:\
com.sun.sgs.impl.profile.listener.SnapshotProfileListener

Note that this property enables two listeners. The client simulator displays output from the SnapshotProfileListener in its control window. The output from AggregateProfileListener can be viewed by connecting to port 43005 via telnet:

telnet server.host 43005

A portion of its output looks like this:

TaskCounts:
TotalTasks=4000162  AvgLength=0.37ms
Transactional=2194463  AvgLength=0.60ms
Successful=3875770  AvgLength=0.38ms  AvgStartDelay=12.47ms  AvgRetries=1.03

The interesting number here are the task lengths. My test procedure was to run 1000 simulated clients against the server and wait for one million, and then four million tasks to execute. I figured that this was enough for the JVM to be "warmed-up" and any server initialization to be minimized in the averages. I ran the server with the modification detection enabled (the default) and then turned-off. At the 1 and 4 million task marks I checked the task average lengths expecting to see an improvement (shorter task times) when modification detection was disabled.3 To my surprise (the first of several) there was no significant difference. The surprise was short lived when I realized that in the snowman server it is rare for an object to be accessed but not modified. So in the default case, very few checks were wasted and removing those extra checks (by marking the objects) had very little impact.

OK, things made sense, for at least ten minutes

My original surprise had turned to disappointment and in an effort to justify my work to this point I wondered how much better the server with markForUpdate() calls and no detection preformed over the server with no calls. I grabbed the original server code, and to make sure I would see the biggest difference between the two approaches I removed all of the markForUpdate() calls from the server. I then tested the two servers and indeed found a significant performance difference - the server without the detection was... slower!? My surprise here took somewhat longer to dissipate.

To understand what was happening, lets recap (from Seth's post) how the modification detection works. When a task ends, any object accessed and not marked for update is serialized and its serialized form is compared with the original. If they are different, the object is marked for update and is written-out when the task's transaction ends. In order to write-out an object it must be serialized, so in the case of a check discovering that an object has been modified the only extra overhead was the compare. Only if the object was not changed is the serialization a wasted step. In the snowman's case all of the objects are pretty small, so the compare was cheap.

The second piece of the puzzle is what markForUpdate() was doing. It turns out that markForUpdate() causes an operation in the data store (and ultimately the DB) so is not without cost. So in the two servers we had the following series of operations for objects that are modified:

A. Server with markForUpdate() calls
  1. get managed object - DB operation (returns serialized object)
  2. markForUpdate() - DB operation
  3. serialize (marked) object
  4. write serialized object - DB operation

B. Server using only modification detection

  1. get managed object - DB operation (returns serialized object)
  2. serialize object
  3. compare with original
  4. write serialized object - DB operation
The extra operation in A is step 2, the markForUpdate() and associated DB operation. The extra operation in B is step 3, comparing two serialized objects. It seemed clear from the data that the DB operation is much more expensive than the compare. Though understandable it does fly in the face of current thinking in regard to marking objects. Could it be that marking is always a bad thing? Though I do not (yet) have data to back it up, it would seem intuitive that if the application access lots of objects for read-only, trying to skip the serialization in an unnecessary check could be of benefit. Another thing to consider is whether or not you know for sure that an object will be modified before it is accessed. In these cases calling DataManager.getForUpdate() instead of get() + markForUpdate() should be an improvement. Which brings us to my next discovery.

The ride is not quite over

When I discovered that modification detection was faster than marking objects I wanted to see how the difference scaled. The default settings of the snowman server is that two clients are needed for a game to star and each game has two robot players. At a 1000 clients that comes out to 500 games and 2000 players in all (1k clients + 1k robots). At that load the server takes up about 25% of the test host. An easy way to increase the load on the snowman server is to increase the number of robots per game. I could increase the number of clients, but that would require another test machine to run a second client simulator. Taking the path of least resistance, I set the number of robots per game to 10 and re-ran the tests. Yet again, the results were surprising. The markForUpdate() server was faster than the server doing modification detection. Completely opposite the results with two robots per game!

Each snowman game is independent and other then login, there is very little contention in the system outside of the game. Within the game, player objects can contend for each other and the flag objects. So the difference between the server with two robots per game and 10 per game is more than just load, it is also the amount of contention. Given this, the theory on the performance reversal is due to there being a lot more contention and marking for update helps because conflict is found sooner rather then later. When objects are marked, conflict can be detected at the markForUpdate() call, and if an object is marked, some other task trying to access it (for read or write) will block, preventing that task from going forward, only to fail later. Thats my theory and I'm stikcing to it. At least until I run the next test...4

Things don't always turn out like you expect

The take-away from all of this is, if there is one, is that marking for update is bad, unless it is good. Actually the real moral here is that applications are different and you will need to test with different options to find what is best. Also, sometimes code behaves different than you expect, even if you helped write it.


1"Two wrongs don't make a right, but five rights will get you back on the highway..." refers to the rule from way back, used by Sun East Coast travelers to remember how to get on Rt 101 south from the rental car lot at the San Francisco airport. I often think of it when I see something take a lot of twists and turns. Original author unknown. (Note: things have changed A LOT at SFO, so the rule is no longer valid. At least for getting onto 101)

2The challenge turned out to be one of those DOH! moments. I was certain I had marked all of the cases where the robot changed their own game state (position, hit points, etc). I was also careful to not mark the cases where the game state was not modified. No sense in writing an object that was not changed. Now the way the robot moves is that a task is run periodically that causes the robot to move, attack, etc. and the way that task is scheduled is by calling TaskManager.scheduleTask() at the end of the task. The actual scheduling code looks like this:


private void scheduleMove(int delay) {
    AppContext.getTaskManager().scheduleTask(
        new MoveTask(appContext.getDataManager().createReference((RobotImpl)this)),
        delay + random.nextInt(500));
}

The reason the task reschedules itself, instead of using TaskManager.schedulePeriodicTask() when the robot is created is so I can add some randomness into the robot's movement, i,e, the delay + random.nextInt(500) in the code above. This line causes the robots to move after a delay that varies by a half of a second. And this was my problem. Calling random.nextInt() changed the state of the random object, and therefore the state of the robot. This happens every time the robot reschedules itself, even if the robot did not change its game state!

I briefly considered using a static random number generator such as Math.random() but this would be a synchronizing point for all the robots in the server. Not a good thing. Also, it turns out that the robot task almost always changes its game state. The only cases where the game state is not modified is if the task runs while the game is being initialized and if it runs after the game ends. In that case the the task is not rescheduled. So, after all of this, what made sense was to remove all of the markForUpdate() calls in the robot and simply do a getForUpdate() when the task starts! Check out the run() method in the robot's move task:

static private class MoveTask implements Task, Serializable {
    private static final long serialVersionUID = 1L;
    final ManagedReference<RobotImpl> robotRef;
        
    MoveTask(ManagedReference<RobotImpl> robotRef) {
        this.robotRef = robotRef;
    }

    public void run() throws Exception {
        try {
            robotRef.getForUpdate().moveRobot();
        } catch (ObjectNotFoundException gameDone) {}
    }
}

3I did verify that the task length being reported does include the activity that occurs outside of the application code and includes the modification detection.

4Note that all of the tests described in this note were run with the server using Berkeley DB Java Edition. The C version behaves somewhat differently due to in the way the two version preform locking and can produce very different results when running the same application. At two robots per game, the two versions showed similar relative behavior across the different tests. At 10 robots per game the C version failed to handle the load when modification detection was enabled. For the snowman game the JE seems to scale much better than the C version. Exactly why is not completely understood, and may be worth a post just on that topic. So for now, deciding which DB is best for any given application is hard without actually testing both.

Thursday Sep 25, 2008

Case study: Improving login performance in Project Snowman

Project Snowman is a multiplayer capture-the-flag type game. The goals of the project were to showcase techniques for building a game server using Project Darkstar and to demonstrate the resulting benefits. Part of that demonstration was to show the snowman game server handling hundreds of client logins. To this end a client simulator was created as part of Project Snowman. The client simulator presents a simple GUI which allows the user to move a slider, selecting the number of clients to login to the server and play the game. The first implementation of the client simulator was designed to login clients as fast as possible. When we tested it against the game server the server was swamped after only a couple of hundred logins, performing far below our expectations. The following outlines the investigation into this initial poor login performance and the steps taken to improve it.

Before we dive into the investigation, lets look at the login requirements of the snowman game. Being a demonstration, the snowman game login is relatively simple. The server collects clients, or players, as they login, and once the required number of players-per-game is reached, a game is created and started. There is no consideration of which client joins which game, so login identity or order is unimportant. The simple login requirements naturally led to a fairly simple initial implementation. As a Darkstar programer is aware, when a client logs-in, the loggedIn() method is called on the application's AppListener class. The snowman's app listener class looked something like this: 

public class SnowmanServer implements Serializable, AppListener {

private Collection<SnowmanPlayer> waitingPlayers = // some unordered collection...

public ClientSessionListener loggedIn(ClientSession session) {
AppContext.getDataManager.markForUpdate(this);
SnowmanPlayer player =   new SnowmanPlayer(session));
waitingPlayers.add(player);
if (waitingPlayers.size() == NUM_PLAYERS_PER_GAME) {
startGame(waitingPlayers);
waitingPlayers.clear();
}
return player; // player is a ClientSessionListener
}
}

In this pseudo code, when a client logs-in a player object is created and then added to the list of waiting players. If there are enough players to start a game, startGame() is called, with the set of waiting players. The waiting set is cleared so that the process can repeat.

What the #^@%!& is going on? 

Though the login code above was straightforward, it was not immediately clear why it performed so poorly. One of the more useful tools for determining what the Darkstar server is doing is the profile listeners. These can be found in the package:

com.sun.sgs.impl.profile.listener

and Seth has blog entry that has more detail on using Darkstar's profiling sub-system. The particular profile listener that was useful in our case was SnapshotProfileListener. To enable this listener, add the following to the Darkstar server's properties:

com.sun.sgs.impl.kernel.profile.listeners=com.sun.sgs.impl.profile.listener.SnapshotProfileListener

This listener reports, on ten second intervals, the number of successful tasks, the number of tasks run, and the average size of the task queue. It reports this data on server port 43007 and can be easily monitored remotely using telnet:

telnet server.host 43007

will produce output looks something like this:

Snapshot[period=10000ms]:
 Threads=40  Tasks=9618/9629
 AverageQueueSize=1.90 tasks  

 

The report above is showing very low contention, 9,618 tasks succeeded out of 9,629 and the average queue size during that time was less than two tasks. During our simulated login "storm", the profiling information showed that the average queue size rapidly increased, into the hundreds, while the number of successful tasks was much lower than the number of tasks run. It is not unusual to see tasks failing, aborting and rescheduling tasks is how Darkstar manages data contention between competing tasks. However, the number of successful tasks was a fraction of the tasks started, which means that something was highly contentions resulting in very little progress by the server. One way we verified this was to add a configurable delay to the client simulator. When client logins were spaced 200 ms apart the number of failed tasks was greatly reduced as well as the average queue length. Unfortunately, five logins per second was not very exciting.

Examining the loggedIn() method above it is clear that each client login modifies the SnowmanServer object (the app listener) when it changes the contents of waitingPlayers. Since there is only one app listener in the server, and only one task can modify the same object at a time it makes the app listener a single choke point for all logins. Creating a new player and adding it to the waiting set is trivial, and even though there is contention on the app listener object, these login tasks should finish quickly, and for the most part keep out of each other's way. Starting a new game is a different story. To start a new game the server must create a game object, two flag objects, and some number of robot player objects. Each of which are managed objects. In addition, a channel is created and each "real" player (non-robot) joins that channel. While this lengthy game creation is taking place, no other login tasks  can succeed because the app listener is being modified. Even worse, it is highly likely that occasionally the game creation will fail, throwing away all that was done.

Short tasks are good tasks

Since logging-in without creating a game was quick, it seemed desirable to make all logins that speedy. To achieve this, the game creation was moved to a separate task:

public ClientSessionListener loggedIn(ClientSession session) {

AppContext.getDataManager.markForUpdate(this);
SnowmanPlayer player =   new SnowmanPlayer(session));
waiting.add(player);
if (waiting.size() == NUM_PLAYERS_PER_GAME) {
AppContext.getTaskManager.scheduleTask(new StartGameTask(waiting));
waiting.clear();
}
return player; // player is also ClientSessionListener
}

private class StartGameTask implements Task {

Collection<SnowmanPlayer> waitingPlayers = // some unordered collection...

StartGameTask(Collection<SnowmanPlayer> waiting) {
waitingPlayers.addAll(waiting);
}

public run() {
statrtGame(waitingPlayers);
}
} 

By removing the game creation from the login, we shortened the login task in the case that a game is ready to start. Since there is no contention between game creation tasks, those are unlikely to abort. This simple fix was quite effective in reducing task failures, verified by the profiling information, and allowed a much higher rate of client logins. Note that the collisions between the login tasks was reduced, the contention remained unchanged. The next step was an effort to actually reduce the contention.

Lets get clever 

One of the ideas that has been kicking around the group to deal with the multiple writer problem is to use a series of independent queues, with multiple writers feeding the tops of the queues, and another set of readers emptying the queues. The theory is that if the queues are independent, contention would only occur when the same queue was accessed for write. If there were many queues the chance that two writers hit the same queue at the same time could be small. Mapping this scheme into the snowman server code, we added a set of queues to the app listener object and when a client logged-in, the newly created player was placed into one of the queues. On the back side, a (new) task would read off the queues until it found enough players to start a game. Implementing this was not as straightforward as it sounds. In the app listener you could not just add an array of queues and expect any benefit. This is because changing the state of any one of the queues effectively changes the state of the app listener. We would sill have the single point of contention. Instead, the queues needed to be wrapped in a ManagedObject. This allows the app listener to keep an array of ManagedReferences to the queues. Now the app listener is immutable (in the context of the login task) and no longer a point of write contention (Managed objects, such as the app listener, accessed only for reading can be shared). Ignoring for now the task reading the end of the queues, here is the new app listener and queue objects:

public class SnowmanServer implements Serializable, AppListener {
private ManagedReference<Deque<ManagedReference<SnowmanPlayer>>>[] waitingDeques;

public void initialize(Properties arg0) {
this.waitingDeques = new ManagedReference[NUMDEQUES];
for(int i = 0; i < waitingDeques.length; i++) {
Deque<ManagedReference<SnowmanPlayer>> deque = new ManagedDeque<ManagedReference<SnowmanPlayer>>();
waitingDeques[i] = AppContext.getDataManager().createReference(deque);
}
AppContext.getTaskManager().scheduleTask(new MatchmakerTask(waitingDeques));
}

public ClientSessionListener loggedIn(ClientSession session) {
SnowmanPlayer player =  new SnowmanPlayer(session);
BigInteger id = player.getSnowmanPlayerRef().getId(); // get a ManagedReference to the player
BigInteger index = id.mod(BigInteger.valueOf((long)NUMDEQUES));

waitingDeques[index.intValue()].get().add(player.getSnowmanPlayerRef());
return player;
}

class ManagedDeque<E> extends LinkedList<E> implements ManagedObject {}
}
}

There are a couple of important things to note here. First is the use an array of ManagedReferences to keep track of the queues in the app listener. Since the queues are ManagedObjects their persistence is handled by Darkstar so the app listener does not need to maintain references to the queues directly. So a change to the queue does not translate into a change in the app listener object. A potential trap is in selecting a queue to place the player in. A temptation might be to keep a index in order to insert into the queues round-robin or to keep a Random object to distribute the players randomly. The problem would be that an int index or the Random object's state would change, and therefore the app listener would no longer be read only. Luckily, in this example, the player's (ManagedReference) ID can be used to create an index.

On the other end of the queue, the MatchmakerTask can be implemented several different ways. Here is a fairly straightforward approach:

public class MatchmakerTask implements Task, Serializable
{

private List<ManagedReference<SnowmanPlayer>> waitingPlayers;
private ManagedReference<Deque<ManagedReference<SnowmanPlayer>>>[] waitingDeques;

public MatchmakerTask(ManagedReference<Deque<ManagedReference<SnowmanPlayer>>>[] waitingDeques) {
this.waitingPlayers = new ArrayList<ManagedReference<SnowmanPlayer>>();
this.waitingDeques = waitingDeques;
}

public void run() throws Exception {
boolean playersFound = false;
for(int i = 0; i < waitingDeques.length; i++) {
ManagedReference<SnowmanPlayer> nextPlayer = waitingDeques[i].get().poll();
if(nextPlayer != null) {
playersFound = true;
waitingPlayers.add(nextPlayer);
}
if(waitingPlayers.size() == NUM_PLAYERS_PER_GAME) {
startGame(waitingPlayers);
break;
}
}

// if no players are found in the queue during this iteration
// schedule a delay for the next polling cycle
// otherwise, schedule the next cycle to occur immediately
if(playersFound)
AppContext.getTaskManager().scheduleTask(this);
else
AppContext.getTaskManager().scheduleTask(this, POLLING_INTERVAL);
}
}
} 

Basically, the task loops once through the set of queues collecting players. If enough for a game are found, a game is started. Before the task ends, it schedules itself to be run again. One could imagine changes in this task, like creating the game in its own task or polling the queues in a more round-robin fashion, but the important accomplishment here is the reduction in contention with the client login task. The code improvements to this point have left only these two points of contention: 1) between login tasks when inserting to the same queue, and 2) between login task and the match maker task when inserting and removing, respectively, from the same queue.

Are we done squeezing?

Though the changes so far have achieved a more than satisfactory login rate, there is one more improvement we made. Darkstar provides a utility data structure that we could use as a drop-in replacement for our ManagedDeque. The class com.sun.sgs.app.util.ScalableDeque is an implementation of Deque that provides concurrent, non-contentious, inserts and removals (as long as there is more than one element in the queue). This further reduces the contention between the client login task and the matchmaker task during a login storm.

And there was great rejoicing... 

Yea team!

Project Snowman was created by a bunch of folks, and to give credit where credit is due, Tim Blackman, long time Project Darkstar dude,  had some time ago mentioned to me the idea of using multiple queues to reduce contention. Owen Kellett, who like myself is a bit newer to Darkstar, implemented the multiple queue mechanism in the snowman server. And David Jurgens, our perennial intern, implemented ScalableDeque. Owen has also posted a blog about Project Snowman that is worth reading. I know Chuck Norris would want you to.

Wednesday Sep 24, 2008

in the beginning

So, this is my first entry into the world of blogging. It has come somewhat as a necessity, as I have some material that I wish to disseminate and this, blogging, seems to be the forum of choice. I am currently working on Project Darkstar and I expect most, if not all of my posts here will be related to that effort.


Today's Page Hits: 20