Miracle #2
This might seem to be a simple matter of programming. After all, we know what data is being cached on what server, so we ought to be able to simply move any task that isn't on that server that wants to touch that data to the right server. There are a couple of problems with this approach, however. The first is that moving a task is more expensive than moving the data, since tasks are associated with network connections that would also have to be moved (among other things). The second is that tasks tend to access more than one data item at a time, and so we need to make sure that all of the data items (and all the tasks that are accessing those data items) are co-located. Doing this by keeping track of each task and each data item quickly becomes, as they say in the trade, computationally infeasible.
So the first thing we need is some way of collecting groups of tasks that can be moved as a unit. The obvious grouping is by player. Those tasks that are being performed in response to a particular player are grouped together so that if it makes sense to move any of them, they will all be moved. We think this will also help in localizing the data access, since if a player is interacting with some data, the chances are that he or she will continue to interact with that data. Organizing tasks by identity works not just for players, but also for non-player characters (PNCs), who also have identities. Finally, organizing by players allows us to have a notion of history if needed. Tasks just happen and then cease to exist, but identities last over time, and give us something that we can instrument and observe, allowing us to measure the effectiveness of what we are doing.
While organizing tasks into groups by identity helps, we now think that we can do even better. And here is where the miracle could occur.
The idea first came when the engineer working on the problem of clustering task, Jane Loizeaux, started looking at tools that might let her visualize what players were doing in a game. She started looking into social networking tools, that do a lot of this sort of visual representation, and then realized that players in games form just these kinds of social networks. Whether the network is long-lasting like a guild or short-term like a particular encounter, the connections revealed by communication patterns and data access split the set of identities into clusters. Our current thought is that if we can determine what those clusters are, we can try to insure that all the members of a particular cluster are co-located on a particular server. When we add the assumption that those who are in a particular cluster are more likely to be interacting with the same game data, we get a way of insuring that the data and the tasks that are using that data are co-located on the same server.
So the Darkstar engineering group is now looking at a lot of the literature concerning discovering social networking, and trying to figure out how to apply the algorithms to the players in a game to let us determine the clusters of players in the game. We can then use these units to place groups of players on game servers, and (when necessary) move entire groups from one server to another. Such a move might be to deal with the fact that the group is now interacting with a different set of data (as happens over the course of a game) or because the set of groups on a particular server is getting too large and is causing the server to slow down.
We don't actually know if this scheme is going to work. It is only speculation on our part at the moment that the partitioning of players (and other identities) that is the result of the social networking algorithms will give us the set of players that are interacting with a particular data set. But it seems like a reasonable possibility, and the only way to really find out is experimentally. So we are implementing the pieces now, and will see if we actually get the results that we think we will get. This is one of those places where it is clear that we are a research group (where the definition of research is that we don't know what we are doing). The only way to find out is to try. The good thing about being a research group is that our goal is to learn things, so even if this turns out to be the wrong approach, we will have learned something.