"The time has come," the Walrus said, "To talk of many things..."
-- Lewis Carroll
This is the first posting for a blog about computer games, virtual worlds,
distributed systems, and concurrent programming. If this seems like a strange
combination of subjects, then please read on; I'd like to convince you that
the combination is not only natural but necessary.
First, a bit about me. I've been doing distributed computing and
object-oriented programming for some time. I started all this
at Apollo Computer,
where I worked on text libraries (shipping one of the first commercial C++ class
libraries not done at Bell Labs) and distributed systems. When Apollo was
acquired by Hewlett-Packard I found myself an
HP employee. While there, I led a group that implemented a system that became
the base of the first CORBA specification.
About 16 years ago, I joined Sun (and more
specifically Sun Labs), where I
continued working in the area of distributed systems. I did some kibitzing
during the design and implementation
of
Java Remote Method Invocation (RMI), which got me and the rest of my group
thrown out of the labs for being overly relevant, and then led the group that
did the Jini Networking
Technology. Like I say, I've been doing distributed systems for some
time.
I came back to the labs about four years ago, looking around for interesting
work in distributed systems. I did some work in systems for large-scale
medical sensing that, while interesting, couldn't have been deployed in the
current environment for medical care in the United States. About the time that
I was coming to this realization, I was asked to do some consulting on another
project in the lab that, to be honest, I hadn't really paid that much
attention to up to that time. But, being between projects, I figured that a
little consulting work might be refreshing, and give me time to find some
interesting research topic to tackle next.
This is how I got started
with Project Darkstar, and how I
first encountered the world of large-scale on-line games and virtual
worlds. And in fact the project did refresh me, and helped me to find some
interesting research problems. But they were all involved with the project
itself, which is why I am still a part (in fact, the technical lead) of the
project to this day.
The problems faced by those developing on-line games or virtual worlds are all
centered on scale. When a game or a world is released, there is no way of
telling how popular it will be. You can (and game companies do) try to
estimate the number of players, but being off by an order of magnitude or two
is not uncommon. If you estimate too low, there are a lot of frustrated
players who either find the game too slow or who can't play the game at
all. If you estimate too high, you have a lot of (expensive) infrastructure
around that isn't being used. What makes this all the more difficult is that
current mechanisms for scaling in games and virtual worlds are based on
cutting up the geography of the virtual environment and assigning different
parts of the geography to different servers. And this cutting up is done as
part of the design of the game. So if you get the scaling wrong, you need to
change the design of the game, and the source code that implements that
design. Not the kind of thing that can be done quickly.
Even if you are able to estimate the total numbers of player or users
accurately, these environments are still open to tremendous fluctuations of
load over time. Whether it is the discovery of some new feature in the game
that causes everyone to crowd into the same region, or something as random as
snow shutting down all of the schools on the east coast, the difference
between normal use and peak use can be as much as a factor of 10. Which makes
capacity planning all that much harder.
This sort of problem cries out for a solution using a bunch of machines
working together in a way that allows load to be balanced at run-time rather
than at compile time. So one of the goals of Project Darkstar is to provide a
server infrastructure that would allow such balancing without requiring the
game programmer to have to be involved in the workings of the distributed
system. That by itself would be pretty challenging, but that is only the
first of the project's goals.
The second goal is to fully exploit the multi-threaded capabilities available
on modern chips without requiring the game programmer to become a concurrent
programming wizard. With the possible exception of scientific super-computing,
no area of programming has more aggressively ridden the Moore's law curve than
games. But now the chip makers (including Sun) are changing the rules. Rather
than making the clocks faster, we are producing chips with multiple cores. The
argument is that chips are still getting faster, but the way that they are
getting faster is that they can do multiple things at the same time. A
particular sequence of instructions may not happen any faster, but you can run
multiple streams of instructions through the chip at the same time, which
means that you can do twice as much (or four times, or 16 times, depending on
the number of cores) as you used to do in the same period of time.
All this is well and good, but it assumes that you can actually run parallel
tasks to exploit the new chips. For some tasks (serving up web pages, for
example) this is pretty easy. But for games, it is a bit more complex. Games
(and virtual worlds) ought to be great candidates for multi-core and
multi-threaded approaches, since most of what happens in a game (or virtual
world) is independent of the other things that are happening, so the whole
thing is embarrassingly parallel. The problem is that these applications are
not entirely separate (sometimes players or members of the world do interact),
and are very much worried about latency (rather than throughput). Even worse,
there are few if any game programmers that understand how to write reliable
concurrent programs. This doesn't make game programmers any different from
almost all other programmers. But they don't get the benefits of multi-core
machines without writing that kind of code.
So the overall goal of Project Darkstar is to produce an infrastructure that
will allow game programmers to write their server code as though they were on
single machine running on a single thread, while exploiting multiple threads
and being able to share the load on lots of different machines. And that turns
out to be a pretty interesting research topic.
The added bonus is that I've been introduced to the world (and culture) of
games and virtual worlds. It is a very different world and a very different
culture from the enterprise world that I had been used to, and I must say that
it is a lot of fun. Games and virtual worlds are part of the entertainment
business, or perhaps part of the education business, but definitely not the
same as the usual enterprise business. These differences have led to a number
of what I call Margaret
Mead moments, when the differences in culture have made communication
difficult and I found I needed to think in a new way. Which is a form of
research in itself.
Hence this blog. I will be discussing games, virtual worlds, distributed
computing, concurrent programming, Project Darkstar, and the community that we
are beginning to build around all of these topics. It's a pretty broad
set of topics, but fitting them together is at least fun, almost always
interesting, and often instructive.