Git: yet another SCM system or a revolutionary model of software development?
I have recently come across the video recording of Linus Torvalds himself arguing
a case for Git as the ultimate source code
management system. The presentation is done in the usual trademark Linus style
(which I happen to like immensely since it seems to be the only way to wake up
the audience sufficiently enough to be engaged in a conversation) but the issues
he tackles are very poignant and go far beyond the merits of a particular SCM
system into a realm of what is the best tool for automating highly decentralized
peer-reviewed software development model:
Distributed vs. centralized development. Bottom line: centralized SCMs
run against the proverbial Bazaar.
Using networks of trust as the key means of conquering complexity of the
modern software projects and dealing with compartmentalization of
key developers/experts. Bottom line: SCM should reflect how humans are wired.
How to keep the pace of development activity at the highest possible level
(by making it cheap and easy to experiment and not care about breaking other stuff)
yet allow the easiest transition path for the changes that seem to be
beneficial for the project. Bottom line: branch early, branch often.
Developers vs. gate keepers and why the problems they face are
fundamentally different. Bottom line: even the best branching is useless without
merging
I find his arguments about why CVS is the most braindead SCM ever invented and
why Subversion simply denies itself the right to exist by proclaiming that they
are the CVS done right quite convincing. Especially so when they are coming from
a guy who has a project of ~22000 files to maintain and does about 25 merges
per day. His main pain points with CVS/Subversion hit very close to home:
There is no data coherency model to write home about. Which might not
be a problem for most projects but certainly is for something as security
sensitive as an OS kernel. Basically with CVS (and I think even Subversion)
the only way you know that your data got corrupted is when its too late.
Branching is waaay too painful because of things like global
namespace for branches (read: constant collisions and things like
test_12345 branches), all branches being pushed down every developer's throat, etc.
Subversion patches certain things as far as branches go, but fails miserably
as far as merging is concerned.
The tools you use are supposed to make you more productive. Period.
Now, I must admit that as much as he was successful at positioning Git to look as the
best tool for the job I am still not convinced on two accounts:
Does git really offer a nice way of structuring complex projects like KDE and such?
His suggestion of Super projects (workspaced with pointers to individual
Git repositories) might be an interesting one, but it certainly requires
some practice and experience to be evaluated properly.
[2008 Update: It seems that this is no longer an issue. Git got the infrastructure
for supporting Submodules and from
what it documented on their TWiki looks like they got it right]
Is the approach Git takes with making it easier to work with projects as
a whole at an expense of treating projects as collection of files the right one?
See the problem is -- I'm much more of a lone developer
than a gatekeeper. So Git might be optimizing for the role I rarely
find myself in (and Linus finds all the time)
But regardless of these concerns I highly recommend you watch the presentation
yourself -- it is well worth it. Just keep in mind one thing: just before
the BitKeeper (something Linus seems to have very fond memories of)
Lary used to develop this little project called
TeamWare here at Sun.
What was TeamWare? Well it was "a distributed source code revision control system...
which BitKeeper seems to share a number of design concepts with".