Kelly O'Hair's Weblog (blogs.sun.com)

pageicon Tuesday Apr 17, 2007

The Mercurial Mindset - Food for Thought

The Mercurial Mindset

So I'm sitting in my office at home and I hear this noise in our Ornamental Pear tree outside. It's not a Mercurial user, just a squirrel, eating fruit from a fruitless tree. Why a fruitless tree is bearing fruit is beyond me, but the squirrel likes it. Hopefully the new Mercurial users will be as happy as Rocky here. ;^)

I've been deep into the technical and developer concerns around Mercurial conversion issues for a few months now. I see the light at the end of the tunnel, most of the time anyway. ;^) So consider this a perspective from deep in the trenches.

One brave documentation team inside Sun has actually stepped up and done the conversion from Teamware to Mercurial, and although these are not software developers, and Mercurial isn't managing Java or C source files, the experience is interesting. Some of the problems they ran into:

  • Using NFS, file/directory ownership and permissions for the groups was a problem.

    The standard Unix rules on user/group permissions seem to be a problem for teams that have members that belong to different Unix groups. Now you might think this is obvious, but it's often not. Teams using Mercurial with NFS or local file systems need to make sure that they have their umask and groups setup right for standard Unix file sharing.

    As far as I can tell Mercurial is obeying the umask/group rules and doing the right thing, but Teamware appeared to have the avoided this issue by making sure all directories created with a 'putback' (e.g. 'hg push') had wide-open permissions (read-write for everyone), and then I assume it re-creates files as needed in those directories. In any case, Teamware did not seem to care that multiple team members didn't share the same groups, or it did this on purpose.

  • Multiple Heads (by accident?)

    Mercurial allows for multiple branches of development, but in general I don't think this is a good idea for anyone starting out. In many cases repositories would end up with multiple heads, which is normal after an 'hg pull' that has a conflict. But the state was very confusing for new users. Doing an 'hg push' will fail telling you to 'hg update', and it would tell you to 'hg merge' and it would tell you to 'hg commit'. Well, in fact, you do need to 'hg update', 'hg merge', and 'hg commit', but the error messages didn't seem to help explain what was happening.

    One confusing item in particular was the fact that even if the merge was empty, you still need to follow the 'hg merge' with an 'hg commit'. Quiting the editor when the file says "no changes", will often terminate the commit and leave you in the same state. Once the pattern is understood, they could conform, but it was considered 'odd' to be committing what appeared to be nothing.

    Repeated 'hg pull' actions after this state just created more and more heads or unmerged changesets, which made it even more confusing. Once the pattern of 'hg pull && hg update && hg merge && hg commit' was established, things went much smoother.

  • The basic steps.

    The basic steps of doing a change and pushing the change were not obvious to some people.

    With Teamware you would:

    
    bringover -p /nfs-path/parent-workspace -w /path/workspace .
    cd /path/workspace
    (cd some_dir && sccs edit files && vi/emacs files && sccs delget filename)
    bringover -p /nfs-path/parent-workspace .
    resolve
    putback .   -OR   putback some_directory/filename
    

    The Teamware bringover and putback is a per-file operation, with the default operation being the entire workspace, but the "I'm putting back files" attitude seems to be rather burned into the soul or something. When a resolve happens, it's on a per-file basis, and you each file is resolved separately.

    With Mercurial you would:

    
    hg clone /nfs-http-ssh-path/parent-repository /path/repository
    cd /path/repository
    vi/emacs any files you want
    hg commit
    hg pull && hg update
    hg merge && hg commit
    hg push
    

    The merge could be unnecessary, but if 'hg heads' lists more than one head, you need to 'hg merge && hg commit' to lop it off. (The 'hg commit' creates the changeset).

    With Mercurial you push and pull changesets, not files. Although you can create a changeset for one file change, you still don't push files. And if changeset A was created after changeset B, you can't skip B, you'd need to push both, even if these two changesets change completely unrelated files.

  • No partial workspaces.

    I've come to the conclusion that this is a good thing. Over the years I have been amazed at the times two supposedly unrelated file changes can cause a build failure or a runtime failure. Allowing partial workspaces or partial pushes is a risk to anyone that pulls these changes into their own repository. The integrity of the repository should be a top priority, and you can only guarantee that when you operate on the entire repository.

    Maybe after we get more experience with the merge issues I'll change my mind.

  • Merge issues.

    I've gotten myself into a merge nightmare a few times, tried kdiff3 on the Mac but it didn't seem to work for me, or I need to RTFM. Simple merges were fine, but the nasty ones where the same lines were changed were challenging for me, and for others too. The Teamware filemerge tool ('f' command during a resolve) was quite nice, and I'm beginning to miss it. Hooking filemerge into Mercurial is possible but on a file by file basis, and when you get those 100 files to merge, that may be awkward. But it's only a matter of time before I find a good merge tool and configuration.

Hope this helps someone. I'll try and talk about the poor CVS or Subversion mindset problems with Mercurial next time, although I'm not an experienced CVS or Subversion expert, I can address the benefits of a DSCM or Distributed Source Code Management system like Mercurial.

Comments:

I've gotten to love kdiff3, but it took me 4-5 sessions of sitting down and struggling with it before I figured out how to use it. I use it for directory-based merging. I like the feature of being able to split a conflict into two parts if the automatic algorithm gets something wrong. I wish filemerge would do that. I also lets you resolve conflicting lines from A and B as (A) or (B) or (AB) or (BA) in an intuitive way. Make some more posts about what you don't like (or find confusing) about kdiff3. Maybe I'll learn to hate it too. ;-)

Posted by Chris Quenelle on April 17, 2007 at 10:42 PM PDT #

The problem I always had with KDiff3 is the fact that I'm not able to manually edit the files while comparing them, even though according to the documentation it should be possible.

Well and the fact that the interface looks like a christmas tree decorated by someone with bad taste ;-)

So far I have always liked Araxis Merge best but of course it's not FLOSS and only available for Windows. So what I actually use on my Linux systems is Meld, which lacks a lot of features but at least it is simple, allows me to edit and doesn't hurt my eyes.

PS: Next time you should really try to focus on the squirrel (but mind the Squirrely Wrath!) ;-)

Posted by quintesse on April 18, 2007 at 01:48 AM PDT #

The multiple heads thing is the problem I've always had with Mercurial - darcs just works in this regard, whereas with mercurial I have to remember to hg pull && hg up && hgmerge before I edit anything in the repo.

Posted by bma on April 18, 2007 at 02:18 AM PDT #

You can use the "fetch" extension which does "pull&merge&commit" in one go.. Dmitri

Posted by Dmitri Trembovetski on April 18, 2007 at 11:24 AM PDT #

Hi Kelly, did you really take that squirrel picture yourself? It's beautiful. I thought american squirrels were grey but I was mistaken... BTW - I've played with Mercurial a bit and liked it.

Posted by daniel on April 20, 2007 at 02:40 AM PDT #

Thanks for the wonderful Mercurial-related posts :-)

FWIW, regarding merges, one tricky detail which got me when I first started doing a lot of pull && merge operations with mercurial is that the head to which the workspace is updated before the merge is significant!

So, for example, is there's a parent/child workspace and the parent initially contains two files ('alpha' and 'beta') at the time the clone operation is done, then after the following set of operations:

% find parent
parent/alpha
parent/beta

% hg clone parent child
% find child
child/alpha
child/beta

% cd child ; \
  hg rm alpha && hg commit -m 'del alpha' ; \
  cd ..

% cd parent ; \
  echo foo >> alpha ; \
  hg ci -m 'more alpha' alpha ; \
  cd ..

% cd child ; \
  hg pull

if the child workspace is updated to its "local" head (i.e. not the tip), the merge will keep the 'alpha' file deleted...

if the child workspace is updated to "tip" before the merge, the file 'alpha' may reappear even though it was locally deleted in the 'child' workspace.

The general rule appears to be:

"the head which is the 'parent' revision of the workspace in the history graph (i.e. the one to which we "update" before the merge) has precedence when the merge operation considers file renames, moves, copies or deletions"

Unfortunately, I don't know if Teamware has a similar notion of whose changes have precedence, so I can't really explain this in terms which would be more useful to people who are accustomed to <code>bringover</code> and <code>merge</code> with TW.

Posted by Giorgos Keramidas on May 03, 2007 at 04:50 PM PDT #

Post a Comment:
Comments are closed for this entry.