Chandan chandanlog(3C)
or sayings of an hearer
or A Blog of a Security Engineer
or The Official Online Journal of Chief Executive Prankster, Sun Microsystems Inc.,

All (Archive) | General | Solaris | Security | Art | About | |
General Solaris Security Art

« How do you say Open... | Main | From Binaries to the... »
14 Jun 2005 OpenSolaris Source Browser
As a security engineer I spend a great deal of my time wandering through Solaris source code jungle looking to hunt down security bugs. I had written a simple search engine in Java specifically meant for searching source code. This search engine, now enhanced with cross-referencing is being used to search and browse the OpenSolaris code on-line.

It was an experiment in extreme programming, trying to test my principle: "Make the common case faster and easier". Care was taken at each step in programming, that it executed faster. A lot more time was spent making sure it was easy to use.

A simple abstraction of a "program file" is central to the search engine. It was needed to make the search work across various source languages and binaries formats:
  1. A program has definitions: In C, these could be macros, structures, functions or variables. In Java, these could be classes, methods or fields. For the sake of abstraction they are all "Definitions". When a cross reference is generated, definitions are not linked to anything else. These are bold and italic when seen in browser.
  2. Then there are symbols. These are the places where definitions are used, referenced, called, invoked or used in expressions. When cross referenced they are linked to take you to the point where they are defined. When the target of linking is defined within the source file, these are colored purple, else they are normal links.
  3. Then there is all human search-able text content. This includes everything like messages, string constants, keywords, numbers and operators.
  4. There is file's revision history that contains change comments and people changing them, dates etc.,
  5. Other than these a source file has a path which can have a source tree's release and product consolidation names apart from the path to the source. File extension that can tell the type of programming language
You can search for all these fields.

Apache Lucene was chosen as the core search library that indexes source code, after considering a variety of other search engines and tools. Exuberant Ctags is used to extract definitions from a source file. A simple HTML pretty printer and cross-referencer formats C/C++/Java family of languages, shell scripts and plain text formats. The source browser understands CVS, RCS and SCCS used for source code versioning.

Some useful tips for searching the OpenSolaris Source:

How do you limit your searches only to a particular directory tree?

Searching for uses of MAXPATHLEN on the whole tree could throw several hundred files. If you just want to search only in the kernel, add usr/src/uts to path field: MAXPATHLEN path:usr/src/uts

Given an error message, how do you find the code responsible for it?

Assume I see something like this on log messages: ....
Jun 10 21:09:36 ahost bge: [ID 801593 kern.notice] NOTICE: bge0: link down
Jun 10 21:09:36 ahost in.routed[142]: [ID 238047 daemon.warning] interface bge0 to 123.123.56.78 turned off
Jun 10 21:13:00 ahost in.routed[142]: [ID 464608 daemon.error] route 123.123.56.78/23 --> 123.123.56.78 nexthop is not directly connected
Jun 10 21:23:00 ahost in.routed[142]: [ID 464608 daemon.error] route 0.0.0.0 --> 123.123.56.1 nexthop is not directly connected
Jun 10 21:33:00 ahost in.routed[142]: [ID 464608 daemon.error] route 123.123.56.0/23 --> 123.123.56.78 nexthop is not directly connected
...
Now I want to go to the source code that is causing these messages. Pick up fragments of string which are most likely to be in the source code and search for them. "nexthop is not directly connected" is a good candidate. Searching for the phrase (in quotes) immediately hits the source file causing that message.

How do you limit your searches only to Perl scripts?

Assuming that perl scripts have extension "pl", add the term pl to path field. For eg., searching for "connect" in perl scripts use the syntax: connect path:pl.

More information about the usage and examples are on source browser help page.

Technorati Tag: OpenSolaris Technorati Tag: Solaris

Link | Comments [11]

Comments:

Fantastic. Really useful info.

Posted by Bharath on June 15, 2005 at 12:56 AM PDT #

Wtf! This source browser owns everything I have ever seen before. Can you please release the sourcecode of this tool?

Posted by Maw on June 15, 2005 at 02:10 AM PDT #

Efforts to OpenSource the tool is under way. Watch this blog for more news and progress.

Posted by Chandan on June 15, 2005 at 09:19 PM PDT #

Bravo! This is a very nice tool. Please oh please release this to the public. I would love to browse my code like the OpenSolaris Source Browser.

Posted by Mike on July 26, 2005 at 09:17 PM PDT #

This is the best source browsing tool that I have seen. I had been looking how OpenSolaris Source Browser works for 2 weeks and finally find this page. A new 'must have' tool for developers. Can't wait to use the tool!

Posted by CK on August 06, 2005 at 03:45 AM PDT #

I agree about the source browser - it is far superior to anything I have used and I would welcome the opporunity to use it within my organization.

Posted by Dave on August 19, 2005 at 11:59 AM PDT #

Just released the sources and the tool on http://www.opensolaris.org/os/project/opengrok/ The project is named OpenGrok (it helps you Grok Open Source better!)

Posted by Chandan on November 15, 2005 at 11:41 PM PST #

How do you get the tool to recurse down a dir and index all files? It only sees files under the SRC_ROOT.

Posted by Zvendil on November 19, 2005 at 08:56 PM PST #

Make sure that directories are not symlinks - it ignores them. You can send bugs/comments to opengrok at sun-dot-com.

Posted by Chandan on November 21, 2005 at 08:15 AM PST #

I've created brief description about installing and running OpenGrok:

http://www.ioplex.com/~miallen/

It might be helpful for those you may not be terribly familiar with setting up Java, Servlet containers, and so on. It also has notes about a few hiccups I ran into (e.g. SRC_ROOT and DATA_ROOT must be full paths).

Posted by Mike on December 02, 2005 at 12:52 PM PST #

too simple

Posted by 202.122.23.3 on June 06, 2006 at 06:16 AM PDT #

Post a Comment:

Comments are closed for this entry.

« How do you say Open... | Main | From Binaries to the... »

Copyright (cc) 2004-2006 by Chandan chandanlog(3C): OpenSolaris Source Browser