14 Jun 2005
OpenSolaris Source Browser

As a security engineer I spend a great deal of my time wandering through Solaris source code jungle looking to hunt down security bugs. I had written a simple search engine in Java specifically meant for searching source code. This search engine, now enhanced with cross-referencing is being used to search and browse the
OpenSolaris code on-line.
It was an experiment in extreme programming, trying to test my principle: "Make the common case faster and easier".
Care was taken at each step in programming, that it executed faster. A lot more time was spent making sure it was easy to use.
A simple abstraction of a "program file" is central to the search engine. It was needed to make the search work across various source languages and binaries formats:
-
A program has definitions: In C, these could be macros, structures, functions or variables. In Java, these could be classes, methods or fields. For the sake of abstraction they are all "Definitions".
When a cross reference is generated, definitions are not linked to anything else. These are bold and italic when seen in browser.
-
Then there are symbols. These are the places where definitions are used, referenced, called, invoked or used in expressions.
When cross referenced they are linked to take you to the point where they are defined.
When the target of linking is defined within the source file, these are colored purple, else they are normal links.
- Then there is all human search-able text content. This includes everything like messages, string constants, keywords, numbers and operators.
- There is file's revision history that contains change comments and people changing them, dates etc.,
- Other than these a source file has a path which can have a source tree's release and product consolidation names apart from the path to the source. File extension that can tell the type of programming language
You can search for all these fields.
Apache Lucene was chosen as the core search library that indexes source code,
after considering a variety of other search engines and tools.
Exuberant Ctags is used to extract definitions from a source file.
A simple HTML pretty printer and cross-referencer formats
C/C++/Java family of languages, shell scripts and plain text formats.
The source browser understands CVS, RCS and SCCS used for source code versioning.
Some useful tips for searching the OpenSolaris Source:
How do you limit your searches only to a particular directory tree?
Searching for uses of
MAXPATHLEN on the whole tree could throw
several hundred files.
If you just want to search only in the kernel, add
usr/src/uts to
path field:
MAXPATHLEN path:usr/src/uts
Given an error message, how do you find the code responsible for it?
Assume I see something like this on log messages:
....
Jun 10 21:09:36 ahost bge: [ID 801593 kern.notice] NOTICE: bge0: link down
Jun 10 21:09:36 ahost in.routed[142]: [ID 238047 daemon.warning] interface bge0 to 123.123.56.78 turned off
Jun 10 21:13:00 ahost in.routed[142]: [ID 464608 daemon.error] route 123.123.56.78/23 --> 123.123.56.78 nexthop is not directly connected
Jun 10 21:23:00 ahost in.routed[142]: [ID 464608 daemon.error] route 0.0.0.0 --> 123.123.56.1 nexthop is not directly connected
Jun 10 21:33:00 ahost in.routed[142]: [ID 464608 daemon.error] route 123.123.56.0/23 --> 123.123.56.78 nexthop is not directly connected
...
Now I want to go to the source code that is causing these messages.
Pick up fragments of string which are most likely to be in the source code and search for them.
"nexthop is not directly connected" is a good candidate. Searching for the phrase (in quotes) immediately hits the
source file causing that message.
How do you limit your searches only to Perl scripts?
Assuming that perl scripts have extension "
pl", add the term
pl to path field. For eg., searching for "
connect" in perl scripts use the syntax:
connect path:pl.
More information about the usage and examples are on source browser
help page.
Technorati Tag:
OpenSolaris
Technorati Tag:
Solaris
Link |
Posted by Bharath on June 15, 2005 at 12:56 AM PDT #
Posted by Maw on June 15, 2005 at 02:10 AM PDT #
Posted by Chandan on June 15, 2005 at 09:19 PM PDT #
Posted by Mike on July 26, 2005 at 09:17 PM PDT #
Posted by CK on August 06, 2005 at 03:45 AM PDT #
Posted by Dave on August 19, 2005 at 11:59 AM PDT #
Posted by Chandan on November 15, 2005 at 11:41 PM PST #
Posted by Zvendil on November 19, 2005 at 08:56 PM PST #
Posted by Chandan on November 21, 2005 at 08:15 AM PST #
http://www.ioplex.com/~miallen/
It might be helpful for those you may not be terribly familiar with setting up Java, Servlet containers, and so on. It also has notes about a few hiccups I ran into (e.g. SRC_ROOT and DATA_ROOT must be full paths).
Posted by Mike on December 02, 2005 at 12:52 PM PST #
Posted by 202.122.23.3 on June 06, 2006 at 06:16 AM PDT #