Minion: An open source search engine from Sun Labs
Saturday Apr 19, 2008
I just created a java.net project for Minion. Minion is the name that we (which is to say, Jeff) came up with for the open sourcing of the Sun Labs search engine. The engine we're open sourcing is a substantial revision of the engine that ships with the Portal Server and Web Server.
In the simplest terms, Minion provides an API for indexing and searching documents. Minion has a pretty liberal interpretation of what a document is: a document is a map from field names to field values. If you want to index data, you just have to figure out what the fields are that you want indexed and how they should be treated by the engine. The indexer takes a document as a java.util.Map and adds it to the index. This simple model turns out to be fairly useful for a pretty wide range of things.
As far as querying goes, Minion provides ranked boolean, proximity, and parametric query operators. In addition to the query opertions, Minion provides document similarity operations as well as automatic document classification and document clustering capabilities.
Once the project's officially approved (and we clean things up a wee bit) we'll be putting the source code into the java.net repository.
For the next little while, I'll be blogging about the engine in general as well as the extremely specific.






Wow, big news! This is fantastic. I can't wait u...
Have you considered donating this to ASF (Apache L...
Jeff, thanks for the kind words. I hope to be blo...
This sounds great. Where can I download it and hav...