On The Margins

(Masood Mortazavi)


(Books)(Blogger)(java.net)
Check Google Page Rank

20041001 Friday October 01, 2004

[ Web ] The Failure of Search (or the Fallacy of Abundance)

So, what's going on with web search? Why is it giving such high rank to (at best) marginal material, such as the one on this weblog on certain topics? Or as I asked earlier, why do we even feel that we get anything relevant when we perform a search on the Web? How much better material are we actually missing when we limit ourselves to the findings of a search engine?

It is in asking those sorts of questions that we can arrive at modest discoveries or at least novel explanations of what we see around us.

To further the investigation I reported earlier, I went back to the chapters on search in Hubert Dreyfus' little book, On the Internet. According to Dreyfus, given the immense size of the Net, it is "estimated that search engines can recall at most 2 per cent of the relevant sites." (The number might have changed in the last three years but I don't believe that the changes, if any, would affect the arguments in any drastic way.)

We need to ask why "content" (or "information") retrieval systems are receiving the hype they are receiving even if they are hardly adequate when it comes to searching for specific content. How could my weblogs, even if they are somewhat useful, be ranked as the third most useful or important content on certain scholars I've only occasionally quoted and on whose works I still consider myself a novice?

Surely, this sort of system behavior cannot be good if we have hopes to be able to find important bits of documents or knowledge through search and information retrieval.

To explain the hype regarding search and information retrieval, Dreyfus quotes computer scientist David Blair, who cites information retrieval (IR) pioneer Don Swanson:

IR prioneer Don Swanson observed this phenomenon decades ago, and calls it the "fallacy of abundance". The fallacy of abundance is the mistake a searcher makes when he uses a large IR system and is able to find some useful documents. Swanson pointed out that on a sufficiently large system . . . almost any query will retrieve some useful documents. The mistake is to think that just because you got some useful documents the IR system is performing well. What you don't know is how many better documents the system missed.

And so . . . since my weblogs can be ranked highly by Google for certain subjects, they may be perceived (by some searchers) to be more important than they really are.

2004-10-01 11:11:02.0 -- ; Permalink ; Trackback.

Comments:

Post a Comment:

Comments are closed for this entry.

On the Margins Tag Cloud

america apache art berkeley blogs books business canada capital code communications community computing conference connectors content contribution corporate costs culture databases derby design desktop developers development economics education energy engineering film finance history information innovation international internet iran isfahan java java-db javaone law linux logic management markets mathematics media mobile music mysql netbeans networks news open open-solaris open-source opensolaris opensource os persian philosophy phones photography photos politics postgresql practice privacy products programming ruby science server services social society software solaris sports strategy sun sun-microsystems systems technology tehran telecommunications tools transactions transportation travel tv us video war web windows work writing

Del.icio.us

RSS Feeds

XML

All
/ Persian (فارسی)
/Announcements
/Art (هنر)
/Business
/Code
/Culture
/Design
/Economics
/Here
/History
/Java
/Mathematics
/Media
/Networks
/Papers
/Personal
/Philosophy
/Science
/Society
/Sports
/Sun Microsystems Inc.
/Technology
/Telecommunications
/This
/Web
/Work

Disclaimer

I work at Sun Microsystems. The opinions expressed here are purely my own, and neither Sun nor any other party necessarily agrees with them.

Coordinates

Locations of visitors to this page

« December 2009
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  
       
Today

www.flickr.com
This is a Flickr badge showing public photos from M.Mortazavi. Make your own badge here.

Entry Statistics

Entries: 1246
Comments: 919

Recent Entries

StatCounter

Statistics from StatCounter

Page Rank

Check Google Page Rank

On the Margins Tag Cloud

america apache art berkeley blogs books business canada capital code communications community computing conference connectors content contribution corporate costs culture databases derby design desktop developers development economics education energy engineering film finance history information innovation international internet iran isfahan java java-db javaone law linux logic management markets mathematics media mobile music mysql netbeans networks news open open-solaris open-source opensolaris opensource os persian philosophy phones photography photos politics postgresql practice privacy products programming ruby science server services social society software solaris sports strategy sun sun-microsystems systems technology tehran telecommunications tools transactions transportation travel tv us video war web windows work writing

RSS Feeds

XML

All
/ Persian (فارسی)
/Announcements
/Art (هنر)
/Business
/Code
/Culture
/Design
/Economics
/Here
/History
/Java
/Mathematics
/Media
/Networks
/Papers
/Personal
/Philosophy
/Science
/Society
/Sports
/Sun Microsystems Inc.
/Technology
/Telecommunications
/This
/Web
/Work

Other Places




Landmine Casulties
free counters'

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License.
© Masood Mortazavi
This is a personal weblog, I do not speak for my employer.