blogging california england firefox glassfish google hacking j2ee java openid opensource roller skype soccer sun sunray thewaronliquid travel treo ubuntu vaio voip web2 work yahoo
Oct
12

[from blog4mantero's FlickrStream]

OK, maybe that's a bit harsh - it can't be that bad, after all, people (including myself) use it all the time. But it is fair to say that search on the web is a problem yet to be solved.

Google, Ask, Y!, MSN, AltaVista - they all do a fairly basic job of locating simple strings on web pages (and in the case of Google embedded in other rich document formats) but the way search works is pretty basic today and it's common for the user to have to repeat the search process multiple times to get what you want - ie. the user has to adapt to the machine's results and try again (and again). So basically - search is a largely human endeavour but we get some help from machines (Google, etc.) to help with the volume problem.

There are other problems :

  • The Problem is growing very fast The More people are adding more stuff to the Web and much of it is not good quality (in terms of correctness of content / authority) - eg. this post - I know very little about 'search' - I'm far from being an expert on the subject - but even the most powerful search algorithms can't really tell you whether I'm an authority on the subject or just someone with spare time and an opinion. Max has commented on the more general issue in his post on Existential Phenomenology.
  • Much of the decent content on the Web is Opaque to Search. Historically, much of the content on the web has been text (HTML, XML, XHTML, etc.) which is handy as it's trivial for a search engine to access; but now we're seeing more rich document formats - PDF, MS Word, OpenDocument in the future - these present a problem but that's being overcome. The real challenge comes when more of the content is in the form of audio (pod casts, etc.) and video. This is important because (traditionally at least) the quality of this content has been greater because the production costs were higher. So, for example, if you wanted to learn about "The feeding Habits of Mountain Lions" - I would suggest getting hold of a video documentary by the BBC Natural History Unit (world renowned for being good at this kind of thing) as opposed to spending an hour or so searching the web for something appropriate and authoratative. The problem is, that video is completely opaque to search engines - there is a little bit of meta-data describing the video but that isn't enough.
  • Old habits die hard / searchers are lazy The people doing the searching have adopted a bad habit - type a word (or two) into the box and click (repeat as necessary) - there are more effective ways to find what you want - ie. go to the place likely to have a good answer or somewhere you trust and start your search there.
Fortunately there are many companies / organizations applying brain-power to the problem, the following links are probably worth clicking if you're interested in how these problems and others are being tackled :

There are plenty of other promising technologies and services out there (too many to list) - this is clearly an area where a winner is yet to emerge.

Technorati Tags : ,

Find it

Subscribe

Contact Me

My status

follow pixelfodder at http://twitter.com

Links

The Aquarium (from the source)

Images

sharps. Get yours at flagrantdisregard.com/flickr