Kirk Pearson's Weblog
Kirk Pearson's Weblog

Wednesday July 28, 2004
how the web bypasses language barriers
I want the world to know about distributed computing, and to put its wasted computing cycles to good use. But since I only speak English (yes, I'm another one of those lazy Americans who hasn't learned another language fluently because much of the rest of the world has been gracious enough to learn English), I can only tell others about distributed computing in English.
Fortunately, some of those other world citizens who have learned English have also been gracious enough to translate my site into their languages: Hebrew, Italian, Portuguese, German, Czech, and now Chinese (hello to the Equn.com team!). Thanks to these people I can reach more of the world in spite of my langauge handicap. And thanks to the web, I can do this much more easily and quickly than was ever possible before.
Some day Douglas Adams' babel fish may exist, and none of us will have language barriers, but until then we can use some of the increasing number of tools on the web to help us bypass our language barriers. The babelfish machine translator site allows people to translate small amounts of text or web pages among various languages. The translations are not always of high quality, but they are free, fast, and usually understandable (and the mistranslations are often funny). I use this site frequently to read discussion forum postings in French, German, Swedish, Russian, Korean and other languages: without this site I would miss out on these discussions. Wikipedia, an online, "open-source" encyclopedia, allows people to create encyclopedia articles in many languages and to easily translate them into other languages. The Worldwide Lexicon project will create a "network of multilingual dictionary and translation services for the Internet." This project will allow human volunteers to translate words and phrases quickly and accurately among many languages.
I am amazed at how many ways there are now for people to access all of the information on the web, regardless of the language it is written in.
(2004-07-28 14:32:18.0)
Permalink

Monday July 26, 2004
Google search is broken!
As of 10:15 AM MDT (5:15 PM UTC), if you try to perform a search at Google, you will see:
503 Server Error
Google
Error
Server Error
The service you requested is not available at this time.
Service error -27.
I can't think of a time that Google search has ever been broken when I've tried to use it. I wonder if this is related to all of today's news about Google's upcoming IPO?
And speaking of IPOs, what was the last company to IPO between $106 and $136 per share? Google must not be worried about attracting investors.
11:30 AM MDT update: according to a Slashdot article, "the latest MyDoom worm variant has caused a bit of an Internet storm. Google, at this time (12:28 EDT), is returning 503 errors on all queries submitted from certain locations."
Thanks to Chris Calkins for pointing out Google's search problems.
(2004-07-26 09:48:38.0)
Permalink

Friday July 23, 2004
Just one page a day
What do you do while you're listening to a conference call, or waiting for an application to compile, or waiting for a document to print? Twiddle your thumbs? Surf the web for the latest depressing financial news and gossip on celebrities? Why not use that wasted time to contribute to the world's first online library of free electronic texts, Project Gutenberg?
Project Gutenberg was begun in 1971 by Michael Hart, with the goal of creating a library of 10,000 electronic books which would be free and downloadable to anyone with a computer. The library contains books that have entered the public domain after their copyrights expired (these books were typically published before 1925). Until a few years ago, each electronic book was created by one person, who laboriously scanned a paper copy of the book, one page at a time, on an electronic scanner, then converted each scanned image into ASCII text, then assembled all the pieces of text into one file and proofread and formatted the file. I have contributed five books to Project Gutenberg using this technique.
In 2000, Charles Franks created the Distributed Proofreaders project to help speed up the process of creating electronic books. In this project, volunteer project managers scan paper books to create a set of images for each book, and use Optical Character Recognition (OCR) software to convert each image to a page of ASCII text. Then a project manager uploads the images and text pages to the Disibuted Proofreaders server. Next, volunteer proofreaders view a list of projects at the website and choose one to proofread. Then a simple proofreading interface is displayed and a volunteer can proofread pages of the project, comparing the OCRed text to the image of the page, and making any corrections necessary to the text. After a book is completed, a post-processor assembles all of the text pages into the final e-text and submits it to Project Gutenberg, where it is sent to ftp archives all over the world. This project has made it possible to create an electronic book in hours instead of weeks.
With the help of hundreds of volunteers, the project now proofreads over 6,000 pages (twenty-four 250-page books) every day. On September 3, 2003, the project completed its 2,000th book: soon after, on October 15, 2003, it helped Project Gutenberg reach its first goal of publishing its 10,000th book. As of today Distributed Proofreaders has completed 4,841 books.
So how can you help? Sign up, choose a book that interests you, and proofread just one page a day. Depending on the difficulty of the book, this can take 5 to 30 minutes. You can proofread a sentence while you're waiting for another web page to load, a paragraph while you're waiting for your document to print, or a page while you're waiting for someone else to finish talking on a conference call. If 100 people proofread just one page a day, they will create a new electronic book in just two to three days. You don't need any special tools or skills to proofread. The project site contains detailed proofreading guildelines and tutorials, proofreading mentors, and very active discussion groups.
If you're fluent in another language besides English, you can also help proofread books in other languages in the Distributed Proofreaders Europe project. This project enables volunteers to proofread books in any Unicode-supported script.
(2004-07-23 10:47:33.0)
Permalink

Wednesday July 21, 2004
Have you BOINCed today?
No, not boinked, but BOINCed. BOINC is the Berkeley Open Infrastructure for Network Computing, the next-generation distributed computing platform. It allows distributed computing organizations to develop and manage client applications on a common distributed computing platform. It also allows project participants to participate in multiple projects without having to download a client application for each project. The BOINC core client is available for Windows, Linux, UNIX, and Mac OSX, and project client applications are available for some or all of those platforms.
SETI@home is the first distributed computing project to use BOINC. It began its BOINC-based project to search for alien radio signals on June 22, 2004. The project has had a hard time generating enough work for all of the participants who want it, but the project server and website are becoming more stable. It has a Solaris client.
Predictor@home is the second project to use BOINC: it is trying to predict protein structures from protein sequences. The project client is still in development, and is not available for Solaris, but you can participate in the project with the other platforms and you can contribute useful work.
Einstein@Home and climateprediction.net are also developing BOINC-based clients, although it doesn't look like they plan to develop Solaris clients.
If your CPU is idle, join one of these projects and start BOINCing!
(2004-07-21 09:41:43.0)
Permalink

Tuesday July 20, 2004
I'm #1 in search results for "distributed computing" at google.com, anyway. Sometime in the last few days my distributed computing website took over first place in search results for "distributed computing" at Google. And it took me only 5 years to get there :-) Woohoo! (2004-07-20 11:44:50.0)
Permalink

Thursday July 15, 2004
What does your workstation/server/PC do in its spare time?
Six years ago I learned about the distributed.net project and became an active participant in its RC5-64 encryption cracking contest. A year after that I was completely hooked on the idea of distributed computing projects and I created a website to track active and upcoming public distributed computing projects (those in which anyone can participate) and to get others involved with them. The site has grown from tracking distributed.net, the SETI@home project to search for alien radio signals, and the GIMPS project to find world-record large Mersenne prime numbers, to tracking over 40 projects in many areas of science, biological sciences, and mathematics.
Since I created my website, it has grown to a viewership of 24,000 people per month and a mailing list of over 200 people who subscribe to website updates, and I have received lots of emails from people all over the world who have said my site got them interested in, and participating in, one or more distributed computing projects. But in the same time I have not gotten a single coworker, friend or relative interested in distributed computing enough to join a project and stay active in it for more than a few days. Why is this? Why is it so hard to convince the people around me of the benefits of using spare CPU cycles for distributed computing, and to participate in a project?
Modern CPUs are so fast that they have nothing to do most of the time. Your CPU is wasting millions of computing cycles doing nothing while you read this blog entry. It wastes billions more cycles while you view web pages, read emails, create documents, go out to lunch, or go home at night. With about 5 minutes of work you can install a project and use all of those spare cycles. While I am writing this message, one of my CPUs is looking for Optimal Golomb Rulers, the other is looking for a factor of 3^491 + 1, a number for which no factors are known, the idle CPU on my lab system is simulating protein folding, and the CPU on my PC at home is predicting the weather 50 years from now. Another Sun employee, lemonsqzz in the Distributed Folding project, is using enough spare CPU cycles to be the 6th-highest participating member in that project.
What is your CPU doing right now? Does it run a screensaver when you're not using it? If it's wasting computing cycles, why not use those cycles to help find a cure for cancer, design better hardware circuits, or contribute to other scientific or mathematics research?
(2004-07-15 15:28:43.0)
Permalink
Today's Page Hits: 7