PANOLA and the Chuvo Test
The other day I wrote
about Chuvo, the Portuguese water dog we use to
sniff out stale web content. We have another secret weapon called the
PAges NObody Looks At (PANOLA) process that we deploy periodically.
Chuvo and PANOLA work together to keep our web site clean. The process
works roughly as follows, and you can follow it on your web site too:
Technorati Tag: Web-Design
- Create a dump of all URLs on your site. Call this
AllURLs.txt. - From your web metrics system, dump a separate list of all pages
that got traffic (your metrics program won't know about pages that
didn't get any traffic -- by design, metrics systems can only measure
what does happen, not what doesn't happen.). Call this
ViewedURLs.txt. - Choose a cutoff number of views (we use Chuvo's 8-10
views-per-quarter for inspiration, but then move the bar a bit higher
since we figure any page on Sun.com ought to be well more popular than
a Portuguese water dog), so we actually cut off all pages that get less
than about 100 views in 90 days. Call this
TrimmedViewedURLs.txt. - Normalize the format of the URLs in
TrimmedViewedURLs.txt, so that the URLs listed match the format exactly of the URLs in theAllURLs.txtdump - Run a diff comparing
TrimmedViewedURLs.txtagainstAllURLs.txt.
- The result is your list of purge candidates.
- Send automated email to all the affected page owners and their bosses alerting them that their pages are about to be deleted. (You do have an up-to-date online list of page owners, right?) Set a deadline for them to reply.
- In parallel, visually inspect the pages on the purge list to understand the navigational implications of deleting the pages (some pages may be linked to by other pages, and you'll want to fix that... unless you have an automated publishing system that takes care of that sort of thing automagically.)
- Review the emails from the page owners, assuming you actually got any replies. Send another reminder and mention the deadline really clearly. Actually, send about four more reminders.
- (To keep up morale, send around short humorous notes within your
team with the funniest or strangest content you find as you rummage
through the purge list.)
- After consulting with anyone who has replied, revise your final purge list. Pick a day.
- Make sure you have a backup image of the site you can can recover anything quickly if you accidentally deleted it.
- Purge the pages.
- Brace yourself for one or two frantic emails from page owners who ask "why wasn't I informed of this?" even though you mailed them about five times.
- Monitor site comments and 404s to make sure you didn't delete anything really important.
- Open the frosty beverage of your choice and celebrate a cleaner
web site!
Technorati Tag: Web-Design