Data Models: Don't get me started...
Kristen's
recent posting on web site data models
digs into a problem I think a lot of us everywhere have on our web
properties: There's a lot of cruft.
Even if most of what you've got on
your Web is state-of-the-art, data-driven pages, chances are if your
site has been around a while (Sun.com, for instance, has been around 10
years and our intranet is older than that) you're going to have an
embarrassing amount of manually maintained chunks of HTML that aren't
really well suited for interchange between different systems.
This
state of disarray leads to what the industry humorously calls
"manual repurposing" or 'swivel chair interfaces" -- that is, having
people mindlessly copy and
paste chunks of text from one place to another in order the keep the
engines of the ship running, rather like shoveling coal.
It's a bit of
a hopeless exercise, because once you've decided that it's easier to
copy and paste rather than to invest a week or two thinking about a
data model, you've trapped yourself into a commitment to manually
feed
a potentially ever-expanding set of sites and systems that all might
need the same elements of information and each of which might want to
do something different with it. There are some reasonable ways to
automate the coal shoveling, such as
Web-based syndication services that can snarf up swatches of HTML
content that your content partners can have magically appear on their
sites, but these only go so far because blobs of HTML inherently
contain very little knowledge about the content. And so you might have
a nifty product spec description that includes weight or voltage
requirements, but if it's imprisoned in an HTML chunk without any
special markup, you can't use that weight information in, say, a
shipping description because the information just isn't accessible that
way. So, somebody retypes it and heaven help them if it changes later.
As a Web "user experience" guy, I care about this problem a lot more
than you might think. I care about it because when there's a lack of
order in
the data, I can't create systematically navigable interfaces around
content, and I can't make sure that rendered pages contain all the
content they're supposed to, and sometimes I can't even know that the
content everywhere is up to date because there's no data model or
system of record to consult for each data element.
A good case study in this topic is the new Business Solutions area
which just launched on sun.com. Originally, much of the
content was
scattered around the site in literally thousands of HTML files and
PDFs, organized into a byzantine set of directory structures that had
evolved over the years, with the navigation roughly tracking to the
byzantine directories and sometimes not working quite right because it
was all hand-maintained through the heoric efforts of various web
folks. End result from a user experience standpoint:
Lots of great case studies and other information that few people could
find.
The new system puts important content in an XML
repository and then presents it to the site visitor as if it's it's all
organized in one place on the web site. In the new system, content is
tagged with metadata... whose taxonomy
drives a navigational system... that in turn allows a user to zip
quickly to business case studies and articles by industry type,
technology solution type, or business goal. The rendered destination
pages are fed underneath by data in a content model that is defined in
an XML schema that also helps define template-driven authoring that
ultimately makes it easier for writers to know what content needs to
exist. All of the standard elements of the sun.com design here -- such
as the navigational 'breadcrumbing" toward the top of the page -- are
all driven by the metadata taxonomy.
It probably sounds complicated (and actually I've oversimplified quite a bit), but what it means to me as UE guy is I can later change around the UI... or make available content for subscription... or use the same stuff on another subsite... or add new features... all without touching any of the original content. And that's all I want from my data models.
Posted by Anonymous on June 30, 2004 at 12:56 PM PDT #