Jed Michnowicz's Weblog

ZFS and Interwoven TeamSite

Thursday Nov 06, 2008

We recently upgraded our core CMS software.  Due to various platform dependencies, we were forced to upgrade EVERYTHING in one swoop including:

Given how many things could go wrong with the upgrade, we made it through without too much pain.

One of the main reasons for using ZFS was to simplify our backup process.  With our old setup, the process was:

  • Freeze the application
  • Offline a mirror
  • Unfreeze the application
  • Backup the offline mirror
  • Online the mirror and wait, and wait, and wait, and wait some more for everything to sync up

After 23.8 hours and a completely pegged IO subsystem, that process would [usually] complete and we could start with the next days backup.

Switching to ZFS, this gives us the ability to:

  • Freeze the application
  • Create a snapshot
  • Unfreeze the application
  • Backup the snapshot

With the new process, the backup completes in ~5 hours and we do not saturate the IO subsystem with constant disk resyncing. 

While we were sure to get significant administrative gains with ZFS, there was no case history of ZFS with TeamSite.  On a developers forum, I could not find anyone using ZFS with TeamSite.  I guess someone has the be the first?

Overall, we did not have any significant setbacks, but we did run into a conflict with the ARC cache.

(If you are a TeamSite/Solaris customer, take note....) 

ZFS is known to be aggressive with memory usage and will utilize unallocated memory.  This conflicted with TeamSite which likes to monitor free memory.  When the application senses the box is low on memory, TeamSite will shut itself down to prevernt data corruption.  After a few incidents of this and a few hours of searching for "zfs" and "memory", I quickly realized this was a simple fix of limiting the ARC cache and rebooting the box.  (Some might argue scheduling the reboot was a greater feat than finding a fix.)

Another oddity during the upgrade, which is really filesystem independent was having too much storage.  After initializing our new storage system and firing up the app, it threw the following error:

Tue May  6 14:29:50 2008]     iwts    master  DiskFail        /store/temp/default     statvfs 
failed on /store/temp/default (Value too large for defined data type)

With 2.9TB of available storage, I believe we ran into a 2^32 error.  We have "5,528,735,715" blocks on our filesystem while while 2^32 is "4,294,967,296".  With ZFS, the fix was as simple as:

zfs set quota=1.5GB pool/store

(Still waiting for an official answer from the vendor.  Holding breath.)

Finally, we have experienced some performance issues when "cloning" the "backing store" filesystem.  Upon clone destruction, the application intermittently freezes and zfs commands will hang for a few minutes.  This issue seems to be limited to clones as we can create and destroy snapshots without issue.  (We don't need to use clones so, we decided to avoid clones for now until we can properly investigate.)

Overall, I have been very happy with ZFS as a storage solution for TeamSite.  It simplifies backups and is a very simple to manage solution.


[0] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg

Webserver 7 and sun.com

Thursday Nov 06, 2008

This past summer, our team upgraded many of Sun's websites to Sun Webserver 7. This included www.sun.com, most of our "country" sites (uk.sun.com, cn.sun.com, etc) and most recently, www.java.com. In all, we upgraded approximately seven different server instances across 50 unique sites.

During the initial migration, we intentionally ignored some of the new features of Webserver 7. We knew the upgrade had the potential to make our life a lot easier, but we wanted to focus on stability before we started making significant changes. Over the past few months, I have fallen in love with Webserver 7 and the complex functionality you can pull off with some simple configuration tweaks. 

(One trend you will notice is that none of these hacks would be possible without the "<If>" statement.)

Selective Expires

A group requested the ability to add an "Expires" header to certain content.  As an admin, this can be a nightmare since the list of files getting the expires header will constantly change over time.  Also, before big content updates, they should temporarily disable the expires header.  Using the "lookup" function, I can empower web publishers to control this setting.

Create a control file:

expires.txt
# days till expirations (must be an integer)
/images/file.jpg       1
/js/Menu.js            14
/im/never-updates.jpg  500
/js/gl.js              1

Add the following config to:

  • Add an expires header to any matched URI
  • Process updates to the control file in real-time
 <If lookup('expires.txt',$uri)>
  ObjectType fn="set-variable" 
     insert-srvhdrs="Expires: $(httpdate($time + (lookup('expires.txt',$uri)) * 86400))"
</If>

Backend Application Hand-off

Significant portions of our sites are serviced by a back-end application.  In the old days, our only option was to map the backend URI's in a config file processed with a custom NSAPI.  While this worked, it was not an elegant solution and resulted in delays as people wanted to shuffle content between the webserver and the back-end application.  With Webserver 7, we can perform a file stat to determine if we should serve content locally with webserver, or send the request to the backend.

<If not -f "$docroot/$path" and not -f "$docroot/$path/index.html" and not -f "$docroot/$path/index.jsp">
     NameTrans fn="map" from="/" name="passthrough-backend" to="http:/"
</If>

URI Rewrites

Some of our sites share the docroot for http and https content.  The SEO czars wanted to control the crawling of both http and https content.  To enable this, we used the "restart" function which is very similar to an apache "mod_rewrite".

 <Client security="true">
  <If $uri =~ '^/robots.txt'>
    NameTrans fn="restart" uri="/robots-ssl.txt"
  </If>
</Client>

Now we can restrict robots from crawling the https content. 

We also use restarts to maintain some legacy URL's that cannot be redirected.  Previously, we used a custom NSAPI module for this functionality which we happily EOL'd.

[0] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg

∞ miles to the gallon

Friday Apr 07, 2006

With this switch to day light savings, it is officially bike to work season! While other Sun bloggers may not have a season per se, I have too many cell phone talking, makeup applying, bad driving drivers to contend with for a safe ride in the dark. I have enough trouble playing "frogger" in traffic during daylight hours.

My commute is ~18 miles one way from Santa Clara up to Menlo Park. Generally it will take about an hour, but it really depends on how many red lights I hit. While cycling takes no "gas", it does take a lot of fuel. On average, I probably consume an extra $2-$3 in food on bike commte days, but with a car commute cost of ~$15, I still end up saving money. Oh yeah, there is also that whole good for the environment thing.

[0] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg

Powerpoint Detox

Thursday Apr 06, 2006

I was at the Interwoven "GearUp" conference earlier this week. This is an annual conference where customers, analysts, and reporters can get together for a few days to learn, chat, and complain about Interwoven products and generic content management topics. Most of the day is spent staring at powerpoint presentations from product managers, customers and partners. Since I was a speaker at this years conference, I was able to slip in a little Staroffice into the mix during my content distribution talk. Most attendees probably did not realize it until my last graphic, but they were being "Staroffice-toxed". Ha ha! Take that Microsoft Office!

Overall I enjoyed commiserating with colleagues and engaging in surprisingly candid conversations with some of the Interwoven folks. It is always good to know that:

  1. For whatever reason, "the web"* is hard to get right
  2. "Enterprise" software always falls short of expectations, but it does get a little bit better each year
  3. We run a rather complete web platform compared to other companies, but it is by no means perfect or complete

* By "web" I mean content management, content distribution, content presentation and localization. As for the actual content, I tend to be agnostic toward what is contained within the web pages. I leave that to the creative folk.

PS: This is my first post so I will do a quick who am I...

I am a content management engineer in the content management engineering group. We provide the platform for content authoring, management, distribution (publishing) and delivery for sun.com and many other external web properties.

Like this post? del.icio.us | furl | slashdot | technorati | digg