Useful stuff for your blog-reading pleasure.
All | General

20090917 Thursday September 17, 2009

New OpenSolaris ZFS Auto-Scrub Service Helps You Keep Proper Pool Hygiene

A harddisk that is being scrubbed

One of the most important features of ZFS is the ability to detect data corruption through the use of end-to-end checksums. In redundant ZFS pools (pools that are either mirrored or use a variant of RAID-Z), this can be used to fix broken data blocks by using the redundancy of the pool to reconstruct the data. This is often called self-healing.

This mechanism works whenever ZFS accesses any data, because it will always verify the checksum after reading a block of data. Unfortunately, this does not work if you don't regularly look at your data: Bit rot happens and with every broken block that is not checked (and therefore not corrected), the probability increases that even the redundant copy will be affected by bit rot too, resulting in data corruption.

Therefore, zpool(1M) provides the useful scrub sub-command which will systematically go through each data block on the pool and verify its checksum. On redundant pools, it will automatically fix any broken blocks and make sure your data is healthy and clean.

It should now be clear that every system should regularly scrub their pools to take full advantage of the ZFS self-healing feature. But you know how it is: You set up your server and often those little things get overlooked and that cron(1M) job you wanted to set up for regular pool scrubbing fell off your radar etc.

Introducing the ZFS Auto-Scrub SMF Service

Here's a service that is easy to install and configure that will make sure all of your pools will be scrubbed at least once a month. Advanced users can set up individualized schedules per pool with different scrubbing periods. It is implemented as an SMF service which means it can be easily managed using svcadm(1M) and customized using svccfg(1M).

The service borrows heavily from Tim Foster's ZFS Auto-Snapshot Service. This is not just coding laziness, it also helps minimize bugs in common tasks (such as setting up periodic cron jobs) and provides better consistency across multiple similar services. Plus: Why invent the wheel twice?

Requirements

The ZFS Auto-Scrub service assumes it is running on OpenSolaris. It should run on any recent distribution of OpenSolaris without problems.

More specifically, it uses the -d switch of the GNU variant of date(1) to parse human-readable date values. Make sure that /usr/gnu/bin/date is available (which is the default in OpenSolaris).

Right now, this service does not work on Solaris 10 out of the box (unless you install GNU date in /usr/gnu/bin). A future version of this script will work around this issue to make it easily usable on Solaris 10 systems as well.

Download and Installation

You can download Version 0.5b of the ZFS Auto-Scrub Service here. The included README file explains everything you need to know to make it work:

After unpacking the archive, start the install script as a privileged user:

pfexec ./install.sh

The script will copy three SMF method scripts into /lib/svc/method, import three SMF manifests and start a service that creates a new Solaris role for managing the service's privileges while it is running. It also installs the OpenSolaris Visual Panels package and adds a simple GUI to manage this service.

ZFS Auto-Scrub GUI

After installation, you need to activate the service. This can be done easily with:

svcadm enable auto-scrub:monthly

or by running the GUI with:

vp zfs-auto-scrub

This will activate a pre-defined instance of the service that makes sure each of your pools is scrubbed at least once a month.

This is all you need to do to make sure all your pools are regularly scrubbed.

If your pools haven't been scrubbed before or if the time or their last scrub is unknown, the script will proceed and start scrubbing. Keep in mind that scrubbing consumes a significant amount of system resources, so if you feel that a currently running scrub slows your system too much, you can interrupt it by saying:

pfexec zpool scrub -s <pool name>

In this case, don't worry, you can always start a manual scrub at a more suitable time or wait until the service kicks in by itself during the next scheduled scrubbing period.

Should you want to get rid of this service, use:

pfexec ./install.sh -d

The script will then disable any instances of the service, remove the manifests from the SMF repository, delete the scripts from /lib/svc/method, remove the special role and the authorizations the service created and finally remove the GUI. Notice that it will not remove the OpenSolaris Visual Panels package in case you want to use it for other purposes. Should you want to get rid of this as well, you can do so by saying:

pkg uninstall OSOLvpanels

Advanced Use

You can create your own instances of this service for individual pools at specified intervals. Here's an example:

  constant@fridolin:~$ svccfg
  svc:> select auto-scrub
  svc:/system/filesystem/zfs/auto-scrub> add mypool-weekly
  svc:/system/filesystem/zfs/auto-scrub> select mypool-weekly
  svc:/system/filesystem/zfs/auto-scrub:mypool-weekly> addpg zfs application
  svc:/system/filesystem/zfs/auto-scrub:mypool-weekly> setprop zfs/pool-name=mypool
  svc:/system/filesystem/zfs/auto-scrub:mypool-weekly> setprop zfs/interval=days 
  svc:/system/filesystem/zfs/auto-scrub:mypool-weekly> setprop zfs/period=7
  svc:/system/filesystem/zfs/auto-scrub:mypool-weekly> setprop zfs/offset=0
  svc:/system/filesystem/zfs/auto-scrub:mypool-weekly> setprop zfs/verbose=false
  svc:/system/filesystem/zfs/auto-scrub:mypool-weekly> end
  constant@fridolin:~$ svcadm enable auto-scrub:mypool-weekly

This example will create and activate a service instance that makes sure the pool "mypool" is scrubbed once a week.

Check out the zfs-auto-scrub.xml file to learn more about how these properties work.

Implementation Details

Here are some interesting aspects of this service that I came across while writing it:

  • The service comes with its own Solaris role zfsscrub under which the script runs. The role has just the authorizations and profiles necessary to carry out its job, following the Solaris Role-Based Access Control philosophy. It comes with its own SMF service that takes care of creating the role if necessary, then disables itself. This makes a future deployment of this service with pkg(1) easier, which does not allow any scripts to be started during installation, but does allow activation of newly installed SMF services.
  • While zpool(1M) status can show you the last time a pool has been scrubbed, this information is not stored persistently. Every time you reboot or export/import the pool, ZFS loses track of when the last scrub of this pool occurred. This has been filed as CR 6878281. Until that has been resolved, we need to take care of remembering the time of last scrub ourselves. This is done by introducing another SMF service that periodically checks the scrub status, then records the completion date/time of the scrub in a custom ZFS property called org.opensolaris.auto-scrub:lastscrub in the pool's root filesystem when finished. We call this service whenever a scrub is started and it deactivates itself once it's job is done.
  • As mentioned above, the GUI is based on the OpenSolaris Visual Panels project. Many thanks to the people on its discussion list to help me get going. More about creating a visual panels GUI in a future blog entry.

Lessons learned

It's funny how a very simple task like "Write an SMF service that takes care of regular zpool scrubbing" can develop into a moderately complex thing. It grew into three different services instead of one, each with their own scripts and SMF manifests. It required an extra RBAC role to make it more secure. I ran into some zpool(1M) limitations which I now feel are worthy of RFEs and working around them made the whole thing slightly more complex. Add an install and de-install script and some minor quirks like using GNU date(1) instead of the regular one to have a reliable parser for human-readable date strings, not to mention a GUI and you cover quite a lot of ground even with a service as seemingly simple as this.

But this is what made this project interesting to me: I learned a lot about RBAC and SMF (of course), some new scripting hacks from the existing ZFS Auto-Snapshot service, found a few minor bugs (in the ZFS Auto-Snapshot service) and RFEs, programmed some Java including the use of the NetBeans GUI builder and had some fun with scripting, finding solutions and making sure stuff is more or less cleanly implemented.

I'd like to encourage everyone to write their own SMF services for whatever tools they install or write for themselves. It helps you think your stuff through, make it easy to install and manage, and you get a better feel of how Solaris and its subsystems work. And you can have some fun too. The easiest way to get started is by looking at what others have done. You'll find a lot of SMF scripts in /lib/svc/method and you can extract the manifests of already installed services using svccfg export. Find an SMF service that is similar to the one you want to implement, check out how it works and start adapting it to your needs until your own service is alive and kicking.

If you happen to be in Dresden for OSDevCon 2009, check out my session on "Implementing a simple SMF Service: Lessons learned" where I'll share more of the details behind implementing this service including the Visual Panels part.

Edit (Sep. 21st) Changed the link to CR 6878281 to the externally visible OpenSolaris bug database version, added a link to the session details on OSDevCon.

"New OpenSolaris ZFS Auto-Scrub Service Helps You Keep Proper Pool Hygiene" has been brought to you by Constantin's Blooog.
This entry was created on 2009-09-17 07:25:34.0 PST and is associated with the following tags:

You're welcome to use this Permalink , add a comment below or send your feedback to constantin at sun dot com.
Comments [4]


20090325 Wednesday March 25, 2009

Think Twice Before Deleting Stuff (Or Better Not at All!)

Some piggy banks

No, this is not going to be another "Remember to do snapshots" post. I'm also not going to talk about backups. Instead, let's look at some very practical aspects of deleting files.

So, why delete a file? "Trivial", you think, "so I can save space!". Sure, dear reader, but at the expense of what?

Let's stop and think for a minute. Our lives try to center around doing cool, worthwhile, meaningful, useful stuff. Deleting files isn't really cool, nor fun, it is a necessity we're forced to do. Don't you hate it when that dreaded "Your startup disk is almost full" message appears while you're in the middle of downloading new photos from your latest exciting vacation trip?

Actually, the seemingly simple act of deleting is really a challenge: "Will I need this again?", "Wouldn't it be better to archive this instead?", "Last time I was really glad I kept that email from 2 years ago, so why delete this one?". Sometimes I surprise myself thinking a long time before I really press that "ok" button or hit "Enter" after the "rm".

The reality is: Storage is cheap, so why delete stuff in the first place?

To put things in perspective, let's try an ROI analysis of deleting files. Let's say we need about 6 seconds of thinking time before we can decide whether a particular file can really be deleted without regret. Let's also assign some value to our time, say $12 per hour (I hope you're getting paid much more than that, but this is just to keep the numbers simple).

Storage is cheap, and last time I checked, a 1 TB USB hard drive cost about $100 at a major electronics retailer, with prices falling by the hour.

Now, how much space does the act of deleting a file need to free up so it justifies the effort of deciding whether to delete or keep it?

Well, our $12 per hour conveniently breaks down to $0.20 per minute, which allows us to perform 10 delete-it-or-not decisions per minute at $0.02 each. Fine. Deleting seems to be cheap, doesn't it?

Now, for that $0.02 you can buy a 1/5000th of a 1 TB hard drive. Wait a minute, 1TB/5000 still amounts to 200 MB of data per $0.02! That's more than you need to store a 10 minute video, or a full CD of music, compressed at high quality! Or 20 presentations at 10MB each! Not to mention countless emails, source code and other files!

So, unless the file you're pondering is bigger than 200MB, it's not really worth even considering to delete it. I'll call this 200MB boundary the "Destructive Utility Heuristic (DUH)".

The result is therefore: Save your time, buy more harddisk space (or upgrade your old hard drive to a bigger one before it dies) and move on. Life's too precious to waste it on deleting stuff. Create good stuff instead! Only think about deleting stuff if the file in question is bigger than 200MB.

I can hear some "Wait, but!"'s in the audience, ok, one at a time:

  • "But I can delete much faster than 6 seconds!"
    No big deal. So you can delete 1 file per second, that's still a threshold of 33MB, more than 5 songs worth or even the biggest practical business presentation or the source code to a major open source project. And harddisks are getting cheaper every day, while your time will become more and more precious as you age. Yes, if you're dead sure that file is useless junk and don't need to think about it, go ahead and delete it, but why did you save it in the first place?

  • "But I like my directories to be clean and tidy!"
    Congratulations, that's a good habit! Keeping files organized doesn't mean you need to delete stuff, though. Set up an "Archive" folder somewhere and dump everything you think you may or may not use again there. Use one archive folder for each year if you want. File search technology is pretty advanced these days so you should be able to find your archived files quicker than the time you'd take to decide which ones you'll never want to find again. Then, you can still decide to delete your whole archive from 3 years ago because you never used it, and it will likely make some sense, because its size may be above the destructive utility heuristic, but chances are you won't really care because storage will have become even cheaper after those 3 years so you won't save a big deal, relatively speaking.

  • "That still doesn't help me when that damn 'Your startup disk is almost full' message comes!"
    You're right. The point is: It's often hard to sift through data and decide what to keep and what not. That's why we dread deleting stuff and instead wait until that message comes. I'm only offering relief to those that felt that the act of having to delete stuff isn't really rewarding, and it isn't (at least while you're below the DUH). Go buy a bigger harddrive for your laptop, it's really the cost effective option. Use the numbers above to help you justify that towards your finance department.

  • "I'm still not convinced. I actually kinda like going through my files and delete them once in a while..."
    Sure, go ahead. Just know that you could use that time to do more productive stuff, such as checking out the Sun cloud, installing OpenSolaris or testing our new Sun OpenStorage products.

  • "Wait, aren't you supposed to write about OpenSolaris, ZFS and this stuff anyway?"
    I'm glad you mentioned that :). Actually, OpenSolaris and ZFS make it even easier for you to both not care about deleting stuff while keeping your files organized at the same time. The amazing ZFS auto snapshot SMF service will create snapshots of your data automagically every 15 minutes, so it won't matter whether you delete files or not. You can then choose to either not delete them at all and just move them to some archive, or you can delete whatever you want, without the 6 seconds of thinking (just to keep stuff tidy), knowing that you'll always be able to recover those files with Time Slider later. You could then use zfs send/receive to dump your data incrementally to a file server as a backup mechanism and the hooks are already there to automate this.

See, once you think of it, there's not really a need to delete files at all any more. At least not for mere mortals like us with file sizes that are typically below the destructive utility heuristic of currently 200MB (and rising...) most of the time. Music has already reached the point where a song can be stored at studio quality with lossless compression at manageable file sizes so that kind of data won't see significant growth any more. And photos and videos will soon follow. This means we'll need to care less and less about restricting personal data storage. Instead, we now need to focus more on managing personal storage.

Now there's a completely different problem that'll keep us entertained for some time...

"Think Twice Before Deleting Stuff (Or Better Not at All!)" has been brought to you by Constantin's Blooog.
This entry was created on 2009-03-25 07:07:19.0 PST and is associated with the following tags:

You're welcome to use this Permalink , add a comment below or send your feedback to constantin at sun dot com.
Comments [3]


20090114 Wednesday January 14, 2009

How to get Audio to work on OpenSolaris on VirtualBox

Man playing a big trumpet My regular working environment on the go or when working from home is, of course, OpenSolaris. I've been using it on an Acer Ferrari Laptop for years now and I can say I'm very happy with it, and that's not just because I work for Sun.

Lately, I tried OpenSolaris on VirtualBox on my private MacBook Pro. This configuration turned out to work better than the native OpenSolaris on my company's Acer Ferrari laptop! Due to the MBP being 2 years newer and it having a dual-core CPU plus 4 GB of RAM, it turned out to be the better machine to host my OpenSolaris work environment.

With one exception: Audio.

Audio isn't enabled in VirtualBox by default in the Mac version and that has already been blogged elsewhere. The solution is simply to enable Audio in VirtualBox settings and select the Intel ICH AC97 soundchip.

Then, OpenSolaris doesn't come with an ICH AC97 audio driver and even the new SUNWaudiohd driver doesn't support it. The solution here is to download the OSS sound drivers from 4Front technologies. So far, so good.

But this didn't work for me: Either the sound would play for a few seconds, then hang, or the sound drivers wouldn't be recognized by GNOME/GStreamer at all, resulting in a crossed-out loudspeaker icon at the top! This is very frustrating if you want to show Brandan's excellent shouting video to an audience and have to switch out of OpenSolaris/VirtualBox back to Mac OS X just for that.

Apparently others suffered from the same annoyance, too, but neither of the solutions I found seemed to help: I installed and uninstalled and reinstalled the OSS drivers a number of times, ran the ossdevlinks script to recreate device links, even installed a newer, experimental version of the SUNaudiohd driver. No luck yet.

Then Frank, a Sun sales person who happens to use OpenSolaris on his laptop as well (Yay! a salesrep using OpenSolaris! Kudos to Frank!) suggested to uninstall the SUNWaudiohd driver, then install the OSS sound driver, which worked for him. It didn't occur to me that uninstalling SUNWaudiohd might be the solution, so I wanted to give it a try.

But, alas "pfexec pkg uninstall SUNaudiohd" didn't work for me either! Apparently there's a dependency between this package and the slim_install package bundle. Again, Google is your friend and it turned out to be a known bug that prevented me from uninstalling SUNWaudiohd. The workaround is simply to "pfexec pkg uninstall slim_install" which is no longer needed after the installation process anyway.

So lo and behold, gone is slim_install, gone is SUNWaudiohd, installed the OSS drivers, logged out and back in and audio works fine now! (Notice: no reboot required).

Here's the sweet and short way to audio goodness on OpenSolaris on VirtualBox:

  1. Shutdown your OpenSolaris VirtualBox image if it is running, so you can change it's settings.
  2. Activate audio for your OpenSolaris VM in VirtualBox. Select the ICH AC97 Chip. Here's a blog entry that describes the process.
  3. Boot your OpenSolaris VirtualBox image.
  4. Uninstall the slim_server package: "pfexec pkg uninstall slim_server"
  5. Uninstall the SUNWaudiohd driver: "pfexec pkg uninstall SUNWaudiohd"
  6. Download the OSS sound driver for OpenSolaris.
  7. Install the OSS sound driver: "pfexec pkgadd -d oss-solaris-v4.1-1051-i386.pkg" (Or whatever revision you happened to download).
  8. Log out of your desktop and log back in. Sound should work now.

"How to get Audio to work on OpenSolaris on VirtualBox" has been brought to you by Constantin's Blooog.
This entry was created on 2009-01-14 07:32:19.0 PST and is associated with the following tags:

You're welcome to use this Permalink , add a comment below or send your feedback to constantin at sun dot com.
Comments [9]


20071127 Tuesday November 27, 2007

Shrink big presentations with ooshrink

I work in an environment where people use presentations a lot. Of course, we like to use StarOffice, which is based on OpenOffice for all of our office needs.

Presentation files can be big. Very big. Never-send-through-email-big. Especially, when they come from marketing departments and contain lots of pretty pictures. I just tried to send a Sun Systems overview presentation (which I created myself, so less marketing fluff), and it still was over 22MB big!

So here comes the beauty of Open Source, and in this case: Open Formats. It turns out, that OpenOffice and StarOffice documents are actually ZIP files that contain XML for the actual documents, plus all the image files that are associated with it in a simple directory structure. A few years ago I wrote a script that takes an OpenOffice document, unzips it, looks at all the images in the document's structure and optimizes their compression algorithm, size and other settings based on some simple rules. That script was very popular with my colleagues, it got lost for a while and thanks to Andreas it was found again. Still, colleagues are asking me about "That script, you know, that used to shrink those StarOffice presentations." once in a while.

Today, I brushed it up a little, teached it to accept the newer od[ptdc] extensions and it still works remarkably well. Here are some examples:

  • The Sun homepage has a small demo presentation with a few vacation photos. Let's see what happens:
    bash-3.00$ ls -al Presentation_Example.odp
    -rw-r--r--   1 constant sun       392382 Mar 10  2006 Presentation_Example.odp
    bash-3.00$ ooshrink -s Presentation_Example.odp
    bash-3.00$ ls -al Presentation_Example.*
    -rw-r--r--   1 constant sun       337383 Nov 27 11:36 Presentation_Example.new.odp
    -rw-r--r--   1 constant sun       392382 Mar 10  2006 Presentation_Example.odp

    Well, that was a 15% reduction in file size. Not earth-shattering, but we're getting there. BTW: The -s flag is for "silence", we're just after results (for now).

  • On BigAdmin, I found a presentation with some M-Series config diagrams:

    bash-3.00$ ls -al Mseries.odp
    -rw-r--r-- 1 constant sun 1323337 Aug 23 17:23 Mseries.odp
    bash-3.00$ ooshrink -s Mseries.odp
    bash-3.00$ ls -al Mseries.*
    -rw-r--r-- 1 constant sun 379549 Nov 27 11:39 Mseries.new.odp
    -rw-r--r-- 1 constant sun 1323337 Aug 23 17:23 Mseries.odp

    Now we're getting somewhere: This is a reduction by 71%!

  • Now for a real-world example. My next victim is a presentation by Teera about JRuby. I just used Google to search for "site:sun.com presentation odp", so Teera is completely innocent. This time, let's take a look behind the scenes with the -v flag (verbose):
    bash-3.00$ ooshrink -v jruby_ruby112_presentation.odp
    Required tools "convert, identify" found.
    ooshrink 1.2
    Check out "ooshrink -h" for help information, warnings and disclaimers.

    Creating working directory jruby_ruby112_presentation.36316.work...
    Unpacking jruby_ruby112_presentation.odp...
    Optimizing Pictures/1000020100000307000000665F60F829.png.
    - This is a 775 pixels wide and 102 pixels high PNG file.
    - This image is transparent. Can't convert to JPEG.
    - We will try re-encoding this image with PNG compression level 9.
    - Failure: Old: 947, New: 39919. We better keep the original.
    Optimizing Pictures/100000000000005500000055DD878D9F.jpg.
    - This is a 85 pixels wide and 85 pixels high JPEG file.
    - We will try re-encoding this image with JPEG quality setting of 75%.
    - Failure: Old: 2054, New: 2089. We better keep the original.
    Optimizing Pictures/1000020100000419000003C07084C0EF.png.
    - This is a 1049 pixels wide and 960 pixels high PNG file.
    - This image is transparent. Can't convert to JPEG.
    - We will try re-encoding this image with PNG compression level 9.
    - Failure: Old: 99671, New: 539114. We better keep the original.
    Optimizing Pictures/10000201000001A00000025EFBC8CCCC.png.
    - This is a 416 pixels wide and 606 pixels high PNG file.
    - This image is transparent. Can't convert to JPEG.
    - We will try re-encoding this image with PNG compression level 9.
    - Failure: Old: 286677, New: 349860. We better keep the original.
    Optimizing Pictures/10000000000000FB000001A6E936A60F.jpg.
    - This is a 251 pixels wide and 422 pixels high JPEG file.
    - We will try re-encoding this image with JPEG quality setting of 75%.
    - Success: Old: 52200, New: 46599 (-11%). We'll use the new picture.
    Optimizing Pictures/100000000000055500000044C171E62B.gif.
    - This is a 1365 pixels wide and 68 pixels high GIF file.
    - This image is too large, we'll resize it to 1280x1024.
    - We will convert this image to PNG, which is probably more efficient.
    - Failure: Old: 2199, New: 39219. We better keep the original.
    Optimizing Pictures/100000000000019A000002D273F8C990.png.
    - This is a 410 pixels wide and 722 pixels high PNG file.
    - This picture has 50343 colors, so JPEG is a better choice.
    - Success: Old: 276207, New: 32428 (-89%). We'll use the new picture.
    Patching content.xml with new image file name.
    Patching styles.xml with new image file name.
    Patching manifest.xml with new image file name.
    Optimizing Pictures/1000000000000094000000E97E2C5D52.png.
    - This is a 148 pixels wide and 233 pixels high PNG file.
    - This picture has 4486 colors, so JPEG is a better choice.
    - Success: Old: 29880, New: 5642 (-82%). We'll use the new picture.
    Patching content.xml with new image file name.
    Patching styles.xml with new image file name.
    Patching manifest.xml with new image file name.
    Optimizing Pictures/10000201000003E3000003E4CFFA65E3.png.
    - This is a 995 pixels wide and 996 pixels high PNG file.
    - This image is transparent. Can't convert to JPEG.
    - We will try re-encoding this image with PNG compression level 9.
    - Failure: Old: 196597, New: 624633. We better keep the original.
    Optimizing Pictures/100002010000013C0000021EDE4EFBD7.png.
    - This is a 316 pixels wide and 542 pixels high PNG file.
    - This image is transparent. Can't convert to JPEG.
    - We will try re-encoding this image with PNG compression level 9.
    - Failure: Old: 159495, New: 224216. We better keep the original.
    Optimizing Pictures/10000200000002120000014A19C2D0EB.gif.
    - This is a 530 pixels wide and 330 pixels high GIF file.
    - This image is transparent. Can't convert to JPEG.
    - We will convert this image to PNG, which is probably more efficient.
    - Failure: Old: 39821, New: 56736. We better keep the original.
    Optimizing Pictures/100000000000020D0000025EB55F72E3.png.
    - This is a 525 pixels wide and 606 pixels high PNG file.
    - This picture has 17123 colors, so JPEG is a better choice.
    - Success: Old: 146544, New: 16210 (-89%). We'll use the new picture.
    Patching content.xml with new image file name.
    Patching styles.xml with new image file name.
    Patching manifest.xml with new image file name.
    Optimizing Pictures/10000000000000200000002000309F1C.png.
    - This is a 32 pixels wide and 32 pixels high PNG file.
    - This picture has 256 colors, so JPEG is a better choice.
    - Success: Old: 859, New: 289 (-67%). We'll use the new picture.
    Patching content.xml with new image file name.
    Patching styles.xml with new image file name.
    Patching manifest.xml with new image file name.
    Optimizing Pictures/10000201000001BB0000006B7305D02E.png.
    - This is a 443 pixels wide and 107 pixels high PNG file.
    - This image is transparent. Can't convert to JPEG.
    - We will try re-encoding this image with PNG compression level 9.
    - Failure: Old: 730, New: 24071. We better keep the original.
    All images optimized.
    Re-packing...
    Success: The new file is only 67% as big as the original!
    Cleaning up...
    Done.

    Neat. We just shaved a third off of a 1.3MB presentation file and it still looks as good as the original!

    As you can see, the script goes through each image one by one and tries to come up with better ways of encoding images. The basic rules are:

    • If an image if PNG or GIF and it has more than 128 colors, it's probably better to convert it to JPEG (if it doesn't use transparency). It also tries recompressing GIFs and other legacy formats as PNGs if JPEG is not an option.
    • Images bigger than 1280x1024 don't make a lot of sense in a presentation, so they're resized to be at most that size.
    • JPEG allows to set a quality level. 75% is "good enough" for presentation purposes, so we'll try that and see how much it buys us.
    The hard part is to patch the XML files with the new image names. They don't have any newlines, so basic Unix scripting tools may hiccup and so the script uses a more conservative approach to patching, but it works.

 

Before I give you the script, here's the obvious
Disclaimer: Use this script at your own risk. Always check the shrunk presentation for any errors that the script may have introduced. It only works 9 out of 10 times (sometimes, there's some funkiness about how OpenOffice uses images going on that I still don't understand...), so you have to check if it didn't damage your file.

The script works with Solaris (of course), but it should also work in any Linux or any other Unix just fine. It relies on ImageMagick to do the image heavy lifting, so make sure you have identify(9E) and convert(9E) in your path. 

My 22 MB Systems Overview presentation was successfully shrunk into a 13MB one, so I'm happy to report that after so many years, this little script is still very useful. I hope it helps you too, let me know how you use it and what shrink-ratios you have experienced!

"Shrink big presentations with ooshrink" has been brought to you by Constantin's Blooog.
This entry was created on 2007-11-27 03:20:08.0 PST and is associated with the following tags:

You're welcome to use this Permalink , add a comment below or send your feedback to constantin at sun dot com.
Comments [12]


20071101 Thursday November 01, 2007

7 Tips for Enhancing Your Email Efficiency

I think I sent my first email in 1987. We lived in Rome, Italy and my brother and I shared a modem with which we collected our very first online experiences on a Commodore Amiga 500.

Today I receive about 500-700 emails a day on my Sun account. Not counting Spam (most of which is filtered by our mail system anyway). That's a lot, but over time I grew accustomed to dealing with more and more email as efficiently as possible.

Here's what helps me use email as a productivity tool rather than a burden, while still having fun. This is going to be a long post, but if your Inbox currently has more than 100 emails, possibly sitting there for more than a week or two, then I promise you an easy to use way of getting your Inbox to 0.

Zero mails in your Inbox. Once and for all. Still, you will be informed about what's going on and it'll be earlier, with less effort, and more reliably.

The Email Client

In my email carreer, I've use a lot of mail clients. During university days I started with the classic mail(1) on SunOS 4 and it's counterparts on VMS and on an IBM 3090 mainframe. Then I've used Elm for a long time, then Pine. When I joined Sun in 1998, one of the first things I did was to compile myself Pine so I could keep my habit of reading email on a terminal.

Why on a terminal? It's always quicker and more efficient than a GUI (Yes, I'm one of those old-schoolers that still prefer vi as their favorite text editor). It really is. So much that I'd like to make it...

Email-Efficiency Rule #1: Make sure you can use your email client with keystroke commands only.

When dealing with hundreds of emails, the extra time to move the mouse cursor and to click on some buttons etc. really adds up. Learning keystrokes might seem tedious at first, but it will quickly become second nature and you'll be amazed at how quickly you can scan through emails with just one hand sitting on your keyboard, while having your other hand free to drink coffee while reading email.

After a while, I migrated to another email client called Mutt. This introduced two major new features that made my email-life much, much easier: Threads and Filters.

A threading email client automatically groups emails that have the same subject (or that are related to each other based on the header information) into threads. Threads are more efficient to read because they contain all emails related to a certain subject or conversation in one go. And more importantly: You can delete dozens of "Please take me off this list" or "me too" emails and other uninteresting discussions with a single keystroke!

Mozilla Thunderbird supports threads very nicely, so does Apple's Mail and of course GMail, only they call it "conversations" (and they dig up all related mail from the past too, which is very nice).

Message filtering is another powerful feature of modern email clients. It lets you pre-sort email into folders or assign different colors/priorities/etc., based on simple rules. I don't feel comfortable with automatically filing away emails without at least looking at their subject. So I use rules exclusively to assign priorities to emails: Emails that are addressed directly to me or come from my management chain automatically get prioritized highest. Emails where my email address shows up on the CC line or that is addressed to working groups that are dear to my heart get the second highest priority. All other email gets normal priority. Emails from Sun get a different color than external email. Other similar rules are of course possible and can be very useful.

Using filters makes it easy to get a picture of what's going on when you only have a few minutes to check email in between meetings or when on the go, without risking to overlook any important email. Therefore, let's postulate...

Email Efficiency Rule #2: Let your email client do the reading before you do.

I now use Mozilla Thunderbird to read my Sun email. At some point, I just felt that there has to be a way to efficiently read email and still use a GUI, and Thunderbird is quite good at it: It supports keystrokes, threads nicely, you can program complex rules to pre-digest email easily and it is multi-platform, open source and contributed to by Sun.

With threading and rule-based priority sorting enabled, my 500-700 emails a day split into about 10-20 "Highest" and another 30-40 or so "High" priority emails. This is much more manageable as I can work through the higher prioritized emails with a more concentrated mind before quickly scanning through the rest just in case there's something interesting there.

For my personal email, I use Google's GMail, because it completely outsources my need to archive emails, has a great browser-based user interface that can be accessed from anywhere (even a mobile phone) while still feeling like a real application and of course it suppors keystrokes, has very nice threading support and it supports filters too.

After my company gave me a Nokia E61i so I can read email on the go, I had a new problem: Nokia's email client doesn't support threading nor message filters (please tell me if you know a better email client for SymbianOS), and hundreds of truncated sender/subject lines on a mobile phone aren't really useful. So let's have a look at the server side of the picture:

The Email Server

Today, the two main mail server protocols are POP3 and IMAP4. POP3 basically dumps all your email onto your client, then (optionally) forgets about it as soon as you connect to your mail server. Not good if you're on the go. And then you need to take care of all archiving yourself. And what if you access the same mail box from different clients?

IMAP4 on the other hand lets your client choose whether to only pull headers or the whole message, it supports server-side folders to sort your mail into and you can keep your mail on the server while accessing it from multiple clients out of multiple devices and still everything stays perfectly synchronized.

So, whenever possible, choose IMAP4. If you can't choose IMAP4, change your email service. Fortunately, Google just introduced IMAP4 support, in case you want to read your Mail with something else than their web interface.

Thanks to IMAP4, we don't have to organize our mails on our clients, instead we should go by... 

Email Efficiency Rule #3: Keep your emails on the server, always.

Really. There's no point in downloading all your email to some client that can suffer a hard-drive crash or a virus infection or whatever. Chances are that your email server is a much more reliable machine than your email client and it minimizes the bandwidth needed to read and manage emails to what your brain can handle without downloading hundreds of emails that you'll never read past the subject line. You can still dump your favorite folders to disk or to a CD for archiving purposes, once in a while, if you want to.

Back to my mobile-phone-can't-thread-nor-prioritize problem. One feature of the Sun Java System Communications Server that we use is server-side filtering. It lets you forward, file or delete mails based on simple rules. Again, I like to be conservative here, so I never want to automatically delete anything, just file away what I know for sure is not important enough to waste my precious mobile phone's bandwidth with.

The utter majority of the emails I get are from internal and external mailing lists that I subscribed to or not or that otherwise find my email Inbox. These are natural candidates for "If the mail was addressed to <insert mailinglist alias here>, then file it to <some folder>" type of rules. Keeping it simple, I only use one folder for this purpose, called "ToBeRead". You could also name it "Inbox2" or "Later" but the important thing here is to actually treat this folder as a real Inbox folder the next time you have some time and a more comfortable client. Don't create a growing monster pile of unread mail because you started playing with rules, it won't really help you.

Email Efficiency Rule #4: Let your email server do some reading, too.

Sorting email on your server is different from sorting email on your client: The former gives you a bandwidth choice that enables the use of mobile devices or helps you quickly check email through a web interface (by pre-sorting email into folders), while the latter helps you look at your email in the right sequence (by threading and prioritizing it).

I just checked my Sun mail through the Nokia E61i after not having checked mail for a day (today is a bank holiday in Bavaria) and I have 47 new mails. I didn't check my ToBeRead folder, but I'm sure it has more mails than I can handle on a mobile device comfortably. Seems to work for me (and I've seen a couple of mails that will make nice new rules to my server-side filter).

I usually don't check emails after work hours, in the weekend or during bank holidays. I seem to be immune to the Crackberry disease, which I guess is a good thing. This brings us to the most important Email efficiency part of all:

Email Workflow

One of the first trainings that Sun sent me to after I was hired was about time-management. This is a fascinating subject by itself but it turns out that a lot of the principles taught under the umbrella of time-management can be applied beautifully to organizing your email.

If you're looking for a great blog on the subject of life hacks (a term for "when geeks start digging into time and self management") then check out Merlin Mann's "43Folders". If you prefer to read a book, then I can highly recommend David Allen's "Getting Things Done" (GTD).

Here's an easy but very efficient email workflow that is very similar to the GTD workflow:

  1. Go to the next email in your Inbox and ask yourself:
    "Do I need to do something because of this email?" (or: "Is it actionable?")
    • If the answer is "yes", then you either have to reply to the email or do some action that is associated with it. Now ask yourself:
      "Can I do it in less than 2 minutes?"
      • If the answer is "yes", then just do it. Really. Now.
      • If the action takes longer than 2 minutes, ask yourself:
        "Can I delegate it or do I need to do it myself?"
        • If you can delegate it, delegate it. Now. Forward the email to the person that is supposed to do the job, then make yourself a note so you can follow up with her if needed.
        • If you need to act upon the email yourself (it'll take more than 2 minutes), write this down as a new task into your to-do list (so it never gets forgotten).
    • File away or delete the email. There's no more reason for it to sit in your Inbox.
  2. Go to 1.

After a couple of iterations, you should have an empty Inbox. Really. 0 emails. Take a deep breath, celebrate and get used to it.

"But now I have this big and long to do list!" I hear you say. Well, that might be true, but a to-do list and an email Inbox are really two different things. Email is for communication, your to-do list is a way for you to organize your tasks. Never mix them up.

The important thing here is to get rid of all those emails in your Inbox. Feel the joy of hitting the delete key or filing away that email with the knowledge that it has been dealt with, once and for all!

Email Efficiency Rule #5: Develop an email workflow that helps you clean your Inbox.

You're invited to try the above workflow or you can develop your own. The point is to have a system that helps you get your Inbox to zero and free your mind for what's really important (Hint: It isn't email). Your workflow should be easy to implement, no matter how, where and when you read email. There should be no excuse left that prevents you from cleaning up your Inbox.

Having an empty Inbox has a great motivational power. You'll feel as if a big weight has been taken off your shoulder. You'll feel free to actually get some work done, instead of looking at all those emails. Try it out just once, but beware: Having an empty Inbox can be highly addictive...

Two things are left now: Dealing with that long to-do list and an easy and efficient way of filing those emails that you've dealt with already. As said, dealing with to-do lists is the subject of a whole science and I can only encourage you to check out one of the many sources on time and self management. This introduction might be a good start.

So what to do about filing emails? I know quite a lot of colleagues with elaborate folder systems that they use to file their emails and stuff in. One can base a filing structure on project names, client names, products, events, themes, priorities, whatnot. My easy answer to this problem is: File everything into one single folder, then let the computer find it when you need it.

Really, it works. Modern email clients are very good at searching through vast amounts of email. In fact, thanks to IMAP, it's actually the server that does it for you. I have just one single folder on my mail server that I use for filing mail away, it has thousands of emails and it is called "file". That's it.

You still think it can't be that simple? Well the ultimate test is: Will you be able to find any particular email quickly and easily? With an elaborate filing system, based on many different folders, this may or may not be the case. I've seen many colleagues try different folders while desperately looking for that one important email. Did I sort it into the client's folder? Wait, it was related to that project so it's probably in that folder. Or was it in "Pending"?

If you only have one folder to file stuff in, you rely on using your email client's search mechanism. This gives you at least four different ways to search for an email:

  • By person: If you're looking for a particular email, you probably remember its sender or recipient. Search for that person, then find the email in the results. Done.
  • By keywords in the subject: Think of one or two words that are guaranteed to show up in the subject of the mail you're searching for.
  • By time: Some study has found that the most brain-friendly way to organize stuff is by date/time. Try to remember the point in time you got or sent that email, then scroll back in time in your filing folder. This works best for emails associated with projects, stuff that is quite recent, etc.
  • Full text search: If all else fails, do a full text search. This shouldn't happen often, but works as a last resort. And it's reasonable quick on modern computers. Quicker than going through all those other folders...

Of course, combinations work well, too. Searching by person, then subject or time usually works for me 99% of the time. I only need to resort to full text search about once every 6 months.

Email Efficiency Rule #6: File away your email and let the computer do the searching.

Filing or deleting? When in doubt, file! Storage space has become cheap and search algorithms have become so powerful that there really is no reason not to file everything. Google has made this a major point when advertising their GMail service, and they're right.

So we now have found a good email client that supports keystrokes. We teached it how to thread and how to prioritize our emails. We like to keep emails on the server because they're really better off there and we let the server do some pre-work so we can deal with low-bandwidth situations. We've developed an email workflow that empties our Inbox in no time and an easy way to file all those emails too, relying on our computer's ever increasing power to always find what we look for.

We're almost in email heaven now, but we want to make sure to stay there and avoid going back to email hell after the next period of hectic activity or after a long vacation that filled up our Inboxes to DOS-inducing levels. We want to attack the problem at the root.

Remember those server-side rules that said "Email addressed to X should be filed into Y for later review"? Well, why did you subscribe to that newsletter/mailing list/discussion group in the first place? Is email really the right way to stay current on a certain subject?

The truth is: No. Email is a communication mechanism between people who know each other and have to say something to each other. It is not a news delivery mechanism (RSS can do that better and more efficiently). It is not a way to gather and harvest information (Google and other search engines on the internet can do it better). And it is not a discussion forum (Use Newsgroups, IM and chat or web based forums).

So let's go through our server-side rules and ask ourselves: Do I really want to keep subscribed to this service? Why don't I switch to a pull model for staying informed where I'm in control vs. being flooded by all those "informational" emails that I don't have the time to read anyway?

There's also email minimization potential with day-to-day emails to and from your co-workers. Do you really need to forward that email to your 30 or 100 other co-workers that may or may not be interested in that particular news item? Is that joke, video, URL really so funny that your entire office has to look at it? Do you really want to be "kept posted" on all minutiae of that process or just receive a short "done" notification at the end?

Email Efficiency Rule #7: Go on an email diet. Limit newsletters/mailing lists/mass emails to a necessary amount and write/forward emails only when necessary. Especially when addressing a large group of people.

I know that this rule is the hardest. But think of it. It makes sense. It may not be easily implemented everywhere (And I'm known for being an occasionally passionate participant in large email discussions myself), but using the right information resource/channel for the task at hand is often a very good idea.

Let me know if the above tips and rules are helpful to you. Share your own secrets of email efficiency. Let me know how large your Inbox is and whether you like it or not. What is your perfect way of dealing with large amounts of email?

 

"7 Tips for Enhancing Your Email Efficiency" has been brought to you by Constantin's Blooog.
This entry was created on 2007-11-01 15:56:50.0 PST and is associated with the following tags:

You're welcome to use this Permalink , add a comment below or send your feedback to constantin at sun dot com.
Comments [2]




Archives
Subscribe to This Blog!
Most Popular Entries
Watch videos of Constantin
About this site
Links
Get in Touch!
This is Sun employee Constantin Gonzalez' personal blog.
All opinions expressed herein are solely of the author and do not necessarily reflect those of his employer.
If you want to contact the author, please send email to constantin (dot) gonzalez (at) sun (dot) com.
Thank you for reading this blog!