Thursday September 17, 2009 | Constantin's Blooog |
|
Useful stuff for your blog-reading pleasure.
All
|
General
New OpenSolaris ZFS Auto-Scrub Service Helps You Keep Proper Pool Hygiene
One of the most important features of ZFS is the ability to detect data corruption through the use of end-to-end checksums. In redundant ZFS pools (pools that are either mirrored or use a variant of RAID-Z), this can be used to fix broken data blocks by using the redundancy of the pool to reconstruct the data. This is often called self-healing. This mechanism works whenever ZFS accesses any data, because it will always verify the checksum after reading a block of data. Unfortunately, this does not work if you don't regularly look at your data: Bit rot happens and with every broken block that is not checked (and therefore not corrected), the probability increases that even the redundant copy will be affected by bit rot too, resulting in data corruption. Therefore, It should now be clear that every system should regularly scrub their pools to take full advantage of the ZFS self-healing feature. But you know how it is: You set up your server and often those little things get overlooked and that Introducing the ZFS Auto-Scrub SMF ServiceHere's a service that is easy to install and configure that will make sure all of your pools will be scrubbed at least once a month. Advanced users can set up individualized schedules per pool with different scrubbing periods. It is implemented as an SMF service which means it can be easily managed using The service borrows heavily from Tim Foster's ZFS Auto-Snapshot Service. This is not just coding laziness, it also helps minimize bugs in common tasks (such as setting up periodic cron jobs) and provides better consistency across multiple similar services. Plus: Why invent the wheel twice? RequirementsThe ZFS Auto-Scrub service assumes it is running on OpenSolaris. It should run on any recent distribution of OpenSolaris without problems. More specifically, it uses the -d switch of the GNU variant of date(1) to parse human-readable date values. Make sure that /usr/gnu/bin/date is available (which is the default in OpenSolaris). Right now, this service does not work on Solaris 10 out of the box (unless you install GNU date in /usr/gnu/bin). A future version of this script will work around this issue to make it easily usable on Solaris 10 systems as well. Download and InstallationYou can download Version 0.5b of the ZFS Auto-Scrub Service here. The included README file explains everything you need to know to make it work: After unpacking the archive, start the install script as a privileged user:
The script will copy three SMF method scripts into
After installation, you need to activate the service. This can be done easily with:
or by running the GUI with:
This will activate a pre-defined instance of the service that makes sure each of your pools is scrubbed at least once a month. This is all you need to do to make sure all your pools are regularly scrubbed. If your pools haven't been scrubbed before or if the time or their last scrub is unknown, the script will proceed and start scrubbing. Keep in mind that scrubbing consumes a significant amount of system resources, so if you feel that a currently running scrub slows your system too much, you can interrupt it by saying:
In this case, don't worry, you can always start a manual scrub at a more suitable time or wait until the service kicks in by itself during the next scheduled scrubbing period. Should you want to get rid of this service, use:
The script will then disable any instances of the service, remove the manifests from the SMF repository, delete the scripts from
Advanced UseYou can create your own instances of this service for individual pools at specified intervals. Here's an example: constant@fridolin:~$ svccfg svc:> select auto-scrub svc:/system/filesystem/zfs/auto-scrub> add mypool-weekly svc:/system/filesystem/zfs/auto-scrub> select mypool-weekly svc:/system/filesystem/zfs/auto-scrub:mypool-weekly> addpg zfs application svc:/system/filesystem/zfs/auto-scrub:mypool-weekly> setprop zfs/pool-name=mypool svc:/system/filesystem/zfs/auto-scrub:mypool-weekly> setprop zfs/interval=days svc:/system/filesystem/zfs/auto-scrub:mypool-weekly> setprop zfs/period=7 svc:/system/filesystem/zfs/auto-scrub:mypool-weekly> setprop zfs/offset=0 svc:/system/filesystem/zfs/auto-scrub:mypool-weekly> setprop zfs/verbose=false svc:/system/filesystem/zfs/auto-scrub:mypool-weekly> end constant@fridolin:~$ svcadm enable auto-scrub:mypool-weekly This example will create and activate a service instance that makes sure the pool "mypool" is scrubbed once a week. Check out the Implementation DetailsHere are some interesting aspects of this service that I came across while writing it:
Lessons learnedIt's funny how a very simple task like "Write an SMF service that takes care of regular zpool scrubbing" can develop into a moderately complex thing. It grew into three different services instead of one, each with their own scripts and SMF manifests. It required an extra RBAC role to make it more secure. I ran into some zpool(1M) limitations which I now feel are worthy of RFEs and working around them made the whole thing slightly more complex. Add an install and de-install script and some minor quirks like using GNU date(1) instead of the regular one to have a reliable parser for human-readable date strings, not to mention a GUI and you cover quite a lot of ground even with a service as seemingly simple as this. But this is what made this project interesting to me: I learned a lot about RBAC and SMF (of course), some new scripting hacks from the existing ZFS Auto-Snapshot service, found a few minor bugs (in the ZFS Auto-Snapshot service) and RFEs, programmed some Java including the use of the NetBeans GUI builder and had some fun with scripting, finding solutions and making sure stuff is more or less cleanly implemented. I'd like to encourage everyone to write their own SMF services for whatever tools they install or write for themselves. It helps you think your stuff through, make it easy to install and manage, and you get a better feel of how Solaris and its subsystems work. And you can have some fun too. The easiest way to get started is by looking at what others have done. You'll find a lot of SMF scripts in If you happen to be in Dresden for OSDevCon 2009, check out my session on "Implementing a simple SMF Service: Lessons learned" where I'll share more of the details behind implementing this service including the Visual Panels part. Edit (Sep. 21st) Changed the link to CR 6878281 to the externally visible OpenSolaris bug database version, added a link to the session details on OSDevCon.
"New OpenSolaris ZFS Auto-Scrub Service Helps You Keep Proper Pool Hygiene" has been brought to you by Constantin's Blooog.
This entry was created on 2009-09-17 07:25:34.0 PST and is associated with the following tags:
opensolaris
script
scrub
service
smf
solaris
tool
useful
zfs
zpool
ZFS Replicator Script, New Edition
Meanwhile, the fine guys at the ZFS developer team introduced recursive send/receive into the ZFS command, which makes most of what the script does a simple -F flag to the zfs(1M). Unfortunately, this new version of the ZFS command has not (yet?) been ported back to Solaris 10, so my ZFS snapshot replication script is still useful for Solaris 10 users, such as Mike Hallock from the School of Chemical Sciences at the University of Illinois at Urbana-Champaign (UIUC). He wrote: Your script came very close to exactly what I needed, so I took it upon myself to make changes, and thought in the spirit of it all, to share those changes with you. The first change he in introduced was the ability to supply a pattern (via -p) that selects some of the potentially many snapshots that one wants to replicate. He's a user of Tim Foster's excellent automatic ZFS snapshot service like myself and wanted to base his migration solely on the daily snapshots, not any other ones. Then, Mike wanted to migrate across two different hosts on a network, so he introduced the -r option that allows the user to specify a target host. This option simply pipes the replication data stream through ssh at the right places, making ZFS filesystem migration across any distance very easy. The updated version including both of the new features is available as zfs-replicate_v0.7.tar.bz2. I didn't test this new version but the changes look very good to me. Still: Use at your own risk. Thanks a lot, Mike!
"ZFS Replicator Script, New Edition" has been brought to you by Constantin's Blooog.
This entry was created on 2008-08-13 13:25:49.0 PST and is associated with the following tags:
open
opensolaris
opensource
remote
replication
script
snapshot
solaris
source
zfs
zpool
Welcome to the year 2038!
The Year 2038 ProblemTo understand the Year 2038 Problem, check out the definition of typedef long time_t; /* time of day in seconds */ To represent a date/time combination, most Unix OSes store the number of seconds since January 1st, 1970, 00:00:00 (UTC) in such a time_t variable. On 32-Bit systems, "long" is a signed integer between -2147483648 and 2147483647 (see types.h). This covers the range between December 13th, 1901, 20:45:52 (UTC) and January 19th, 2038, 03:14:07, which the fathers of C and Unix thought to be sufficient back then in the seventies. On 64-Bit systems, time_t can be much bigger (or smaller), covering a range of several hundred thousands of years, but if you're 32-Bit in 2038 you'll be in trouble: A second after January 19th, 2038, 03:14:07 you'll travel back in time and immediately find yourself in the middle of December 13th, 1901, 20:45:52 with a major headache called "overflow". More details about this problem can be found on its Wikipedia page. 2038 could be today...Well, you might say, I'll most probably be retired in 2038 anyway and of course, there won't be any 32-Bit systems that far in the future, so who cares? A customer of mine cared. They run a very big file server infrastructure, based on Solaris, ZFS and a number of Sun Fire X4500 machines. A big infrastructure like this also has a large number of clients in many variations. And some of their clients have a huge problem with time: They create files with a date after 2040. Now, the NFS standard will happily accept dates outside the 32-Bit time_t range and so will ZFS. But any program compiled in 32-Bit mode (and there are many) will run into an overflow error as soon as it wants to handle such a file. Incidentally, most of the Solaris file utilities (you know, rm, cp, find, etc.) are still shipped in 32-Bit, so having files 30+ years in the future is a big problem if you can't administer them. The 64-Bit solutionOne simple solution is to recompile your favourite file utilities, say, from GNU coreutils in 64-Bit mode, then put them into your path and hello future! You can do this by saying something like: CC=/opt/SUNWspro/bin/cc CFLAGS=-m64 ./configure --prefix=/opt/local; make (Use /opt/SunStudioExpress if you're using Sun Studio Express). Now, while trying to reproduce the problem and sending some of my own files into the future, I found out thanks to Chris and his short "what happes if I try" DTrace script, that OpenSolaris already has a way to deal with these problems: ufs and ZFS just won't accept any dates outside the 32-Bit range any more (check out lines 2416-2428 in zfs_vnops.c). Tmpfs will, so at least I could test there on my OpenSolaris 2008.05 laptop. That's one way to deal with it, but shutting the doors doesn't help our poor disoriented client of the future. And it's also only available in OpenSolaris, not Solaris 10 (yet). The DTrace solutionSo, I followed Ulrich's helpful suggestions and Chris' example and started to hack together a DTrace script of my own that would print out who is trying to assign a date outside of 32-Bit-time_t to what file, and another one that would fix those dates so files can still be accepted and dealt with the way sysadmins expect. The first script is called "showbigtimes" and it does just that: constant@foeni:~/file/projects/futurefile$ pfexec ./showbigtimes_v1.1.d dtrace: script './showbigtimes_v1.1.d' matched 7 probes CPU ID FUNCTION:NAME 0 18406 futimesat:entry UID: 101, PID: 2826, program: touch, file: blah atime: 2071 Jun 23 12:00:00, mtime: 2071 Jun 23 12:00:00 ^C constant@foeni:~/file/projects/futurefile$ /usr/bin/amd64/ls -al /tmp/blah -rw-r--r-- 1 constant staff 0 Jun 23 2071 /tmp/blah constant@foeni:~/file/projects/futurefile$ Of course, I ran " A couple of non-obvious hoops needed to be dealt with:
I hope the comments inside the script are helpful. Be sure to check out the DTrace Documentation, which was very useful to me. The second script is called correctbigtimes.d and it not only alerts us of files being sent into the future, it automatically corrects the dates to the current date/time in order to prevent any time-travel outside the bounds of 32-Bit time_t at all: constant@foeni:~/file/projects/futurefile$ pfexec ./correctbigtimes_v1.1.d dtrace: script './correctbigtimes_v1.1.d' matched 2 probes dtrace: allowing destructive actions CPU ID FUNCTION:NAME 0 18406 futimesat:entry UID: 101, PID: 2844, program: touch, fd: 0, file: atime: 2071 Jun 23 12:00:00, mtime: 2071 Jun 23 12:00:00 Corrected atime and mtime to: 2008 Jul 3 16:23:25 ^C constant@foeni:~/file/projects/futurefile$ ls -al /tmp/blah -rw-r--r-- 1 constant staff 0 2008-07-03 16:23 /tmp/blah constant@foeni:~/file/projects/futurefile$ As you can see, we enabled DTrace's destructive mode (of course only for constructive purposes) which allows us to change the time values on the fly and ensure a stable time continuum. This time, I left out the code that created the file descriptor-to-filename table, because this script may potentially be running for a long time and I didn't want to consume preciuous memory for just a convenience feature (Otherwise we'd kept an extra table of all open files for all running threads in the syste,!). If we get a filename string, we print it, otherwise a file descriptor needs to suffice, we can always look it up through pfiles(1). The actual time modification takes place inside our local variables, which then get copied back into the system call through copyout(). I hope you liked this little excursion into the year 2038, which can happen sooner than we think for some. To me, this was a great opportunity to dig a little deeper into DTrace, a powerful tool that shows us what's going on while enabling us to fix stuff on the fly. Update: Ulrich had some suggestions and found a bug, so I updated both scripts to version 1.2:
The new versions are already linked from above or available here: showbigtimes_v1.2.d, correctbigtimes_v1.2.d.
"Welcome to the year 2038!" has been brought to you by Constantin's Blooog.
This entry was created on 2008-07-03 08:18:35.0 PST and is associated with the following tags:
2038
bug
dtrace
problem
script
solaris
Presenting images and screenshots the cool, 3D, shiny way
But what if you have to present on software, some web service or give a Solaris training with lots of command line stuff? Sure, you can do screenshots and hope that the GUI looks nice. Or use other photos (like the one to the left) that may or may not relate to the software you present about. But screenshots and photos (to a lesser degree) are so, well, 2D. They look boring. Wouldn't it be nice to present your screenshots the way Apple presents its iTunes software? Like add some 3D depth to your slide-deck or website, with a nice, shiny, reflective underground? Well, you don't need to spend thousands of dollars with art departments and graphics artists (they'd be glad to do something different for a change) or work long hours with Photoshop or the Gimp (a most excellent piece of software, BTW), trying to create that stylish 3D look. Here's a script that can do this easily for you! You're probably wondering why my daughter Amanda shows up at the top of this article. Well, she was volunteered to be a test subject for my new script. The script uses ImageMagick and POV-Ray in a similar way to my earlier photocube script that we now use to generate the animated cube of the HELDENFunk show notes. It places any image you give it into a 3D space and adds a nice, shiny reflection to it. Let's see how Amanda looks like after she's been through the featurepic.sh script: -bash-3.00$ ./featurepic.sh -s 200 Amanda_small.jpg The size (-s) parameter defines the length of either width or height of the result image, whichever is larger. In this case, we choose an image size of a maximum of 200x200 pixels, so the image can fit this blog. You can see the result to the right. Nice, eh? As you can see, her picture has now been placed into a 3D scene, slightly rotated to the left, onto a shiny, white surface. More interesting than the usual flat picture on a blog, isn't it? The script uses POV-Ray to place and rotate the photo in 3D and to generate the reflection. ImageMagick is used for pre- and post-processing the image. The reflection is not real, it is actually the same picture, flipped across the y axis and with a gradient transparency applied to it. That way, the reflection can be controlled much better. I tried the real thing and it didn't want to look artistic enough :). The amount of rotation, the reflection intensity and the length of the reflective tail can be adjusted with command-line switches, so can the height of the camera. Here's an example that uses all of these parameters: -bash-3.00$ ./featurepic.sh -h
The camera height (-c) value is relative to the picture: 0 is ground
level, 1 is at the top edge. The camera will always look at the center
of the image. Camera height values below 0.5 are good because a camera below the subject makes it look slightly more impressing. Values above 0.5 make you look down at the picture, making it a bit smaller and less significant. The reflection intensity (-r) goes from 0 (no reflection) to 1 (perfect mirror) while the length of the reflection (the fade-off "tail", -t) goes from 0 (no tail) to 1 (same size as image). Smaller values for reflection and the tail length make the reflection more subtle and less distracting. I think the default values are very good for most cases. Check out the -p option for a nicer way to integrate the resulting image into other graphical elements of your presentation. It creates a PNG image with a transparency channel. This means you can place it above other graphical elements (such as a different background color) and the reflection will still look right. See the next example to the right, where Amanda prefers a pink background. Keep in mind that the rendering step still assumes a white background, so drastic changes in background may or may not result in slight artifacts at the edges.
You can also use this script with some pictures of hardware to make them look more interesting, if the hardware shot is dead front and if it doesn't have any border at the bottom. Use an angle value of 0, this will place your hardware onto that virtual glossy carbon plastic that makes it look nicer. See below for an embellished Sun Fire T5440 Server, the new flagship in our line of Chip-Multi-Threading (CMT) servers. This script should work on any unixoid OS, especially Solaris, that understands sh and where a reasonably recent (6.x.x) version of ImageMagick and POV-Ray are available. You can get ImageMagick and POV-Ray from their websites. On Solaris, you can easily install them through Blastwave. The version of ImageMagick that is shipped with Solaris in /usr/sfw is not recent enough for the way I'm using it, so the Blastwave version is recommended at the moment.
It's free as in "free beer". Speaking of which, if you like this script, leave a comment or send me email at constantin at sun dot com telling me what you did with it, what other features you'd like to see in the script and where I can meet you for some beer :).
"Presenting images and screenshots the cool, 3D, shiny way" has been brought to you by Constantin's Blooog.
This entry was created on 2008-04-28 00:44:56.0 PST and is associated with the following tags:
3d
cool
effects
howto
imagemagick
imaging
photos
povray
presentations
script
Shrink big presentations with ooshrinkI work in an environment where people use presentations a lot. Of course, we like to use StarOffice, which is based on OpenOffice for all of our office needs. Presentation files can be big. Very big. Never-send-through-email-big. Especially, when they come from marketing departments and contain lots of pretty pictures. I just tried to send a Sun Systems overview presentation (which I created myself, so less marketing fluff), and it still was over 22MB big! So here comes the beauty of Open Source, and in this case: Open Formats. It turns out, that OpenOffice and StarOffice documents are actually ZIP files that contain XML for the actual documents, plus all the image files that are associated with it in a simple directory structure. A few years ago I wrote a script that takes an OpenOffice document, unzips it, looks at all the images in the document's structure and optimizes their compression algorithm, size and other settings based on some simple rules. That script was very popular with my colleagues, it got lost for a while and thanks to Andreas it was found again. Still, colleagues are asking me about "That script, you know, that used to shrink those StarOffice presentations." once in a while. Today, I brushed it up a little, teached it to accept the newer od[ptdc] extensions and it still works remarkably well. Here are some examples:
Before I give you the script, here's the obvious The script works with Solaris (of course), but it should also work in any Linux or any other Unix just fine. It relies on ImageMagick to do the image heavy lifting, so make sure you have identify(9E) and convert(9E) in your path. My 22 MB Systems Overview presentation was successfully shrunk into a 13MB one, so I'm happy to report that after so many years, this little script is still very useful. I hope it helps you too, let me know how you use it and what shrink-ratios you have experienced!
"Shrink big presentations with ooshrink" has been brought to you by Constantin's Blooog.
This entry was created on 2007-11-27 03:20:08.0 PST and is associated with the following tags:
imagemagick
imaging
open
openoffice
opensource
script
source
staroffice
useful
Cool Apple-Like Photo Animations With POV-Ray, ImageMagick and Solaris
Recently we took a team photograph for an internal web page. I wanted that effect and I love the open source raytracer POV-Ray so I wrote a script that renders the same animation effect and creates an animated GIF using ImageMagick. You can see an example output to the right featuring photos of some popular Sun products. BTW, check out photos.sun.com for free, high-quality access to Sun product photography. To create your own photocubes, you just need POV-Ray and ImageMagick in your path and the photocube.sh script. Being open source, all run on Solaris but also on Linux, NetBSD or any other operating system that can run open source software. I'd love to try this script out on a Niagara 2 system with its 8 cores, 16 pipelines, 64 threads and 8 FPUs. Hmmm, all rendering frames in parallel :). There are already precompiled distributions of POVRay and ImageMagick on Blastwave that you can install very easily onto your Solaris machine if you don't have them already. Just call the script with 6 URLs or pathnames. It will then automatically read in the images, render the animation frames and then combine them all into an animated GIF: -bash-3.00$ ../photocube.sh *.jpg The script uses ImageMagick to make the pictures quadratic and to limit their size to 1200x1200 pictures if necessary. Since the Feel free to modify this script to your needs. You may want to experiment with other ways of animating the cube or other image transition effects. Maybe you want to use ffmpeg to create real video files instead of animated GIFs. Be careful when cranking up the number of frames while using ImageMagick to create animated GIFs, ImageMagick wants to suck in all frames into memory before creating the animated GIF and so you may end up using a lot of memory. If someone has a more elegant, scriptable animated GIF creator, please leave me a comment. I hope you enjoy this little exercise in raytracing and animation. Let me know if you have suggestions or other ideas to improve this script!
"Cool Apple-Like Photo Animations With POV-Ray, ImageMagick and Solaris" has been brought to you by Constantin's Blooog.
This entry was created on 2007-08-23 13:20:47.0 PST and is associated with the following tags:
apple
cool
diy
howto
imagemagick
iphoto
opensolaris
opensource
photos
pov-ray
raytracing
script
slideshow
solaris
ZFS Snapshot Replication ScriptOne of the OpenSolaris' ZFS filesystem's greatest features are its snapshots. You can easily create a snapshot by saying Now let's say you have a nice pool and have been creating snapshots on a regular basis. After a few months, you decide to remodel your pool layout or migrate some of your filesystems over to a new pool for whatever reason. Then, you're facing a lot of those I had to migrate quite a few filesystems and many snapshots (thanks to Tim's excellent ZFS Snapshot SMF Service) lately when I set up a new pool strategy for my home server so I wrote myself a script to do the replication job. Since it may take some time for the Disclaimer: Please be advised that this script has only been tested a couple of times and it is provided to you completely on an "as-is" basis. Please have a look at the script to understand how it works and try it out on some non-risky pools and filesystems before you do real stuff with it. Run a backup before using this script and don't shoot me if something goes wrong. Ok, what can this script do for you? First of all, check out its -h flag to see what options it provides:
Great, let's try it out. Here's a pool with some data and some snapshots as well as another, empty pool: Now, let's copy the
It works. And it automatically used incremental snapshots as well to save space, too! If we now add another snapshot to our original pool piscina and then run zfs-replicate again, it will skip already replicated snapshots and just copy those that are additional:
This is useful because you can now run this script on regularly basis to have one pool automatically backed up to another pool. In fact, the Sometimes, the destination filesystem gets touched, or otherwise acted upon and then Finally, another scenario is file system migration: You have a filesystem in one pool and want to migrate it with all it's snapshots to another pool, with minimal downtime. This can be done using the If you're worried about some daemons depending on your filesystem's availability (like Samba), you can use the -c option to provide their names. zfs-replicate will then bring down the matching SMF services right before unmounting and restart them automatically after re-mounting the migrated filesystem. Again, you might need to wait until the SMF service is really down (Read: The last Samba connection has closed). I hope this script is useful to you and again, I assume you know what you're doing and do some testing before using it in production. I'm sure there are still some bugs and shortcomings so please send me email to constantin (dot) gonzalez (at) sun (dot) com or leave a comment and I'll try to make the script better for you. Many thanks to Chris Gerhard, whose backup script was an inspiration for me in hacking together this utility. Also, many thanks to Tim Foster for some code-review and initial feedback (Sorry, I haven't managed to implement some locking yet...). Let me know when you're in Munich and you'll get some well-deserved beer!
"ZFS Snapshot Replication Script" has been brought to you by Constantin's Blooog.
This entry was created on 2007-08-16 13:41:02.0 PST and is associated with the following tags:
administration
filesystem
howto
open
opensolaris
opensource
programming
replication
script
shell
snapshot
software
solaris
source
unix
utility
zfs
|
|