Tuesday August 12, 2008 | Constantin's Blooog |
|
Useful stuff for your blog-reading pleasure.
All
|
General
ZFS saved my data. Right now.
For storage, I use Western Digital's MyBook Essential Edition USB drives because they are the cheapest ones I could find from a well-known brand. The packaging says "Put your life on it!". How fitting. Last week, I had a team meeting and a colleague introduced us to some performance tuning techiques. When we started playing with iostat(1M), I logged into my server to do some stress tests. That was when my server said something like this: constant@condorito:~$ zpool status (data from other pools omitted) pool: santiago state: DEGRADED status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scrub: scrub completed after 16h28m with 0 errors on Fri Aug 8 11:19:37 2008 config: NAME STATE READ WRITE CKSUM santiago DEGRADED 0 0 0 mirror DEGRADED 0 0 0 c10t0d0 DEGRADED 0 0 135 too many errors c9t0d0 DEGRADED 0 0 20 too many errors mirror ONLINE 0 0 0 c8t0d0 ONLINE 0 0 0 c7t0d0 ONLINE 0 0 0 errors: No known data errors This tells us 3 important things:
Over the weekend, I ordered myself a new disk (sheesh, they dropped EUR 5 in price already after just 5 days...) and after a " constant@condorito:~$ zpool status
(data from other pools omitted)
pool: santiago
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
scrub: resilver in progress for 1h13m, 6.23% done, 18h23m to go
config:
NAME STATE READ WRITE CKSUM
santiago DEGRADED 0 0 0
mirror DEGRADED 0 0 0
replacing DEGRADED 0 0 0
c10t0d0 DEGRADED 0 0 135 too many errors
c11t0d0 ONLINE 0 0 0
c9t0d0 DEGRADED 0 0 20 too many errors
mirror ONLINE 0 0 0
c8t0d0 ONLINE 0 0 0
c7t0d0 ONLINE 0 0 0
errors: No known data errors
The next step for me is to send the c10t0d0 drive back and ask for a replacement under warranty (it's only a couple of months old). After receiving c10's replacement, I'll consider sending in c9 for replacement (depending on how the next scrub goes). Which makes me wonder: How will drive manufacturers react to a new wave of warranty cases based on drive errors that were not easily detectable before? [1] To the guys at Drobo: Of course you're invited to implement ZFS into the next revision of your products. It's open source. In fact, Drobo and ZFS would make a perfect team!
"ZFS saved my data. Right now." has been brought to you by Constantin's Blooog.
This entry was created on 2008-08-12 06:44:22.0 PST and is associated with the following tags:
corruption
data
drobo
integrity
opensolaris
solaris
storage
zfs
« Previous day (Aug 11, 2008) | Main | Next day (Aug 12, 2008) » |
|