Wednesday September 14, 2005 Another question that seems to come up again and again. What do the errors returned by iostat -E really mean. Well now we have opensolaris it is all in the source. Iostat -E will give output like this:
sd21 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: SEAGATE Product: ST373307LSUN72G Revision: 0207 Serial No: 0241B09EN1 Size: 73.40GB <73400057856 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 0 Predictive Failure Analysis: 0
These entries are simply all the error kstats for the device and can be printed using the kstat command;
$ kstat -n sd21,err
module: sderr instance: 21
name: sd21,err class: device_error
Device Not Ready 0
Hard Errors 0
Illegal Request 0
Media Error 0
No Device 0
Predictive Failure Analysis 0
Product ST373307LSUN72G Revision
Recoverable 0
Revision 0207
Serial No 0241B09EN1
Size 73400057856
Soft Errors 0
Transport Errors 0
Vendor SEAGATE
crtime 315.74644992
snaptime 852020.62705776
So what do they all mean? For scsi disk drives they are set up in the sd_create_errstats() routine and then updated in the sd drvier when ever each of the errors is hit.
Device Not Ready: The drive returned the sense key 0x2 (Not ready).
Media Error: The drive returned the sense key 0x3(Medium Error).
No Device: The drive returned the sense key 0x6 (Unit Attention) or in the case of a removable device it must have happened multiple times.
Hard Errors: All the above conditions are counted as Hard errors with the addition of the SCSI sense key 0x4 (Hardware Error).
Illegal Request: The drive returned the sense key 0x5 (Illegal Request). This also treats as a Soft Error and that kstat is also incremented.
Recoverable: The drive returned the sense key 0x1 (Recovered Error) to indicate that the last command completed successfully but some recovery action had to be taken by the drive. This also treats as a Soft Error and that kstat is also incremented.
Predictive Failure Analysis: The drive returned sense key 0x6 (Unit Attention) with and ASC (Additional Sense Code) of 0x5D indicating that the drive has exceeded it's predictive failure threshold. This is treated as a soft error.
Transport Error: This error occurs for a number of reasons all related to being unable to transport the command. The command could have been timed out or reset or the host bus adapter unable to put the command onto the SCSI bus. This is neither as soft nor a hard error.
Tags: opensolaris disks kstats
Except where otherwise noted, this site is
licensed under a Creative Commons License 2.0
This is a personal weblog, I do not speak for my employer.
Posted by Lyle on September 15, 2005 at 04:16 PM BST #
Posted by P.Santhosh on July 24, 2006 at 09:51 AM BST #