A common method of troubleshooting is to count errors and the part with the most number of errors is the culprit. So Solaris has part of it's interface kstat structures. These structures are initialized and maintained since boot. However, this method of troubleshooting is flawed and can be dangerous by itself. It does not take into consideration, cause and effect.
For instance when a disk is having problems, on a scsi bus, the HBA will reset the bus to try to re-establish communication with the disk. Every drive on that bus will reset. Now kstat counters will increment and report errors on drives that may not be experiencing the problems. Overtime these counters may accumulate more errors on drives than the failing disk.
Real example. Looking at these events you will notice targets 0 and 1 have a huge number of hard errors. So you may be apt to conclude that one of these disks is causing c1t3d0 to have some transport errors.
# iostat -En
c1t0d0 Soft Errors: 0 Hard Errors: 230 Transport Errors: 31
Vendor: SEAGATE Product: ST373207LSUN72G Revision: 045A Serial No: 060433Y77Y
Size: 73.40GB <73400057856 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 230 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c1t1d0 Soft Errors: 0 Hard Errors: 230 Transport Errors: 31
Vendor: SEAGATE Product: ST373207LSUN72G Revision: 045A Serial No: 060433Y55L
Size: 73.40GB <73400057856 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 230 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c1t2d0 Soft Errors: 0 Hard Errors: 45 Transport Errors: 13
Vendor: SEAGATE Product: ST314670LSUN146G Revision: 045A Serial No: 060843QDRV
Size: 146.80GB <146800115712 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 45 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c1t3d0 Soft Errors: 0 Hard Errors: 7 Transport Errors: 482
Vendor: SEAGATE Product: ST314670LSUN146G Revision: 045A Serial No: 060843RTC7
Size: 146.80GB <146800115712 bytes>
Media Error: 5 Device Not Ready: 0 No Device: 2 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
This would be a reasonable conclusion, but incorrect. What really is happening in this case is target 3 is failing with an incorrect phase and the bus resets. On reset the must reset and since a write is in flight to target 0, target 0 records an error.
Jan 27 19:11:44 blaze scsi: [ID 107833 kern.warning] WARNING: /pci@1c,600000/scsi@2 (glm0):
Jan 27 19:11:44 blaze Resetting scsi bus, got incorrect phase from (3,0)
Jan 27 19:11:44 blaze genunix: [ID 408822 kern.info] NOTICE: glm0: fault detected in device; service still available
Jan 27 19:11:44 blaze genunix: [ID 611667 kern.info] NOTICE: glm0: Resettingscsi bus, got incorrect phase from (3,0)
Jan 27 19:11:44 blaze scsi: [ID 107833 kern.warning] WARNING: /pci@1c,600000/scsi@2 (glm0):
Jan 27 19:11:44 blaze got SCSI bus reset
Jan 27 19:11:44 blaze genunix: [ID 408822 kern.info] NOTICE: glm0: fault detected in device; service still available
Jan 27 19:11:44 blaze genunix: [ID 611667 kern.info] NOTICE: glm0: got SCSI bus reset
Jan 27 19:11:44 blaze scsi: [ID 107833 kern.warning] WARNING: /pci@1c,600000/scsi@2/sd@3,0 (sd3):
Jan 27 19:11:44 blaze auto request sense failed (reason=reset)
Jan 27 19:11:47 blaze scsi: [ID 107833 kern.warning] WARNING: /pci@1c,600000/scsi@2/sd@0,0 (sd0):
Jan 27 19:11:47 blaze Error for Command: write(10) Error Level: Retryable
Jan 27 19:11:47 blaze scsi: [ID 107833 kern.notice] Requested Block: 53976752 Error Block: 53976752
Jan 27 19:11:47 blaze scsi: [ID 107833 kern.notice] Vendor: SEAGATE Serial Number: 060433Y77Y
Jan 27 19:11:47 blaze scsi: [ID 107833 kern.notice] Sense Key: Unit Attention
Jan 27 19:11:47 blaze scsi: [ID 107833 kern.notice] ASC: 0x29 (scsi bus reset occurred), ASCQ: 0x2, FRU: 0x2
Jan 27 19:11:47 blaze scsi: [ID 107833 kern.warning] WARNING: /pci@1c,600000/scsi@2/sd@1,0 (sd1):
Jan 27 19:11:47 blaze Error for Command: write(10) Error Level: Retryable
Jan 27 19:11:47 blaze scsi: [ID 107833 kern.notice] Requested Block: 53997104 Error Block: 53997104
Jan 27 19:11:47 blaze scsi: [ID 107833 kern.notice] Vendor: SEAGATE Serial Number: 060433Y55L
Jan 27 19:11:47 blaze scsi: [ID 107833 kern.notice] Sense Key: Unit Attention
Jan 27 19:11:47 blaze scsi: [ID 107833 kern.notice] ASC: 0x29 (scsi bus reset occurred), ASCQ: 0x2, FRU: 0x2
So when using the counters, use then to give you the big picture, but make sure you look through the events to evaluate the problem.