Notes to Self

Alex Sokolsky's Notes on Computers and Programming

HD Management in TrueNAS

My raidz2 zpool has errors:

# zpool status
  pool: boot-pool
 state: ONLINE
status: One or more features are enabled on the pool despite not being
	requested by the 'compatibility' property.
action: Consider setting 'compatibility' to an appropriate value, or
	adding needed features to the relevant file in
	/etc/zfs/compatibility.d or /usr/share/zfs/compatibility.d.
  scan: scrub repaired 0B in 00:00:50 with 0 errors on Fri Mar 27 03:45:52 2026
config:

	NAME                                             STATE     READ WRITE CKSUM
	boot-pool                                        ONLINE       0     0     0
	  ata-SuperMicro_SSD_SMC0515D91120B964037-part2  ONLINE       0     0     0

errors: No known data errors

  pool: tank
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
	attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
	using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: scrub in progress since Sun Mar 29 09:07:17 2026
	4.23T / 20.2T scanned at 1.33G/s, 1.90T / 20.2T issued at 612M/s
	0B repaired, 9.42% done, 08:41:54 to go
config:

	NAME                                          STATE     READ WRITE CKSUM
	tank                                          ONLINE       0     0     0
	  raidz2-0                                    ONLINE       0     0     0
	    ata-ST6000VN001-2BB186_ZCT2WLTH           ONLINE       0     0     0
	    ata-WDC_WD60EFZX-68B3FN0_WD-C81G6HGK      ONLINE       0     0     0
	    ata-WDC_WD60EFZX-68B3FN0_WD-C81EX39K      ONLINE       0     0     0
	    ata-WDC_WD8003FFBX-68B9AN0_VR368JHK       ONLINE       0     0     0
	    ata-WDC_WD60EFRX-68MYMN1_WD-WX31DB48X2ZY  ONLINE       1     0     0

errors: No known data errors

Seems like /dev/sda has a history of errors:

oot@nass:~# smartctl -x /dev/sda|grep Error
Error logging capability:        (0x01)	Error logging supported.
					SCT Error Recovery Control supported.
  1 Raw_Read_Error_Rate     POSR-K   200   200   051    -    0
  7 Seek_Error_Rate         -OSR-K   200   200   000    -    0
199 UDMA_CRC_Error_Count    -O--CK   200   200   000    -    3
200 Multi_Zone_Error_Rate   ---R--   200   200   000    -    0
0x10       GPL     R/O      1  NCQ Command Error log
SMART Extended Comprehensive Error Log Version: 1 (6 sectors)
Device Error Count: 35 (device log contains only the most recent 24 errors)
	ER     = Error register
Error 35 [10] occurred at disk power-on lifetime: 3409 hours (142 days + 1 hours)
  40 -- 51 00 40 00 00 2c cd 57 a8 40 00  Error: UNC at LBA = 0x2ccd57a8 = 751654824
Error 34 [9] occurred at disk power-on lifetime: 64104 hours (2671 days + 0 hours)
  10 -- 51 00 40 00 01 7a 87 28 e8 40 00  Error: IDNF at LBA = 0x17a8728e8 = 6350645480
Error 33 [8] occurred at disk power-on lifetime: 57832 hours (2409 days + 16 hours)
  10 -- 51 00 40 00 02 b5 2e 80 18 40 00  Error: IDNF at LBA = 0x2b52e8018 = 11629658136
Error 32 [7] occurred at disk power-on lifetime: 57529 hours (2397 days + 1 hours)
  10 -- 51 00 40 00 01 1d 80 b3 78 40 00  Error: IDNF at LBA = 0x11d80b378 = 4789941112
Error 31 [6] occurred at disk power-on lifetime: 57307 hours (2387 days + 19 hours)
  10 -- 51 00 40 00 02 b5 50 1b 10 40 00  Error: IDNF at LBA = 0x2b5501b10 = 11631860496
Error 30 [5] occurred at disk power-on lifetime: 56039 hours (2334 days + 23 hours)
  10 -- 51 00 40 00 00 00 00 01 b0 40 00  Error: IDNF at LBA = 0x000001b0 = 432
Error 29 [4] occurred at disk power-on lifetime: 54935 hours (2288 days + 23 hours)
  10 -- 51 00 40 00 02 0e 18 66 10 40 00  Error: IDNF at LBA = 0x20e186610 = 8826414608
Error 28 [3] occurred at disk power-on lifetime: 54401 hours (2266 days + 17 hours)
  10 -- 51 00 40 00 01 f1 e6 3d b8 40 00  Error: IDNF at LBA = 0x1f1e63db8 = 8353365432
SCT Error Recovery Control:

According to Common Hard Drive Error Codes and Diagnostics:

Conmpare the above to /dev/sdb:

root@nass:~# smartctl -x /dev/sdb|grep Error
Error logging capability:        (0x01)	Error logging supported.
					SCT Error Recovery Control supported.
  1 Raw_Read_Error_Rate     POSR--   100   064   006    -    0
  7 Seek_Error_Rate         POSR--   093   060   045    -    1966565051
184 End-to-End_Error        -O--CK   100   100   099    -    0
191 G-Sense_Error_Rate      -O--CK   100   100   000    -    3
199 UDMA_CRC_Error_Count    -OSRCK   200   200   000    -    0
0x10       GPL     R/O      1  NCQ Command Error log
SMART Extended Comprehensive Error Log Version: 1 (5 sectors)
No Errors Logged
SCT Error Recovery Control:
0x04  =====  =               =  ===  == General Errors Statistics (rev 1) ==
0x04  0x008  4               0  ---  Number of Reported Uncorrectable Errors
0x06  0x018  4               0  ---  Number of Interface CRC Errors