Getting CKSUM errors on new drives.

Jolinar

Cadet
Joined
Apr 6, 2020
Messages
4
Hello.

I am new to this forum and also in dire need of some help.
I have been running a SuperMicro X10DSC+ for a couple of years for now with 30 6TB disks. Everything has been working exceptionally well.
As I was running a bit low on space I ordered 10 more. After adding the new drives to the zpool I have been getting CKSUM errors on them.
After a full scrub, each of the new drives has about 1.5-1,6k errors.

I also noticed that I get SCSI errors on the console - again only for the new disks.
Code:
....
> (da34:mrsas0:1:45:0): SCSI status: OK
> (da32:mrsas0:1:43:0): READ(10). CDB: 28 00 65 01 26 18 00 00 40 00 
> (da32:mrsas0:1:43:0): CAM status: SCSI Status Error
> (da32:mrsas0:1:43:0): SCSI status: OK
> (da34:mrsas0:1:45:0): READ(10). CDB: 28 00 9a 85 82 58 00 00 40 00 
> (da34:mrsas0:1:45:0): CAM status: SCSI Status Error
> (da34:mrsas0:1:45:0): SCSI status: OK
> (da39:mrsas0:1:50:0): WRITE(10). CDB: 2a 00 dc d3 73 30 00 00 40 00 
> (da32:mrsas0:1:43:0): READ(10). CDB: 28 00 71 cd 7c 78 00 01 00 00 
> (da39:mrsas0:1:50:0): CAM status: SCSI Status Error
> (da39:mrsas0:1:50:0): SCSI status: OK
> (da32:mrsas0:1:43:0): CAM status: SCSI Status Error
> (da32:mrsas0:1:43:0): SCSI status: OK
> (da35:mrsas0:1:46:0): READ(10). CDB: 28 00 9a 85 9d f0 00 00 30 00 
> (da35:mrsas0:1:46:0): CAM status: SCSI Status Error
> (da35:mrsas0:1:46:0): SCSI status: OK
> (da33:mrsas0:1:44:0): READ(10). CDB: 28 00 1c 73 06 78 00 01 00 00 
> (da33:mrsas0:1:44:0): CAM status: SCSI Status Error
> (da33:mrsas0:1:44:0): SCSI status: OK
> (da39:mrsas0:1:50:0): READ(10). CDB: 28 00 6e be 01 b8 00 00 f0 00 
> (da39:mrsas0:1:50:0): CAM status: SCSI Status Error
> (da39:mrsas0:1:50:0): SCSI status: OK
> (da32:mrsas0:1:43:0): READ(10). CDB: 28 00 94 02 1b a0 00 00 40 00 
> (da32:mrsas0:1:43:0): CAM status: SCSI Status Error
> (da32:mrsas0:1:43:0): SCSI status: OK
> (da32:mrsas0:1:43:0): READ(10). CDB: 28 00 71 63 f7 88 00 01 00 00 
> (da33:mrsas0:1:44:0): READ(10). CDB: 28 00 15 52 ad a8 00 00 30 00 
> (da34:mrsas0:1:45:0): WRITE(10). CDB: 2a 00 ea dc e5 b8 00 00 28 00 
> (da39:mrsas0:1:50:0): READ(10). CDB: 28 00 70 6f 9f d8 00 00 f0 00 
> (da39:mrsas0:1:50:0): CAM status: SCSI Status Error
> (da39:mrsas0:1:50:0): SCSI status: OK
> (da30:mrsas0:1:41:0): READ(10). CDB: 28 00 70 6f a1 c8 00 01 00 00 
> (da35:mrsas0:1:46:0): READ(10). CDB: 28 00 63 73 27 10 00 01 00 00 
> (da30:mrsas0:1:41:0): READ(10). CDB: 28 00 af 69 da e0 00 00 10 00 
> (da35:mrsas0:1:46:0): CAM status: SCSI Status Error
> (da35:mrsas0:1:46:0): SCSI status: OK
> (da30:mrsas0:1:41:0): CAM status: SCSI Status Error
> (da30:mrsas0:1:41:0): SCSI status: OK
> (da31:mrsas0:1:42:0): READ(10). CDB: 28 00 9d 83 95 50 00 00 38 00 
> (da38:mrsas0:1:49:0): WRITE(10). CDB: 2a 00 ea dd 52 68 00 00 30 00 
> (da38:mrsas0:1:49:0): CAM status: SCSI Status Error
> (da38:mrsas0:1:49:0): SCSI status: OK
> (da36:mrsas0:1:47:0): READ(10). CDB: 28 00 9d 83 d6 80 00 00 40 00 
> (da36:mrsas0:1:47:0): CAM status: SCSI Status Error
> (da36:mrsas0:1:47:0): SCSI status: OK
> (da39:mrsas0:1:50:0): READ(10). CDB: 28 00 7d 4b c8 68 00 01 00 00 
> (da34:mrsas0:1:45:0): READ(10). CDB: 28 00 d6 dd 35 a8 00 00 08 00 
> (da39:mrsas0:1:50:0): CAM status: SCSI Status Error
> (da34:mrsas0:1:45:0): CAM status: SCSI Status Error
> (da34:mrsas0:1:45:0): SCSI status: OK
> (da39:mrsas0:1:50:0): SCSI status: OK
> (da34:mrsas0:1:45:0): READ(10). CDB: 28 00 08 cf 25 90 00 01 00 00 
> (da34:mrsas0:1:45:0): CAM status: SCSI Status Error
> (da34:mrsas0:1:45:0): SCSI status: OK
> (da39:mrsas0:1:50:0): WRITE(10). CDB: 2a 00 da 07 ef 48 00 00 30 00 
> (da37:mrsas0:1:48:0): READ(10). CDB: 28 00 af 62 db 10 00 01 00 00 
> (da30:mrsas0:1:41:0): WRITE(10). CDB: 2a 00 c9 1d e8 70 00 00 40 00 
> (da30:mrsas0:1:41:0): CAM status: SCSI Status Error
> (da30:mrsas0:1:41:0): SCSI status: OK
> (da37:mrsas0:1:48:0): WRITE(10). CDB: 2a 00 ea de 41 30 00 00 40 00 
> (da37:mrsas0:1:48:0): CAM status: SCSI Status Error
> (da37:mrsas0:1:48:0): SCSI status: OK
> (da37:mrsas0:1:48:0): READ(10). CDB: 28 00 9a 89 b6 e0 00 00 40 00 
> (da39:mrsas0:1:50:0): WRITE(10). CDB: 2a 00 dc d7 51 70 00 00 40 00 
> (da32:mrsas0:1:43:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00 
> (da37:mrsas0:1:48:0): READ(10). CDB: 28 00 b0 68 a0 80 00 00 28 00 
> (da39:mrsas0:1:50:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00 
> (da37:mrsas0:1:48:0): CAM status: SCSI Status Error
> (da37:mrsas0:1:48:0): SCSI status: OK
> (da39:mrsas0:1:50:0): CAM status: SCSI Status Error
> (da39:mrsas0:1:50:0): SCSI status: OK
> (da39:mrsas0:1:50:0): READ(10). CDB: 28 00 7d 5e 84 88 00 01 00 00 
> (da32:mrsas0:1:43:0): WRITE(10). CDB: 2a 00 ea df 1c 28 00 00 40 00 
> (da39:mrsas0:1:50:0): CAM status: SCSI Status Error
> (da39:mrsas0:1:50:0): SCSI status: OK
> (da32:mrsas0:1:43:0): READ(10). CDB: 28 00 ba 9d 28 b0 00 00 40 00 
> (da34:mrsas0:1:45:0): READ(10). CDB: 28 00 70 0c 70 d0 00 01 00 00 
> (da35:mrsas0:1:46:0): WRITE(10). CDB: 2a 00 ea df 6c 58 00 00 40 00 
> (da34:mrsas0:1:45:0): READ(10). CDB: 28 00 9a 8a db 40 00 00 40 00 
> (da34:mrsas0:1:45:0): CAM status: SCSI Status Error
> (da34:mrsas0:1:45:0): SCSI status: OK
> (da38:mrsas0:1:49:0): READ(10). CDB: 28 00 9a 8b e4 98 00 00 68 00 
> (da39:mrsas0:1:50:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00 
> (da39:mrsas0:1:50:0): CAM status: SCSI Status Error
> (da39:mrsas0:1:50:0): SCSI status: OK
> (da36:mrsas0:1:47:0): READ(10). CDB: 28 00 7b d5 a6 a0 00 00 80 00 
> (da32:mrsas0:1:43:0): READ(10). CDB: 28 00 35 f9 2b 30 00 00 d8 00 
> (da39:mrsas0:1:50:0): WRITE(10). CDB: 2a 00 dc d9 ca 10 00 00 40 00 
> (da39:mrsas0:1:50:0): READ(10). CDB: 28 00 b5 78 50 50 00 00 08 00 
> (da32:mrsas0:1:43:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00 
> (da32:mrsas0:1:43:0): CAM status: SCSI Status Error
> (da32:mrsas0:1:43:0): SCSI status: OK
> (da30:mrsas0:1:41:0): WRITE(10). CDB: 2a 00 dc da 40 f8 00 00 40 00 
> (da31:mrsas0:1:42:0): WRITE(10). CDB: 2a 00 dc da 41 00 00 00 40 00 
> (da31:mrsas0:1:42:0): CAM status: SCSI Status Error
> (da31:mrsas0:1:42:0): SCSI status: OK
> (da37:mrsas0:1:48:0): WRITE(10). CDB: 2a 00 ea e2 6f 60 00 00 40 00 
> (da37:mrsas0:1:48:0): READ(10). CDB: 28 00 9a 8d f7 78 00 00 38 00 
> (da35:mrsas0:1:46:0): WRITE(10). CDB: 2a 00 ea e2 fb 48 00 00 40 00 
> (da31:mrsas0:1:42:0): READ(10). CDB: 28 00 74 b9 83 18 00 01 00 00 
> (da35:mrsas0:1:46:0): READ(10). CDB: 28 00 9a 8e 40 00 00 00 40 00 
> (da32:mrsas0:1:43:0): READ(10). CDB: 28 00 c2 04 db e8 00 00 18 00 
> (da36:mrsas0:1:47:0): WRITE(10). CDB: 2a 00 dc dc 0a 18 00 00 40 00 
> (da38:mrsas0:1:49:0): READ(10). CDB: 28 00 76 9e d5 20 00 00 f0 00 
> (da31:mrsas0:1:42:0): READ(10). CDB: 28 00 a3 16 16 a0 00 01 00 00 
> (da32:mrsas0:1:43:0): WRITE(10). CDB: 2a 00 ea e4 13 88 00 00 40 00 
> (da31:mrsas0:1:42:0): CAM status: SCSI Status Error
> (da31:mrsas0:1:42:0): SCSI status: OK
> (da32:mrsas0:1:43:0): READ(10). CDB: 28 00 5a ea 15 18 00 01 00 00 
> (da32:mrsas0:1:43:0): CAM status: SCSI Status Error
> (da32:mrsas0:1:43:0): SCSI status: OK
> (da30:mrsas0:1:41:0): READ(10). CDB: 28 00 9d 89 c7 c0 00 00 40 00 
> (da30:mrsas0:1:41:0): CAM status: SCSI Status Error
> (da30:mrsas0:1:41:0): SCSI status: OK
> (da34:mrsas0:1:45:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00 
> (da30:mrsas0:1:41:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00 
> (da30:mrsas0:1:41:0): CAM status: SCSI Status Error
> (da30:mrsas0:1:41:0): SCSI status: OK
> (da34:mrsas0:1:45:0): READ(10). CDB: 28 00 ea b3 40 d0 00 00 18 00 
> (da34:mrsas0:1:45:0): CAM status: SCSI Status Error
> (da34:mrsas0:1:45:0): SCSI status: OK
> (da30:mrsas0:1:41:0): READ(10). CDB: 28 00 a2 4a cd 38 00 01 00 00 
> (da37:mrsas0:1:48:0): READ(10). CDB: 28 00 c6 e1 f1 40 00 00 40 00 
> (da30:mrsas0:1:41:0): CAM status: SCSI Status Error
> (da30:mrsas0:1:41:0): SCSI status: OK
> (da33:mrsas0:1:44:0): READ(10). CDB: 28 00 4a 58 ed d0 00 00 40 00 
> (da33:mrsas0:1:44:0): CAM status: SCSI Status Error
> (da33:mrsas0:1:44:0): SCSI status: OK
> (da38:mrsas0:1:49:0): READ(10). CDB: 28 00 37 84 a3 38 00 00 38 00 
> (da38:mrsas0:1:49:0): CAM status: SCSI Status Error
> (da38:mrsas0:1:49:0): SCSI status: OK
> (da31:mrsas0:1:42:0): READ(10). CDB: 28 00 7a cb 74 90 00 00 10 00 
> (da39:mrsas0:1:50:0): WRITE(10). CDB: 2a 00 c9 1e 27 30 00 00 40 00 
> (da39:mrsas0:1:50:0): CAM status: SCSI Status Error
> (da39:mrsas0:1:50:0): SCSI status: OK
> (da32:mrsas0:1:43:0): WRITE(10). CDB: 2a 00 ea e6 50 98 00 00 30 00 
> (da32:mrsas0:1:43:0): WRITE(10). CDB: 2a 00 aa a3 3d 80 00 00 40 00 
> (da33:mrsas0:1:44:0): READ(10). CDB: 28 00 c9 15 52 08 00 00 30 00 
> (da38:mrsas0:1:49:0): READ(10). CDB: 28 00 76 1b bd e8 00 00 30 00 
> (da35:mrsas0:1:46:0): WRITE(10). CDB: 2a 00 ea e6 cb 60 00 00 40 00 
> (da31:mrsas0:1:42:0): READ(10). CDB: 28 00 c3 55 9d 98 00 00 08 00 
> (da35:mrsas0:1:46:0): CAM status: SCSI Status Error
> (da35:mrsas0:1:46:0): SCSI status: OK
> (da31:mrsas0:1:42:0): CAM status: SCSI Status Error
> (da31:mrsas0:1:42:0): SCSI status: OK
> (da35:mrsas0:1:46:0): READ(10). CDB: 28 00 9a 91 d6 80 00 00 40 00 
> (da35:mrsas0:1:46:0): CAM status: SCSI Status Error
> (da35:mrsas0:1:46:0): SCSI status: OK
> (da37:mrsas0:1:48:0): WRITE(10). CDB: 2a 00 ea e7 3c 48 00 00 40 00 
> (da37:mrsas0:1:48:0): READ(10). CDB: 28 00 d6 de 12 48 00 00 08 00 
> (da39:mrsas0:1:50:0): WRITE(10). CDB: 2a 00 dc df 12 38 00 00 40 00 
> (da36:mrsas0:1:47:0): READ(10). CDB: 28 00 53 9b 18 10 00 00 e8 00 
> (da36:mrsas0:1:47:0): CAM status: SCSI Status Error
> (da36:mrsas0:1:47:0): SCSI status: OK
> (da31:mrsas0:1:42:0): READ(10). CDB: 28 00 70 81 d4 28 00 00 40 00 
> (da36:mrsas0:1:47:0): READ(10). CDB: 28 00 a3 02 55 68 00 00 40 00 
> (da32:mrsas0:1:43:0): WRITE(10). CDB: 2a 00 aa a3 56 38 00 00 40 00 
> (da32:mrsas0:1:43:0): CAM status: SCSI Status Error
> (da32:mrsas0:1:43:0): SCSI status: OK
> (da36:mrsas0:1:47:0): READ(10). CDB: 28 00 cf 54 91 d8 00 00 40 00 
> (da36:mrsas0:1:47:0): CAM status: SCSI Status Error
> (da36:mrsas0:1:47:0): SCSI status: OK
> (da35:mrsas0:1:46:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00 
> (da31:mrsas0:1:42:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00 
> (da38:mrsas0:1:49:0): READ(10). CDB: 28 00 aa 8b 22 c8 00 00 08 00 
> (da38:mrsas0:1:49:0): CAM status: SCSI Status Error
> (da38:mrsas0:1:49:0): SCSI status: OK
> (da32:mrsas0:1:43:0): WRITE(10). CDB: 2a 00 ea e8 2b 80 00 00 28 00 
> (da32:mrsas0:1:43:0): CAM status: SCSI Status Error
> (da32:mrsas0:1:43:0): SCSI status: OK
> (da34:mrsas0:1:45:0): READ(10). CDB: 28 00 ba 4d 11 c0 00 00 40 00 
> (da34:mrsas0:1:45:0): READ(10). CDB: 28 00 08 1f e7 60 00 00 10 00 
> (da36:mrsas0:1:47:0): READ(10). CDB: 28 00 7c 13 13 e0 00 01 00 00 
> (da34:mrsas0:1:45:0): WRITE(10). CDB: 2a 00 ea e8 b0 68 00 00 40 00 
> (da32:mrsas0:1:43:0): READ(10). CDB: 28 00 9a 95 4f 60 00 00 40 00 
> (da31:mrsas0:1:42:0): READ(10). CDB: 28 00 9d 8f 76 98 00 00 40 00 
> (da31:mrsas0:1:42:0): CAM status: SCSI Status Error
> (da31:mrsas0:1:42:0): SCSI status: OK
> (da34:mrsas0:1:45:0): READ(10). CDB: 28 00 08 48 96 50 00 00 38 00 
> (da30:mrsas0:1:41:0): READ(10). CDB: 28 00 9d 8f d8 40 00 00 40 00 
> (da30:mrsas0:1:41:0): CAM status: SCSI Status Error
> (da30:mrsas0:1:41:0): SCSI status: OK
> (da39:mrsas0:1:50:0): WRITE(10). CDB: 2a 00 dc e0 dd 48 00 00 28 00 
> (da39:mrsas0:1:50:0): READ(10). CDB: 28 00 9d 90 5f f0 00 00 40 00 
> (da39:mrsas0:1:50:0): CAM status: SCSI Status Error
> (da39:mrsas0:1:50:0): SCSI status: OK
> (da30:mrsas0:1:41:0): READ(10). CDB: 28 00 ad 78 8a 58 00 00 40 00 
> (da31:mrsas0:1:42:0): READ(10). CDB: 28 00 ad 78 8a 58 00 00 40 00 
> (da34:mrsas0:1:45:0): READ(10). CDB: 28 00 aa 9b 1c 60 00 00 08 00 
> (da31:mrsas0:1:42:0): READ(10). CDB: 28 00 1c ff f8 68 00 01 00 00 
> (da38:mrsas0:1:49:0): READ(10). CDB: 28 00 ce bb b6 58 00 00 40 00 
> (da38:mrsas0:1:49:0): READ(10). CDB: 28 00 9a 97 c5 30 00 00 20 00 
> (da38:mrsas0:1:49:0): CAM status: SCSI Status Error
> (da38:mrsas0:1:49:0): SCSI status: OK
> (da36:mrsas0:1:47:0): READ(10). CDB: 28 00 79 e9 ba 20 00 00 f0 00 
> (da34:mrsas0:1:45:0): WRITE(10). CDB: 2a 00 ea ea d0 f8 00 00 28 00 
> (da36:mrsas0:1:47:0): CAM status: SCSI Status Error
> (da36:mrsas0:1:47:0): SCSI status: OK
> (da34:mrsas0:1:45:0): CAM status: SCSI Status Error
> (da34:mrsas0:1:45:0): SCSI status: OK
> (da37:mrsas0:1:48:0): READ(10). CDB: 28 00 64 c5 3f e8 00 00 f8 00 
.....


I have updated the system firmware and BIOS
Freenas version is FreeNAS-11.3-U1
What I did notice is, that the old drives are "block size: 512B configured, 4096B native"
New drives are 512B native. Could that be the issue?

The specs for the system:
SuperMicro X10DSC+
2x Intel E5-2603 v4 @ 1.7Ghz
64GB RAM
System is running on two SSDs that are mirrored with ZFS
Drives:
30x ST6000NM0024-1HT
10x ST6000NM0235-2AB


Current output of the zpool status (mid scrub)
Code:
zpool status -v Backups
  pool: Backups
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: scrub in progress since Fri Apr  3 20:04:26 2020
        87.9T scanned at 388M/s, 87.3T issued at 386M/s, 126T total
        78.9M repaired, 69.37% done, 1 days 05:05:30 to go
config:

        NAME                                            STATE     READ WRITE CKSUM
        Backups                                         DEGRADED     0     0    14
          raidz1-0                                      ONLINE       0     0     0
            gptid/9829c871-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/98bac365-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/995238d1-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/99e7e9b4-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/9a7d42e7-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
          raidz1-1                                      ONLINE       0     0     0
            gptid/9b138896-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/9bb00809-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/9c477544-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/9ce69eb1-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/9d892148-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
          raidz1-2                                      ONLINE       0     0     0
            gptid/9e2db526-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/9edc0ed6-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/9f7b0625-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/a01827b8-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/a0b6747f-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
          raidz1-3                                      ONLINE       0     0     0
            gptid/a1619b2b-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/a1f8a461-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/a29d5c37-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/a33c40b4-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/a3dc1795-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
          raidz1-4                                      ONLINE       0     0     0
            gptid/a4898277-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/a529b97c-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/a5cf3ad5-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/a6719352-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/a717b75b-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
          raidz1-5                                      ONLINE       0     0     0
            gptid/a7cf10c8-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/a8753ed1-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/a924845f-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/a9dd6f1c-be0b-11e6-a931-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
            gptid/d766cdda-b40b-11e7-b647-0cc47aa501be  ONLINE       0     0     0  block size: 512B configured, 4096B native
          raidz1-6                                      DEGRADED     0     0    15
            gptid/60ecdd80-67e1-11ea-9d80-0cc47aa501be  DEGRADED     0     0   448  too many errors
            gptid/634407ff-67e1-11ea-9d80-0cc47aa501be  DEGRADED     0     0   437  too many errors
            gptid/6647c5a4-67e1-11ea-9d80-0cc47aa501be  DEGRADED     0     0   379  too many errors
            gptid/688f5f78-67e1-11ea-9d80-0cc47aa501be  DEGRADED     0     0   351  too many errors
            gptid/6acbb05b-67e1-11ea-9d80-0cc47aa501be  DEGRADED     0     0   365  too many errors
          raidz1-7                                      DEGRADED     0     0    15
            gptid/925b275a-67e1-11ea-9d80-0cc47aa501be  DEGRADED     0     0   400  too many errors
            gptid/94968010-67e1-11ea-9d80-0cc47aa501be  DEGRADED     0     0   359  too many errors
            gptid/96d5ee69-67e1-11ea-9d80-0cc47aa501be  DEGRADED     0     0   355  too many errors
            gptid/99166c0e-67e1-11ea-9d80-0cc47aa501be  DEGRADED     0     0   408  too many errors
            gptid/9b4e3d8e-67e1-11ea-9d80-0cc47aa501be  DEGRADED     0     0   402  too many errors

I am aware that raidz1 with 6TB disks is not the best idea in the world.

The drives are connected through an Avago MegaRAID 3108 as JBOD disks.
The controller has two connectors, as much as I understand, the first 30 drives are connected through one connector, the second 30 (or the problematic 10 in my case) through the second one.

Can someone point me in the right direction - I'm out of ideas.
Should I just order 4k native drives and swap one out to see what happens?
Should I maybe move some of the old drives to the second half of the system to see if they also get errors there?
Maybe someone has already seen this kind of error and knows what's wrong?

English is not my first language so please let me know if I was not clear enough on something.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Looks like a cabling or controller/port problem. Check into that side first.

The disks are probably still fine.
 

Jolinar

Cadet
Joined
Apr 6, 2020
Messages
4
So I switched out the RAID controller today. Seems to no avail.

Do I understand correctly that when I move the disks to another bay (from one backplane to another) they will continue to work and the pool will not break?
This would be a solution to test if it's a backplane issue when I swap a few disks with each other.
 

anmnz

Patron
Joined
Feb 17, 2018
Messages
286
Do I understand correctly that when I move the disks to another bay (from one backplane to another) they will continue to work and the pool will not break?

Yes. ZFS identifies disks by metadata stored on the disks themselves, not by where they are connected to the system.
 

Jolinar

Cadet
Joined
Apr 6, 2020
Messages
4
Seems that it was the backplane that was faulty. After a lengthy wait I received a new one and I am not getting any SCSI errors anymore.

BUT
During scrubbing still CKSUM errors show up. I'm guessing they're just old errors that have now need to be fixed somehow.
Is there a way I can get the CKSUM column to 0 again? Do I need to delete the files with "permanent errors" and re-create them? Will this remove the CKSUM errors?
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
I got similar errors with SMR drives. Make sure the 10 new drives aren’t SMR.
 

Jolinar

Cadet
Joined
Apr 6, 2020
Messages
4
It seems that the error is fixed.
For testing I swapped some drives before I changed the backplane. When I changed the backpane I swapped the drives back.
The new drives that were on the working backplane have not gotten any errors.
The old drives, that were connected to the faulty backplane, and are now connected back on the first, working backplane, still show some CKSUM errors.
I guess the others will also clear with time and some "zpool clear" -s.


The red and green drives were swapped out with each other.
NAME STATE READ WRITE CKSUM
Backups DEGRADED 0 0 472
raidz1-0 ONLINE 0 0 0
gptid/9829c871-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
gptid/98bac365-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
gptid/995238d1-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
gptid/99e7e9b4-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
gptid/9a7d42e7-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
raidz1-1 DEGRADED 0 0 16
gptid/9b138896-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
gptid/9bb00809-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
gptid/9c477544-be0b-11e6-a931-0cc47aa501be DEGRADED 0 0 273 too many errors
gptid/9ce69eb1-be0b-11e6-a931-0cc47aa501be DEGRADED 0 0 253 too many errors
gptid/9d892148-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
raidz1-2 ONLINE 0 0 0
gptid/9e2db526-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
gptid/9edc0ed6-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
gptid/9f7b0625-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
gptid/a01827b8-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
gptid/a0b6747f-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
raidz1-3 ONLINE 0 0 0
gptid/a1619b2b-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
gptid/a1f8a461-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
gptid/a29d5c37-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
gptid/a33c40b4-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
gptid/a3dc1795-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
raidz1-4 ONLINE 0 0 0
gptid/a4898277-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
gptid/a529b97c-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
gptid/a5cf3ad5-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
gptid/a6719352-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
gptid/a717b75b-be0b-11e6-a931-0cc47aa501be ONLINE 0 0 0 block size: 512B configured, 4096B native
raidz1-5 DEGRADED 0 0 850
gptid/a7cf10c8-be0b-11e6-a931-0cc47aa501be DEGRADED 0 0 0 too many errors
gptid/a8753ed1-be0b-11e6-a931-0cc47aa501be DEGRADED 0 0 0 too many errors
gptid/a924845f-be0b-11e6-a931-0cc47aa501be DEGRADED 0 0 285 too many errors
gptid/a9dd6f1c-be0b-11e6-a931-0cc47aa501be DEGRADED 0 0 271 too many errors
gptid/d766cdda-b40b-11e7-b647-0cc47aa501be DEGRADED 0 0 0 too many errors
raidz1-6 DEGRADED 0 0 2
gptid/60ecdd80-67e1-11ea-9d80-0cc47aa501be ONLINE 0 0 0
gptid/634407ff-67e1-11ea-9d80-0cc47aa501be ONLINE 0 0 0
gptid/6647c5a4-67e1-11ea-9d80-0cc47aa501be ONLINE 0 0 0
gptid/688f5f78-67e1-11ea-9d80-0cc47aa501be ONLINE 0 0 0

gptid/6acbb05b-67e1-11ea-9d80-0cc47aa501be DEGRADED 0 0 323 too many errors
raidz1-7 DEGRADED 0 0 76
gptid/925b275a-67e1-11ea-9d80-0cc47aa501be DEGRADED 0 0 346 too many errors
gptid/94968010-67e1-11ea-9d80-0cc47aa501be DEGRADED 0 0 364 too many errors
gptid/96d5ee69-67e1-11ea-9d80-0cc47aa501be DEGRADED 0 0 376 too many errors
gptid/99166c0e-67e1-11ea-9d80-0cc47aa501be DEGRADED 0 0 381 too many errors
gptid/9b4e3d8e-67e1-11ea-9d80-0cc47aa501be DEGRADED 0 0 342 too many errors
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
You might also want to re-think this:
I am aware that raidz1 with 6TB disks is not the best idea in the world.

The drives are connected through an Avago MegaRAID 3108 as JBOD disks.

You're right about RAIDZ1, but even more concerning is the RAID controller in JBOD mode...
 
Top