SOLVED Unknown error when trying to replace a drive

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
Hi,

/dev/da3 in my main server Atlas (TrueNAS 12.0-U1 ; more details in my signature) started to give me smart errors. I offlined the drive, put the spare in and tried to replace the drive, all of that from the GUI. Unfortunately, the process fails pretty quickly and does not give much useful info about it. I checked on the console to find more and got this from /var/log/messages :

I tried to remove the drive and re-insert it in case it was not perfectly fit first shot. This is why the drive is first mentioned to be detached and I re-attach it right after.

Code:
Apr 28 00:49:31 Atlas.jb.lan da3: <ATA ST4000VN008-2DR1 SC60>  s/n ZGY74QQK detached
Apr 28 00:49:31 Atlas.jb.lan (da3:mrsas0:1:3:0): Periph destroyed
Apr 28 00:50:00 Atlas.jb.lan mrsas0: System PD created target ID: 0x3
Apr 28 00:50:03 Atlas.jb.lan da3 at mrsas0 bus 1 scbus1 target 3 lun 0
Apr 28 00:50:03 Atlas.jb.lan da3: <ATA ST4000VN008-2DR1 SC60> Fixed Direct Access SPC-4 SCSI device
Apr 28 00:50:03 Atlas.jb.lan da3: Serial Number ZGY74QQK
Apr 28 00:50:03 Atlas.jb.lan da3: 150.000MB/s transfers
Apr 28 00:50:03 Atlas.jb.lan da3: 3815447MB (7814037168 512 byte sectors)
Apr 28 00:50:03 Atlas.jb.lan kernel: igb3: link state changed to DOWN
Apr 28 00:50:07 Atlas.jb.lan kernel: igb3: link state changed to UP
Apr 28 00:51:25 Atlas.jb.lan mrsas0: mrsas_data_load_cb_prp: error=27
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Data map/load failed.
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Build SYSPDIO failed.
Apr 28 00:51:25 Atlas.jb.lan mrsas0: mrsas_data_load_cb_prp: error=27
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Data map/load failed.
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): WRITE(10). CDB: 2a 00 00 00 00 00 00 02 30 00
Apr 28 00:51:25 Atlas.jb.lan mrsas0: (da3:mrsas0:1:3:0): CAM status: SMP Status Error
Apr 28 00:51:25 Atlas.jb.lan Build SYSPDIO failed.
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): Retrying command, 3 more tries remain
Apr 28 00:51:25 Atlas.jb.lan mrsas0: mrsas_data_load_cb_prp: error=27
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Data map/load failed.
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Build SYSPDIO failed.
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): WRITE(10). CDB: 2a 00 00 00 02 30 00 02 30 00
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): CAM status: SMP Status Error
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): Retrying command, 3 more tries remain
Apr 28 00:51:25 Atlas.jb.lan mrsas0: mrsas_data_load_cb_prp: error=27
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Data map/load failed.
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Build SYSPDIO failed.
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): WRITE(10). CDB: 2a 00 00 00 00 00 00 02 30 00
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): CAM status: SMP Status Error
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): Retrying command, 2 more tries remain
Apr 28 00:51:25 Atlas.jb.lan mrsas0: mrsas_data_load_cb_prp: error=27
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Data map/load failed.
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Build SYSPDIO failed.
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): WRITE(10). CDB: 2a 00 00 00 02 30 00 02 30 00
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): CAM status: SMP Status Error
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): Retrying command, 2 more tries remain
Apr 28 00:51:25 Atlas.jb.lan mrsas0: mrsas_data_load_cb_prp: error=27
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Data map/load failed.
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Build SYSPDIO failed.
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): WRITE(10). CDB: 2a 00 00 00 00 00 00 02 30 00
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): CAM status: SMP Status Error
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): Retrying command, 1 more tries remain
Apr 28 00:51:25 Atlas.jb.lan mrsas0: mrsas_data_load_cb_prp: error=27
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Data map/load failed.
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Build SYSPDIO failed.
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): WRITE(10). CDB: 2a 00 00 00 02 30 00 02 30 00
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): CAM status: SMP Status Error
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): Retrying command, 1 more tries remain
Apr 28 00:51:25 Atlas.jb.lan mrsas0: mrsas_data_load_cb_prp: error=27
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Data map/load failed.
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Build SYSPDIO failed.
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): WRITE(10). CDB: 2a 00 00 00 00 00 00 02 30 00
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): CAM status: SMP Status Error
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): Retrying command, 0 more tries remain
Apr 28 00:51:25 Atlas.jb.lan mrsas0: mrsas_data_load_cb_prp: error=27
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Data map/load failed.
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Build SYSPDIO failed.
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): WRITE(10). CDB: 2a 00 00 00 02 30 00 02 30 00
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): CAM status: SMP Status Error
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): Retrying command, 0 more tries remain
Apr 28 00:51:25 Atlas.jb.lan mrsas0: mrsas_data_load_cb_prp: error=27
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Data map/load failed.
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Build SYSPDIO failed.
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): WRITE(10). CDB: 2a 00 00 00 00 00 00 02 30 00
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): CAM status: SMP Status Error
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): Error 5, Retries exhausted
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): WRITE(10). CDB: 2a 00 00 00 02 30 00 02 30 00
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): CAM status: SMP Status Error
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): Error 5, Retries exhausted
Apr 28 00:51:25 Atlas.jb.lan mrsas0: mrsas_data_load_cb_prp: error=27
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Data map/load failed.
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Build SYSPDIO failed.
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): WRITE(10). CDB: 2a 00 00 00 04 60 00 02 30 00
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): CAM status: SMP Status Error
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): Retrying command, 3 more tries remain
Apr 28 00:51:25 Atlas.jb.lan mrsas0: mrsas_data_load_cb_prp: error=27
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Data map/load failed.
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Build SYSPDIO failed.
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): WRITE(10). CDB: 2a 00 00 00 04 60 00 02 30 00
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): CAM status: SMP Status Error
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): Retrying command, 2 more tries remain
Apr 28 00:51:25 Atlas.jb.lan mrsas0: mrsas_data_load_cb_prp: error=27
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Data map/load failed.
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Build SYSPDIO failed.
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): WRITE(10). CDB: 2a 00 00 00 04 60 00 02 30 00
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): CAM status: SMP Status Error
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): Retrying command, 1 more tries remain
Apr 28 00:51:25 Atlas.jb.lan mrsas0: mrsas_data_load_cb_prp: error=27
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Data map/load failed.
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Build SYSPDIO failed.
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): WRITE(10). CDB: 2a 00 00 00 04 60 00 02 30 00
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): CAM status: SMP Status Error
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): Retrying command, 0 more tries remain
Apr 28 00:51:25 Atlas.jb.lan mrsas0: mrsas_data_load_cb_prp: error=27
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Data map/load failed.
Apr 28 00:51:25 Atlas.jb.lan mrsas0: Build SYSPDIO failed.
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): WRITE(10). CDB: 2a 00 00 00 04 60 00 02 30 00
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): CAM status: SMP Status Error
Apr 28 00:51:25 Atlas.jb.lan (da3:mrsas0:1:3:0): Error 5, Retries exhausted


I have no clue what that mrsas_data_load_cb_prp Error 27 can be. Nothing from Google about it nor in the forums.

Anyone has an idea what is going on here ?

Thanks in advance,
 

Piero21

Cadet
Joined
Jul 28, 2021
Messages
2
Hello,
I have the same problem. After a sudden shutdown due to an air conditioning problem, my zfs pool was degraded. The disk that was causing the zfs inconsistency still appears to be functional. I deleted it on another system and now when I want to "replace" it with itself, I get the exact same error. This is an 18TB Seagate drive. Were you able to solve your problem?
 

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
Hi,

Nope; no progress here. (or should I say, no effort...). I will have to RMA both that failing drive and the cold spare that I had. For the moment, I keep my 3rd server hot so I have an up-to-date copy onsite in case of a catastrophic failure in my main server. Still, I should replace these drives sooner than later....
 

Piero21

Cadet
Joined
Jul 28, 2021
Messages
2
Thank you for your reply.
For information, I inserted the problematic disk into another Truenas server and created a pool on it. It seems to be working perfectly. Including successful "long" smart test.
I still do not know what was the cause of the problem in question. I suspect the way the disc is hot inserted. But I would have to do more testing to confirm this.
 

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
Hi,

Thanks a lot for your experience. Here, I powered down my entire environment (required for me to power down TrueNAS as well) and cold replaced the drive. Indeed, the spare drive have been detected properly and is now working as expected. So indeed, that error seems to be somehow related to the fact that the drive was hot plugged.

Thanks to that, I now replaced that previous drive with 2 failed long smart test. I will now return it (still under warranty....).
 
Top