Can not replace disk: NOT READY, Logical unit not ready, initial

Oir0Quee

Cadet
Joined
Aug 27, 2020
Messages
3
Two weeks ago, on my FreeNAS unit one of my disks failed, so I sent it to HGST for a warranty replacement. After I received the replacement disk (same type, same size), I installed it, but could not replace the degraded ZFS pool. It seems like the new disk is not fully recognized by my system and I am clueless. The device cannot query its size, I can not write anything to it. I also tried different slots in the chassis, same results.
Does anyone here have an idea? Is there some special BSD magic that I overlooked? Or is it possible they sent me a faulty device? Unfortunately, this is the only machine I have available, that has SAS ports.

System info:
Code:
Mainboard: Supermicro X11SSL-CF
SAS Controller: LSI SAS3008 (Onboard)
Chassis: Supermicro SuperChassis 836BE1C-R1K03B
SAS Backplane: Supermicro HD Backplane BPN-SAS3-836EL2
CPU: Intel(R) Xeon(R) CPU E3-1240 v5 @ 3.50GHz
RAM: 32 GB ECC
Boot device: Samsung SSD 850 EVO 250GB
ZFS Pool (RAIDZ2):
 4 x SEAGATE ST10000NM0206 10TB
 4 x HGST HUH721010AL5200 10TB (One of these failed)
 


zpool status:
Code:
root@discworld ~ # zpool status
  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:00:36 with 0 errors on Sat Aug 22 03:45:36 2020
config:

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
          ada0p2    ONLINE       0     0     0

errors: No known data errors

  pool: pool
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
  scan: scrub repaired 0 in 0 days 20:25:18 with 0 errors on Mon Jul 20 20:25:19 2020
config:

        NAME                                            STATE     READ WRITE CKSUM
        pool                                            DEGRADED     0     0     0
          raidz2-0                                      DEGRADED     0     0     0
            gptid/efc6973c-f415-11e8-a34d-0cc47a85297e  ONLINE       0     0     0
            gptid/f0f67dd6-f415-11e8-a34d-0cc47a85297e  ONLINE       0     0     0
            gptid/f21563c2-f415-11e8-a34d-0cc47a85297e  ONLINE       0     0     0
            gptid/f336aa8a-f415-11e8-a34d-0cc47a85297e  ONLINE       0     0     0
            gptid/f450bd90-f415-11e8-a34d-0cc47a85297e  ONLINE       0     0     0
            gptid/f563cdb8-f415-11e8-a34d-0cc47a85297e  ONLINE       0     0     0
            426474221169454384                          OFFLINE      0     0     0  was /dev/gptid/f678f62b-f415-11e8-a34d-0cc47a85297e
            gptid/f78ff78b-f415-11e8-a34d-0cc47a85297e  ONLINE       0     0     0

errors: No known data errors


I set the defect disk to offline and tried to zpool replace it:
Code:
root@discworld ~ # zpool replace pool 426474221169454384 /dev/da6
cannot replace 426474221169454384 with /dev/da6: device is too small


Even smartctl is not successful:
Code:
smartctl -i /dev/da6
smartctl 7.0 2018-12-30 r4883 [FreeBSD 11.3-RELEASE-p9 amd64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               HGST
Product:              HUH721010AL5200
Revision:             A21D
Compliance:           SPC-4
LU is fully provisioned
Rotation Rate:        7200 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000cca251100588
Serial number:        7PG8U4PS
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Thu Aug 27 19:57:17 2020 CEST
device is NOT READY (e.g. spun down, busy)
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.

This is in /var/log/messages:
Code:
root@discworld ~ # grep da6 /var/log/messages | grep '10:2'
Aug 27 10:28:01 discworld da6 at mpr0 bus 0 scbus0 target 14 lun 0
Aug 27 10:28:01 discworld da6: <HGST HUH721010AL5200 A21D> Fixed Direct Access SPC-4 SCSI device
Aug 27 10:28:01 discworld da6: Serial Number 7PG8U4PS
Aug 27 10:28:01 discworld da6: 1200.000MB/s transfers
Aug 27 10:28:01 discworld da6: Command Queueing enabled
Aug 27 10:28:01 discworld da6: Attempt to query device size failed: NOT READY, Logical unit not ready, initial
Aug 27 10:28:01 discworld ses0: da6,pass6 in 'Slot06', SAS Slot: 1 phys at slot 6

I tried to dd to the disk, but nothing can be written:
Code:
root@discworld ~ # dd if=/dev/zero of=/dev/da6 count=100
dd: /dev/da6: Device not configured
1+0 records in
0+0 records out
0 bytes transferred in 0.000036 secs (0 bytes/sec)

Replacing the disk with the web interface fails with:
Code:
Replacing Disk
Error: [EFAULT] Failed to wipe disk da6: [EFAULT] Command gpart create -s gpt /dev/da6 failed (code 1): gpart: provider: Operation not supported by device
 

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
Hey,

It looks like you got a DoA drive.... Defective On Arrival...

This is why burning is so important..
 

Oir0Quee

Cadet
Joined
Aug 27, 2020
Messages
3
Thanks for your reply... So no chance, I overlooked some weird setting which stops new disks from being recognized? Damn. Another week waiting for the next replacement.
 

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
I overlooked some weird setting which stops new disks from being recognized?

No. I think that your new drive itself is defective. To be completely sure, try to use it in another system. If it does not work in that second system, you will have your evidence...
 

Oir0Quee

Cadet
Joined
Aug 27, 2020
Messages
3
Just for information, the disk replaced. Yesterday, I received the new disk. FreeNAS is currently replacing it in my pool. Thank you for the advice!
 
Top