WD40EFRX drive turned Unknown.

ah1970

Cadet
Joined
Jan 16, 2021
Messages
2
I have two HP N40L servers at home which were running 12.0. Each has three 4TB drives, but I was running out of room, so I decided to upgrade them.

On Wednesday, the new IronWolf drive arrived for the backup server, so I killed the existing pool, moved the SSD from ADA0 to the optical drive bay, added the new drive and rebooted. When the system came back, I rebuilt the pool, replicated across from the main server and everything just worked.

Today, the two new WD40EFRX drives arrived for the main server. One is an extra and the other is to replace an EFAX which was part of the original pool.

I did the same as the other server, killed the pool, moved the SSD to the optical drive bay, removed the EFAX and added the two EFRX and rebooted.

When I tried to rebuild the pool, one of the drives shows as Unknown. I can add it to the pool but TN complains you can't make a vdev out of different drive sizes. They all show as 3.64TiB.

After faffing around for a while, I thought it might just be a DOA drive, but on checking the serial numbers, it turns out this was one of the original drives, which was working fine under 11.3 and then 12.0, until today.

If I move it around the bays, the problem follows it. I've cleaned the drive edge connectors and air dusted the sockets in the backplane.

The other three drives all have the same firmware, this one has a completely different version, and case design.

root@freenas[~]# dmesg | grep WD40
ada0: <WDC WD40EFRX-68N32N0 82.00A82> ACS-3 ATA SATA 3.x device
ada1: <WDC WD40EFRX-68N32N0 82.00A82> ACS-3 ATA SATA 3.x device
ada2: <WDC WD40EFRX-68N32N0 00.02C.5> ATA8-ACS SATA 2.x device
ada3: <WDC WD40EFRX-68N32N0 82.00A82> ACS-3 ATA SATA 3.x device

I've tried going back to 11.3 but it's just the same.

To be honest, I'll probably just trash this rogue disk and buy another new one, but it bugs me that TN was perfectly happy with it up until this morning.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
The drive has probably failed. If you look at "dmesg" to see what the size is listed, you will probably see something radically other than as expected. If you query it with smartctl, it will probably show "failed". A classic sign of failure of the mechanical bits is for the drive to suddenly not know exactly what it is.

Stopping a hard drive after a long time running is always a risky proposition, they do not always start spinning again. You may be able to feel that it is not spinning up, or having trouble spinning up, if you take it out of the chassis, put it on a flat surface, and power it up with your hand on top of it.
 
Top