Missing Disk

Status
Not open for further replies.

Bhoot

Patron
Joined
Mar 28, 2015
Messages
241
So Yesterday evening I got a message on my email

The volume bhoot (ZFS) state is DEGRADED: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state.

It happened without a warning. I realized the ada3 was not being detected. No other alerts No SMART report to indicate an impending failure.
I opened the box refitted the wire and the alert disappeared.

Later last night I got new mails.

The volume bhoot (ZFS) state is DEGRADED: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state.
followed by
Device: /dev/ada3, unable to open device
The volume bhoot (ZFS) state is DEGRADED: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state.

This is the last SMART report generated by freenas for ada3 on 19th July.

########## SMART status report for ada3 drive (Western Digital Red: WD-WCC4E7UUT762) ##########

SMART overall-health self-assessment test result: PASSED

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 180 179 021 Pre-fail Always - 8000
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 46
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 096 096 000 Old_age Always - 3649
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 42
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 26
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 433
194 Temperature_Celsius 0x0022 114 108 000 Old_age Always - 38
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0

No Errors Logged

Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
Short offline Completed without error 00% 3644 -

I still don't understand if it's a problem with the sata cable or the the HDD. I also wanna know what I should tell the WD coz this isn't a straight SMART error. The system wasn't opened or didn't suffer any kinda bump that would lead to any kind of hard disk failure. COuld someone help me with the problem? I do have a cold spare but I would like to know if replacing would really solve the problem?
 

Bhoot

Patron
Joined
Mar 28, 2015
Messages
241
Here is a shot of the volume status

freenas.png
 

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
Have you tried replacing the cable? As rare as it is, cables do go bad, and that could easily be your problem here. However, that being said, I've seen hard drives that could pass a SMART test not work, so that could also be your problem.

The simple solution is to replace the cable. There is an outside possibility it's some problem with your PSU/drive power cable, so if you can replace/swap that, I'd do that too. If the drive still shows up bad, then I'd say it's the drive.
 

Bhoot

Patron
Joined
Mar 28, 2015
Messages
241
Thank you for the quick reply. Really can't thank the community enough for having such amazing response time. :)
Cool I'll try that. But TBH I don't currently have a spare SATA. Is it safe to interchange with a working disk? It's a raidZ2 so having a possibility of both the drive being bad and the cable being bad will still save me from a catastrophe right? Currently I am not really writing to the system.
Another thing I want to know is should I be happy if the bios detects the disk or should I let it boot all the way to the webui?
 

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
Personally, I'd hesitate to mess with a working disk when my array is degraded. In theory you should be fine, but you never know when a second disk might decide to die on you.

Do you have any other computer(s) that you could steal the cable from, even just temporarily? At this point, the goal is to quickly rule out the cable, because if the drive is really bad, you want to get it replaced ASAP.
 
Status
Not open for further replies.
Top