Disk Offline Problems

Status
Not open for further replies.

centex99

Dabbler
Joined
Jul 29, 2012
Messages
45
I had a disk change status to "Removed" then I decided to shut down the server, check the cables, and then turned it back on. It now seems to be "ok" except it shows a warning status of "WARNING: The volume NAS_VOL (ZFS) status is ONLINE: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected.Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'."

Zpool status -v reveals:
pool: NAS_VOL
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://illumos.org/msg/ZFS-8000-9P
scan: resilvered 988K in 0h0m with 0 errors on Fri Jan 31 13:48:50 2014
config:

NAME STATE READ WRITE CKS
UM
NAS_VOL ONLINE 0 0
0
raidz1-0 ONLINE 0 0
0
gptid/8138d265-d902-11e1-b300-f46d0471c67b ONLINE 0 0
0
gptid/6b5683d1-ef98-11e1-992b-f46d0471c67b ONLINE 0 0
0
gptid/826c766e-d902-11e1-b300-f46d0471c67b ONLINE 0 0
0
gptid/8308e26e-d902-11e1-b300-f46d0471c67b ONLINE 0 0
0
gptid/83a636ea-d902-11e1-b300-f46d0471c67b ONLINE 0 0
1

errors: No known data errors
 

centex99

Dabbler
Joined
Jul 29, 2012
Messages
45
The last disk is the one that showed "Removed" earlier... it's on ADA4.
Also, my AD0 disk has been having a few issues... it has "Device: /dev/ada0, 8 Offline uncorrectable sectors"
I've been keeping an eye on this, however this number has not increased from 8. Do I need to replace this drive? What about the ADA4 drive?
 

centex99

Dabbler
Joined
Jul 29, 2012
Messages
45
So, the drive just went "Removed" again... is this likely due to the controller or the drive? This drive hasn't shown any other signs of failure...
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I see this thread and I want to cry with tears of dispair for mankind.

Did you even try searching the forums for phrases like "offline uncorrectable errors"? You should. If you did, you'd have not needed to post this thread.
 

centex99

Dabbler
Joined
Jul 29, 2012
Messages
45
Regarding the offline errors... yes, I've searched the forum and read about 15 different posts and they all seem to say something different or to search the thread. Some threads say to replace the drive, others say if its just a few sectors its ok, some say to run a scrub and it'll get rid of the error, which I'm fairly certain I've had a scrub ran, but the error persists... I'm sorry if it frustrates you, but I'm asking for help.

My larger concern/question was in regards to the other disk going offline... I've since switched the cables like I mentioned and neither disk has gone offline in at least 24 hours... maybe it was a bad connection, I'm not entirely sure. One thing I do know for sure though is, I don't want to replace ada0 (which has the bad sectors) until I'm 99% confident that the "removed" drive won't show back up during a resliver.
 

centex99

Dabbler
Joined
Jul 29, 2012
Messages
45
So back in to the "disk removed" problem... I didn't encounter the error for over a week and then decided today to put the case side back on... voila the disk went "removed" again... and it followed the disk. So that tells me it wasn't the sata cable or the sata port, but perhaps the disk or power cable (didn't swap those). Will a disk that is overheated get changed to "removed" status? It's weird because I had this up and running for over two years without this problem occuring.
 
Status
Not open for further replies.
Top