Bad Block, Identify bad drive.

Status
Not open for further replies.

Steve Simpson

Dabbler
Joined
Oct 20, 2013
Messages
19
I'm getting continuous pages of the following error:
vm_fault: pager read error, pid xxxxx (nginx)
swap_pager: I/O error - pagein failed; blkno 1048907, size 4096, error 6

How can Identify which drive is the bad one? Will a zpool scrub mark that block as bad? Before I do that I would like to identify the bad drive so I can replace it.

After booting I get an Alert on the web interface which says to determine if the drive needs to be replaced yet the drive isn't identified.

Thanks for any help.
 

Steve Simpson

Dabbler
Joined
Oct 20, 2013
Messages
19
Now a different drive has died. Freenas says it has lost device ada1 and has removed the entry. So I replaced the drive with another. When I tried to add the new drive to the pool via the web interface my choice is "stripe" etc and there is no selection to add it to the ZFS pool. Any idea on how to add it to the existing zpool?
 

Steve Simpson

Dabbler
Joined
Oct 20, 2013
Messages
19
And now this (see screenshot) on another drive. Maybe I need a new motherboard, or replace the SATA cable.
 

Attachments

  • IMG_1936.JPG
    IMG_1936.JPG
    411 KB · Views: 274

Steve Simpson

Dabbler
Joined
Oct 20, 2013
Messages
19
Thanks for your suggestion. I just added a smart test for all the drives. I can't tell if it has actually run or is now running. Where does the output go when it does run?
 

IanWorthington

Contributor
Joined
Sep 13, 2013
Messages
144
Thanks for your suggestion. I just added a smart test for all the drives. I can't tell if it has actually run or is now running. Where does the output go when it does run?


iirc that's a scheduled test which runs at the appointed time. I /believe/ you get an email if it fails but I've never seen one... In your case I'd run it at the command line, leave it for the recommended time, then you can issue another command to see the results.

See https://wiki.archlinux.org/index.php/S.M.A.R.T. for syntax
 

Steve Simpson

Dabbler
Joined
Oct 20, 2013
Messages
19
You don't, the replacement process does that for you: http://doc.freenas.org/index.php/Volumes#Replacing_a_Failed_Drive.

I was trying to replace the drive via the ZFS Volume Manager which seemed to be the right place for it. Right now the bad file server is down. Using my good master server I don't see a "Volume Status" tab. That would explain why I thought ZFS Volume Manager was the correct place to replace a drive. It seemed to be the only place. How do you get the Volume Status tab to show up? I put back the bad drive on the bad server to see if the tab shows up then but with the errors on ada1 it never even gets to the point where the web server is up.

It looks like I'll need to replace both ada1 and ada3 and start from scratch *again*.
 

Steve Simpson

Dabbler
Joined
Oct 20, 2013
Messages
19
I wound up replacing two drives and recreating the pool from scratch. All data lost. Copying everything from the master file server to the backup. It'll take 4 days or so. Again.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I ran badblocks for over week on my drives before I trusted them with my data. And FOUR were bad.

Whoa! bad luck.. go buy a lottery ticket as luck has to swing back, right?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I ran badblocks for over week on my drives before I trusted them with my data. And FOUR were bad.

Out of how many? Sounds like someone viewed the "fragile" sticker as a challenge.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
12. Or 15 if you count the replacements (yeah, I'm running w/o a spare right now...)

1/3 DoA rate? That is nasty...


A small study some time ago showed that packages labeled "Fragile" actually received worse treatment than "normal" packages. Not the most significant of studies, but still a rather unfortunate conclusion...
 
Status
Not open for further replies.
Top