Attempting to determine failed drive behind RAID controller

Status
Not open for further replies.

CarlB

Dabbler
Joined
Jan 30, 2018
Messages
40
I'm a total FreeNAS rookie. Last year I bought an old poweredge server off ebay in hope of making a cheap NAS. It sort of worked, but there were some complications, like the fact that all drives sit behind a raid controller. So I ended up making RAID0 for each drive, then letting FreeNAS do RAIDZ on its own. Ghetto solution, but I figured it was ok for now.

Here were are months later and I've had my first drive problem. "The volume vol0 state is DEGRADED: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state." Excellent I thought, ZFS has saved me time to replace that drive.
Code:
Checking status of zfs pools:
NAME		   SIZE  ALLOC   FREE  EXPANDSZ   FRAG	CAP  DEDUP  HEALTH  ALTROOT
freenas-boot  14.5G   834M  13.7G		 -	  -	 5%  1.00x  ONLINE  -
vol0		  14.5T  5.61T  8.89T		 -	14%	38%  1.00x  DEGRADED  /mnt

  pool: vol0
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
		the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://illumos.org/msg/ZFS-8000-2Q
  scan: scrub repaired 0 in 0 days 03:34:33 with 0 errors on Sun Jan 28 03:34:35 2018
config:

		NAME											STATE	 READ WRITE CKSUM
		vol0											DEGRADED	 0	 0	 0
		  raidz1-0									  DEGRADED	 0	 0	 0
			gptid/b04226f1-7899-11e7-a067-842b2b005ba9  ONLINE	   0	 0	 0
			gptid/b08932dd-7899-11e7-a067-842b2b005ba9  ONLINE	   0	 0	 0
			71195709623019041						   UNAVAIL	  0	 0	 0  was /dev/gptid/b0dc468b-7899-11e7-a067-842b2b005ba9
			gptid/b125c584-7899-11e7-a067-842b2b005ba9  ONLINE	   0	 0	 0
		  raidz1-1									  ONLINE	   0	 0	 0
			gptid/2c9bfa43-7f0e-11e7-879a-842b2b005ba9  ONLINE	   0	 0	 0
			gptid/2d1ccd4c-7f0e-11e7-879a-842b2b005ba9  ONLINE	   0	 0	 0
			gptid/2d915097-7f0e-11e7-879a-842b2b005ba9  ONLINE	   0	 0	 0
			gptid/2e07d2d3-7f0e-11e7-879a-842b2b005ba9  ONLINE	   0	 0	 0

errors: No known data errors

The reason I have two RAIDZ1's 4 drives each is that I started this build with 4 drives. It became clear to me later that I needed more space, so I added 4 more. Only then realizing I cannot expand. Too late for RAIDZ2 unless I could move all this data somewhere.

Anyway my terrible raid strategy aside, we can see the 3rd drive in the list is unavailable. However the GUI does not show the serial for these drives, they are behind that controller. If I run smartctl -a for one of my drives mfid0 through mfid6 I get the following error:
Code:
/dev/mfid0: To monitor disks on LSI RAID load mfip.ko module and run 'smartctl -a /dev/passX' to show SMART information
Please specify device type with the -d option.


Only 7 drives appear in the GUI, so I was going to just get the serials for all the working drives then I can eliminate the working ones. With this error I am not sure how to go about narrowing down the malfunctioning drive. The server doesn't display any errors on any of the 8 bays, so out of ideas.

I appreciate any suggestions you guys can give me.
 
Last edited by a moderator:

BigDave

FreeNAS Enthusiast
Joined
Oct 6, 2013
Messages
2,479
open the View Disks tab in the freeNAS GUI, count up your drives and write down the serial numbers,
you will be missing one. Shut down your FreeNAS, open up the box and find all the serial numbers
you wrote down, the one you don't have a serial nimber for is your "bad" drive.
If your RAID card does not list the serial numbers in the View Disks tab, you are SOL...
 

CarlB

Dabbler
Joined
Jan 30, 2018
Messages
40
There are no serial numbers in view disks and the smartctl -a command doesn't work. Guess I am hosed.
 

BigDave

FreeNAS Enthusiast
Joined
Oct 6, 2013
Messages
2,479
Find a way to get that data off the pool, then get a proper HBA or a RAID card you can flash the FW
and make it an HBA. FreeNAS must have DIRECT access to the drives for monitoring and maintenance
of drive health. Good luck!
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Agree with what @BigDave is saying, and I think you now see (one reason) why RAID controllers aren't recommended. But does the controller's setup utility show all eight disks, or is one missing from there? You may be able to get the information you need that way.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
So if it's a PowerEdge it probably has drive activity lights. Go into the root of your pool. Run "tar cvf - . > /dev/null" to create a tarball that reads all the files, just chucking it into the bitbin. Look at which lights are flickering on the array. The problem drive is the other one.
 

CarlB

Dabbler
Joined
Jan 30, 2018
Messages
40
So if it's a PowerEdge it probably has drive activity lights. Go into the root of your pool. Run "tar cvf - . > /dev/null" to create a tarball that reads all the files, just chucking it into the bitbin. Look at which lights are flickering on the array. The problem drive is the other one.
Love this idea. I do see a single disc that the activity light on isn't doing anything. I'm scared of the current setup, so I ordered a drive to backup my pool. I'll try to replace this disc once I have things backed up. I'll probably find a HBA card with a SAS connector so I can use the existing poweredge HDD rack. Then start over with 8 drives in RAIDZ2.

Is ZFS send/receive the most accepted way if I wanted to completely clone my pool with jails to an external HDD backup for total reinstall of FreeNAS?
 
Last edited by a moderator:

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Is ZFS send/receive the most accepted way if I wanted to completely clone my pool with jails to an external HDD backup for total reinstall of FreeNAS?

Yes.
 

CarlB

Dabbler
Joined
Jan 30, 2018
Messages
40
Thanks for all the help guys. I've got a SAS HBA on the way that doesn't use RAID. This weekend I am making a copy of my current pool using this guide. https://forums.freenas.org/index.ph...te-data-from-one-pool-to-a-bigger-pool.40519/

My plan

1) Copy vol0 to external HDD using snapshot, send, receive.
2) Create new vol0 using non-raid HBA.
3) Migrate from the external HDD to the new vol0. This is a little different than the guide since I am not renaming and using the external as the new pool. It will just be a temporary home for the pool
4) Detach the external HDD and enjoy having things setup correctly this time.
 

CarlB

Dabbler
Joined
Jan 30, 2018
Messages
40
Just a quick update on this. I actually didn't have to migrate data, but made a backup since I wasn't sure.

ZFS was smart enough to ignore the RAID header on these drives and imported them on the new controller without issue.
 
Status
Not open for further replies.
Top