RAID stress test

Status
Not open for further replies.

tstorzuk

Explorer
Joined
Jun 13, 2011
Messages
92
A little help please,

I'm looking to run a RAID stress test to make sure that my hard drives are ok, before putting any valued information onto them.

How would I go about doing that?

I've read that a command like:

Code:
badblocks -wvsb 4096 -p 3 /dev/sda


should run 3 passes of writing data to all the RAID.

Where should I input this type of code to run this? Where should I find the output of the process and what information should I be looking for to determine the health of the RAID? How can I figure out which drive is faulty, if there is one?
 

tstorzuk

Explorer
Joined
Jun 13, 2011
Messages
92
Anyone know how to make this happen?
 
Joined
May 27, 2011
Messages
566
badblocks only works on block devices (single hard disk), not on file systems (zfs file system)

you'll need to run that command for each hard disk, not the entire pool. to do so, you can either log in through the console or setup ssh and login through that.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
This would be a good test to add to FreeNAS maybe. A utilities menu could come in handy.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Yes, I meant the GUI. I'm sure one version 8 is fairly complete and stable, maybe we could work in some good utilities that would help diagnose problems or just make it easy to run a simple test. I had no idea about this badblocks test but then again, I'm not a linux/BSD trained person and even though I've been exposed to it for a few years, it's still new to me. Many people will use FreeNAS in the home so might as well make it as complete as possible.

Right now I have all four of my drives running the test, it's been just over 8 hours since I started and I'm ~55% complete. I assume a failure is obvious.
 

Tekkie

Patron
Joined
May 31, 2011
Messages
353
What does badblocks actually do? :)
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Here is a description of the command. http://linux.about.com/library/cmd/blcmdl8_badblocks.htm

For the example above it is a destructive command using 4K byte blocks which will write then read patterns to/from the hard drive in order to see if the drive has errors.

I'm going to play around with the -c option and increase it from 16 to 64 since I have a lot of RAM. I'd like to see what speed changes it makes.
 
Joined
May 27, 2011
Messages
566
Here is a description of the command. http://linux.about.com/library/cmd/blcmdl8_badblocks.htm

For the example above it is a destructive command using 4K byte blocks which will write then read patterns to/from the hard drive in order to see if the drive has errors.

I'm going to play around with the -c option and increase it from 16 to 64 since I have a lot of RAM. I'd like to see what speed changes it makes.


so are you planning on running this on each individual drive? let us know how it goes.
 

Tekkie

Patron
Joined
May 31, 2011
Messages
353
Running it now on my 6 drives with -c 64, not sure if its much faster than with the default -c 16.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I ran this yesterday for a single pass on each drive (opened 4 ssh windows) yesterday without error. Now I'm trying larger blocks just because I can and I'd like to run a few more passes. I don't believe there is anything wrong with my drives, I'm just goofing around right now. I'm still in the testing phase of my FreeNAS box so I don't have any problem doing whatever it takes to learn something new.

Testing complete !
After doing some quick testing for my system the optimal command line is:
Code:
badblocks -wvsb 4096 -c 32 /dev/ada1


I tested until I was a 1% complete, not highly accurate but close enough for this test. This was two drives running at the same time or comparison (a single drive with -c 32 = 2 min 26 sec)

-c 16 (default) = 4 min 50 seconds
-c 32 = 3 min 04 sec
All values above 32 showed no improvement and I went up to 64K.

So a value of 32 or 64 is very reasonable. Rough approximation shows that the default value of 16 would take approx 8 hours and 20 minutes to complete while a -c value of 32 would take 5 hours 6 minutes. Keep in mind these are rough but substantial either way. Your values may be different but I'd say it's a safe bet you could run a value of 32 or 64, your choice. Read the link I posted above about the command.

-Mark
 
Joined
May 27, 2011
Messages
566
Glad to hear it worked. Since this is your baby, maybe you should add a feature request for it. it would be nice to have through the GUI. it would be just per drive though but you seem to have gotten it working to your satisfaction.

hell, if you had a raidz2, you could remove a single disk, do a rw test, then re add it.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I have a lot of features I'd like to request but right now I'm focusing on letting the developers build a fully functional and stable build. I'll give it some thought but besides a test of the hard drive via badblocks, what else would you folks like. Please keep in mind I'm looking specifically at troubleshooting/diagnostics tools. And since I'm not a FreeBSD (or Linux variant) expert, a little help in how this would be implemented would go a long way when submitting a features ticket.

So for testing badblocks we could have a menu that if the drive meets certain criteria to which it could be safely removed from a raid (ex. 3 disk RAIDZ) and one drive could be removed and tested with a warning message of course. Multiple drives could be tested provided they were not assigned. And of course a report with the drive serial number and a Pass or Fail (list how many blocks failed). Yes, easy stuff.

Just to let you know that there is a non-destructive testing as well. I wanted to give that a shot and I think I will install a drive I know is failing to see what results it produces. I'd like to know if badblocks does any repair or what would need to be done to automate a repair if needed.

For other utilities, please list what you would like. I think a CPU Stress Test is in order. RAM Test as well. These are currently available so I might try to add these to a build in the future.

I didn't want to take over this thread, maybe we should start another one.

EDIT: On second thought, there are better utilities to test the PC out but I still like any tools which could be used live on the FreeNAS server but the RAM and CPU are out.
 
Status
Not open for further replies.
Top