Resource icon

Hard Drive Burn-In Testing - Discussion Thread

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
smartctl is not a process like that. You run the command and hit enter. smartctl sends a command to the disk that says 'run a long smart test' and the disk's firmware obliges. The command line is immediately returned while the hard drive does its magical long test.
 

andrewjs18

Contributor
Joined
Oct 19, 2014
Messages
141
smartctl is not a process like that. You run the command and hit enter. smartctl sends a command to the disk that says 'run a long smart test' and the disk's firmware obliges. The command line is immediately returned while the hard drive does its magical long test.

it'll be doing its magical thing while I'm sleeping shortly. :D
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
right, but what happens if you close the ssh session before smartctl -t long finishes?
A key fact that will help you to understand is that a SMART test is a self-test. You tell the disk to do it, and it goes and does it on its own. When you initiate the test, you're told what the polling interval is. What that means to a human is, "don't bother checking for results until at least that much time has elapsed."

EDIT: doh, too slow!
 

qwertymodo

Contributor
Joined
Apr 7, 2014
Messages
144
Also worth noting is that because of the way smart tests are conducted, the results are stored by the drive firmware itself. If you pull a drive and stick it into a brand new machine you can still view all of the results of every test performed on that drive (or maybe they only store the few most recent, that part I'm not sure on but that's beside the point).
 

andrewjs18

Contributor
Joined
Oct 19, 2014
Messages
141
might not be a half bad idea to note in the OP that smartctl tests do not need to be ran using tmux - that'll they run even after you close your ssh session.

on another note, my smartctl -a results came back without any disk errors. YAY! now off to badblocks...
 

qwertymodo

Contributor
Joined
Apr 7, 2014
Messages
144
It's already in the OP, though I suppose I could re-word it a bit.

First of all, the S.M.A.R.T. tests. The first thing that someone unfamiliar with S.M.A.R.T. tests might find strange is the fact that no results are shown when you run the test. The way these tests work is that you initiate the test, it goes off and does its thing, then it records the results for you to check later. So, if this is an initial burn-in test for your entire system, you can initiate tests on all of the drives simultaneously by simply issuing the test command for each drive one after another.
 

andrewjs18

Contributor
Joined
Oct 19, 2014
Messages
141
so the test is about 15 hours in and I just checked on it. It's showing some data that I didn't see last night and I'm not exactly sure how data is supposed to be presented for badblocks. if someone could let me know, that'd be great:

[root@freenas] ~# tmux attach -s 1
1484278203
1484278204
1484278205
1484278206
1484278207
1484278208
1484278209
1484278210
1484278211
1484278212
1484278213
1484278214
1484278215
90.12% done, 15:24:15 elapsed. (0/0/124480 errors)
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
1484278252
1484278253
1484278254
1484278255
1484278256
1484278257
1484278258
1484278259
89.85% done, 15:21:59 elapsed. (0/0/121248 errors)
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
1484278143
1484278144
1484278145
1484278146
1484278147
90.67% done, 15:21:45 elapsed. (0/0/121904 errors)
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
1484278531
1484278532
1484278533
1484278534
1484278535
88.56% done, 15:21:31 elapsed. (0/0/122216 errors)
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
1484278459
1484278460
1484278461
1484278462
1484278463
88.88% done, 15:21:19 elapsed. (0/0/123072 errors)

edited to add: I did have a zpool set up before running this test. I don't care if it deletes it or not as the only data on the disks is a /home directory. should I of deleted the zpool from the freenas gui BEFORE running this test? if so, I'll cancel this test (if it's possible), delete the zpool and then rerun it.
 
Last edited:

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
Looks like something is writing to the pool, which is no surprise (e.g. FreeNAS does frequent small writes to whichever pool houses the syslog). This will make it very hard to interpret the results of backblocks, so you're probably best off killing this run, detaching the pool, and starting over.
 

andrewjs18

Contributor
Joined
Oct 19, 2014
Messages
141
Looks like something is writing to the pool, which is no surprise (e.g. FreeNAS does frequent small writes to whichever pool houses the syslog). This will make it very hard to interpret the results of backblocks, so you're probably best off killing this run, detaching the pool, and starting over.

yeah, I'm going to do that shortly, remove the pool and then rerun badblocks.

thanks for the reply.
 

andrewjs18

Contributor
Joined
Oct 19, 2014
Messages
141
0 errors for all drives in both badblocks and the long version of the smartctl test. woot..

I'm not exactly sure how long it took to run badblocks on my 4tb hdd (is there a way to check this?), but for one of my disks, the last time I checked it before it was finished, it was coming up on the 54th hour of running.
 

qwertymodo

Contributor
Joined
Apr 7, 2014
Messages
144
I think mine was like 72 hours for the standard 4-pass option on 2TB drives. It takes... awhile.

But hey, it's a burn-in test. You wouldn't REALLY want to test them for 20 minutes and call it good, would you?
 

andrewjs18

Contributor
Joined
Oct 19, 2014
Messages
141
I think mine was like 72 hours for the standard 4-pass option on 2TB drives. It takes... awhile.

But hey, it's a burn-in test. You wouldn't REALLY want to test them for 20 minutes and call it good, would you?

nope! thanks for the guide, btw..very useful.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Is this a valid test for ssd too
Absolutely not, not saying you couldn't try but... A SSD has a built in hardware manager which will automatically remove failed memory and remap the block using the over-provisioned memory, that's what it's for.
 

qwertymodo

Contributor
Joined
Apr 7, 2014
Messages
144
What joeschmuck said is correct. It's known as wear leveling, and it introduces a lot of weirdness any time you want to do anything involving raw drive access. Another thing you can't reliably do is secure file deletion, so that's a good argument for full drive encryption, especially on SSD's.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Another thing you can't reliably do is secure file deletion, so that's a good argument for full drive encryption, especially on SSD's.
That is not exactly true, the TRIM command will erase a marked block of data when the command is issued and you can command the drive to do a secure erase operation but that wipes out all the data on the SSD, it's not selective.
 
Top