Github repository for FreeNAS scripts, including disk burnin and rsync support

Github repository for FreeNAS scripts, including disk burnin and rsync support

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
Awesome, thanks a ton, running like a charm now!

Code:
root@freenas:~/disk-burnin-and-testing # ./disk-burnin.sh da0
+-----------------------------------------------------------------------------
+ Started burn-in of /dev/da0 : Thu Jun  8 11:07:07 PDT 2017
+-----------------------------------------------------------------------------
Host: freenas
Drive Model: WDC_WD80EFZX-68UW8N0
Serial Number: VK0W2LRY
Short test duration: 2 minutes
Short test sleep duration: 120 seconds
Extended test duration: 1242 minutes
Extended test sleep duration: 74520 seconds
Log file: ./burnin-WDC_WD80EFZX-68UW8N0_VK0W2LRY.log
Bad blocks file: ./burnin-WDC_WD80EFZX-68UW8N0_VK0W2LRY.bb
+-----------------------------------------------------------------------------
+ Run SMART short test on drive /dev/da0: Thu Jun  8 11:07:07 PDT 2017
+-----------------------------------------------------------------------------
smartctl 6.5 2016-05-07 r4318 [FreeBSD 11.0-STABLE amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 2 minutes for test to complete.
Test will complete after Thu Jun  8 11:09:07 2017

Use smartctl -X to abort test.
Short test started, sleeping 120 seconds until it finishes



Is there a recommend way to run all 16 drives in parallel, or should I just open a bunch of SSH sessions and kick them off? I'm not well versed enough in shell scripting, a for loop would wait for each one to finish before beginning the enxt one Im assuming, which I probably don't want to do or it will take a month.
Glad to hear it!

As @danb35 pointed out, tmux is very useful for starting multiple simultaneous burn-in sessions. Search the forum and you'll find detailed instructions on using it.
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
Spearfoot updated Github repository for FreeNAS scripts, including disk burnin with a new update entry:

Disk burnin script modified to correctly parse extended test duration for large drives

Summary: Forum member @Mugiwara discovered a bug in the disk-burnin.sh script, which was unable to parse the extended test duration from his 8TB disks. I have modified the script to correct this bug.

Details: The bug had to do with the parentheses delimiting the test durations as they are reported by the smartctl command:
Code:
Short self-test routine
recommended polling time:		(   2) minutes.
Extended self-test routine
recommended polling time:		(1242)...

Read the rest of this update entry...
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
One must appreciate your commenting and detail @Spearfoot !
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
Spearfoot updated Github repository for FreeNAS scripts, including disk burnin with a new update entry:

Added IPMI support to get_hdd_temp.sh script

I've added IPMI support to the get_hdd_temp.sh script. When enabled, the script uses IPMI to determine the number of socketed CPUs and reports the current temperature for each of them.

This is particularly handy for those of us who run FreeNAS as a virtual machine, as sysctl always returns a CPU temperature of '-1' in this case.

I also tweaked the drive output to display drive capacity and more extensive family and model information.

Kudos to P. Robar, who made...

Read the rest of this update entry...
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
Spearfoot updated Github repository for FreeNAS scripts, including disk burnin with a new update entry:

New script to save your configuration file and email it to you as an encrypted tarball

The title pretty much says it all.

I've posted a new script on GitHub (github.com/Spearfoot/FreeNAS-scripts) named save_config_enc.sh.

It's similar to the older save_config.sh in that its primary function is to copy your configuration file onto your pool. But if you set it up to do so this version goes further and will also:
  • Validate the configuration file with sqlite3 pragma...

Read the rest of this update entry...
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478

budmannxx

Contributor
Joined
Sep 7, 2011
Messages
120
@Spearfoot thanks for this excellent script. I'd like to suggest an enhancement: support for the -d option in smartmontools. I don't have any available SATA ports to run the burn in tests, so I'm using a USB dock. When I initially ran the script, I was seeing an "Unknown USB bridge" error. I was able to resovle this by adding:

-d sat

to all the smartctl commands in the script. An option to specify the device type via variable would be a great addition to this, and might help others. For reference, I found this page which lists the appropriate device type options for a variety of supported USB devices:

https://www.smartmontools.org/wiki/Supported_USB-Devices
 

budmannxx

Contributor
Joined
Sep 7, 2011
Messages
120
My tests finally completed...almost. I noticed the disk extended test failed before badblocks, but the status was "Interrupted (host reset)." I let it do the badblocks stuff, and the same thing happened at the exact same time during the followup extended test. I'm assuming this has something to do with my USB dock or some kind of idle timeout on the drive itself. I'm still investigating that (but any advice would be appreciated).

My main question is...why the script isn't responding after the "sleeping 23880 seconds..." has elapsed. Here's what I'm seeing, from after the badblocks tests completed:

Code:
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline  Completed without error  00%  67  -
# 2  Extended offline  Interrupted (host reset)  90%  7  -
# 3  Short offline  Completed without error  00%  0  -

Finished SMART short test on drive /dev/sdb: Wed Aug 16 09:39:05 EDT 2017
+-----------------------------------------------------------------------------
+ Run SMART extended test on drive /dev/sdb: Wed Aug 16 09:39:05 EDT 2017
+-----------------------------------------------------------------------------
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-32-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 398 minutes for test to complete.
Test will complete after Wed Aug 16 16:17:05 2017

Use smartctl -X to abort test.
Extended test started, sleeping 23880 seconds until it finishes


The test completed at around 4:17pm, but it's now after 7:00pm and the script is just hanging. I confirmed in a separate window that the extended test did complete (it failed again at 90% due to the same host reset issue), but I was expecting the script to complete. Has anyone see this behavior?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Or run another script to hit up the drive periodically to prevent the enclosure spinning down

Just reading smartdata might do it.

Happens enough that this feature should probably be added to the script :)
@Spearfoot
 

budmannxx

Contributor
Joined
Sep 7, 2011
Messages
120
Yeah, you'll want to test with something more reliable, like SATA.

Yep, would love to--no SATA ports available currently. I guess I'll try to see if there's something I can configure with the dock. Or maybe this could be some setting on my Linux machine. I realize this isn't the place for support for either of those.
 

budmannxx

Contributor
Joined
Sep 7, 2011
Messages
120
Or run another script to hit up the drive periodically to prevent the enclosure spinning down

Just reading snartdara might do it.

Happens enough that this feature should probably be added to the script :)
@Spearfoot

Derp--should have thought about that, thank you.

Any ideas on why the script is hanging though? The extended test failed the same way on the pre-badblocks tests, but the script happily continued on to run badblocks and the post-badblocks smartctl tests.
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
Derp--should have thought about that, thank you.

Any ideas on why the script is hanging though? The extended test failed the same way on the pre-badblocks tests, but the script happily continued on to run badblocks and the post-badblocks smartctl tests.
Well... to be precise, if the extended test 'failed' it would report errors. The behavior you're seeing is what happens when a SMART test gets interrupted. In your case, it appears that the test was interrupted when it was ~90% complete. Post the full contents of your disk burn-in log file and perhaps we can confirm this.

My suspicion is that these tests take much longer to run because you're connected via a USB dock. Or not. But in any case, the test duration returned by the drive isn't always very accurate. Accurate or not, the script first sleeps for whatever duration the drive reports. It then begins querying the drive at 15 second intervals to check for the test having completed, and will eventually time out in 4 more hours.

So @Stux: it does periodically read the SMART data... but only after having waited for the complete test duration. :p I suppose the code could be restructured to read the SMART data throughout the test... But frankly -- it always works flawlessly for me when I run it on my Ubuntu 'bench' PC. I just burned-in two new 4TB drives last week.

Something is interrupting the SMART extended test at ~90% for @budmannxx, but I don't believe it's this script. Again, post the full contents of your burn-in logs and I can do a better job of troubleshooting.
 

budmannxx

Contributor
Joined
Sep 7, 2011
Messages
120
Well... to be precise, if the extended test 'failed' it would report errors. The behavior you're seeing is what happens when a SMART test gets interrupted. In your case, it appears that the test was interrupted when it was ~90% complete. Post the full contents of your disk burn-in log file and perhaps we can confirm this.
Totally agreed. The extended test was interrupted--it didn't fail--at 90% both times.
My suspicion is that these tests take much longer to run because you're connected via a USB dock. Or not. But in any case, the test duration returned by the drive isn't always very accurate. Accurate or not, the script first sleeps for whatever duration the drive reports. It then begins querying the drive at 15 second intervals to check for the test having completed, and will eventually time out in 4 more hours.
Very insightful. I'm running an extended test now to see how long it takes on its own. I was unaware of the 4-hour timeout in the script (upon further review, I see it in the code now). I was within that window when I thought the script was hanging--it just hadn't timed out yet. This explains what I saw and confirms the script is working as intended.
So @Stux: it does periodically read the SMART data... but only after having waited for the complete test duration. :p I suppose the code could be restructured to read the SMART data throughout the test... But frankly -- it always works flawlessly for me when I run it on my Ubuntu 'bench' PC. I just burned-in two new 4TB drives last week.

Something is interrupting the SMART extended test at ~90% for @budmannxx, but I don't believe it's this script. Again, post the full contents of your burn-in logs and I can do a better job of troubleshooting.
Agreed again. The script is NOT causing the interrupt. I think it's a setting either in the USB dock or on my PC. I'm doing as @Stux suggested above and running a command to read data from the drive throughout the extended test to confirm this lets the test complete. If so, I'll try to figure out how to prevent whatever is causing the interrupt.

EDIT: Here are the burn-in logs:
Code:
$ sudo sh disk-burnin.sh sdb
+-----------------------------------------------------------------------------
+ Started burn-in of /dev/sdb : Sun Aug 13 14:33:07 EDT 2017
+-----------------------------------------------------------------------------
Host: mint
Drive Model: WDC_WD30EFRX-68EUZN0
Serial Number: [REDACTED]
Short test duration: 2 minutes
Short test sleep duration: 120 seconds
Extended test duration: 398 minutes
Extended test sleep duration: 23880 seconds
Log file: ./burnin-WDC_WD30EFRX-68EUZN0_[REDACTED].log
Bad blocks file: ./burnin-WDC_WD30EFRX-68EUZN0_[REDACTED].bb
+-----------------------------------------------------------------------------
+ Run SMART short test on drive /dev/sdb: Sun Aug 13 14:33:07 EDT 2017
+-----------------------------------------------------------------------------
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-32-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 2 minutes for test to complete.
Test will complete after Sun Aug 13 14:35:07 2017

Use smartctl -X to abort test.
Short test started, sleeping 120 seconds until it finishes
SMART self-test complete
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-32-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline  Completed without error  00%  0  -

Finished SMART short test on drive /dev/sdb: Sun Aug 13 14:35:08 EDT 2017
+-----------------------------------------------------------------------------
+ Run SMART extended test on drive /dev/sdb: Sun Aug 13 14:35:08 EDT 2017
+-----------------------------------------------------------------------------
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-32-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 398 minutes for test to complete.
Test will complete after Sun Aug 13 21:13:08 2017

Use smartctl -X to abort test.
Extended test started, sleeping 23880 seconds until it finishes
Timeout polling for SMART self-test status
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-32-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline  Interrupted (host reset)  90%  7  -
# 2  Short offline  Completed without error  00%  0  -

Finished SMART extended test on drive /dev/sdb: Mon Aug 14 01:15:50 EDT 2017
+-----------------------------------------------------------------------------
+ Run badblocks test on drive /dev/sdb: Mon Aug 14 01:15:50 EDT 2017
+-----------------------------------------------------------------------------
Checking for bad blocks in read-write mode
From block 0 to 732566645
Testing with pattern 0xaa: done  
Reading and comparing: done  
Testing with pattern 0x55: done  
Reading and comparing: done  
Testing with pattern 0xff: done  
Reading and comparing: done  
Testing with pattern 0x00: done  
Reading and comparing: done  
Pass completed, 0 bad blocks found. (0/0/0 errors)
Finished badblocks test on drive /dev/sdb: Wed Aug 16 09:37:05 EDT 2017
+-----------------------------------------------------------------------------
+ Run SMART short test on drive /dev/sdb: Wed Aug 16 09:37:05 EDT 2017
+-----------------------------------------------------------------------------
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-32-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 2 minutes for test to complete.
Test will complete after Wed Aug 16 09:39:05 2017

Use smartctl -X to abort test.
Short test started, sleeping 120 seconds until it finishes
SMART self-test complete
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-32-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline  Completed without error  00%  67  -
# 2  Extended offline  Interrupted (host reset)  90%  7  -
# 3  Short offline  Completed without error  00%  0  -

Finished SMART short test on drive /dev/sdb: Wed Aug 16 09:39:05 EDT 2017
+-----------------------------------------------------------------------------
+ Run SMART extended test on drive /dev/sdb: Wed Aug 16 09:39:05 EDT 2017
+-----------------------------------------------------------------------------
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-32-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 398 minutes for test to complete.
Test will complete after Wed Aug 16 16:17:05 2017

Use smartctl -X to abort test.
Extended test started, sleeping 23880 seconds until it finishes
^C

(The ^C at the end is me killing the script during the 4-hour timeout period, so the log file didn't write and wasn't cleaned up, per the code.)
 
Last edited:

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
Very insightful. I'm running an extended test now to see how long it takes on its own. I was unaware of the 4-hour timeout in the script (upon further review, I see it in the code now). I was within that window when I thought the script was hanging--it just hadn't timed out yet. This explains what I saw and confirms the script is working as intended.

Agreed again. The script is NOT causing the interrupt. I think it's a setting either in the USB dock or on my PC. I'm doing as @Stux suggested above and running a command to read data from the drive throughout the extended test to confirm this lets the test complete. If so, I'll try to figure out how to prevent whatever is causing the interrupt.

EDIT: Here are the burn-in logs:
Code:
$ sudo sh disk-burnin.sh sdb
+-----------------------------------------------------------------------------
+ Started burn-in of /dev/sdb : Sun Aug 13 14:33:07 EDT 2017
+-----------------------------------------------------------------------------
Host: mint
Drive Model: WDC_WD30EFRX-68EUZN0
Serial Number: [REDACTED]
Short test duration: 2 minutes
Short test sleep duration: 120 seconds
Extended test duration: 398 minutes
Extended test sleep duration: 23880 seconds
Log file: ./burnin-WDC_WD30EFRX-68EUZN0_[REDACTED].log
Bad blocks file: ./burnin-WDC_WD30EFRX-68EUZN0_[REDACTED].bb
+-----------------------------------------------------------------------------
+ Run SMART short test on drive /dev/sdb: Sun Aug 13 14:33:07 EDT 2017
+-----------------------------------------------------------------------------
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-32-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 2 minutes for test to complete.
Test will complete after Sun Aug 13 14:35:07 2017

Use smartctl -X to abort test.
Short test started, sleeping 120 seconds until it finishes
SMART self-test complete
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-32-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline  Completed without error  00%  0  -

Finished SMART short test on drive /dev/sdb: Sun Aug 13 14:35:08 EDT 2017
+-----------------------------------------------------------------------------
+ Run SMART extended test on drive /dev/sdb: Sun Aug 13 14:35:08 EDT 2017
+-----------------------------------------------------------------------------
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-32-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 398 minutes for test to complete.
Test will complete after Sun Aug 13 21:13:08 2017

Use smartctl -X to abort test.
Extended test started, sleeping 23880 seconds until it finishes
Timeout polling for SMART self-test status
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-32-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline  Interrupted (host reset)  90%  7  -
# 2  Short offline  Completed without error  00%  0  -

Finished SMART extended test on drive /dev/sdb: Mon Aug 14 01:15:50 EDT 2017
+-----------------------------------------------------------------------------
+ Run badblocks test on drive /dev/sdb: Mon Aug 14 01:15:50 EDT 2017
+-----------------------------------------------------------------------------
Checking for bad blocks in read-write mode
From block 0 to 732566645
Testing with pattern 0xaa: done
Reading and comparing: done
Testing with pattern 0x55: done
Reading and comparing: done
Testing with pattern 0xff: done
Reading and comparing: done
Testing with pattern 0x00: done
Reading and comparing: done
Pass completed, 0 bad blocks found. (0/0/0 errors)
Finished badblocks test on drive /dev/sdb: Wed Aug 16 09:37:05 EDT 2017
+-----------------------------------------------------------------------------
+ Run SMART short test on drive /dev/sdb: Wed Aug 16 09:37:05 EDT 2017
+-----------------------------------------------------------------------------
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-32-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 2 minutes for test to complete.
Test will complete after Wed Aug 16 09:39:05 2017

Use smartctl -X to abort test.
Short test started, sleeping 120 seconds until it finishes
SMART self-test complete
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-32-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline  Completed without error  00%  67  -
# 2  Extended offline  Interrupted (host reset)  90%  7  -
# 3  Short offline  Completed without error  00%  0  -

Finished SMART short test on drive /dev/sdb: Wed Aug 16 09:39:05 EDT 2017
+-----------------------------------------------------------------------------
+ Run SMART extended test on drive /dev/sdb: Wed Aug 16 09:39:05 EDT 2017
+-----------------------------------------------------------------------------
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-32-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 398 minutes for test to complete.
Test will complete after Wed Aug 16 16:17:05 2017

Use smartctl -X to abort test.
Extended test started, sleeping 23880 seconds until it finishes
^C

(The ^C at the end is me killing the script during the 4-hour timeout period, so the log file didn't write and wasn't cleaned up, per the code.)
Your analysis is completely accurate!

Thanks for posting your log... it shows that the polling timed out after the first extended test:
Code:
...snip...
Use smartctl -X to abort test.
Extended test started, sleeping 23880 seconds until it finishes
Timeout polling for SMART self-test status
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-32-generic] (local build)
...snip...
So I think you're on the right track. As you indicated, the remedy is probably to either to disable whatever USB timeout is aborting the extended test or just poll the drive during the entire test instead of sleeping for most of the expected test duration.
 

budmannxx

Contributor
Joined
Sep 7, 2011
Messages
120
Polling the drive periodically did it! My drive has now passed the extended test. I'm still not sure how to change the setting on my dock (or maybe in my OS), but I'm not going to look too much further at that. Let the resilvering begin!

This would come up on a simple Google search, but here's the command I used (on a Linux box) to poll the drive:

while true; do dd if=/dev/XXX of=/dev/null count=1; sleep 120; done

That will run forever. Replace the "XXX" with your device and choose a sleep duration (in seconds) of something less than your drive timeout time (hopefully that's obvious). Hit Ctrl+C to kill it when the test is done.

Thanks for everyone's help.
 

svtkobra7

Patron
Joined
Jan 12, 2017
Messages
202
@Spearfoot RE: utility-scripts-for-freenas-and-vmware-esxi ... and at the risk of sounding like a complete noob ...

The readme notes: The SSH service must be enabled on the ESXi host and public key logons configured from the FreeNAS VM where you wish to execute the scripts.

I know how to enable SSH on the ESXi host but I'm not sure how to perform this task: "public key logons configured from the FreeNAS VM where you wish to execute the scripts." Can you provide some guidance here, please? I'd love to deploy these scripts but not knowing how to get past this step has stopped me for a while. If this is detailed somewhere else, and I missed it, my apologies (just let me know).

I look forward to your reply and thanks in advance for your time (I know this is well beneath you!).
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194

svtkobra7

Patron
Joined
Jan 12, 2017
Messages
202
  • Thanks for that ... I'm relatively familiar with the users functionality, but was missing one fundamental (that I've overlooked teaching myself for sometime - which isn't described there).
  • I followed the guide here: https://forums.freenas.org/index.ph...ows-clients-using-putty-on-freenas-9-3.34893/
  • I can finally remove SSH / Allow password authentication (slaps head for you) and with a strong p/w it is so much faster to access via Putty now.
  • Now to set up @Spearfoot 's scripts.
  • Much appreciated, sir. :) [I should have spent the time to understand this some time ago ...]
 
Top