Proposed FreeNAS Build

Status
Not open for further replies.
Joined
Jun 26, 2012
Messages
260
Badblocks 51 hours in on 6 3TB WD Reds.
At around 40 hours I noticed 1 drive was slowing down on the % done counter. At this point, I have
57%, 53%, 51%, 49%, 44% and 15% all on the 0x00 pattern.

I saw someone mention in another thread that a long test may have been started accidentally, is there any way to check what tests are running? I guess I may have kicked one off accidentally when running a smartctl -A to check the drive temps while badblocks was running...
 
Joined
Jun 26, 2012
Messages
260
OK, almost everything is OK with badblocks.
The last drive that was lagging far behind...when I came up in the morning to check, all drives were done, except that one. It looks like it had restarted the badblocks from the beginning. Maybe as I was scrolling through previous commands I accidentally kicked it off again?

Regardless, unless someone says I should stop there because clearly something is wrong?!?!

I shutdown/power up via IPMI. Did smart tests on all the drives and everything looked good. SMART tests and scrubs are scheduled (both long and short) based on CyberJocks guide HERE

Created my pool (6 x 3TB WD Red in RAIDz2, FreeNAS 9.10.2-u2)

I did my basic setup based on DrKK's guide HERE

Do I need a boot scrub when using an SSD?

I need to deal with the mismatch LSI 3008 controller firmware/driver issue.

Is there a definitive guide to setup Plex in 9.10 (or similar)? I know it was confusing when I first set it up several years ago. Anway, I downloaded the version available...and it created a jail already. And...noob question (I think), should I create a dataset to store the media I expect to be using in Plex?


Sometime soon I'll either edit the early posts here or start a new one with pictures.
 

diedrichg

Wizard
Joined
Dec 4, 2012
Messages
1,319
And...noob question (I think), should I create a dataset to store the media I expect to be using in Plex?
Yes/and no. Technically no, because the media could very well live within the jail but it is then subject to total loss if the jail was destroyed. (but then again, this would be the case for destroying a dataset. Bottom line; set up as many datasets as you wish to divvy up your media. It could be just one dataset with Music, Movies and TV folders inside. It could be three datasets with no folders and just that type of media per dataset... *ahem, however, Plex and Emby prefer to have a top-level folder containing each piece of media in it's own folder.

Datasets are designed to set permissions and access/denial of its contents. Therefore, you really only need a "media" dataset containing folders labeled Movies, TV, etc.
 
Last edited:
Joined
Jun 26, 2012
Messages
260
Yes/and no. Technically no, because the media could very well live within the jail
What's the pro/con of having the data reside in the jail or not?
 

diedrichg

Wizard
Joined
Dec 4, 2012
Messages
1,319
Okay, done for now. Previous post edited for completeness.
 
Last edited:

diedrichg

Wizard
Joined
Dec 4, 2012
Messages
1,319
What's the pro/con of having the data reside in the jail or not?
Con: You have a slightly higher chance of inadvertently deleting the jail and thus the media. It's much more work to set permissions and access when your data is inside the jail.

Pro: none that outweigh the cons.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Datasets are also useful for setting quotas.

Your media should be in a dataset, not the jail.

The only real question is should that be in a "share" dataset, "media" or "movies"?

FWIW, I find too many datasets to be a pain.

BUT replication schedules are per dataset. So if you want to replicate your documents, and music, but not movies, then music/documents need to be individual datasets.
 
Last edited:
Joined
Jun 26, 2012
Messages
260
So, according to ftp://ftp.supermicro.com/driver/SAS/LSI/3008/Firmware/Previous%20Releases/
there is no version 11. Version 12 it is!
At least it isn't confusing *sarcasm*

Successfully flashed the firmware. Minor heart attack along the way as I though I might have bricked it.
From the main UEFI menu I was able to start the executable provided with the firmware, BIOS etc. But when it asked for the firmware file, it could not open it (after clearing out the controller). After which I was getting errors about the controller not being active. Apparently I needed to find? the usb stick first? Or change to the USB and execute from there. Operating from the "fs0" prompt, I did it again and it was able to open the file needed. PHEW! Reboot and warning is now gone in the GUI.

Informationally, it is firmware version 12 on my LSI 3008 controller. Maybe this was before the offset between firmware and driver started? Or they have reset to matching?
 
Joined
Jun 26, 2012
Messages
260
Got my SMB Windows share set up.
Still waiting to get some uninterrupted time to set up Plex and to update the thread with pics of the build.
In the meantime I'm just about to start the HDD burn-in on what will by my cold spare drive and
on the 2 drives from my previous build to see if I can salvage and use them as a separate backup pool in mirror.

Can I run wdidle from the Ultimate Boot Disk while all of my other drives are connected?
Or is it vitally important to not have anything else connected when I'm doing that?
 
Joined
Jun 26, 2012
Messages
260
Damn. Woke up to an email from my FreeNAS box:
Device: /dev/ada3, Self-Test Log error count increased from 0 to 1

smartctl -a output (due to a forum error, I had to refresh and my initial post that I was in the middle of writing was lost. This smartctl output is from after I started the currently running long test, not from before that was started)
Code:
=== START OF INFORMATION SECTION ===																								
Model Family:	 Western Digital Red																							   
Device Model:	 WDC WD30EFRX-68EUZN0																							 
Serial Number:	WD-WCC4N5AHL22L																								   
LU WWN Device Id: 5 0014ee 2b87371f6																								
Firmware Version: 82.00A82																										 
User Capacity:	3,000,592,982,016 bytes [3.00 TB]																				 
Sector Sizes:	 512 bytes logical, 4096 bytes physical																			
Rotation Rate:	5400 rpm																										 
Device is:		In smartctl database [for details use: -P show]																   
ATA Version is:   ACS-2 (minor revision not indicated)																			 
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)																			
Local Time is:	Mon May  8 09:25:05 2017 EDT																					 
SMART support is: Available - device has SMART capability.																		 
SMART support is: Enabled																										   
																																	
=== START OF READ SMART DATA SECTION ===																							
SMART overall-health self-assessment test result: PASSED																			
																																	
General SMART Values:																											   
Offline data collection status:  (0x00) Offline data collection activity															
										was never started.																		 
										Auto Offline Data Collection: Disabled.													 
Self-test execution status:	  ( 249) Self-test routine in progress...															
										90% of test remaining.																	 
Total time to complete Offline																									 
data collection:				(39240) seconds.																					
Offline data collection																											 
capabilities:					(0x7b) SMART execute Offline immediate.															
										Auto Offline data collection on/off support.												
										Suspend Offline collection upon new														 
										command.																					
										Offline surface scan supported.															 
										Self-test supported.																		
										Conveyance Self-test supported.															 
										Selective Self-test supported.															 
SMART capabilities:			(0x0003) Saves SMART data before entering															
										power-saving mode.																		 
										Supports SMART auto save timer.															 
Error logging capability:		(0x01) Error logging supported.																	
										General Purpose Logging supported.														 
Short self-test routine																											 
recommended polling time:		(   2) minutes.																					
Extended self-test routine																										 
recommended polling time:		( 394) minutes.																					
Conveyance self-test routine
recommended polling time:		(   5) minutes.																					
SCT capabilities:			  (0x703d) SCT Status supported.																	   
										SCT Error Recovery Control supported.													   
										SCT Feature Control supported.															 
										SCT Data Table supported.																   
																																	
SMART Attributes Data Structure revision number: 16																				 
Vendor Specific SMART Attributes with Thresholds:																				   
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE									
  1 Raw_Read_Error_Rate	 0x002f   200   200   051	Pre-fail  Always	   -	   0											
  3 Spin_Up_Time			0x0027   179   179   021	Pre-fail  Always	   -	   6025										 
  4 Start_Stop_Count		0x0032   100   100   000	Old_age   Always	   -	   8											
  5 Reallocated_Sector_Ct   0x0033   200   200   140	Pre-fail  Always	   -	   0											
  7 Seek_Error_Rate		 0x002e   200   200   000	Old_age   Always	   -	   0											
  9 Power_On_Hours		  0x0032   100   100   000	Old_age   Always	   -	   268										 
10 Spin_Retry_Count		0x0032   100   253   000	Old_age   Always	   -	   0											
11 Calibration_Retry_Count 0x0032   100   253   000	Old_age   Always	   -	   0											
12 Power_Cycle_Count	   0x0032   100   100   000	Old_age   Always	   -	   8											
192 Power-Off_Retract_Count 0x0032   200   200   000	Old_age   Always	   -	   3											
193 Load_Cycle_Count		0x0032   200   200   000	Old_age   Always	   -	   162										 
194 Temperature_Celsius	 0x0022   119   112   000	Old_age   Always	   -	   31										   
196 Reallocated_Event_Count 0x0032   200   200   000	Old_age   Always	   -	   0											
197 Current_Pending_Sector  0x0032   200   200   000	Old_age   Always	   -	   0											
198 Offline_Uncorrectable   0x0030   100   253   000	Old_age   Offline	  -	   0											
199 UDMA_CRC_Error_Count	0x0032   200   200   000	Old_age   Always	   -	   0											
200 Multi_Zone_Error_Rate   0x0008   200   200   000	Old_age   Offline	  -	   1											
																																	
SMART Error Log Version: 1																										 
No Errors Logged																													
																																	
SMART Self-test log structure revision number 1																					 
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error									 
# 1  Extended offline	Completed: read failure	   90%	   263		 135447376											 
# 2  Short offline	   Completed without error	   00%	   192		 -													 
# 3  Extended offline	Completed without error	   00%	   100		 -													 
# 4  Short offline	   Completed without error	   00%		 0		 -													 
# 5  Conveyance offline  Completed without error	   00%		 0		 -													 
																																	
SMART Selective self-test log data structure revision number 1																	 
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS																						
	1		0		0  Not_testing																								
	2		0		0  Not_testing																								
	3		0		0  Not_testing																								
	4		0		0  Not_testing																								
	5		0		0  Not_testing																								
Selective self-test flags (0x0):																									
  After scanning selected spans, do NOT read-scan remainder of disk.																
If Selective self-test is pending on power-up, resume after 0 minute delay.													 


RMA time?

I know this is when I want infant mortality to strike, but...Ugh.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Damn. Woke up to an email from my FreeNAS box:
Device: /dev/ada3, Self-Test Log error count increased from 0 to 1

smartctl -a output (due to a forum error, I had to refresh and my initial post that I was in the middle of writing was lost. This smartctl output is from after I started the currently running long test, not from before that was started)
Code:
=== START OF INFORMATION SECTION ===																								
Model Family:	 Western Digital Red																							  
Device Model:	 WDC WD30EFRX-68EUZN0																							
Serial Number:	WD-WCC4N5AHL22L																								  
LU WWN Device Id: 5 0014ee 2b87371f6																								
Firmware Version: 82.00A82																										
User Capacity:	3,000,592,982,016 bytes [3.00 TB]																				
Sector Sizes:	 512 bytes logical, 4096 bytes physical																			
Rotation Rate:	5400 rpm																										
Device is:		In smartctl database [for details use: -P show]																  
ATA Version is:   ACS-2 (minor revision not indicated)																			
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)																			
Local Time is:	Mon May  8 09:25:05 2017 EDT																					
SMART support is: Available - device has SMART capability.																		
SMART support is: Enabled																										  
																																	
=== START OF READ SMART DATA SECTION ===																							
SMART overall-health self-assessment test result: PASSED																			
																																	
General SMART Values:																											  
Offline data collection status:  (0x00) Offline data collection activity															
										was never started.																		
										Auto Offline Data Collection: Disabled.													
Self-test execution status:	  ( 249) Self-test routine in progress...															
										90% of test remaining.																	
Total time to complete Offline																									
data collection:				(39240) seconds.																					
Offline data collection																											
capabilities:					(0x7b) SMART execute Offline immediate.															
										Auto Offline data collection on/off support.												
										Suspend Offline collection upon new														
										command.																					
										Offline surface scan supported.															
										Self-test supported.																		
										Conveyance Self-test supported.															
										Selective Self-test supported.															
SMART capabilities:			(0x0003) Saves SMART data before entering															
										power-saving mode.																		
										Supports SMART auto save timer.															
Error logging capability:		(0x01) Error logging supported.																	
										General Purpose Logging supported.														
Short self-test routine																											
recommended polling time:		(   2) minutes.																					
Extended self-test routine																										
recommended polling time:		( 394) minutes.																					
Conveyance self-test routine
recommended polling time:		(   5) minutes.																					
SCT capabilities:			  (0x703d) SCT Status supported.																	  
										SCT Error Recovery Control supported.													  
										SCT Feature Control supported.															
										SCT Data Table supported.																  
																																	
SMART Attributes Data Structure revision number: 16																				
Vendor Specific SMART Attributes with Thresholds:																				  
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE									
  1 Raw_Read_Error_Rate	 0x002f   200   200   051	Pre-fail  Always	   -	   0											
  3 Spin_Up_Time			0x0027   179   179   021	Pre-fail  Always	   -	   6025										
  4 Start_Stop_Count		0x0032   100   100   000	Old_age   Always	   -	   8											
  5 Reallocated_Sector_Ct   0x0033   200   200   140	Pre-fail  Always	   -	   0											
  7 Seek_Error_Rate		 0x002e   200   200   000	Old_age   Always	   -	   0											
  9 Power_On_Hours		  0x0032   100   100   000	Old_age   Always	   -	   268										
10 Spin_Retry_Count		0x0032   100   253   000	Old_age   Always	   -	   0											
11 Calibration_Retry_Count 0x0032   100   253   000	Old_age   Always	   -	   0											
12 Power_Cycle_Count	   0x0032   100   100   000	Old_age   Always	   -	   8											
192 Power-Off_Retract_Count 0x0032   200   200   000	Old_age   Always	   -	   3											
193 Load_Cycle_Count		0x0032   200   200   000	Old_age   Always	   -	   162										
194 Temperature_Celsius	 0x0022   119   112   000	Old_age   Always	   -	   31										  
196 Reallocated_Event_Count 0x0032   200   200   000	Old_age   Always	   -	   0											
197 Current_Pending_Sector  0x0032   200   200   000	Old_age   Always	   -	   0											
198 Offline_Uncorrectable   0x0030   100   253   000	Old_age   Offline	  -	   0											
199 UDMA_CRC_Error_Count	0x0032   200   200   000	Old_age   Always	   -	   0											
200 Multi_Zone_Error_Rate   0x0008   200   200   000	Old_age   Offline	  -	   1											
																																	
SMART Error Log Version: 1																										
No Errors Logged																													
																																	
SMART Self-test log structure revision number 1																					
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error									
# 1  Extended offline	Completed: read failure	   90%	   263		 135447376											
# 2  Short offline	   Completed without error	   00%	   192		 -													
# 3  Extended offline	Completed without error	   00%	   100		 -													
# 4  Short offline	   Completed without error	   00%		 0		 -													
# 5  Conveyance offline  Completed without error	   00%		 0		 -													
																																	
SMART Selective self-test log data structure revision number 1																	
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS																						
	1		0		0  Not_testing																								
	2		0		0  Not_testing																								
	3		0		0  Not_testing																								
	4		0		0  Not_testing																								
	5		0		0  Not_testing																								
Selective self-test flags (0x0):																									
  After scanning selected spans, do NOT read-scan remainder of disk.																
If Selective self-test is pending on power-up, resume after 0 minute delay.													


RMA time?

I know this is when I want infant mortality to strike, but...Ugh.

Less than 2 weeks of uptime and its already failing to return data.

RMA it.
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
Was this drive burned in, the whole shebang, using badblocks?
 
Joined
Jun 26, 2012
Messages
260
Was this drive burned in, the whole shebang, using badblocks?
Yup. Standard 4 pass w/ badblocks. Finished it early last week. Don't even have any data on the drives yet except a few test
files to check my windows share.
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
wow.
 
Joined
Jun 26, 2012
Messages
260
Lol, short and to the point.

I did have one drive that was lagging behind all the others by several hours during the badblocks run, maybe it was this one.
I didn't note which drive it was unfortunately.
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
Lol, short and to the point.

I did have one drive that was lagging behind all the others by several hours during the badblocks run, maybe it was this one.
I didn't note which drive it was unfortunately.
Damn, that would have been so interesting :)
 

diedrichg

Wizard
Joined
Dec 4, 2012
Messages
1,319
This is troublesome because now you don't know if the lagging drive is the same one throwing the SMART errors. Therefore, it could still be in the pool. RMA it, throw in the spare drive and when the new one comes in burn it in and keep it as your spare. Read the manual for replacing drives.
 

diedrichg

Wizard
Joined
Dec 4, 2012
Messages
1,319
However... You could run badblocks on the drive again to test it one more time. But, why bother, just send it back and ease your mind.
 
Joined
Jun 26, 2012
Messages
260
However... You could run badblocks on the drive again to test it one more time. But, why bother, just send it back and ease your mind.
I'm not waiting around another 50+ hours for badblocks to run. Sending this puppy back.
 
Status
Not open for further replies.
Top