Does FreeNAS access the drives when not in use?

Status
Not open for further replies.

peridian

Dabbler
Joined
Feb 2, 2015
Messages
13
Hi,

My backup NAS consists of Western Digital Green drives, as they have proven (to me at least) to be efficient at spinning down and avoiding wear and tear whilst the main machine remains on.

Does FreeNAS have any processes or checks that periodically access the drives whilst running? Could it be doing anything to cause the drives to spin up when they're not actually needed for data access?

Regards,
Rob.
 

marbus90

Guru
Joined
Aug 2, 2014
Messages
818
The standard FreeNAS checks at least every 10 seconds. That means that your drives are spinning down and being woken up 2 seconds later. To avoid this with FreeNAS 9.3 you could redirect the .system dataset to an SSD, which keeps collectd off the storage disks.

To be sure: can you post the smartctl -x output for each of your drives in code-tags?
 

peridian

Dabbler
Joined
Feb 2, 2015
Messages
13
Wow, that was painful. Is there an easier way of doing that than copying and pasting each section via the web command shell?

/dev/ada0

Code:
smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p8 amd64] (local build)      
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org    
                                                                               
=== START OF INFORMATION SECTION ===                                            
Model Family:     Western Digital Caviar Green (AF, SATA 6Gb/s)                
Device Model:     WDC WD30EZRX-22D8PB0                                          
Serial Number:    WD-WCC4N4PE53X0                                              
LU WWN Device Id: 5 0014ee 20b1cb569                                            
Firmware Version: 80.00A80                                                      
User Capacity:    3,000,592,982,016 bytes [3.00 TB]                            
Sector Sizes:     512 bytes logical, 4096 bytes physical                        
Rotation Rate:    5400 rpm                                                      
Device is:        In smartctl database [for details use: -P show]              
ATA Version is:   ACS-2 (minor revision not indicated)                          
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)                        
Local Time is:    Mon Feb  2 18:32:08 2015 GMT                                  
SMART support is: Available - device has SMART capability.                      
SMART support is: Enabled                                                      
AAM feature is:   Unavailable                                                  
APM feature is:   Unavailable                                                  
Rd look-ahead is: Enabled                                                      
Write cache is:   Enabled                                                      
ATA Security is:  Disabled, frozen [SEC2]  
Wt Cache Reorder: Enabled                                                      
                                                                               
=== START OF READ SMART DATA SECTION ===                                        
SMART overall-health self-assessment test result: PASSED                        
                                                                               
General SMART Values:                                                          
Offline data collection status:  (0x80) Offline data collection activity        
                                        was never started.                      
                                        Auto Offline Data Collection: Enabled.  
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever  
                                        been run.                              
Total time to complete Offline                                                  
data collection:                (41760) seconds.                                
Offline data collection                                                        
capabilities:                    (0x7b) SMART execute Offline immediate.        
                                        Auto Offline data collection on/off supp
ort.                                                                            
                                        Suspend Offline collection upon new    
                                        command.                                
                                        Offline surface scan supported.        
                                        Self-test supported.                    
                                        Conveyance Self-test supported. 
                                        Selective Self-test supported.          
SMART capabilities:            (0x0003) Saves SMART data before entering        
                                        power-saving mode.                      
                                        Supports SMART auto save timer.        
Error logging capability:        (0x01) Error logging supported.                
                                        General Purpose Logging supported.      
Short self-test routine                                                        
recommended polling time:        (   2) minutes.                                
Extended self-test routine                                                      
recommended polling time:        ( 418) minutes.                                
Conveyance self-test routine                                                    
recommended polling time:        (   5) minutes.                                
SCT capabilities:              (0x7035) SCT Status supported.                  
                                        SCT Feature Control supported.          
                                        SCT Data Table supported.              
                                                                               
SMART Attributes Data Structure revision number: 16                            
Vendor Specific SMART Attributes with Thresholds:                              
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE          
  1 Raw_Read_Error_Rate     POSR-K   100   253   051    -    0                  
  3 Spin_Up_Time            POS--K   100   253   021    -    0                  
  4 Start_Stop_Count        -O--CK   100   100   000    -    2                  
  5 Reallocated_Sector_Ct   PO--CK   200   200   140    -    0        
  7 Seek_Error_Rate         -OSR-K   100   253   000    -    0                  
  9 Power_On_Hours          -O--CK   100   100   000    -    2                  
 10 Spin_Retry_Count        -O--CK   100   253   000    -    0                  
 11 Calibration_Retry_Count -O--CK   100   253   000    -    0                  
 12 Power_Cycle_Count       -O--CK   100   100   000    -    2                  
192 Power-Off_Retract_Count -O--CK   200   200   000    -    0                  
193 Load_Cycle_Count        -O--CK   200   200   000    -    205                
194 Temperature_Celsius     -O---K   118   118   000    -    32                
196 Reallocated_Event_Count -O--CK   200   200   000    -    0                  
197 Current_Pending_Sector  -O--CK   200   200   000    -    0                  
198 Offline_Uncorrectable   ----CK   100   253   000    -    0                  
199 UDMA_CRC_Error_Count    -O--CK   200   200   000    -    0                  
200 Multi_Zone_Error_Rate   ---R--   100   253   000    -    0                  
                            ||||||_ K auto-keep                                
                            |||||__ C event count                              
                            ||||___ R error rate                                
                            |||____ S speed/performance                        
                            ||_____ O updated online                            
                            |______ P prefailure warning                        
                                                                               
General Purpose Log Directory Version 1                                        
SMART           Log Directory Version 1 [multi-sector log support]              
Address    Access  R/W   Size  Description 
0x00       GPL,SL  R/O      1  Log Directory                                    
0x01           SL  R/O      1  Summary SMART error log                          
0x02           SL  R/O      5  Comprehensive SMART error log                    
0x03       GPL     R/O      6  Ext. Comprehensive SMART error log              
0x06           SL  R/O      1  SMART self-test log                              
0x07       GPL     R/O      1  Extended self-test log                          
0x09           SL  R/W      1  Selective self-test log                          
0x10       GPL     R/O      1  NCQ Command Error log                            
0x11       GPL     R/O      1  SATA Phy Event Counters                          
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log                        
0xa0-0xa7  GPL,SL  VS      16  Device vendor specific log                      
0xa8-0xb7  GPL,SL  VS       1  Device vendor specific log                      
0xbd       GPL,SL  VS       1  Device vendor specific log                      
0xc0       GPL,SL  VS       1  Device vendor specific log                      
0xc1       GPL     VS      93  Device vendor specific log                      
0xe0       GPL,SL  R/W      1  SCT Command/Status                              
0xe1       GPL,SL  R/W      1  SCT Data Transfer                                
                                                                               
SMART Extended Comprehensive Error Log Version: 1 (6 sectors)                  
No Errors Logged                                                                
                                                                               
SMART Extended Self-test Log Version: 1 (1 sectors)                            
No self-tests have been logged.  [To run self-tests, use: smartctl -t] 
SMART Selective self-test log data structure revision number 1                  
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS                                    
    1        0        0  Not_testing                                            
    2        0        0  Not_testing                                            
    3        0        0  Not_testing                                            
    4        0        0  Not_testing                                            
    5        0        0  Not_testing                                            
Selective self-test flags (0x0):                                                
  After scanning selected spans, do NOT read-scan remainder of disk.            
If Selective self-test is pending on power-up, resume after 0 minute delay.    
                                                                               
SCT Status Version:                  3                                          
SCT Version (vendor specific):       258 (0x0102)                              
SCT Support Level:                   1                                          
Device State:                        Active (0)                                
Current Temperature:                    32 Celsius                              
Power Cycle Min/Max Temperature:     23/32 Celsius                              
Lifetime    Min/Max Temperature:     22/32 Celsius                              
Under/Over Temperature Limit Count:   0/0                                      
                                                                               
SCT Temperature History Version:     2                                          
Temperature Sampling Period:         1 minute  
Temperature Logging Interval:        1 minute                                  
Min/Max recommended Temperature:      0/60 Celsius                              
Min/Max Temperature Limit:           -41/85 Celsius                            
Temperature History Size (Index):    478 (158)                                  
                                                                               
Index    Estimated Time   Temperature Celsius                                  
 159    2015-02-02 10:35     ?  -                                              
 ...    ..(318 skipped).    ..  -                                              
   0    2015-02-02 15:54     ?  -                                              
   1    2015-02-02 15:55    22  ***                                            
   2    2015-02-02 15:56    22  ***                                            
   3    2015-02-02 15:57     ?  -                                              
   4    2015-02-02 15:58    23  ****                                            
 ...    ..(  2 skipped).    ..  ****                                            
   7    2015-02-02 16:01    23  ****                                            
   8    2015-02-02 16:02    24  *****                                          
 ...    ..(  5 skipped).    ..  *****                                          
  14    2015-02-02 16:08    24  *****                                          
  15    2015-02-02 16:09    25  ******                                          
 ...    ..(  2 skipped).    ..  ******                                          
  18    2015-02-02 16:12    25  ******                                          
  19    2015-02-02 16:13    26  *******                                        
 ...    ..(  2 skipped).    ..  *******              
  22    2015-02-02 16:16    26  *******                                        
  23    2015-02-02 16:17    27  ********                                        
 ...    ..(  3 skipped).    ..  ********                                        
  27    2015-02-02 16:21    27  ********                                        
  28    2015-02-02 16:22    28  *********                                      
 ...    ..( 12 skipped).    ..  *********                                      
  41    2015-02-02 16:35    28  *********                                      
  42    2015-02-02 16:36    29  **********                                      
 ...    ..( 22 skipped).    ..  **********                                      
  65    2015-02-02 16:59    29  **********                                      
  66    2015-02-02 17:00    30  ***********                                    
  67    2015-02-02 17:01    29  **********                                      
 ...    ..(  2 skipped).    ..  **********                                      
  70    2015-02-02 17:04    29  **********                                      
  71    2015-02-02 17:05    30  ***********                                    
  72    2015-02-02 17:06    29  **********                                      
  73    2015-02-02 17:07    30  ***********                                    
  74    2015-02-02 17:08    29  **********                                      
  75    2015-02-02 17:09    30  ***********                                    
 ...    ..( 11 skipped).    ..  ***********                                    
  87    2015-02-02 17:21    30  ***********                                    
  88    2015-02-02 17:22    31  ************                                    
  89    2015-02-02 17:23    30  ***********    
  90    2015-02-02 17:24    30  ***********                                    
  91    2015-02-02 17:25    31  ************                                    
  92    2015-02-02 17:26    30  ***********                                    
 ...    ..(  2 skipped).    ..  ***********                                    
  95    2015-02-02 17:29    30  ***********                                    
  96    2015-02-02 17:30    31  ************                                    
  97    2015-02-02 17:31    31  ************                                    
  98    2015-02-02 17:32    30  ***********                                    
  99    2015-02-02 17:33    31  ************                                    
 ...    ..( 10 skipped).    ..  ************                                    
 110    2015-02-02 17:44    31  ************                                    
 111    2015-02-02 17:45    32  *************                                  
 112    2015-02-02 17:46    31  ************                                    
 ...    ..(  2 skipped).    ..  ************                                    
 115    2015-02-02 17:49    31  ************                                    
 116    2015-02-02 17:50    32  *************                                  
 117    2015-02-02 17:51    31  ************                                    
 ...    ..(  2 skipped).    ..  ************                                    
 120    2015-02-02 17:54    31  ************                                    
 121    2015-02-02 17:55    32  *************                                  
 122    2015-02-02 17:56    31  ************                                    
 123    2015-02-02 17:57    31  ************                                    
 124    2015-02-02 17:58    32  *************  
 ...    ..( 34 skipped).    ..  *************                                  
 159    2015-02-02 18:33    32  *************                                  
                                                                               
SCT Error Recovery Control command not supported                                
                                                                               
Device Statistics (GP Log 0x04) not supported                                  
                                                                               
SATA Phy Event Counters (GP Log 0x11)                                          
ID      Size     Value  Description                                            
0x0001  2            0  Command failed due to ICRC error                        
0x0002  2            0  R_ERR response for data FIS                            
0x0003  2            0  R_ERR response for device-to-host data FIS              
0x0004  2            0  R_ERR response for host-to-device data FIS              
0x0005  2            0  R_ERR response for non-data FIS                        
0x0006  2            0  R_ERR response for device-to-host non-data FIS          
0x0007  2            0  R_ERR response for host-to-device non-data FIS          
0x0008  2            0  Device-to-host non-data FIS retries                    
0x0009  2            6  Transition from drive PhyRdy to drive PhyNRdy          
0x000a  2            6  Device-to-host register FISes sent due to a COMRESET    
0x000b  2            0  CRC errors within host-to-device FIS                    
0x000f  2            0  R_ERR response for host-to-device data FIS, CRC        
0x0012  2            0  R_ERR response for host-to-device non-data FIS, CRC    
0x8000  4         9427  Vendor specific        



Forum wouldn't let me post all four drive outputs, but they are all the same make/model/capacity.

Can't say I understand most of this. Let me know if you need anything else.

Regards,
Rob.
 

marbus90

Guru
Joined
Aug 2, 2014
Messages
818
From looking at
193 Load_Cycle_Count -O--CK 200 200 000 - 205
the raw value seems okay to me - I worried that you'd have racked up a LCC of 200k-600k already. wdidle'd?
 

peridian

Dabbler
Joined
Feb 2, 2015
Messages
13
So should I still consider your previous suggestion, or do you think these drives won't get spun up that often? Or (given this is for scheduled backups) should I consider powering the dvice down and having it set on a timer to start up?

Regards,
Rob.
 

marbus90

Guru
Joined
Aug 2, 2014
Messages
818
The drives don't seem to be spinning down anyway. Or that system is just a few hours to days old.

In general it isn't that feasible to have the drives spin down anyway. If you really want it: either get a SSD for the .system dataset or power off the system when not in use.

Paranoid level would include unplugging the backup server from the mains during that period to avoid having the primary and the backup storage killed by surges/dead UPS etc.
 

peridian

Dabbler
Joined
Feb 2, 2015
Messages
13
Thanks for the advice, I'll have to see how the device fairs over the next few weeks. And yes, the system is less than a day old.

Regards,
Rob.
 

marbus90

Guru
Joined
Aug 2, 2014
Messages
818
Then I'd recommend to wdidle them that instant to prevent the drives spinning down on their own in 8second intervals. AFAIK the 300second setting works way better with ZFS.
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
I don't think wdidle has anything to do with spinning down every 8 seconds (which would contribute to Start/Stop Count). It's parking the heads after 8 seconds of inactivity (which contributes to Load Cycle Count). WDidle allows you to change that so they park after up to 300 seconds of inactivity instead of 8 seconds.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Hold it. Let's get a few things straight:

Drives don't just spin down on their own by default. This is the recommended scenario.
Greens park their heads after 8 seconds to save power. This leads to a rapidly-increasing Load Cycle Count (drives are rated for 300 000, IIRC) with typical NAS workloads. This can be fixed with wdidle.
By default, there's a somewhat constant stream of .system dataset writes to the pool, making it hard to get the drives to sleep/whatever.

Now, 100 Load Cycles per hour is very high. Run wdidle on all drives ASAP.

And please, burn in the drives before putting data on them.
 

peridian

Dabbler
Joined
Feb 2, 2015
Messages
13
I left the system running overnight, and checked the values again this morning:

Code:
4    Start_Stop_Count    -O--CK    100    100    000    -    2
9    Power_On_Hours        -O--CK    100    100    000    -    18
12    Power_Cycle_Count    -O--CK    100    100    000    -    2
193    Load_Cycle_Count    -O--CK    200    200    000    -    1501


~83 cycles per hour, so at ~300000 lifetime, that's ~3614 hours (~150 days). This is obviously growing fast even without any data or activity on it. I have switched the device off until I can make further decisions.

Thanks for pointing out wdiddle, that's caused me to do more research on the issues with WD Green vs Red drives, not something I had come across previously.

I'm actually beginning to feel that FreeNAS simply isn't suitable for home storage solutions. The system this was to replace is a Ubuntu + Samba + ext3/4 install, that didn't use RAID at all and instead used the LVM to manage the data across the disks. That system has had 4 WD Green drives in for the last 4 years, and has never encountered data corruption or hard drive failures.

I initially thought I would prefer FreeNAS for its web interface to simplify setting up storage space, automate backups, and setup permissions. However my research since trying to get this up and running has led me to the following conclusions:

  • The ZFS file system seems to be more prone to data corruption due to memory hiccups, promoting use of ECC RAM which is not typically used in home systems.
  • As discussed here, FreeNAS seems to periodically "activate" drives, causing additional (unnecessary) wear and tear on the drives.
  • The web admin GUI periodically hangs on me until the NAS is rebooted (see https://forums.freenas.org/index.php?threads/login-lock-up.27310/)

Please feel free to dispute any of this, but I feel that FreeNAS is intended for large scale enterprise hardware setups, not home usage. I am starting to think I may ditch FreeNAS in favour of the tried and tested approach, even if it is a bit more painful to setup.

Thanks for all the help.

Regards,
Rob.
 

ipsum

Dabbler
Joined
Dec 15, 2014
Messages
29
(...) my research since trying to get this up and running has led me to the following conclusions:
  • The ZFS file system seems to be more prone to data corruption due to memory hiccups, promoting use of ECC RAM which is not typically used in home systems.
  • As discussed here, FreeNAS seems to periodically "activate" drives, causing additional (unnecessary) wear and tear on the drives.
  • The web admin GUI periodically hangs on me until the NAS is rebooted (see https://forums.freenas.org/index.php?threads/login-lock-up.27310/)
Please feel free to dispute any of this, (...)

this-is-gonna-be-good.gif


Seriously though, chances are you have experienced silent corruption on your 4 year old system without even knowing it. To each his/her own :)

Just because ECC is not typically used in your so-called home systems, does not mean it should not be.
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
Yeah, you'll find lots of posts detailing the virtues of FreeNAS/ZFS on this forum.

It is true that drives stay spinning in FreeNAS by default, but it is not that hard to make them spin down.
https://forums.freenas.org/index.php?threads/how-to-let-drives-spin-down.26314/
However, the reigning theory (I'm not aware of clear data either way) in the FreeNAS world seems to be that drives last longer if they stay spinning than if they start/stop, maybe even once a day.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
The ZFS file system seems to be more prone to data corruption due to memory hiccups, promoting use of ECC RAM which is not typically used in home systems.
That's not really correct. All filesystems are vulnerable to data corruption from any of a number of sources, and most have no mechanism to even locate, much less correct, any corruption. ZFS has robust mechanisms to ensure data integrity, but using it without ECC memory is like putting a screen door on a bank vault.
As discussed here, FreeNAS seems to periodically "activate" drives, causing additional (unnecessary) wear and tear on the drives.
Leaving the disks spun up does not cause appreciable wear and tear. Constantly parking and unparking the heads, as WD Green drives will do by default, does. This is behavior that can be easily modified by a WD-sanctioned tool, if you choose to use WD Greens (as many people do). If you believe (contrary to any available evidence) that leaving your disks spun up is bad for them, you'll probably be better off with a system other than FreeNAS. There are workarounds, but you seem to be starting from a faulty premise.
The web admin GUI periodically hangs on me until the NAS is rebooted (see https://forums.freenas.org/index.php?threads/login-lock-up.27310/)
This is not normal. I don't know off the top of my head what the problem is, but it definitely isn't normal or expected behavior.
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977
I find freenas fantastic for a home server solution.

Good luck with your server peridian. Ignorance is bliss.......
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I'm with jailer.. ignorance is bliss.

ZFS can identify and correct errors. The problem is that throughout your life you've probably had errors and there was no way to identify the cause. ZFS can identify (and correct) this kind of problem.

This reminds me of a situation I had last week with someone. They were using TCP for NFS. They were getting errors, so they decided to try UDP (in case you didn't know, UDP doesn't really have any kind of error correction). He was then astonished that the errors went away and concluded that TCP must be buggy and wanted me to help them find the bug in the TCP implementation of NFS. The reality is that he took out the error correction mechanism that created his problem to begin with. Without an identifiable error condition you'll have no error. When I explained to him that he removed the error correction, hence no error he laughed. He also came back later and found that he was still able to prove there were problems using other metrics. Whoops.

I would agree that FreeNAS isn't for your "typical home" setup. It's for those people that are neurotic about their data and want to protect it long-term and are willing to pay for extras like server-grade parts and ECC RAM to ensure their data is safe. If that expense is too much for you, that's fine. Its your data and your choice how to proceed. For others its a way to enjoy FreeBSD stability without a 2 year degree in FreeBSD.

Do note though that the wdidle thing will still be required on Linux, so don't think that switching OSes will get you out of the woods with that. It surely won't, and you'll figure that out soon enough. Just Google around and you'll see for yourself. In fact I think there's a linux version of wdidle out there that someone hacked together. ;)

As I noted in the thread you wrote (I happened to reply just seconds before reading this thread) you aren't using recommended hardware. So I'm not the least bit surprised that you aren't having a great experience with it. That's literally par for the course. We've got lots of people that don't use recommended hardware. Statistically they are much much much more likely to have problems than those that take our recommendations seriously. We're not building a desktop. We're not building a gamer system. We're building a server. The Z87 chipset is NOT appropriate for a server.
 

peridian

Dabbler
Joined
Feb 2, 2015
Messages
13
Seriously though, chances are you have experienced silent corruption on your 4 year old system without even knowing it.

Maybe so, but to date I have not actually encountered problems/lost data resulting from that.

Just because ECC is not typically used in your so-called home systems, does not mean it should not be.

Agreed, but the vast majority of consumer motherboards tend to be non-ECC, and since I had never had any data issues with memory previously, it did not occur to me that I might require a more reliable RAM solution.

It is true that drives stay spinning in FreeNAS by default, but it is not that hard to make them spin down.

Thanks for the link, perhaps spin-down was the wrong term. "Reduce activity" to save on wear and tear, which is what the WD Green idle state seems to be trying to achieve.

However, the reigning theory (I'm not aware of clear data either way) in the FreeNAS world seems to be that drives last longer if they stay spinning than if they start/stop, maybe even once a day.

That's taken up some of my time researching. I can't find any statistical evidence on the web weighing up lifespan reduction from being left running vs that done by start/stop events at differing frequencies. Would be helpful to know.

ZFS has robust mechanisms to ensure data integrity...

ZFS can identify and correct errors...

Agreed, and I like that fact, however by "more prone to data corruption due to memory hiccups" I meant that ZFS appears to have a critical dependency on the memory state of the system.

An ext4 formatted drive may sit there quietly with infrequent access, and if a memory glitch occurs it's more likely to affect the running system than the data held (granted unless the memory space in question is holding data about to be written to the disk).

However, ZFS appears to be continuously checking data on the disks to protect them, but does so assuming that the memory space is 100% uncorrupted; a memory glitch while running ZFS can result in data at rest being corrupted (I've read posts about entire storage spaces being lost, not sure how accurate that is).

Granted, the ext4 disk will be more prone to all manor of data corruption problems, same as NTFS or other journaling systems, so I think "self-healing" ZFS-like file systems are the way to go to protect data. I just had not realised the critical dependency on using ECC RAM.

Leaving the disks spun up does not cause appreciable wear and tear. Constantly parking and unparking the heads, as WD Green drives will do by default, does. This is behavior that can be easily modified by a WD-sanctioned tool, if you choose to use WD Greens (as many people do). If you believe (contrary to any available evidence) that leaving your disks spun up is bad for them, you'll probably be better off with a system other than FreeNAS. There are workarounds, but you seem to be starting from a faulty premise.

Indeed, my faulty premise was that the drives spun themselves down. I clearly had not understood the true nature of the idle mode for these drives.

This is not normal. I don't know off the top of my head what the problem is, but it definitely isn't normal or expected behavior.

I know, happens about every other day for me. I'm wondering if its a browser compatibility thing rather than a problem with the web host.

The problem is that throughout your life you've probably had errors and there was no way to identify the cause. ZFS can identify (and correct) this kind of problem.

None that lost me data (touch laminated substance this monitor is parked on).

Do note though that the wdidle thing will still be required on Linux, so don't think that switching OSes will get you out of the woods with that.

Why? My last storage backup was Linux and had WD Greens, as per above it has lasted 4 years. It's OS was mounted on a separate HDD, and the Greens only held data backups, so they were not being periodically poked by running processes.

I admit that the problem is not really a FreeNAS issue, but my use of the WD Greens in a system that is constantly performing checks on the data, a necessary part of the protection of the system. I picked the Greens because of the success of my previous drives, but am now starting to think I should have gotten different ones.

I haven't actually tried the original suggestion for redirecting the .system data to a separate drive, although I note several other posts around from people who have done so.

As I noted in the thread you wrote (I happened to reply just seconds before reading this thread) you aren't using recommended hardware. So I'm not the least bit surprised that you aren't having a great experience with it. That's literally par for the course. We've got lots of people that don't use recommended hardware. Statistically they are much much much more likely to have problems than those that take our recommendations seriously.

Unfortunately I read the FAQ on the freeNAS site and didn't really research it beyond that. As you noted in your other post (I'll respond separately), the real gotcha for me here was the non-ECC RAM. wdidle the drives would not have been a big issue, but the potential to lose swathes of data from a memory glitch is a biggie.


Thanks, I ended up building a bootable FreeDOS USB for this.

I would agree that FreeNAS isn't for your "typical home" setup. It's for those people that are neurotic about their data and want to protect it long-term and are willing to pay for extras like server-grade parts and ECC RAM to ensure their data is safe.

And that's the kicker, at the time I purchased hardware based on previous recommendations for building a home server with Linux. FreeNAS (or ZFS more specifically) has its own set of requirements.


Thank you everybody for your feedback, this has been an enlightening day.

For the time being I'm going to revert back to a Ubuntu/ext4 USB system just so that I can use the drives as is for the near future without killing them in the process.

It's not a good long term solution, but I think realistically I'm going to have to swap out some of my hardware to ensure the reliability of the ZFS system before I try to use it in anger. *sigh* let the savings begin...

Thanks again,
Rob.
 

demon

Contributor
Joined
Dec 6, 2014
Messages
117
Agreed, but the vast majority of consumer motherboards tend to be non-ECC, and since I had never had any data issues with memory previously, it did not occur to me that I might require a more reliable RAM solution.

No one ever does, until they do. The problem is, you can have bad RAM silently chewing up your data for who-knows-how-long on a non-ECC-equipped system. Yes, data at rest won't be corrupted on that Ubuntu/ext4 setup (because unlike with ZFS, the filesystem won't be doing the periodic scrubs), but then you'll also have no way of knowing that something's gone terribly wrong until fsck or the kernel coughs up a lung because the filesystem structure is chewed up almost beyond recognition (I've dealt with chewed-up filesystems - like can't mount, fsck can't complete, debugfs barely works damaged - before, it can happen, anyone who says it can't is nuts). If that happens, ZFS or no, you can probably kiss that data goodbye, barring expensive data-recovery services.

ECC memory isn't more reliable, it just gives you a way of knowing that something's gone wrong before something horrific happens. Really, all computers should use ECC - but they don't, because it's cheaper to replace the cheap, crappy computer when it goes sideways. The real question is, how much is the data that lives on that machine worth to you?

Not trying to criticize you, just pointing out that non-ECC is always a risk. Anyway, good luck to you, whatever you do. May your RAM never eat it. :)
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
You know back in the early 80s all computers had a technology that was a predecessor to ECC RAM. When Apple cut corners and with with the equivalent of non-ECC RAM they were harshly criticized. People laughed and told them that there was no way that the system would be stable because of constant RAM errors and such. Of course we know that isn't true now, but back then nobody actually had evidence to the contrary. But, there were problems. If you had a job where you worked on things that required accurate math (engineering, etc.) you were expressly forbidden from using anything from Apple because the risk of data corruption in RAM was too likely to risk.

Strangely, we're seeing things revert. More and more people (and companies) are talking about ECC RAM and going to technologies that include error correction. Companies seem to be less and less accepting of corruption from random variables they have no control of and are seeking remedies.

What's funny is that virtually all data paths on your machine, PCie, USB, SATA, SAS, firewire, etc have some kind of error correction in them. The RAM is the only one that doesn't.
 
Status
Not open for further replies.
Top