HELP! Directory is not reachable Disk offline ZFS all fine??

Status
Not open for further replies.
Joined
Apr 24, 2014
Messages
8
Hi,

I just switched to FreeNAS in the hopes it has advantages above stock FreeBSD.
Now i copied all the data from servers to other servers so i can make the switch.
All went fine. The only problem was that the disk would not spin down (But ok i can live with that)
The speed increase of freenas is great!

Now my problem. I was moving the data arround so i had the proper new directory structure i really wanted. After copying all my pictures (about 10.000 to 30.000) the system dit not respond anymore.
I tried logging into SSH. it worked. i can change directorys all ok. But when i get to the Picture diretory i cannot ls it. if i go to one directory above i can make a ls but no ls -al...
I tried a Find . -type f also no success. I waited for 10 minutes no response. if i look in top the sh command is in wait.

Now i enabled smart. it says disk 3 (ada3) is offline... but zpool status all disks online????
I started a scrub. i hope it helps.

I hope. I really really hope someone can help and give a solution so i can get my life back.
And if u ask. backup. well i just got fnished moving all the data arround and needed to use my backup disk for a temp storage....

Hope there is a good awser that does not include u lost everything.

With regards!

Aron
 

warri

Guru
Joined
Jun 6, 2011
Messages
1,193
What is your FreeNAS version, hardware and pool configuration?
What is the output of zpool status and smartctl -a /dev/ada3?

Please post it in [code]-tags.
 
Joined
Apr 24, 2014
Messages
8
Hi Warri,

Thank u for responding so fast!

I use the same hardware as 2 years back.

3 identical servers. Worked great with FreeBSD 8.2 with ZFS.

Supermicro Atom mailboards D510 with 4 GB of ram ( yes i know it is to little. But they are archiving servers use for one other server that dumps its data)
6 WD Green disks 2TB.
All in raidZ config.

Freenas version: FreeNAS-9.2.1.3-RELEASE-x64 (dc0c46b)

Zpool status

Code:
  pool: tank23
state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
        still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: scrub in progress since Thu Apr 24 21:30:10 2014
        649G scanned out of 4.92T at 165M/s, 7h35m to go
        0 repaired, 12.87% done
config:
 
        NAME                                            STATE    READ WRITE CKSUM
        tank23                                          ONLINE      0    0    0
          raidz1-0                                      ONLINE      0    0    0
            gptid/2f9ce0a4-bef2-11e3-824f-0025906b33f8  ONLINE      0    0    0
            gptid/30b74190-bef2-11e3-824f-0025906b33f8  ONLINE      0    0    0
            gptid/31c39a19-bef2-11e3-824f-0025906b33f8  ONLINE      0    0    0
            gptid/32dbc144-bef2-11e3-824f-0025906b33f8  ONLINE      0    0    0
            gptid/33f84117-bef2-11e3-824f-0025906b33f8  ONLINE      0    0    0
            gptid/3516c95c-bef2-11e3-824f-0025906b33f8  ONLINE      0    0    0
 
errors: No known data errors


Seems i read smartd wrong. It says without my reading haste 1 offline uncorrected error.

Code:
smartctl 6.2 2013-07-26 r3841 [FreeBSD 9.2-RELEASE-p3 amd64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
 
=== START OF INFORMATION SECTION ===
Model Family:    Western Digital Caviar Green (AF, SATA 6Gb/s)
Device Model:    WDC WD20EARX-00PASB0
Serial Number:    WD-WMAZA6727533
LU WWN Device Id: 5 0014ee 0584f510d
Firmware Version: 51.0AB51
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:    512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:  ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Thu Apr 24 22:38:42 2014 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
 
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
 
General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (  0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (39960) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (  2) minutes.
Extended self-test routine
recommended polling time:        ( 385) minutes.
Conveyance self-test routine
recommended polling time:        (  5) minutes.
SCT capabilities:              (0x3035) SCT Status supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.
 
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate    0x002f  200  200  051    Pre-fail  Always      -      0
  3 Spin_Up_Time            0x0027  166  164  021    Pre-fail  Always      -      6675
  4 Start_Stop_Count        0x0032  099  099  000    Old_age  Always      -      1594
  5 Reallocated_Sector_Ct  0x0033  200  200  140    Pre-fail  Always      -      0
  7 Seek_Error_Rate        0x002e  200  200  000    Old_age  Always      -      0
  9 Power_On_Hours          0x0032  074  074  000    Old_age  Always      -      19698
10 Spin_Retry_Count        0x0032  100  100  000    Old_age  Always      -      0
11 Calibration_Retry_Count 0x0032  100  253  000    Old_age  Always      -      0
12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      37
192 Power-Off_Retract_Count 0x0032  200  200  000    Old_age  Always      -      23
193 Load_Cycle_Count        0x0032  183  183  000    Old_age  Always      -      52774
194 Temperature_Celsius    0x0022  123  098  000    Old_age  Always      -      27
196 Reallocated_Event_Count 0x0032  200  200  000    Old_age  Always      -      0
197 Current_Pending_Sector  0x0032  200  200  000    Old_age  Always      -      1
198 Offline_Uncorrectable  0x0030  200  200  000    Old_age  Offline      -      1
199 UDMA_CRC_Error_Count    0x0032  200  200  000    Old_age  Always      -      0
200 Multi_Zone_Error_Rate  0x0008  200  200  000    Old_age  Offline      -      1
 
SMART Error Log Version: 1
No Errors Logged
 
SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]
 
 
SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
 

warri

Guru
Joined
Jun 6, 2011
Messages
1,193
Nothing out of the ordinary there, your pool seems fine. I have no idea why it's now listing the files.
I'd wait for the scrub to finish and then try again..
Out of curiosity, which zpool version are you using (zpool get version)?

Btw, you should probably setup regular SMART tests :)

EDIT: And you could try testing your memory with memtest86+. As cyberjock says, "you never know with non-ecc RAM".
 
Joined
Apr 24, 2014
Messages
8
Code:
NAME    PROPERTY  VALUE    SOURCE
tank23  version  -        default


I just added the ZFS Volume left everything on default. I was wondering why FreeNAS was not using the latest one.

What happend was: I used windows explorer to move the files ( i normally would not use Windows and CIFS but it seems easier at first)
My server connects with NFS to the storage pools. I moved all kind of files back and forth. until the latest move. Windows (windows 8.1) was very slow. Instead of moving the directorys fast (because it was on the same share) it was moving file by file. i thought maybe this is because the directory was not empty when i started.
After a while i got a network disconnected message (my laptop was not disconnected i think cifs kicked me.

I left my find command running. no change after 1 hour of wait.
I tried stopping SMB no luck there. it is still running...

I will add the smart test. thanks for the tip.
 
Joined
Apr 24, 2014
Messages
8
Seems that Freenas does not report the version but is using feature flags?
Here are my features:

Code:
NAME    PROPERTY                      VALUE                          SOURCE
tank23  size                          10.9T                          -
tank23  capacity                      45%                            -
tank23  altroot                        /mnt                          local
tank23  health                        ONLINE                        -
tank23  guid                          6515809491345406096            default
tank23  version                        -                              default
tank23  bootfs                        -                              default
tank23  delegation                    on                            default
tank23  autoreplace                    off                            default
tank23  cachefile                      /data/zfs/zpool.cache          local
tank23  failmode                      continue                      local
tank23  listsnapshots                  off                            default
tank23  autoexpand                    on                            local
tank23  dedupditto                    0                              default
tank23  dedupratio                    1.00x                          -
tank23  free                          5.95T                          -
tank23  allocated                      4.92T                          -
tank23  readonly                      off                            -
tank23  comment                        -                              default
tank23  expandsize                    0                              -
tank23  freeing                        0                              default
tank23  feature@async_destroy          enabled                        local
tank23  feature@empty_bpobj            active                        local
tank23  feature@lz4_compress          enabled                        local
tank23  feature@multi_vdev_crash_dump  disabled                      local
tank23  feature@spacemap_histogram    disabled                      local
tank23  feature@enabled_txg            disabled                      local
tank23  feature@hole_birth            disabled                      local
tank23  feature@extensible_dataset    disabled                      local
tank23  feature@bookmarks              disabled                      local
 

warri

Guru
Joined
Jun 6, 2011
Messages
1,193
Yes when the version shows as dash, it is v5000 (= feature flags). FreeNAS actually includes some of the newer ZFS features taken directly from FreeBSD 10.

I really don't know what else you could try. Sometimes ZFS just wants more RAM, if you happen to have a machine with more RAM..

But maybe somebody else has an idea.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Well, you can and should expect erratic server behavior when you have <8GB of RAM. That's why I said you should have 8GB of RAM minimum in the manual.
 
Joined
Apr 24, 2014
Messages
8
Hi Cyberjock,

Do u have an idea what is causing it? ZFS scrub did not find anything. 0 repairs.
I know what u wrote in your manual. I read it. Do u have a real idea for a solution?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
If you read it why are you even asking for help? Do you know how many people have had problems with FreeNAS they couldn't explain. Then after weeks of trying to figure out why FreeNAS was such a POS they dropped in more RAM and suddenly everything started working correctly? Then they show up and say stuff like "I've spent 80 man-hours on the crap, and a $100 stick of RAM was all I needed!?" I sh*t you not....

As for ideas on what is causing it or real solutions I don't even try to solve problems that may be caused by insufficient RAM. I've spent so much time on it in the past that i take the stance "if you couldn't spend the money to build it right why should I be expected to spend my time to solve it".

To add to the potential mischief, RAIDZ1 is very well known to not be a reliable vdev type. We've seen tons and tons of pools go bad with RAIDZ1. I'd never recommend RAIDZ1 even as a backup server.

As warri already said above(and even mentioned me by name) bit flips from non-ECC RAM have the potential to be quite destructive to zfs. Unfortunately, you can never rule out bitflips with non-ECC RAM in your system, nor can you rule out problems related to insufficient RAM. It's well documented in the forums that both of those can create problems that aren't easily diagnosed nor are even reproducible on the same system.. To be honest, considering the "bizarre-ness" of your problem I'd almost certainly expect your problems to be related to one or both of those flaws in your server design.

As a backup server, I'd simply replace it with something more appropriate and rebuild your backups from the original servers.

Now, regarding your hard drive that you provided SMART data on, you aren't running any SMART tests(which is bad). You certainly aren't doing any SMART monitoring or emailing or you'd have complained about FreeNAS sending you emails warning of impending drive failure. If you run a SMART long test on that disk, it will certainly fail. That will qualify it for an RMA. While you are at it, you should take this time to check out all of your disks and run SMART tests on all of your disks.

There's a right and wrong way to setup the hardware as well as the software. Not only is the hardware not appropriate for the task, but the software doesn't appear to have been set up appropriately either. :(
 
Joined
Apr 24, 2014
Messages
8
I Found the solution.

SMB crashed corrupted the configuration and hanged. after disabeling the service. killed it. everything worked fine.
only now i see that i need to make a new SMB config. it will not get up again.
How can a written config file be corrupted through to less memory?

Code:
store1 notifier: Performing sanity check on Samba configuration: FAILED


In case of the memory. I switched from stock FreeBSD to FreeNAS. it worked in the past, it was stable like a rock stuck in a mountain.
The mainbords does not support more memory.

http://www.supermicro.com/products/motherboard/ATOM/ICH9/X7SPA.cfm?typ=H

For the settings. i was still in the process of building my setup. it is not a primairy production server :)
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526

Right, so now's the time to look at buying a mainboard that can support more memory. :P

FreeBSD can handle smaller memory systems than FreeNAS. FreeBSD's minimum requirements are 256MB of RAM if I'm not mistaken. FreeNAS need 8GB minimum.

As for that corrupt configuration a reboot should solve that problem *if* the FreeNAS database isn't corrupt. We have seen database corruption because of insufficient RAM. FreeNAS regenerates the Samba config file on bootup based on the FreeNAS database settings.
 
Joined
Apr 24, 2014
Messages
8
Hmmmm, seems that the database is indeed corrupt. I cannot start the samba configuration again. i tried to apply all the samba settings again. No luck. Is there a link or something to reactivate samba without a database backup or reinstall?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Nope. Gotta redo your config from scratch. :(
 
Joined
Apr 24, 2014
Messages
8
ok, no problem. The system was not fully configured :) The last question :) It did not say directly in the manual. a factory reset in the settings menu will this also destroy the ZFS pool? and so all the data containing on this?

( in think not, but want to ask to be sure :D )

btw :) thanks for the help!
Aron
 

warri

Guru
Joined
Jun 6, 2011
Messages
1,193
Factory restore will only reset the configuration database of FreeNAS and does not touch your volumes.
After a reset you'll have to import your pool again with auto-import.

Factory Restore: resets the configuration database to the default base version. However, it does not delete user SSH keys or any other data stored in a user's home directory. Since any configuration changes stored in the configuration database will be erased, this option is handy if you mess up your system or wish to return a test system to the original configuration.
 
Status
Not open for further replies.
Top