Warning from 9.1.1

Status
Not open for further replies.

bdacasc

Dabbler
Joined
Mar 15, 2012
Messages
38
This morning I got an warning from my Freenas 9.1.1


WARNING: The volume Tank1 (ZFS) status is UNKNOWN: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected.Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'




Looking under Storage, View volume
The status is set to healty
I can not determind what is the error




The setup
FreeNAS-9.1.1-RELEASE-x64 (a752d35)
PlatformIntel(R) Atom(TM) CPU D525 @ 1.80GHz
Memory8171MBSystem
TimeSun Dec 08 10:19:40 CET 2013
Uptime10:19AM up 30 days, 15:22, 0 usersLoad Average0.12, 0.15, 0.11Connected through192.168.100.218


I need some advice of what to do
The system reports an error but I can not see what it is?
What would You do?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Post the output of "zpool status" from the CLI. Put it in CODE blocks as the formatting of the text is crucial.
 

bdacasc

Dabbler
Joined
Mar 15, 2012
Messages
38
Thx for the replay

This is the output from the command "zpool status"

[root@freenas] ~# zpool status
pool: Tank1
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://illumos.org/msg/ZFS-8000-9P
scan: scrub repaired 36K in 14h11m with 0 errors on Sun Nov 10 14:11:34 2013
config:

NAME STATE READ WRITE CKS UM
Tank1 ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
gptid/d7202b72-ec3f-11e1-9103-0025906ab8ac ONLINE 0 0 0
gptid/d7969438-ec3f-11e1-9103-0025906ab8ac ONLINE 0 0 1
gptid/d80da5ab-ec3f-11e1-9103-0025906ab8ac ONLINE 0 0 0
gptid/d8843b97-ec3f-11e1-9103-0025906ab8ac ONLINE 0 0 0
gptid/d8fd7f40-ec3f-11e1-9103-0025906ab8ac ONLINE 0 0 0
gptid/d978e569-ec3f-11e1-9103-0025906ab8ac ONLINE 0 0 1
cache
gptid/d9e4c0c2-ec3f-11e1-9103-0025906ab8ac ONLINE 0 0 0

errors: No known data errors

Please advice
 

Attachments

  • zpool_status.txt
    1.6 KB · Views: 217

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
You didn't include the CODE block, so your post doesn't help me as the formatting is destroyed.

Edit: Nevermind.. your attached file does have the proper formatting.

You appear to have a problem. Both disks with gptids d9e4c0c2 and d7969438 appear to have a checksum error. So if you weren't using ZFS you'd have lost some data somewhere due to silent corruption.

So you had some corruption, but ZFS was able to detect and correct the error.

3 things that concern me(aside from the CHKSUM errors).

1. Your last scrub was almost a month ago. Scrubs should be done every week to 2 weeks. So you need to consult the manual and change your scrub cycle appropriately.
2. You should NOT have a cache drive with 8GB of RAM. That needs to be removed. It is not actually doing you any good and may actually be hurting performance. Your cache should not exceed 5x your ARC, and you'd be lucky if your ARC is 4GB because of how little RAM you have. So removing that should be high on your priority list.
3. You are not using a system with ECC RAM. So these errors may be related to bitflips in RAM or failing RAM. You should make it a priority to test your RAM.

As for the CHKSUM errors they normally are a sign of a disk or storage subsystem problem. Because you are using non-ECC RAM there's no way to be sure that you didn't have data corruption in RAM(that's why we recommend ECC RAM).

You should also run short and long tests on all of your disks at regular intervals and and have SMART monitoring enabled along with email setup. Considering the errors you've gotten I can bet you are missing 1 or more of those.
 

bdacasc

Dabbler
Joined
Mar 15, 2012
Messages
38
Sorry but I do not understand what you mean when you write code block
Look at the enclosed file
 
Joined
Dec 7, 2013
Messages
95
He means using the button that looks like {}# when posting. When you hover over it with the mouse its says "Code"
 

bdacasc

Dabbler
Joined
Mar 15, 2012
Messages
38
Thx for the input.
I will consider your input

For a while a go I had a power blackout could this have affected the system and be the root cause of the checksum errors?
 
Status
Not open for further replies.
Top