phretor
Cadet
- Joined
- Dec 26, 2012
- Messages
- 1
Hello.
In my FreeNAS 8.2 box I created a raidz1 vdev with of 4x2TB disks (the vdev has been working for about 2 years). The system has 4GB of RAM (an upgrade to 16GB is planned). In the last year, I replaced 3 out of 4 disks (smart errors). The system is scheduled to scrub the vdev weekly. I never encountered any data error and so far I am quite happy with zfs.
Today, I stumbled upon the first occurrence of data errors. As this is my first experience with zfs, I need some feedback on how to handle them.
Recently, the fourth disk had some bad sectors, so I decided to replace it. While resilvering the vdev (with the new, fourth disk in), several data errors were reported, and both disk 3 and 4 had several thousands of checksum errors.
After a second scrub, the situation is as follows:
Unfortunately, `zpool status -v` hangs the system (maybe too many files). Meanwhile, I ordered a replacement for the third drive, because I discovered that it was failing too. SMART errors here: http://pastebin.com/XmCBh4E6
So, to conclude, I've got some questions:
Last, I am aware that WD green disks are not the best choice for a NAS. I learned this at my own risk. I am planning to migrate to a better hardware configuration, but first I need to take care of these issues.
Thanks in advance for any feedback.
In my FreeNAS 8.2 box I created a raidz1 vdev with of 4x2TB disks (the vdev has been working for about 2 years). The system has 4GB of RAM (an upgrade to 16GB is planned). In the last year, I replaced 3 out of 4 disks (smart errors). The system is scheduled to scrub the vdev weekly. I never encountered any data error and so far I am quite happy with zfs.
Today, I stumbled upon the first occurrence of data errors. As this is my first experience with zfs, I need some feedback on how to handle them.
Recently, the fourth disk had some bad sectors, so I decided to replace it. While resilvering the vdev (with the new, fourth disk in), several data errors were reported, and both disk 3 and 4 had several thousands of checksum errors.
After a second scrub, the situation is as follows:
Code:
[phretor@opentank ~]$ zpool status pool: tank state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz1 ONLINE 0 0 0 gpt/disk0 ONLINE 0 0 0 ada1p2 ONLINE 0 0 0 ada2p2 ONLINE 0 0 7 gptid/f70f9292-4f60-11e2-9b34-00270e2f08e1 ONLINE 0 0 0 errors: 494390 data errors, use '-v' for a list
Unfortunately, `zpool status -v` hangs the system (maybe too many files). Meanwhile, I ordered a replacement for the third drive, because I discovered that it was failing too. SMART errors here: http://pastebin.com/XmCBh4E6
So, to conclude, I've got some questions:
- are these data errors a sign of actual damages or data losses, or are they recoverable somehow?
- while waiting for the replacement disk, is there something that I can do to reduce the risk to loose data completely?
- is there a way to enumerate the files affected by data errors (so that I can see if I have a spare copy of these files on some other machines), without hanging the system?
Last, I am aware that WD green disks are not the best choice for a NAS. I learned this at my own risk. I am planning to migrate to a better hardware configuration, but first I need to take care of these issues.
Thanks in advance for any feedback.