paul.warwicker.1
Dabbler
- Joined
- Apr 30, 2016
- Messages
- 12
I have some permanent errors on one of my volumes, but these are shown as hex codes. For example:
Reading back through older forum posts suggests that the only way to resolve this was to restore from a backup and recreate the pool.
The volume appears to be okay and the errors I was seeing at 11.2 (https://www.ixsystems.com/community/threads/losing-zfs-pool-overnight.75994/) are no longer causing me an issue at 11.1. Maybe I just got unlucky on the last reboot because these usually cleared on reboot or when I removed any temporary volumes used during testing.
I found a very useful post here http://unixetc.co.uk/2012/01/22/zfs-corruption-persists-in-unlinked-files/ which discussed the issue. Towards the end of the article, it suggests doing a scrub and then stopping that scrub immediately.
The net result is that I now have a clean volume. Question is, has anyone else been in this position and doubted the reliability? It is almost too easy!
-paul
Code:
... pool: oracle02 state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: resilvered 2.77M in 0 days 00:08:18 with 0 errors on Mon Apr 29 00:00:12 2019 config: NAME STATE READ WRITE CKSUM oracle02 ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 gptid/74ea4d6e-ffe3-11e8-aec2-941882388da4 ONLINE 0 0 0 gptid/762f22f2-ffe3-11e8-aec2-941882388da4 ONLINE 0 0 0 gptid/770f7817-ffe3-11e8-aec2-941882388da4 ONLINE 0 0 0 gptid/77f3f862-ffe3-11e8-aec2-941882388da4 ONLINE 0 0 0 gptid/78e9f9f4-ffe3-11e8-aec2-941882388da4 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: <0x722>:<0x0> <0x78c>:<0x0> <0x2b9>:<0x0> <0x2b9>:<0x572> <0x6c6>:<0x15> <0x6c6>:<0x32> <0x6cc>:<0x0> <0x6cc>:<0xe> oracle#
Reading back through older forum posts suggests that the only way to resolve this was to restore from a backup and recreate the pool.
The volume appears to be okay and the errors I was seeing at 11.2 (https://www.ixsystems.com/community/threads/losing-zfs-pool-overnight.75994/) are no longer causing me an issue at 11.1. Maybe I just got unlucky on the last reboot because these usually cleared on reboot or when I removed any temporary volumes used during testing.
I found a very useful post here http://unixetc.co.uk/2012/01/22/zfs-corruption-persists-in-unlinked-files/ which discussed the issue. Towards the end of the article, it suggests doing a scrub and then stopping that scrub immediately.
The net result is that I now have a clean volume. Question is, has anyone else been in this position and doubted the reliability? It is almost too easy!
Code:
... pool: oracle02 state: ONLINE scan: scrub canceled on Mon Apr 29 22:29:15 2019 config: NAME STATE READ WRITE CKSUM oracle02 ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 gptid/74ea4d6e-ffe3-11e8-aec2-941882388da4 ONLINE 0 0 0 gptid/762f22f2-ffe3-11e8-aec2-941882388da4 ONLINE 0 0 0 gptid/770f7817-ffe3-11e8-aec2-941882388da4 ONLINE 0 0 0 gptid/77f3f862-ffe3-11e8-aec2-941882388da4 ONLINE 0 0 0 gptid/78e9f9f4-ffe3-11e8-aec2-941882388da4 ONLINE 0 0 0 errors: No known data errors oracle#
-paul