CompuGlobalHyperMegaNet
Contributor
- Joined
- Sep 13, 2014
- Messages
- 149
I feel a little silly posting this but please bare with me. I've had a really stressful few days and this is the first time of encountered and issues with my storage pool, so I'm second guessing everything and having trouble keeping focus. To cap it off, it appears my online backup is expired (*hangs head in shame), thanks spam filter!.... The point is that I was pretty damn stressed before these storage issues and they've only added to the load.
I'd appreciate some hand holding (so to speak) if someone would be so kind.
====================================
I've just received a couple of emails from my server. The first was received at 20:03 and said-
The second was received at 20:04 (on minute after the first) and said-
I checked the pool status using-
I also checked the disks via the GUI. One disk had around 524 write errors (I'm kicking myself for not taking accurate notes) but aside from that, there were no errors or checksum errors reported on any of the disks.
I then cleared the errors using-
and initiated a Scrub, which is currently about 40% done with no errors.
The system had been off for a few days before being booted around 19:30 today (the same day as the above emails/errors). The write errors may have occurred whilst I was transferring around 20GB of data to the FreeNAS server or whilst I was hashing the data.
My hardware can be found in my sig.
=========================================
So my questions are as follows.
1. If the scrub completes and there are no errors, what should my course of action be?
2. If there are errors reported, should I power down the system whilst I wait for the replacement disk to arrive and passed the burn-in test or should I leave the system running?
3. Is there something else I should do before replacing the disk?
4. Once the scrub is complete, should I run a Long S.M.A.R.T. test on the drive in question?
5. What can cause write errors?
P.S. Please forgive me if I'm waffling or asking dumb questions. It's just been one of those weeks...
=====================================
[EDIT] The scrub is now complete and no errors are being reported. I've just run "zpool status" and I get the following-
As you can see, the problematic disk (da5p2) is being shown by it's name rather than it's gptid like the others. Why is this?
Am I safe to shut the system down and fall into bed with a stiff drink?
I'd appreciate some hand holding (so to speak) if someone would be so kind.
====================================
I've just received a couple of emails from my server. The first was received at 20:03 and said-
Code:
The volume tank (ZFS) state is ONLINE: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state.
The second was received at 20:04 (on minute after the first) and said-
Code:
The volume tank (ZFS) state is ONLINE: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected.
I checked the pool status using-
Code:
# zpool status
I also checked the disks via the GUI. One disk had around 524 write errors (I'm kicking myself for not taking accurate notes) but aside from that, there were no errors or checksum errors reported on any of the disks.
I then cleared the errors using-
Code:
zpool clear tank
and initiated a Scrub, which is currently about 40% done with no errors.
The system had been off for a few days before being booted around 19:30 today (the same day as the above emails/errors). The write errors may have occurred whilst I was transferring around 20GB of data to the FreeNAS server or whilst I was hashing the data.
My hardware can be found in my sig.
=========================================
So my questions are as follows.
1. If the scrub completes and there are no errors, what should my course of action be?
2. If there are errors reported, should I power down the system whilst I wait for the replacement disk to arrive and passed the burn-in test or should I leave the system running?
3. Is there something else I should do before replacing the disk?
4. Once the scrub is complete, should I run a Long S.M.A.R.T. test on the drive in question?
5. What can cause write errors?
P.S. Please forgive me if I'm waffling or asking dumb questions. It's just been one of those weeks...
=====================================
[EDIT] The scrub is now complete and no errors are being reported. I've just run "zpool status" and I get the following-
Code:
[root@freenas ~]# zpool status pool: freenas-boot state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Mon Mar 27 03:45:04 2017 config: NAME STATE READ WRITE CKSUM freenas-boot ONLINE 0 0 0 ada0p2 ONLINE 0 0 0 errors: No known data errors pool: tank state: ONLINE scan: scrub repaired 0 in 3h3m with 0 errors on Mon Apr 3 00:34:41 2017 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 gptid/fc861721-41e7-11e5-b1be-002590f5a510 ONLINE 0 0 0 gptid/fcea84f6-41e7-11e5-b1be-002590f5a510 ONLINE 0 0 0 gptid/fd5098de-41e7-11e5-b1be-002590f5a510 ONLINE 0 0 0 gptid/fdb415bc-41e7-11e5-b1be-002590f5a510 ONLINE 0 0 0 gptid/fe17c521-41e7-11e5-b1be-002590f5a510 ONLINE 0 0 0 gptid/fe7c3f44-41e7-11e5-b1be-002590f5a510 ONLINE 0 0 0 da5p2 ONLINE 0 0 0 gptid/ff4aa622-41e7-11e5-b1be-002590f5a510 ONLINE 0 0 0
As you can see, the problematic disk (da5p2) is being shown by it's name rather than it's gptid like the others. Why is this?
Am I safe to shut the system down and fall into bed with a stiff drink?
Last edited: