yeliaB
Dabbler
- Joined
- Jan 10, 2017
- Messages
- 17
Hello all,
My NAS (config in my signature) started throwing concerning errors on ada1, so I ordered a replacement drive.
When the replacement drive arrived, I (using the GUI) OFFLINE'd ada1, and shutdown the system so I could get the old drive out, and the replacement drive in (no hot swap hardware in this box.)
After swapping in the replacement drive, I restarted the system. Getting into the GUI, I looked at the pool, found the old drive and clicked on its Options, and then on Replace. I then identified the replacement drive, and clicked the REPLACE DISK button. All just as written up in the documentation.
The console immediately started spewing these kinds of errors:
Being slow up the uptake, I head'ed a couple zpool status and saw that the resilvering wasn't starting, decided to shutdown the system, remove the replacement drive, put back in the old drive, rebooted and, in some kind of panic, got my system back into a state where the GUI describes my pool (tank) as follows:
Or, if you'd prefer the shell version:
Here's the output from glabel; scanning through all this, I believe that /dev/gptid/69edc2e6-bf5f-11e9-abca-001e67e092ac is the dead-on-arrival replacement drive that is no longer attached, and gptid/65bacdb2-f144-11e6-9a9b-001e67e092ac is the original error-throwing drive (ada1) that I added back into the pool:
At this point, I know enough to know that I don't know enough to get myself out of this mess that was admittedly my own doing.
I *think* what I'm looking to do is to remove the offline /dev/gptid/69edc2e6-bf5f-11e9-abca-001e67e092ac, and get rid of the "replacing" status because the original, error-throwing drive (ada1) seems to be online. At that point, I'm hoping I'll be able to once again follow the documentation to replace ada1 with a new replacement drive, which I will test thoroughly before attempting to do so.
Any words of wisdom would be greatly appreciated at this point--thanks for reading this far!
My NAS (config in my signature) started throwing concerning errors on ada1, so I ordered a replacement drive.
When the replacement drive arrived, I (using the GUI) OFFLINE'd ada1, and shutdown the system so I could get the old drive out, and the replacement drive in (no hot swap hardware in this box.)
After swapping in the replacement drive, I restarted the system. Getting into the GUI, I looked at the pool, found the old drive and clicked on its Options, and then on Replace. I then identified the replacement drive, and clicked the REPLACE DISK button. All just as written up in the documentation.
The console immediately started spewing these kinds of errors:
Code:
Aug 15 09:18:27 server (ada1:ahcich1:0:0:0): READ_FPDMA_QUEUED. ACB: 60 e0 20 f2 a0 40 ba 02 00 00 00 00 Aug 15 09:18:27 server (ada1:ahcich1:0:0:0): CAM status: Uncorrectable parity/CRC error Aug 15 09:18:27 server (ada1:ahcich1:0:0:0): Retrying command Aug 15 09:18:27 server (ada1:ahcich1:0:0:0): READ_FPDMA_QUEUED. ACB: 60 e0 20 00 00 40 00 00 00 00 00 00 Aug 15 09:18:27 server (ada1:ahcich1:0:0:0): CAM status: Uncorrectable parity/CRC error Aug 15 09:18:27 server (ada1:ahcich1:0:0:0): Error 5, Retries exhausted Aug 15 09:18:27 server (ada1:ahcich1:0:0:0): READ_FPDMA_QUEUED. ACB: 60 e0 20 02 00 40 00 00 00 00 00 00 Aug 15 09:18:27 server (ada1:ahcich1:0:0:0): CAM status: Uncorrectable parity/CRC error Aug 15 09:18:27 server (ada1:ahcich1:0:0:0): Error 5, Retries exhausted Aug 15 09:18:27 server (ada1:ahcich1:0:0:0): READ_FPDMA_QUEUED. ACB: 60 e0 20 f0 a0 40 ba 02 00 00 00 00 Aug 15 09:18:27 server (ada1:ahcich1:0:0:0): CAM status: Uncorrectable parity/CRC error Aug 15 09:18:27 server (ada1:ahcich1:0:0:0): Error 5, Retries exhausted Aug 15 09:18:27 server (ada1:ahcich1:0:0:0): READ_FPDMA_QUEUED. ACB: 60 e0 20 f2 a0 40 ba 02 00 00 00 00 Aug 15 09:18:27 server (ada1:ahcich1:0:0:0): CAM status: Uncorrectable parity/CRC error Aug 15 09:18:27 server (ada1:ahcich1:0:0:0): Error 5, Retries exhausted
Being slow up the uptake, I head'ed a couple zpool status and saw that the resilvering wasn't starting, decided to shutdown the system, remove the replacement drive, put back in the old drive, rebooted and, in some kind of panic, got my system back into a state where the GUI describes my pool (tank) as follows:
Code:
tank 0 0 0 DEGRADED RAIDZ2 0 0 0 DEGRADED ada0p2 0 0 0 ONLINE ada3p2 0 0 0 ONLINE ada2p2 0 0 0 ONLINE REPLACING 0 0 0 DEGRADED ada1p2 0 0 0 ONLINE /dev/gptid/69edc2e6-bf5f-11e9-abca-001e67e092ac 0 0 0 OFFLINE ada4p2 0 0 0 ONLINE
Or, if you'd prefer the shell version:
Code:
# zpool status tank pool: tank state: DEGRADED scan: scrub repaired 0 in 0 days 04:47:13 with 0 errors on Sun Aug 25 05:17:14 2019 config: NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 raidz2-0 DEGRADED 0 0 0 gptid/938511ff-d954-11e8-b0a5-001e67e092ac ONLINE 0 0 0 gptid/e5e894dd-f1e5-11e6-a4e7-001e67e092ac ONLINE 0 0 0 gptid/66d967c3-4d16-11e7-8664-001e67e092ac ONLINE 0 0 0 replacing-3 DEGRADED 0 0 0 gptid/65bacdb2-f144-11e6-9a9b-001e67e092ac ONLINE 0 0 0 10491244544366002744 OFFLINE 0 0 0 was /dev/gptid/69edc2e6-bf5f-11e9-abca-001e67e092ac gptid/2ba1b21b-4d53-11e7-8664-001e67e092ac ONLINE 0 0 0 errors: No known data errors #
Here's the output from glabel; scanning through all this, I believe that /dev/gptid/69edc2e6-bf5f-11e9-abca-001e67e092ac is the dead-on-arrival replacement drive that is no longer attached, and gptid/65bacdb2-f144-11e6-9a9b-001e67e092ac is the original error-throwing drive (ada1) that I added back into the pool:
Code:
# glabel list Geom name: ada0p2 Providers: 1. Name: gptid/938511ff-d954-11e8-b0a5-001e67e092ac Mediasize: 3998639456256 (3.6T) Sectorsize: 512 Stripesize: 0 Stripeoffset: 2147549184 Mode: r1w1e1 secoffset: 0 offset: 0 seclength: 7809842688 length: 3998639456256 index: 0 Consumers: 1. Name: ada0p2 Mediasize: 3998639456256 (3.6T) Sectorsize: 512 Stripesize: 0 Stripeoffset: 2147549184 Mode: r1w1e2 Geom name: ada1p2 Providers: 1. Name: gptid/65bacdb2-f144-11e6-9a9b-001e67e092ac Mediasize: 3998639460352 (3.6T) Sectorsize: 512 Stripesize: 0 Stripeoffset: 2147549184 Mode: r1w1e1 secoffset: 0 offset: 0 seclength: 7809842696 length: 3998639460352 index: 0 Consumers: 1. Name: ada1p2 Mediasize: 3998639460352 (3.6T) Sectorsize: 512 Stripesize: 0 Stripeoffset: 2147549184 Mode: r1w1e2 Geom name: ada2p2 Providers: 1. Name: gptid/66d967c3-4d16-11e7-8664-001e67e092ac Mediasize: 3998639460352 (3.6T) Sectorsize: 512 Stripesize: 0 Stripeoffset: 2147549184 Mode: r1w1e1 secoffset: 0 offset: 0 seclength: 7809842696 length: 3998639460352 index: 0 Consumers: 1. Name: ada2p2 Mediasize: 3998639460352 (3.6T) Sectorsize: 512 Stripesize: 0 Stripeoffset: 2147549184 Mode: r1w1e2 Geom name: ada3p2 Providers: 1. Name: gptid/e5e894dd-f1e5-11e6-a4e7-001e67e092ac Mediasize: 3998639460352 (3.6T) Sectorsize: 512 Stripesize: 0 Stripeoffset: 2147549184 Mode: r1w1e1 secoffset: 0 offset: 0 seclength: 7809842696 length: 3998639460352 index: 0 Consumers: 1. Name: ada3p2 Mediasize: 3998639460352 (3.6T) Sectorsize: 512 Stripesize: 0 Stripeoffset: 2147549184 Mode: r1w1e2 Geom name: ada4p2 Providers: 1. Name: gptid/2ba1b21b-4d53-11e7-8664-001e67e092ac Mediasize: 3998639460352 (3.6T) Sectorsize: 512 Stripesize: 0 Stripeoffset: 2147549184 Mode: r1w1e1 secoffset: 0 offset: 0 seclength: 7809842696 length: 3998639460352 index: 0 Consumers: 1. Name: ada4p2 Mediasize: 3998639460352 (3.6T) Sectorsize: 512 Stripesize: 0 Stripeoffset: 2147549184 Mode: r1w1e2 Geom name: da0p1 Providers: 1. Name: gptid/fa0bc54d-efb5-11e6-8c15-001e67e092ac Mediasize: 524288 (512K) Sectorsize: 512 Stripesize: 0 Stripeoffset: 17408 Mode: r0w0e0 secoffset: 0 offset: 0 seclength: 1024 length: 524288 index: 0 Consumers: 1. Name: da0p1 Mediasize: 524288 (512K) Sectorsize: 512 Stripesize: 0 Stripeoffset: 17408 Mode: r0w0e0 Geom name: da1p1 Providers: 1. Name: gptid/fa32753a-efb5-11e6-8c15-001e67e092ac Mediasize: 524288 (512K) Sectorsize: 512 Stripesize: 0 Stripeoffset: 17408 Mode: r0w0e0 secoffset: 0 offset: 0 seclength: 1024 length: 524288 index: 0 Consumers: 1. Name: da1p1 Mediasize: 524288 (512K) Sectorsize: 512 Stripesize: 0 Stripeoffset: 17408 Mode: r0w0e0 Geom name: ada0p1 Providers: 1. Name: gptid/9377a876-d954-11e8-b0a5-001e67e092ac Mediasize: 2147483648 (2.0G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 65536 Mode: r0w0e0 secoffset: 0 offset: 0 seclength: 4194304 length: 2147483648 index: 0 Consumers: 1. Name: ada0p1 Mediasize: 2147483648 (2.0G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 65536 Mode: r0w0e0 #
At this point, I know enough to know that I don't know enough to get myself out of this mess that was admittedly my own doing.
I *think* what I'm looking to do is to remove the offline /dev/gptid/69edc2e6-bf5f-11e9-abca-001e67e092ac, and get rid of the "replacing" status because the original, error-throwing drive (ada1) seems to be online. At that point, I'm hoping I'll be able to once again follow the documentation to replace ada1 with a new replacement drive, which I will test thoroughly before attempting to do so.
Any words of wisdom would be greatly appreciated at this point--thanks for reading this far!