I am running FreeNAS-11.3-U4.1 and unfortunately, since some time, I am getting random ZFS errors. Out of nowhere, one of the devices in my pool will become faulted.
After a reboot, the pool is online again and seems to behave normally - until the error reappears a while later. The device that is faulted is always a different one.
I am running FreeNAS on
* Proxmox 6.2 Hypervisor
* Intel Xeon E3-1245v6 (CPU set to HOST in Proxmox)
* 64GB Samsung ECC Ram (32 GB for FreeNAS)
* LSI SAS 9207-8i HBA (fully passed through to FreeNAS)
I am aware that running FreeNAS in a virtualized manner is not encouraged, but I think that since I am fully passing through my controller it should hopefully be fine - and it was for over 2 years.
What did I already do:
Code:
root@freenas:~ # zpool status -v tank pool: tank state: DEGRADED status: One or more devices are faulted in response to persistent errors. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the faulted device, or use 'zpool clear' to mark the device repaired. scan: resilvered 1.37G in 0 days 00:00:34 with 0 errors on Sun Oct 18 18:45:47 2020 config: NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 raidz3-0 DEGRADED 0 0 0 gptid/1f9588fd-181d-11ea-8a68-b9d2674e3063.eli ONLINE 0 0 0 gptid/14731eb5-aa63-11e7-ad99-0d581efb06db.eli ONLINE 0 0 0 gptid/946e300f-165a-11ea-8a68-b9d2674e3063.eli ONLINE 0 0 0 gptid/c3a7a5f9-a99b-11e7-a252-115519ca0956.eli ONLINE 0 0 0 gptid/08f95f17-3cf4-11e8-9c0e-1159604507fe.eli ONLINE 0 0 0 gptid/c577a1de-a99b-11e7-a252-115519ca0956.eli FAULTED 6 274 0 too many errors gptid/8a84d43a-3e38-11e8-9c0e-1159604507fe.eli ONLINE 0 0 0 gptid/c74ea2cb-a99b-11e7-a252-115519ca0956 ONLINE 0 0 0 cache gptid/22609ef9-a583-11ea-90b4-fbdc11792c3d.eli ONLINE 0 0 0 errors: No known data errors
After a reboot, the pool is online again and seems to behave normally - until the error reappears a while later. The device that is faulted is always a different one.
I am running FreeNAS on
* Proxmox 6.2 Hypervisor
* Intel Xeon E3-1245v6 (CPU set to HOST in Proxmox)
* 64GB Samsung ECC Ram (32 GB for FreeNAS)
* LSI SAS 9207-8i HBA (fully passed through to FreeNAS)
I am aware that running FreeNAS in a virtualized manner is not encouraged, but I think that since I am fully passing through my controller it should hopefully be fine - and it was for over 2 years.
What did I already do:
- Reseat all cables
- Try out a different set of cables
- Reseat HBA
- Update Proxmox and FreeNAS
- Run SMART tests on all drives (I scheduled SMART tests of course, and all drives seem to be fine