I noticed today that one of my ZFS pools was unhealthy and that two of the four disks have over 10,000 checksum errors. I am running a scrub now that will take around 11 hours, but searching the forums, I haven't found any circumstances of pool errors quite like this. My system specs are in my signature, and the checksum error pool is from four Seagate Exos X18 18TB SAS 12Gb/s Enterprise HDD (in the checksum error pool). All the drives are connected to the host controller using mini-sas.
The only thing I did that I usually don't do was that I did a copy using an SSH connection to the bare metal server from one pool to another instead of moving the files through the network like I always do. I'm unsure if that could cause all the checksum errors. Also, do you know why two of the four drives would have all the errors and the other two would not? The smart status is OK on all the drives.
The files on the pool seem OK. Is there any way to "fix" this without rebuilding the entire pool? It wouldn't be the end of the world, as these are my newest drives, but I have a lot of data already copied to it and would not like to have to redo all that backing up (it took days). Could it be my controller, as only two drives have checksum errors?
I'm just looking for a start to diagnosing any issue, and the system works great other than this new "unhealthy" flag that the new pool popped up. I have another large storage pool and several smaller pools that have never been unhealthy on the same controller. Thanks, and I hope that you have all the information you need and that this is posted in the appropriate forum. I have not had the need to post here very often.
The only thing I did that I usually don't do was that I did a copy using an SSH connection to the bare metal server from one pool to another instead of moving the files through the network like I always do. I'm unsure if that could cause all the checksum errors. Also, do you know why two of the four drives would have all the errors and the other two would not? The smart status is OK on all the drives.
The files on the pool seem OK. Is there any way to "fix" this without rebuilding the entire pool? It wouldn't be the end of the world, as these are my newest drives, but I have a lot of data already copied to it and would not like to have to redo all that backing up (it took days). Could it be my controller, as only two drives have checksum errors?
I'm just looking for a start to diagnosing any issue, and the system works great other than this new "unhealthy" flag that the new pool popped up. I have another large storage pool and several smaller pools that have never been unhealthy on the same controller. Thanks, and I hope that you have all the information you need and that this is posted in the appropriate forum. I have not had the need to post here very often.