FreeNAS version: 9.2.1.7
6 3TB drives in the pool, Mix of Seagate, Toshiba and WD
10GB system RAM
775 era motherboard (I can get the exact model if needed, just need to crack open the case)
All drives are in a single Z3 pool, all data is stored in one of the datasets, no data on the root pool.
2 of the drives are new. 2 of the drives are less than a year old, the other 2 are about 2 years old.
The problem:
Running zpool scrub pool results in some errors that are repaired and all drives end up with checksum errors listed under zpool status pool. Running scrub again has a similar result. No matter how many times scrub is run errors are found and fixed. The amount of data fixed is usually between ~20MB and ~500MB
What has already been done:
memtest86+ overnight: no errors
checked each drives SMART logs, only the seagate drive has any troubling errors.
all non system datasets were destroyed
each drive has been taken out of the pool and a full badblocks read/write test: no errors
once passed each drive was then formatted and added to the pool with zpool replace
after each drive was added and done resilvering a scrub was run with only the system data sets: no errors
data sets were (re)created, and data copied over the network with rsync
and now I'm back to square one.
Possibilities I have not tested:
The CPU itself is going bad, the corruption is happening in L1/2 cache: Unlikely, but possible. I don't have anything to swap in.
The motherboard itself is causing corruption, possibly a fault in the southbridge or SATA controller: I may have enough PCIe sata cards to check that. Of note, the motherboard doesn't support 3TB drives correctly, if you run windows on the board it will not see the drives correctly. FreeNAS has no issue seeing the full capacity
I recently changed the secondary SATA controller from RAID to AHCI, none of the pool drives are connected to it however.
The power supply should be stable, I do have another unit I could try but have not yet done so.
Completely delete and create a new pool, I'd just have to save the system data etc.
Not sure what to do at this point, the motherboard and CPU have been stable for years, it was previously my main server and never gave me any grief except windows not seeing 3TB drives correctly unless they were connected to a PCIe card (and no, the "latest" bios doesn't fix it, there is a non OEM bios that does but it causes instability)
6 3TB drives in the pool, Mix of Seagate, Toshiba and WD
10GB system RAM
775 era motherboard (I can get the exact model if needed, just need to crack open the case)
All drives are in a single Z3 pool, all data is stored in one of the datasets, no data on the root pool.
2 of the drives are new. 2 of the drives are less than a year old, the other 2 are about 2 years old.
The problem:
Running zpool scrub pool results in some errors that are repaired and all drives end up with checksum errors listed under zpool status pool. Running scrub again has a similar result. No matter how many times scrub is run errors are found and fixed. The amount of data fixed is usually between ~20MB and ~500MB
What has already been done:
memtest86+ overnight: no errors
checked each drives SMART logs, only the seagate drive has any troubling errors.
all non system datasets were destroyed
each drive has been taken out of the pool and a full badblocks read/write test: no errors
once passed each drive was then formatted and added to the pool with zpool replace
after each drive was added and done resilvering a scrub was run with only the system data sets: no errors
data sets were (re)created, and data copied over the network with rsync
and now I'm back to square one.
Possibilities I have not tested:
The CPU itself is going bad, the corruption is happening in L1/2 cache: Unlikely, but possible. I don't have anything to swap in.
The motherboard itself is causing corruption, possibly a fault in the southbridge or SATA controller: I may have enough PCIe sata cards to check that. Of note, the motherboard doesn't support 3TB drives correctly, if you run windows on the board it will not see the drives correctly. FreeNAS has no issue seeing the full capacity
I recently changed the secondary SATA controller from RAID to AHCI, none of the pool drives are connected to it however.
The power supply should be stable, I do have another unit I could try but have not yet done so.
Completely delete and create a new pool, I'd just have to save the system data etc.
Not sure what to do at this point, the motherboard and CPU have been stable for years, it was previously my main server and never gave me any grief except windows not seeing 3TB drives correctly unless they were connected to a PCIe card (and no, the "latest" bios doesn't fix it, there is a non OEM bios that does but it causes instability)