dealy663
Dabbler
- Joined
- Dec 4, 2021
- Messages
- 32
Hi
I've been having some instability problems on my TrueNas Scale system. It was crashing pretty infrequently (2-3 times over the past year, up 'till last week). I updated some hardware (added a GPU) last friday. Then on saturday morning it crashed again, but this time when it came back up one of the drives had failed. A single drive, 1 year old, 7200 rpm 12TB. So I thought aha! the drive was flakey and causing some system instability. I removed the drive and I proceeded with the GPU install and got my system back up and running again yesterday. During the process there were many reboots as I was trying to figure out passthrough. Then today I saw a message in the logs indicating that another pool was degraded and one of the mirrored drives was offline. I shutdown the machine and checked all the connections, rebooted and then that mirrored drive reported that it was ok. A bit later a different mirrored pool reported that it was degraded with one drive offline, while the first pool was reporting 0 errors.
In my frustration I shutdown the machine and shook my head, and am sitting down asking if anyone has seen this type of behavior before. How can different drives in different pools develop problems like this? I'm pretty sure the first single drive failure was correct. But the pool failures are hard to explain or understand. I'm running 32 GB of ECC ram, and host 3 apps (truecharts, pihole and plex). I have 2 VMs one is the new one with the GPU, the other doesn't put much strain on the system.
I'm running Bluefin and upgraded to it last weekend, but the system crashes started before that, however the first real noticeable drive failure came after the upgrade to Bluefin.
Any suggestions would be appreciated.
Thanks, Derek
I've been having some instability problems on my TrueNas Scale system. It was crashing pretty infrequently (2-3 times over the past year, up 'till last week). I updated some hardware (added a GPU) last friday. Then on saturday morning it crashed again, but this time when it came back up one of the drives had failed. A single drive, 1 year old, 7200 rpm 12TB. So I thought aha! the drive was flakey and causing some system instability. I removed the drive and I proceeded with the GPU install and got my system back up and running again yesterday. During the process there were many reboots as I was trying to figure out passthrough. Then today I saw a message in the logs indicating that another pool was degraded and one of the mirrored drives was offline. I shutdown the machine and checked all the connections, rebooted and then that mirrored drive reported that it was ok. A bit later a different mirrored pool reported that it was degraded with one drive offline, while the first pool was reporting 0 errors.
In my frustration I shutdown the machine and shook my head, and am sitting down asking if anyone has seen this type of behavior before. How can different drives in different pools develop problems like this? I'm pretty sure the first single drive failure was correct. But the pool failures are hard to explain or understand. I'm running 32 GB of ECC ram, and host 3 apps (truecharts, pihole and plex). I have 2 VMs one is the new one with the GPU, the other doesn't put much strain on the system.
I'm running Bluefin and upgraded to it last weekend, but the system crashes started before that, however the first real noticeable drive failure came after the upgrade to Bluefin.
Any suggestions would be appreciated.
Thanks, Derek