James Snell
Explorer
- Joined
- Jul 25, 2013
- Messages
- 50
Disclaimer: I haven't bothered to study the architectural details of ZFS. As a user of the file system, I shouldn't really need to know the details either.
I've long-since operated a wide range of personal servers on non-ECC memory and never had a problem I could ever remotely chalk-up to a memory error. I could just be lucky, but back in the day, I ran a ~6-8 disk Linux LVM with zero redundancy with my massive personal media archive on it. It never once failed me. I added & removed drives from it maybe once a year. I replaced the motherboard, cpu, memory (etc) many times too. No data problems, ever. All on consumer-grade hardware. Granted, I favoured Intel everything, I never bought ECC memory, due to cost (and being a kid, at least, for part of that period). Still, I've never had a problem I could remotely trace to memory faults.
I now am operating a FreeNAS box at home with a mirrored ZFS situation, on non-ECC memory. After skimming various discussions basically calling out anyone who doesn't run on ECC a "n00b", "idiot" or whatever, I'm feeling like juggling my hardware around, as I can shuffle my gear such that my FreeNAS will end up on a machine with all sorts of "server-grade" glory.
My statement: I'm pissed by my (mis?)understanding that ZFS is relatively very fault tolerant. I'm sure it must be in various ways. Being intolerant to memory errors and thus failing entirely (whole zpool losses) make this file system sound like a cool academic toy to me. Not something I feel overly inclined to rely on. What I don't know for certain is how other file systems handle the same sorts of failures. I've read that UFS is more stable than ZFS, whatever that means. I've used ReiserFS, ext2,3 & 4 for many many uses, some extremely fault-intolerant and (insanely expensive situations) and always on non-ecc memory. My only problems were bottlenecks or weird physics of moving hdd platters (2.5" mechanical drives will never work in helicopters, take my word for that).
Suffice to say, I've lived and breathed professional Unix Programming & hardcore hobbyist Linux sysadmin for over a decade and this notion of risking everything over not running ECC is really distracting. Am I the luckiest jerk on the planet? Why hasn't this come up for me?
It's not like running on ECC hardware is complicated. Heck, I happen to have some great such hardware laying around, so I have options personally. But it costs a lot more and thus massive droves of users out there can't/won't afford it, especially with past's like mine. Am I totally fooling myself about this topic only referring to ZFS?? I'm sure enough untimely failures will trash any system, but in practice, does this sort of thing really happen? Really? (*shakes you, the reader, violently, desparately*) No seriously, does it really ever happen in the actual real-world? Tell me a freakin actual story!
I think any seasoned computer user can agree there's no substitute for complete, regular and multiple backups, though I can say also the only time I've ever used my own backups were on rare occasions when a hard drive's PCB died (once ever) or when someone else deleted data they later decided they wanted.
Perhaps I've lived a profound, remarkable, sheltered life.. And boasting of my experiences could lead others to their demise (and perhaps my own)?
It seems to me like the frequency of these sorts of errors are so incredibly rare that it's feasible to build software solutions. It's interesting how much computing guts I can get for $300. But I doubt a new motherboard, cpu and stick of ECC memory will ever fall below $500+. I would think some BSD/Linux kernel geek would have written a memory management module that could redundantly store and soft-check everything in memory and thus reduce the situations where ECC would make a real difference. Sure a soft approach would be an inefficient use of the gear. So? Don't you have better things to do with your time? People use fancy ATMEGA cpus just to drive blinky-lights (Arduinos). I bet those chips have more guts than were needed in the Apollo spacecraft. Efficiency is also measured in effort to get the thing working for your needs. So, use your hardware inefficiently if it gives you +1% stability on something you really depend on.
As long as running ZFS practically requires doubly-expensive hardware, it'll be an incomplete file system. And woeful tales of misery will be associated with its use. Especially when it's bundled with free software with names like FreeNAS and NAS4Free. People are going to use such software on the throw-away hardware they have. If they have to use it on >$500 equipment, they'll go buy a dedicated hardware nas unit that already has everything set up for them. FreeNAS' latest categorization of UFS as a legacy feature seems to just further provoke this trajectory of dissent from the users.
Hardware faults will happen. Effectively losing terabytes of data over a potential single bit error, seems pretty remarkably useless. Maybe I'll switch over to UFS with FreeNAS, even with my ECC server in play.
Now, let the crap-fest of hateful responses commence...
I've long-since operated a wide range of personal servers on non-ECC memory and never had a problem I could ever remotely chalk-up to a memory error. I could just be lucky, but back in the day, I ran a ~6-8 disk Linux LVM with zero redundancy with my massive personal media archive on it. It never once failed me. I added & removed drives from it maybe once a year. I replaced the motherboard, cpu, memory (etc) many times too. No data problems, ever. All on consumer-grade hardware. Granted, I favoured Intel everything, I never bought ECC memory, due to cost (and being a kid, at least, for part of that period). Still, I've never had a problem I could remotely trace to memory faults.
I now am operating a FreeNAS box at home with a mirrored ZFS situation, on non-ECC memory. After skimming various discussions basically calling out anyone who doesn't run on ECC a "n00b", "idiot" or whatever, I'm feeling like juggling my hardware around, as I can shuffle my gear such that my FreeNAS will end up on a machine with all sorts of "server-grade" glory.
My statement: I'm pissed by my (mis?)understanding that ZFS is relatively very fault tolerant. I'm sure it must be in various ways. Being intolerant to memory errors and thus failing entirely (whole zpool losses) make this file system sound like a cool academic toy to me. Not something I feel overly inclined to rely on. What I don't know for certain is how other file systems handle the same sorts of failures. I've read that UFS is more stable than ZFS, whatever that means. I've used ReiserFS, ext2,3 & 4 for many many uses, some extremely fault-intolerant and (insanely expensive situations) and always on non-ecc memory. My only problems were bottlenecks or weird physics of moving hdd platters (2.5" mechanical drives will never work in helicopters, take my word for that).
Suffice to say, I've lived and breathed professional Unix Programming & hardcore hobbyist Linux sysadmin for over a decade and this notion of risking everything over not running ECC is really distracting. Am I the luckiest jerk on the planet? Why hasn't this come up for me?
It's not like running on ECC hardware is complicated. Heck, I happen to have some great such hardware laying around, so I have options personally. But it costs a lot more and thus massive droves of users out there can't/won't afford it, especially with past's like mine. Am I totally fooling myself about this topic only referring to ZFS?? I'm sure enough untimely failures will trash any system, but in practice, does this sort of thing really happen? Really? (*shakes you, the reader, violently, desparately*) No seriously, does it really ever happen in the actual real-world? Tell me a freakin actual story!
I think any seasoned computer user can agree there's no substitute for complete, regular and multiple backups, though I can say also the only time I've ever used my own backups were on rare occasions when a hard drive's PCB died (once ever) or when someone else deleted data they later decided they wanted.
Perhaps I've lived a profound, remarkable, sheltered life.. And boasting of my experiences could lead others to their demise (and perhaps my own)?
It seems to me like the frequency of these sorts of errors are so incredibly rare that it's feasible to build software solutions. It's interesting how much computing guts I can get for $300. But I doubt a new motherboard, cpu and stick of ECC memory will ever fall below $500+. I would think some BSD/Linux kernel geek would have written a memory management module that could redundantly store and soft-check everything in memory and thus reduce the situations where ECC would make a real difference. Sure a soft approach would be an inefficient use of the gear. So? Don't you have better things to do with your time? People use fancy ATMEGA cpus just to drive blinky-lights (Arduinos). I bet those chips have more guts than were needed in the Apollo spacecraft. Efficiency is also measured in effort to get the thing working for your needs. So, use your hardware inefficiently if it gives you +1% stability on something you really depend on.
As long as running ZFS practically requires doubly-expensive hardware, it'll be an incomplete file system. And woeful tales of misery will be associated with its use. Especially when it's bundled with free software with names like FreeNAS and NAS4Free. People are going to use such software on the throw-away hardware they have. If they have to use it on >$500 equipment, they'll go buy a dedicated hardware nas unit that already has everything set up for them. FreeNAS' latest categorization of UFS as a legacy feature seems to just further provoke this trajectory of dissent from the users.
Hardware faults will happen. Effectively losing terabytes of data over a potential single bit error, seems pretty remarkably useless. Maybe I'll switch over to UFS with FreeNAS, even with my ECC server in play.
Now, let the crap-fest of hateful responses commence...