200TB-1PB Librarian is new here

Status
Not open for further replies.

mediahound

Dabbler
Joined
Mar 11, 2013
Messages
15
Howdy, hopefully my title got some attention.

I'm here because of ZFS. : P

I currently have or am involved in several projects all which are driving me towards wanting to use freeNAS and ZFS but i'm more of a media librarian than a computer expert so the whole process has thus far been postponed due to having no idea whats involved, not wanting to become a sysadmin, not understanding unix, worries about putting data into something I can't ever get it back OUT of again, and similar. The three main projects are:
1) Personal Media Library (sure I collect data like lots of people here probably do, though primarily educational content like college lecture video recordings, wanting it for a home media server) but so far this is tolerable on windows. It doesn't have to migrate over, but there's advantages if freeNAS works better.
2) Business Project (low budget moviemaking and computer graphics which will suck up terabyte after terabyte), for this one performance actually does matter but it can be a totally separate system.
3) Historical Archiving - this has the potential to grow quite a bit, for instance there are thousands of foreign language medical videotapes and thousands of books and journals from China i'm hoping to get slowly scanned in just by myself over the years for digital preservation before they degrade any further, a project which might take a few years by itself since it's not like i'm doing this for profit or full time but just for preservation. If I finish mine and get access to others libraries I hope to scan in theirs as well - future scan/digitizing projects will be archives from Russia and Georgia - but i'm so overwhelmed with what I already have that I stopped even seeking or asking who else has stuff to put in the queue. This is all part of an alternative medicine research project for those curious, the stuff from china is about chinese medicine, the stuff from russia some alternative medicine stuff they have/do there, and though I might mirror others archives on other topics (which would grow the database massively but which would generally be other educational topics on everything imaginable similar to the 'CD of the 3rd world' project or the Appropriate Technology Libraries but with video instead of text) those are the main two i'm trying to archive and preserve as possibly within my grasp. If I work out "a system" to easily digitize videotapes and scan in books/journals physically, and to reliably archive/integrity verify and stop further degradation of that material, i'm hoping to set up other people in foreign countries as well, some with little computer knowledge, of how to help that. So they can just be mailed extra hard drives when needed and periodically mail back a full one when they have new stuff to share since they're in places internet is slow. The whole archive has to be mirrored in several places, and organized so that 'chunks' can be more easily distributed to those most in need, ie prioritizing the most worthy 4tb or whatever the largest single drive size is to mail to someone with a laptop in Guatemala to access for instance since they wont have superfast internet to do it all online, it has to be mostly self contained.

What I have so far is a snakesnest of drives hooked up to a windows PC until I ran out of drive letters, and then a couple hard drive crashes and lost data later i'm leaving certain things offlined entirely until I work out a better system. :- / The nightmares of sysadminning and maintenance and moving files from drive 12 to drive 7 and such trying to organize around limited drive space or a dying crashing drive are already the #1 impediment to anything else being done, it's become apparent growing it is pointless when i'm already losing data and having to re-scan it for sometimes the 3rd time despite having backups before having a few bits in the middle of a large file corrupt on both drives and still needing it to be done. I've created md5's which helps but creating PARchive files is so incredibly labor intensive that it can take WEEKS to make parity files for a single drive so I can't even keep up with that anymore.

So at the moment i'm seeking advice on how to proceed to include migrating data and getting used to things.

Because of the thousands of hours of work that will go into either project 2 OR 3, obviously backups are waaaay important. From what i've researched this may or may not be within ZFS's paradigm - the easy migration of data to offsite storage, and possibly the integration/importing of data likewise. What i'd want is to have ideally is a set of geographically distributed computers, which would either add data by importing from a drive (for when bandwidth to/from the country was still expensive) as well as realtime FTP-style sync-ing (for when bandwidth was cheap and uncapped). ZFS seems to be the king of preventing data corruption, detecting corruption and repairing from corruption within a single computer - but I definately need solutions beyond one single site computer since that can be wiped out by one bad power surge, one flood, or one house fire. Most syncs will be by sneakernet mailing 4tb drives around, because neither me nor the people who may be willing to help so far have uncapped internet, but for certain very high effort or important or time critical things we might want to have a priority/"push" directory to force a sync/mirroring right away.

If anyone has even the faintest clue of how to do the above please share because I don't. : P But thats what the need is and it might even break freeNAS compatibility or require some radical workaround that makes something else easier to use. Cluster level integrity (if thats the right term) is what matters more than individual machine survival where ZFS seems to excel best at.


Budget is very important! :( Enterprise level solutions without an enterprise level budget. Especially for the historical archive, i'm a college student trying to prevent loss of things some people don't seem to care much about. If it's possible to save $600 not buying a RAID card that lets me buy 15tb more of storage right now. As close to 100% of the budget needs to go to hard drives, due to the amount of data and the level of redundancy sought, both which will grow over time. Expandability is important too, if I hit some hard limit on what one low budget server can contain, how do I expand beyond that while still having it treated like a monolithic data set, simply add a 2nd server with more drives? There has to be some known easy to follow strategy so that no matter how fast the data grows i'll know what to do, and some kind of low maintenance "i dont want to be a sysadmin" solution so that if it just tells me to replace hard drive 4BB_X on rack 2 I can do it without worrying that much. My job should mostly be to upload data into the array, add drives, replace drives, or replace whole motherboard server units if a system starts failing or showing some kind of problem thats taking down one node of a cluster, and not have to worry about things like did some virus corrupt files on there, or did I accidentally delete that subdirectory at 3am in sleep deprivation, or whatever. Much of the data wont even be sorted or translated probably for 10 years but it has to survive undamaged until things like machine-assisted translation make the process alot easier including what will then be easy like I assume convert a whole video to english, keyword mark it, and make it searchable by time code.

Power use may become an issue. Right now most drives sit idle despite being on and connected. ZFS's desire to scrub the whole archive every week spooling up 24 drives would probably throw the breaker though. : P Do they have any 'reduced power usage' modes or ways to structure the archive, or I mean if i'm accessing one file it's only accessing the drive that it's on and not spooling up and down multiple drives just for that one file? Having powered down archives, either drives or backup servers is possible. Like something that turns on, mirrors everything, then turns off every two weeks or something. (between a backup drive writing changed-file deltas which then starts over) Actually the ideal may be to somehow compartmentalize lesser used data that once it's up and verified and processed into some desired final state (which may take awhile/a project may be open for months before I get a chance to do that) to stick on like a 3-4TB drive at a time that is then mirrored to a backup and shut down, periodically rechecked to verify it's condition but generally not accessed much if at all, just shelved and kept. Yet we'd want to know exactly where that data is, what drive it's on, when it was last checked, have a separate backup of parity/restore type data should anything have been corrupted on that drive in the meanwhile even if both drives lost a few bits, etc etc... if that makes sense. Total power use of everything IS an issue for a 24/7 on file server with this many spools. I'm already eating Ramen and freegan too much just to afford more hard drives as it is.

Total ultimate size of the archive with others helping could well reach a petabyte or more, it will at least reach 200TB. Any business/movie projects would probably reach 100TB without too much difficulty. Whether there's any advantage merging all projects into one server (personal/business/charitable) or whether it makes more sense to keep them separate, i'm open to suggestions to. Just one server is more convenience and less power though, having the ability to split that later or merge it back would be another plus.



Feel free to comment on any part of the above. I've read through a few getting started stuff and similar but am already having an issue with like there isn't even motherboards with enough RAM to stick all the drives on just one freeNAS array if they want 1gig ram per 1TB of drive storage, or the additional cost of those that do almost make it cheaper to get two more user-level motherboards, maybe the primary machine (with fairly low data redundancy, equal to RAID6/two parity drives at most) and then a backup computer which is usually powered down, or which schedules turning on, mirroring, and turning off. Everything is still open for discussion including beyond the single-ZFS-running-machine level.

I particularily like using USB drives due to not having to worry about special server class hard drive chassis, special hot swap ability in the SATA connectors, expanding the power supplies, exceeding the designed level (ie 16, 20, 24 drives) of the whole system and similar. Also the ease of just pulling a bad drive and replacing it without a screwdriver or just having all the drives sitting on a bookshelf with a big box fan blowing at them. If i'm low on space and all I have to do is stick a 4way or 7way hub off one of the ports to keep adding drives until I can afford a 2nd server, instead of hitting a wall of what a pair of RAID cards can do off the pci-e slots that's a plus as well since the performance needs is not excessive right now. I dont know if there's any ZFS provision (not from what I read yet, maybe there should be?) but the ability to have a user either connect or power up any usb drive when told (mounted at a fixed position) to copy files handily off to any other connected drive could well make sense. Something where the drive is still USB connected but the power to it is off unless it signals you to turn on the power to that external drive to grab files needed. (this would normally be a single user system, i'm aware that couldn't possibly work in any other scenario)


So can anyone help me understand which parts freenas/ZFS will DO out of the above, and which things I have to plan around the... at least difficulties and limitations of (like needing so much RAM) or where I will still need to learn about other software? I want to get an idea of the whole picture before I even start shopping for hardware, because this is being driven by the need for so many drives before anything else.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
WOW. Long post. Welcome to the forum. The best advice I think I can give you is to find someone that has FreeBSD(or FreeNAS) experience that is willing to work on this project with you. There are a lot of people that get the "Enterprise class solution without Enterprise class costs" but there are also alot of people that don't know the hidden details and lose their data suddenly and are shocked until they post and one of the senior posters tells them what they did wrong. Considering the cost of 200TB+ the data obviously is valuable. I'd consider the cost of paying someone to build the system for you and maybe provide basic support. Typically 90% of the work is setting up the server.

As for your options for sneakernet for updates, I don't have any super simple ideas how you'd easily implement that except to do it manually. Generally, sneakernet isn't the "ideal" solution for backups. Most people just do network transfers over their LAN. ZFS snapshots are the de-facto standard for backups between 2 machines that both use ZFS. It's fast, easy to setup and maintain, and very reliable. Rsync can also work, but I'd question how useful it would be in very large quantities of data. I tried to migrate 20TB of data and it would have been extremely slow over Gb LAN.

If you can find someone that also has a vested interest in using FreeNAS(perhaps for a resume bullet) but also has FreeBSD experience they may do it for you for cheap or free. I wouldn't expect that to include support after the server is setup and running though. You really should find someone with IT experience(and FreeBSD/FreeNAS) to maintain the system. No server system I know of tells you when to replace a drive but tells you that a drive has errors(leaving you to decide if/when to replace the drive) and if a disk is missing. The admin is left with the responsibility to determine the correct course of action and to implement it.

If there was a file server that would just spit out "replace disk 4" many admins would be instantly jobless. It's not always as easy to identify a failed disk as you might think. I'm not sure where you are, but if I lived locally I'd almost want to be involved just for the experience. Its not often you have the opportunity to build such a large server with someone else's money. I know I can't afford a 200TB+ server. There are often unexpected issues that aren't considered when you have a small server(mine's "just" 36TB) but when you scale up to big big sizes you have new issues to deal with. I've done stuff for small businesses for friends and relatives just for the experience of knowing how to set up something I hadn't actually done before. Small business wants to build a new office with fiber in the walls, new domain and new workstations... no problem!

I currently have or am involved in several projects all which are driving me towards wanting to use freeNAS and ZFS but i'm more of a media librarian than a computer expert so the whole process has thus far been postponed due to having no idea whats involved, not wanting to become a sysadmin, not understanding unix, worries about putting data into something I can't ever get it back OUT of again, and similar.
Those are healthy, completely normal, and completely accurate! The forum has had many people lose data that work in IT because they didn't understand ZFS, FreeBSD, FreeNAS, etc. The fact that you don't want to be a system admin pretty much implies that you must give that responsibility to someone else.

Good luck!
 

mediahound

Dabbler
Joined
Mar 11, 2013
Messages
15
Thanks for responses... I was familiar with the Backblaze already and originally wanted to design around something similar, but since the cost of USB drives has either paced or even beat SATA port raw drives, when combined with everything else like less need for mega power supplies and such i'm really hoping to do a USB solution unless the nature of the drive market radically changes. The cost per port just seems to be a win right now, and things like 16 port RAID cards and even chaining those SATA splitters like the BackBlaze used fairly expensive compared to the simplicity of a USB hub. Something like 36 drives on one PC I assume is likely UNLESS the sheer overhead (like RAM requirements, at "1gig ram per terabyte" i'm not having 200 gigabytes of ram i'm sure) makes it work less well.

I am content with using ZFS on "working systems" and then mirroring it on a NON ZFS system that i'm more familiar with like standard windows. What i'm going so far is working (sort of) on windows, it's just the overhead annoyance of all the partitioning of volumes, space on a drive running out screwing up easy organization by subdirectory, making md5's and similar is becoming a serious hinderance. :-/

The budget of paying a sysadmin definately wont work, and the other issues of the project is pretty much going to leave anyone else contributing to likely be single users. Help and advice from say people here would be welcome but i'd likely have to do it all by myself, if i'm building a system in rural china and setting it up to the internet I probably wont be able to find any freebsd users. The best i'd be able to do is to tell a friend to type things on the keyboard or "hook up the replacement drive like I showed you". This isn't just about my data, it's about figuring out a system I can teach others to run to help with the project. If I can't make a turnkey low maintenance solution work then i'll no choice but to stick with windows and redundant drives like i'm doing now. :-/ I've already shown a few others how to do what I do, i've just lost data myself by not fully following my own 'standards' at all times or because I hadn't figured out the system at the time.

That said I already "sysadmin" the windows system, i'm just hoping/was told that ZFS should be less work than making it all work on windows is, when you include the overhead of migrating files around to organize and reorganize endlessly and such. I was just hoping that ZFS/FreeNAS could ease things alot... the impression I had is that it's production ready (not losing data with proper configuration), just not proof against 'user errors' of sloppy configuration. If it's not production ready and risks randomly losing data, or reporting securely written data which doesn't please let me know! :-/ Insure I dont screw up with configuration errors and in theory it should be fine? I'd hope to publically show what i'm doing here asking people for if there's any holes in my strategy. My hope is to start a small part of data on ZFS (and probably not exclusively on ZFS) and if it shows to work well, expanding radically as I go/migrating from windows to ZFS for even the backup storage. Or I could make windows the production system and back up to ZFS, but I really want that end to end integrity verification from the moment data is created, even if I treat the backup as a "catastrophe protection" system I normally don't intend to access.

In short, can someone summarize more about WHEN zfs has failures have been occurring on recent systems/builds even when people have followed say the configuration advice in the HOWTO systems? I'm hoping true backups (ie - "RAID is not a backup" i take totally to heart) kept disconnected from the power/wiring normally, and in physically separate locations SHOULD protect against any conceivable problem... the only thing I can conceive of, is if some kind of ZFS-caused error could propagate to the backups as well, falsely telling me data has integrity/is safely written and similar when in reality it's not. Some kind of filesystem level metadata corruption silently corrupting the data (the very thing ZFS is meant to protect against, to include viruses) so that even copying it to the backup drives or running a filesystem compare/synchronization is irrelevant because it's already been damaged or the damaged data propagates and overwrites the good data on the backup. Assuming multiple physically independant ZFS based systems which normally are disconnected from one another and in physically separate locations for catastrophe-tolerance, I need to know what failure modes i'd be at risk of.


EDIT: I'm also planning on having separate physical NAS boxes if needed, for instance data for archival which is already scanned in generally doesn't need to change, or it could be migrated to a set of drives which is simply put on the shelf and pulled out to check them once every few months. (bit rot should be even rarer if not spun up at all i'd assume) Also the NAS boxes for archival could be separate from the higher performance boxes for the film/cinema project. It's sorting out very general strategies like that which come before I even spec out a single individual NAS box. It's just I want to have the ability to easily merge and integrate multiple boxes into one complete dataset in the future when drive prices come down for instance, ie when 12 TB drives are common and cheap maybe in 5 years. Or i'd like the ability to search the entire dataset as if it's a monolithic whole right now if it's all powered on at the same time on the same network. (if i'm understanding the terms, to treat as a single volume, a set of data that actualy resides upon four separate but local NAS boxes and something like 30 drives including parity/hot swap/backup type drives) Am I making any sense or am I missing the boat? :- /

EDITEDIT: Starting from zero, i'm seeking public suggestions and recommendations on the best way to meet the following criteria:
- A dataset which is expected to grow from at least 70TB up to 200TB or even 1PB in the long run
- Ease of expansion, none of this "drive Y is full, so we need to add drive Z... then we have to upgrade Drive G from a 1.5tb to a 4tb and move ALL the data back to Drive G continuing with drive letters" kind of crap that i'm forced to do right now. Just throw new drives on the pool as budget allows for both expansion and hot spares, pull known bad drives for warranty claims or junking, etc.
- Minimum cost per port (so the budget can be spent on just acquiring drives) while keeping full hot swap ability (SATA can do this but often involves special chassis and similar adding cost for what i've seen... USB supports it natively and automatically assuming I can configure it to use USB properly)
- Performance for archival box is NOT highly important, assume 15megabytes/sec minimum guaranteed transfer rates under all conditions/even slower than USB 2.0 I dont think i'm expecting too much (though I want that to including rebuilding mirrors, silvering new drives, and possibly sending backup data out a 2nd gig-ethernet port with extra available drive bandwidth though it doesn't have to max that connection) would be ideal since things like multiple video ripping cards scanning in videotapes could readily produce data up to that amount. Lets design the slow box before we work on the complicated 100MB/sec cinema/film NAS.
- Plans for triple redundancy of dataset as a rule, meaning physically separate locations just synched via IP or whatever. Meaning even total data loss on a single server running ZFS i'm not too worried about as long as the ZFS standard itself will not corrupt data somehow running a mirror to the other site backups. I'm hoping this makes things easier not harder though because a 'perfect' NAS box isn't the goal, just one with a tolerably low failure rate.
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
Wow, your posts are just too darn long to try and respond to all of your questions. At least your English is great and you are making sense. :)

The problems I see are IF you had your master files on windows, THAT is where you need to worry about corruption and bit rot. You will never know for certain that every time you backup files from windows to ZFS that each of them hasn't had some type of silent corruption. I've noticed this with several JPG files and it makes me wonder what other types of files have corruption that I can't see.

The other thing to consider is that in the very worst case there have been people that have lost their pools for whatever reasons (most of the times I've seen, it's been some type of user error or misunderstanding about how things work), and there are NO data recovery applications available for ZFS. I'm not even sure that data recovery services understand how to recover a damaged ZFS filesystem. So, making sure you don't propagate bad data to your redundant backups is another thing you can never be sure about either unless you have some way of checking all of your files before every backup. I think once you have known good files on zfs, there is a lot less risk of problems.

I'm not aware of any ZFS related bugs that have propagated to backups. I suppose if you really want to be safe, you could backup by 2 different methods, one with "zfs send", and then to the other NAS use rsync, and then compare them?

I'm not completely clear what you were trying to convey about USB drives. I think USB would be slow and has its own risks of data corruption. If it were my data, I would avoid USB completely.



Anyone else care to jump in here?

- - - Updated - - -

Also, this has turned into a discussion that really doesn't belong in the introductions section of the forums. Unless someone thinks it belongs somewhere else, I'm going to move it to off-topic (its kind of a grey area topic).
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
The problems I see are IF you had your master files on windows, THAT is where you need to worry about corruption and bit rot. You will never know for certain that every time you backup files from windows to ZFS that each of them hasn't had some type of silent corruption. I've noticed this with several JPG files and it makes me wonder what other types of files have corruption that I can't see.

This.. exactly. I've had some files with garbage in the file, pictures, word docs, whatever. The corruption caught me off guard every time because there wasn't any apparent reason for the file corruption. I'm somewhat convinced it was due to other issues beyond my control and that ZFS will help reign in those problems.

To be honest, if you are so untrusting of using FreeBSD/FreeNAS or you don't feel comfortable enough with FreeBSD/FreeNAS to trust it with your data I wouldn't try to build a system with it at all. ZFS is an amazing file system, but if you have hesitations about properly managing your server you can make horrible mistakes that wipe out data without knowing it. This is why earlier I said:

I'd consider the cost of paying someone to build the system for you and maybe provide basic support. Typically 90% of the work is setting up the server.

If you can find someone that also has a vested interest in using FreeNAS(perhaps for a resume bullet) but also has FreeBSD experience they may do it for you for cheap or free.

If you aren't an IT guy you are really really in over your head. The forum has plenty of "IT guys" that have had their butts handed to them(and plenty with lost data to show for it). Once you start talking >80TB you have a 100% chance for data corruption somewhere. If its a single file that sucks, but you won't know where the corruption is in Windows. If its the file system.. well, good luck with that. You'd better have excellent backups.

People pop in here from time to time wanting very very large servers for cheap. The two are mutually exclusive. You either get very very large or you get cheap. This is why system administrators have a job. Just like a car mechanic sometimes you just have to pay the mechanic and be glad your car runs. For your size, I don't consider Windows an option, at all. (Hint: Microsoft doesn't consider NTFS an option either!)

There's a calculation somewhere that a 4TB drive, if you filled it completely full of data, you WOULD have corruption somewhere that wouldn't be apparent unless you examined the data and knew it was wrong. There is no "detecting and correcting this error" unless you have ZFS or you implement your own detection(par2 or whatever). I'm running just 30TB at home and I was worrying about silent corruption from my Windows Server. Now I run ZFS and I sleep better at night.

tl;dr - Pay some admin to help you or just accept the data loss as the cost of (not) doing business. You aren't going to get the price point you are wanting at the size you want. That's a fact of life. Windows has no valid file system solution and Microsoft has said that large volumes shouldn't be used on NTFS. They currently have no "good" solution for large volumes, nor any way to find silent corruption, let alone fix it. It's obvious your current strategy is not sustainable, but you(or your organization) don't seem to be committed enough to pay to keep the data around. Somewhere something is gonna give. The organization will have to pay, or the data will. Pick one and pick wisely.

Edit: If you look above I said "Good Luck" because I knew exactly where this thread would go. These kinds of threads never end well for the OP because the OP doesn't really know what he wants(no offense). It always goes the same way. Organizations don't want to drop 5 or 6 figures to build and maintain a system, but somehow rationalize their cost savings as "necessary". Same sh*t, different organization. Nothing personal but organizations need to either say "we want this data and are willing to pay for it" or stop trying to store the data. Either the data is valuable or it isn't. If its valuable then you'd pay for it. If its not then you won't. Simple as that.

Anyway, I think I'm done trying to provide help on this topic. Until the organization is will to jump on board with this with both feet there really isn't much to discuss. Once they are willing to jump on board then get an IT guy that wants to do this as a side-job then you won't be asking any of these questions because your IT guy will (or better) know the answers. Hint: Any IT guy that recommends Windows(and by definition NTFS) for something even as big as 50TB is not worth his salt and should be fired.

I don't think anyone in your organization has realized that if you have any file system corruption on Windows all you can do is run a chkdsk and pray it doesn't hurt too badly. On ZFS, any kind of corruption is repairable as long as you build smartly.
 

mediahound

Dabbler
Joined
Mar 11, 2013
Messages
15
The problems I see are IF you had your master files on windows, THAT is where you need to worry about corruption and bit rot.

I probably have a few rotted bits. I'm just trying to prevent it getting worse. Nothing I have so far is life or death critical, a slightly distorted jpeg, a moment of glitched divx video... about half of it has had md5's made, about a third of that was either archived in RAR with parity files or had PAR files made. But i'd like that corruption to be the last that I have to deal with - i'd like new data to go straight into a ZFS array so I can trust it more now that I know what a problem silent corruption really is. :-/

For data recovery and such, the big question is do I back up to a 2nd ZFS NAS box, or simply a windows box? The latter is corruptable but more salvageable.

I'm not completely clear what you were trying to convey about USB drives. I think USB would be slow and has its own risks of data corruption. If it were my data, I would avoid USB completely.

Actually neither am I. :) The system requirements say "SATA" so I was confused... does that mean I cant use IDE, SCSI or USB drives? USB 3.0 is faster than SATA II speed, and my performance needs of 15MB/sec is not excessive. Near as I can tell USB is no less reliable than SATA because it just uses the same SATA drives you have internally, it's just an external interface and power supply usually. I want to avoid the hassle of decasing the drives if possible, and to have the ability to use sets of USB drives (like four drives in one Zpool) to send mirrors of data around via sneakernet. But if freeNAS specifically disallows the use of a USB drive obviously that wont work. :(

To be honest, if you are so untrusting of using FreeBSD/FreeNAS or you don't feel comfortable enough with FreeBSD/FreeNAS to trust it with your data I wouldn't try to build a system with it at all.

Gotta learn sometime. :) Thats why I said (I think more than once) that I want to start getting into ZFS, not that i'm planning to monolithically change everything overnight. And anything on the ZFS system would also be backed up in windows format for now.

If making a single box with alot of space is too hard, multiple NAS boxes with 32TB fine will do for now until the frontier works itself out a bit more. I'm just trying to explore the upper tier... once I get more confidence understanding how it works, or upon building a 2nd box maybe a bit larger like 48TB

I've already had notable problems of data corruption on my current dataset which must be pushing 70tb on windows - the md5's, the RAR copies with restoration data, the PARity archives are all workaround hassles which solve the problem but take more time than I want to keep putting in, because obviously once it doubles i'll probably spend three times the time managing all the problems. :-/ If freenas/ZFS is more of a sysadmin hassle than less that doesnt make sense... if the real problem is just pushing too far into the frontier, help me spec out what can safely be designed as a pretty much known quantity for now and i'll just replicate it for now.

It's obvious your current strategy is not sustainable, but you(or your organization) don't seem to be committed enough to pay to keep the data around. The money has to come from somewhere. :-/ So far it's my pocket, for historical preservation, for things I wont even ever make profit on - i'm just like the librarian that sees them hauling books to the dumpster and in a panic loads up the pickup, rents a storage garage, and has them sitting in boxes on the floor, probably getting damp or whatever but it was that or nothing. I'm just trying to do what I can with the limited finances and commitment I have available. It's possible i'll just have to scale back certain things or give up if I fail to find the right people to help me with the project, but insofar as I can on my own i'm trying to do what I can to preserve it in the best quality feasiable for now.

Anyway, I think I'm done trying to provide help on this topic.

I'm a little confused at this. :-/ There is no "organization" with funding, just me hating to see knowledge wasted that others dont seem to care much about, while accepting that it's probably just going to end up that way. It's not about people being cheap, the money DOESNT EXIST and the data is already degrading in analog formats or going to end up thrown in a storage garage with no environmental control soon. I do the best that I can or I do nothing at all, wipe the drives and just fill them with entertainment data not caring, those are basically my choices.

If what I want is considered beyond the frontier, then just help me understand what is within the frontier. Several times i've asked "well what CAN I do" like several 32TB boxes independant of one another but i'm not getting much input there either. The only question is what sucks worse - using windows with PAR/md5 and redundancy, or try to learn freeNAS. If I cant get the advice I need it will just force me back to windows and I just have to seriously scale back the addition of new data working more on the preservation of existing data.


I'm not really sure if moving it to a different forum helps either, even if it's an intro that sorta turned into a discussion... :-/
 
Status
Not open for further replies.
Top