More TrueNAS Testing with some questions...

EasyGoing1 · Nov 10, 2021

(my questions are at the bottom of this post if you wanna skip my discussion)

I've got TrueNAS installed in a VMWare Fusion virtual machine (Fusion updated to the latest version as of today). I then created 4 virtual hard drives via virtual SATA bus just to get a feel of how the NAS works.

I went into Pools and then went to add a pool and the first thing I noticed was how TrueNAS offers drive pooling based mainly on function and not RAID type ... then I realized that TrueNAS only seems to offer ZFS Raiding. While this isn't a problem, it was interesting nonetheless...

I looked at the manual options for creating a pool and the option that really piqued my interest was the dedupe option, though I couldn't seem to spec a config for it that did not include two drives for dedupe where I did not get a warning saying that the configuration was not safe - using a single drive in the dedupe area gave the red flag, while two drives didn't.

So I opted to just go with the default recommendation using all four drives for the pool, which I believe was RAID-Z3?? Not sure ...

Next, I created a volume that I intended to share for an iScsi test and was surprised to see that it doesn't like it when you try to make the size of the volume larger than 80% of the available storage. Which I could override, but I'm looking to test RECOMMENDED configurations here ...

I then went into shares and created an iSCSI target using the volume I created, I initiated a connection with the iSCSI target on my MBP then formatted it with APFS.

Then I transferred some files just to see how everything worked and though I was only getting 6MB / second data transfer, that was to be expected since the VM is using the same NIC (single Gigabit via a USB-C Nic adapter) as my Mac and it's all running on the same machine ... those transfer speeds seem reasonable to me.

Initial thoughts are that the whole process was very straight forward and TrueNAS made it easy to use the "I don't know what I'm doing, just do it for me" method as well as providing the option to configure at least some of the more detailed parameters involved with each needed step. Though I think it would be a nice feature for less technical users to offer some kind of a "wizard menu" that would offer canned configurations like "take these raw drives that aren't configured yet and use as much as space as you can while offering a single drive redundancy and give me an ISCSI target that I can connect to" ... something along those lines ... but that's just my thoughts on making things, even more, user friendly.

Here are my questions:

1) I haven't read up on it yet, so even if someone responds with a link to where I can learn would be great, but when I saw DEDUPE in the pool options, I started salivating. The idea that I could create a space to write data to where the file system ensures that no duplicate data is ever written to the space while still presenting the illusion that it does: ie if I have a file in one folder then later on I end up copying that same file to another folder, it would still appear to me to be in two different places yet the back end file system only kept a single copy of the file ... if THAT is how deduplication works, then that is what I want. But what threw me off was the need to add drives to the dedupe config. I obviously don't understand how deduplication is implemented in TrueNAS and my thoughts on how it works are apparently not in harmony with how it actually works. Can someone who understands that technology enlighten me?

2) Why does TrueNAS prefer that we NOT create volumes that utilize more than 80% of the available space in the file system? What am I missing when I assume that I should be able to use all of my drive space for shared volumes?

3) Is it considered "best practice" or is it even any sort of issue, which way I choose to format an iSCSI target? I went with Apple's default of APFS, but I'm wondering if there are caveats to choosing one format over another?

4) Does TrueNAS subscribe to the notion that when creating a RAID volume, ZFS is the only way to go, and if you want any other type of raid, then you need to provide your own hardware card for that or simply install third-party software that will create different kinds of raids?

I haven't messed with a NAS in almost 10 years except for some Synology boxes a few years ago, so obviously I've missed out on NAS implementation "theory" or "best practices" over that time. I'm willing to read up on things on my own but I always like to ask those who are involved for the best sources to find that knowledge which is mainly why I created this post.

Thank you,

Mike

Patrick M. Hausen · Nov 10, 2021

EasyGoing1 said:
Does TrueNAS subscribe to the notion that when creating a RAID volume, ZFS is the only way to go, and if you want any other type of raid, then you need to provide your own hardware card for that or simply install third-party software that will create different kinds of raids?

I can answer that one quickly, the rest will take more time: TrueNAS is a ZFS appliance. If you don't want to use ZFS, TrueNAS is not the software for you. But why would you want to? ZFS is superior to every single other filesystem as far as the safety of your data is concerned. I run an entire data center worth of ZFS filesystems and I am not looking back. Specifically don't even think of using "hardware RAID" with TrueNAS. You still can only put ZFS on top of that and you are not gaining anything. Better let ZFS handle all the disk drives.

EasyGoing1 · Nov 10, 2021

Patrick M. Hausen said:
I can answer that one quickly, the rest will take more time: TrueNAS is a ZFS appliance. If you don't want to use ZFS, TrueNAS is not the software for you. But why would you want to? ZFS is superior to every single other filesystem as far as the safety of your data is concerned. I run an entire data center worth of ZFS filesystems and I am not looking back. Specifically don't even think of using "hardware RAID" with TrueNAS. You still can only put ZFS on top of that and you are not gaining anything. Better let ZFS handle all the disk drives.

I am particularly interested in the claim that ZFS is "self-healing"... I wonder how that has worked out in real-world usage. Does it keep a log of what it "heals" and how does it know when data needs to be fixed and more than that, how does it know how to fix it? I'm assuming that since you lose a drive in a Raid-Z3 that it must use some kind of parity calculation to determine what the data is supposed to be ... but does it take that concept down to the block level somehow so that it can constantly ensure data integrity even though a drive might not be failing? Seems like more of a logical level of data awareness rather than just raw hardware failure protection.

Samuel Tai · Nov 10, 2021

You're apparently curious about the deeper technical details of ZFS. Start here: https://openzfs.org/wiki/System_Administration

Patrick M. Hausen · Nov 10, 2021

The ZFS primer is probably a good start:

ZFS Primer

Background information about the Zettabyte File System (ZFS).

www.truenas.com

Yes, ZFS keeps checksums of everything at the blocklevel. So even if you lose more redundant copies than the system can cope with - if it's not entire disks, it will tell you exactly which files are damaged beyond repair and which aren't.

EasyGoing1 · Nov 11, 2021

Samuel Tai said:
You're apparently curious about the deeper technical details of ZFS. Start here: https://openzfs.org/wiki/System_Administration

Thank you, that was an excellent read. I'm beginning to realize that my questions don't have simple answers - which seems to be because ZFS can be implemented in many different ways and should be depending on the function needed and hardware provided.

I like that it was designed so that the file system can be tuned for scenarios as they are encountered. It doesn't try to be a "one size fits all" or perhaps better stated that it is not a "one configuration" fits all... given the variety that seems to exist in different storage hardware in terms of block sizes, caching, and other factors, it's nice to have a file system that can be tailored to one's specific environment.

I wasn't aware of this issue where drives are configured to endure long timeouts when they can't read a block ... I have to assume that most file systems can't or won't perform that task of retrying a block read which is why that ended up on drive firmware and in some cases as a configuration that cannot be modified.

This definitely is going to make the selection process for drives a bit more involved.

Very interesting ...

Samuel Tai · Nov 11, 2021

Some more reading for your reference library. I put this together for another new user on the forum: https://www.truenas.com/community/threads/hello-from-minnesota.90491/post-626508.

danb35 · Nov 11, 2021

EasyGoing1 said:
Why does TrueNAS prefer that we NOT create volumes that utilize more than 80% of the available space in the file system? What am I missing when I assume that I should be able to use all of my drive space for shared volumes?

The path to success for block storage

It seems like I haven't written a sticky for awhile, but just in the last week I've had to cover this topic several times. ZFS does two different things very well. One is storage of large sequentially-written files, such as archives, logs, or data files, where the file does not have the middle...

www.truenas.com

EasyGoing1 said:
But what threw me off was the need to add drives to the dedupe config.

I'm not sure exactly where you're seeing this. Dedupe is a property of a pool (which consists of one or more vdevs, which in turn consist of one or more drives), or of a dataset. But unless you have lots of RAM (estimates of 5 GB per TB of disk space are common), dedupe isn't the droid you're looking for.

EasyGoing1 · Nov 11, 2021

danb35 said:
The path to success for block storage

It seems like I haven't written a sticky for awhile, but just in the last week I've had to cover this topic several times. ZFS does two different things very well. One is storage of large sequentially-written files, such as archives, logs, or data files, where the file does not have the middle...

www.truenas.com

I'm not sure exactly where you're seeing this. Dedupe is a property of a pool (which consists of one or more vdevs, which in turn consist of one or more drives), or of a dataset. But unless you have lots of RAM (estimates of 5 GB per TB of disk space are common), dedupe isn't the droid you're looking for.

... and I just now finished reading the dedupe overview in the TrueNAS documents ... very discouraging indeed.

I was under the impression that deduplication was (stated in simpleton terms) - a file system level of storing some kind of a unique hash signature for every file stored on the volume and when a file came that had the same signature as a file that already exists, it would not be committed to block-level storage, though it would be committed to the file tree as the user placed it.

OBVIOUSLY, my preconceptions about how deduping actually works in practice was more akin to a pipe dream than a plausible scenario...

danb35 · Nov 11, 2021

EasyGoing1 said:
a file system level of storing some kind of a unique hash signature for every file stored on the volume and when a file came that had the same signature as a file that already exists, it would not be committed to block-level storage, though it would be committed to the file tree as the user placed it

Well, that isn't too far from the truth, although it's block-level rather than file-level. But that requires indexing, and that indexing requires RAM--lots of RAM.

EasyGoing1 · Nov 11, 2021

Here's another general question with a brief history to it ... I wrote an app in java a few months ago that walks an entire folder tree from the point of entry down to every last branch and the purpose for that was to keep an offline MySQL copy of my external data volume file table so I could do fast searches for files etc. Using Spotlight proved to be a negative with large amounts of data and literally millions of files. Spotlight was CONSTANTLY tagging that drive and consuming resources, so I disabled it for my data volume and wrote the java program. for the most part, the data didn't move around so I didn't need to walk the tree very often ... once a month perhaps.

But with something like TrueNAS, is it possible to query the nas and quickly pull down a copy of the file allocation table instead of needing to walk it .... which can sometimes take hours?

Samuel Tai · Nov 11, 2021

The recommended way to support Spotlight is via ElasticSearch. See https://www.truenas.com/community/threads/enable-spotlight-in-samba.92361/#post-639972.

EasyGoing1 · Nov 11, 2021

Samuel Tai said:
The recommended way to support Spotlight is via ElasticSearch. See https://www.truenas.com/community/threads/enable-spotlight-in-samba.92361/#post-639972.

Interesting, I wasn't aware that Spotlight could be hooked into a non-apple environment then tapped from a MacOS instance. That would certainly seem to remove all of the overhead of indexing away from my laptop and onto the nas where that function belongs.

Arwen · Nov 11, 2021

EasyGoing1 said:
...

But with something like TrueNAS, is it possible to query the nas and quickly pull down a copy of the file allocation table instead of needing to walk it .... which can sometimes take hours?

Possibly using ZFS snapshots and zfs diff, which compares 2 snapshots. That will tell you the difference and how they are different. I've not used that function, but it seems reasonable.

I guess how it would work is you would take a snapshot after an update. Then later, like the month, take another snapshot and run the diff function. Whence all the changes are made to your index, you delete the older snapshot. But, leave the new one in place for next month's diff. Repeat as desired.

EasyGoing1 · Nov 12, 2021

Are there any performance considerations to take into account when deciding which share type to implement (SMB / NFS / iSCSI) ? Specifically relating to data transfer speeds to the client, not necessarily in terms of resource needs for the NAS.

Samuel Tai · Nov 12, 2021

So long as your TrueNAS is well tuned, and has the appropriate tunables implemented, it's possible to max out all 3 at line rate.

Patrick M. Hausen · Nov 12, 2021

I'd say that depends on the client and the use case. iSCSI presents a block device. SMB and NFS present files. SMB is well suited to server Windows and Mac clients. NFS is the standard for Unix. And so on ...

EasyGoing1 · Nov 12, 2021

Samuel Tai said:
So long as your TrueNAS is well tuned, and has the appropriate tunables implemented, it's possible to max out all 3 at line rate.

And I’m assuming that a block level share like iSCSI does not give the NAS itself access to the files stored inside the share, is that correct?

Patrick M. Hausen · Nov 12, 2021

Yes. But then you cannot mount a block device from more than one client at a time. Unless it's a filesystem specifically designed for that like VMFS. But in the general case, no.

What is your use case for your NAS?

EasyGoing1 · Nov 13, 2021

Patrick M. Hausen said:
Yes. But then you cannot mount a block device from more than one client at a time. Unless it's a filesystem specifically designed for that like VMFS. But in the general case, no.

What is your use case for your NAS?

Just personal data storage... I have bad luck with external drives and had an incident recently with an 8TB external formatted with APFS ... several hundred gigs of data just vanished out of nowhere and with no errors thrown or anything. Fortunately, I was able to recover the lost data and got it copied to different volumes and the APFS still mounts though when I run Apples disk check on it, it has errors that it cannot fix and three other utilities that have the ability to repair APFS volumes all tell me that they also cannot fix it.

So, instead of leaving the drive running, I disconnected it and am at the crossroads now of what do I do next ... I don't want to continue with straight DAS external USB and decided that a NAS with some redundancy would be my best option since the integrity of the files would be the responsibility of the NAS itself and not my MBP - it's a no-brainer option because I seem to go through a catastrophic incident with external USB drives about every three years or so and I've about worn the fun out of that experience... so that's when I almost bought a Synology box the other day, then happened across this website and I remembered playing with FreeNAS back in the 2000s and decided to check out TrueNAS in a VM and that brought me here with my questions.

But at this point, I'm leaning towards either a home brewed clone for TrueNAS or one of those HP microservers... and I will connect to the NAS via a dedicated 10G Ethernet link. And though my druthers were to go iSCSI initially, I realize now that doing that eliminates the option of utilizing some NAS features that I might want to have such as installing a Plex media server and even ZFSs skills in all things "self-repair" - assuming it does that at the file level ... but now that I think about it ... that might be done at the block level meaning that it doesn't matter what kind of share you have, ZFS will protect whatever I chose.

But ... I'm leaning towards an NFS share cause to me, those are close to an iSCSI type of connection while still being at the "folder share" level technically. I've just always been under the assumption that an NFS connection is closer to the kernel than an SMB or CIFS connection and it's usually more desirable to keep those things as low level as you can. Though I might be wrong about that, I'm going off of the knowledge I had about these things over a decade ago ... today it might not make any difference at all - SMB vs NFS etc, though being able to mount the NAS volume under my Volumes folder is appealing to me because it would be just like having a DAS device in terms of accessing the data and I know I can do that with NFS ... not sure about SMB.

Mike

Important Announcement for the TrueNAS Community.

More TrueNAS Testing with some questions...

Dabbler

Hall of Famer

Dabbler

Never underestimate your own stupidity

Hall of Famer

Dabbler

Never underestimate your own stupidity

Hall of Famer

Dabbler

Hall of Famer

Dabbler

Never underestimate your own stupidity

Dabbler

MVP

Dabbler

Never underestimate your own stupidity

Hall of Famer

Dabbler

Hall of Famer

Dabbler

Similar threads