Help with building a new TrueNAS Server for VMs

uberwebguru · Feb 19, 2022

I am planning building a new TrueNAS server that will be used mostly for NFS share for VMs in production
Mentioned production because this will not be some home lab stuffs, will be running mission critical stuffs

I am planning for this, so haven't bought anything yet, so based on responses from this thread then i will go ahead and buy what are recommended
Here is hardware am planning to get:

latest TrueNAS scale(or core?)
1 x Dell PowerEdge R730xd 12 LFF
1 x Intel Xeon E5-2699 v4 22 cores (better option in E5-2600 v4 series?)
4 x 32GB DDR4
4 x 20TB seagate Exos X20 HDD for RAID 10
4 x 10G RJ45 on Dell R730xd connected to Netgear 10G RJ45 switch

to utilize the 4 x 10G on TrueNAS server, plan to use all 4 NICs for VMs, come connect for NIC1 and others to NIC2 etc

so will start with just 4 x 20TB for RAID 10 for a single pool and later add more drives either to the pool or for another pool

Anything i need to start to keep in mind?
What performance do i expect from this?

This will be my very first TrueNAS build. I need a shared storage for VMs, and i have chosen TrueNAS as storage OS of choice

uberwebguru · Feb 19, 2022

I am also considering going with the Dell R730xd 24 SFF option and use SSD for very high IOPS
This will be primary storage for several hundreds of VMs, so need very very very high IOPS
Also will like to know what networking gears are better

will be using netgear 10-Gigabit Smart Switch Series (XS728T) to connect the VM servers to the TrueNAS server
VM servers also have 2 x 10G RJ45 connections on host servers
So i believe i should be fine there especially if i get 40G from the TrueNAS server with the 4 x 10G links

uberwebguru · Feb 22, 2022

this is what scares me about NFS as primary storage for VMs

Seagate Exos X20 20TB Enterprise HDD Review

The Seagate Exos X20 is effectively the same drive as the NAS-oriented 20TB IronWolf Pro, but offers a SAS 12Gbps interface.

www.storagereview.com

In average latency at 8K 70/30, the Exos X20 ranged from 1.29ms to 103.21ms in SMB and 4.81ms to 131.36ms in iSCSI; the IronWolf Pro had tighter ranges of 1.06ms to 78.36ms in SMB and 4.15 to 116.71ms in iSCSI.

Now for the max latency numbers, where the Exos X20 remained unimpressive next to the IronWolf Pro. It ranged from 22.02ms to 3797.75ms in SMB and 345.85ms to 2085.53ms in iSCSI; the Exos X20’s ranges were 24.45ms to 981.12ms in SMB and 271.61ms to 1608.67ms in iSCSI.

The final 8K 70/30 numbers are standard deviation. As might be expected at this point, the Exos X20 isn’t a match for the IronWolf Pro, showing almost universally higher standard deviations.

jgreco · Feb 22, 2022

uberwebguru said:
I am planning building a new TrueNAS server that will be used mostly for NFS share for VMs in production
Mentioned production because this will not be some home lab stuffs, will be running mission critical stuffs

I am planning for this, so haven't bought anything yet, so based on responses from this thread then i will go ahead and buy what are recommended
Here is hardware am planning to get:

latest TrueNAS scale(or core?)

Why would you use Scale, a newly developed version that has had an official release for a grand total of twelve hours, and is clearly targeted at KVM hosting and Gluster, over the rock solid Core that's been doing the task you seek for like a decade?

1 x Dell PowerEdge R730xd 12 LFF

Make sure it has an HBA. Not a fake HBA. Not a RAID controller pretending to be an HBA. See

What's all the noise about HBAs, and why can't I use a RAID controller?

1) An HBA is a Host Bus Adapter. This is a controller that allows SAS and SATA devices to be attached to, and communicate directly with, a server. RAID controllers typically aggregate several disks into a Virtual Disk abstraction of some sort...

www.truenas.com

1 x Intel Xeon E5-2699 v4 22 cores (better option in E5-2600 v4 series?)

I usually prefer lower core count higher clock CPU's.

4 x 32GB DDR4

Mmm.

4 x 20TB seagate Exos X20 HDD for RAID 10

4 x 10G RJ45 on Dell R730xd connected to Netgear 10G RJ45 switch

That seems like the classic bad choice of someone who's terrified of greater-than-gigabit networks and has no clue about SFP+ etc. Please go visit

10 Gig Networking Primer

This is a discussion of high speed networking for newcomers, with specific emphasis on practical small- to medium-scale deployments for home labs or small office users. It originated with a forum thread located here that has many pages of...

www.truenas.com

and then we can talk about how rather than this catastro-network, you could get set up on a decent network. You do not want to be LACP'ing together four 10GbaseT connections. That's a recipe for sucktacular disastrophe.

so will start with just 4 x 20TB for RAID 10 for a single pool and later add more drives either to the pool or for another pool

Anything i need to start to keep in mind?
What performance do i expect from this?

Pretty rotten I think.

So a few things. ZFS does not support "RAID 10" and you are asked not to refer to mirror vdevs in that manner. Please see

Terminology and Abbreviations Primer

We realize that new users have a lot to learn when they come to FreeNAS. There's a certain amount of confusion added to discussions when users pick random/approximate terms to describe things. I've spent a lot of time quietly trying to translate terms on the reader's side when reading posts...

www.truenas.com

which is extra-important for all you Dell owners who tend to show up and expect to be able to use a PERC RAID controller, often in RAID mode. Using the correct terminology communicates that you understand what you're doing.

uberwebguru said:
This will be primary storage for several hundreds of VMs, so need very very very high IOPS

Well, two HDD vdevs will give you about 400 write IOPS or 800 read IOPS -- total. That seems like a poor budget for "several hundreds" of VM's. Of course ZFS can manage to do amazing things if you add ARC and L2ARC for read, and if you oversize the pool, you can usually get pretty awesomely high write speeds too, so those are pessimistic (worst-case) figures. If you have a mainly read workload, and you put 256GB of RAM and 1TB of L2ARC into the system, and the working set fits into ARC+L2ARC, you could easily see hundreds of thousands of read IOPS. But again this is workload dependent. You are only promised the IOPS that the underlying hardware can handle (~400/800).

uberwebguru said:
this is what scares me about NFS as primary storage for VMs

What scares you about NFS as primary storage for VMs? You go on to quote an article about HDD's. HDD's have nothing to do with NFS other than you might use HDD's as backing for an NFS export. But you can use SSD too, or (going outside the realm of TrueNAS) even RAM. NFS is commonly used for VM storage, as is iSCSI, so I'd say "get over it". We have to select from the available technologies.

Complaining about the speed of HDD? Well, it's been that way since the beginning of HDD's.

Also mandatory reading:

The path to success for block storage

It seems like I haven't written a sticky for awhile, but just in the last week I've had to cover this topic several times. ZFS does two different things very well. One is storage of large sequentially-written files, such as archives, logs, or data files, where the file does not have the middle...

www.truenas.com

ChrisRJ · Feb 23, 2022

In addition to what @jgreco wrote about IOPS, I would recommend that you do some reading on performance and sizing. And you need to collect as much information as possible about your workload. In this case about what is happening in the VMs. Yes, this is tedious work and most people are overly (in my view) casual about it. I do this professionally for transactional software (roughly like ERP) and the biggest problem is to get detailed information from customers about performance details.

Without sufficient information you only have two options: 1) Put in sufficient buffer (the less you know, the more buffer you need), or 2) run the risk of ending up with insufficient performance.

uberwebguru · Feb 23, 2022

Why scale? because you can add more storage servers and scale horizontally. Can core do that? Having KVM hosting does not mean i will use the feature.

SFP+ can be better than RJ45, but if RJ45 works, some people might prefer them for different reasons. Sure i can use SFP+ but like i said as long as RJ45 works, choosing it can't be the end of the world.

If i said raid 10 for ZFS, well communication is all about explaining things. It is pretty much mirroring the disks so i can use half and be able to lose half the disks. That was the message. Like i said am new to this, so i don't think i have to be expert with the correct words

Reason for HDD was i keep hearing people saying SSDs are not good for NAS. Again reason for the thread to get feedbacks from folks that have actually deployed these things rather than people's personally review online.

What scares me about NFS? is the extra latency due to traveling via network compared to local storage to the server. As you can see from performance results, the latency can be a killer compared go a local drive.

uberwebguru · Feb 23, 2022

ChrisRJ said:
In addition to what @jgreco wrote about IOPS, I would recommend that you do some reading on performance and sizing. And you need to collect as much information as possible about your workload. In this case about what is happening in the VMs. Yes, this is tedious work and most people are overly (in my view) casual about it. I do this professionally for transactional software (roughly like ERP) and the biggest problem is to get detailed information from customers about performance details.

Without sufficient information you only have two options: 1) Put in sufficient buffer (the less you know, the more buffer you need), or 2) run the risk of ending up with insufficient performance.

Do you use SSDs or HDDs? and why are there misconceptions about using SSD for NAS?
How do you deal with latency with NAS? I am more particular about NAS and not iSCSI.

Ericloewe · Feb 23, 2022

uberwebguru said:
Why scale? because you can add more storage servers and scale horizontally.

I've been running Gluster in production for two and a half years (it's been around longer, but I joined ${COMPANY} two and a half years ago). It's not Nirvana, there are no rainbows and the sunshine is filtered by smog. I would not touch it with a ten-foot pole for anything needing IOPS.

Ericloewe · Feb 23, 2022

uberwebguru said:
Reason for HDD was i keep hearing people saying SSDs are not good for NAS.

Well, that's a new one. Do they happen to work for a manufacturer of HDD platters or something?

Rand · Feb 23, 2022

ChrisRJ said:
And you need to collect as much information as possible about your workload. In this case about what is happening in the VMs. Yes, this is tedious work and most people are overly (in my view) casual about it. I do this professionally for transactional software (roughly like ERP) and the biggest problem is to get detailed information from customers about performance details.

Sorry for the OT; I totally agree with you - but I think most people have the problem that there is no simple (enough) way to identify the actual applications behavior in the first place. Even if your current solution provides a way to gather the data on a storage level (btw, whats the way to do this on TNC?) its very difficult to map that to individual applications on a potentially large number of VMs ...

jgreco · Feb 23, 2022

uberwebguru said:
Why scale? because you can add more storage servers and scale horizontally. Can core do that?

For VM storage? Sure. You keep adding hard drives to your Core installation until you get up past a hundred. I feel that the practical limit is probably less than 200. With modern 20TB HDD's, this gives you 4PB of raw space, up to 2PB of pool space, or somewhere between 100TB-1PB of actual usable VM storage space, depending on what you want your free space reserve and mirroring redundancy to be.

Once you get to that point, you add a second server and create a new datastore on that.

If you were mistaking Gluster for some magic way to add dozens of petabytes transparently to your server, forget it. Gluster is an abstraction layer and inherently introduces lots of latency and significant reduction in potential IOPS due to all the additional handling and indirection. Gluster is meant as a way for an organization to have distributed redundant resources such as filesharing, but is never going to be a high performance solution.

uberwebguru said:
and why are there misconceptions about using SSD for NAS?

Bluntly, 'cuz people are stupid. Stupid people do stupid things and then when it blows up, rather than learning from the mistake and figuring out how to do better next time, they just say stupid things about their misadventures and then generalize. I've talked repeatedly on these forums about our adventures here at SOL with SSD on hardware RAID1 (not ZFS) for VM storage, something that had been considered heretical a decade ago, but turns out to be practical if deployed intelligently on compatible workloads. Keywords "Intel 535" in search.

It should be obvious that some SSD's, such as the Intel S3710, are absolutely rock solid competitors to HDD's for most applications, with incredible endurance, but on the other hand you also have the 870 QVO (not EVO) which comes in an 8TB 2.5" model but with very limited endurance. Both of these are PROBABLY stupid choices for SSD for NAS, but might not be in certain circumstances. I fully expect the 870 QVO's would be awesome in a write-once-read-many (WORM) archival scenario for example. Great place to dump your only very slowly changing ISO library.

Drives such as WD's Red SA500 drives are targeted specifically at the middle-of-the-road NAS market, so not only do people use SSD's for NAS, but also vendors target the segment. These drives are not going to be good for heavy VM write environments, but are expected to be reasonable replacements for general NAS workloads.

uberwebguru said:
SFP+ can be better than RJ45, but if RJ45 works, some people might prefer them for different reasons. Sure i can use SFP+ but like i said as long as RJ45 works, choosing it can't be the end of the world.

That's just it, though, RJ45 does not work well. The reason it saw no serious uptake is because it's a crappy technology, which I talk about in the 10 Gig Networking Primer. If you don't care to listen, that's fine, I can't make you. But I'd much rather have a single 40Gbps SFP+ low latency link from my NAS to my hypervisors than a LACP'd pile of cruddy 10GbaseT copper links. The latest hotness is 25Gbps/100Gbps.

uberwebguru said:
If i said raid 10 for ZFS, well communication is all about explaining things. It is pretty much mirroring the disks so i can use half and be able to lose half the disks. That was the message. Like i said am new to this, so i don't think i have to be expert with the correct words

"Well your ideal path to nirvana storing things is to have a lot of very fast thinky things and memory things and links between them that are not draggy, plus you need to be able to talk fast to all the other things without making it hard."

There. I've given you the actual summary answer to your questions, but done it with none of the correct words or proper details. Real helpful huh. Communication is indeed about explaining things, and no one here would dare accuse me of too few words or too little effort. No one expects you to be expert with the correct words. However, it's important to communicate clearly and accurately, and I already explained that we run into lots of

jgreco said:
Dell owners who tend to show up and expect to be able to use a PERC RAID controller, often in RAID mode.

so it really is important to clearly convey your meaning. We're not mad at you for using the wrong words, but please do try to cooperate if you're asked to use the right ones. I provided a link to the Terminology and Abbreviations Primer above to make it easier for you to self-educate if you'd like.

uberwebguru said:
What scares me about NFS? is the extra latency due to traveling via network compared to local storage to the server. As you can see from performance results, the latency can be a killer compared go a local drive.

So then attach all your storage directly to your hypervisors via the latest and fastest high end low latency PCIe 4 based NVMe SSD's and call it a day.

The latency in a single HDD seek is perhaps 10ms, and the latency in a 10G network is much less than that. If your working set is sitting in the NAS's ARC or L2ARC, you can beat HDD seek times very consistently and get crazy good performance out of the stuff that is being frequently accessed, while also paying only a mild penalty for the convenience of having it on shared storage. Shared storage also means that you can have multiple hypervisors with access to the VM's, which is really convenient if you have vMotion or other rapid migration capabilities.

At the end of the day, every storage strategy has upsides and downsides. Expense, capacity, speed, latency, you play all these factors off each other. You might find that there isn't just one strategy that suits all your needs. Our hypervisors here tend to have a combination of NVMe SSD, hardware RAID1 LSI3108/3508 with SSD and HDD, Synology iSCSI as our "low performance" tier, and TrueNAS NFS/iSCSI for certain capacity workloads. Each of these has different strengths and weaknesses.

Ericloewe · Feb 23, 2022

jgreco said:
For VM storage? Sure. You keep adding hard drives to your Core installation until you get up past a hundred. I feel that the practical limit is probably less than 200. With modern 20TB HDD's, this gives you 4PB of raw space, up to 2PB of pool space, or somewhere between 100TB-1PB of actual usable VM storage space, depending on what you want your free space reserve and mirroring redundancy to be.

To elaborate on this 188-200 disks seems eminently doable with four front/rear loaded expansion chassis plus a 2U/4U server, for a total of 18U/20U. Rack weight and power limits may limit this further, and this is without the specialty stuff that crams 90 disks into 4U.
This is not an absolute maximum ceiling, just a general "yeah, don't overdo it" kind of thing.

jgreco · Feb 23, 2022

Ericloewe said:
This is not an absolute maximum ceiling, just a general "yeah, don't overdo it" kind of thing.

One kinda wonders what the largest system iXsystems has deployed is. The web page says "twenty petabytes on a single head unit" but to me that implies a thousand 20TB HDD's, unless my math is very much in error this morning. That is a slightly terrifying number of disks to attach to a single system.

Ericloewe said:
Rack weight

!!?? Raised floors are so '80's. Concrete pad FTW.

Ericloewe said:
power limits may limit this

Sorry you were just askin' for it.

So a thousand 20TB HDD's as three-way mirrors would give you 20PB, or 6PB pool size, or using the 50% rule up to 3PB usable space. Also assuming 250 IOPS per HDD, that works out to 80K write IOPS or a quarter of a million read IOPS.

uberwebguru · Feb 23, 2022

can someone please explain to me if i can use this unifi aggregation switch to connect directly from pfsense router to my servers? I get that they are between core switches and ToR swithces, but can one use to directly connect to servers? Not everyone have giant networks before getting to end servers.

Switch Pro Aggregation

A 32-port, Layer 3 switch. Features: (28) 10G SFP+ ports (4) 25G SFP28 ports (1) USP RPS DC input for power redundancy Switching capacity: 760 Gbps 1.3" LCM color touchscreen with AR switch management Managed with the UniFi Network application: Version 6.1.25 and later

store.ui.com

Thinking of getting these instead of the Netgear RJ45 switches, thanks to @jgreco

Ericloewe · Feb 23, 2022

Those are just market segments, what matters is the feature set.
Be advised though that Ubiquiti does not have a reputation for good mid- or long-term support.

jgreco · Feb 23, 2022

uberwebguru said:
i can use this unifi aggregation switch to connect directly from pfsense router to my servers

?

uberwebguru said:
I get that they are between core switches and ToR swithces

Switches are given these "names" primarily based upon the role in which a device is expected to operate.

A Top-of-Rack switch, for example, will have reversible fans (so that it can be mounted on the rack backside for those of us who prefer that), a large number of ports (24 or more) of a given speed, and then usually four or maybe six or eight very high speed uplinks, usually at least the quad version of the ports on the switch, so, QSFP+ (40GbE) for an SFP+ switch.

An aggregation switch isn't really targeted at ToR and is usually used more where lower bandwidth demands are expected on individual ports, with burst capability. Therefore this thing has 10G SFP+ ports but only 25G SFP28 ports for uplinks.

"Core" switches tend to be whatever suits the need. Here, for example, I use a pair of Dell 8132F (Force10) switches, which were sold as "aggregation" or "core" switches, not as "ToR" possibly because the fans aren't reversible. I use them as core switching, since all the hypervisors have 10G, the dist switches have 10G uplinks, so I do lots of redundant LACP links in a mostly 10G very happy network.

Part of this also has to do with the featureset, such as the 8132F's do layer 3 switching and OSPF routing.

Do note that Ubiquiti has a long proud history of abandoning unpopular products such as the EdgeRouter Infinity. You may be better off shopping for used datacenter grade switchgear on eBay, though it may be somewhat noisier than Ubiquiti's product.

Terms such as "uplink" are just expected-role words for "port". For example, you can definitely get an Intel XXV710 card and hook it up to one of the SFP28 ports on that Ubiquiti. Doesn't matter that it was "designed" for "uplink". The only time it matters is if you have the bad luck to come across something that isn't actually ethernet, like some stacking port options on various switchgear.

So yes you can stick a 10G card in pfSense and hook it to a SFP+ port on the switch, a 25G card in TrueNAS and hook it to an SFP28 port on the switch, etc.

uberwebguru · Feb 23, 2022

Ericloewe said:
Those are just market segments, what matters is the feature set.
Be advised though that Ubiquiti does not have a reputation for good mid- or long-term support.

So what switch with like 24 x 10G ports do you recommend besides these, other than the popular netgear 10G switches?

uberwebguru · Feb 23, 2022

jgreco said:
So yes you can stick a 10G card in pfSense and hook it to a SFP+ port on the switch, a 25G card in TrueNAS and hook it to an SFP28 port on the switch, etc.

Ok cool, wanted to make sure because was wondering if there are some things lacking that makes it suitable for certain layer in the network than others. Yeah these are really cost effective compared to the too expensive netgear 10G switches.

jgreco · Feb 23, 2022

Well, specific features vary, but basic switching is always basic switching.

Cost effective is all relative. Netgear wants $2200 for a XSM4324FS 24xSFP+ switch, but Dell 8024F's go used on eBay for $400 all the time, and they're likely a more robust and featureful switch too. The 8024F's main drawback is that it has no real uplink support to speak of; all the ports are 10G. I've got some, they're fine switches.

You can get a 48-port 8164F with 2x 40G QSFP+ for $700, and the module to add two more QSFP+ for a few hundred more. Used of course.

There was some chatter recently on NANOG about inexpensive switches from FS.COM such as the https://www.fs.com/products/29126.html for $4000, 20-Port Ethernet L3 Fully Managed Plus Switch, 4 x 10Gb SFP+, with 20 x 40Gb QSFP+ and 4 x 100Gb QSFP28,

At that price, I could see just getting a crapton of 40G cards for everything and calling it a day. I don't know anything more about this unit though. It is absolutely NOT an endorsement, but, on the other hand, these things are mostly made out of some silicon foundry's silicon that is widely used and it's likely to work pretty well.

uberwebguru · Feb 23, 2022

@jgreco thanks for pointing out the dell switches, i never really looked at dell when looking at the 10G switches; so a good discovery for me
The prices are for sure much better than Netgear's ridiculous pricing on 10G switches, not to mention netgear is none for home routers.

those 48 ports 10G switches, i see power consumptions are very high. Is the power consumption based on number of ports used? I hope so, else won't make sense getting those if one is not planning on using lots of the ports

Important Announcement for the TrueNAS Community.

Help with building a new TrueNAS Server for VMs

Explorer

Explorer

Explorer

Resident Grinch

Wizard

Explorer

Explorer

Server Wrangler

Server Wrangler

Guru

Resident Grinch

Server Wrangler

Resident Grinch

Explorer

Server Wrangler

Resident Grinch

Explorer

Explorer

Resident Grinch

Explorer

Similar threads