Questions about ZFS Virtualized under Proxmox

h0m3

Cadet
Joined
Jul 30, 2023
Messages
3
Okay. I have some questions about Virtualizing TrueNAS under Proxmox and hope you guys can answer them

Today I have a proxmox hypervisor running with a Alpine Linux VM and ZFS on Linux running on top of a ZFS managed RAID1 8TB NAS disks and 2 250GB RAID1 special VDEVs. This VM is running with 28GB of RAM.

Today I dont have the SAS virtual controller passed to the VM due to IOMMU groups indisponibility. So my current strategy is to passtrough the disks to the VM, they are not virtual disks and they are dedicated to that VM, so I dont passed the controller. I passed the disk. The Hypervisor and the VM host disk is on another 250GB RAID1 disk array unrelated to my storage disks. I understand the risks of doing this and they are acceptable for me (due primarly to external backups).

I read https://www.truenas.com/community/t...ative-for-those-seeking-virtualization.26095/ and understood that the strategy took is to virtualize the disks and my understand is because the hypervisor, vms and the zfs pool is on the same backend storage, leading to a 4x storage penalty. And I think my situation is descibed on the post linked on the header of the topic but the link is broken.

I plan to export my pool and import it on TrueNAS. My question is does TrueNAS do something extra that I should be aware on ZFS pools? I did understand that TrueNAS tries to manage physical disks health and that is (obviously) not possible under a passtrough disk without a controller, but there anything else I should be aware outside the default ZFS behaviour?
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
You should read the following resource.

You should really passthrough the controller if you want to virtualize TN, otherwise your data will be at serious risk.

Also, I really hope you mean RAIDZ1 when you say RAID1.
 

h0m3

Cadet
Joined
Jul 30, 2023
Messages
3
You should read the following resource.

You should really passthrough the controller if you want to virtualize TN, otherwise your data will be at serious risk.

Also, I really hope you mean RAIDZ1 when you say RAID1.
I mean ZFS Mirror. Not RAID. Poor choice of words.

Anyway, unfortunately passing the controller is not possible unless I do a platform upgrade, something that I have no money for (unless someone is able to donate some motherboards -jk). The controller is in the plans, but due to a full platform upgrade I have to wait.

Based on the guide you've passed im more on the "deploy a small FreeNAS VM instance for basic file sharing (small office, documents, scratch space)." category. I assume some level of data loss possibility for several reasons, thats why all data is backed up offsite.

When you comment that "otherwise your data will be at serious risk." you are referencing risks related to ZFS or other risks related to TrueNAS?
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
I don't virtualize so my answer is speculative: there are be risks related with virtualizing TN on top of virtualizing ZFS.

Others might be able to help you more.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Just last week I was looking at a thread where a forum member has lost his pool due to setting it up this way. (connecting disks to the VM).

Take the risk if you are happy with that (and well enough prepared for the pool failure), but it's in no way recommended for cases where data is important if you can't do it with the entire controller in PCI passthrough.


That is far from the only case I can recall, just the most recent. Don't think that just because it works for a few days for you will mean that you won't have problems somewhere along the line... it seems more-or-less inevitable to me.
 
Last edited:

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
Anyway, unfortunately passing the controller is not possible
Currently, you do not meet bare minimum requirements to virtualize TN according to proven standards.

The forum recommendation is pretty straight forward.
Don't go any further into virtualization other than testing with non real data, until you meet requirements.

It is not for our sake. It is for yours.
 

h0m3

Cadet
Joined
Jul 30, 2023
Messages
3
Thanks all for the answers.

From the answers looks like there is more than just ZFS on TN that could cause a pool failure. I'm prepared to do a pool recovery if necessary, and I think even on a recommended setup I would because I dont trust my data on a single point of failure anyways. But although I'm running ZFS with virtualized hardware for years thats being done on top of a Linux box manually configured, not through any automated NAS setup.

I dont want to do a pool recovery just because I decided to switch to TN. So, from my understanding as of right now TN does not meet my needs and I wont migrate to it. The risks outweight the benefits, thank you all for your insights.
 

zizzithefox

Dabbler
Joined
Dec 18, 2017
Messages
41
Virtualizing TrueNAS under Proxmox is not difficult, and when done correctly, it works. However:
  • You need to use PCI-e passthrough for storage
  • You should avoid GPU passthrough, as it often causes issues unless you have very recent server-grade hardware
  • You should be prepared for some minor issues, typically related to FreeBSD/TrueNAS updates; it just happened to me right now because of 13.0-U5.3. Reverted to U5.2, so be warned.
I say it here, and I deny it here; you can connect the disks directly to the virtual machine in Proxmox, ensuring that the host system ignores them and doesn't enable any caching on them. Does it work? Practically, yes. The frequency of even checksum errors reported on forums is greatly exaggerated, but understand this: THOSE WHO ADVISE AGAINST IT HAVE A POINT, you should not do it.

However, I'm fairly certain that the overwhelming majority of issues occur because people don't ensure the following:
  • Hyper-V: disable disk write caching in Hyper-V and deactivate the disks in Windows through Computer Management
  • Promox: set the cache to "Direct SYNC" for the drives
  • VMWARE: properly configure RDM with disks as independent persistent and I don't remember what else
  • Etc.
I have tried all these settings several times and experienced occasional checksum errors when the system shut down due to power loss, but I always had a RAID-Z2 or triple mirroring, plus another system with frequent replication. As usual, BACKUPS!!!!!

Once, I lost the latest version of some files, which ZFS promptly informed me about, because I had replaced two disks during a couple of years and had forgotten to disable device caching in the Windows Device Manager (this was when I was using Hyper-V at home for a while).

You should know that if the metadata gets corrupted, you could lose the entire pool. If you still want to do it, go ahead.

If I could bet 1000 euros that you won't loose the pool in the next 5 years because of this, I would.

But, your data, your choice. DON'T DO IT if you don't have backups.
 
Top