Copying a large file has random freezes on Windows??

berengard

Dabbler
Joined
Jul 22, 2023
Messages
19
I recently installed TrueNAS as a vitrual machine on Proxmox on my home server.
The server is a 12-core Core i7 with 32GB of DDR3 RAM (picked it up from my workplace...)

On that machine, my TrueNAS VM gets 8GB of RAM and 4 CPU cores to work with.
There are 3 ZFS pools, and each pool has 1 HDD of 6-8TB and some smb shares on the pools' datasets.
There is also an Ubuntu VM installed under Proxmox on that same machine. That Ubuntu mounts some of the shares from TrueNAS for services and such,
and those shares are also mounted as network drives on my Windows machine where I do my work.

Here comes the strange part...
If I copy a large file from the network drive to my Windows local drive, it copies smoothly at peak network bandwidth (1Gbit) but a couple of times throughout the process, it randomly freezes for about 10 seconds, the speed goes down and the TrueNAS VM halts too. Then it unfreezes and continues to copy.
This doesn't happen on the Ubuntu VM, only on my Windows machine.

I would really appreciate if anyone had any shape of idea for why this might be happening??
Thanks in advance!
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
There are special considerations for running TrueNAS virtually. And using Proxmox is less tested than VMWare.
How do you pass through the disks to TrueNAS?


Please list all your hardware. Their are known problems with somethings, like RealTek Ethernet chips. Or SMR disks.
Without hardware listing, it is pretty much guesswork by us.
 

berengard

Dabbler
Joined
Jul 22, 2023
Messages
19
The disks are passed to TrueNAS through the VM configuration.
Scsi(2-4) in the image below are the storage HDDs connected to the machine through a USB3 4-bay enclosure. Scsi1 is a cache drive connected through SATA internally, but isn't used in the pool where I'm copying the files from. Scsi0 is TrueNAS's system virtual disk stored on the machine's internal HDD.

Screenshot 2023-07-23 143716.png


The machine has the following hardware:
CPU: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz (12 threads)
RAM: 4x DIMMs: Kingston 8GB KHX1600C10D3 1333 MT/s (Non-ECC, I think, though TrueNAS says they are)
Chipset: Intel Corporation C600/X79 series chipset
Storage: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller
Network: Intel Corporation 82579V Gigabit Network Connection
GPU: NVIDIA Corporation GK104 [GeForce GTX 660 Ti]

USB3 4-Bay HDD enclosure: JMicron Technology Corp. / JMicron USA Technology Corp. JMS567 SATA 6Gb/s bridge

Storage drives:
1. 1TB HDD for Proxmox and VM virtual disks (Connected internally through motherboard controller)
2. 500GB SSD for caching one of TrueNAS's pool (Connected internally through motherboard controller)
3. 3x 6~8TB SATA3 HDDs (Connected through USB3 4-Bay controller )

Note: The HDDs were previously connected through the same USB3 4-bay enclosure while running just Windows 10 instead of Proxmox and all was running at peak speed with no issues at all.

Any thoughts are greatly appreciated!
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
The disks are passed to TrueNAS through the VM configuration.
Scsi(2-4) in the image below are the storage HDDs connected to the machine through a USB3 4-bay enclosure.
Every part of that is a recipe for a bad day. You should seriously reconsider your setup, as per the info linked by @Arwen. If random freezes are the worst thing that's happening, take it as a win and move your data to a safer setup before your luck runs out.
 

berengard

Dabbler
Joined
Jul 22, 2023
Messages
19
take it as a win and move your data to a safer setup before your luck runs out
Can you please elaborate more? You've done a great job at painting a picture of shock and horror,
but more information for why the freezes are happening and what the danger in my setup is would be a little more helpful.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Can you please elaborate more? You've done a great job at painting a picture of shock and horror,
but more information for why the freezes are happening and what the danger in my setup is would be a little more helpful.
One take away, is that you have 3 x HDD on the USB3 port. So USB3 acts as a funnel to the drives. Not as bad as USB2 speeds, but still exists.

Next, standard VM practice for RELIABLE disks for Virtualized TrueNAS Core is to pass the entire SATA or SAS controller through, which has all the storage devices. By passing through just the drives, you are making the I/O happen at the Proxmox level, not the VM level. Thus, if the Proxmox resources, (memory, CPU, I/O), are busy, your TrueNAS Core becomes busy. At least that was my understanding.

But, please READ the resource;

Last, if I am reading your screen shot correctly of the disks, you have 2 x 6TB Western Digital SMR drives. These are known to be not compatible with ZFS, for exactly the problem you are experiencing, slow downs! Add that on top of USB, you have quite a few issues;
  • Using less tested virtualization host, Proxmox instead of VMWare
  • Not passing through SATA / SAS HBA with drives attached, to VM
  • USB attached disks for data pool drives
  • 2 x 6TB SMR disks
 
Last edited:

berengard

Dabbler
Joined
Jul 22, 2023
Messages
19
Wow, that's super helpful. I can definitely see the problems there and the changes I'm gonna make.
Thank you very much!!

Would you say that eSATA is better than USB3 or would that be equally not recommended?
At the moment I have to use the hardware I have.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Next, standard VM practice for RELIABLE disks for Virtualized TrueNAS Core is to pass the entire SATA or SAS controller through, which has all the storage devices. By passing through just the drives, you are making the I/O happen at the Proxmox level, not the VM level. Thus, if the Proxmox resources, (memory, CPU, I/O), are busy, your TrueNAS Core becomes busy.
And you're vulnerable to whatever bugs are in the path. PCI passthrough is also somewhat vulnerable to bugs, but has the advantage of being mostly hands-off, as far as the host is concerned.
One take away, is that you have 3 x HDD on the USB3 port. So USB3 acts as a funnel to the drives. Not as bad as USB2 speeds, but still exists.
Another issue is the historically atrocious quality of consumer-grade external disk chassis in general and USB/SATA bridges in particular - though yours seems to use plain bridges behind a USB hub instead of the dodgier JBOD/RAID chassis controllers.
Would you say that eSATA is better than USB3 or would that be equally not recommended?
eSATA suffers from being a kludge, with dodgy signal integrity. It's not great in practical terms, but it does bypass most/all of the concerns, apart from chassis quality if you can get it to work.
The whole external disk chassis market is a minefield unless you go all out and buy a serious disk chassis from the likes of Supermicro and connect it up via SAS.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Would you say that eSATA is better than USB3 or would that be equally not recommended?
If you're going to use eSATA, it's likely to include a port multiplier...


Basically, both of your options are poor for permanently connected data pool devices.

You also need to deal with replacing your SMR disks with CMR disks if ZFS is a choice you're making.

At the moment I have to use the hardware I have.
So the recommendation of the forums would be to find an alternative option that doesn't involve ZFS (OMV, unRAID, Xpenology) to use with that hardware.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
If you're going to use eSATA, it's likely to include a port multiplier...
Important point I did not clarify appropriately: What I said applies to one cable per disk. If any port multipliers are in the mix, you can forget about it being reliable.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Wow, that's super helpful. I can definitely see the problems there and the changes I'm gonna make.
Thank you very much!!

Would you say that eSATA is better than USB3 or would that be equally not recommended?
...
As others have said, eSATA is okay, with the caveat of it supporting a single disk. Plus, you must keep the cable short.

At the moment I have to use the hardware I have.
So the recommendation of the forums would be to find an alternative option that doesn't involve ZFS (OMV, unRAID, Xpenology) to use with that hardware.
Unfortunately, TrueNAS has limited hardware support when compared to alternatives. Even though the USB3 enclosure works fine with Windows 10, that is meaningless as far as compatibility with TrueNAS.
 

berengard

Dabbler
Joined
Jul 22, 2023
Messages
19
Next, standard VM practice for RELIABLE disks for Virtualized TrueNAS Core is to pass the entire SATA or SAS controller through, which has all the storage devices. By passing through just the drives, you are making the I/O happen at the Proxmox level, not the VM level. Thus, if the Proxmox resources, (memory, CPU, I/O), are busy, your TrueNAS Core becomes busy. At least that was my understanding.
That totally makes sense. While I do have the option to get a PCIe card, that would also require me to get a new enclosure with individual SATA cables, otherwise I imagine there isn't much difference between passing a PCIe card to the TrueNAS VM with all the drives running over a single eSATA cable and just passing the USB3 enclosure box as a USB device to the VM (Which is showing up as a device in the USB devices list)
For context, this is the USB3 enclosure box I have: Mediasonic PROBOX 4 Bay 3.5” SATA Hard Drive Enclosure

You also need to deal with replacing your SMR disks with CMR disks if ZFS is a choice you're making.
So right now I don't have an option to buy new drives, but here's the deal...
You guys mentioned disk scrubs and re-silvers, which happen when a disk needs to be replicated due to failure in a disk array or a pool, right?

What if my setup isn't a typical use case for ZFS?
  • On my TrueNAS, there are 3 HDDs. Each HDD is in it's own ZFS pool with its own dataset, so each pool has only a single HDD. Let's call them pools A, B and C.
  • Pools A and B have snapshot tasks that run daily, but there isn't much data changes happening daily, maybe a couple of gigs at most but usually less than 100MB, as there's only one user on that NAS (myself).
  • Pool A is being used with a few devices on the network for media files - movies, series, etc. It has no backups or replication but has a dedicated cache SSD drive in it.
  • Pool B is being replicated for its daily snapshots to Pool C and to cloud storage (2 replication tasks). These backups take no more than 10-15 minutes.
  • On top of that, the pool shares are being accessed over a 1Gig network, which caps the USB3 speed and even the HDDs speed.
Assuming the drives remain SMR, which I understand is a red flag, does that use case work better for the drives longevity?
 
Last edited:

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
First, that Mediasonic Probox appears to use a SATA port multiplier. If it has a single eSATA connection, then it has to have a SATA port multiplier for the eSATA connection. Then it would be simpler to have a USB3 to SATA adapter chip. Thus, my guess, regardless of external interface, you are using a SATA port multiplier. Not recommended as they tend to not work well.

In regards to the SMR disks, we are not saying you have to replace them. We are saying that the RANDOM FREEZES are a direct cause of using SMR disks. Live with them or not, your choice.

Their may be other or additional causes of slow downs, like the SATA port multiplier in the USB3 disk data path. Or using Proxmox for disk I/O.

Most of us in the TrueNAS forums love our data. So using a known, problematic, disk drive type, (like SMR, USB or SATA port multiplier), is something we simply don't bother checking how they might work in the long term.


That said about SMR drives, I use a Seagate 8TB SMR Archive drive for one of my backup disks. It was the only consumer 8TB disk at the time I bought it. It works okay, but is slow for the intended purpose, (aka writing my backups). Did I mention is was SLOW? Since it still works, I will continue to use it, as I am not backup window constrained. (So what if it takes a day to run the backup?)
 

berengard

Dabbler
Joined
Jul 22, 2023
Messages
19
First, that Mediasonic Probox appears to use a SATA port multiplier. If it has a single eSATA connection, then it has to have a SATA port multiplier for the eSATA connection. Then it would be simpler to have a USB3 to SATA adapter chip. Thus, my guess, regardless of external interface, you are using a SATA port multiplier. Not recommended as they tend to not work well.
The box is using both SATA and USB3 and can be connected using either-or. So I imagine for USB3 the adapter chip you mentioned is already inside.
In regards to the SMR disks, we are not saying you have to replace them. We are saying that the RANDOM FREEZES are a direct cause of using SMR disks. Live with them or not, your choice.
Yeah, totally. I'm just trying to find a sweet spot that works for my use case and doesn't destroy the drives within a couple of years.
Their may be other or additional causes of slow downs, like the SATA port multiplier in the USB3 disk data path. Or using Proxmox for disk I/O.
I'm gonna be reconfiguring that in the coming few days.
Most of us in the TrueNAS forums love our data. So using a known, problematic, disk drive type, (like SMR, USB or SATA port multiplier), is something we simply don't bother checking how they might work in the long term.
I totally understand. I love my data too :)
So correct me of I'm wrong, but you guys are effectively speaking for the major use case - large zfs pool with dozens of drives, mirroring, parity, snapshots, replication, lots of users, all that jazz... For which, I agree, not having the best drives and best hardware is suicide.
So what if it takes a day to run the backup?
How much data are you usually writing per backup?

With my use case I described, I just wonder if the reduced use ("Reduced"=less than a studio full of people) and the network speed capping might help the drives live longer (and prosper) even with zfs...
The other option that @sretalla mentioned I can undertake is go with btrfs on unRaid instead of TrueNAS. But I'll do some more research before I decide going that route.
 
Last edited:

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
So according to this

1 of the drives I have (WD60EDAZ) is SMR, but the other 2 (WD80EDAZ) are actually CMR.
That's interesting because the drive from which the copying was freezing was the WD80 one.
So either this page is incorrect, or the freezing was caused only by the Proxmox IO controller...
Checking your screen shot again, yes. I made a mistake, (I have trouble with white text on a dark background...)
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
...
How much data are you usually writing per backup?
...
My backup scheme does a ZFS scrub first, then a RSync backup from the NAS to the SMR disk. The amount of writes is probably not more than a 100GB, every few months. It's a cold storage backup, kept in a anti-static bag & Seahorse case;
 

berengard

Dabbler
Joined
Jul 22, 2023
Messages
19
Checking your screen shot again, yes. I made a mistake, (I have trouble with white text on a dark background...)
no worries. I'm on the dark side. Dark mode for life haha

My backup scheme does a ZFS scrub first, then a RSync backup from the NAS to the SMR disk. The amount of writes is probably not more than a 100GB, every few months. It's a cold storage backup, kept in a anti-static bag & Seahorse case;
Very cool. Thanks for sharing
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
In regards to the SMR disks, we are not saying you have to replace them. We are saying that the RANDOM FREEZES are a direct cause of using SMR disks. Live with them or not, your choice.
Just to share experience from the forums here...

I have seen many threads where during some kind of operation (resilver/scrub/other?), ZFS starts to kick SMR disks out of the pool because they are returning CAM STATUS timeouts... which can result in enough disks being kicked out of a pool to take it completely offline.

It's not just "slowness" you need to be prepared to live with if you're planning on using SMR disks with ZFS.
 
Top