Mac OS // SMB // Zfs ARC metadata collapse

Nicolas_Studiokgb

Contributor
Joined
Aug 7, 2020
Messages
130
Hello All, hello @anodos

Sorry to pull up this one from the grave but
I've entered the truenas server in production for a few days now and I still have a very annoying performance issue in SMB with mac clients.

I've configured dataset and share as we finally said
"apple extension" disabled in services
and following settings on the share :

1696877930442.png

I copy files with freefilesync (the destination folder was empty so there is no "sync" to do, only copy the whole files)
It took 1h20m to copy 100GB (should be less than 30minutes at worse) on 1Gb/s ethernet
it's about 9500 files in approx 20 folders/subfolders (.wav, .mxf files)
I still have ARC Requests demand_metadata goes stuck to 10M for one hour and the server cores crawling at 92%, copy speed collapse to less than 1mB/s... the usual issue.

Is there any new remedies for this one ?

Let me know :)
 

Nicolas_Studiokgb

Contributor
Joined
Aug 7, 2020
Messages
130
Just tried the same data copied the same way to an old AFP server also over 1Gb/s : 23minutes.
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
I would try to eliminate factors. That would e.g. include the following activities:
  • Test network speed with iperf
  • Use on-board NIC instead of QLogic
  • Use different program to copy
  • Try SFTP instead of SMB
  • Use different files (smaller, bigger)
Changing details about those aspects will give you hints where to dig deeper.
 

Nicolas_Studiokgb

Contributor
Joined
Aug 7, 2020
Messages
130
So I've done some more testing
I launch another copy with finder of the same bunch of data
ARC Requests demand_metadata stay below 4M as during the test sessions we've done some months ago.

If I launch again with freefilesync the same bunch of data
ARC Requests demand_metadata jump to 7/10M with moment over 13M but this time it didn't go with smbd process overloading cores.
So with twice the same action we have a different behaviour.

But still as soon we go with intensive metadata over SMB we have performance issues.

@ChrisRJ
regarding NIC testing, No differences has been observed during the 2 months test session we've been thru
Mac os just only officially support SMB so we are stuck with SMB and SMB has to work. If I try AFP for exemple performances are better but as AFP is deprecated we can't rely on this setup.
As a post production company we work with sound files, video files and folder with thousound of files in it. That's the way the whole branch work, It has to work this way.
There are not so many "viable", "working", "ressources forks aware" file synchronization program available on mac. If anyone has tested something in this kind of setup let me know. (Gui of course, my users don't even know that command line exists)

Thanks a lot :)
 

asap2go

Patron
Joined
Jun 11, 2023
Messages
228
Did you try a linux SMB client?
If that one is faster then it's likely an issue with the Mac OS implementation of SMB.
If the problem remains then it's more likely to be an issue with the workload/protocol/hardware on server side (e.g. atime=on causing a lot of write operations while copying although that souldn't be SMB specific).
 

Nicolas_Studiokgb

Contributor
Joined
Aug 7, 2020
Messages
130
Hi

Yes we know it's an issue with Mac OS but Apple do what they want and us folk have to adapt...
@anodos coulf explain better than me but its a way that mac os deals with metadata and overload the Zfs ARC metadata requests.
The idea would be to stop using mac but it's not an option because we work in sound port production and workflow has to be on mac.
After a lot of testing with @anodos (thank you so much for that) he manage to limit the issue (it was absolutely unusable at the begining). Certainly that upgrade to last samba version will help more. Let's hope so :)
(atime, sync, are all disabled of course)
 

asap2go

Patron
Joined
Jun 11, 2023
Messages
228
"and overload the Zfs ARC metadata requests."
Okay, I did not know that. Trying a special metadata vdev might help, but that's not easily reversible afaik.
 

Nicolas_Studiokgb

Contributor
Joined
Aug 7, 2020
Messages
130
As there is 256GB of ram and the problem happen even if the ram is completely free. After discussing on the forum we decided that metadata vdev wouldn't help unless HHDs are overloaded (which not the case).
I know there are also improvement in the way ARC is handled and they might appear in a future update

thanks
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Love the Apple experience. I guess that until they fix you can only mitigate the issue as you did.
 

Nicolas_Studiokgb

Contributor
Joined
Aug 7, 2020
Messages
130
As there are more and more Mac users among companies, I'm sure (hope) Ix system is really working on a compatibility with apple clients. (If it works between mac in SMB it should be possible to reproduce the same behaviour, even if its a mac only feature. The ability to segregate settings of "SMB for mac" and "SMB for the rest of the world" should also be considered if needed.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
As there are more and more Mac users among companies, I'm sure (hope) Ix system is really working on a compatibility with apple clients. (If it works between mac in SMB it should be possible to reproduce the same behaviour, even if its a mac only feature. The ability to segregate settings of "SMB for mac" and "SMB for the rest of the world" should also be considered if needed.
I honestly fear there is little they can do.
 

John45622

Contributor
Joined
Dec 2, 2020
Messages
105
I have a similar problem where SMB volumes just hard dismount from OSX when running Backup apps like carbon copy cloner or Chronoync on large folders (10k+ items). I noticed my ARC request metadata stats jump to 60m when this happens. Can someone explain to me what kind of unit 60m is? Bits? Bytes? or does it mean "million" as in 60M? I assume "m" would mean "milli" hence my confusion. Thanks! Not sure why scanning a folder with 10k items creates an ARC hit of "60m" (whatever unit that is). (the other thread with screen shot https://www.truenas.com/community/t...ismount-under-osx-ventura.113610/#post-786158

Sometimes just opening such a folder on OSX dismounts the volume after a few seconds.
 

Nicolas_Studiokgb

Contributor
Joined
Aug 7, 2020
Messages
130
Hi all
Hi @anodos

Back on this metadata issue... It's getting very scary... clearly

I* try to copy 708 files (644MegaBytes) from my mac onto SMB share in a folder containing 1659 files (audio wav files)
(dragn n drop from one folder to the other with the mac os finder)
the server completely collapsing, and crank to 20M ARC requests for 33 minutes (!!!)
The mac showing "preparing to copy to" and then copy during all this time.
33 minutes to copy 644 MB of audio files with the smbd process @ 99% on random cores.

So it's neither a huge amount of number of file nor a huge amount of data to copy. So what is the problem ?

This problem is recurrent and random. I can try to reproduce and it may be won't behave like that twice
(I've just synchronize another folder from the mac to the server, 2,9 GB in 26 seconds, same type of audio files also)

As we can see memory is clearly free

Help

(* : I say "I" to simplify but it's one of my editor... just to say that each user has his own habit of work and we have to find solutions for every usecase, specially on such simple workflow)

1698782246643.png
1698783786129.png
1698782745472.png
 

John45622

Contributor
Joined
Dec 2, 2020
Messages
105
Do we know if this happens on TN Scale as well? Thinking of moving to scale if this only happens on core. But I'm not keen on doing that if it doesn't make a difference.
 

Nicolas_Studiokgb

Contributor
Joined
Aug 7, 2020
Messages
130
I don't know and I'm not sure I want to migrate if there is no assurance it would solve the problem (and not add some new ones)
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Do we know if this happens on TN Scale as well? Thinking of moving to scale if this only happens on core. But I'm not keen on doing that if it doesn't make a difference.
Afaik there should be no difference since the issue is about Apple and how it integrates with SMB.
 

Nicolas_Studiokgb

Contributor
Joined
Aug 7, 2020
Messages
130
I think something can clearly be done server side as we worked a lot on this with @anodos during this year and the last TN release implement corrections to limit the issue. Still, it's not completly gone.
 

John45622

Contributor
Joined
Dec 2, 2020
Messages
105
I have ordered more RAM to see if that helps but the fact that you have almost 10 times more RAM than I do and still have the problems doesn't make me very optimistic. My machine has now reached a point where it's impossible to copy large folders without the volume dismounting. I don't even know how to copy the data off the machine in order to remove TN from our workflow entirely and look into alternative solutions. That said today our pfsense (surricata) reported this:

1700390449309.png


.200 is the server. I don't think it relates to anything as I'm not actually routing the storage. All clients and the server are on the same network and the above is the first time this was reported so I highly doubt it means anything.

Maybe TN just isn't the right thing for a mac based media production...

(we also tried NFS but it's just very slow. I get about 30MB/s at best over an 10Gb network. SMB is almost line speed over the same physical connection so sadly NFS isn't an option either)
 
Last edited:

jenksdrummer

Patron
Joined
Jun 7, 2011
Messages
250
Apple SMB problem is an Apple problem; their solution is to use more Apple.

It was the same with their WIFI for a number of years, until they realized they weren't going to get corporates to implement Apple based WiFi either instead or alongside other solutions; plus, I don't think Apple was going to invest once meshing became a 'thing'. Counterpoint; SMB has been around how long now?

I doubt they'll be bothered with fixing it considering they have an Apple File Server 'solution' for anything on-prem, meanwhile, anything non-prem is just someone else's computer and not using same protocols.


Of issue; is SMB1 enabled?
 

John45622

Contributor
Joined
Dec 2, 2020
Messages
105
Well there is no such thing as an "apple file server solution" anymore. mac-based media houses run iSCSI and fibrechannel servers from DDP, Terrablock, AVID and other suppliers all the time and they are all based on some UX-derivates not Apple hardware or OS. And TN say they supply media houses all the time and even have promo videos on installations in mac based media houses. If it boiled down to OSX per se, I'm sure none of these companies would use TN or any other non-apple server infradtructure. I'll see what more RAM does...
There's even a whitepaper that says TN is "perfect for ProTools"....https://www.truenas.com/media-creation/#TrueNAS-PDF-white-paper-truenas-for-audio-production/3/
And another about media production talking about handling "billions of media files". I doubt they would claim this if OSX was a problem per se as the post-production world is VERY mac driven.
 
Last edited:
Top