How to get consistent character encoding - SMB share vs. FreeNAS/FreeBSD

Status
Not open for further replies.

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Hi all,

in one environment I support there are several SMB shares on a FreeNAS 11.1U5 that are exclusively accessed by Macs. Current or one generation old Mac OS.

Samba charset configuration is as follows:
Code:
dos charset = CP437
unix charset = UTF-8


When I look at the share datasets in a shell on the FreeNAS I have this environment:
Code:
LANG=en_US.UTF-8
MM_CHARSET=UTF-8


Simple characters in file names like German "ä, ö, ü" are displayed correctly when I use e.g. ls, but there are combinations that are displayed as question marks instead of the characters that the Mac users see on their shares.

Any hints on how to get this consistent are most welcome.

Thanks,
Patrick
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
It'd be interesting to know what OS X is doing, exactly since this has come up before. Windows just works with the default settings.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
OK, the directories I first stumbled upon all have trailing dots in their names when viewed on a Mac. Looks like

So that is what happens: Macintosh translates a space (0x20 character encoding) to 0xF028, but only if occurring as the last character of the name. Idem for period: 0x2E to 0xF029.

Sources:
https://valentijn.sessink.nl/?p=652
https://kb.acronis.com/content/39790

This calls for further investigation ...

Patrick
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Well ... looks like this is hard coded in the Mac SMB client unless Apple changed things in recent years:

https://opensource.apple.com/source/xnu/xnu-1456.1.26/bsd/vfs/vfs_utfconv.c.auto.html
Code:
/*
 * Invalid NTFS filename characters are encodeded using the
 * SFM (Services for Macintosh) private use Unicode characters.
 *
 * These should only be used for SMB, MSDOS or NTFS.
 *
 *	Illegal NTFS Char   SFM Unicode Char
 *  ----------------------------------------
 *	0x01-0x1f		   0xf001-0xf01f
 *	'"'				 0xf020
 *	'*'				 0xf021
 *	'/'				 0xf022
 *	'<'				 0xf023
 *	'>'				 0xf024
 *	'?'				 0xf025
 *	'\'				 0xf026
 *	'|'				 0xf027
 *	' '				 0xf028  (Only if last char of the name)
 *	'.'				 0xf029  (Only if last char of the name)
 *  ----------------------------------------
 *
 *  Reference: http://support.microsoft.com/kb/q117258/
 */


I'll try AFP I figure ...

Patrick
 
Status
Not open for further replies.
Top