Linux Jails - Experimental Script

Kailee71

Contributor
Joined
Jul 8, 2018
Messages
110
If I understand @Kris Moore correctly if enough people give feedback on this then their minds are open. At least that is what I read into his comment. I for one am super impressed with what Jailmaker can do already and look forward to seeing it develop further. Really I personally only need one or two extra features that should be possible to implement with reasonable effort (easier implemenation of bridging, although a lot of the improvement could come from TNS, but especially using zvols or datasets as root fs backing).
Keep up the great work @Jip-Hop!
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Last edited:

Kailee71

Contributor
Joined
Jul 8, 2018
Messages
110
Oooh there's movement from iX!

Thanks @Kris Moore for your support on this :smile:
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
So far only @cap has reported trying out jailmaker with rootless jails. So before considering to make this the default, I think it requires more testing and documentation.

- can't bind mount directories from available ZFS datasets as-is (have to chown first or possibly alter ACLs, assuming existing files are not already owned by UIDs in the range from 65536 to 131072).

I always welcome additional testing or (documentation) contributions!
That range of IDs may be in use by local users (somewhat unlikely as we by default monotonically increment from 3000), or when AD is enabled (in FreeNAS / TrueNAS prior to version 12, the default range for AD users was 20000 - 90000000), since then it has been bumped up to start at 100,000,000, but there may be legacy or custom deployments this will conflict. Probably a rule-of-thumb should be if you're using this sort of feature, don't use directory services (LDAP or AD).
 

snicke

Explorer
Joined
May 5, 2015
Messages
74
So far only @cap has reported trying out jailmaker with rootless jails. So before considering to make this the default, I think it requires more testing and documentation.

I see these possible complications:

- docker info inside a rootless jail shows "Native Overlay Diff: false" (not sure if this is normal for docker in a rootless container...)
- can't use host networking when running docker in rootless jail (have to resort to macvlan or bridge networking, which requires specific hardware or additional setup)
- can't bind mount directories from available ZFS datasets as-is (have to chown first or possibly alter ACLs, assuming existing files are not already owned by UIDs in the range from 65536 to 131072).

I always welcome additional testing or (documentation) contributions!
What if this scenario is turned around by running Podman inside the systemd-nspawn containers instead of Docker? Any thoughts about that setup security-wise considering that Podman is running as non-root by default?

I.e we would have the systemd-nspawn container running as root but the Podman containers, inside the systemd-nspawn container, running as non-root. Podman should work as a drop-in replacement for Docker as I understand it.
 

cap

Contributor
Joined
Mar 17, 2016
Messages
122
What if this scenario is turned around by running Podman inside the systemd-nspawn containers instead of Docker? Any thoughts about that setup security-wise considering that Podman is running as non-root by default?

I.e we would have the systemd-nspawn container running as root but the Podman containers, inside the systemd-nspawn container, running as non-root. Podman should work as a drop-in replacement for Docker as I understand it.
Podman in an unprivileged container :D

I haven't done much more with the jails due to lack of time.
But I can say that everything I briefly tested with the unprivileged jails seems to work. These were some Docker containers (Immich, Emby, Jellyfin, Paperless NGX and a few more). But it can also be nice to install something directly without Docker inside a jail.
Edit:
I just remembered:
Emby was super slow when I accessed the library (bind mount). I didn't have time to find out the cause. Maybe I'll find time to test something next week.
 
Last edited:

skittlebrau

Explorer
Joined
Sep 1, 2017
Messages
54
Emby was super slow when I accessed the library (bind mount). I didn't have time to find out the cause. Maybe I'll find time to test something next week.
I experienced similar with Plex, but it definitely wasn't what I'd call 'super slow' and wasn't a dealbreaker. In the grand scheme of things it wasn't a big deal, instead of a movie starting playback in 2 seconds, it loaded in 5 seconds.

Not sure what would be the best way to benchmark my bind mounts for testing purposes.
 

skittlebrau

Explorer
Joined
Sep 1, 2017
Messages
54
What if this scenario is turned around by running Podman inside the systemd-nspawn containers instead of Docker? Any thoughts about that setup security-wise considering that Podman is running as non-root by default?

I.e we would have the systemd-nspawn container running as root but the Podman containers, inside the systemd-nspawn container, running as non-root. Podman should work as a drop-in replacement for Docker as I understand it.

It would be a step in the right direction. I'd still prefer for root in the container to be mapped to another UID/GID.

I tried using podman, but it failed with an error complaining about not being compatible with zfs storage driver.

@Jip-Hop - would the below help if we wanted to ensure specific UIDs are mapped while all others (including root) are mapped to UIDs/GIDs are mapped to numbers above 65536?

The main problem I'm looking to solve is harden the security of the jail, especially for root, without needing to change the ownership of the files on the host.
 
Last edited:

Jip-Hop

Contributor
Joined
Apr 13, 2021
Messages
118

skittlebrau

Explorer
Joined
Sep 1, 2017
Messages
54
It looks like using idmap with the bind mounts will suit what I’m trying to do, so I can do a 1:1 map between non-root users on the container and host, with root in the container effectively being nobody/nogroup on the host.

I think it was mentioned earlier in the thread, but I only just realised that idmap is incompatible with NFSv4 ACLs and only works with POSIX, unless I've misinterpreted things because it's what I observed during testing.
 
Last edited:

skittlebrau

Explorer
Joined
Sep 1, 2017
Messages
54
Odd. It’s noticeably slow and delayed. Running Plex in a Docker container, playback on clients is very slow to start. By comparison, I’ve got Docker in a VM with data mounted by NFS share and playback is instant.
 

cap

Contributor
Joined
Mar 17, 2016
Messages
122
Odd. It’s noticeably slow and delayed. Running Plex in a Docker container, playback on clients is very slow to start. By comparison, I’ve got Docker in a VM with data mounted by NFS share and playback is instant.
As I wrote above, Emby was super slow for me. It took a very long time until the library was loaded and you could access e.g. the movie library or music library from the web interface or AppleTV.
However, I only tested this briefly and haven't done anything at all with the jails recently due to a lack of time. Having just read your post, I looked and saw that I haven't deleted a test jail yet. Emby is even still installed. This time everything is going fast - as you would expect.
I had a power outage last time and the server rebooted. I don't think that's the reason. Perhaps the library was not yet fully processed internally by Emby. I'll try to keep an eye on it.

I actually prefer Jellyfin. But there are no good clients for AppleTV if you also want to access music and photos.
Before I use Plex, I wouldn't use a MediaServer at all. And I say that as someone who has been there since minute 1 when Elan forked Kodi/XBMC many years ago. Plex requires a mandatory account, an internet connection. Privacy is more than dubious. Security incidents etc. I've been away from Plex for years. Jellyfin is the best choice from this perspective. Emby is in between.
 
Last edited:

skittlebrau

Explorer
Joined
Sep 1, 2017
Messages
54
I'll give it another go some time. I was just following the official installation method for Debian from the podman website.

I was able to get idmap working well and in practice it ended up working almost as well as an unprivileged LXC, so I'm happy with just using docker compose. I've got another kid on the way, so I feel like anything I learn about podman now will probably vacate my brain once the sleep deprivation kicks in!

In the meantime, I'm going to do a bit more testing with jailmaker, so it's going to remain secondary to my Docker VM on Proxmox for now. It's a bit of a shame you can't do uid/gid mapping on datasets with NFSv4 ACLs. Someone submitted it as a feature request for systemd-nspawn a few months ago and it looks like it was accepted, so I guess it's just a matter of time until it gets implemented in the same way you can do with LXC already. If I was able to just make it so only root in the jail was neutered, even that would be fine.
 

cap

Contributor
Joined
Mar 17, 2016
Messages
122
I don't quite understand the problem. It may be a little inconvenient, but it's quick and you don't do it all the time.
I have created two users in TrueNAS Scale for the test jail "myjail" and the Docker running there with Emby (UID: 3050; PUID 3050) currently:

myjail - UID: 65536
docker_3050 - UID: 68586 (= 65536 + 3050)


Both users have NFSv4 reading permissions (TrueNAS Scale UI) to the corresponding dataset "center" including files, which is used with a bind_mount.

Code:
startup=1
docker_compatible=1
gpu_passthrough_intel=1
gpu_passthrough_nvidia=0

systemd_nspawn_user_args=--network-bridge=br0 --resolv-conf=bind-host --private-users=65536:65536 --private-users-ownership=chown --bind-ro=/mnt/default/media/center

# You generally will not need to change the options below
systemd_run_default_args=--property=KillMode=mixed --property=Type=notify --property=RestartForceExitStatus=133 --property=SuccessExitStatus=133 --property=Delegate=yes --property=TasksMax=infinity --collect --setenv=SYSTEMD_NSP>
systemd_nspawn_default_args=--keep-unit --quiet --boot


It works. The only strange thing was that it was slow the first time I tried it with Emby. But I didn't notice anything like that with other applications like Navidrome.
I hope to find some time in the coming week. Then I will test something - also with Podman. But I can't promise anything.
 

skittlebrau

Explorer
Joined
Sep 1, 2017
Messages
54
Thanks, I’ll give that a go. I probably overcomplicated things by attempting to use idmap on the bind mounts in addition to offsetting all uids/gids.
 

Jip-Hop

Contributor
Joined
Apr 13, 2021
Messages
118
The --bind-user= flag is quite interesting too. I created a test user named rootlesstest (password disabled, bash login shell) and just tried the --bind-user=rootlesstest flag with a rootless jail and I could share the home directory of my rootlesstest user inside the jail. There the rootlesstest user magically exists and it can access all files (as well as create) in the home dir at /var/run/host/home/rootlesstest. Even though the user in the jail runs with uid 60514 and outside the jail it has uid 6000.
 
  • Like
Reactions: cap

skittlebrau

Explorer
Joined
Sep 1, 2017
Messages
54
Weird. My instance of syncthing really does not like running inside jailmaker on my server.

After a short period of indexing, the syncthing database gets corrupt and this happens after clearing and starting with a fresh database. I'm also seeing a lot of dmesg errors on the TrueNAS host complaining about the process id linked to syncthing and some errors mentioning macvlan. The macvlan issues could be something funky going on with my NIC (Intel X540-T2 10G NIC), but that shouldn't affect the syncthing database since that's running locally on a NVME SSD.

Other services that rely on large databases (Plex and Photoview) haven't been a problem.
 
Last edited:
Top