This is a strange one, and I'm honestly not sure where to ask about this, or how to proceed with it. But let me give some context and details:
- FreeNAS 11.2 U7
- Supermicro X11SAE-M-O
- 32GB ECC Crucial RAM
- SeasonicFocus Plus platinum 550w PSU
- half a dozen WD Red HDDs - main storage for media
- misc SSDs (Samsung mainly IIRC) - jails and misc stuff
I have many jails running, and have been toying with one for CCTV use, and been looking at motion and motioneye for this, and it mostly works acceptably (especially after updating motion to the latest version). However there is one very strange thing that happens, after video is recorded, if you navigate to the list of videos, motioneye (python scripts) calls out to ffmpeg to convert the video. The first call hangs. It never returns, it's not using CPU, nothing. Same command from the command line works immediately. If I kill the ffmpeg proc, it turns zombie and eventually disappears... and then the python script continues and the subsequent ffmpeg call will succeed (for the next video).
I have cleaned out the directory several times, and it keeps happening... meaning it isn't just one corrupt video file or something.
I tried to call truss on the process, and the kernel paniced!
I am somewhat reluctant to repeat this, but I do think the crash is repeatable. I also do not believe this is a memory issue, as I'd expect to see random stuff, rather than specifically "ffmpeg called by python the first time". I can repeat this python/ffmpeg hang 100% of the time, even across reboots.
The crashing PID looked like this:
The PID I was trying to run truss on looked like this:
This system has been stable for about 12 months now, and nothing has changed recently other than adding a jail to run motion/motioneye in.
Typical. As I proof-read the post before clicking post I noticed something.... motioneye is running from the terminal in the background, and I had noticed that when I foregrounded it, the process was suspended, which makes sense given the above stack trace (ie sleeping inside the tty device ioctl call). This is good, this means I can avoid the issue, presumably ffmpeg is trying to write something to stdout, and there is no where for it to go. I can work around that. The kernel panic is not good, but if it is purely running truss on a suspended process, I can just avoid doing that.
I have tested this on FreeBSD12 and it does not panic, so I'm going to say this is solved and leave this here in case someone else hits something similar.
- FreeNAS 11.2 U7
- Supermicro X11SAE-M-O
- 32GB ECC Crucial RAM
- SeasonicFocus Plus platinum 550w PSU
- half a dozen WD Red HDDs - main storage for media
- misc SSDs (Samsung mainly IIRC) - jails and misc stuff
I have many jails running, and have been toying with one for CCTV use, and been looking at motion and motioneye for this, and it mostly works acceptably (especially after updating motion to the latest version). However there is one very strange thing that happens, after video is recorded, if you navigate to the list of videos, motioneye (python scripts) calls out to ffmpeg to convert the video. The first call hangs. It never returns, it's not using CPU, nothing. Same command from the command line works immediately. If I kill the ffmpeg proc, it turns zombie and eventually disappears... and then the python script continues and the subsequent ffmpeg call will succeed (for the next video).
I have cleaned out the directory several times, and it keeps happening... meaning it isn't just one corrupt video file or something.
I tried to call truss on the process, and the kernel paniced!
Code:
Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 05 fault virtual address = 0x2b0 fault code = supervisor write data, page not present instruction pointer = 0x20:0xffffffff80b17c5a stack pointer = 0x28:0xfffffe08556f4520 frame pointer = 0x28:0xfffffe08556f4640 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 32635 (truss)
I am somewhat reluctant to repeat this, but I do think the crash is repeatable. I also do not believe this is a memory issue, as I'd expect to see random stuff, rather than specifically "ffmpeg called by python the first time". I can repeat this python/ffmpeg hang 100% of the time, even across reboots.
The crashing PID looked like this:
Code:
Tracing pid 32635 tid 101441 td 0xfffff80258532620 kern_ptrace() at kern_ptrace+0x138a/frame 0xfffffe08556f4640 sys_ptrace() at sys_ptrace+0x18d/frame 0xfffffe08556f4880 amd64_syscall() at amd64_syscall+0xa38/frame 0xfffffe08556f49b0 fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe08556f49b0 --- syscall (26, FreeBSD ELF64, sys_ptrace), rip = 0x800ae586a, rsp = 0x7fffffffeae8, rbp = 0x7fffffffeb10 ---
The PID I was trying to run truss on looked like this:
Code:
Tracing command ffmpeg pid 29505 tid 101577 td 0xfffff80233b48000 sched_switch() at sched_switch+0x8ad/frame 0xfffffe0855422460 mi_switch() at mi_switch+0xe6/frame 0xfffffe0855422490 thread_suspend_check() at thread_suspend_check+0x2a7/frame 0xfffffe08554224e0 sleepq_catch_signals() at sleepq_catch_signals+0x1e9/frame 0xfffffe0855422540 sleepq_wait_sig() at sleepq_wait_sig+0xf/frame 0xfffffe0855422570 _cv_wait_sig() at _cv_wait_sig+0x167/frame 0xfffffe08554225c0 tty_wait_background() at tty_wait_background+0x19c/frame 0xfffffe0855422690 ttydev_ioctl() at ttydev_ioctl+0x158/frame 0xfffffe08554226e0 devfs_ioctl_f() at devfs_ioctl_f+0x128/frame 0xfffffe0855422740 kern_ioctl() at kern_ioctl+0x26d/frame 0xfffffe08554227b0 sys_ioctl() at sys_ioctl+0x15c/frame 0xfffffe0855422880 amd64_syscall() at amd64_syscall+0xa38/frame 0xfffffe08554229b0 fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe08554229b0 --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x8043adf7a, rsp = 0x7fffffffe7f8, rbp = 0x7fffffffe840 ---
This system has been stable for about 12 months now, and nothing has changed recently other than adding a jail to run motion/motioneye in.
Typical. As I proof-read the post before clicking post I noticed something.... motioneye is running from the terminal in the background, and I had noticed that when I foregrounded it, the process was suspended, which makes sense given the above stack trace (ie sleeping inside the tty device ioctl call). This is good, this means I can avoid the issue, presumably ffmpeg is trying to write something to stdout, and there is no where for it to go. I can work around that. The kernel panic is not good, but if it is purely running truss on a suspended process, I can just avoid doing that.
I have tested this on FreeBSD12 and it does not panic, so I'm going to say this is solved and leave this here in case someone else hits something similar.