Time not syncing with timeserver

duskwither

Cadet
Joined
Mar 20, 2024
Messages
5
Two separate Truenas scale machines, both bare metal being used as SMB fileserver within a Windows AD environment. The timeserver is the Windows domain controller and is a VM getting it's time from the ESZX host it's on. Domain controller time is correct, windows clients in the domain have no time issues either. Both are in UTC timezone or at least configured to be.

My Truenas machines have time drift issues which i can't explain.

Massive offset (i changed the polling interval to 16 to get some faster results):
Code:
root@hostname[~]# ntpq -npcrv
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 10.151.128.6    .LOCL.           1 u    -   16  377    0.397  +12795.   1.154

associd=0 status=c016 leap_alarm, sync_unspec, 1 event, restart,
version="ntpd 4.2.8p15@1.3728-o Wed Sep 23 11:46:38 UTC 2020 (1)",
processor="x86_64", system="Linux/5.10.142+truenas", leap=11, stratum=16,
precision=-23, rootdelay=0.000, rootdisp=0.000, refid=.,
reftime=(no time),
clock=e9a5783c.85ef52fa  Wed, Mar 20 2024 14:59:08.523, peer=0, tc=3,
mintc=3, offset=+0.000000, frequency=+0.000, sys_jitter=0.000000,
clk_jitter=0.000, clk_wander=0.000


Timedatectl output:
Code:
root@hostname[~]# timedatectl
               Local time: Wed 2024-03-20 14:58:21 UTC
           Universal time: Wed 2024-03-20 14:58:21 UTC
                 RTC time: Wed 2024-03-20 14:58:22
                Time zone: UTC (UTC, +0000)
System clock synchronized: no
              NTP service: n/a
          RTC in local TZ: no


ntp.conf
Code:
root@hostname[~]# cat /etc/ntp.conf
server 10.151.128.6 iburst maxpoll 10 minpoll 4
restrict default ignore
restrict -6 default ignore
restrict 127.0.0.1
restrict -6 ::1
restrict 127.127.1.0
restrict 10.151.128.6 nomodify notrap nopeer noquery


  • Checked bios for both machines, don't think I can set a timezone there but time is not too far off real-time.
  • Both machines have been upgraded from truenas core, which had no timesync issues after using the ntpdate command in a cronjob. Gui settings in Truenas Core were always probematic is as well.
  • I checked with tcpdump and can see UDP packets going back and forth between de timeserver and my truenas machines, but nothing seems to be changing.
I'm slowly getting grey hair trying to troubleshoot this problem, any help will be greatly appreciated.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399

duskwither

Cadet
Joined
Mar 20, 2024
Messages
5
On Core, TSC may be improperly flagged with an erroneously high quality. Switching to ACPI or HPET as the timecounter fixes this. See https://www.truenas.com/community/threads/system-is-going-back-in-time-wrong-date.85686/.

There may be equivalent tunables for Scale. For Ubuntu, which is a Debian derivative like Scale, see https://manpages.ubuntu.com/manpages/trusty/en/man4/timecounters.4freebsd.html.
Thanks! I've been looking but can't seem to find any timecounter choices in Scale:

With sysctl I can't find kern.timecounter. There is no kern to begin with, though sysctl kernel gives me a list, without timecounter in it.
Code:
root@hostname[~]# sysctl kernel
snipped
kernel.tainted = 12289
kernel.threads-max = 1026842
kernel.timer_migration = 1
kernel.traceoff_on_warning = 0
kernel.tracepoint_printk = 0
/snipped


Dmesg does'nt give me any direct hints
Code:
root@hostname[~]# dmesg | grep -i time
[    0.009983] ACPI: PM-Timer IO Port: 0x808
[    0.065739] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[    0.083298] Calibrating delay loop (skipped), value calculated using timer frequency.. 6399.82 BogoMIPS (lpj=12799656)
[    1.202330] workingset: timestamp_bits=36 max_order=25 bucket_order=0
[   88.747537] systemd-journald[1763]: Received client request to flush runtime journal.
[   89.036563] sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver
[   94.685450] RAPL PMU: API unit is 2^-32 Joules, 1 fixed counters, 163840 ms ovfl timer
[  148.900513] systemd-journald[7722]: Received client request to flush runtime journal.
[  150.896067] systemd-journald[7860]: Received client request to flush runtime journal.
\

Any tips or suggestions?
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Try sysctl -a | grep timecounter.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
What version of SCALE is this? It looks like angelfish (which is very much EOL). Since angelfish we've had several time-related fixes (including ones in newer kernels).
 

duskwither

Cadet
Joined
Mar 20, 2024
Messages
5
Oof, I was convinced I installed a supported version but you're right, this is Angelfish. I'll update to a supported version and report back.
 

duskwither

Cadet
Joined
Mar 20, 2024
Messages
5
Updated to Cobia and still having similar issues:
Code:
root@hostname[~]# chronyc sources -v

  .-- Source mode  '^' = server, '=' = peer, '#' = local clock.
 / .- Source state '*' = current best, '+' = combined, '-' = not combined,
| /             'x' = may be in error, '~' = too variable, '?' = unusable.
||                                                 .- xxxx [ yyyy ] +/- zzzz
||      Reachability register (octal) -.           |  xxxx = adjusted offset,
||      Log2(Polling interval) --.      |          |  yyyy = measured offset,
||                                \     |          |  zzzz = estimated error.
||                                 |    |           \
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
^? timeserver.domain.xxxxxx>     1   4   377     1  -3615ms[-3615ms] +/-  10.8s

The ? above means the timeserver (Windows DC) is unusable. Need to do some more troubleshooting, unless someone has a magic tip on why the timeserver seems to be unusable.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Your DCs are actually not running in UTC, given the large offset. Try setting reg add HKLM\System\CurrentControlSet\Control\TimeZoneInformation /v RealTimeIsUniversal /t REG_DWORD /d 0x1 on the DCs to actually put them on UTC, and adjust their BIOS clocks accordingly.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
given the large offset.
Is it that large? It's reporting 3615ms, which is only 3.6 seconds. Larger than it ought to be, yes, but not indicative of a timezone error.
 
Top