Home » General » Support » *SOLVED* Clock Drift / NTPd - CentOS OpenVZ Host - Problems
|*SOLVED* Clock Drift / NTPd - CentOS OpenVZ Host - Problems [message #6301]
||Wed, 13 September 2006 15:33
Registered: June 2006
I'm not positive this is an openVz issue, but wanted to post here in case others observe the same thing (?)
I've got a CentOS 4.4 machine with OpenVZ installed and a few OpenVZ virtual hosts running on it. The hardware / software is very solid, no problems at all there.
One ODD thing that has been happening: The clock on the system is drifiting at an absurd rate -- upwards of 5 seconds every 5 minutes -- and when I try to use ntp to keep it in synch with a local time source, it refuses to do so.
Right now I've got a crontab hitting "ntpdate" command every 5 minutes, which is working "OK" but clearly it is a band-aid solution.
I know from other reading that there are issues with VMWare hosting boxes which result in time-skew issues.. and it seemed not-inconcievable that this might also be an issue for OpenVZ.
However, it could be purely a problem with CentOS (I've found some threads in that vein already, but nothing conclusive / to my satisfaction yet).
So .. if anyone has any comments .. they certainly would be appreciated..
Sorry if this is a bit off topic.. clearly if/when I find an answer, I'll post it here just for reference.
[Updated on: Fri, 29 September 2006 02:44] by Moderator
Report message to a moderator
|Re: Clock Drift / NTPd - CentOS OpenVZ Host - Problems [message #6884 is a reply to message #6881]
||Tue, 26 September 2006 09:20
Registered: June 2006
so, some results with a semi-happy-ending. Can't say I understand the mess fully, but at least we have a stopgap that is less hideous.
(0) Starting point: was running with unreliable NTP, Booted on "RHEL-Unsupported" OpenVZ pre-complied Kernel.
(1) rebooted the machine using a stock nonVZ kernel for CentOS. Let ntp try to keep the clock synched. After 5 minutes it established a reliable synch that seemed to hold.
(2) Rebooted the machine to Stock Production VZ Kernel (not the RHEL "unsupported" VZ kernel). Started NTP. It refused to synch, held only to local pseudo-clock and then began to drift again.
(3) Applied some tweaks to the ntp config, and was able to force NTP to synch.
If anyone has thoughts about why this behaviour is here, I'm all ears. It does appear to suggest an issue with this OpenVZ kernel on my hardware being part of the problem, since a stock CentOS kernel wasn't giving any problems.
Exact notes follow below:
(0) STARTING POINT KERNEL:
Grub entry describes it thus:
title CentOS (2.6.9-023stab016.2-smp) OpenVZ RHEL Kernel Aug-26-06
kernel /vmlinuz-2.6.9-023stab016.2-smp ro root=/dev/md3 console=ttyS0,38400 acpi=off
This is pulled via yum-installation using OpenVZ RPM Repo source. With this kernel booted, the NTP drifts terribly (1 second per minute approx)
(1) Reboot to stock kernel, described by grub entry:
title CentOS (2.6.9-42.0.2.ELsmp) CentOS NO OpenVZ Kernel Aug-24-06
kernel /vmlinuz-2.6.9-42.0.2.ELsmp ro root=/dev/md3 console=ttyS0,38400 acpi=off
With this kernel booted (vz hosts down, vz not running) the NTP synch works great.
(2) Reboot using Official VZ Kernel:
title CentOS (2.6.8-022stab078.14-smp) OpenVZ Official Aug-24-06
kernel /vmlinuz-2.6.8-022stab078.14-smp ro root=/dev/md3 console=ttyS0,38400 acpi=off
With this kernel booted, VZ hosts running, the NTP drift is back.
(3) as per hints pulled from the URL:
Specifically, I added to my ntp config:
server 13X.6X.1.1X3 burst iburst
the burst and iburst settings
Bingo, NTP was synched in <5 minutes.
So, For now I'm running the official stock OpenVZ kernel (I don't need the extra features of the RHEL version, namely MAC Addresses for VZHosts - for now -- which is why I had tested that kernel in the recent past) -- and I've got NTP working. For now.
MAYBE the problem somehow relates to the fact, I'm running a 32-bit kernel on a 64-bit platform (AMD 64-bit Athlon 64x2 CPU). I've seen brief reference via my google searches to some context where NTP gets pissed off when the kernel is built on platform somewhat dissimilar from your actual. However, this could be just smoke-and-mirrors thoughts on my part. Could be something totally unrelated. however, the fact that a vanilla CentOS kernel was able to run with NTP happily on this machine without any kludges ... is somewhat pointing-the-finger towards the OpenVZ kernel, one way or another ?!
So. That's the end of the story for now.
|Re: *SOLVED* Clock Drift / NTPd - CentOS OpenVZ Host - Problems [message #7153 is a reply to message #6301]
||Thu, 05 October 2006 09:06
Registered: June 2006
Thought I should post another followup here, since the issue has evolved a fair bit since my last post. Since making the tweak with "burst / iburst" to my ntp config, it appeared to do the trick .. time wasn't ever so far out.
However, I was still seeing not-great values reported via "ntpq -p", ie, offset values in the thousands. Observing my /var/log/messages file, it was also clear that the ntp daemon was struggling, with frequent shifts of the NTP time source being used, and occasional "time warp" clock resets (ie, 120second adjustments being made periodically), approx as follows:
from /var/log/messages ...
Oct 1 04:20:42 hydra ntpd: synchronized to 129.1X3.1.100, stratum 2
Oct 1 04:20:47 hydra ntpd: no servers reachable
Oct 1 04:23:20 hydra ntpd: synchronized to 129.1X3.5.100, stratum 2
Oct 1 04:24:18 hydra ntpd: synchronized to 129.1X3.105.63, stratum 3
Oct 1 04:24:21 hydra ntpd: no servers reachable
Oct 1 04:24:35 hydra ntpd: synchronized to 129.1X3.105.63, stratum 3
Oct 1 04:25:31 hydra ntpd: no servers reachable
Oct 1 04:27:03 hydra ntpd: synchronized to 129.1X3.1.100, stratum 2
Oct 1 04:25:16 hydra ntpd: time reset -106.823489 s
Oct 1 04:26:35 hydra ntpd: synchronized to 129.1X3.1.100, stratum 2
Oct 1 04:26:39 hydra ntpd: no servers reachable
Oct 1 04:27:54 hydra ntpd: synchronized to 129.1X3.1.100, stratum 2
Oct 1 04:28:51 hydra ntpd: synchronized to 129.1X3.105.63, stratum 3
Oct 1 04:28:53 hydra ntpd: no servers reachable
Oct 1 04:29:05 hydra ntpd: synchronized to 129.1X3.1.100, stratum 2
Yesterday I was puttering around with this a bit more. Some reading I came across suggested that a boot-flag option, "clock=pmtmr" might be of some use. I rebooted with this, but it was clear from Dmesg that it was being ignored, telling me:
Warning: clock= override failed. Defaulting to PIT
So, I tweaked my boot parameters again and removed my boot parameter, "acpi=off", since I hadn't tested behaviour thus with my "new ntp config" (using burst / iburst).
Since the change, NTP behaviour is *exactly* what I would have wanted - the time source is no longer having trouble, offset values are reasonable, and the clock is solid. ie:
[root@hydra log]# ntpq -p
remote refid st t when poll reach delay offset jitter
*sxt-np-1.XXXX.D 128.XXX.150.93 2 u 111 1024 377 6.370 -0.550 1.929
+KIL-NP-1.XXXX.D XXX.246.168.9 2 u 133 1024 377 3.472 0.198 5.219
Router.Phys.OCE XXX.173.5.100 3 u 100 1024 357 704.997 350.089 111.705
and additionally in /var/log/messages,
Oct 4 17:44:44 hydra ntpd: ntpd email@example.com Sun Aug 13 01:49:12 CDT 2006 (1)
Oct 4 17:44:44 hydra ntpd: ntpd startup succeeded
Oct 4 17:44:44 hydra ntpd: precision = 4.000 usec
Oct 4 17:44:44 hydra ntpd: Listening on interface wildcard, 0.0.0.0#123
Oct 4 17:44:44 hydra ntpd: Listening on interface lo, 127.0.0.1#123
Oct 4 17:44:44 hydra ntpd: Listening on interface eth0, XXX.173.23.235#123
Oct 4 17:44:44 hydra ntpd: Listening on interface eth2, XXX.168.111.235#123
Oct 4 17:44:44 hydra ntpd: kernel time sync status 0040
Oct 4 17:44:44 hydra ntpd: frequency initialized -86.648 PPM from /var/lib/ntp/drift
Oct 4 17:44:51 hydra ntpd: synchronized to XXX.173.5.100, stratum 2
Oct 4 17:44:59 hydra ntpd: kernel time sync disabled 0041
Oct 4 17:46:04 hydra ntpd: kernel time sync enabled 0001
Oct 4 22:16:37 hydra sshd(pam_unix): session opened for user chipmant by (uid=0)
ie, note that once sync was enabled at 17:46:04, it hasn't been lost / nor has it drifted / the clock is solid now.
So - things are infinitely better now as they stand as compared to the original state. I'm not sure if this mess of a history will help anyone else, ever, but I thought I should post the end-of-story here... just in case.
Current Time: Sat May 25 21:11:34 EDT 2013