[nSLUG] Why is ntp so terrible?

D G Teed donald.teed at gmail.com
Fri Mar 10 14:44:44 AST 2017


The IPv6 issue was the problem.  Here is the run of my checkdate script
now.  A few
systems still working on getting there, but much better...

Fri Mar 10 14:35:48 AST 2017
Fri Mar 10 14:35:49 AST 2017
Fri Mar 10 14:35:49 AST 2017
Fri Mar 10 14:35:49 AST 2017
Fri Mar 10 14:35:49 AST 2017
Fri Mar 10 14:35:49 AST 2017
Fri Mar 10 14:35:49 AST 2017
Fri Mar 10 14:35:49 AST 2017
Fri Mar 10 14:35:51 AST 2017
Fri Mar 10 14:35:51 AST 2017
Fri Mar 10 14:35:52 AST 2017
Fri Mar 10 14:35:52 AST 2017
Fri Mar 10 14:35:53 AST 2017
Fri Mar 10 14:35:53 AST 2017
Fri Mar 10 14:35:53 AST 2017
Fri Mar 10 14:35:26 AST 2017
Fri Mar 10 14:35:54 AST 2017
Fri Mar 10 14:35:54 AST 2017
Fri Mar 10 14:35:54 AST 2017
Fri Mar 10 14:35:54 AST 2017
Fri Mar 10 14:35:55 AST 2017
Fri Mar 10 14:35:55 AST 2017
Fri Mar 10 14:35:55 AST 2017
Fri Mar 10 14:35:55 AST 2017
Fri Mar 10 14:35:55 AST 2017
Fri Mar 10 14:35:56 AST 2017
Fri Mar 10 14:35:56 AST 2017
Fri Mar 10 14:35:56 AST 2017
Fri Mar 10 14:35:52 AST 2017
Fri Mar 10 14:35:57 AST 2017
Fri Mar 10 14:36:17 AST 2017
Fri Mar 10 14:35:57 AST 2017
Fri Mar 10 14:35:57 AST 2017
Fri Mar 10 14:35:57 AST 2017
Fri Mar 10 14:35:57 AST 2017
Fri Mar 10 14:35:58 AST 2017
Fri Mar 10 14:35:58 AST 2017
Fri Mar 10 14:35:58 AST 2017
Fri Mar 10 14:35:58 AST 2017
Fri Mar 10 14:35:58 AST 2017
Fri Mar 10 14:35:58 AST 2017
Fri Mar 10 14:35:59 AST 2017
Fri Mar 10 14:35:59 AST 2017

The insidious part of this problem is ntp is quiet about
the failure in the messages log.  We are getting
the equivalent of 404 in http world or host not found
in DNS and it just swims along in useless mode.
Even the SAN drifted off in silence.

GPS for time sync...  Not really the problem I'm concerned about.


On Fri, Mar 10, 2017 at 12:20 PM, Dop Ganger <nslug at fop.ns.ca> wrote:

> On Fri, 10 Mar 2017, D G Teed wrote:
>
> that isn't the nature of this inaccuracy.
>>
>> ntpq -p on the main ntp server which I will call ntp.example.com:
>>
>> # ntpq -p
>>      remote           refid      st t when poll reach   delay   offset
>>  jitter
>> ============================================================
>> ==================
>> *time12.nrc.ca   132.246.11.233   2 u  178 1024  377   19.666    0.291
>> 0.620
>> +time1.chu.nrc.c 209.87.233.52    2 u  254 1024  377   46.781    3.205
>> 8.099
>>
>
> Have you considered setting up a GPS to get a reliable stratum 1 time
> source?
>
> The ninth entry in the list has the time in the sample as 08:50:45
>>
>> Here is ntpq -p run on the 9th system with 8:50:45:
>>
>> # ntpq -p
>>      remote           refid      st t when poll reach   delay   offset
>>  jitter
>> ============================================================
>> ==================
>>  ntp.example.com  .INIT.          16 u    - 1024    0    0.000    0.000
>>   0.000
>>
>
> ntpd stuck in INIT usually means it can't connect to the configured
> upstream. Try watching tcpdump after restarting ntpd to see where packets
> are going?
>
> One thing I am looking into is IPV6.  This is enabled on the main ntp
>> server
>> and I notice the systems where IPv6 has been disabled seem to have
>> consistent
>> time.  The others may have attempted IPv6 connections and failed due
>> to firewall on IPv6 being blocked for most services.  I have changed the
>> firewall
>> on the central ntp server to allow IPv6 connections with udp 123 - maybe
>> this
>> will improve the outcomes.
>>
>
> IPv6 can interfere with DNS operations, we usually disable it in our
> environment.
>
> You might also want to look at openntpd - it's less featured than ntpd but
> works simply and well.
>
> Cheers... Dop.
>
> _______________________________________________
> nSLUG mailing list
> nSLUG at nslug.ns.ca
> http://nslug.ns.ca/mailman/listinfo/nslug
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://nslug.ns.ca/pipermail/nslug/attachments/20170310/1e5cf8de/attachment.html>


More information about the nSLUG mailing list