FreeNas 9.1 system clock woes….

I recently installed a FreeNas system, not my first mind you, and discovered a problem I have never experienced in all of my FreeNas installations.  The clock would not keep accurate time!

This is frustrating when you are using Active directory to authenticate users.  A skew of over 5 minutes would cause errors for client machines and prevent them from accessing shares.  attempts to reset the clock can be frustrating because FreeNas handles the time and system BIOS a little differently than most Linux systems.

Let me qualify that last statement.  Most systems give you the choice of using UTC (Coordinated Universal Time) or local time (BIOS).  FreeNas does the same, but differently.  When setting the time in the system settings, the timezone controls how the system clock is set.  If you choose UTC as your timezone, then Freenas will use the local system (BIOS) clock.  If you choose a time zone, then the local system (BIOS) will be set to the offset of FreeNas time to make the BIOS clock read UTC time.
This can be very confusing unless you wrap you head around it.

Start Wrapping:
FreeNas sets the local system clock based on the offset from UTC that you choose.  For example, if you choose UTC time then the offset is 0.  This would mean that FreeNas time and BIOS are time are the same, therefore setting the BIOS time to your local time zone time will cause your BIOS time and FreeNas time to be identical.

If, however you set your timezone to say, America, New York (EST), the offset is -5.  FreeNas will use this offset to set your BIOS clock.  This means that your if your BIOS clock says 12:00am then FreeNas will show 5:00am (5:00am -5hrs = 12:00am).  This is the preferred method because it allows FreeNas to control Daylight Saving Time (DST) adjustments.  Using the previous method would mean that you have to manually adjust the BIOS clock each time there is a DST adjustment to be made.

Now that we have the concept of properly setting the clock out of the way, how do we deal with the loss of time?  My problem came from the improper use of NTP servers.  More specifically the Burst and iBurst options.  By default FreeNas sets the iBurst option to on.  Understanding Burst and iBurst can be a bit daunting.  There is a great technical description given in this Poll Process document.  The quick explanation is that iburst is used to initially set the time and burst is used when polling ntp servers after the initial time set.

By default FreeNas sets the iBurst option to on.  This causes the initial time set during startup to go much quicker (seconds vs. minutes) decreasing system boot time.  A good option to have. Used properly it will help.  I recommend keeping this option on.

My mistake was assuming that the burst option did the same when adjusting the system time.  It does not!  All burst does is generate a ton of traffic and most NTP servers consider this unfriendly and may even block you.  The burst option should be reserved for checking local time servers on your own network.

Why did I turn on burst in the first place?  Turns out that the real source of my problem  was that the default time servers given in FreeNas were not all responding.  Freenas use a process to check multiple time servers to verify that the time servers are accurate.  This means your need to have at least 4 to 5 time servers configured.  The process looks at the time retrieved from all time servers and then looks for sane and insane servers  The minimum number of sane servers required to find a candidate for sync.  Here is an expert from an ntp.org document (read the full document here)that describes this:

7.1.4.3.1. Number of Time Server “Candidates” and “Sane”

The NTP algorithms for selecting good timeservers (truechimers) and separating them from bad ones (falsetickers) are complex, but what it amounts to is the system going through and systematically looking at all the currently remaining servers and seeing if one of them is statistically an outlier, and therefore “insane”. The “insane” server is eliminated, and the cycle repeats. Once you’ve eliminated all the “insane” servers, the remaining “sane” servers are culled through a similar process.

Once you’ve gotten down to the minimum number of “sane” servers, the system considers the rest to be the “candidates” for selection as the One True Clock (a.k.a., “syspeer”). This is the clock that ntpd will sync to. Of course, ntpd will periodically re-calculate the set of “sane” servers, the “candidates”, and the “syspeer”. See section 8.4 of TroubleshootingNTP for information on the “tally” character that shows you which server is considered to be a candidate, insane, etc….

By default, the minimum required number of sane servers is three (3), and the minimum required number of candidates is one (1). However, this can have some very undesirable consequences (see http://lists.ntp.org/pipermail/questions/2003-September/000737.html).

Basically since I could not get a quorum of time server to give an accurate clock, FreeNas was not updating the clock at all, or possibly adjusting the time on a best guess.

The Solution:  Set your BIOS clock to UTC and use the proper timezone setting in FreeNas.  Then verify that the time servers you are using are responding to your requests (here is a great article at NTP.org), leave the default settings alone!  You will probably need to add more time servers, I recommend finding 5 to 6 servers from different organizations (ntp.org, nist.gov etc…) to increase your odds of finding a candidate, 4 is not enough because if 2 stop responding you may experience problems  and any more than 10 will take too much time and resources to process.

There is a plethora of others.  Google it!

 

 

 

One Reply to “FreeNas 9.1 system clock woes….”

  1. UPDATE: While this has helped my FreeeNas installation keep time better, I’m still getting instances when the clock suddenly skews by more than the sanity limit. When this happens, the clock will not update at all. Therefore I’ve added a Cron Job to my solution to restart the ntpd service every 15 minutes, forcing the clock update whether the sanity limit is exceeded or not.

    Command: service ntpd restart.

    ~frustrating