Synchronization Process
Algorithm Overview
The Network Time Protocol (NTP) employs a multi-stage algorithm to synchronize client clocks with remote servers, mitigating errors from network delays, clock drifts, and faulty sources. The core process begins with data collection from multiple peers, followed by filtering to select the most accurate samples, selection to identify reliable servers (truechimers) while discarding unreliable ones (falsetickers), clustering to refine the candidate set, combining to compute a consensus offset, and finally disciplining the local clock through adjustments in phase and frequency. This design draws from statistical and fault-tolerant principles to achieve sub-millisecond accuracy in favorable conditions, assuming a majority of servers provide correct time. These algorithms are specified for NTP version 4 (NTPv4).[17]
The clock filter algorithm operates per peer, processing round-trip samples to isolate the lowest-delay measurement, which minimizes asymmetric delay effects. For each association, it maintains an eight-stage shift register of measurement tuples (offset, delay, dispersion, timestamp), shifting in new samples and discarding stale ones older than three poll intervals. The samples are sorted by increasing delay, and the peer dispersion (ε) is computed as the weighted sum of stage dispersions: ε = Σ (ε_i / (i+1)^2) for i from 0 to 7, where ε_i is the dispersion at stage i; this quadratic weighting favors low-delay samples. The root synchronization distance (λ = δ/2 + ε, where δ is the round-trip delay) then serves as a quality metric, increasing over time at rate φ (15 μs/s) to penalize inactive peers. Jitter (ψ) is estimated as the root-mean-square (RMS) of differences between the best offset and other offsets in the register, bounded by the system precision. This filtering ensures only high-quality samples propagate to higher stages.[18]
In the selection phase, the intersection algorithm—adapted from Marzullo's algorithm for interval consensus—evaluates all candidate peers to detect falsetickers using Byzantine fault principles. For each peer, it constructs a correctness interval [θ - λ, θ + λ], where θ is the measured offset and λ the root distance, representing the possible true time values. The algorithm scans for the largest contiguous intersection containing midpoints of these intervals, requiring agreement from at least a simple majority (or a configurable minimum of 1 survivor). Peers whose intervals fall outside this intersection are discarded as falsetickers, assuming no more than (N-1)/2 faulty sources in a population of N (up to nearly 50% if N is odd). This step robustly identifies truechimers even in the presence of significant faulty servers.[19]
The subsequent clustering algorithm refines the survivor list by iteratively discarding the outlier contributing the most to selection jitter (ψ_s) until at most three (NMIN) associations remain or no further reduction in ψ_s is possible. Survivors are ranked by a score combining stratum (multiplied by MAXDIST) and root distance (λ), selecting the system peer as the lowest-scoring entry while avoiding unnecessary peer switches if the new candidate matches the current stratum. The combine algorithm then computes the final system offset (Θ) as a weighted average of survivor offsets: Θ = Σ (w_i * θ_i) / Σ w_i, where weights w_i = 1 / λ_i emphasize low-distance peers; system jitter (Ψ) follows as √(ψ_s^2 + ψ_p^2), with ψ_p the RMS peer jitter. This consensus mechanism provides a robust estimate resilient to individual errors.[20][21]
Finally, the clock discipline algorithm uses a feedback loop to adjust the local clock, treating it as a hybrid phase-locked loop (PLL) and frequency-locked loop (FLL). The combined offset (Θ) drives phase updates, while frequency is tuned via a variable-frequency oscillator (VFO) to minimize drift. Adjustments occur via "slew" (gradual shift) for offsets under 128 ms or "step" (instant correction) for larger ones, with a state machine handling transient errors like spikes. The loop filter computes frequency offset (α) and sets the poll interval adaptively between 64 s (2^6 s) and 131,072 s (≈36 hours, 2^17 s) based on jitter and stability, ensuring long-term synchronization within 1-50 ms over the Internet.[22]
Peer Selection and Clock Filtering
In the Network Time Protocol (NTP), the clock filter algorithm operates on a per-peer basis to process incoming time samples and select the most reliable one for synchronization. For each peer association, the algorithm maintains an eight-stage shift register that stores tuples consisting of the clock offset θ (the estimated time difference between the local and peer clocks), round-trip delay δ, dispersion ε (a measure of the maximum error bound), and arrival timestamp t. Upon receipt of a valid NTP packet, the register shifts existing tuples leftward, discards the oldest entry, and inserts a new tuple derived from the packet's timestamps, with initial dispersion computed from the peer's root dispersion and the sample's processing jitter.[23]
The filtering process prioritizes samples with the lowest round-trip delay to minimize network asymmetry effects, sorting the register stages by increasing δ and selecting the first-stage sample as the candidate. Dispersion values in the register accumulate error bounds, increasing at a rate of PHI ≈ 15 × 10^{-6} seconds per second due to local clock wander, and the algorithm computes an overall peer dispersion as a weighted sum ε = Σ (ε_i / (i+1)^2) for i=0 to 7, where stages are sorted by delay; this quadratic weighting favors recent, low-delay samples. Jitter ψ, representing the variation in offsets, is calculated as the root-mean-square (RMS) of differences between the best offset and others in the register, bounded by the system precision to avoid amplification of errors. This low-delay selection helps mitigate path delays and ensures that only high-quality samples contribute to the peer's state variables (offset, delay, jitter, and dispersion), which are updated only if the new sample's delay is less than the current or if the register has fewer than eight entries.[23]
Peer selection occurs at the system level through a multi-stage process that evaluates all candidate peers to identify and combine the best time sources while discarding unreliable ones. The selection algorithm first applies a falseticker detection mechanism, inspired by Byzantine agreement principles, to cull inaccurate peers (falsetickers). For each peer, it constructs a correctness interval [θ - ρ, θ + ρ], where ρ is the root distance (synchronization distance from the peer to its primary reference, incorporating delay, dispersion, and jitter). These intervals are intersected across all peers; falsetickers are those whose intervals do not overlap with the largest clique (majority intersection) containing more than half the peers, ensuring resilience against up to (N-1)/2 faulty sources. Survivors of this stage, termed truechimers, represent the concordant set of accurate clocks.[24]
Following falseticker removal, the clustering algorithm refines the truechimer set to select the optimal subset for synchronization. Truechimers are sorted by a merit factor, typically stratum number multiplied by a large constant plus the root synchronization distance λ = ε + δ/2, favoring lower-stratum (closer to primary references) and lower-distance peers. Selection jitter ψ_x is then computed as the RMS of offset differences among the candidates; iteratively, the peer contributing the maximum jitter is discarded until ψ_x falls below a threshold or the set size reaches a minimum of three (NMIN). The first-ranked survivor becomes the system peer, providing the primary offset for clock discipline, while the combine algorithm weights the offsets of all survivors by the inverse of their root distances (w_i = 1 / λ_i) to compute the system offset Θ and overall system jitter Ψ = √(ψ_x² + ψ_p²), where ψ_p is peer jitter. This process self-organizes the NTP subnet into a hierarchical structure, dynamically adapting to network conditions and ensuring robust synchronization even with heterogeneous error sources.[25][26]
Clock Adjustment Methods
The Network Time Protocol (NTP) employs a clock discipline algorithm to adjust the local system clock based on synchronization offsets derived from peer measurements. This algorithm, a hybrid of phase-locked loop (PLL) and frequency-locked loop (FLL) mechanisms, continuously estimates and corrects both phase (time offset) and frequency (drift rate) errors to maintain synchronization accuracy.[1][29] The process runs at fixed intervals, typically every second, and adapts the poll interval (τ) dynamically between 64 seconds (2^6 s) and 131,072 seconds (≈36 hours, 2^17 s) to balance accuracy and network load.[1]
Adjustments are applied in one of two primary methods: stepping or slewing, selected based on the magnitude of the computed offset (θ). For offsets exceeding the step threshold (default 0.128 seconds), the clock is stepped immediately using system calls like settimeofday(), which sets the clock directly to the corrected time. This method resets all peer associations and invalidates prior data, as large discontinuities can indicate significant errors or initialization.[1] If the offset surpasses the panic threshold (1000 seconds), the daemon typically exits to prevent erroneous operation.[1] Stepping is rare in steady-state operation but essential for initial synchronization or recovery from major disruptions.[29]
For smaller offsets below the step threshold, the clock is slewed gradually using calls like adjtime(), which incrementally adjusts the clock frequency to amortize the phase error over time without introducing discontinuities. This preserves monotonicity in the timescale, avoiding issues in applications sensitive to time jumps. The slew process combines phase and frequency corrections: the phase offset θ is corrected via a proportional term, while the frequency φ is updated as a weighted average incorporating recent measurements, such as φ_k = φ_{k-1} + w (φ_k - φ_{k-1}) where w ≈ 0.25.[1][29] The hybrid PLL/FLL design uses the PLL for short poll intervals (τ ≤ 1024 seconds) to prioritize phase locking and switches to FLL for longer intervals to emphasize frequency stability.[29]
The offset θ itself is computed from NTP packet timestamps as θ = ½[(T2 - T1) + (T3 - T4)], where T1 to T4 represent client send, server receive, server send, and client receive times, respectively; this is refined by the clock filter and selection algorithms to select the best peer measurements before application.[1] Frequency wander is bounded by a tolerance parameter (PHI = 15 ppm), and adjustments incorporate jitter (ψ) and dispersion (ε) estimates to ensure stability, with the root synchronization distance λ = (δ/2) + ε + PHI × (t - t0) guiding peer selection.[1] In non-linear modes, such as during initial frequency estimation (up to 15 minutes) or high-jitter bursts, the algorithm temporarily prioritizes frequency corrections to avoid oscillation.[1] Overall, these methods achieve synchronization errors typically under 1 millisecond on LANs and a few tens of milliseconds on WANs under normal conditions.[29]