In the case that the clock is out of sync for a significantly long time,
the max error will grow large enough to eclipse the 10-second default,
at which point it will still crash as before. But, if NTP is properly
restored within a few minutes, the server should remain operational.

Re: [Kudu][NTP] Kudu use raft to ensure consensus, what's the responsibility of NTP in KUDU?

NTP synchronization (and specifically enforcing a maximum clock error on each node) helps guarantee Kudu's transaction semantics. This page has more details, as does the Kudu design paper.

The maximum clock error is an NTP concept, and refers to the upper bound of time that the local machine's clock may deviate from whatever NTP clock it's synchronized with. Machines with well-configured NTP installations should guarantee some sort of stable maximum clock error that you can use for Kudu's max_clock_sync_error_usec configuration flag. Unfortunately I don't understand Kudu transactions well enough to explain its effect on Kudu transaction semantics.

Re: [Kudu][NTP] Kudu use raft to ensure consensus, what's the responsibility of NTP in KUDU?

Can I brief it as: The NTP is used to make the MVCC and READ_AT_SNAPSHOT scan accurately. If the max_clock_sync_error_usec is large will result in the scan deviation is more larger too, and vice versa?

Re: [Kudu][NTP] Kudu use raft to ensure consensus, what's the responsibility of NTP in KUDU?

Kudu's transaction system "tag's" mutations with a timestamp that has a wall-time component.

Beyond powering distributed consistency semantics (i.e. in between different tablets, each running raft internally) these timestamps can be used to run point-in-time scans, if the system is setup appropriately. As part of the algorithm, server's send the timestamps to each other all the time, either directly or through clients, and then use those to update their own clocks.

If the server's clock's are out-of-sync by a lot, then the point-in-time scans lose meaning, but moreover, other weird things might happen like a server crashing and coming back with lower wall clock time.

We're working on an alternative that will avoid users having to deal with this problem, but in the mean time I'd suggest setting ntp properly.

One possible layout that we've seen work in the past is to have a couple of ntp time masters close or within the kudu cluster and then have the servers be ntp peers to each other.