Preparing for the Leap Second, 2017 Jan 1 Edition
From a civil time perspective, December 31st will have an extra second added to the end. From the Linux time system’s point of view, the last second of December 31st will repeat itself as another second with the same timestamp as the one previous is inserted at the end of the day.
Since the date of the aforementioned blog post, not too much has changed with exception to an evolution in some client drivers as it pertains to client timestamp behavior. The following repeats a lot of information from our previous article about the leap second and adds some additional details as it pertains to to the drivers.
Those of you who were using Apache Cassandra or DataStax Enterprise -- or even running other databases or applications under Linux -- back in 2012 may have had problems when a leap second was added at the end of June. In this blog post, we’ll explain how things have changed since then, what we’ve done to anticipate other problems that may be caused by the leap second, and what you can do to prepare for it.
Livelock in Pre-3.4 Linux Kernel and Pre-7u60 JDK
As explained in Jonathan's 2012 leap-second blog post, many of the failures that occurred in 2012 were caused by a bug in the Linux kernel that caused a livelock in the timer subsystem when the leap second was inserted. Luckily, a fix for that particular problem was applied to the kernel as part of version 3.4.
Determining if Your System is Affected
As an initial assessment, run
uname -r to determine the version of the kernel you're running. Kernel versions 3.4 and higher aren’t affected by the bug. For a more comprehensive assessment, and to demonstrate problems that can be caused by the kernel bug, the author of the bug fix wrote two programs that exercise the bug. These are useful diagnostic tools, but do not use them on production systems. They alter the host system's clock and shouldn't be run on systems currently in production or that contain data you want to keep.
- This program can lock up kernels that still contain the bug.
- This program, run with the
-soption, will repeatedly insert leap seconds and check for any timing errors resulting from the insertion.
We've tested both of these programs on Ubuntu images on AWS and verified that they fail on systems with old kernels and succeed on newer ones. You may not see the expected failures on systems running under other forms of virtualization; for instance, we saw different timer-resetting behavior on images running under VirtualBox. If you're a Red Hat Enterprise Linux user with a Red Hat account, Red Hat's lab on the subject may be helpful. It assumes you use RHEL, but if you do, it can determine if your system is susceptible to the livelock without interacting with your system clock.
If you use RHEL 2.6 or higher, your system may be safe from kernel livelocks even on older kernels. There was a workaround applied to the kernel that prevents the livelock from causing problems, though it does not fix the underlying issue. See this bug report and this update report for more information.
Java-based applications like Cassandra were particularly affected by this kernel issue due to thread parking operations' reliance on the CLOCK_REALTIME system clock. Recent versions of JDK 7 (7u60+) and all versions of JDK 8 include an enhancement (JDK-6900441) that instead uses CLOCK_MONOTONIC instead for these operations. CLOCK_MONOTONIC in the general case is not affected by system time changes, such as insertion of a leap second.
We were able to reproduce kernel lockups using pre-7u60 JDKs on pre-3.4 kernels. We have not yet seen a kernel lockup, even with older kernels, with JDK 7u60 and higher. Still, we strongly discourage using this as a workaround -- if you are using a kernel older than 3.4, you are still at risk of a livelock in the kernel.
On newer kernel versions that do not demonstrate this issue, it still may be of value to be at a JDK level greater than or equal to 7u60, as time-sensitive operations will behave more correctly than in older versions.
Timestamp Behavior Over the Leap Second
Cassandra uses monotonically increasing timestamps as of 2.1.3. However, this monotonicity is done independently on each node. During an inserted leap second, each Cassandra node will still return timestamps greater than previous ones used even though the time was sent back one second. The timestamps generated during the inserted leap second will be based off of the millisecond basis of 31 Dec 2016 23:59:59 (1483228799000XXX) until the second elapses.
For many applications, this interleaved ordering will not affect correct operation. In the case of an inserted leap second, you only need to be concerned if you expect to make multiple changes to a column value in a row during that second. If your application requires that values’ writetime order are the same as their wall-clock-time insertion order for changes to the same value within one second, you should make sure your strategy for ensuring that property holds also works during inserted leap seconds.
Clock Sync Problems Around the Leap Second
Cassandra’s behavior depends on your cluster having well-synced clocks on all your servers. The timestamps on writes and deletes are, in most cases, generated by the coordinator node (though they can also be generated by the client, in the case of drivers like the Python driver that use protocol version 3). Thus, if clocks are out of sync, timestamps on writes that were coordinated by different nodes can be out of order.
Ensure that your servers are synchronized with NTP using the same servers. Using external NTP pools carries some risks, however. NTP servers, such as those accessible as part of the ntp.org server pool, can be out of sync with one another or can be misconfigured to add leap seconds at the wrong time, or to not add scheduled leap seconds. If your nodes’ NTP clients use external servers directly, their clocks may drift as they independently compensate for upstream inconsistencies. You can avoid these problems by setting up your own NTP pool that will compensate for inconsistencies between upstream servers and provide consistent time to your nodes as clients.
Leap Seconds and DataStax Drivers
Like Cassandra, some client drivers are also susceptible to the kernel bug around leap seconds and timestamp generation issues.
Kernel Issue Impact
As the java-driver library runs on the JVM, it could, in theory, be susceptible to the kernel bug encountered in June 2012. In testing on kernel 2.6.35-32 with JDK 7u55, we found that no threads were susceptible to the leap second issue. However, since there may be other activities in an application running the java-driver, we strongly recommended upgrading your kernel to 3.4+ and also considering upgrading your JDK version to 7u60+.
The C++, Python, Ruby, and Node.js drivers were also tested on an older kernel version and did not demonstrate any lock up issues after a leap second was inserted. That being said, it is still strongly recommended that you consider upgrading to kernel 3.4+ as these tests were not comprehensive.
Leap Seconds and Client Timestamp Implementations
If you are using client timestamps you may run into similar issues described in the ‘Timestamp Behavior over the Leap Second’ section. In DataStax client drivers, there are three ways to enable client timestamps:
- Appending ‘USING TIMESTAMP timestamp’ to your CQL query. This is supported for all versions of Cassandra supporting CQL.
- Using the ‘set timestamp’ method on a Statement, for example setDefaultTimestamp in the DataStax Java Driver. This is only available for drivers supporting Cassandra 2.1 and running against Cassandra 2.1+ / DataStax Enterprise 4.7+ clusters.
- Using a timestamp generator. See the table below for the availability and behavior of timestamp generators per-driver.
|driver||enabled by default?||monotonic?||notes|
|cpp||no (CPP-413)||partially (CPP-412)||docs|
|csharp||no (CSHARP-516)||n/a||No timestamp generator implementation. Client timestamp may be provided on a per statement basis via Statement.SetTimestamp at user's discretion.|
|Timestamp generator implementation added in 3.2.0. Prior to this release, client timestamp may be provided on a per execution basis via ClientOptions at user's discretion.|
|php||no (CPP-413)||partially (CPP-412)||docs|
|no, is based off of time.time() which is subject to system clock changes (PYTHON-676).||docs|
|ruby||no (RUBY-284)||:simple uses Time::now which is subject to system clock changes. :monotonic (TickingOnDuplicate) offers fully monotonic implementation.||docs|
Note that client timestamps require protocol version 3 (introduced in C* 2.1 / DSE 4.7) and thus client timestamp generators will only be used when the driver is configured with protocol version 3 or greater.
In summary, to prepare for the upcoming leap second in January:
- At a bare minimum, make sure you are running Apache Cassandra/DataStax Enterprise and its drivers on kernel version 3.4 or higher. We also recommend using JDK version 7u60 or higher. This should protect you from the livelock problems users experienced in 2012.
- Determine if your application will be affected by out-of-order timestamps during the inserted leap second, and if it will, develop a strategy for preventing any problems.
DataStax has many ways for you to advance in your career and knowledge.
You can take free classes, get certified, or read one of our many white papers.
register for classes
DBA's Guide to NoSQL