Get in the Ring with Cassandra and EC2 Part 2
This is the second in a series of posts about making good choices for hosting Cassandra on the Amazon Web Services (AWS) environment. The first post can be found here. In this post, we tackle issues of picking the right Amazon Machine Image (AMI), what decisions you might want to make about disks for best performance, the options available for networking in AWS, and the critical role the system clock makes in Cassandra consistency.
As a reminder, this series of posts should serve as a general advice and best practices guide specifically for the Amazon Web Services (AWS) environment. Liberties were taken in some cases to describe local hardware as well as other clouds, but were generally constrained to information that was highly relevant to Elastic Compute Cloud (EC2) deployments.
Choosing the Right AMI
Choosing the right AMI should not be considered lightly. What you’re essentially doing is trusting an organization to properly install, configure, and bake an operating system on which you’ll place your valuable data. Pick only AMIs from a trusted organization. Whether you want to use Debian, Redhat, or any other Linux choice, make sure to pick an official image, with up-to-date kernels and EC2 fixes.
The recommended AMI is the Amazon Linux AMI; it is the most accurately configured AMI for EC2. For I2 instances, use the Amazon Linux AMI 2013.09.02 or any Linux AMI with a version 3.8 or newer kernel for the best I/O performance.
If you’re wish to use Enhanced Networking, you’ll need the latest Amazon Linux AMI or follow the instructions to do so yourself. Some newer AMIs from different Linux distributions should work, but verify that they actually come baked with these new drivers ready for use. As you can see, you must be careful to consider the difference between cloud environments and physical environments, in terms of imaging.
The DataStax AMI currently uses an Ubuntu 12.04 LTS image found here, along with an upgraded kernel as recommended by Amazon. We chose Ubuntu because of its ease, community adoption, and support of the EC2 environment.
EC2 instances are configured with a wide variety of ephemeral devices. Note that there are choices about number of disks, as well as type of disks. Ephemeral disks are set by the EC2 type you choose; you can also add Elastic Block Storage (EBS) to any instance.
EBS devices work well for some applications, but for Cassandra, a disk-intensive database typically limited by disk IO, you should never choose EBS devices to house your data. EBS is a service similar to that of legacy Network Attached Storage (NAS) devices, and are not recommended for Cassandra for the following reasons.
Before a religious war about local disks vs networked disks is started, consider this: network disks are a single point of failure (SPOF). At DataStax, we understand that within the NAS infrastructure, you can have multiple layers of redundancy and ways to validate. But ultimately, you’ll never really know how reliable that is until disks start failing. With Cassandra, nodes come and go frequently within the ring and it is noticeable that redundancy works as expected.
In April of 2011, AWS had a network outage that Netflix blogged about. This isn’t an isolated incident – this has happened at least twice. In this instance, EBS devices were largely affected by this outage. If the network goes down, disks will be inaccessible, and thus your database will go offline. Using networked storage for Cassandra data, circumvents the innate, tangible redundancy that a distributed database grants by default.
If you still believe that severed networks are not a problem, then ponder this fact – local disks are faster than NAS. Eliminating the need to traverse the network, and stream data across a data center will reduce the latency of each Cassandra operation. In the world of database operations today, latencies in the microsecond range are required, not milliseconds. Certainly, you can pay for EBS-Optimized instances, but network variance due to multi-tenancy can still occur. Granted, you can use Provisioned I/O EBS to alleviate this problem, but cost and complication are introduced. Cassandra is a simple distributed database that works best in uncomplicated architecture with low-cost “commodity” hardware, even if it is EC2.
If physical hardware is selected, we first recommend using Solid State drives (SSDs). If you choose to use rotational disks, use one for commit logs and a RAID0 device for data. In EC2 instances, though, put both the commit logs and data in a RAID0 device. In fact, the DataStax AMI does this by default on instantiation. Should you choose to use EBS-backed images, move all Cassandra, DataStax Enterprise, and OpsCenter logs onto the RAID0 device as well to ensure that the image will not lock up, due to networking issues.
When you create a RAID0 device, make sure to align the disks at 4K or 1MB boundaries. In addition, setting the blockdev readahead value to 128 with the following command will greatly lower your disk contention: `sudo blockdev –setra 128 <device>`. Using the XFS format with defaults and nobootwait in the /etc/fstab settings is also recommended for Cassandra deployments.
Be sure that all AMIs launch with all allowable disks. Some EC2 instances, such as m3.*, ephemeral disks must be specifically attached.
EC2 Network Choices
Different instances types have different networking abilities and it’s good to become familiar with these constraints. Because most Cassandra deployments will typically be bound by disk IO, network speed is rarely something that is in the foreground of decision making. In EC2, when you add instance types with SSD storage, or even the hs1.xlarge, with 24 storage devices, you’ll also notice that the network abilities also increase, and thus network performance continues to be a non-issue.
However, because EC2 network performance can be inconsistent, a general recommendation is to increase your phi_convict_threshold to 12, in the cassandra.yaml file. Otherwise, you may see issues with flapping nodes, which occur when Cassandra’s gossip protocol doesn’t recognize a node as being UP anymore and periodically marks it as DOWN before getting the UP notification. Leave the phi_convict_threshold at its default setting, unless you see flapping nodes.
EC2 now has Enhanced Networking support for i2.* and c3.* instance types launched with VPCs using HVM AMIs that include the ixgbevf driver, as well as the SriovNetSupport bake-time flag. This is a feature that affects a very narrow segment of EC2 instances, but it’s useful to know that AWS making progress in this area, if these are EC2 instances that you plan to use.
System Clocks and Cassandra Consistency
Clock skew will happen, especially in a cloud environment, and cause timestamps to get offset. Because Cassandra heavily relies on reliable timestamps to resolve overwrites, keeping clocks in sync is of utmost importance to your Cassandra deployment. If you’re handling timestamps in the application tier as well, keeping it in sync there is also highly recommended. This can be done by simply installing and ensuring the NTP service is active and running successfully.
There have been stories of certain machines that can become highly skewed within just 24 hours. Although this is rare, don’t get too stressed out; you can simply decommission the node and bring another one up in its place. Don’t spend time trying to hunt down the culprit and instead fully utilize the benefits of running Cassandra in a cloud provide. This allows you to spend time moving forward instead of trying to fix an issue that can be easily forgotten.
Hopefully, this post has given you something to think about, and if you are planning to deploy to AWS, some helpful tips.