DataStax Enterprise 3.0 Documentation

Transparent data encryption

This documentation corresponds to an earlier product version. Make sure this document corresponds to your version.

Latest DSE documentation | Earlier DSE documentation

Transparent data encryption (TDE) protects at rest data. At rest data is data that has been flushed from the memtable in system memory to the SSTables on disk.

../../_images/encryption.png

As shown in the diagram, data stored in the commit log is not encrypted. If you need commit log encryption, store the commit log on an OS-level encrypted file system using Gazzang, for example. Data can be encrypted using different algorithms, or you can choose not to encrypt data at all. SSTable data files are immutable (they are not written to again after they have been flushed to disk). SSTables are encrypted only once when they are written to disk.

The high-level procedure for encrypting data is:

  1. Back up SSTables.
  2. Set permissions so that only the user/group running DataStax Enterprise can change the keytab file. If JNA is installed, JNA takes care of setting these permissions.
  3. Ensure that the user encrypting data has been granted ALTER permission on the table containing the data to be encrypted. You can use LIST PERMISSIONS to view the permissions granted to a user.
  4. Specify encryption options when you create a table (column family).
  5. Rewrite all SSTables using nodetool scrub, or use nodetool flush to flush to disk all new data using the current settings for encryption.

Requirements

TDE requires a secure local file system to be effective. The encryption certificates are stored locally; therefore, an invasion of the local file system invalidates encryption.

Options

To get the full capabilities of TDE, download and install the Java Cryptography Extension (JCE), unzip the jar files and place them under $JAVA_HOME/jre/lib/security. JCE-based products are restricted for export to certain countries by the U.S. Export Administration Regulations.

Limitations and recommendations

Data is not directly protected by TDE when accessed using the following utilities.

Utility Reason Utility Is Not Encrypted
json2sstable Operates directly on the sstables.
nodetool Uses only JMX, so data is not accessed.
sstable2json Operates directly on the sstables.
sstablekeys Operates directly on the sstables.
sstableloader Operates directly on the sstables.
sstablescrub Operates directly on the sstables.

The local file system could be protected through a third party whole-disk encryption solution. You choose ssl, kerberos authentication, encrypted file system, or other ways to secure nodes.

DataStax recommends that you do not export local file systems if possible. If you must export a local file system, ensure that mounting the file system on the node is a server-side capability.

Compression and encryption introduce performance overhead.

Encrypting Data

You designate encryption on a per table (column family) basis. When using encryption, each node generates a separate key used for only that node’s sstables.

For example, log in as the default superuser:

./cqlsh -3 -u cassandra -p cassandra

The ALTER TABLE syntax for setting encryption options is the same as the syntax for setting data compression options.

For example, to set compression options in the chores table:

ALTER TABLE chores
  WITH compression_parameters:sstable_compression = 'DeflateCompressor'
  AND compression_parameters:chunk_length_kb = 64;

To set encryption options in the chores table using CQL 3, for example:

ALTER TABLE chores
  WITH compression_parameters:sstable_compression = 'Encryptor'
  AND compression_parameters:cipher_algorithm = 'AES/ECB/PKCS5Padding'
  AND compression_parameters:secret_key_strength = 128;
  AND compression_parameters:chunk_length_kb = 1;

Designating data for encryption using ALTER TABLE doesn't encrypt existing SSTables, just new SSTables that are generated. When setting up data to be encrypted, but not compressed, set the chunk_length_kb option to the lowest possible value, 1, as shown in the previous example. Setting this option to 1 improves read performance by limiting the data that needs to be decrypted for each read operation to 1 KB.

Setting encryption and compression together

Encryption and compression occur locally, which is more performant than trying to accomplish these tasks on the Cassandra-side. Encryption can be set together with compression using a single statement. The single statement in CQL 3 is:

ALTER TABLE chores
  WITH compression_parameters:sstable_compression = 'EncryptingSnappyCompressor'
  AND compression_parameters:cipher_algorithm = 'AES/ECB/PKCS5Padding'
  AND compression_parameters:secret_key_strength = 128
  AND compression_parameters:chunk_length_kb = 128;

Encryption/compression options and sub-options

Using encryption, your application can read and write to SSTables that use different encryption algorithms or no encryption at all. Using different encryption algorithms to encrypt SSTable data is similar to using different compression algorithms to compress data. This section lists the options and sub-options.

The high-level container option for encryption and/or compression used in the ALTER TABLE statement are:

  • Encryptor
  • EncryptingDeflateCompressor
  • EncryptingSnappyCompressor
  • DeflateCompressor
  • SnappyCompressor (default)

The cipher_algorithm sub-option

The cipher_algorithm options and acceptable secret_key_strength for the algorithms are:

cipher_algorithm secret_key_strength
AES/CBC/PKCS5Padding 128, 192, or 256
AES/ECB/PKCS5Padding 128, 192, or 256
DES/CBC/PKCS5Padding 56
DESede/CBC/PKCS5Padding 112 or 168
Blowfish/CBC/PKCS5Padding 32-448
RC2/CBC/PKCS5Padding 40-128

You can install custom providers for your JVM. The AES-512 is not supported out-of the box.

The secret_key_provider_factory_class sub-option

The secret_key_provider_factory_class is:

com.datastax.bdp.cassandra.crypto.LocalFileSystemKeyProviderFactory

The secret_key_file sub-option

The secret_key_file option is the location of the keyfile. The default location is /etc/dse/conf, but it can reside in any directory.

The chunk_length_kb sub-option

On disk, SSTables are encrypted and compressed by block (to allow random reads). This subproperty of compression defines the size (in KB) of the block and is a power of 2. Values larger than the default value might improve the compression rate, but increases the minimum size of data to be read from disk when a read occurs. The default value (64) is a good middle-ground for compressing tables.

Using just encryption and no compression, the size of SSTables are dramatically different. For example, during an internal test, starting with a 3.2M .db file and in using these options, resulted in a 236K encrypted .db file:

  • sstable_compression = EncryptingDeflateCompressor
  • cipher_algorithm = 'AES/CBC/PKCS5Padding',
  • secret_key_strength = 256
  • secret_key_file = '/home/automaton/newencrypt/keyfile'
  • chunk_length_kb = 128

Altering the table to use the EncryptingDeflateCompressor and the same options as before resulted in a file size of 236K, so combining encryption and compression is probably a good idea.

The iv_length sub-option

Not all algorithms allow you to set this sub-option, and most complain if it is not set to 16 bytes. Either use 16 or accept the default.

The syntax for setting this sub-option is similar to setting a compression algorithm to compress data.

ALTER TABLE chores
  WITH compression_parameters:sstable_compression = 'EncryptingSnappyCompressor'
  AND compression_parameters:cipher_algorithm = 'AES/ECB/PKCS5Padding'
  AND compression_parameters:secret_key_strength = 128
  AND compression_parameters:iv_length = 16;

Using nodetool to complete encryption operations

Use the nodetool scrub utility to rewrite all the SSTables. Use nodetool flush to flush to disk all new data using the current settings for encryption.

About the keytab file

After designating the data to be encrypted, a keytab file is created in the directory set by the secret_key_file. If the directory doesn’t exist, it is created. A failure to create the directory probably indicates a permissions problem.

Example values in the keytab file are:

AES/ECB/PKCS5Padding:256:bxegm8vh4wE3S2hO9J36RL2gIdBLx0O46J/QmoC3W3U= AES/CBC/PKCS5Padding:256:FUhaiy7NGB8oeSfe7cOo3hhvojVl2ijI/wbBCFH6hsE= RC2/CBC/PKCS5Padding:128:5Iw8EW3GqE6y/6BgIc3tLw==

Deleting, moving, or changing the data in the keytab file causes errors when the node restarts and you lose all your data. Consider storing the file on a network server or encrypting the entire file system of the nodes using a third-party tool.

CassandraFS

The CassandraFS (Cassandra file system) is accessed as part of the Hadoop File System (HDFS) using the configured authentication. If you encrypt the CassandraFS keyspace's sblocks and inode tables, all CassandraFS data gets encrypted.

Using SolrJ-Auth

Follow instructions in the solrj-auth-README.md file to use the SolrJ-Auth libraries to implement encryption. The SolrJ-auth-README.md file is located in the following directory:

Debian installations: /usr/share/doc/dse-libsolr*

RHEL-based installations: /usr/share/doc/dse-libsolr

Binary installations: resources/solr

These SolrJ-Auth libraries are included in the DataStax Enterprise distribution:

Debian installations: /usr/share/dse/clients

Binary installations: <install_location>/clients