TechnologyMarch 4, 2013

Using Transparent Data Encryption in DataStax Enterprise

Robin Schumacher
Robin Schumacher
Using Transparent Data Encryption in DataStax Enterprise

One of the new security features added in DataStax Enterprise 3.0 is transparent encryption. Let’s take a quick tour of the feature and see how it works.

What is Transparent Data Encryption?

Transparent Data Encryption (TDE) is used to encrypt data at rest so that it cannot be easily read by unauthorized users who gain access to the underlying files used to store table / column family data. TDE is especially good for tables containing sensitive data such as human resource information, social security numbers, credit card data, etc.

How to use Transparent Data Encryption in DataStax Enterprise

TDE is applied at the table / column family level in DataStax Enterprise. The easiest way to use TDE is specify it at table creation time. TDE is specified using the Cassandra plug in that does both data compression and encryption. An example of specifying TDE for a new table might be:

cqlsh:dev> create table emp
... (empid int primary key,
...  first_name varchar,
...  last_name varchar,
...  ssn int)
... with compression_parameters:sstable_compression = 'Encryptor'
...   and compression_parameters:cipher_algorithm = 'AES/ECB/PKCS5Padding'
...   and compression_parameters:secret_key_strength = 128;

The above sets AES 128 encryption for the new table. There are a number of different encryption algorithms that may be set for a table:

cipher_algorithm key_strength
AES/CBC/PKCS5Padding 128, 192, 256
AES/ECB/PKCS5Padding 128, 192, 256
DES/CBC/PKCS5Padding 56
DESede/CBC/PKCS5Padding 112 or 168
Blowfish/CBC/PKCS5Padding 32-448
RC2/CBC/PKCS5Padding 40-128

Custom encryption providers may also be installed and used in DataStax Enterprise.

Once encryption is specified for a new table, there is nothing else a developer or administrator needs to do. No changes need to be made at the application level to read or write data; this is where the ‘transparent’ part of TDE comes into play. Data may be written and read in typical fashion:

cqlsh:dev> insert into emp
... (empid, first_name, last_name, ssn)
... values (1,'laura','jung',213456789);
cqlsh:dev> select * from emp;
empid  | first_name | last_name | ssn
1      |      laura |      jung | 213456789

Encrypting data for an existing table can be done, but there are extra steps that need to be followed. First, the ALTER command is used to specify encryption for the existing table:

WITH compression_parameters:sstable_compression = 'Encryptor'
AND compression_parameters:cipher_algorithm = 'AES/ECB/PKCS5Padding'
AND compression_parameters:secret_key_strength = 128; 

However, the existing SStables on disk still need to be updated to include encryption, which can be accomplished by using nodetool supplied in DataStax Enterprise and running the scrub command to rewrite all SStables for the table or the flush command to just flush to disk all new data, which is then encrypted.

Encrypting Hadoop Data in DataStax Enterprise

Because DataStax Enterprise swaps out HDFS for the Cassandra File System (CFS) and runs all Hadoop operations on top of Cassandra, it’s possible to use TDE to encrypt all Hadoop data in DataStax Enterprise.

The inode and sblocks tables used to store all metadata and operational data on nodes in DataStax Enterprise specified as Hadoop nodes can be encrypted in the exact manner as described above so that all Hadoop data is protected.

For Solr/enterprise search nodes in DataStax Enterprise, because Solr indexes do not make use of Cassandra tables to store their data, TDE cannot be used to fully cover Solr data as is done on Hadoop and Cassandra nodes. Solr data can be encrypted, however Solr indexes cannot.

Current Limitations

Limitations to the current implementation of TDE in DataStax Enterprise are:

  • The encryption keys are stored on the server and are not separated/pushed out to a different machine.
  • Data in the Cassandra commit log is not encrypted.
  • There is some minor overhead that comes from using encryption, the amount of which is dependent on a number of factors (table size, datatypes used, etc.)

Next Steps

To try out TDE and all features of DataStax Enterprise, download a copy today. DataStax Enterprise is completely free to use in development environments with no restrictions, however production deployments do require that a subscription be purchased.

For more information on all of 3.0's security features, please see our online documentation and our “What’s New in DataStax 3.0?” white paper.


One-stop Data API for Production GenAI

Astra DB gives JavaScript developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.