DataStax Enterprise 3.0 Documentation

Configuring and using data auditing

This documentation corresponds to an earlier product version. Make sure this document corresponds to your version.

Latest DSE documentation | Earlier DSE documentation

Auditing is implemented as a log4j-based integration. DataStax Enterprise places the audit log in the directory indicated by a log4j.property. After the file reaches a threshold, it rolls over, and the file name is changed. The file names include a numerical suffix determined by the maxBackupIndex.

The audit logger logs information on the node set up for logging. For example, node 0 has audit turned on, node 1 does not. Issuing updates and other commands on node 1 does not generally show up on node 0’s audit log. To get the maximum information from data auditing, turn on data auditing on every node. The log4j supports data stored on the file system or in Cassandra.

Auditing is configured through a text file in the file system, so the file is vulnerable to OS-level security breaches. Store the file on an OS-level encrypted file system using Gazzang, for example, to secure it.

Configuring data auditing

You can configure which categories of audit events should be logged and also whether operations against any specific keyspaces should be omitted from audit logging.

To configure data auditing:

  1. Open the log4j-server.properties file in the following directory.

Packaged installs

/etc/dse/cassandra

Binary installs

/resources/cassandra/conf

  1. To configure data auditing, uncomment these properties, and ensure that the default properties are set.

    Property Default Description
    log4j.logger.DataAudit INFO, A Produce INFO-level logs.
    log4j.additivity.DataAudit false Prevents logging to the root appender.
    log4j.appender.A org.apache.log4j.RollingFileAppender Prevents logging to the root appender.
    log4j.appender.A.File /var/log/cassandra/audit.log Sets the file and path of the log file.
    log4j.appender.A.bufferedIO true True improves performance but will not be real time; set to false for testing.

    To disable data auditing, comment out log4j.logger.DataAudit, log4j.additivity.DataAudit, and log4jappender.A. This removes almost all auditing overhead. The Log4J audit logger logs at INFO level, so the DataAudit logger must be configured at INFO (or lower) level in log4j-server.properties. Setting the logger to a higher level, such as WARN, prevents any log events from being recorded, but it does not completely disable the data auditing. Some overhead occurs beyond that caused by regular processing.

  1. Set other general options to tune the logging, for example uncomment these properties and accept the following defaults:

    • log4j.appender.A.maxFileSize=200MB
    • log4j.appender.A.maxBackupIndex=5
    • log4j.appender.A.layout=org.apache.log4j.PatternLayout
    • log4j.appender.A.layout.ConversionPattern=%m%n
    • log4j.appender.A.filter.1=com.datastax.bdp.cassandra.audit.AuditLogFilter
  2. Uncomment and set log4j.appender.A.filter.1.ActiveCategories to ALL or to a combination of these settings:

    Setting Logging
    ADMIN Logs describe schema versions, cluster name, version, ring, and other admin events
    ALL Logs everything: DDL, DML, queries, and errors
    AUTH Logs login events
    DML Logs insert, update, delete and other DML events
    DDL Logs object and user create, alter, drop, and other DDL events
    DCL Logs grant, revoke, create user, drop user, and list users events
    QUERY Logs all queries

    Set the ActiveCategories property to a comma separated list of the categories to include in the audit log output. By default, this list is empty so unless specified, no events are included in the log. Events are generated even if not included in the log, so set this property.

  3. You can disable logging for specific keyspaces. Set this property as follows to prevent logging to specified keyspaces:

    log4j.appender.A.filter.1.ExemptKeyspaces=do_not_log,also_do_not_log
    

    To prevent the audit logger from logging information about itself when using the Cassandra log4j appender, exempt the keyspace from the appender logs.

The audit log section of the log4j-server.properties file should look something like this:

log4j.logger.DataAudit=INFO, A
log4j.additivity.DataAudit=false
log4j.appender.A=org.apache.log4j.RollingFileAppender
log4j.appender.A.File=/var/log/cassandra/audit.log
log4j.appender.A.bufferedIO=true
log4j.appender.A.maxFileSize=200MB
log4j.appender.A.maxBackupIndex=5
log4j.appender.A.layout=org.apache.log4j.PatternLayout
log4j.appender.A.layout.ConversionPattern=%m%n
log4j.appender.A.filter.1=com.datastax.bdp.cassandra.audit.AuditLogFilter
log4j.appender.A.filter.1.ActiveCategories=ALL
log4j.appender.A.filter.1.ExemptKeyspaces=do_not_log,also_do_not_log

Format of logs

The log format is a simple set of pipe-delimited name/value pairs. The pairs themselves are separated by the pipe symbol ("|"), and the name and value portions of each pair are separated by a colon. A name/value pair, or field, is only included in the log line if a value exists for that particular event. Some fields always have a value, and are always present. Others might not be relevant for a given operation. The order in which fields appear (when present) in the log line is predictable to make parsing with automated tools easier. For example, the text of CQL statements is unquoted but if present, is always the last field in the log line.

Field Label Field Value Optional
host dse node address no
source client address no
user authenticated user no
timestamp system time of log event no
category DML/DDL/QUERY for example no
type API level operation no
batch batch id yes
ks keyspace yes
cf column family yes
operation textual description yes

The textual description value for the operation field label is currently only present for CQL.

Auditing is completely separate from authorization, although the data points logged include the client address and authenticated user, which may be a generic user if the default authenticator is not overridden. Logging of requests can be activated for any or all of the first list of categories covered by log4j.appender.A.filter.1.ActiveCategories (shown in step 3 in Configuring data auditing).

CQL Logging examples

Generally, SELECT queries are placed into the QUERY category. The INSERT, UPDATE, and DELETE statements are categorized as DML. CQL statements that affect schema, such as CREATE KEYSPACE and DROP KEYSPACE are categorized as DDL.

CQL USE

USE dsp904;

host:/192.168.56.1|source:/192.168.56.101|user:#<User allow_all groups=[]>
  |timestamp:1351003707937|category:DML|type:SET_KS|ks:dsp904|operation:use dsp904;

CLI USE

USE dsp904;

host:/192.168.56.1|source:/192.168.56.101|user:#<User allow_all groups=[]>
  |timestamp:1351004648848|category:DML|type:SET_KS|ks:dsp904

CQL query

SELECT * FROM t0;

host:/192.168.56.1|source:/192.168.56.101|user:#<User allow_all groups=[]>
  |timestamp:1351003741953|category:QUERY|type:CQL_SELECT|ks:dsp904|cf:t0|operation:select * from t0;

CQL BATCH

BEGIN BATCH
  INSERT INTO t0(id, field0) VALUES (0, 'foo')
  INSERT INTO t0(id, field0) VALUES (1, 'bar')
  DELETE FROM t1 WHERE id = 2
APPLY BATCH;

host:192.168.56.1|source:/192.168.56.101|user:#<User allow_all groups=[]>
  |timestamp:1351005482412|category:DML|type:CQL_UPDATE
  |batch:fc386364-245a-44c0-a5ab-12f165374a89|ks:dsp904|cf:t0
  |operation:INSERT INTO t0 ( id , field0 ) VALUES ( 0 , 'foo' )

host:192.168.56.1|source:/192.168.56.101|user:#<User allow_all groups=[]>
  |timestamp:1351005482413|category:DML|type:CQL_UPDATE
  |batch:fc386364-245a-44c0-a5ab-12f165374a89|ks:dsp904|cf:t0
  |operation:INSERT INTO t0 ( id , field0 ) VALUES ( 1 , 'bar' )

host:192.168.56.1|source:/192.168.56.101|user:#<User allow_all groups=[]>
  |timestamp:1351005482413|category:DML|type:CQL_DELETE
  |batch:fc386364-245a-44c0-a5ab-12f165374a89|ks:dsp904|cf:t1
  |operation:DELETE FROM t1 WHERE id = 2

CQL DROP KEYSPACE

DROP KEYSPACE dsp904;

host:/192.168.56.1|source:/192.168.56.101|user:#<User allow_all groups=[]>
  |timestamp:1351004777354|category:DDL|type:DROP_KS
  |ks:dsp904|operation:drop keyspace dsp904;

CQL prepared statement

host:/10.112.75.154|source:/127.0.0.1|user:allow_all
  |timestamp:1356046999323|category:DML|type:CQL_UPDATE
  |ks:ks|cf:cf|operation:INSERT INTO cf (id, name) VALUES (?, ?) [id=1,name=vic]

Thrift batch_mutate

host:/192.168.56.1|source:/192.168.56.101|user:#<User allow_all groups=[]>
  |timestamp:1351005073561|category:DML|type:INSERT
  |batch:7d13a423-4c68-4238-af06-a779697088a9|ks:Keyspace1|cf:Standard1

host:/192.168.56.1|source:/192.168.56.101|user:#<User allow_all groups=[]>
  |timestamp:1351005073562|category:DML|type:INSERT
  |batch:7d13a423-4c68-4238-af06-a779697088a9|ks:Keyspace1|cf:Standard1

host:/192.168.56.1|source:/192.168.56.101|user:#<User allow_all groups=[]>
  |timestamp:1351005073562|category:DML|type:INSERT
  |batch:7d13a423-4c68-4238-af06-a779697088a9|ks:Keyspace1|cf:Standard1

Batch updates

Batch updates, whether received via a Thrift batch_mutate call, or in CQL BEGIN BATCH....APPLY BATCH block, are logged in the following way: A UUID is generated for the batch, then each individual operation is reported separately, with an extra field containing the batch id.

Configuring auditing for a DSE Search/Solr cluster

By default, DSE Search/Solr nodes need no configuration for data auditing except setting up the log4j-server.properties file. If the filter-mapping element in the Solr web.xml file is commented out, the auditor cannot log anything from Solr and you need to configure auditing as described in the next section.

If necessary, uncomment the filter-mapping element in the Solr web.xml.

<filter-mapping>
    <filter-name>DseAuditLoggingFilter</filter-name>
    <url-pattern>/*</url-pattern>
</filter-mapping>

The Solr web.xml is located in the following directory:

Packaged installations

/usr/share/dse/solr/web/solr/WEB-INF/web.xml

Binary installations

/resources/solr/web/solr/WEB-INF/web.xml

Example of a Solr Audit Log

Here is an example of the data audit log of a Solr query:

host:/10.245.214.159|source:127.0.0.1|user:jdoe|timestamp:1356045339910|category:QUERY
  |type:SOLR_QUERY|ks:wiki|cf:solr|operation:/wiki.solr/select/?q=body:trains