DataStax Enterprise allows you to stream your web and application log information into a database cluster via Apache log4j.
Apache log4j is a Java-based logging framework that provides runtime application feedback. It provides the ability to control the granularity of log statements using an external configuration file (log4j.properties).
With the Cassandra Appender, you can store the log4j messages in a column family where they're available for in-depth analysis using the Hadoop and Solr capabilities provided by DataStax Enterprise. For information about Cassandra logging, see Logging Configuration. Addtionally, DataStax provides a Log4j Search Demo.
The log4j utility has three main components: loggers, appenders, and layouts. Loggers are logical log file names. They are the names known to the Java application. Each logger is independently configurable for the level of logging. Outputs are controlled by Appenders. Numerous Appenders are available and multiple Appenders can be attached to any Logger. This makes it possible to log the same information to multiple outputs. Appenders use Layouts to format log entries. In the example below, messages show the level, the thread name, the message timestamp, the source code file, the line number, and the log message.
The available levels are:
Datastax does not recommend using TRACE or DEBUG in production due to verbosity and performance.
As mentioned above, the messages that appear in the log are controlled via the conf/log4j.properties file. Using this properties file, you can control the granularity to the Java package and class levels. For example, DEBUG messages from a particular class can be included in the log while messages from others remain at a higher level. This is helpful to reduce clutter and to identify messages. The log is most commonly a file and/or stdout. The format, behavior (such as file rolling), and so on is also configurable at runtime.
Below are sample log messages from a Cassandra node startup:
INFO [main] 2012-02-10 09:15:33,112 DatabaseDescriptor.java (line 495)
Found table data in data directories. Consider using the CLI to define your schema.
INFO [main] 2012-02-10 09:15:33,135 CommitLog.java (line 166)
No commitlog files found; skipping replay
INFO [main] 2012-02-10 09:15:33,150 StorageService.java (line 400)
Cassandra version: 1.0.7
INFO [main] 2012-02-10 09:15:33,150 StorageService.java (line 401)
Thrift API version: 19.20.0
INFO [main] 2012-02-10 09:15:33,150 StorageService.java (line 414)
Loading persisted ring state
...
The Cassandra Appender provides the capability to store log4j messages in a Cassandra column family.
To enable the Cassandra Appender:
Add resources/log4j-appender/lib/ to your application classpath.
Modify the conf/log4j.properties file, as shown in the example below:
# Cassandra Appender
log4j.appender.CASS=com.datastax.logging.appender.CassandraAppender
log4j.appender.CASS.hosts = 127.0.0.1
log4j.appender.CASS.port = 9160
#log4j.appender.CASS.appName = "myApp"
#log4j.appender.CASS.keyspaceName = "Logging"
#log4j.appender.CASS.columnFamily = "log_entries"
#log4j.appender.CASS.placementStrategy =
"org.apache.cassandra.locator.NetworkTopologyStrategy"
#log4j.appender.CASS.strategyOptions = {"DC1" : "1", "DC2" : "3" }
#log4j.appender.CASS.replicationFactor = 1
#log4j.appender.CASS.consistencyLevelWrite = ONE
#log4j.appender.CASS.maxBufferedRows = 256
log4j.logger.com.foo.bar= INFO, CASS
Commented lines are included for reference and to show the default values.
log4j.appender.CASS=com.datastax.logging.appender.CassandraAppender specifies the CassandraAppender class and assigns it the CASS alias. This alias is referenced in the last line.
log4j.appender.CASS.hosts = 127.0.0.1 allows using a comma delimited list of Cassandra nodes (in case a node goes down).
Specify replication options in lines:
log4j.appender.CASS.placementStrategy = "org.apache.cassandra.locator.NetworkTopologyStrategy" log4j.appender.CASS.strategyOptions = {"DC1" : "1", "DC2" : "3" }.
log4j.logger.com.foo.bar= INFO, CASS specifies that all log messages of level INFO and higher, which are generated from the classes and sub-packages within the com.foo.bar package, are sent to the Cassandra server by the Appender.
By default, the CassandraAppender records log messages in the Column Family log_entries in the Logging keyspace. The definition of this Column Family is as follows:
cqlsh:Logging> describe columnfamily log_entries;
CREATE COLUMNFAMILY log_entries (
KEY uuid PRIMARY KEY,
app_start_time bigint,
app_name text,
class_name text,
file_name text,
level text,
line_number text,
log_timestamp bigint,
logger_class_name text,
host_ip text,
host_name text,
message text,
method_name text,
ndc text,
thread_name text,
throwable_str_rep text
) WITH
comment='' AND
comparator=text AND
row_cache_provider='ConcurrentLinkedHashCacheProvider' AND
key_cache_size=200000.000000 AND
row_cache_size=0.000000 AND
read_repair_chance=1.000000 AND
gc_grace_seconds=864000 AND
default_validation=text AND
min_compaction_threshold=4 AND
max_compaction_threshold=32 AND
row_cache_save_period_in_seconds=0 AND
key_cache_save_period_in_seconds=14400 AND
replication_on_write=True;
Consider the following log snippet:
09:20:55,470 WARN SchemaTest:68 - This is warn message #163
09:20:55,470 INFO SchemaTest:71 - This is info message #489
09:20:55,471 ERROR SchemaTest:59 - Test exception.
java.io.IOException: Danger Will Robinson, Danger!
at com.datastax.logging.SchemaTest.testSavedEntries(SchemaTest.java:58)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
...
Note that the ERROR entry above includes the stack trace associated with an Exception. The associated rows in the log_entries Column Family appear as follows (queried using cqlsh):
KEY,eea1256e-db24-4cef-800b-843b3b2fb72c | app_start_time,1328894454774 | level,WARN |
log_timestamp,1328894455391 | logger_class_name,org.apache.log4j.Category | message,
This is warn message #163 | thread_name,main |
KEY,f7283a71-32a2-43cf-888a-0c1d3328548d | app_start_time,1328894454774 | level,INFO |
log_timestamp,1328894455064 | logger_class_name,org.apache.log4j.Category | message,
This is info message #489 | thread_name,main |
KEY,37ba6b9c-9fd5-4dba-8fbc-51c1696bd235 | app_start_time,1328894454774 | level,ERROR |
log_timestamp,1328894455392 | logger_class_name,org.apache.log4j.Category | message,
Test exception. | thread_name,main | throwable_str_rep,java.io.IOException: Danger
Will Robinson, Danger!
at com.datastax.logging.SchemaTest.testSavedEntries(SchemaTest.java:58)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
....
Not all columns have values because the set of values in logging events depends on the manner in which the event was generated, that is, which logging method was used in the code and the configuration of the column family.
Storing logging information in Cassandra provides the capability to do in-depth analysis via the DataStax Enterprise platform using Hadoop and Solr.