Cassandra Log Viewing with gltail
gltail by Erlend Simonsen is a real time log viewer written in Ruby that provides an interesting live view of your log data from multiple servers. Log entries are assigned to blocks/labels which are shown on the border, and track counts/averages per entry. New log entries drop into the view from the top either as events displaying a text (for example an error message), or as activities from the side represented as differently sized circles (according to some notion of size for that activity).
But instead of using my poor english to describe this, here is the end result of my efforts so far, just to keep everybody interested.
I hope your machine already has Ruby. If not, find the respective setup instructions and follow them. I am on Mac, so I ended up installing Homebrew to get the latest version of Ruby (but that’s a whole different article).
Set up a Cassandra/DSE ring with a few nodes and do stuff on there to fill up the logs once you have the gltail application running. You can find the instructions for DSE setup and running demos on the Datastax web site. If you already have a working Cassandra ring to use, just try it with those nodes.
You can download gltail from https://github.com/Fudge/gltail. Just do a
git clone https://github.com/Fudge/gltail
and you are all set. To ensure you have all the required packages, make sure you follow the instructions in https://github.com/Fudge/gltail/blob/master/README.
I also ended up tweaking one file on my Mac (not sure whether that will be needed for you, but you may want to do it if your ruby has trouble finding files).
diff --git a/bin/gl_tail b/bin/gl_tail index a0ffe2f..4d831ad 100755 --- a/bin/gl_tail +++ b/bin/gl_tail @@ -74,6 +74,8 @@ elsif !File.exist?(file) exit end +$LOAD_PATH.unshift(File.dirname(__FILE__) + '/../lib') + require File.dirname(__FILE__) + '/../lib/gl_tail.rb' if defined? @print
To allow gltail to read the log files and parse them, there are a few customizations needed as described below.
The gl_tail.yaml I used is attached here. The file should be in the directory where you start bin/gl_tail from.
The servers section defines the list of machines to ping for log files. Each server definition starts with the server name to use, and contains the source for the log files. In this case I choose to just copy log files from the servers down for illustration purposes, but you can specify host, user, password and ssh command used to extract those files via ssh from the remote nodes. This is also where the parser for the log file is selected, as well as some additional options like colors (rgb + transparency).
The config section contains some application level options like window size, and bounce setting. Play with this one, false makes balls just fly off the side, with this set to true, they bounce off a wall and drop into a neat funnel.
For each side of the window, there is a section that defines the size of the border, and what blocks/labels are represented in which order. For each block there are a number of options, such as color, the size, whether to clear old values and some more. Plenty of stuff to play with. Most importantly this is where new blocks would be added in case the parser has additional activities or events to add.
# Only use spaces to indent, not tabs servers: server1: source: local files: ./test/system1.log parser: cassandra color: 0.2, 1.0, 0.2, 1.0 server2: source: local files: ./test/system2.log parser: cassandra color: 0.2, 1.0, 0.4, 1.0 server3: source: local files: ./test/system3.log parser: cassandra color: 0.2, 1.0, 0.6, 1.0 server_errors: source: local files: ./test/system4.log parser: cassandra color: 0.2, 1.0, 0.8, 1.0 config: dimensions: 800x600 min_blob_size: 0.004 max_blob_size: 0.04 highlight_color: orange bounce: true left_column: size: 20 alignment: -0.99 blocks: errors: order: 0 size: 10 show: total color: 1.0, 0.0, 0.0, 0.0 dropped messages: order: 1 size: 5 show: total color: 1.0, 0.0, 0.0, 0.0 warnings: order: 2 size: 10 show: total color: 1.0, 1.0, 0.0, 0.0 compacted: order: 3 size: 5 show: total color: 1.0, 1.0, 0.2, 0.0 flushing: order: 3 size: 5 show: total color: 1.0, 1.0, 0.2, 0.0 right_column: size: 20 alignment: 0.99 blocks: dse servers: order: 0 size: 20 auto_clean: false show: total cassandra servers: order: 1 size: 20 auto_clean: false show: total loads: order: 2 size: 20 auto_clean: false show: total schemas: order: 2 size: 20 auto_clean: false show: total resolver: reverse_ip_lookups: true reverse_timeout: 0.5
The most interesting piece is of course the parser for the log file. That file is located in the folder lib/gltail/parsers and defines what is extracted from the log file. I wrote my own little first version of this, which is included here.
# cassandra.rb - OpenGL visualization of your server traffic # Copyright 2013 Sven Delmas
# # Licensed under the GNU General Public License v2 (see LICENSE) # # Parser which handles cassandra/system.log (standard log setup) from Apache Cassandra class CassandraParser < Parser def parse( line ) # main line parse _, priority, thread, date, time, fileName, lineNumber, message = /(\S+) \[(\S+)\] (\S+) (\S+) (\S+) \(line (\S+)\) (.*)/.match(line).to_a if message # We got a message, so let's parse it for interesting stuff _, cassandra_version = /Cassandra version: (\S+)/.match(message).to_a _, dse_version = /DSE version: (\S+)/.match(message).to_a _, dropped_messages = /(\S+) messages dropped in last/.match(message).to_a _, compacted = /Compacted (\S+) sstables to/.match(message).to_a _, load_endpoint, load = /Endpoint (\S+) state changed LOAD = (\S+)/.match(message).to_a _, schema_endpoint, schema = /Endpoint (\S+) state changed SCHEMA = (\S+)/.match(message).to_a _, flushing = /Completed flushing (\S+)/.match(message).to_a if dropped_messages add_activity(:block => 'dropped messages', :name => server.name) end if compacted add_activity(:block => 'compaction done', :name => server.name) end if flushing add_activity(:block => 'flushing', :name => server.name) end # Events to pop up if cassandra_version server_name = server.name + '=' + cassandra_version server_message = server.name + ' started version: ' + cassandra_version add_event(:block => 'cassandra servers', :name => server_name, :message => server_message, :update_stats => true, :color => [0.0, 1.0, 0.0, 0.0]) end if dse_version server_name = server.name + '=' + dse_version server_message = server.name + ' started version: ' + dse_version add_event(:block => 'dse servers', :name => server_name, :message => server_message, :update_stats => true, :color => [0.0, 1.0, 0.0, 0.0]) end if schema_endpoint schema_name = schema_endpoint add_activity(:block => 'schemas', :name => schema_name) end add_event(:block => 'errors', :name => server.name, :message => message, :update_stats => true, :color => [1.0, 0.0, 0.0, 0.0]) if priority == "ERROR" add_event(:block => 'warnings', :name => server.name, :message => message, :update_stats => true, :color => [1.0, 1.0, 0.0, 0.0]) if priority == "WARN" end end end
As you can see, the main purpose of this code is to define regular expressions that will detect and extract specific information. Once the information is extracted, it can easily be added to a particular block (including a newly defined block in case you are adventurous). Right now the code will extract the DSE and Cassandra version, errors, warnings, schema changes, flush events, compaction events. I played with a few other log messages, but picking the right messages to extract is non trivial as the display may get overloaded quickly. There are of course more possibilities as to what to track:
- DSE Auditing – With the proper setup of log4j appenders, it would be possible to track logins, queries, inserts, deletes
- Specific errors and warnings
- Endpoint load reports
- Adding or removal of nodes
- Whatever else pops up in the logs that you may need to know about
You will probably not use this to do line by line analysis, this is not what gltail is made for. What I like about this display method (besides that it’s just plain cool) is that with the right extraction, one can see how certain events correlate. For example there may be a lot of connection drops while a lot of compaction going on. This also can quickly alert about specific errors in a way that is visual and tracks the occurrences.
I hope you find this little overview interesting, or maybe even useful. It’s just to show that there are a lot of fun projects out there that can be used with Cassandra.