The utility sstable2json converts the on-disk SSTable representation of a column family into a JSON formatted document. Its counterpart, json2sstable , does exactly the opposite: it converts a JSON representation of a column family to a Cassandra usable SSTable format. Converting SSTables this way can be useful for testing and debugging.
Note
Starting with version 0.7, json2sstable and sstable2json must be run in such a way that the schema can be loaded from system tables. This means that cassandra.yaml must be found in the classpath and refer to valid storage directories.
See also: The Import/Export section of http://wiki.apache.org/cassandra/Operations.
This converts the on-disk SSTable representation of a column family into a JSON formatted document.
bin/sstable2json [-f OUT_FILE] SSTABLE
[-k KEY [-k KEY [...]]]] [-x KEY [-x KEY [...]]] [-e]
SSTABLE should be a full path to a column-family-name-Data.db file in Cassandra’s data directory. For example, /var/lib/cassandra/data/Keyspace1/Standard1-e-1-Data.db.
-k allows you to include a specific set of keys. Limited to 500 keys.
-x allows you to exclude a specific set of keys. Limited to 500 keys.
-e causes keys to only be enumerated
The output of sstable2json for standard column families is:
{
ROW_KEY:
{
[
[COLUMN_NAME, COLUMN_VALUE, COLUMN_TIMESTAMP, IS_MARKED_FOR_DELETE],
[COLUMN_NAME, ... ],
...
]
},
ROW_KEY:
{
...
},
...
}
The output for super column families is:
{
ROW_KEY:
{
SUPERCOLUMN_NAME:
{
deletedAt: DELETION_TIME,
subcolumns:
[
[COLUMN_NAME, COLUMN_VALUE, COLUMN_TIMESTAMP, IS_MARKED_FOR_DELETE],
[COLUMN_NAME, ... ],
...
]
},
SUPERCOLUMN_NAME:
{
...
},
...
},
ROW_KEY:
{
...
},
...
}
Row keys, column names and values are written in as the hex representation of their byte arrays. Line breaks are only in between row keys in the actual output.
This converts a JSON representation of a column family to a Cassandra usable SSTable format.
bin/json2sstable -K KEYSPACE -c COLUMN_FAMILY JSON SSTABLE
JSON should be a path to the JSON file
SSTABLE should be a full path to a column-family-name-Data.db file in Cassandra’s data directory. For example, /var/lib/cassandra/data/Keyspace1/Standard1-e-1-Data.db.
The sstablekeys utility is shorthand for sstable2json with the -e option. Instead of dumping all of a column family’s data, it dumps only the keys.
bin/sstablekeys SSTABLE
SSTABLE should be a full path to a column-family-name-Data.db file in Cassandra’s data directory. For example, /var/lib/cassandra/data/Keyspace1/Standard1-e-1-Data.db.