DataStax Developer Blog

Brisk 1.0 Beta 2 Released

By Kris Hahn -  June 20, 2011 | 3 Comments

DataStax has released Brisk 1.0 Beta 2! You can download Brisk from the DataStax web site.

New Features in Brisk 1.0 Beta 2

The following new features have been added in this release:

Feature

Description

BRISK-12

Apache Pig Integration. See the DataStax Documentation for more information about using Pig in Brisk.

BRISK-89

Job Tracker Failover. See the DataStax Documentation for more information about using the new brisktool movejt command.

BRISK-207

New Snappy Compression Codec built on Google Snappy is now used internally for automatic CassandraFS block compression.

BRISK-180

Automap Cassandra Column Families to Hive Tables in the Brisk Hive Metastore.

BRISK-152

Add a second HDFS layer in CassandraFS for long-term data storage. This is needed because the blocks column family in CFS requires frequent compactions – Hadoop uses it during MapReduce processing to store small files and temporary data. Compaction cleans this temporary data up after it is not needed anymore. Now there is the cfs:/// and cfs-archive:/// endpoints within CFS. The blocks column family in cfs-archive:/// has compaction disabled to improve performance for static data stored in CFS.

Major Fixes in Brisk 1.0 Beta 2

Brisk 1.0 Beta 2 also incudes the following major fixes. For details on all fixes in Beta 2, see the Brisk Jira Project Web site:

Issue

Description

BRISK-126

Remove multiple slf4j warnings

BRISK-203

Use batchMutate instead of insert in HiveCassandraOutputFormat

BRISK-219

Cassandra super columns not mapping in Hive

BRISK-220

Improve performance of hadoop fs -ls

CASSANDRA-2683

Compaction issue causing secondary index corruption.

Open Issues

For a description of the open issues in Brisk, see the Brisk Jira Project Web site.

About Brisk

Brisk is an open-source Hadoop and Hive distribution developed by DataStax that utilizes Apache Cassandra for its core services and storage. Brisk provides Hadoop MapReduce capabilities using CassandraFS, an HDFS-compatible storage layer inside Cassandra. By replacing HDFS with CassandraFS, users are able to leverage their current MapReduce jobs on Cassandra’s peer-to-peer, fault-tolerant, and scalable architecture. Brisk is also able to support dual workloads, allowing you to use the same cluster of machines for both real-time applications and data analytics without having to move the data around between systems.

Brisk is available via Apache license v2.0, and contains the following components:



Comments

  1. Yigang Chen says:

    hi, is there a download of Brisk 1.0 for windows? All I need is the cassandra based hadoop; do I still need enterprise? or is it the only option?

  2. don caldwell says:

    is there a discussion group for the source version on github? is there a design document?

  3. don caldwell says:

    Brisk Jira Project Web site does not have an administrator to allow one to log in.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>