TechnologyJuly 22, 2013

Support CQL3 tables in Hadoop, Pig and Hive

Alex Liu
Alex Liu
Support CQL3 tables in Hadoop, Pig and Hive

The first generation is based on the first generation of Hadoop Cassandra driver which uses the thrift column families. We need use the second generation of Hadoop Cassandra driver to improve the query on composite columns which CQL3 table use under the hood.

The second generation uses the second generation of Hadoop Cassandra driver to query on CQL3 tables. Basically It set the input and output CQL query and map the input and output value to Hive data type.

All metadata are retrieved from system tables of system.schema_columnfamilies and system.schems_columns.

All CQL3 tables have auto generated Hive tables using CqlStorageHandler which has the following parameters

The push down condition will be implemented the similar way as Pig partition filter push down. We will also expand the default mappings to include collections.

Share

Open-Source,
Scale-Out, Cloud-Native
NoSQL Database

Astra DB is scale-out NoSQL built on Apache Cassandra™. Handle any workload with zero downtime and zero lock-in at global scale.

Company
Resources
Cloud Partners

DataStax, is a registered trademark of DataStax, Inc.. Apache, Apache Cassandra, Cassandra, Apache Pulsar, and Pulsar are either registered trademarks or trademarks of the Apache Software Foundation.

United States