DataStax Enterprise 2.0 Documentation

Checking Imported Data

This document corresponds to an earlier product version. Make sure you are using the version that corresponds to your version.

Latest DSE documentation | Earlier DSE documentation

DataStax Enterprise provides a SQL-like language called CQL that is similar to the DDL, DML, and SELECT syntax in SQL. CQL lessens the learning curve for those coming from RDBMS systems. You can use familiar syntax for all object creation and data access operations. You can use the Cassandra Query Language (CQL) utility to confirm the success of the Sqoop import. Alternatively, you can use the Cassandra Command Line Interface (CLI) to perform the same type of queries.

Using CQL to Check Imported Data

To check the data in the example of importing data into a column family, you can use CQL. For example, to check the number of rows imported into the column family:

./cqlsh

use newKS;

select count(*) from npa_nxx_cf limit 200000;

The number of records appears.

 count
--------
 105291
select * from npa_nxx_cf where key IN (626794,212524,512538);

Records appear for Pasadena, New York, and Austin.

 KEY    | city     | lat   | linetype | lon    | npa | nxx | state
--------+----------+-------+----------+--------+-----+-----+-------
 626794 | Pasadena | 34.17 |        L | 118.13 | 626 | 794 |    CA
 212524 | New York | 40.71 |        L | 074.01 | 212 | 524 |    NY
 512538 |   Austin | 30.27 |        L | 097.74 | 512 | 538 |    TX

Validating Import Results in a Cluster

Use this dse command to view the results in the Cassandra File System:

./dse hadoop fs -ls /npa_nxx

Depending on the number of DSE Analytic nodes and task tracker configuration, the output shows a number of files in the directory, part-m-0000n, where 'n' ranges from 0 to the number of tasks that were executed as part of the Hadoop job.

The contents of these files can be viewed using this command:

./dse hadoop fs -cat /npa_nxx/part-m-00000

By varying the number of tasks (the 00000), the output looks something like this:

361991,361,991,27.73,097.40,L,TX,Corpus Christi
361992,361,992,27.73,097.40,L,TX,Corpus Christi
361993,361,993,27.73,097.40,L,TX,Corpus Christi
361994,361,994,27.73,097.40,L,TX,Corpus Christi
361998,361,998,27.79,097.90,L,TX,Agua Dulce
361999,361,999,27.80,097.40,W,TX,Padre Island National Seashore