DataStax Enterprise 2.0 Documentation

Getting Information about the Sqoop Command

This document corresponds to an earlier product version. Make sure you are using the version that corresponds to your version.

Latest DSE documentation | Earlier DSE documentation

Use the help option of the sqoop import command to get online help on Sqoop command line options. For example, on the Mac:

cd <install_location>/bin

./dse sqoop import --help

The help output for usage is:

usage: sqoop import [GENERIC-ARGS] [TOOL-ARGS]

Cassandra arguments

The help output for Cassandra is:

--cassandra-column-family <cf>
  Sets the target Cassandra column family for the import

--cassandra-create-schema
  If specified, Cassandra keyspace and column family are created.
  The column family must not exist if this flag is specified.

--cassandra-keyspace <keyspace>
  Import to <keyspace> in Cassandra

--cassandra-partitioner <partitioner>
  The partitioner class to use for writing to the Column Family.
  The default is RandomPartitioner.

--cassandra-password <passwd>
  Cassandra user password, if necessary

--cassandra-replication-factor <repFactor>
  The replication factor to use for the Keyspace.
  Requirements:

  1) use --cassandra-create-schema

  2) the keyspace does not already exist. Implies a Simple
  replication strategy. Defaults to 1 if neither
  cassandra-rep-factor nor cassandra-strategy-options are
  specified.

--cassandra-row-key <keyCol>
  Specifies which input column to use as the row key

--cassandra-strategy-options <stratOptions>
  Strategy options apply to the keyspace if cassandra-create-
  schema is specified and the keyspace does not already exist.
  Implies a Network topology replication strategy. This option
  and cassandra-rep-factor are mutually exclusive.

--cassandra-thrift-host <thriftHost>
  Comma separated list of Cassandra thrift host(s)

--cassandra-thrift-port <thriftPort>
  Cassandra thrift port - default  9160

--cassandra-username <user>
  Cassandra user name, if necessary

Other Arguments

The help output for other arguments is:

Common arguments

--connect <jdbc-uri>                         Specify JDBC connect string

--connection-manager <class-name>            Specify connection manager class name

--connection-param-file <properties-file>    Specify connection parameters file

--driver <class-name>                        Manually specify JDBC driver class to use

--hadoop-home <dir>                          Override $HADOOP_HOME

--help                                       Print usage instructions

 -P                                          Read password from console

--password <password>                        Set authentication password

--username <username>                        Set authentication username

--verbose                                    Print more information while working

Import control arguments

--append                        Imports data in append mode

--as-avrodatafile               Imports data to Avro data files

--as-sequencefile               Imports data to SequenceFiles

--as-textfile                   Imports data as plain text (default)

--boundary-query <statement>    Set boundary query for retrieving max and min value
                                of the primary key

--columns <col,col,col...>      Columns to import from table

--compression-codec <codec>     Compression codec to use for import

--direct                        Use direct import fast path

--direct-split-size <n>         Split the input stream every 'n' bytes when importing
                                in direct mode

-e,--query <statement>          Import results of SQL 'statement'

--fetch-size <n>                Set number 'n' of rows to fetch from the database when
                                more rows are needed

--inline-lob-limit <n>          Set the maximum size for an inline LOB

-m,--num-mappers <n>            Use 'n' map tasks to import in parallel

--split-by <column-name>        Column of the table used to split work units

--table <table-name>            Table to read

--target-dir <dir>              HDFS plain table destination

--warehouse-dir <dir>           HDFS parent for table destination

--where <where clause>          WHERE clause to use during import

-z,--compress                   Enable compression

Incremental import arguments

--check-column <column>        Source column to check for incremental change

--incremental <import-type>    Define an incremental import of type 'append' or
                               'lastmodified'

--last-value <value>           Last imported value in the incremental check column

Output line formatting arguments

--enclosed-by <char>               Sets a required field enclosing character

--escaped-by <char>                Sets the escape character

--fields-terminated-by <char>      Sets the field separator character

--lines-terminated-by <char>       Sets the end-of-line character

--mysql-delimiters                 Uses MySQL's default delimiter set:
                                   fields: ,lines: \n  escaped-by: \
                                   optionally-enclosed-by: '

--optionally-enclosed-by <char>    Sets a field enclosing character

Input parsing arguments

--input-enclosed-by <char>               Sets a required field encloser

--input-escaped-by <char>                Sets the input escape character

--input-fields-terminated-by <char>      Sets the input field separator

--input-lines-terminated-by <char>       Sets the input end-of-line char

--input-optionally-enclosed-by <char>    Sets a field enclosing character

Hive arguments

--create-hive-table                         Fail if the target hive table
                                            exists

--hive-delims-replacement <arg>             Replace Hive record \0x01 and row
                                            delimiters (\n\r) from imported string
                                            fields with user-defined string

--hive-drop-import-delims                   Drop Hive record \0x01 and row delimiters
                                            (\n\r) from imported string fields

--hive-home <dir>                           Override $HIVE_HOME

--hive-import                               Import tables into Hive (Uses Hive's
                                            default delimiters if none are set.)

--hive-overwrite                            Overwrite existing data in the Hive table

--hive-partition-key <partition-key>        Sets the partition key to use when
                                            importing to hive

--hive-partition-value <partition-value>    Sets the partition value to use when
                                            importing to hive

--hive-table <table-name>                   Sets the table name to use when importing
                                            to hive

--map-column-hive <arg>                     Override mapping for specific column to
                                            hive types.

HBase arguments

--column-family <family>    Sets the target column family for the import

--hbase-create-table        If specified, create missing HBase tables

--hbase-row-key <col>       Specifies which input column to use as the row key

--hbase-table <table>       Import to <table> in HBase