DataStax Enterprise 3.0 Documentation

Getting information about the sqoop command

This documentation corresponds to an earlier product version. Make sure this document corresponds to your version.

Latest DSE documentation | Earlier DSE documentation

Use the help option of the sqoop import command to get online help on Sqoop command line options. For example, on the Mac:

cd <install_location>/bin

./dse sqoop import --help

The help output for usage is:

usage: sqoop import [GENERIC-ARGS] [TOOL-ARGS]

Cassandra arguments

The help output for Cassandra is:

--cassandra-column-family <cf>
  Sets the target Cassandra column family for the import

--cassandra-create-schema
  If specified, Cassandra keyspace and column family are created.
  The column family must not exist if this flag is specified.

--cassandra-keyspace <keyspace>
  Import to <keyspace> in Cassandra

--cassandra-partitioner <partitioner>
  The partitioner class to use for writing to the Column Family.
  The default is RandomPartitioner.

--cassandra-password <passwd>
  Cassandra user password, if necessary

--cassandra-replication-factor <repFactor>
  The replication factor to use for the Keyspace.
  Requirements:

  1) use --cassandra-create-schema

  2) the keyspace does not already exist. Implies a Simple
  replication strategy. Defaults to 1 if neither
  cassandra-rep-factor nor cassandra-strategy-options are
  specified.

--cassandra-row-key <keyCol>
  Specifies which input column to use as the row key

--cassandra-strategy-options <stratOptions>
  Strategy options apply to the keyspace if cassandra-create-
  schema is specified and the keyspace does not already exist.
  Implies a Network topology replication strategy. This option
  and cassandra-rep-factor are mutually exclusive.

--cassandra-thrift-host <thriftHost>
  Comma separated list of Cassandra thrift host(s)

--cassandra-thrift-port <thriftPort>
  Cassandra thrift port - default  9160

--cassandra-username <user>
  Cassandra user name, if necessary

Other arguments

The help output for other arguments is:

Common arguments

--connect <jdbc-uri>                         Specify JDBC connect string

--connection-manager <class-name>            Specify connection manager class name

--connection-param-file <properties-file>    Specify connection parameters file

--driver <class-name>                        Manually specify JDBC driver class to use

--hadoop-home <dir>                          Override $HADOOP_HOME

--help                                       Print usage instructions

 -P                                          Read password from console

--password <password>                        Set authentication password

--username <username>                        Set authentication username

--verbose                                    Print more information while working

Import control arguments

--append                        Imports data in append mode

--as-avrodatafile               Imports data to Avro data files

--as-sequencefile               Imports data to SequenceFiles

--as-textfile                   Imports data as plain text (default)

--boundary-query <statement>    Set boundary query for retrieving max and min value
                                of the primary key

--columns <col,col,col...>      Columns to import from table

--compression-codec <codec>     Compression codec to use for import

--direct                        Use direct import fast path

--direct-split-size <n>         Split the input stream every 'n' bytes when importing
                                in direct mode

-e,--query <statement>          Import results of SQL 'statement'

--fetch-size <n>                Set number 'n' of rows to fetch from the database when
                                more rows are needed

--inline-lob-limit <n>          Set the maximum size for an inline LOB

-m,--num-mappers <n>            Use 'n' map tasks to import in parallel

--split-by <column-name>        Column of the table used to split work units

--table <table-name>            Table to read

--target-dir <dir>              HDFS plain table destination

--warehouse-dir <dir>           HDFS parent for table destination

--where <where clause>          WHERE clause to use during import

-z,--compress                   Enable compression

Incremental import arguments

--check-column <column>        Source column to check for incremental change

--incremental <import-type>    Define an incremental import of type 'append' or
                               'lastmodified'

--last-value <value>           Last imported value in the incremental check column

Output line formatting arguments

--enclosed-by <char>               Sets a required field enclosing character

--escaped-by <char>                Sets the escape character

--fields-terminated-by <char>      Sets the field separator character

--lines-terminated-by <char>       Sets the end-of-line character

--mysql-delimiters                 Uses MySQL's default delimiter set:
                                   fields: ,lines: \n  escaped-by: \
                                   optionally-enclosed-by: '

--optionally-enclosed-by <char>    Sets a field enclosing character

Input parsing arguments

--input-enclosed-by <char>               Sets a required field encloser

--input-escaped-by <char>                Sets the input escape character

--input-fields-terminated-by <char>      Sets the input field separator

--input-lines-terminated-by <char>       Sets the input end-of-line char

--input-optionally-enclosed-by <char>    Sets a field enclosing character

Hive arguments

--create-hive-table                         Fail if the target hive table
                                            exists

--hive-delims-replacement <arg>             Replace Hive record \0x01 and row
                                            delimiters (\n\r) from imported string
                                            fields with user-defined string

--hive-drop-import-delims                   Drop Hive record \0x01 and row delimiters
                                            (\n\r) from imported string fields

--hive-home <dir>                           Override $HIVE_HOME

--hive-import                               Import tables into Hive (Uses Hive's
                                            default delimiters if none are set.)

--hive-overwrite                            Overwrite existing data in the Hive table

--hive-partition-key <partition-key>        Sets the partition key to use when
                                            importing to hive

--hive-partition-value <partition-value>    Sets the partition value to use when
                                            importing to hive

--hive-table <table-name>                   Sets the table name to use when importing
                                            to hive

--map-column-hive <arg>                     Override mapping for specific column to
                                            hive types.

HBase arguments

--column-family <family>    Sets the target column family for the import

--hbase-create-table        If specified, create missing HBase tables

--hbase-row-key <col>       Specifies which input column to use as the row key

--hbase-table <table>       Import to <table> in HBase