The Sqoop demo migrates the data from a MySQL table to text files in CFS. The Sqoop data migration demo uses the MySQL database and data from the North American Numbering Plan. This data consists of the area-code (NPA) and telephone number (Nxx) for the USA and Canada.
To run the demo, you need:
To run the Sqoop demo on a single node on a Mac, for example, follow these steps.
Install MySQL and download the JDBC driver for MySQL from the MySQL site. This example uses mysql-connector-java-5.0.8-bin.jar.
Put the connector in a directory included in the Sqoop classpath, such as the resources/sqoop subdirectory of your DataStax Enterprise (DSE) installation.
On the MySQL command line, start the MySQL daemon. For example:
sudo ./mysqld_safe --user=mysql
Start MySQL and create the demo database:
sudo ./mysql
CREATE DATABASE npa_nxx_demo;
Then connect to the database and create the table:
USE npa_nxx_demo;
CREATE TABLE npa_nxx (
npa_nxx_key varchar(16) NOT NULL,
npa varchar(3) DEFAULT NULL,
nxx varchar(3) DEFAULT NULL,
lat varchar(8) DEFAULT NULL,
lon varchar(8) DEFAULT NULL,
linetype varchar(1) DEFAULT NULL,
state varchar(2) DEFAULT NULL,
city varchar(36) DEFAULT NULL,
PRIMARY KEY (npa_nxx_key)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Locate the demos/sqoop directory.
The location of the demo directory depends on your platform:
RPM-Redhat or Debian installations
cd /usr/share/dse-demos/sqoop
Tar distribution, such as Mac
cd <install_location>/demos/sqoop
Populate the table by loading the CSV file in the demos/sqoop directory.
LOAD DATA LOCAL INFILE 'npa_nxx.csv'
INTO TABLE npa_nxx_demo.npa_nxx
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n';
MySQL returns the following message:
Query OK, 105291 rows affected (1.01 sec) Records: 105291 Deleted: 0 Skipped: 0 Warnings: 0
Start DSE as an analytics node. The method you use depends on your platform:
RPM-Redhat or Debian installations
Edit /etc/default/dse, set HADOOP_ENABLED=1, and run this command:
/etc/init.d/dse service start
Tar distribution, such as Mac
Make the bin directory in the DSE installation directory, the current directory and run the dse cassandra command using the -t option.
cd <install_location>/bin
sudo ./dse cassandra -t
The -t option starts Hadoop and marks the node for Analytics.
Use the dse command in the bin directory to migrate the data from the MySQL table to text files in the CFS directory, npa_nxx.
Note
Use the database username and password or -P instead of --password to be prompted for the database password. If the database account is not password-protected, just omit the password option.
sudo ./dse sqoop import --connect jdbc:mysql://127.0.0.1/npa_nxx_demo \
--username root \
--password <password> \
--table npa_nxx \
--target-dir /npa_nxx
DSE returns this message: INFO mapreduce.ImportJobBase: Retrieved 105291 records.
To run this demo and import data to nodes in a cluster, the database permissions must be granted to the nodes. For example, use the grant ALL command to grant MySQL access to the hosts.