TechnologyJune 7, 2019

DataStax Bulk Loader Pt. 4 — Unloading

Brian Hess
Brian Hess
DataStax Bulk Loader Pt. 4 — Unloading
$ dsbulk unload -url /tmp/unload -k dsbulkblog -t iris_with_id
total | failed | rows/s | mb/s | kb/row | p50 ms |  p99ms | p999ms  
150 |      0 | 232 | 0.01 |   0.05 | 171.44 | 171.97 | 171.97
Operation UNLOAD_20190314-170354-717718 completed successfully in 0 seconds.
$ dsbulk unload -url /tmp/unload -k dsbulkblog -t iris_with_id 2> /dev/null
Operation UNLOAD_20190314-170542-812259 failed: connector.csv.url target directory: /tmp/unload must be empty.
$ dsbulk unload -url /tmp/unload -k dsbulkblog -t iris_with_id --connector.csv.fileNameFormat "iris-%0,6d.csv"
$ dsbulk unload -k dsbulkblog -t iris_with_id
$ dsbulk unload -k dsbulkblog -t iris_with_id 2> /dev/null | sed 's/Iris-//g' > /tmp/unload/iris_shortname.csv
$ dsbulk unload -k dsbulkblog -t iris_with_id -m "id,species"
$ dsbulk unload -query "SELECT id, species FROM dsbulkblog.iris_with_id"
$ dsbulk unload -query "SELECT id, species FROM dsbulkblog.iris_with_id WHERE Token(id) > :start AND Token(id) <= :end"
$ dsbulk unload -query "SELECT id, species, writetime(species) AS writetime FROM dsbulkblog.iris_with_id"
$ dsbulk unload -query "SELECT id, species FROM dsbulkblog.iris_with_id WHERE id IN (101,102,103,104,105)"
$ dsbulk unload -query "SELECT id, species FROM dsbulkblog.iris_with_id WHERE Token(id) > 0 AND Token(id) < 100000000000000000"
$ cqlsh -e "CREATE TABLE dsbulkblog.iris_with_search (id int PRIMARY KEY, petal_length double, petal_width double, sepal_length double, sepal_width double, species text);"
$ cqlsh -e "CREATE SEARCH INDEX IF NOT EXISTS ON dsbulkblog.iris_with_search"
$ dsbulk load -url /tmp/dsbulkblog/iris.csv -k dsbulkblog -t iris_with_search
$ dsbulk unload -query "SELECT id, petal_length, petal_width, sepal_length, sepal_width, species FROM dsbulkblog.iris_with_search WHERE solr_query = '{\\\"q\\\": \\\"species:Iris-setosa\\\"}'" --executor.continuousPaging.enabled false
$ dsbulk unload -query "SELECT * FROM dsbulkblog.iris_with_search WHERE solr_query = '{\\\"q\\\": \\\"species:Iris-setosa\\\"}'" --executor.continuousPaging.enabled false
Operation directory: /tmp/logs/UNLOAD_20190320-180708-312514
total | failed | rows/s | mb/s | kb/row | p50ms | p99ms | p999ms   
50 |      0 | 162 | 0.01 |   0.05 | 26.80 | 26.87 |  26.87
$ dsbulk unload -k dsbulkblog -t iris_with_id -delim "\t"
$ dsbulk unload -k dsbulkblog -t iris_with_id -nullStrings "N/A"
$ dsbulk unload -k dsbulkblog -t president_birthdates -delim "\t" --codec.date "EEEE MMMM d, y GGGG"
Discover more
Data ProcessingDataStax Bulk Loader
Share

One-stop Data API for Production GenAI

Astra DB gives JavaScript developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.