TechnologyJuly 8, 2019

DataStax Bulk Loader Pt. 5 — Counting

Brian Hess
Brian Hess
DataStax Bulk Loader Pt. 5 — Counting
$ dsbulk count -k dsbulkblog -t iris_with_id
$ dsbulk count -k dsbulkblog -t iris_with_id --stats.modes global
$ dsbulk count -k dsbulkblog -t iris_with_id -stats global
Operation directory: /tmp/logs/COUNT_20190314-171517-238903.
total | failed | rows/s | mb/s | kb/row | p50 ms | p99ms | p999ms   
   150 |      0 | 400 | 0.00 |   0.00 | 18.68 | 18.74 |  18.74
Operation COUNT_20190314-171517-238903 completed successfully in 0 seconds.
150
$ dsbulk count -k dsbulkblog -t iris_with_id --log.verbosity 0
150
$ dsbulk count -k dsbulkblog -t iris_with_id --log.verbosity 0 --stats.mode hosts
/127.0.0.1:9042 150 100.00
$ dsbulk count -k dsbulkblog -t iris_with_id --log.verbosity 0 --stats.mode ranges
-9223372036854775808 -9223372036854775808 150 100.00
$ cqlsh -e "CREATE TABLE dsbulkblog.iris_clustered(id INT, petal_length DOUBLE, petal_width DOUBLE, sepal_length DOUBLE, sepal_width DOUBLE, species TEXT, PRIMARY KEY ((species), id))"
$ dsbulk load -url /tmp/dsbulkblog/iris.csv -k dsbulkblog -t iris_clustered
$ dsbulk count -k dsbulkblog -t iris_clustered --log.verbosity 0 --stats.mode partitions
'Iris-virginica' 50 33.33
'Iris-versicolor' 50 33.33
'Iris-setosa' 50 33.33
Total rows per host:
/127.0.0.1:9042 150 100.00
Total rows per token range:
-9223372036854775808 -9223372036854775808 150 100.00
$ dsbulk count -query "SELECT id FROM dsbulkblog.iris_with_id WHERE petal_width = 2 ALLOW FILTERING"
Operation directory: /tmp/logs/COUNT_20190314-171916-543786.
total | failed | rows/s | mb/s | kb/row | p50 ms |  p99ms | p999ms     
   6 |      0 | 18 | 0.00 |   0.00 | 130.81 | 131.07 | 131.07
Operation COUNT_20190314-171916-543786 completed successfully in 0 seconds.
6
$ dsbulk count -query "SELECT id FROM dsbulkblog.iris_with_id WHERE Petal_width = 2 AND Token(id) > :start AND Token(id) <= :end ALLOW FILTERING"
Discover more
Data ProcessingDataStax Bulk Loader
Share

One-stop Data API for Production GenAI

Astra DB gives JavaScript developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.