DataStax Developer Blog


Read more..

Archive



Welcome to the DataStax Tech Blog Archive. Here’s where you can find the latest insights and technical articles from top experts and fellow peers on distributed systems, data management strategies, Star Wars/Star Trek debates, and best practices for building cloud applications that are always on, built with effortless scale, and deliver instant insight.

Be sure to check out our hours of free training and downloads on DataStax Academy. For industry topics and general DataStax news, head over to our company blog. Get the latest developer news and updates at DataStax Developer Blog.

Graph Storytelling with Studio 2.0.0

By Bob Briody - June 13, 2017

One sign of great data visualization is that you can quickly and accurately interpret the provided information without having to think much about the mechanics of the visualization itself. In Studio 2.0.0 we added a few features to the Graph View that enable this type of seamless storytelling.

Read More

DSE 5.1 Resource Manager Part 2 – Process Security

By Jacek Lewandowski - June 6, 2017

DSE Resource Manager comes with a customizable implementation of the mechanism used to control the driver and executor lifecycles. In particular we provide an alternative to the default mechanism which allows processes to be run as separate system users. Follow this blog post to learn how this impacts the security of your DSE cluster, how it can be configured and how you can verify what it actually does. We will also show a step-by-step guide to demonstrate how it works.

Read More

DSE 5.1 Resource Manager, Part 1 – Network Connections Security

By Jacek Lewandowski - May 30, 2017

DSE Resource Manager is a custom version of the Spark Standalone cluster manager. It provides the functionality of a Spark Master when running Apache Spark(™) applications with DSE. Since the introduction of Spark in DSE 4.5, DSE Analytics has enhanced the open source Spark Master implementation with: automatic management of the Spark Master and Spark Workers lifecycles; Apache Cassandra(R)-based high availability; distributed and fault tolerant storage of Spark Master recovery data; and pain-free configuration for client applications. In 5.1, our introduction of the DSE Resource Manager adds even more to our custom integration providing more ease-of-use, security, and stability.

Read More

From CFS to DSEFS

By Piotr Kołaczkowski - May 23, 2017

Cassandra File System (CFS) is the default distributed file system in the DataStax Enterprise platform in versions 2.0 to 5.0. Its primary purpose is to support Hadoop and Spark workloads with temporary Hadoop-compatible storage. In DSE 5.1, CFS has been deprecated and replaced with a much improved DataStax Enterprise File System (DSEFS). DSEFS is available as an option in DSE 5.0, and was made the default distributed file system in DSE 5.1.

Read More

DataStax Drivers Fluent APIs for DSE Graph are out!

By Kevin Gallardo - May 16, 2017

Following the DataStax Enterprise 5.1 release, DataStax released its first non-beta versions of the Fluent APIs for DSE Graph. This new feature brings the DataStax Enterprise Drivers into full compatibility with the Apache TinkerPop GLVs, and we even included additional functionalities in order to make the experience of developing graph applications even faster and easier.

Read More

DSE 5.1: Automatic Optimization of Spark SQL Queries Using DSE Search

By Russell Spitzer - May 9, 2017

DSE Search (Apache Solr based) and DSE Analytics (Apache Spark Based) may seem like they are basically designed for orthogonal use cases. Search optimizes the quick generic searches over your Big Data and Analytics optimizes for reading your entire dataset for processing. But there is a sweet spot where Analytics can benefit greatly from the enhanced indexing capabilities from Search. Previously in DSE this synergy could only be accessed from the RDD api but now with DSE 5.1 we bring DSE Search together with Dse Analytics in SparkSQL and DataFrames.

Read More

Introducing DSE Graph Frames

By Artem Aliev - May 2, 2017

The DseGraphFrame package provides the Spark base API for bulk operations and analytics on Dse Graph. It is inspired by Databricks’ GraphFrame library and supports a subset of Apache TinkerPop Gremlin graph traversal language. It supports reading of DSE Graph data into a GraphFrame and writing GraphFrames from any format supported by Spark into DSE Graph.

Read More

DSE Continuous Paging Tuning and Support Guide

By Russell Spitzer - April 25, 2017

Continuous paging (CP) is a new method of streaming bulk amounts of records from Datastax Enterprise to the Datastax Java Driver. This is disabled by default and can be activated and used only when running with DataStax Enterprise (DSE) 5.1. When activated, all read operations executed from the CassandraTableScanRDD will use continuous paging. Continuous Paging is an Opt-In feature and can be enabled as a Spark configuration option.

Read More

Property Graph Algorithms

By Marko A. Rodriguez - April 10, 2017

The term property graph has come to denote an attributed, multi-relational graph. That is, a graph where the edges are labeled and both vertices and edges can have any number of key/value properties associated with them.

Read More

Tel. +1 (650) 389-6000 sales@datastax.com Offices France GermanyJapan

DataStax Enterprise is powered by the best distribution of Apache Cassandra™.

© 2017 DataStax, All Rights Reserved. DataStax, Titan, and TitanDB are registered trademark of DataStax, Inc. and its subsidiaries in the United States and/or other countries.
Apache Cassandra, Apache, Tomcat, Lucene, Solr, Hadoop, Spark, TinkerPop, and Cassandra are trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.