email iconemail phone iconcall
Robin Schumacher, SVP and Chief Product Officer

Introducing DataStax Enterprise Graph

By Robin Schumacher, SVP and Chief Product OfficerApril 12, 2016

Today we announced an exciting new component of DataStax Enterprise (DSE): DSE Graph, which is a scale-out graph database used to manage complex and highly connected data. DSE Graph is a critical part of the largest upcoming release in our company’s history that includes new versions of our server, management/monitoring, and development tools. 

Although we will be providing more news and some great talks on DSE Graph at our European Summit being held in London on April 19th-20th, let me give you some more insight into why we’ve announced DSE Graph now and what you can expect in the weeks to come.

Why Graph?

First, let me remind you why we acquired Aurelius back in 2015 and why we believe graph database support is such an important addition to DSE.

There’s little argument over the fact that today’s applications required to deliver real-time value at scale – what we call cloud applications [1] – are intensely multi-faceted. For example, a modern retail cloud application includes various modules such as product catalogs, user profile management, fraud detection, recommendation and personalization engines, shopping cart, clickstream/log analysis, and others.

It’s becoming increasingly common for these components to have distinct data model support requirements. Because of this, a database that provides adaptive data management (or multi-model) functionality will deliver a simpler and more agile solution for quickly bringing cloud applications to market.

Our customers have been asking us to solve their multi-data model needs for a while now, and we’ve answered with our next version of DSE, which has built-in multi-model capabilities that provide support for key-value, tabular, JSON / document, and graph data models.

Having graph as part of the DSE platforms enables us to now not only serve the lower and middle parts of today’s data model continuum where data complexity and relationships are concerned, but also the highest end so we can support the parts of a cloud application that need to manage complex and highly connected data.

multi data model continuum

And what parts or use cases of an application are solved with a graph database? While a variety of graph-shaped business problems are commonly addressed through a combination of batch and/or ETL processes, the standard issues encountered by our customers today need a real-time response and include:

  • Master Data Management (e.g. Customer 360): A graph is the best model for critical customer and business data along with their relationships that are consolidated across business units, and then queried and maintained by various transactional and BI business applications. 
  • Recommendation and personalization – relevant and personalized recommendations and other customizations for a user can be best identified in a large graph of other users and entity interactions. A graph is well suited to help recommend products, next actions, or advertising based on a user’s information, past behavior, and interactions.
  • IoT, Network Asset Management and Monitoring – A graph is a good model for managing network assets (with their properties or configurations) and how they relate to each other over time. A graph can be used to manage and monitor the network, optimize resource allocation, detect and fix problems, etc. This can also include IoT use case where assets are devices or machines that generate time-series data (status records, event data, etc.).
  • Security Management and Fraud Detection: In a complex and highly interrelated network of users, entities, transactions, events, and interactions, a graph database can help determine which entity, transaction or interaction is fraudulent, poses a security risk, or is a compliance concern.

A Sneak-Peak at DSE Graph

How does DSE Graph solve the business problems associated with these and similar use cases? Let’s take a quick look at the technology under the hood of DSE Graph and find out.

DSE Graph is built on the foundation of three open source projects. First, is Apache TinkerPop™.

TinkerPop is the standard for open source graph computing frameworks and is in use by every mainstream graph database today. It enables operational database and data analytic systems to offer graph computing capabilities to their users.

Part of TinkerPop is Gremlin, which is the standard language for graph databases. What SQL is to an RDBMS, Gremlin is to graph. Not only does DataStax make full use of TinkerPop in DSE Graph, but we are heavy contributors to the project.

Next is the Titan graph database. We used Titan as a model for DSE Graph, however we’ve gone far beyond Titan’s basic scale-out capabilities. So much so, in fact, that we estimate that 90% of DSE Graph is completely new code and 10% is Titan-inspired. That said, because of its reliance on TinkerPop, DSE Graph is compatible with Titan, which means that existing Titan users can migrate their application code to DSE Graph [2].  

Then comes Cassandra. DSE Graph utilizes an enterprise-certified version of Apache Cassandra™ for its persistent datastore and inherits all of Cassandra’s key benefits including constant uptime, write/read/active-everywhere functionality, linear scalability, predictable low-latency response times, and operational maturity. To that foundation, DSE Graph adds other performance-enhancing capabilities that include an adaptive query optimizer, locality-driven graph data partitioner, distributed query execution engine, and various graph-specific index structures.

Next comes integration with all of DSE’s enterprise-class features. This rounds out the complete enterprise graph solution and includes advanced security protection, built-in analytics and enterprise search functionality, visual management, monitoring, and development/driver tooling.

Such integration is not superficial but is baked in throughout the platform. For example, you can easily run OLAP tasks on graph data using our Apache Spark-based DSE Analytics, and employ the same Gremlin language you use for standard transactional work. There’s no special development work or languages like SparkSQL required; everything is handled transparently by DSE.

We’ve also got you covered on the management and development tooling sides. DataStax OpsCenter will provide visual management and monitoring support for DSE Graph, and we’re bringing out a new web-based development solution that will help you visually interact with and query DSE Graph databases (and all other aspects of DSE down the road). Finally, all of the DataStax drivers have been updated to support DSE Graph, which means you can use one connector for CQL, Gremlin, SparkSQL, etc.  

There’s more (e.g. expert support, software lifecycle management, etc.), but I think you understand why we’re so excited to have DSE Graph be a part of our enterprise DBMS platform and why we believe it’s a hand-in-glove fit for the use cases I described earlier.

More to Come

As we get closer to the release of DSE Graph and our next major version of DSE, we’ll be providing more information in upcoming posts. However, if you’d like to learn more now, please visit our DSE Graph webpage. You’ll find papers, FAQ’s, videos, and much more on graph technology and DSE Graph in particular.

If you have any other questions that those materials don’t answer, please be sure to contact us.
[1] We define a cloud application as one that consists of many endpoints including browsers, mobile devices, and/or machines that are geographically distributed. They are intensely transactional, always available, as well as instantaneously and intelligently responsive no matter the number of users or machines using the application.

[2] Data from existing Titan databases can be migrated over to DSE graph using the platform’s supplied load utilities.



Your email address will not be published. Required fields are marked *