The Distributed Data Show

The Distributed Data Show is your weekly source for the latest news and technical expertise to help you succeed in building large-scale distributed systems. Brought to you by the DataStax Developer Advocate team, we go in-depth with DataStax engineers and special guests from the broader data community. New episodes each Tuesday.

Subscribe here to get weekly blog updates!  DDS is also available on YouTube, ITunes, and your favorite podcast provider.

Dev Laptop 2

Featured Episodes

Video

Building CICD Pipelines in the Modern Age with Christopher Bradford

Many DSE users have very long upgrade cycles due to time and complexity concerns.  Using the CICD methodology Christopher Bradford has taken up the challenge to make the upgrade path both faster and lower risk.  Today we get to dive in and take a look at what he has been up to. Highlights! 1:00 Common issues with normal operations. 1:40 What is CICD? 4:00 Testing Considerations. 6:05 What does all this really do? 9:49 Where can I do this?

Learn More
Video

Cassandra Data Modeling Tools

In this episode Jeff and Adron have a quick topical discussion of some tools they're using to get work done with CQL and databases in general. Adron discusses using JetBrains DataGrip and what it's been enabling him to do, then Jeff interjects with some additional thoughts and asks the question, is Cassandra not your only database? Where Adron elaborates on how DataGrip works with many other databases, so when one is approached with work across a wide spectrum of sources they can tackle that work with DataGrip. Then both Jeff and Adron get into what they'd like to see in next generation IDE's and what they'd like to have tooling around to get the job done!

Learn More
Video

A Developer's Journey with Cristina Veale

David chats with Cristina about her non-traditional background breaking into the tech sector, and the journey that lies ahead as she takes on her new role of Developer Advocate at DataStax. Highlights! 0:16 - David asks Cristina how she became a developer 1:35 - David asks if this journey led Cristina into Developer Relations 3:30 - David asks Cristina how she came to be a Developer Advocate with Datastax. 5:19 - David asks Cristina about her DataStax bootcamp experience, and her transition from frontend to distributed databases. 9:20 - What's next for Cristina being on the Developer Advocate team at DataStax

Learn More
Video

SpringBoot: From The Trenchies with Frank Moley

Spring Boot is a powerful framework helping developers building applications fast. In this episode Franck explains us how he came to Spring in the first place and what are the good and bad sides about it.

Learn More
Video

Application Development with Graph Data with Dr. Denise Gosnell and Dave Bechberger

Denise Gosnell interviews Dave Bechberger live at Data Day Texas regarding challenges when developing Graph based applications, recommendations on approaches to take, and what resources are available for developers new to Graph. Highlights! 00:42 Challenges for application development with Graph 01:21 Performance issues that dev teams run into when issuing Graph "queries" 02:17 What is branching factor? 03:08 Top 3 recommendations when data modeling with Graph 04:30 Debate over edges or vertices, when to use each? 05:25 "Even in the relational world, everyone loves 3rd normal form until it doesn't work" 06:04 Where do development teams gain skills using Graph databases? 07:21 What kind of resources are recommended when learning gremlin query languages? 08:32 Dave's releasing a new book "Graph Databases in Action" 09:29 Advice on building Graph based applications for production

Learn More
Video

Kafka and Cassandra with Tim Berglund

Description - Jeff and Tim talk about the most common questions developers have about Kafka and three great ways to combine Kafka with Cassandra in your applications. Highlights: 0:30 - debating the great innovations of human history 1:38 - questions developers ask Tim about Kafka: there are lots of specific questions about tuning producer/consumer throughput, but also further up the stack in terms of stream processing APIs and bridging synchronous and asynchronous interactions 4:17 - Jeff's first fail with Kafka - publishing just a value to a topic configured as a key-value topic. 5:50 - Best practices for working with microservices - having services communicate through durable logs of immutable events is a great pattern. When these services also expose synchronous APIs, they often need to query for other data. Logs like Kafka don't do well with complex queries like full text search, geospatial, etc, and that's where incorporating Cassandra and DataStax Enterprise makes sense.. 8:31 - Pattern 1 for combining Kafka and Cassandra: a service consumes events from a stream, performs computation, and produces new events. Service may provide API and need to grab data from Cassandra. 9:33 - Pattern 2: Cassandra-centric view - use Kafka as a pipe for data ingest into Cassandra. This is great when you want to leveraging Cassandra's multi-DC replication  10:29 - Pattern 3: Cassandra into Kafka. Possible to do change data capture (CDC) from Cassandra and other databases via connectors plugged into Kafka connect 11:32 - Wrapping up - the challenges of finding time to code when leading DevRel teams, having outside interests and hobbies.

Learn More

Recent Episodes

Video

Upgrading Cassandra Clusters with Carlos Rolo

Patrick McFadin talks with “Cassandra Archaeologist” Carlos Rolo of Pythian about best practices for upgrades in Apache Cassandra clusters. Highlights! 0:15 - Patrick welcomes Carlos to the episode and explains why he’s known as the “Cassandra archaeologist”. 2:10 - Carlos shares his tips for upgrades: read the documentation, make sure you’re doing backups and testing them, validate that your drivers work on a new version 3:18 - Doing a test upgrade is recommended, especially from an application perspective 4:00 - Upgrading across multiple version numbers takes some work, you need to go a step at a time and upgrade SSTables 5:14 - Online upgrades are the norm - this is a major reason why people choose Cassandra 6:23 - Changes affecting upgrades to Cassandra 4.0 - check the release notes! 7:13 - The most difficult upgrades are the ones involving transition away from the legacy Thrift API 9:35 - The final word - make sure to read the docs when doing upgrades

Learn More
Video

Searching For Success with Alice Lottini and Roberto Carrera

This week the EMEA DataStax crew takes over the DDS to provide feedbacks about the DataStax Conference and announcements made during keynotes. This was also an occasion to highlight the talk Timeseries at scale performed by Alice and Patrick. Highlights! 0:22 : Presentation of Alice 0:53 : Presentation of Patrick 1:31 : Feedbacks about talk TimeSeries at scale - (Patrick) Requirements + Use Cases 2:28 : How the talk goes ? - (Alice) Overall feedback and attendees profiles + questions 4:10 : How do you feel about this conference, do you meet interesting folks already ? 5:46: What did you think about the keynote and related announcements

Learn More
Video

Cassandra at Netflix and Version 4 Wishlist with Vinay Chella

Host Aleks Volochnev sits down with Netflix Cloud Database Architect, Vinay Chella to discuss Full Query Logging, how Sidecar makes ops people happy and why Netflix already plans to migrate to version 4? A lot is discussed, so stay tuned! Highlights! 00:00 Welcome 00:25 Introduction 01:53 Vinay's Talk I @ Accelerate 02:00 Full Query Logging 03:00 Vinay's Talk II @ Accelerate 03:20 What are you working on right now? 03:25 Sidecar 05:40 Performance Monitoring 06:40 Netflix' Technical Blog 07:05 Version Four 08:05 Async Internode Messaging 08:40 Zero Copy Streaming 10:12 Chaos Engineering vs Cassandra 12:15 Working with Apache Community 14:10 An open-source contributions 15:00 How to become a Cassandra Contributor 15:58 Favourite Bug 17:13 Numbers?! 18:17 Data Density 19:20 Thank you!

Learn More
Video

Constellation Tech Preview

Starting off this episode Adron and Kat (Kathryn Erickson) kicks off the discussion with a little focused camera angle on the DataStax Accelerate 2019 Conference! Adron and Kat elaborate on DataStax Desktop and also AppStax! The conversation wraps up with details around DataStax Enterprise Graph, and future direction around that technology. Afterwards Amanda joins Mattias Broecheler for more discussion around the Desktop and AppStax technology. Mattias explains the focus, ideas behind, and core features that will change how development is done with AppStax!

Learn More
Video

Apache Casandra 4.0 Improvements with TheLastPickle Guys

TheLastPickle, DataStax Accelerate, and exciting updates coming in Apache Cassandra 4.0. Cedrick talks with John Haddad and Alex Dejanovski from TheLastPickle to discuss their presentations at DataStax Accelerate along with Apache Cassandra tools managed by TLP and new updates coming with Cassandra 4.0.

Learn More
Video

Performance Heaven with Intel Optane + DSE with Donnie Roberson

Donnie Roberson of the DataStax Partner team joins the show to talk about the amazing performance results observed in running DataStax Enterprise 6 on Intel's latest generation hardware including the Xeon processors and Optane DCPMM, and when and where you might be able to get your hands on this technology.  Highlights! 0:00 - Jeff welcomes Donnie to the show and gets distracted by the drone races in the Accelerate expo hall. 1:54 - Donnie came to Cassandra from the Hadoop world and worked on the DataStax support team before making a move into into partner management and working on cool projects like the DataStax Docker images. 2:57 - The DSE 6.0 release in 2018 introduced a Thread Per Core architecture to allocate token ranges to specific cores within each Cassandra node. 4:17 - Intel has recently come out with the newest generation Xeon processors and Optane DCPMM. These two technologies are "a match made in performance heaven". 5:08 - Intel gave DataStax access to a private lab with 4 machines with this new hardware - 364GB RAM, 80 core Intel 2nd generation scalable processors, 2 Optane DCPMM 1.2 GB drives per machine. 5:51 - The workloads tested included 90% reads / 10% writes, 50% reads / 50% writes, 10% reads / 90% writes. On the write heavy test they observed 448 K ops/s (112 K ops/s per node). This is 2x performance over current NVMe drives. Read heavy workloads saw up to 5x performance impovement, with sub millisecond latencies. 7:18 - This hardware is available from Intel now, it will also be coming to Google Cloud Platform within the next couple of quarters. 8:11 - Jeff starts making up architectures on the fly involving Intel, GCP and DataStax Constellation. 9:10 - The highlight of the Accelerate conference for Donnie is the opportunity to meet Cassandra users and DataStax customers.

Learn More

To watch more videos, please visit the Distributed Data Show Youtube Channel!

Watch Now

Cassandra is amazing for scalability. Masterless (any node can be coordinator) architecture helps to add a new node really really easy comparing to other DBs. As Vinay says Simian Army can't do any harm to Cassandra.

Erol Shaban

User, Youtube

Amazing. I will start looking at Cassandra Code.

Abhisar Mohapatra

User, Youtube