Toggle Menu

About the Author

Jonathan is a co-founder of DataStax. Before DataStax, Jonathan was Project Chair of Apache Cassandra for six years, where he built the Cassandra project and community into an open-source success. Previously, Jonathan built an object storage system based on Reed-Solomon encoding for data backup provider Mozy that scaled to petabytes of data and gigabits per second throughput.

VIEW

  • All

  • Company

  • Developers

  • Events

  • Code Examples

  • 2019
company
2 June, 2021

Why Nutanix Beam Selected Apache Pulsar over Apache Kafka

Authors: Jonathan Ellis

Apache Pulsar™ is used by hundreds of companies to solve distributed messaging problems at scale. Some of these use cases are well-publicized, like Splunk’s or Verizon Media’s or…

Learn More
developers
27 January, 2021

Why Apache Pulsar as a Service is Essential to the Modern Data Stack

Authors: Jonathan Ellis

Messaging has been on DataStax’s radar for several years. A significant motivator for this is the increasing popularity of microservice-based architectures. Briefly, microservice…

Learn More
developers
24 February, 2020

The Future of Databases

Authors: Jonathan Ellis
Cloud, DSE Graph

DataStax Co-Founder Jonathan Ellis on the Future of Databases Last month at Data Day Texas, DataStax Co-Founder and CTO Jonathan Ellis shared his predictions for what we’ll see in…

Learn More
company
17 July, 2019

Momentum, Change, and Moving to the Future

Authors: Billy Bosworth, Jonathan Ellis

For several years DataStax has worked with customers in hybrid and multi-cloud environments. Recently, our roadmap has intensified and accelerated in response to customer demands.…

Learn More
company
8 November, 2018

DataStax and the Cassandra Community

Authors: Jonathan Ellis
Community

DataStax’s contributions to the Cassandra community cover a wide range of needs, all oriented towards your success in building Cassandra-based applications. I think of these as the…

Learn More
company
25 September, 2018

5 Lessons in Distributed Databases

Authors: Jonathan Ellis

A few weeks ago I had the pleasure to speak at  Distributed Data Summit in San Francisco—a conference exploring the latest developments in Apache Cassandra and other distributed…

Learn More
developers
10 May, 2016

Materialized View Performance in Cassandra 3.x

Authors: Jonathan Ellis
Apache Cassandra™

Materialized views (MV) landed in Cassandra 3.0 to simplify common denormalization patterns in Cassandra data modeling.  This post will cover what you need to know about MV…

Learn More
developers
20 July, 2015

Cassandra 2.2

Authors: Jonathan Ellis

Cassandra 2.2 is now generally available, including the following highlights: Microsoft Windows is now ready for production deployments. JSON data can now be inserted, updated,…

Learn More
developers
16 July, 2014

Cassandra 2.1: now over 50% faster

Authors: Jonathan Ellis

Besides improvements to compaction and repair, 2.1 brings dramatic improvements to the core read and write paths. The two most important changes were: Adding response grouping to…

Learn More
developers
15 July, 2014

Off-heap memtables in Cassandra 2.1

Authors: Jonathan Ellis

Moving data structures off of the Java heap to native memory is important to keep up with datasets that continue to grow, while the JVM stays stuck at heap sizes of about 8GB. As of…

Learn More
developers
12 June, 2014

Cassandra architecture and performance, mid 2014

Authors: Jonathan Ellis

The impending release of Cassandra 2.1 is a good time to look at how Cassandra is doing against the distributed NoSQL competition. This is an update of my summary of the top…

Learn More
developers
11 June, 2014

How not to benchmark Cassandra: a case study

Authors: Jonathan Ellis
Apache Cassandra™

I recently wrote about how not to benchmark Cassandra and some of the principles involved in benchmarking Cassandra and other databases correctly.  Let’s take a look at how to apply…

Learn More
developers
31 May, 2014

Cassandra vs failures, latency edition

Authors: Jonathan Ellis

A common misconception is that masterless databases like Cassandra are designed to tolerate network partitions, which are quite possible but relatively uncomon within a single data…

Learn More
developers
4 February, 2014

How not to benchmark Cassandra

Authors: Jonathan Ellis

As Cassandra continues to increase in popularity, it's natural that more people will benchmark it against systems they're familiar with as part of the evaluation process.…

Learn More
developers
27 January, 2014

Improving compaction in Cassandra with cardinality estimation

Authors: Jonathan Ellis

Wasteful Bloom filter allocation Compaction is the process whereby Cassandra merges its log-structured data files to evict obsolete or deleted rows. These data files (sstables) are…

Learn More
developers
4 November, 2013

Pluggable metrics reporting in Cassandra 2.0.2

Authors: Jonathan Ellis

Guest post by Chris Burroughs Starting in 1.1, Apache Cassandra® began exposing its already bountiful internal metrics using the popular Metrics library. The number of metrics has…

Learn More
developers
1 November, 2013

Cassandra 2.0.1, 2.0.2, and a quick peek at 2.0.3

Authors: Jonathan Ellis
Releases

The first two minor releases after Cassandra 2.0.0 contained many bug fixes, but also some new features and enhancements. For the benefit of those who don't read the CHANGES …

Learn More
developers
3 October, 2013

Rapid read protection in Cassandra 2.0.2

Authors: Jonathan Ellis

Rapid read protection allows Cassandra to tolerate node failure without dropping a single request. We designed it for 2.0, but it took some extra time to get the corner cases worked…

Learn More
developers
3 September, 2013

What’s under the hood in Cassandra 2.0

Authors: Jonathan Ellis

The headlining features in 2.0 are lightweight transactions, CQL enhancements, and triggers. But 2.0 also features a lot of internal optimizations and improvements! Performance…

Learn More
developers
1 September, 2013

Why Cassandra Doesn’t Need Vector Clocks

Authors: Jonathan Ellis

One of the notable features of Amazon's 2007 Dynamo paper was the use of vector clocks for conflict resolution. However, the two most prominent systems designed by engineers who…

Learn More
developers
23 July, 2013

Lightweight transactions in Cassandra 2.0

Authors: Jonathan Ellis

Background When discussing the tradeoffs between availability and consistency, we say that a distributed system exhibits strong consistency when a reader will always see the most…

Learn More
developers
5 June, 2013

Does CQL support dynamic columns / wide rows?

Authors: Jonathan Ellis

The transition to CQL has been tough for some people who were used to the existing Thrift-based data model. A common misunderstanding is that CQL does not support dynamic columns or…

Learn More
13 January, 2013

2012 in review: Performance

Authors: Jonathan Ellis

Four years ago, well before starting DataStax, I evaluated the then-current crop of distributed databases and explained why I chose Cassandra. In a lot of ways, Cassandra was the…

Learn More
developers
2 January, 2013

Cassandra 1.2.0 released

Authors: Jonathan Ellis

The new year is here, and so is Cassandra 1.2.0! Key improvements include: Virtual nodes, which improve the granularity of capacity increases and dramatically improve repair and…

Learn More