DataStax News: Astra Streaming now GA with new built-in support for Kafka and RabbitMQ. Read the press release.
Relational database management systems are flexible and often very performant; they ruled the database scene throughout the 1980s and 1990s. But like many tools, there are limits to what they can do. NOSQL database began to pop up in the 2000s to fill the gaps left by a relational database management system.
The moniker “NoSQL” originally started as a hashtag, and while it’s the de facto name for this style of systems, the term “non relational” is more accurate, as many NoSQL databases embrace SQL (Structured Query Language) because it is such a well understood, declarative language.
While all NoSQL databases hold in common the fact that they offer features that are difficult to accomplish using a traditional relational database management system, NoSQL databases can be differentiated into one of the following categories:
1) Key Value Stores
Key value stores offer a simple lookup of a value by its key. Think of a dictionary. Data is arranged by words (the keys) and you can easily get to each word’s definition (the value).
Because of the simplicity of their structure, key value stores are relatively easy to distribute and scale out; they are also highly performant for equality-based searches with simple payloads.
2) Wide Columnar Stores
A wide columnar store takes the idea of a key value store to the next level. Data is still distributed by a key, but the value is a structured set of rows and columns. This tabular payload allows for storage of more complex, structured data. Depending on the implementation, there may or may not be a strict schema applied to this payload.
Wide columnar stores offer many of the semantics of relational databases but support the ability to distribute and scale the system.
3) Document Databases
Because of the flexibility in structures, developers often like working with document databases because they can match the payload stored in the database to the specific objects that they interact with in their code; this allows them to avoid the “impedance mismatch” often experienced with mapping the tabular data stored in databases into the in-memory objects used by applications.
4) Graph Databases
The use of the term “relational” in a relational database management system does not refer to the fact that two things are related; instead, it refers to the fact that a relational database management system is based on relational algebra. A mathematical “relation” is a set of “tuples”—basically synonymous with tables.
While a relational database management system can be used to relate tables together through the use of joins, a graph database takes this concept to the next level. Graph databases are based on mathematical graph theory, and they represent data as a set of vertices and edges (terms borrowed from geometry). Data stored in a graph database can be thought of as “pre joined” because all the connections between entities are laid out ahead of time.
Graph databases excel at handling use cases where there is great value in how the data is related. Examples include fraud detection, network analysis, and web connectivity (Google’s founder Larry Page famously created the PageRank algorithm based on graph connectivity).
Because of their highly related nature, graph databases are the hardest of the NoSQL types to distribute across multiple nodes.
Which NoSQL Database is Right for Me?
Choosing the right NoSQL database involves understanding the capabilities of each tool and matching those capabilities to the requirements of the application. The idea of “polyglot persistence”—using a variety of tools to store a system’s data based on the specific needs of the application—has become prominent, especially in systems that demand extreme performance at extreme scale. Having a toolbelt full of NoSQL databases and the know-how to use the right one for the right job is a powerful concept.