Highly Available and Scalable Distributed Database

Apache Cassandra™ is a distributed database that delivers the high availability, performance, and linear scalability today’s most demanding applications require. It offers operational simplicity and effortless replication across cloud service providers, data centers, and geographies, and it can handle petabytes of information and thousands of concurrent operations per second across hybrid cloud environments.

Apache Cassandra™ Logo

DataStax Commitment to Open Source Cassandra

From its inception, Cassandra has been the premier distributed database on the market, and here at DataStax, we remain committed to continuing that legacy. DataStax offers production-certified Cassandra distributions plus 24x7x365 expert support to ensure all Cassandra users can make the most of this powerful database.

icon

#1 OSS Committer

DataStax has contributed a majority of the open-source Cassandra code commits and we are one of the driving forces behind Apache Cassandra 4.0.

icon

Apache Cassandra Experts

DataStax solutions are developed and updated from the open-source Cassandra project, and the DataStax team has been an integral part of the Cassandra project since its inception.

icon

Open Source Leadership

DataStax provides open-source leadership in other database-related projects (like Apache TinkerPop™) as part of our commitment to open source.

Apache Cassandra™ Architecture

The data management needs of the average large organization have changed dramatically over the last ten years, requiring data architects, operators, designers, and developers to rethink the databases they use as their foundation. The proliferation of large-scale, globally distributed data led to the birth of Apache Cassandra™, one of the world’s most powerful and now most popular NoSQL databases. Read this white paper to learn how Cassandra was born, how it’s evolved, how it operates, and what DataStax Distribution of Apache Cassandra™ adds to the equation.

Coursera Logo

"High availability is extremely important to us and our users, and that was the first thing that caught our eye with Apache Cassandra and DataStax. High availability with reliable performance is a big win for us."

Daniel Chia

Software Engineer, Coursera

DataStax Distribution of Apache Cassandra

Develop and scale your applications with confidence with DataStax Distribution of Apache Cassandra and support from the Cassandra experts.

Icon
Blog

How to do Joins in Apache Cassandra™ and DataStax Enterprise

For years, a critique directed at NoSQL databases was that you couldn’t do join queries like those possible in an RDBMS. While this is true for some NoSQL databases, we thought it would be helpful to remind Apache Cassandra™ users that join operations are indeed now possible with Cassandra. There are a couple of ways that you can join tables together in Cassandra and query them: Use Apache Spark’s SparkSQL™ with Cassandra (either open source or in DataStax Enterprise - DSE). Use DataStax provided ODBC connectors with Cassandra and DSE. In this post we’ll first illustrate how to perform SQL Joins [1] with Cassandra tables using SparkSQL and then look at how to use DataStax’s ODBC connector to easily create join queries[2] that can be used to create dashboards with BI software like Tableau [3]. Creating Join Queries Using Spark and Cassandra While you can create your own Cassandra and Spark combination clusters using open source, its a lot easier to use DSE as it bundles and certifies Spark with Cassandra as part of its analytics package. To use Spark in DSE, you simply start one or more nodes in analytics mode, and you’re ready to roll. DSE ships with a Weather Application Demo that shows how DSE Analytics works. We’ll use a couple of the objects in that demo to illustrate how to perform a simple join operation. For more details on how to setup the demo, and to view much more complicated join queries used in the application, please refer to our online documentation. The tables used in this example have the following structures: To create a join query, we first start one or more DSE nodes in analytic mode by executing: We then indicate what keyspace to use, which in this case is the “weathercql” keyspace: Creating a join operation with SparkSQL involves using the following syntax: For this example, we’ll join data from the monthly and station table, store the results in a SparkSQL CassandraSQLContext - RDD (resilient distributed dataset) called “results”, iterate through and print the results: Creating Join Queries with Cassandra and ODBC You can create join queries on Cassandra data outside of Spark by using DataStax’s free ODBC driver (we also supply an ODBC driver for Spark). This means that any developer/DBA/BI/ETL tool that has ODBC connectivity can connect to and query data in Cassandra. It is important to note that join operations done with the current ODBC driver should not involve large tables as the performance may not be acceptable for most queries that target big clusters. Let’s take a look at how this works with one of the most popular BI tools in the market, which is Tableau. The steps below show a simple way to execute a freehand SQL join query using Tableau [3] and DataStax Enterprise 4.6. 1. First, we create an ODBC connection/datasource ( Fig 1) to DataStax Enterprise. Fig 1: DataStax Cassandra ODBC Connector 2. Next, we open Tableau [3] and connect to Cassandra (Fig 2) using the ODBC connection created in the previous step. Fig 2: Connecting to Cassandra using ODBC 3. Then, we code our join query using Tableau’s Custom SQL Query[4] editor (Fig 3) to create a dashboard that displays the join query’s results.(Fig 4). Fig 3: Custom SQL against DataStax Enterprise Fig 4: Tableau Dashboard And Coming Soon...Joins on Steroids with Graph! We announced our acquisition of Auerlius back in Feburary of this year, which is the company behind the Titan open source graph database. If you understand what a graph database can do, then you know that one thing it does very well is handle the traversal of multiple relationships between vertices (entities in an RDBMS world) without any need to create indexes or materialized views to overcome join performance inefficiencies in an RDBMS. In short, a graph database represents the ultimate in joins where ease of use and performance are concerned. Coming soon in DSE will be DSE Graph, which will provide just this type of capability along with multi-model support in DSE. As an example of how graph can dramatically reduce the complexity of join operations, the below comparison shows a sample, RDBMS join query on the left for a recommendation engine application vs. how the exact same query is handled in a graph database (Fig 5). Big difference, wouldn't you say? Fig 5: RDBMS Join Query vs. Graph Database Join Query Conclusion Joining Cassandra tables together with SQL-styled queries can be carried out in multiple ways today, with each method being easy to use and code. For more information on creating joins on Cassandra data, please refer to the online documentation here and here. You can find downloads of DSE and our ODBC drivers on our downloads page. [1] Customers need to consider to the costs of creating such ad-hoc queries against distributed databases. [2] Customers need to run thorough query performance assessments when using this option [3] Customers can use any Business Intelligence or ETL solution that supports standard JDBC/ODBC. [4] Custom SQL is for illustrative purposes only. You can use any methodology supported by your BI vendor to create Reports or Dashboards. ETL jobs can be created in a similar method using your favorite ETL tool.    

Learn More
Icon
Blog

How to Move Data from Relational Databases to DataStax Enterprise / Cassandra using Sqoop

By Robin Schumacher | March 21, 2012 When I'm at conferences, I always have the same conversation. People come up to me who are excited about and sold on using Cassandra, but they want to migrate part or all of a particular RDBMS to Cassandra and they don't know how to go about it. In the past, I've always felt bad that I've never had a great answer for them, but with the release of DataStax Enterprise 2.0, things have gotten much easier. DataStax Enterprise 2.0 includes support for Sqoop, which is a tool designed to transfer data between an RDBMS and Hadoop. Given that DataStax Enterprise combines Cassandra, Hadoop, and Solr together into one big data platform, you can now move data to not only a Hadoop system with Sqoop, but Cassandra as well. Let me show you how it works. Setting up Sqoop works via JDBC, so really the only prerequisite you'll have to deal with is downloading the JDBC driver for your source RDBMS (e.g. Oracle, MySQL, SQL Server, etc.) and putting it in directory where sqoop has access to it (we recommend the /sqoop subdirectory of the main DataStax Enterprise installation). For this exercise, I'm going to migrate data from a MySQL database over to DataStax Enterprise, so I downloaded the JDBC driver from the MySQL website, unzipped it, and put it in my /sqoop subdirectory: robinsmac:sqoop robin$ pwd /Users/robin/dev/dse-2.0/resources/sqoop robinsmac:sqoop robin$ ls -l total 10296 -rw-r--r--@ 1 robin staff 4132 Mar 2 17:05 CHANGES.txt -rw-r--r--@ 1 robin staff 719 Mar 2 17:05 DISCLAIMER.txt -rw-r--r--@ 1 robin staff 15760 Mar 2 17:05 LICENSE.txt -rw-r--r--@ 1 robin staff 251 Mar 2 17:05 NOTICE.txt -rw-r--r--@ 1 robin staff 1096 Mar 2 17:05 README.txt drwxr-xr-x@ 6 robin staff 204 Mar 2 17:05 bin drwxr-xr-x@ 3 robin staff 102 Mar 2 17:05 conf drwxr-xr-x@ 3 robin staff 102 Mar 2 17:05 lib drwxr-xr-x@ 10 robin staff 340 Oct 3 04:44 mysql-connector-java-5.1.18 -rw-r--r--@ 1 robin staff 789885 Oct 3 04:44 mysql-connector-java-5.1.18-bin.jar -rw-r--r--@ 1 robin staff 3834947 Mar 5 16:42 mysql-connector-java-5.1.18.tar.gz -rw-r--r--@ 1 robin staff 604406 Mar 2 17:05 sqoop-1.4.1-dse-20120216.054945-6.jar Migrating Schema and Data The MySQL source table that I'm migrating to DataStax Enterprise has a little over 100,000 rows in it and looks like this: CREATE TABLE `npa_nxx` ( `npa_nxx_key` varchar(16) NOT NULL, `npa` varchar(3) DEFAULT NULL, `nxx` varchar(3) DEFAULT NULL, `lat` varchar(8) DEFAULT NULL, `lon` varchar(8) DEFAULT NULL, `linetype` varchar(1) DEFAULT NULL, `state` varchar(2) DEFAULT NULL, `city` varchar(36) DEFAULT NULL, PRIMARY KEY (`npa_nxx_key`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1$$ The command I'll use to migrate both the table and data to a Cassandra column family is the following: ./dse sqoop import --connect jdbc:mysql://127.0.0.1/dev \ --username root \ --table npa_nxx \ --cassandra-keyspace dev \ --cassandra-column-family npa_nxx_cf \ --cassandra-row-key npa_nxx_key \ --cassandra-thrift-host 127.0.0.1 \ --cassandra-create-schema The dse command is located in the /bin directory of the DataStax Enterprise install. I first pass the IP address and database of the MySQL server I want to use, followed by the username (I'm not using a password for the super user on MySQL right now; yes, I know, bad practice…) I then indicate what MySQL table I want to migrate. After I've set all the MySQL parameters, I then designate a new Cassandra keyspace to use, followed by the name I want to give my new column family object. Lastly, I tell sqoop what the primary key of the column family will be, the IP address of the Cassandra node I want to connect to, and then pass a parameter telling sqoop to create my new keyspace (note that you can use existing keyspaces if you'd like). The submission produces the following output and end result: 12/03/06 08:58:56 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 12/03/06 08:58:56 INFO tool.CodeGenTool: Beginning code generation 12/03/06 08:58:56 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `npa_nxx` AS t LIMIT 1 12/03/06 08:58:56 INFO orm.CompilationManager: HADOOP_HOME is /Users/robin/dev/dse-2.0-EAP3-SNAPSHOT/resources/hadoop/bin/.. Note: /tmp/sqoop-robin/compile/2e2b8b85fba83ccf1f52a8ee77c3b12f/npa_nxx.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 12/03/06 08:58:56 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-robin/compile/2e2b8b85fba83ccf1f52a8ee77c3b12f/npa_nxx.jar 12/03/06 08:58:56 WARN manager.MySQLManager: It looks like you are importing from mysql. 12/03/06 08:58:56 WARN manager.MySQLManager: This transfer can be faster! Use the --direct 12/03/06 08:58:56 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path. 12/03/06 08:58:56 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql) 12/03/06 08:58:57 INFO mapreduce.ImportJobBase: Beginning import of npa_nxx 12/03/06 08:58:58 INFO cfs.CassandraFileSystem: CassandraFileSystem.uri : cfs:/// 12/03/06 08:58:58 INFO config.DatabaseDescriptor: Loading settings from file:/Users/robin/dev/dse-2.0-EAP3-SNAPSHOT/resources/cassandra/conf/cassandra.yaml 12/03/06 08:58:58 INFO config.DatabaseDescriptor: DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap 12/03/06 08:58:58 INFO config.DatabaseDescriptor: Global memtable threshold is enabled at 329MB 12/03/06 08:58:58 INFO snitch.DseDelegateSnitch: Setting my role to Cassandra 12/03/06 08:58:58 INFO config.DseConfig: Loading settings from file:/Users/robin/dev/dse-2.0-EAP3-SNAPSHOT/resources/dse/conf/dse.yaml 12/03/06 08:58:58 INFO config.DseConfig: Load of settings is done. 12/03/06 08:58:58 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`npa_nxx_key`), MAX(`npa_nxx_key`) FROM `npa_nxx` 12/03/06 08:58:58 WARN db.TextSplitter: Generating splits for a textual index column. 12/03/06 08:58:58 WARN db.TextSplitter: If your database sorts in a case-insensitive order, this may result in a partial import or duplicate records. 12/03/06 08:58:58 WARN db.TextSplitter: You are strongly encouraged to choose an integral split column. 12/03/06 08:58:58 INFO mapred.JobClient: Running job: job_201203051624_0002 12/03/06 08:58:59 INFO mapred.JobClient: map 0% reduce 0% 12/03/06 08:59:05 INFO mapred.JobClient: map 25% reduce 0% 12/03/06 08:59:06 INFO mapred.JobClient: map 50% reduce 0% 12/03/06 08:59:07 INFO mapred.JobClient: map 75% reduce 0% 12/03/06 08:59:08 INFO mapred.JobClient: map 100% reduce 0% 12/03/06 08:59:08 INFO mapred.JobClient: Job complete: job_201203051624_0002 12/03/06 08:59:08 INFO mapred.JobClient: Counters: 14 12/03/06 08:59:08 INFO mapred.JobClient: Job Counters 12/03/06 08:59:08 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=13439 12/03/06 08:59:08 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 12/03/06 08:59:08 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 12/03/06 08:59:08 INFO mapred.JobClient: Launched map tasks=4 12/03/06 08:59:08 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0 12/03/06 08:59:08 INFO mapred.JobClient: File Output Format Counters 12/03/06 08:59:08 INFO mapred.JobClient: Bytes Written=0 12/03/06 08:59:08 INFO mapred.JobClient: FileSystemCounters 12/03/06 08:59:08 INFO mapred.JobClient: FILE_BYTES_WRITTEN=88472 12/03/06 08:59:08 INFO mapred.JobClient: CFS_BYTES_READ=587 12/03/06 08:59:08 INFO mapred.JobClient: File Input Format Counters 12/03/06 08:59:08 INFO mapred.JobClient: Bytes Read=0 12/03/06 08:59:08 INFO mapred.JobClient: Map-Reduce Framework 12/03/06 08:59:08 INFO mapred.JobClient: Map input records=105291 12/03/06 08:59:08 INFO mapred.JobClient: Spilled Records=0 12/03/06 08:59:08 INFO mapred.JobClient: Total committed heap usage (bytes)=340000768 12/03/06 08:59:08 INFO mapred.JobClient: Map output records=105291 12/03/06 08:59:08 INFO mapred.JobClient: SPLIT_RAW_BYTES=587 12/03/06 08:59:08 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 10.724 seconds (0 bytes/sec) 12/03/06 08:59:08 INFO mapreduce.ImportJobBase: Retrieved 105291 records I can then log into the Cassandra CQL utility and check that my column family and data are there: robinsmac:bin robin$ ./cqlsh Connected to Test Cluster at localhost:9160. [cqlsh 2.0.0 | Cassandra 1.0.8 | CQL spec 2.0.0 | Thrift protocol 19.20.0] Use HELP for help. cqlsh> use dev; cqlsh:dev> select count(*) from npa_nxx_cf limit 200000; count -------- 105291 Conclusions That's it. There are a lot more parameters you can use for sqoop; typing ./dse sqoop import help will list them all for you. To try out DataStax Enterprise with Sqoop, download a copy of the software – it's completely free for development use. Thanks for your support of DataStax and Cassandra!

Learn More

Advanced Database Capabilities

DataStax Enterprise (DSE) goes far beyond Apache Cassandra capabilities with double the horsepower, operational simplicity, and advanced security.

icon

DSE Advanced Performance

DSE includes twice the horsepower of Apache Cassandra, delivering twice the throughput to handle twice the workloads with the same hardware. Plus DataStax Bulk Loader makes loading and unloading data a snap.

icon

DSE NodeSync

A major challenge of Apache Cassandra is operational management. Repairing nodes for synchronization is an intensely manual process that requires the right expertise. DSE NodeSync removes that pain, eliminating 90% of such manual operations. So even novice DBAs and DevOps professionals can run DSE like seasoned professionals.

icon

DSE Advanced Security

Apache Cassandra includes only basic security such as login and password. DSE adds comprehensive, enterprise-grade security, including authentication, authorization, transparent data encryption, JDBC drivers with built-in security, and auditing by user or profile.

Icon
Blog
Cassandra’s Journey — Via the Five Stages of Grief

New technologies usually need to fight their way into the hearts of the people who will end up using them. This fight is often long and hard, and Apache Cassandra didn’t have it any easier than any of the other technological developments of our time. In 2008 the database world was a wild place. Large data infrastructures were testing the limits of relational databases, and companies like Google and Amazon were about to run out of options on how to handle their massive data volumes. At the time I was working at an education company called Hobsons, and was one of those infrastructure engineers trying to get more scale out of my tired old databases. Cassandra caught my eye as something with a great foundation in computer science that also solved many of the issues I was having. But not everyone was as convinced as I was. If you’re not familiar with the  Kübler-Ross model of grieving, also known as The Five Stages of Grief, it describes a way most people end up dealing with loss and change. Looking back, I realize now that the en-masse giving up of relational databases to switch to something more appropriate for the new world of big data—Cassandra— very much followed this same model. Here’s how it happened from my POV in the trenches of data infrastructure. Stage 1: Denial - The individual believe the prognosis is somehow mistaken and clings to a false, preferable reality. In 2008, Apache Cassandra was the closing curtain on a 30-year era of database technology, so denial was an easy and obvious response to it in the early years. Of course, many of the new databases being released weren’t exactly of the highest quality. Coming from a database with years and years of production vetting, it was easy to throw some shade at the newcomers. Cassandra was in that camp. But it could do things relational databases couldn’t, like stay online when physical nodes fail or scale online by just adding more servers. Administrators called it a toy and developers called it a fad — just some kids trying to be cool. Cassandra kept growing, though — and solving real problems. The replication story was unmatched and was catching a lot of attention. There were ways to replicate a relational database, but it was hard and didn’t work well. Data integrity required one primary database with all others being secondary or read-only, and failure modes contributed to a lot of offline pages displayed on web sites. But generally speaking people only want to make the effort to fix things when they absolutely have to, and for now, relational databases weren’t really broken. Stage 2: Anger - The individual recognizes that denial cannot continue and becomes frustrated. Slowly but surely people started to move notable use cases with real production workloads over to Cassandra. There were happy users talking about incredible stories of scale and resiliency! The company names attached to these stories became less cutting edge and more mainstream and it was becoming clear to many that this wasn’t just a fad. It was starting to make a real impact and could be coming to a project meeting soon. I remember one of my first consulting gigs at a big-name company. I was working with the development team on some data models and in the back of the room was a group of engineers, arms crossed, not looking happy. When I talked to them, they made it quite clear that this change was not welcome, and that “This is going to ruin the company.” They were the Oracle database administrators and they saw this at best as a bad idea and at worst as a threat to their livelihood. In the ensuing months I experienced similar tense moments with other groups of engineers. Stage 3: Bargaining - The individual tries to postpone the inevitable and searches for an alternate route. Despite roadblocks and delay tactics, the needs of businesses everywhere dictated a move to high-scaling technologies like Apache Cassandra. It was solving real problems in a way no other database could and no matter how much “tuning” you could do on your other solutions. This led to situations where teams started negotiating the terms of a Cassandra roll-out. One team I worked with wasn’t allowed to put Cassandra in any critical path close to customers. Ironically, when the systems in the critical path started failing, the only system that could withstand the conditions that led to their failure was the much-maligned Cassandra cluster. Then, a new breed of database appeared that tried to capitalize on the fear of non-relational databases. It was called NewSQL and promised full ACID transactions along with Cassandra-like resiliency, but NewSQL never quite worked out when real-world failures presented themselves. That’s how infrastructure goes: It burns half-baked ideas to the ground and calls in a welcoming party for the good ideas. Stage 4: Depression -  "I'm so sad, why bother with anything?" Cassandra started gaining traction in every corner of the tech world. As the solutions implemented to avoid this inevitability failed, fighting the future became less and less appealing. There was a massive growth period when the early adopters became late adopters and they were talking. The relational database holdouts finally just stopped talking about it and did something else. Many decided to move to data warehousing where they could put their amazing SQL skills to use via complex queries. Stage 5: Acceptance - The individual embraces the inevitable future. And then, there was a moment, and nobody knows exactly when it was, that Cassandra became a mainstream database. It might have been when everywhere you looked there was yet another great use case being talked about. As the saying went, anyone doing something at scale on the Internet was probably using Cassandra. For me, the moment I realized Cassandra had finally been accepted was when I saw large numbers of database administrators signing up for training on DataStax Academy. It was like a big shift had occurred in the day-in, day-out world of databases. Application developers were always pushing the cutting edge, but administrators had to keep those applications running until they were replaced, and their new foundation of choice was Cassandra. When you think about it, you really see the same reaction to every new paradigm-shifting technology. The early days of the computer, the Internet, and now blockchain all faced the same fear and doubt as the early days of Cassandra. Collectively—we deny the truth, rage at inevitability, scramble for an alternative, fall into despair, and finally accept and embrace our new reality. What comes after Cassandra is anyone’s guess, but as with people, usually the best kind of change comes little by little and goes almost completely unnoticed until it’s staring you in the face, and you say, “Wow — you’ve changed!” Here’s to the Cassandra of the past, the present, and the future.

Get the Blog
Icon
Blog
The Four Main Challenges with Apache Cassandra™

Enterprises are increasingly flocking to open source technology because of its accessibility, theoretical cost-effectiveness, and ability to attract top talent. According to the 2018 Open Source Program Management Survey, 53% of companies say their organization has an open source software program or plan to establish one within the next year, and according to the 2016 Global Developer Report, 98% of developers use open source tools—even when they’re not supposed to. Here at DataStax we’re HUGE Apache Cassandra fans! We based our technology on Cassandra for good reason: it’s fast, flexible, and foundational. Enterprises can form their data management strategies on it and be confident they’ll be able to scale with their growth. That said, as with other open source tools, Cassandra does present certain challenges at the enterprise level. While these challenges are easily overcome with the right strategy and resources, we think it’s worth exploring exactly what these challenges are, the hidden costs associated with them, and why most enterprises end up needing a little extra help to tap into the full potential of Cassandra. 1. Rising maintenance costs Open source solutions are becoming more and more popular in the enterprise because they’re easier to adopt and they eliminate licensing fees. They eliminate the need for extensive contract negotiations, which can be stressful and time-consuming. However, while open source tools may be free to deploy, they do come with hidden ongoing maintenance costs that can have a significant impact on total cost of ownership (TCO) beyond the cost of acquiring the software. When companies move to open source they end up either investing in internal talent to develop and maintain the technology or depending on a network of third-party developers, especially the open source community. Contributions are voluntary and are made when a contributor has the time and not necessarily when an organization has a need. Still, companies that use open source depend on these contributions for things like maintenance, bug fixes, and new features. These dependencies introduce a lot of risk into the equation, making it more difficult for enterprises to meet service-level agreements as well as bringing the potential of downtime and the costs associated with lost business.   2. Security, compliance, and governance risk HIPAA, Sarbanes-Oxley, GDPR—oh my. Different industries in different countries are forced to comply with different regulations. One of the main reasons open source projects fail or run into issues is because of security compliance. It’s often difficult for organizations to implement global security standards to ensure compliance, particularly in hybrid cloud environments. This makes the complete adoption and use of open source software that much more challenging. Failure to comply with these regulations exposes organizations in regulated industries to significant financial and reputational risk. While Cassandra does offer some built-in security features out of the box—like role-based authentication and authorization—these features, by themselves, can’t guarantee security for organizations that operate in heavily regulated industries.   3. Ad hoc support from multiple sources Because Cassandra’s free, it’s easy to adopt. This ease of implementation, however, comes with its own challenges. Individual teams usually end up implementing the database on an ad hoc basis. As the deployment scales and multiplies across the organization, the need for support services increases. In many cases, organizations end up with a patchwork quilt of support and services from a variety of different sources: some in-house resources, the open source community, and third-party agencies. All of these come with varying levels of Cassandra expertise and response time. It’s not the most efficient, cost-effective, or reliable approach, to say the least.   4. Limited Apache Cassandra expertise Cassandra boasts a robust community that offers a rich set of collective knowledge. But much of that knowledge isn’t organized in an intuitive way. Implementing and configuring Cassandra requires a significant learning curve. Most companies find out that it’s very difficult and costly to hire in-house expertise because there’s a limited supply of talent. Employees usually end up educating themselves on Cassandra, using a combination of open source documentation, help from the community, and trial and error. This slows down adoption and puts an enormous administrative burden on IT. While open source software can help organizations achieve their goals, it is not without its drawbacks. Hidden costs, security risks, a patchwork network of support services, and a lack of expertise are all reasons why organizations struggle with open source adoption. The good news is that, with the right partner, you can unlock the full power of Cassandra without any of the downsides. That’s the ticket to helping your organization realize its full potential.   eBook: The 5 Main Benefits of Apache Cassandra™ READ NOW

Get the Blog
Icon
Webinar
Speed Dating with Apache Cassandra™

Microservices, security and compliance, multi-tenant data centers, cluster sizing … there’s a lot to consider when thinking about your data platform! Join us for an online meetup featuring experts from DataStax and our partner, software consulting firm Expero, to get bite-sized lightning talks covering these topics and more. We’ve curated a list of the most critical topics into this speed dating format to help you unlock the potential in your organization by maximizing the effectiveness of your data platform.

Get the Webinar