What the Growth in Multi-Data Centers Means for Databases and Big Data
A recent article in InfoWorld contained some pretty interesting statistics about the rise and growth of multi-data centers. In their latest poll of data center managers, the Uptime Institute discovered that 80 percent of respondents have built a new data center or upgraded an existing facility within the past five years.
Another study of the North American data center market done by Digital Realty Trust found that 92 percent of respondents said their companies will definitely or probably expand their data center space in 2012, which was the highest percentage reported in six years.
This news, coupled with the fact that data centers are primarily put in place to hold (gasp!) data, makes it not hard to see that the need for databases that easily span and interact between multiple data centers is only going to escalate, and likely escalate at a rapid clip.
But what does a multi-data center database look like? Does it just equate to log shipping, mirroring between data centers, master-slave replication, something else? To be sure, there are use cases where the above options will work just fine, but increasingly, I’m seeing the following short list requirements:
- The ability for a single, logical database to span 1-N datacenters; not just two.
- Multi-directional syncs between data centers; not just one way. Or, put another way: the desire to have truly location independent, read and write anywhere freedom.
- Built in network intelligence so that data is smartly transferred between data centers to minimize bandwidth overload and latency issues.
- The ability to support all key types of data traffic across data centers (e.g. real-time, analytic, search, etc.)
The reasons why a multi-data center database is needed vary. Some use cases involve just the simple desire for a good disaster recovery plan. But the majority of use cases revolve around needing to keep one logical database synched up between 1-n physical data centers and deliver response times as fast as possible for the users each serves in their assigned locale.
Pulling this off isn’t easy unless you start with the right database architecture and feature set. For example, master-slave designs are many times practically impossible as the requirement for read-write anywhere can’t be met.
Fortunately, Cassandra’s architecture is tailor-made for multiple data centers. Its peer-to-peer design coupled with online scale-out and full redundancy that offers no single points of failure and continuous availability make it ideal in multi-data center environments.
Further, DataStax Enterprise supports all of this not only for Cassandra, but also Hadoop and Apache Solr / enterprise search in database clusters. This includes both on-premise data centers as well as cloud deployments.
So if you’re one of the many who are running and expanding your organization’s data center footprint, and you need a database that is built from the ground up to support multi-data centers, download DataStax Enterprise now and give it a try. We think you’ll be pleased with the end result, just like many of our customers who are happily running their database across multiple data centers.