Designing a Future-Proof Data Architecture
This is an excerpt from the DataStax whitepaper Moving to a Modern Architecture, which delves into the eight key ways to prepare for and manage a modern data architecture. Click here to download the full whitepaper.
What does a data architecture that can withstand nearly anything thrown at it—both now and in the future—look like?
Without a doubt, that’s a rather foggy crystal ball to gaze into, but there is a way to bring things more clearly into focus. By reviewing the most often cited modern application data needs and issues that typically cause IT architectures to buckle and collapse, you can design a data blueprint that can hold up even under the nastiest of weather.
Let’s start with some of the most common requirements and then go into others that have more recently come about with the current evolution of database systems, digital applications, and their radically distributed deployments.
Volume and Traffic
More data + more users = database headaches. That’s a pretty well-understood and proven formula where databases are concerned. And today’s modern applications, with their keep-all-data-online requirement and concurrent user traffic that can spike 1,000x in an hour, will easily make a wreck of even the best laid-out data architectures.
TIP: The keys to future-proofing a data design where these two issues are concerned are to (1) rather than legacy scale-up approaches, utilize a divideand-conquer method that scales out the data layer, which means (2) utilizing a masterless database architecture that is able to remove both data volume and user-traffic bottlenecks through an elastic design. The architecture should work in a uniform way whether it’s on-premises, in the cloud, and in hybrid deployments.
Trust us when I tell you that you have way more downtime risk than you think you do.
Studies done by the Uptime Institute humble even the cockiest IT professional who thinks their systems won’t go down—especially those who trust the cloud to save them. Even with all the advances we have in technology, Uptime’s studies show that outages are actually increasing and—take a deep breath—cloud providers are now the second most commonly cited reason for IT service failure (No. 1 is still on-premises data center issues).
TIP: If you’ve put a scale-out masterless data foundation in place, you need to make sure that it’s one that has built-in redundancy in compute, storage, and data. If this three-legged data architecture stool is in place, then you have a good shot at continuous availability versus simply high availability.
If you want a future-proof data architecture, it needs to be one that is location independent in nature versus location dependent.
Location independence means that your data can live anywhere and not only be read but written everywhere as well. There are three reasons you need location independence. First, as just described, with properly used data replication, it allows for constant uptime versus almost assured downtime at some point because you have multiple copies of your data in different locations.
Second, it provides for uniform customer response times. Being able to put your data where your customer is means they’ll be able to access their data just as fast in any location, which is especially important since many application architectures are multi-home in nature.
Last, it protects against vendor lock-in and vendor location limitations. Having your data held hostage by a particular platform vendor who is limited in location support, with the only way out being a costly and heavy-lifting migration initiative, is not wise.
TIP: Be sure to pay attention to the fine print of your database vendor’s architecture diagrams and descriptions. Many can distribute data to multiple locations for read operations, but they can’t do the same for write activity. You need both, plus a transparent and easy way to put your data where it’s needed and sync it up with other copies of it around the globe.
User and application database interactions have evolved far past the standard transactions offered in legacy relational engines.
For example, take your typical credit card transaction. The authorization process used by the credit card vendor contains many different contexts in order to avoid a fraudulent event. Not only does it contain the standard database transactional characteristics, but it also involves search operations that review historical purchase activity, followed by analytics that are run on that data, which ensures it’s in line with the current purchase, and then it goes through the approval process and returns its response back to the application and user. One transaction, done in splitsecond fashion, so the user doesn’t grow impatient and move on to something else.
There are various industry terms used for contextual transactions—hybrid transactional analytical processing (HTAP), translytical processing, and so on. Whichever name you use, the idea is that your typical ACID transactions of legacy database yore are long gone.
Instead, a data architecture that forgoes the typical separation of database workloads is required today. This is why Gartner, in their “There is Only One DBMS Market” report, says: “The separation of the DBMS market into operational and analytic tasks has outlived its usefulness. Now each type of use case can be addressed by a single vendor or product. Data and analytics leaders responsible for data management strategies must view DBMS as one market to drive new capabilities.”
TIP: Your future-proof data architecture needs DBMS systems that handle contextual transaction management, with full support for transactional, analytical, and search workloads.
Thanks for reading this excerpt from the DataStax whitepaper Moving to a Modern Architecture, tune in next week when we'll release another excerpt or click here to download the full asset.