DataStax Blog

Confessions of an Oracle DBA – Part 1

By Robin Schumacher -  June 11, 2013 | 0 Comments

This post kicks off a short series on how a hard-core Oracle guy came to see that NoSQL databases are here to stay and are able to handle things that Oracle was never meant to.

Some Background

I’ve been a database geek for over 20 years now, with much of that time being spent as an Oracle professional. As an Oracle DBA and architect, I’ve gotten to design and manage some fairly prominent applications – everything from large Oracle financials systems for major insurance companies, to Oracle-driven medical apps that were right in hospital delivery rooms tracking every aspect of newborn births, to managing Oracle RAC platforms for trading systems that handled millions of dollars of transactions per hour.

Because of all the experience I was gaining, I wrote many tech articles for Oracle magazine and other IT outlets, wrote several books on Oracle and SQL Server design, development, and tuning, and spoke at conferences on how to make Oracle sing. As a DBA, even though I was well versed in other databases (e.g. Teradata), wherever I chose to go, I found myself using Oracle as my .44 magnum for apps needing heavy lifting, and SQL Server as my .38 backup for smaller jobs.

When I joined a major database tools company and began creating software tools for DBA’s, I kept my Oracle and SQL Server focus (although I did Sybase as well) and helped build data modeling, admin, development, and monitoring tools that are still in use today. I like to think that I knew the Oracle V$ views better than most and could diagnose a badly running database via wait events faster than the majority.

At that point, it was safe to say I was absolutely a career Oracle professional.

The Web Changed Everything 

When I left the database tools company and joined MySQL to start the product management group there in 2005, I had my eyes opened to two pretty important things. First, the Web had brought with it data management challenges that were much different than your standard corporate app that served mostly one, and sometimes two sites.

Second, those data challenges had changed the databases that IT pros were selecting, and MySQL was at the forefront of the shift. Why was this happening?

Let’s be honest: cost had a lot to do with it (it’s hard to beat free…). I still remember nearly laughing myself hoarse when a very well-known Web forum devoted to Oracle users experienced a hiccup that visibly showed MySQL being used as the underlying database. When questioned, the site’s owner publicly admitted, “Hey, I can’t afford Oracle.”

But cost wasn’t the total story of why companies (either pure Web or brick-and-mortar with a Web presence) moved to databases like MySQL; there were technical reasons too.

Some of the new data management patterns that the Web brought necessitated the breaking of orthodox RDBMS practices. Data engines that didn’t support ACID transactions (gasp!) were used because of the speedup they provided for fast incoming data as well as query use cases.

Scale-up architectures were displaced with commodity box scale-out, divide-and-conquer sharded implementations that proved to handle the needs of increasing capacity better than the traditional single skyscraper box approach. Simple, but good enough replication mechanisms in open source databases helped spread users and load across scale-out farms that were built, despite the fact that small amounts of data loss or inconsistency were possible.

Breaking tried-and-true legacy RDBMS rules gave seasoned DBAs heart palpitations, but despite the cries of “toy databases” and predictions of disaster, the open source relational databases more than held their own – they proved you didn’t necessarily need Oracle to have a well performing database infrastructure.

More Changes

Around the time we sold MySQL to Sun (2008), I began noticing a few new trends in the businesses I visited. First, while the sharded scale out implementations were lauded early on, they were now proving hard to manage.

I vividly remember sitting in the conference room of a major software supplier to the airlines who told me they were tearing down the scale-out architecture they had implemented just a few years earlier. Back at that time, I had sat in the same conference room with the same people who were quite happy with their sharded design, but now the architecture was proving too unmanageable and costly to care for from a personnel perspective.

I saw the failure of such designs elsewhere also, but for other reasons. One of the primary issues was that the RDBMS master-slave architectures afforded only read scale-out and not write. Companies I met with couldn’t deal with the write bottleneck inherent in master-slave setups, and even something such as Oracle RAC couldn’t help because it still used shared storage.

Moreover, this need to have write scale-out was extended to include the requirement of being able to write in multiple geographic locations and then have the data be synchronized in some manner.  Standard master-to-master replication wouldn’t do for a variety of technical reasons (e.g. performance, too many locations, etc.); instead, true read/write anywhere across multiple data centers and cloud availability zones was the request.

Another issue I heard voiced repeatedly was the need to store all types of data, especially the kind being produced by social media and other like businesses. The RDBMS model – whether proprietary or open source – was proving to be inhospitable to such data formats.

Two other factors were also coming on strong with a vengeance. Data volumes have always been growing, but mostly growth was tied to data warehouses and other styled repositories. No more. What I heard was the need to keep heavy data volumes online for line of business applications that couldn’t handle the sometimes high response times associated with data warehouses.

Further, the speed at which data was coming in for some of these companies was increasing, oftentimes astronomically. A lot of this data was time-series in nature, and couldn’t be consumed quickly enough in a RDBMS, even when non-transactional storage engines were used.

Like with the start of Web 1.0, things were beginning to change again.

In Part Two, I’ll talk about how these things and more kicked off the NoSQL movement and why its impact on Oracle will likely be even greater than with what happened with open source RDBMSs.



Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>