The Five Minute Interview – ReachLocalMay 23, 2012
This article is one in a series of quick-hit interviews with companies using Apache Cassandra and/or DataStax Enterprise for key parts of their business. For this interview, we spoke with Jeff Hagins who is CTO of ReachLocal, which is headquartered in Woodland Hills, CA.
DataStax: Jeff, thanks for talking with us today. What do you guys do at ReachLocal?
Jeff: ReachLocal provides end-to-end Internet advertising services to small and medium-sized businesses in eight countries. We handle things like paid search, display advertising campaigns, social media marketing and reputation management, and more. We also manage campaigns where the customer’s web site information is actually served up via our own Web proxy servers so we can capture analytics data regarding end user interactions.
DataStax: Sounds like you have to manage tons of data then, correct?
Jeff: Absolutely. The amount of information we have to deal with is beyond the scalability limits of traditional RDBMS’s. And that’s where Cassandra and DataStax Enterprise come into the picture. We needed something to store the large volumes of data we generate that is then replicated across many different data centers around the world.
DataStax: How heavy is the load that you deal with today and is it growing?
Jeff: Our growth is fairly dramatic from a company standpoint. Our current environment handles 50 million events a day, and we’re looking at multiplying that by 5x very shortly. We’re confident that DataStax Enterprise can scale to where we seeing ourselves being in the future, which is over one billion events per day.
DataStax: Does the need for continuous availability play a part in why you’re using DataStax Enterprise?
Jeff: Yes. We have a SLA requirement of 100% availability and we accomplish that via balancing our workloads in six different data centers across the globe with DataStax Enterprise. If one data center goes down, the traffic just shifts over to a different data center.
The easy-to-use replication capability of Cassandra is key for us. We drop information into what we call our core data centers, and Cassandra takes care of replicating that data out to all the other data centers.
We currently use about 16-20 nodes per data center for each database cluster, but they’re going to grow as our data volumes do. We see each cluster being hundreds of nodes in the future.
DataStax: What part does Hadoop play in your system?
Jeff: Prior to DataStax Enterprise, we were actually trying to architect a platform that integrated search and analytics with Cassandra. Frankly, you saved us a ton of work with DataStax Enterprise.
Some of the analytic information we collect is summarized in real time through Cassandra’s counters feature, but there is a lot of other information we need to do post processing for reporting purposes, and that’s where Hadoop comes in.
DataStax: How do you monitor and manage everything?
Jeff: We just started using your OpsCenter tool, which has already proven to be invaluable to us where understanding hot spots and rebalancing clusters are concerned. I hear nothing but great things back from my technical team that is using OpsCenter.
DataStax: Had you tried running your applications on other databases prior to Cassandra and DataStax Enterprise?
Jeff: Yes. We actually chose Cassandra about two years ago after a heavy evaluation that included other options like MongoDB and general purpose RDBMS’s. At the time, the low write latency and the ability to handle multi-data center replication clinched it for us where Cassandra was concerned. And now, Cassandra is even faster for both reads and writes, and with DataStax Enterprise, the whole package just represents everything we could have hoped for.
DataStax: Thanks again for the time Jeff.
Jeff: No problem.
For more information on ReachLocal, visit: http://www.reachlocal.com/
SHARE THIS PAGE