The Five Minute Interview – Hallo
This article is one in a series of quick-hit interviews with companies using Apache™ Cassandra and/or DataStax Enterprise (DSE) for key parts of their business. During this interview, we spoke with Adrian Rodriguez, CTO at Hallo.
DataStax: Adrian, thanks for chatting with us today. What can you tell us about Hallo?
Rodriguez: Hallo is a social voice application that lets you quickly and easily send voice and audio messages to friends, family and the world. The app contains two main features.
The first is private messaging, which is a very utilitarian feature that lets people quickly send audio messages to each other either 1:1 or in groups. It’s great for situations where you need to send a message that is too detailed for texting but you don’t want to make a phone call.
The second feature, called World, allows users to post voice clips on a timeline within the app. After a user posts the clip, Hallo notifies all of that user’s followers of the new content and lets them immediately listen to the new clip.
DataStax: When did you realize you needed a big data solution like DataStax?
Rodriguez: We realized pretty early in the development process that we would need a big data solution. We had built out the pieces for private communication and developed a certain workflow using MySQL. But the ideas kept flowing and eventually the World feature came up.
We knew that we could get celebrities on the app, and began simulating the traffic and saw just how large this could grow. When you are talking about hundreds of thousands of users, and potentially millions one day, we needed a solution that could handle a high volume of data and scale along with us.
We quickly realized that MySQL couldn’t handle the real-time workflows and huge data queries, although it is still very useful for any transactional flows that we have. A celebrity post would spin up the CPUs on the database servers and it was a warning that the technology couldn’t scale at all. We ran through a few options like vertically scaling the database machines, MySQL replication, read slaves, and sharding but that would be a real pain from a development and operations resources perspective, and it would be hard to code. So we needed to look at other options.
DataStax: What other technologies did you evaluate?
Rodriguez: We looked at Redis, Riak, Mongo and Cassandra. But Cassandra really hit a home run. It was easy to set up and scaling wasn’t a problem since we just had to add another node. It’s easy to manage operationally and we didn’t see CPUs spike at all.
We also looked at Hadoop for our real-time streaming workflows and realized it just wouldn’t scale right; the latency was too much. Because of our real-time streaming workflow, we decided to use the Storm project that came out of Backtype which is now part of Twitter. So we went live with DataStax Enterprise at launch and it’s worked great for us.
DataStax: What benefits have you experienced since integrating DSE into the system?
Rodriguez: The scalability of Cassandra is amazing, and that was by far the most important factor in our decision. We get spikes of traffic when a celebrity posts, and his followers will generate a significant amount of read and write traffic during the spike. The writes in turn generate more read traffic until it tapers off. With MySQL that would slow to a crawl, but Cassandra handles it fine.
Cassandra really delivers amazing performance, which is critical because of the high write throughput we experience when people post to all their followers’ timelines. And then we get a huge read spike right after when all the followers listen to the message at around the same time.
Cassandra is very easy to implement and we had no trouble setting up the system. DSE really helped us tune our system correctly, and it’s certified for real-time workflows which is especially important to us. Plus DSE’s OpsCenter makes it very easy to manage operations on the cluster.
DataStax also gives great 24/7 production support, which really helped us. Even when we are playing around with the tool at the code level, I receive immediate responses from your support team and usually have my issue solved within the hour. Virtual nodes are awesome on Cassandra 1.2 too; I was trying out Cassandra with AMI and figured “ok, I’ll send a support request”. It was instantly taken care of.
DataStax: Thanks for sharing your insights with us, Adrian. We’re glad that you are having such a great experience with DataStax.