Behind the Innovator: Hornet Finds the Perfect Match with DataStax Luna 🖤
Welcome to the next installment of our Q&A series: Behind the Innovator.
Behind the Innovator takes a peek behind the scenes with learnings and best practices from leading architects, operators, and developers building cloud-native, data-driven applications with Apache Cassandra™ and open-source technologies in unprecedented times.
This week’s conversation took place around support for Cassandra alongside managing clusters and upgrades with Matthew Hirst and Nate Mitchell of Hornet. Hornet is a social networking site for the gay community, active around the world and currently supporting more than 30 million users.
Here’s what they had to say.
1. Can you share your technical background to date? Key accomplishments, achievements, etc.
Nate Mitchell, Head of DevOps, Hornet: I have worked around infrastructure, Linux and cloud services, including several years working for AWS. While I was there, I was working a lot around DynamoDB and ElasticSearch. I joined Hornet as I wanted to concentrate on something that would be based on my experience and give me a chance to build something of my own, rather than looking at bits and pieces for others.
2. What is your current priority for the team and what are you trying to achieve?
Matthew: Our current priorities are around stability and availability, making sure that the service is running and that latency is low. Alongside this, we have our engineering goals to manage and that we meet our sprint targets. We are looking ahead at how we grow as a service too, and how that will change the demands on our infrastructure over time.
Nate: My priority - and a big part of why I joined - is around how we use Infrastructure as Code within our service to support the service. While there were a lot of things in the cloud when I joined, they were being managed manually rather than automated. Changing this has been a big project for me, and it’s something that will continue over time as we migrate and update our systems.
3. What other systems are in place at Hornet, and how does Cassandra support them?
Matthew: We use a range of different databases for different tasks across our infrastructure. We have Apache Cassandra in place to support our social feeds and messaging services, for example - those applications have high write volumes and Cassandra is ideal for those. We have ElasticSearch in place for data exploration and search, and we use Redis for caching and information sharing where fast reads are needed.
Alongside these, we have PostgreSQL in place for more general tasks. We started out with a lot of PostgreSQL, but over time Cassandra has moved in for a lot of those areas where data volumes and write speed requirements have gone up.
Nate: When I started, Cassandra was already one of the main databases to support and run. We looked at where the hotspots were and where more support was needed, and Cassandra was always down the priority list - it just ran and kept running. The stability it had in place already was amazing. I came from working with DynamoDB every day, and I didn’t think any other database could steal a place in my heart, but Cassandra did.
4. What are some key learnings and challenges you've experienced while working on this current project?
Nate: When I started, we were always conscious of the impact that a database failure could have on services. If Postgres failed, then the data would be difficult to get back and it would be time consuming. Cassandra provided that resilience and reliability over time - we ran for more than two years with no downtime at all. We did have one issue around our feeds cluster this year, and we turned to the Apache Software Foundation Slack group for help, and the suggestion was to talk to DataStax and the team from The Last Pickle.
They provided us with consulting on how to improve our cluster health and what we were doing - overall, we were on the right track, but we could improve things by updating our versions to be more current. We used DataStax Luna to get advice on our open source deployment - originally we planned on getting standalone consulting days, but the Luna Professional subscription provides us with a year of support. We also purchased three health check days.
Matthew: As we implement the findings and recommendations through, we should see some concrete improvements from the consultancy we got, and should see some big improvements and cost savings. For example, we should be able to keep our current cluster sizing in place, while the business looks to more than double our daily active users. This represents an immediate potential cost saving.
5. What's the vision for this project as a business, and how do you support that?
Matthew: There are some big goals to increase adoption and use of the app within the gay community - we should provide more support for group discussions and social activities alongside the direct conversations and interactions that we were known for in the past. That means more messages and more interaction, and that will increase the volume of data that we have to support. From our perspective, we are preparing our infrastructure now to manage that so we have the best possible experience for users.
Nate: I think the work we have been doing as a team around Infrastructure as Code has been successful so far, and we are in a good place now to continue that work over time. This should help our future development, as we can build our Cassandra clusters out. We are looking at scaling up our existing clusters and building new ones so we can scale out horizontally and support all those new services too.
As a team, we are looking at Kubernetes - we see a future where things will be Kubernetes all the way down over time, but we are not there yet.
6. Do you have any advice for your younger self that you would share now?
Matthew: I think keeping ahead of your upgrade curve is a good piece of advice. We had one of our clusters running on Cassandra 2.0, which is a long way behind the current supported versions. Being behind did make it harder, but we were able to manage our approach and make it less painful over time. So keeping current is a great piece of advice for others. Plus you can take advantage of more new functionality and improvements.
I would suggest keeping on top of all the new approaches that are getting put together, but don’t look at them in anger without good business reasons. For example, machine learning can’t solve everything, even though it gets a lot of hype. We have done a lot of work around how to use computational statistics in our work - it’s great for some things, and not for others. For example, we had a lot of success around moderation and image recognition, preventing abuse of our message services with pictures that people might not want to get.
We built that and it was a great success. However, we tried the same approach for recommendations and it did not produce the results we wanted. We can carry on looking at how we use models, statistics and what gets called machine learning over time, to see where it might be useful for us in the future.
Nate: I’d tell myself that the overwhelming feeling of incompetence will never go away, and that it is OK to have that feeling. If it does go, then you're probably stagnating and need to move on. There are too many things changing all the time for you to ever know them all, so be humble and recognize where your strengths are and where to ask for help.