Five Minute Interview – Wiggle
Sports Retailer WiggleReplaces SQL Server With DataStax Enterprise For Online Shopping Cart’s Recommendation Engine
This article is one in a series of quick-hit interviews with companies using Apache Cassandra and DataStax Enterprise for key parts of their business. For this interview, we spoke with Brett Lawrence who is a Lead Engineer in the marketing front end team at Wiggle.
“Relational databases were unable to implement the technology and applications we needed.”
DataStax: Brett, can you provide some background on Wiggle?
Brett: Wiggle is an on-line sporting goods retailer. We started in ’99, in what was essentially the back of a bike shop, and have grown extremely rapidly since then to ship globally in the cycle, run and triathlon industry. We’ve grown into a 500-person company serving more than 100 countries and generating more than £140M in revenue in 2012.
DataStax: What application at Wiggle does DataStax Enterprise support?
Brett: About three years ago, we wanted to add some personalization to our website showing customers products that they might like, rather than just the single product they were on. We wanted to implement an algorithm that would show them which product had been bought in the same basket as the one they were looking at, as well as which products had been viewed by other people in the same setting as the product they were viewing.
We initially implemented both technologies on Microsoft SQL Server. We found that the second algorithm, the product also viewed, wouldn’t perform for us in any way near that would scale in production. This is because there are a lot more products in an average view than there are in a basket. Our system at the time wasn’t capable of supporting the load and we weren’t in a position, or we didn’t have a plan at that time, to upgrade it so that we could handle it. So we started exploring new technologies.
We contracted a third party to build us a solution using Cassandra, which would enable us to grow the “also viewed” part of our recommendation engine for a capacity of five or six years. That project went very successfully up until about six months ago where our automatic, every four-hour MapReduce process we implemented outside of Cassandra started to fail for us.
Based on our evaluation we decided to move to a production enterprise version of the software that would be supportive, and would have a predictable roadmap. So it was natural for us to re-implement this algorithm on DataStax Enterprise and the changes from the first software that we initially used, to the current DataStax Enterprise are incredible. We’re thinking about other things that we can do with it as well now.
DataStax: When you began migrating off Microsoft SQL Server, did you evaluate other NoSQL solutions such as MongoDB and Couchbase?
Brett: We looked at them and used Mongo for other small projects. At the time it seemed like the integration that DataStax Enterprise provided between MapReduce and Cassandra would make the whole process much more streamlined for us.
DataStax: You’ve mentioned scale issues that you were able to overcome, and that the integration with the batch analytics was nice. Were there other features or characteristics of Cassandra in particular, that you found very helpful?
Brett: We were aware of the multi-data center support and we have a plan that we’re starting to put in place now, regarding actually hosting data in multiple data centers globally, that we at the time thought Cassandra would be useful for.
DataStax: How large is your deployment?
Brett: We’ve got four nodes. Our data before we reduce it is, it’s probably relatively small in the big data sense, but I think we’re looking at 40 or 50 gig per node. In terms of rows, we’ve got around 160 million in the way we store them, that’s before we reduce that data, which after we reduce it we end up with about 20 gig per node, which is 18 months of data. When we initially forecasted our deployment we prepared for five years’ worth of data, so we expect to just leave it running for five years.
DataStax: And how do you manage the deployment?
Brett: DataStax OpsCenter helped drive us toward DataStax Enterprise and make a full purchase because now we could actually hand over maintenance to our operations team instead of leaving it with developers.
DataStax: How would you summarize the benefits that you’ve gotten from Cassandra, or what you’re expecting of DataStax Enterprise?
Brett: Relational databases were unable to implement the technology and applications we needed, and the solution that we’ve developed generates demonstrable revenues. I think Cassandra is another tool that is appropriate for certain jobs and if you don’t have it in your infrastructure, you’re limited.
DataStax: If somebody came to you, brand new to NoSQL, maybe coming from the relational world, what advice would you give them? What do’s and dont’s would you pass to them?
Brett: We had some challenges with the analytics, wrapping our heads around that to produce properly. The challenges mainly for us from a knowledge and learning perspective were being able to hand over maintenance to our operations department. We were very used to SQL Server and relational databases and it was difficult to persuade them to buy into NoSQL. But the benefits have certainly proven themselves out since then.
For more information on Wiggle, see: http://www.wiggle.com/.