Get your copy of the O’Reilly Cassandra eBook: The Definitive Guide - Download FREE Today
Moving Hosting From AWS to Astra DB to Stop Jumping Through Hoops
Siggy.ai is an artificial intelligence-driven recommendation app for Shopify webshops. While Siggy.ai was building out its artificial intelligence-driven Shopify app, it found key specialized operations he needed were only available on DataStax Astra DB.
Products & Services
As Chang Xiao, founder of Siggy.ai, was building out his artificial intelligence-driven Shopify app, he found key specialized operations he needed were only available on Astra DB from DataStax.
Siggy.ai is an artificial intelligence-driven recommendation app for Shopify webshops. Instead of drawing on the often-used “other customers also bought” recommendation, Siggy.ai populates the recommendation fields with related products based on the product catalog itself. This makes it useful for even newly opened shops that have little or no historical shopping behavior to draw on.
The app is currently being optimized in a beta test with some big webshops. Siggy.ai founder, Chang Xiao, has worked with a lot of open source technology and initially chose Apache Hbase to cover his database needs.
“I’ve really enjoyed the open-source concept for a very long time,” Chang said. “But I quickly found out that as a startup with one person, Hbase was too difficult to manage. I had to install a lot of different software on top of each other and manage the server itself. I lacked expertise in this, so I started looking for a good substitute.”
Chang is a certified Amazon Web Services (AWS) solution architect. So his first choice was to build the entire Siggy.ai stack on AWS. But as he dug deeper into Apache Cassandra®, it dawned on him that this wouldn’t work.
DataStax Astra DB gave Siggy.ai all the functions
As Chang went through the functions needed to build his recommendation plugin it became apparent that the possibilities on AWS Keyspaces were too limited. For example, he needed to use count query to fetch the index status from the database and virtual tokens for large amounts of data.
When a Shopify webshop installs the Siggy.ai app, it will index the entire product catalog so the algorithm in the app can present recommendations based on that. The count query function lets the app display a visual of the percentage of the products that are indexed by the algorithm.
“This is an important feedback function as it shows the shop owner that our app is working. Without it, I need to go through a very long and convoluted way — effectively jumping through hoops on AWS Keyspace. It’s a very common use case. A lot of applications need to count the number of items in a database and present it back to the user,” Chang said.
As he describes in a blog post from this part of his development process: “DataStax literally wrote the code (…) This is a huge benefit and relief for any developers who are implementing Cassandra only to find the limitations with AWS Keyspaces”.
Besides the Cassandra functions crucial to Siggy.ai, Amazon Keyspaces also doesn’t support functions like MIN(), MAX() and the user-defined functions that make Cassandra truly powerful.
Chang Xiao primarily does his programming in Python. So when he went back over the documentation for the Python driver for Cassandra, he had a lightbulb moment.
“Since DataStax wrote everything in the documentation, all the functions should work with their service. So now I just do what it says in the documentation and it works,” Chang said.
Astra DB relieved Siggy.ai of headaches
When Chang hit “publish” on his short blog post he had already made up his mind. He was going to migrate his data from his existing Hbase database to DataStax Astra DB. Even though most of the Siggy.ai stack runs on AWS. His research and proof-of-concept work showed that it was technically feasible to run the app like this. Apart from the limited features on AWS Keyspaces, two other issues had been bugging him.
“I’d been thinking about the shutdown of Parler since January of 2021. The big players have cost-optimized services. But if they decide to take us down for whatever reason, open-source software will let us quickly switch to another platform,” Chang said.
This might be mostly a theoretical and ideological issue for a product recommendation app like Siggy.ai. But when Chang went on a hiking vacation in the White Mountains of New Hampshire, he was met with a very real, critical and relatable issue. During the July 4th of 2021 the Hbase database crashed. And continued crashing throughout his vacation.
Once every few hours and at least once every day the Apache Thrift™ server, Hbase used to communicate with the app would crash without any specific cause. This had a cascading effect on Siggy.ai, where product recommendations were not being displayed on the Shopify stores running active tests.
“Things just went haywire. It was really the worst possible time for me to be diagnosing server issues, restarting the Thrift server and checking if it’s running. I was on my laptop in the hotel with poor reception and my wife just stared at me like “what’s wrong with you?” That’s when I finally decided to make the switch,” Chang said.
From research into Cassandra to proof-of-concept, code refactoring and finally data migration to Astra DB it took just 14 days. Chang could now work without any crashes or other unexplained delays. One month later, he finished the first iteration of the Siggy.ai app. And Chang has since had the peace of mind to go weeks without checking on the servers.
Siggy.ai to use K8ssandra for app development
“Everything just works now. That peace of mind is the biggest benefit for me. We now use K8ssandra for our production and will be extending it to use it for our development environment,” Chang said.
The next steps for the Shopify product recommendation app will be to add new algorithms and extend the use of Cassandra by streaming data for real-time personalization of Shopify shops. As the beta testing comes to an end, Chang has given himself a year to find out what Siggy.ai can evolve into.
“If it doesn’t work out, I’ll have gained a lot of new skills and great experiences that I can apply somewhere else. I haven’t interacted much in the DataStax Discord channel yet. But I hope we get to the point where we’re scaling so much that I get challenges that go beyond the documentation,” Chang said.
The top 3 reasons Siggy.ai migrated to DataStax
- Siggy.ai needed to use count query and virtual tokens. These were available on DataStax Astra DB.
- Shutdowns of independent apps hosted on the cloud services of tech giants made Siggy.ai founder Chang Xiao worried about the plug being arbitrarily pulled on his app.
- With Astra DB and K8ssandra the development is easy, and the servers are always on.