DataStax Developer Blog

Multi-Datacenter Cassandra on 32 Raspberry Pi’s

By Brandon Van Ryswyk -  August 19, 2014 | 20 Comments

Here at Datastax, my fellow intern Daniel Chin and I built a 32 node DataStax Enterprise cluster running on Raspberry Pi’s! We are showcasing the always on, fault tolerant nature of Cassandra by letting anybody take down an entire data center with the press of a Big Red Button in our lobby.

lobby_wide

Being able to withstand a data center going down is not just an edge case, it is an absolute necessity for the highly available applications Cassandra powers. While the cloud is far more flexible for production use, nothing beats a big shiny hardware display for a demo.

Our main goal for this project was to take the abstract concept of fault tolerance and make it something you can see in action and interact with. We built upon the work of Travis Price and the DataStax Sales Engineering team, who pioneered the use of Raspberry Pi’s to demonstrate Cassandra.

cluster_closeup_final

The Build Process:

The Hardware:

 We wanted the overall display to look clean and professional enough to be appropriate for the lobby at our headquarters, but expose enough of the technology to be a compelling and unique demo.

As DataStax is a software company, fabricating hardware came with a unique set of challenges. (“Hey, do we have a shop vac?” “No.”) I ended up drawing on my experience with Solidworks (a popular CAD program) from high school FIRST robotics to design all of the acrylic, and had it cut using a laser at a local machine shop. The assorted mounting hardware and the pedestal were sourced from McMaster Carr.

lots_of_pis

The Electronics:

Each Pi is running at its factory clock settings, and is completely unmodified. To avoid latency problems and to ensure our Pi’s stayed online, we transitioned off of WiFi and used ethernet cables and switches instead.

To get power to each Pi, we use micro USB cables that are connected to five port USB hubs that are then plugged in to two power strips, one for each data center. This makes it easy to set up, and doesn’t require building any custom power distribution rails.

Our large red button is connected to an Arduino that actuates a power relay to cut AC power to the network switch for Datacenter RED. The Arduino provides timing control, and makes the button inoperable during the network outage.

daniel_assembling

The Software:

The cluster is set up as a two datacenter DSE 4.5 cluster, with Opscenter 5.0 running to show the status of all the nodes. As we expected, running a high performance enterprise database on a computer with a single core 700mHz processor and 512MB of RAM is not trivial.

We are using vnodes, and have throttled all of the cassandra.yaml values to the lowest intensity we can to squeeze C* to within our hardware constraints.

With Cassandra running, each Pi has 8 to 11 megabytes of free RAM.

For reference, our documentation currently recommends 16 cores and 24GB of RAM for a production system.

cables_wall

While this cluster won’t be setting speed benchmarks any time soon, we hope that it gets people excited about Cassandra and its incredible always on capabilities!

pedestal_pic_teaser



Comments

  1. michael says:

    Hey, what’s that Dell monitor?

  2. Pepijn says:

    Maybe you already did this, but since you said the raspis run unmodified:

    You can change the memory split to give less memory to the GPU. The default is 64mb, I use 8md I think. That’d more than double the free memory.

    1. Derpatron says:

      You can’t be using 8. The minimum requirement for the binary blob for the GPU is 16MB.

  3. Brett says:

    Would love to see a video of the demo!

  4. Kyle O says:

    Very impressive project. Can you upload a video of it in action, or do those interested need to visit the datastax hq?

  5. Bryan Stone says:

    I’d love to see some benchmark data for how this performs!

  6. Conor says:

    Is there any video available to watch? I’d love to see it in action!

  7. Jero says:

    Could you share the cassandra.yaml that the PIs are using ?

  8. J says:

    It would be cooler to show data integrity after a net split. http://aphyr.com/posts/294-call-me-maybe-cassandra/

  9. Justin says:

    Cool project, and display. +1 for the requests of the above commenters.

  10. Alex says:

    Awesome project.

    Great plug for FIRST Robotics!

    http://www.usfirst.org

  11. John says:

    Which SD card did you think was up to par for your public display?

  12. ZeuS says:

    You should start chaosmonkey on that and record whats gonna happen

  13. Anon says:

    What does the red button do? This post needs a better intro.

  14. James says:

    What USB hubs are you using? I have trouble finding good ones for powering Pis :)

  15. One thing I’d love to know more about is the particulars of the c* configuration tweaks used on these memory-starved tiny pieces of hardware. It could be very useful for others who are trying to install on virtual dev machines, less-than-optimal systems for testing, and similar.

  16. Richard Ney says:

    I second Daniel’s comment, can someone share the setups used for running this C* cluster on the Pis. I’d love to replicate a smaller version of this test.

  17. Ma Diga says:

    I third Daniel and Richard’s follow up…

  18. Juan says:

    What power supply splitter did you use to be able to still power your Pi’s?

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>