The Five Minute Interview – Amara Health Analytics

By Conor SelfDecember 3, 2013

Amara Health Analytics Uses DataStax Enterprise To Proactively Protect Patient’s Health

This article is one in a series of quick-hit interviews with companies using Apache Cassandra and DataStax Enterprise for key parts of their business. For this interview, we spoke with Steve Nathan, CEO of Amara Health Analytics, and Dwight Hare, the company’s Chief Software Architect.

DataStax Enterprise translates to lower total cost for us to deliver our system over time and more agility in meeting customers’ needs.”

Steve Nathan
CEO, Amara Health Analytics

DataStax: How does Amara Health Analytics help its customers?

Steve: We provide real time decision support for clinicians to help in the early identification of hospital patients at risk of serious, rapidly-progressive disease states, like sepsis. These are patients that are potentially going to crash over the next hours or days, so an intervention is extremely urgent. Obviously that means the data streams we analyze are very real-time in nature.

DataStax: How do customers use your software?

Steve: For the end-user clinicians it’s very straightforward. They receive a text message alert from our system on their smartphone or tablet that directs their attention to an at-risk patient. The message includes key clinical variables to provide context for the alert.

In terms of IT, our software connects with multiple hospital data sources. Broadly speaking, there are three forms of data we analyze: structured – such as medical codes and other numeric values; unstructured clinician narrative – such as doctor’s notes, operative reports, and discharge summaries; and real-time physiologic signals from patient monitoring devices. Much of this data comes to us in Health Level 7 (HL7) format, but we can handle any other format the hospital may have.

DataStax: What technical or business challenges drove you toward a NoSQL solution versus a traditional relational database?

Dwight: A significant challenge involved the fact we have a lot of structured and unstructured data – but it’s very dynamic in that we might start getting a feed of data that we didn’t know about when we built the system. The code we’ve developed is written in Java – the normal server-side, back-end Java application. It’s actually an interpreter because we wanted the system to be very dynamic and be able to ingest new data sources that we didn’t know about when we designed the system.

We needed something that was very dynamic in the sense that the schema could change at run-time in real-time. We also ingest large amounts of lengthy narrative texts, nursing notes, and things like that. We needed to do a lot of searching over text, so we wanted a Solr kind of query capability as compared to what you get in SQL. SQL just didn’t meet our needs at all because it’s very hard-wired. You’ve got to define the schema in advance, and it doesn’t do very well with unstructured data in terms of manipulating it and searching it. We decided very quickly on a NoSQL database.

We looked around and Cassandra and DataStax Enterprise were the obvious leaders there. We ingest all of this data into a variety of column families, our main one we call a patient timeline, which is time-series in nature. We put all of the data that’s streaming in for patients into a timeline for each patient in chronological order. We have a picture of everything that’s happened to that patient from the moment they walked into the hospital until they leave, and even after they leave if labs or other results come in later. Then we generate other column families from that.

We also generate a report column family, and that’s where we perform the analysis. We have a combination of natural language processing, machine learning and rules-based analytics so we have built up a large set of fairly complex rules that deduce the state of the patients based on the data we’re getting and then identify risks to their health.

We’re looking for an accumulation of evidence that indicates a degradation in the patient’s state, so we have complex rules. The outputs of the rules are derived data.

The raw data from the patient timeline and the derived facts are in the report. We build up this very large data set and then we can use queries to run over that and generate reports. One of the important things about our product is that in order to really demonstrate its value, we have to show that the alerts we’re sending are having a significant impact on the care of patients. So we want to show over a period of time that patients that got alerts, and therefore received treatment sooner, have better outcomes.

The great thing about Cassandra, DataStax and Solr is we can slice and dice data and generate reports in a variety of ways. We can show which effects are due to our alerts vs. due to other factors.

DataStax: You said you never really looked at relational databases for this application – but did you evaluate other NoSQL providers?

Dwight: Yes, we looked at many of them. We chose Cassandra and DataStax because we like open-source software and if you get a community you don’t get locked in as much. Also, the hook-up to Solr was really important for us. The search capabilities and enterprise stability right out of the box made our development so much faster.

DataStax: So it sounds like Solr was the main draw.

Dwight: Yeah, we absolutely had to have that, and that’s what sent us directly to you.

DataStax: When you go into a hospital, how do you configure your Cassandra database?

Dwight: We use two models. For the local model, we provide a local install with replication. Another great thing about your system is that it gives us replication automatically, so when we’re going for 100% up-time we can perform a rolling upgrade. We designed our software to be similar in that we can have multiple nodes and have one failover to the next one so we can do rolling upgrades without any down time.

In the SaaS model, which I prefer, we have a geographically distributed data center that gives us the capability to do all these things. In the hospital it really depends on their environment. Basically, our uptime goal is to be as available as the data sources we’re getting.

DataStax: In terms of the manageability of the systems, obviously for your SaaS, I’m assuming you take care of everything. Then if it’s on premise, do they take care of the database then?

Dwight: No, we manage everything remotely and set up the system with alerts so that if any system goes down, it’s noticed. Then we can send an alert saying something’s down. Sometimes a data feed from the hospital goes down. We’ll notice that and contact them right away. We’re watching Cassandra in all of that.

DataStax: What benefits have you experienced using DataStax Enterprise?

Dwight: DataStax Enterprise gives us the ability to create and modify the schema in the live system. The fact that the schema doesn’t have to be hard-wired is part of our overall design. Let’s say we get a new type of data about a patient that we haven’t seen before. DataStax Enterprise allows us to introduce that new information into a live system without having to re-install it or modify the code.

So it’s a very dynamic system that doesn’t necessarily need to know in advance every kind of data that’s going to come in to the system. But if we introduce a new type of data, it can flow into the system and immediately become available for analysis and reporting. We just update the queries we’re using to access that additional data. So we can modify the system and move it forward in a very dynamic, quick way.

Steve: In business terms, DataStax Enterprise translates to lower total cost for us to deliver our system over time and more agility in meeting customers’ needs.

DataStax: What advice would you give someone who was migrating from a relational database to a NoSQL database?

Dwight: Mostly it’s just the learning curve for people who have only dealt with SQL databases. I guess the only advice I could give would be to not even approach this thinking in terms of databases because there’s so many expectations people have when you say relational data.

For more information on Amara Health Analytics, see:



Your email address will not be published. Required fields are marked *

Tel. +1 (408) 933-3120 Offices France Germany

DataStax Enterprise is powered by the best distribution of Apache Cassandra™.

© 2017 DataStax, All Rights Reserved. DataStax, Titan, and TitanDB are registered trademark of DataStax, Inc. and its subsidiaries in the United States and/or other countries.
Apache Cassandra, Apache, Tomcat, Lucene, Solr, Hadoop, Spark, TinkerPop, and Cassandra are trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.