Apache Hadoop is an ideal platform for consolidating large-scale data from a variety of real-time and legacy sources. It complements existing data management solutions with new analyses and processing tools. Enterprises today collect and generate more data than ever before and Hadoop was designed for the fast, reliable analysis of both structured data and complex data. As a result, many enterprises deploy Hadoop alongside their legacy IT systems, which allows them to combine old data and new data sets in powerful new ways.
Hadoop, however, is overly complex to deploy and its architecture, including Name Node, secondary Name Node, etc acts as a single point of failure (SPOF). Additionally, enabling multi-data center and the cloud is extremely tough with the traditional Hadoop stack, unlike Cassandra, which is designed for multi-data center and the cloud. DataStax Enterprise combines Cassandra and Hadoop to create a powerful new Hadoop distribution.