BACKGROUND IMAGE: iSTOCK/GETTY IMAGES
Apache Cassandra is an open source distributed database system that is designed for storing and managing large amounts of data across commodity servers. Cassandra can serve as both a real-time operational data store for online transactional applications and a read-intensive database for large-scale business intelligence (BI) systems.
Originally created for Facebook, Cassandra is designed to have peer-to-peer symmetric nodes, instead of master or named nodes, to ensure there can never be a single point of failure (SPoF). Cassandra automatically partitions data across all the nodes in the database cluster, but the administrator has the power to determine what data will be replicated and how many copies of the data will be created.
After Facebook open-sourced the code, Cassandra became an Apache Incubator project in 2008 and a top-level Apache project in 2010. As of this writing, Cassandra deployments include Netflix, Digg, Adobe, Twitter, HP, IBM, Rackspace, Cisco and Reddit.
The name Cassandra was inspired by the beautiful mystic seer in Greek mythology whose predictions for the future were never believed.
This tutorial from DataStax provides an excellent overview of Cassandra: