Apache Cassandra--A Deep Dive
Recently, there has been some discussion about what Big Data is. The definition of Big Data continues to evolve. Along with variety, volume and velocity (which the usual suspects handle well), other facets have been introduced, namely complexity and distribution. Complexity and distribution are facets that require a different type of solution.
While you can manually shard your data (Oracle, MySQL) or extend the master-slave paradigm to handle data distribution, a modern big data solution should solve the problem of distribution in a straightforward and elegant manner without manual intervention or external sharding. Apache Cassandra was designed to solve the problem of data distribution. It remains the best database for low latency access to large volumes of data while still allowing for multi-region replication. We will discuss how Cassandra solves the problem of data distribution and availability at scale.
This class will cover:
• Data Partitioning
• Local Storage Model
• The Write Path
• The Read Path
• Multi-Datacenter Deployments
• Upcoming Features (1.2 and beyond)
For the most benefit from this class, attend the "Getting Started with Cassandra" workshop.
Level : Intermediate