Apache Cassandra

A highly scalable, distributed wide-column NoSQL database designed to handle massive amounts of data across many commodity servers, providing high availability with no single point of failure.

Cheat Sheet

Prime Use Case

When you need linear horizontal scalability for write-heavy workloads across multiple data centers and can tolerate eventual consistency.

Critical Tradeoffs

  • Optimized for writes at the expense of complex read patterns
  • Eventual consistency vs. Strong consistency (Tunable)
  • No support for joins or multi-row transactions
  • High operational overhead regarding compaction and repair

Killer Senior Insight

Cassandra is essentially a distributed hash map where every node is equal; its 'Query-First' data modeling requirement means you must design your tables specifically to satisfy your UI/API queries, not to normalize your data.

Recognition

Common Interview Phrases

Need for 'Always-On' availability
Write throughput exceeding 100k+ operations per second
Global multi-region deployment requirements
Time-series data or append-only logs
Requirement to scale storage linearly by adding nodes

Common Scenarios

  • IoT sensor data ingestion
  • User activity tracking and analytics
  • Messaging and chat history
  • Recommendation engine feature stores
  • E-commerce shopping carts and session management

Anti-patterns to Avoid

  • Applications requiring ACID transactions across multiple tables
  • Systems needing ad-hoc reporting or complex SQL joins
  • Small datasets that fit comfortably on a single relational instance
  • Workloads with frequent updates or deletes to the same records (Tombstone issues)

The Problem

The Fundamental Issue

The 'Write Wall' and Single Point of Failure (SPOF) inherent in traditional master-slave relational databases.

What breaks without it

Master nodes become a bottleneck for write operations

Failover mechanisms introduce downtime during leader election

Vertical scaling hits a hard physical limit and becomes exponentially expensive

Why alternatives fail

Relational DBs (Postgres/MySQL) struggle with multi-master write synchronization

MongoDB's single-master architecture (per shard) can lead to write unavailability during elections

Standard caches (Redis) don't provide the same durability or disk-based storage capacity

Mental Model

The Intuition

Imagine a circular table where every guest is equally responsible for holding a piece of a giant encyclopedia. If one guest leaves, their neighbors already have a copy of their pages. To find a fact, you just need to know which guest's name starts with the right letter.

Key Mechanics

1

Consistent Hashing: Determines data placement across the ring using partition keys

2

Gossip Protocol: Peer-to-peer communication for node state and health discovery

3

LSM-Trees (Log-Structured Merge-Trees): Converts random writes into sequential disk I/O via Memtables and SSTables

4

Tunable Consistency: Allows developers to choose (R + W > N) for strong consistency or lower for performance

5

Hinted Handoff: Temporarily stores writes for a downed node to ensure eventual consistency

Framework

When it's the best choice

  • When the write-to-read ratio is high
  • When zero-downtime is a hard requirement
  • When data volume is expected to grow into the Petabyte range

When to avoid

  • When you need to perform 'GROUP BY' or 'JOIN' operations on the fly
  • When your data access patterns are unpredictable
  • When you have a low-volume, high-complexity relational schema

Fast Heuristics

If Multi-Region Write + High Availability
Cassandra
If Complex Relationships + ACID
PostgreSQL
If Low Latency Key-Value + Small Dataset
Redis

Tradeoffs

+

Strengths

  • Linear horizontal scalability (double nodes = double throughput)
  • No single point of failure (Peer-to-peer architecture)
  • High write performance due to append-only storage engine
  • Flexible schema for wide-column attributes

Weaknesses

  • Read latency can be high due to checking multiple SSTables
  • Tombstones (deleted markers) can degrade performance if not managed
  • No native support for secondary indexes at scale
  • Requires deep knowledge of data modeling to avoid 'hot partitions'

Alternatives

ScyllaDB
Alternative

When it wins

When you need maximum performance with lower hardware footprint (C++ rewrite of Cassandra)

Key Difference

Shared-nothing architecture that avoids JVM garbage collection pauses

DynamoDB
Alternative

When it wins

When you want a fully managed serverless experience on AWS

Key Difference

Proprietary AWS service with auto-scaling but less control over internal configuration

CockroachDB
Alternative

When it wins

When you need horizontal scale but require full SQL and ACID compliance

Key Difference

Uses Raft consensus for strong consistency, which adds latency to writes compared to Cassandra's AP focus

Execution

Must-hit talking points

  • Explain the Partition Key vs. Clustering Key distinction clearly
  • Mention 'LSM-Trees' and why they make writes fast (sequential vs random I/O)
  • Discuss 'Quorum' (RF=3, R=2, W=2) to demonstrate understanding of consistency trade-offs
  • Highlight the 'Compaction' process and its impact on disk space and I/O

Anticipate follow-ups

  • Q:How do you handle hot partitions? (Salting or better key selection)
  • Q:What happens during a network partition? (CAP theorem: it chooses AP)
  • Q:How do you handle deletes in a distributed system? (Tombstones and Grace Period)
  • Q:How does Cassandra handle multi-DC replication?

Red Flags

Using Cassandra like a relational database (Normalizing data)

Why it fails: Leads to client-side joins which are extremely slow and negate the benefits of the distributed system.

Selecting a low-cardinality partition key

Why it fails: Creates 'Hot Partitions' where one node handles all the traffic while others stay idle, leading to system-wide bottlenecks.

Relying heavily on Secondary Indexes

Why it fails: Secondary indexes in Cassandra are local to the node; querying them requires hitting every node in the cluster (scatter-gather), which kills performance.