ScyllaDB

A high-performance, C++ rewrite of Apache Cassandra utilizing a shared-nothing, shard-per-core architecture to maximize modern hardware utilization.

Cheat Sheet

Prime Use Case

When you require sub-millisecond latencies at petabyte scale and need to eliminate the non-deterministic 'Stop-the-World' GC pauses inherent in JVM-based NoSQL systems.

Critical Tradeoffs

Extreme performance vs. high operational complexity
Predictable P99s vs. strict data modeling requirements
High hardware efficiency vs. lack of managed service ubiquity compared to DynamoDB

Killer Senior Insight

ScyllaDB isn't just 'fast Cassandra'; it's a specialized operating system for data that uses the Seastar framework to bypass the Linux kernel's scheduler and page cache, treating each CPU core as an independent execution unit.

Recognition

Common Interview Phrases

We need to handle 1M+ writes per second with low TCO

P99 latency must stay under 10ms regardless of background tasks

We are hitting JVM GC bottlenecks in our current Cassandra or HBase cluster

Common Scenarios

Real-time bidding (RTB) in AdTech
High-velocity IoT sensor data ingestion
User profile stores for massive-scale social networks
Fraud detection systems requiring millisecond lookups

Anti-patterns to Avoid

Small datasets (< 1TB) where the overhead of sharding outweighs the benefits
Applications requiring complex JOINs or ACID transactions across multiple tables
Prototyping phases where developer velocity is more important than performance

The Problem

The Fundamental Issue

The 'In-Memory Bottleneck' and CPU inefficiency caused by thread contention, context switching, and Garbage Collection in traditional NoSQL systems.

What breaks without it

Unpredictable latency spikes (P99/P999) due to Java GC cycles

Underutilized multi-core CPUs due to global lock contention

Higher TCO because more nodes are needed to reach performance targets due to software overhead

Why alternatives fail

Cassandra's JVM architecture introduces non-deterministic pauses that are impossible to tune away at extreme scale

DynamoDB can become prohibitively expensive at extreme sustained throughput due to its pricing model

Relational DBs cannot scale horizontally for writes without complex, manual application-level sharding

Mental Model

The Intuition

Imagine a massive warehouse where instead of one giant door where every worker crowds and waits for a manager to clear the path, every single worker has their own private door, their own forklift, and their own section of the shelf. They never have to talk to or wait for each other.

Key Mechanics

Shard-per-core architecture: Each CPU core manages its own memory and NIC queues

Seastar engine: An asynchronous, non-blocking programming framework

Direct I/O: Bypassing the Linux Page Cache to avoid kernel-level locking

Autonomous self-optimizing background tasks: Dynamic controllers for compaction and repair

Framework

When it's the best choice

High-throughput write-heavy workloads
Predictable low-latency requirements at scale
Large-scale Cassandra migrations where API compatibility is required

When to avoid

Strong consistency requirements (ACID) across multiple rows
Rapidly changing schemas that require frequent full-table scans
Low-budget projects where managed serverless services suffice

Fast Heuristics

If the bottleneck is CPU/GC

ScyllaDB

If the bottleneck is Developer Velocity

DynamoDB

If the bottleneck is Complex Queries

PostgreSQL

Tradeoffs

Strengths

Up to 10x throughput compared to Apache Cassandra
Consistent low latency (no GC pauses)
Lower TCO due to higher data density per node
Drop-in compatibility with Cassandra (CQL) and DynamoDB APIs

−

Weaknesses

Steep learning curve for partition-key-centric data modeling
Compaction strategy tuning is critical and can be complex
C++ memory management makes debugging 'weird' crashes harder than Java-based systems

Alternatives

Apache Cassandra

Alternative

When it wins

When the team has deep existing Java expertise and hasn't hit performance ceilings.

Key Difference

JVM-based, thread-per-request model vs. Scylla's shard-per-core C++ model.

Amazon DynamoDB

Alternative

When it wins

When operational overhead must be zero and the workload is highly bursty.

Key Difference

Fully managed serverless vs. Scylla's cluster-based architecture.

Aerospike

Alternative

When it wins

When the primary use case is a pure Key-Value store with extreme focus on Flash/NVMe optimization.

Key Difference

Optimized for KV lookups rather than the wide-column/CQL flexibility of Scylla.

Execution

Must-hit talking points

Mention 'Shard-per-core' architecture and how it eliminates lock contention
Discuss the 'Seastar' framework and its shared-nothing approach
Explain why bypassing the Linux Page Cache (Direct I/O) is critical for NVMe performance
Differentiate between Compaction Strategies (STCS vs LCS) and their impact on write amplification

Anticipate follow-ups

Q:How does Scylla handle 'Hot Partitions' through its per-core scheduler?
Q:Explain the trade-offs of Lightweight Transactions (LWT) using Paxos or Raft
Q:How do you handle cross-datacenter replication and the CAP theorem implications?

Red Flags

Using 'ALLOW FILTERING' in production queries.

Why it fails: It forces a full cluster scan, defeating the purpose of partition-key-based indexing and causing massive latency spikes.

Ignoring the Partition Key design and creating 'fat' partitions.

Why it fails: Poorly chosen keys lead to hot shards, where one CPU core is pegged at 100% while others are idle, creating a system-wide bottleneck.