ScyllaDB
Cheat Sheet
Prime Use Case
When you require sub-millisecond latencies at petabyte scale and need to eliminate the non-deterministic 'Stop-the-World' GC pauses inherent in JVM-based NoSQL systems.
Critical Tradeoffs
- Extreme performance vs. high operational complexity
- Predictable P99s vs. strict data modeling requirements
- High hardware efficiency vs. lack of managed service ubiquity compared to DynamoDB
Killer Senior Insight
ScyllaDB isn't just 'fast Cassandra'; it's a specialized operating system for data that uses the Seastar framework to bypass the Linux kernel's scheduler and page cache, treating each CPU core as an independent execution unit.
Recognition
Common Interview Phrases
Common Scenarios
- Real-time bidding (RTB) in AdTech
- High-velocity IoT sensor data ingestion
- User profile stores for massive-scale social networks
- Fraud detection systems requiring millisecond lookups
Anti-patterns to Avoid
- Small datasets (< 1TB) where the overhead of sharding outweighs the benefits
- Applications requiring complex JOINs or ACID transactions across multiple tables
- Prototyping phases where developer velocity is more important than performance
The Problem
The Fundamental Issue
The 'In-Memory Bottleneck' and CPU inefficiency caused by thread contention, context switching, and Garbage Collection in traditional NoSQL systems.
What breaks without it
Unpredictable latency spikes (P99/P999) due to Java GC cycles
Underutilized multi-core CPUs due to global lock contention
Higher TCO because more nodes are needed to reach performance targets due to software overhead
Why alternatives fail
Cassandra's JVM architecture introduces non-deterministic pauses that are impossible to tune away at extreme scale
DynamoDB can become prohibitively expensive at extreme sustained throughput due to its pricing model
Relational DBs cannot scale horizontally for writes without complex, manual application-level sharding
Mental Model
The Intuition
Imagine a massive warehouse where instead of one giant door where every worker crowds and waits for a manager to clear the path, every single worker has their own private door, their own forklift, and their own section of the shelf. They never have to talk to or wait for each other.
Key Mechanics
Shard-per-core architecture: Each CPU core manages its own memory and NIC queues
Seastar engine: An asynchronous, non-blocking programming framework
Direct I/O: Bypassing the Linux Page Cache to avoid kernel-level locking
Autonomous self-optimizing background tasks: Dynamic controllers for compaction and repair
Framework
When it's the best choice
- High-throughput write-heavy workloads
- Predictable low-latency requirements at scale
- Large-scale Cassandra migrations where API compatibility is required
When to avoid
- Strong consistency requirements (ACID) across multiple rows
- Rapidly changing schemas that require frequent full-table scans
- Low-budget projects where managed serverless services suffice
Fast Heuristics
Tradeoffs
Strengths
- Up to 10x throughput compared to Apache Cassandra
- Consistent low latency (no GC pauses)
- Lower TCO due to higher data density per node
- Drop-in compatibility with Cassandra (CQL) and DynamoDB APIs
Weaknesses
- Steep learning curve for partition-key-centric data modeling
- Compaction strategy tuning is critical and can be complex
- C++ memory management makes debugging 'weird' crashes harder than Java-based systems
Alternatives
When it wins
When the team has deep existing Java expertise and hasn't hit performance ceilings.
Key Difference
JVM-based, thread-per-request model vs. Scylla's shard-per-core C++ model.
When it wins
When operational overhead must be zero and the workload is highly bursty.
Key Difference
Fully managed serverless vs. Scylla's cluster-based architecture.
When it wins
When the primary use case is a pure Key-Value store with extreme focus on Flash/NVMe optimization.
Key Difference
Optimized for KV lookups rather than the wide-column/CQL flexibility of Scylla.
Execution
Must-hit talking points
- Mention 'Shard-per-core' architecture and how it eliminates lock contention
- Discuss the 'Seastar' framework and its shared-nothing approach
- Explain why bypassing the Linux Page Cache (Direct I/O) is critical for NVMe performance
- Differentiate between Compaction Strategies (STCS vs LCS) and their impact on write amplification
Anticipate follow-ups
- Q:How does Scylla handle 'Hot Partitions' through its per-core scheduler?
- Q:Explain the trade-offs of Lightweight Transactions (LWT) using Paxos or Raft
- Q:How do you handle cross-datacenter replication and the CAP theorem implications?
Red Flags
Using 'ALLOW FILTERING' in production queries.
Why it fails: It forces a full cluster scan, defeating the purpose of partition-key-based indexing and causing massive latency spikes.
Ignoring the Partition Key design and creating 'fat' partitions.
Why it fails: Poorly chosen keys lead to hot shards, where one CPU core is pegged at 100% while others are idle, creating a system-wide bottleneck.