The Question
Design
Unified Data Lakehouse Platform
Design a scalable data platform capable of processing both high-throughput real-time streams and large-scale batch workloads using a unified storage layer. The system must support ACID transactions, schema evolution, and provide a SQL-based interface for analytical queries while maintaining low-latency ingestion and cost-efficient historical storage.
Kafka
S3
Apache Iceberg
Flink
Spark
Trino
Kubernetes
Glue
Parquet
ACID
CDC
March 17, 2026