Distributed Metrics Monitoring and Aggregation System
Design a high-scale, distributed system capable of ingesting, storing, and querying 10 million metrics per second from a fleet of global microservices. The system must support real-time alerting with sub-10 second latency and provide high-performance analytical queries for long-term trend visualization (up to 1 year of data). Address challenges such as high-cardinality tags, data compression, and the trade-offs between write throughput and query latency in a cloud-native environment.
KafkaClickHouseFlinkRedisgRPCProtobufPrometheusS3ZooKeeper
00