Intelligent Adaptive Rate Limiter
Design an intelligent rate-limiting system for a global-scale ML inference API (e.g., Search or Recommendations). The system must differentiate between legitimate high-burst users and malicious scrapers or DDoS actors. Focus on achieving sub-2ms decision latency while handling 500k+ QPS. Detail the data pipelines for real-time feature extraction, the model architecture for risk scoring, and how to minimize false positives through offline/online evaluation strategies. Explain how you would manage the trade-off between system protection and user experience using an ML-driven dynamic thresholding approach.
LightGBMFlinkKafkaRedisSparkFeastEnvoyPrometheus
10