Name: InterviewGPT
Rating: 4.8 (100 reviews)

The Question

Distributed Web Crawler Design

Design a highly scalable, distributed system capable of crawling and indexing a significant portion of the web. The system must efficiently manage URL discovery, prioritize content fetching, and strictly adhere to website-specific politeness policies while handling petabytes of data.

Redis

Bloom Filter

PostgreSQL

Distributed Workers