DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.
DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.
The Question
SQL

Consecutive High Attendance Records

You are given a table Stadium containing stadium visit data with columns id (integer), visit_date (date), and people (integer). The id is an auto-incrementing value that corresponds with the chronological order of the visit_date. Write a SQL query to identify all sequences of three or more consecutive IDs where each visit in that sequence had a population of 100 or more people. The result should include the original row details and be ordered by the visit date.
PostgreSQL
CTE
Window Function
Gaps and Islands
Questions & Insights

Clarifying Questions

Are there gaps in the `id` sequence? The problem states "consecutive IDs." In Gaps and Islands problems, we must clarify if "consecutive" means id, id+1, id+2 or simply the chronological order. Based on the prompt "As the id increases, the date increases," we will treat the integer id as the sequence generator.
Can the same date have multiple entries? The prompt states visit_date has unique values, so it is a natural key or a unique constraint.
What is the scale of the data? For a stadium tracking daily visits, the dataset is likely in the magnitude of thousands or tens of thousands of rows. Standard window functions in PostgreSQL are highly efficient for this volume.
What if a sequence is longer than 3? The requirement is "three or more." Any row belonging to a cluster of \ge 3 consecutive IDs meeting the criteria must be returned.
Assumptions:
Primary Key:id is the unique identifier and follows a strictly increasing chronological order.
Table Type: This is an Event Fact Table capturing attendance metrics per day.
Data Quality: We assume people is non-null. If NULLs exist, they would be excluded by the >= 100 predicate.

Thinking Process

To solve this "Gaps and Islands" problem, I will follow a logical progression:
Filter First: We only care about days where people >= 100. Filtering early reduces the dataset for window function processing.
Identify Islands: Once filtered, the id values will no longer be strictly consecutive if some days had < 100 people. For example, if IDs are [1, 2, 3, 5, 6, 7], there is a gap at 4.
The "Difference" Trick: To group these "islands," I will subtract a monotonic sequence (ROW_NUMBER) from the id.
For id values 1, 2, 3, the row numbers are 1, 2, 3. Differences: 1-1=0, 2-2=0, 3-3=0.
For id values 5, 6, 7, the row numbers are 4, 5, 6. Differences: 5-4=1, 6-5=1, 7-6=1.
Rows sharing the same difference belong to the same consecutive block.
Aggregate & Filter: Use another window function (COUNT(*) OVER(...)) to count how many records exist in each island.
Final Selection: Filter for counts \ge 3 and sort by visit_date.
Implementation Breakdown

Problem Set

Goal: Find blocks of \ge 3 consecutive IDs where people >= 100.
Constraint 1: Attendance threshold \ge 100.
Constraint 2: Consecutive IDs (minimum 3).
Constraint 3: Order by visit_date.
Edge Cases:
Exactly 3 rows in a sequence.
More than 3 rows (e.g., 5 rows should all be returned).
Multiple separate sequences in the same table.

Approach

CTEs (Common Table Expressions): To break the logic into readable steps (Filtering -> Grouping -> Counting).
Window Functions:
ROW_NUMBER(): To generate a sequence for the difference calculation.
COUNT(...) OVER(...): To determine the size of each island without collapsing rows.
PostgreSQL Specifics: Standard ANSI SQL compliance ensures this works perfectly in PostgreSQL.
Computational Cost:O(N \log N) due to the sorting required for window functions, where N is the number of rows with \ge 100 people.

Implementation

Wrap Up

Advanced Topics

Indexing: An index on people would speed up the initial filter. However, since the final result requires an order by id (via ROW_NUMBER), a composite index on (people, id) would be ideal to allow the engine to filter and then scan in order.
Performance in PostgreSQL: If the table is massive, ROW_NUMBER() can be memory-intensive. However, for a "Stadium" table, the row count is unlikely to exceed the capacity of a standard work_mem setting.
Alternative - LEAD/LAG: One could use LEAD and LAG to check if the current row + 1 and current row + 2 meet the criteria, or current - 1 and current + 1, etc. This is often more verbose and harder to maintain than the Gaps and Islands approach, as it requires hardcoding the "3 or more" logic into multiple conditional branches.
Parallelism: PostgreSQL can parallelize the FilteredStadium scan and the window function computation if the dataset size warrants it.