DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.
DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.
The Question
SQL

Best-Selling Products per Category with Tie-Breakers

You are given two tables: 'products' (containing product_id, product_name, and category_name) and 'product_sales' (containing product_id, sales_quantity, and rating). Write a PostgreSQL query to identify the top-performing product in every category. The primary metric for performance is 'sales_quantity'. In the event of a tie in sales, the product with the higher 'rating' should be prioritized. If multiple products remain tied after checking both metrics, include all of them. The final result should display the category name and product name, sorted alphabetically by the category name.
PostgreSQL
CTE
Window Function
RANK
Questions & Insights

Clarifying Questions

Grain of the Input Tables: Is the product_sales table a transactional log (many rows per product) or a summary table (one row per product)?
Assumption:* Based on the schema provided (`product_id`, `sales_quantity`, `rating`), this is a summary table** where each product_id appears once with its total sales and average rating.
Handling Absolute Ties: If two products in the same category have identical sales_quantity AND identical rating, should both be returned, or just one?
Assumption: In a "Top 1" scenario, typically we return all tied records unless specified otherwise. I will use RANK() to include ties, but I will also mention ROW_NUMBER() if a single record per category is strictly required.
Null Values: How should we handle products with NULL sales or ratings?
Assumption: We assume sales_quantity and rating are populated for active products. If NULL exists, standard SQL ordering treats NULL as the largest value in some dialects or smallest in others; in PostgreSQL, DESC treats NULLS FIRST by default. I will ensure they are treated as the lowest priority.
Schema Relationships:
products (Dimension Table): product_id is the Primary Key (PK).
product_sales (Fact/Summary Table): product_id is a Foreign Key (FK) referencing products. This is a 1:1 relationship based on the summary table assumption.

Thinking Process

Join Strategy: I need to combine the descriptive attributes (category_name, product_name) from the products table with the performance metrics (sales_quantity, rating) from the product_sales table. An INNER JOIN is appropriate here as a product must have sales data to be considered "best-selling."
Ranking Logic: The core of the problem is a "Top N per Group" puzzle.
Partitioning: We need to group the calculation by category_name.
Ordering: Within each category, we sort by sales_quantity descending. For ties, we sub-sort by rating descending.
Window Function Selection:
RANK(): If two products have the same sales and rating, both get rank #1.
ROW_NUMBER(): If two products tie, it arbitrarily picks one.
Decision: I will use RANK() to respect the data if a perfect tie exists, as it is the most analytically honest approach unless the business specifically asks for a single row.
Filtering: Wrap the ranked logic in a Common Table Expression (CTE) to filter for where the rank is 1.
Final Output: Select the required columns and apply the final alphabetical sort on category_name.
Implementation Breakdown

Problem Set

Goal: Identify the top product per category based on sales, then rating.
Constraints:
Use PostgreSQL syntax.
Secondary tie-break on rating.
Final sort on category_name ascending.
Edge Cases:
Categories with only one product.
Products with identical sales but different ratings.
Products with identical sales and identical ratings.

Approach

Technologies: PostgreSQL.
Window Function:RANK() to handle the partitioning and multi-level ordering.
CTEs: Used for readability and to allow filtering on the window function result (which cannot be done in a WHERE clause directly).
Join:INNER JOIN between products and product_sales.
Complexity:O(N \log N) due to the sort required by the window function, where N is the number of products.

Implementation

Wrap Up

Advanced Topics

PostgreSQL Optimization - `DISTINCT ON`: In PostgreSQL, the DISTINCT ON (expression) clause is often more performant and concise for "Top 1 per Group" problems.
    SELECT DISTINCT ON (category_name) 
           category_name, product_name
    FROM products p
    JOIN product_sales ps ON p.product_id = ps.product_id
    ORDER BY category_name, sales_quantity DESC, rating DESC;
Note: This strictly returns only ONE row per category, even if there's a tie.
Indexing: To optimize this query, a Composite Index on product_sales(sales_quantity DESC, rating DESC) would be beneficial, as would an index on the join key product_id.
Query Plan: The execution plan will likely show a Hash Join (if tables are large) followed by a WindowAgg step. If the dataset is massive and partitioned by category, the database can perform this operation in parallel across partitions.