Here are 5 SQL questions that often appear in Data Scientist Online Assessments (OAs), especially from companies like Meta, Amazon, Google, and fintech/startups. Each includes a short problem, a sample schema, and a clean, optimized SQL solution with explanations.
1. Find the Top 3 Products by Sales in Each Category
-- Tables
Products(product_id, product_name, category)
Sales(sale_id, product_id, sale_date, revenue)
-- Solution
SELECT category, product_id, total_revenue
FROM (
SELECT
p.category,
s.product_id,
SUM(s.revenue) AS total_revenue,
RANK() OVER (PARTITION BY p.category ORDER BY SUM(s.revenue) DESC) AS rnk
FROM Sales s
JOIN Products p ON s.product_id = p.product_id
GROUP BY p.category, s.product_id
) ranked
WHERE rnk <= 3
ORDER BY category, total_revenue DESC;
Key Concepts: RANK(), partitioning, aggregation.
2. Find Customers Who Never Purchased
-- Tables
Customers(customer_id, name)
Orders(order_id, customer_id, order_date)
-- Solution
SELECT c.customer_id, c.name
FROM Customers c
LEFT JOIN Orders o
ON c.customer_id = o.customer_id
WHERE o.customer_id IS NULL;
Key Concepts: LEFT JOIN +
IS NULL filtering to find missing relations.
3. Running Total of Daily Active Users
-- Tables
UserActivity(user_id, activity_date)
-- Solution
WITH daily AS (
SELECT activity_date, COUNT(DISTINCT user_id) AS daily_users
FROM UserActivity
GROUP BY activity_date
)
SELECT
activity_date,
SUM(daily_users) OVER (ORDER BY activity_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total
FROM daily
ORDER BY activity_date;
4. Second Highest Salary
-- Table
Employee(id, name, salary)
-- Solution (ANSI SQL)
SELECT DISTINCT salary
FROM Employee
ORDER BY salary DESC
OFFSET 1 ROW FETCH NEXT 1 ROW ONLY;
-- MySQL alternative
SELECT MAX(salary) AS SecondHighestSalary
FROM Employee
WHERE salary < (SELECT MAX(salary) FROM Employee);
5. Find Users Who Made Purchases on Consecutive Days
-- Table
Purchases(user_id, purchase_date)
-- Solution
SELECT DISTINCT p1.user_id
FROM Purchases p1
JOIN Purchases p2
ON p1.user_id = p2.user_id
AND p2.purchase_date = p1.purchase_date + INTERVAL '1 day';
Summary: These problems exercise window functions, joins, aggregation, subqueries, and date arithmetic — all common in data scientist OAs.