Here's something that surprises a lot of people: SQL is more relevant in 2026 than it was ten years ago. Every AI/ML pipeline, every analytics dashboard, every backend service - they all talk to a database at some point. And when you sit down for a technical interview at any company that touches data (which is all of them), SQL is going to come up.

But here's the thing - most SQL interview prep guides are stuck in 2018. They'll drill you on obscure syntax nobody uses, while skipping the window functions and CTEs that interviewers actually love to ask about.

This guide is different. These are the questions that actually show up in interviews at companies ranging from startups to FAANG. I've organized them into tiers so you can focus on what matters for your target role.

We'll use a consistent schema throughout, so you can follow along:

-- Our sample schema
CREATE TABLE departments (
    department_id INT PRIMARY KEY,
    department_name VARCHAR(100),
    budget DECIMAL(12, 2)
);

CREATE TABLE employees (
    employee_id INT PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    email VARCHAR(100),
    department_id INT REFERENCES departments(department_id),
    manager_id INT REFERENCES employees(employee_id),
    salary DECIMAL(10, 2),
    hire_date DATE
);

CREATE TABLE products (
    product_id INT PRIMARY KEY,
    product_name VARCHAR(100),
    category VARCHAR(50),
    price DECIMAL(10, 2)
);

CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    employee_id INT REFERENCES employees(employee_id),
    product_id INT REFERENCES products(product_id),
    quantity INT,
    order_date DATE,
    total_amount DECIMAL(10, 2)
);

Let's get into it.

Tier 1: The Fundamentals (Everyone Gets These)

Every SQL interview starts here. If you stumble on these, it's a red flag - no matter what role you're interviewing for. The good news is they're straightforward once you've practiced them a few times.

SELECT, WHERE, GROUP BY, HAVING, ORDER BY

These are the building blocks. You need to know the order of execution (which is different from the order you write them):

FROM - pick the table
WHERE - filter rows
GROUP BY - group remaining rows
HAVING - filter groups
SELECT - pick columns
ORDER BY - sort results

This matters because it explains why you can't use a column alias in WHERE but you can in ORDER BY. Interviewers love testing whether you understand this.

JOINs

You need to know four types cold:

INNER JOIN - only matching rows from both tables
LEFT JOIN - all rows from the left table, matching rows from the right (NULLs where no match)
RIGHT JOIN - opposite of LEFT JOIN (rarely used in practice - just flip your tables)
FULL OUTER JOIN - all rows from both tables, NULLs where no match on either side

The key insight: LEFT JOIN is the workhorse. You'll use it probably 80% of the time in real work.

Aggregations

COUNT, SUM, AVG, MIN, MAX - know them all. The tricky part is understanding COUNT(*) vs COUNT(column) vs COUNT(DISTINCT column):

COUNT(*) counts all rows, including NULLs
COUNT(column) counts non-NULL values in that column
COUNT(DISTINCT column) counts unique non-NULL values

NULL Handling

NULLs trip up a lot of candidates. Remember:

NULL = NULL is not TRUE - it's NULL (which is falsy)
Use IS NULL / IS NOT NULL for comparisons
COALESCE(a, b, c) returns the first non-NULL value
NULLIF(a, b) returns NULL if a equals b (great for avoiding division by zero)

Fundamental Interview Questions

Q1: "Find all employees who don't belong to any department."

SELECT employee_id, first_name, last_name
FROM employees
WHERE department_id IS NULL;

Simple, but the interviewer is checking that you use IS NULL instead of = NULL. You'd be surprised how many people get this wrong.

Q2: "For each department, find the number of employees and the average salary. Only show departments with more than 5 employees, sorted by average salary descending."

SELECT
    d.department_name,
    COUNT(e.employee_id) AS employee_count,
    ROUND(AVG(e.salary), 2) AS avg_salary
FROM departments d
LEFT JOIN employees e ON d.department_id = e.department_id
GROUP BY d.department_name
HAVING COUNT(e.employee_id) > 5
ORDER BY avg_salary DESC;

This tests GROUP BY, HAVING, JOINs, and aggregations all at once. Note the LEFT JOIN - we want all departments, even if they somehow have zero employees. Also note that HAVING uses COUNT(e.employee_id) not the alias, since HAVING is evaluated before SELECT in most databases.

Q3: "List all employees along with their department name. Include employees with no department and departments with no employees."

SELECT
    e.first_name,
    e.last_name,
    COALESCE(d.department_name, 'Unassigned') AS department_name
FROM employees e
FULL OUTER JOIN departments d ON e.department_id = d.department_id;

This checks that you know FULL OUTER JOIN and can use COALESCE to handle NULLs gracefully.

Q4: "Find the total revenue per product category, but only for orders placed in 2025. Replace any NULL category with 'Uncategorized'."

SELECT
    COALESCE(p.category, 'Uncategorized') AS category,
    SUM(o.total_amount) AS total_revenue
FROM orders o
INNER JOIN products p ON o.product_id = p.product_id
WHERE o.order_date >= '2025-01-01'
  AND o.order_date < '2026-01-01'
GROUP BY COALESCE(p.category, 'Uncategorized')
ORDER BY total_revenue DESC;

Common mistake here: using YEAR(order_date) = 2025. That works, but it prevents the database from using an index on order_date. Using range comparisons is better practice, and mentioning that in an interview earns you points.

Tier 2: Intermediate (Most Interviews Hit These)

This is where interviews get interesting. If you're going for any backend, data, or full-stack role, expect questions at this level. Window functions in particular have become a favorite among interviewers - they separate candidates who've just memorized syntax from those who actually think in SQL.

Subqueries vs JOINs

Most subqueries can be rewritten as JOINs and vice versa. Know when to use each:

JOINs are generally more readable and often better optimized by the query planner
Subqueries are useful when you need to filter based on an aggregate, or when the subquery result is small
Correlated subqueries (where the subquery references the outer query) can be slow - the database might execute them once per row

CTEs (Common Table Expressions)

CTEs using the WITH clause are one of the best things to happen to SQL readability. They let you break complex queries into named, logical steps:

WITH monthly_sales AS (
    SELECT
        employee_id,
        DATE_TRUNC('month', order_date) AS month,
        SUM(total_amount) AS revenue
    FROM orders
    GROUP BY employee_id, DATE_TRUNC('month', order_date)
)
SELECT * FROM monthly_sales WHERE revenue > 10000;

Use CTEs when your query has multiple logical steps. They make your SQL readable and debuggable, which interviewers love to see.

Window Functions

This is the big one. Window functions let you perform calculations across rows related to the current row without collapsing them into groups. The key players:

ROW_NUMBER() - unique sequential number for each row
RANK() - rank with gaps (1, 2, 2, 4)
DENSE_RANK() - rank without gaps (1, 2, 2, 3)
LAG(column, n) - value from n rows before
LEAD(column, n) - value from n rows after
SUM(column) OVER (...) - running or partitioned sum

The OVER clause is what makes it a window function. You can PARTITION BY (like GROUP BY but without collapsing rows) and ORDER BY within the window.

CASE Statements

CASE is SQL's if/else. You'll use it for conditional logic inside queries:

SELECT
    first_name,
    salary,
    CASE
        WHEN salary >= 150000 THEN 'Senior'
        WHEN salary >= 100000 THEN 'Mid'
        ELSE 'Junior'
    END AS salary_band
FROM employees;

Self-Joins

A self-join is just a table joined to itself. The classic use case is finding an employee's manager (since both are in the employees table).

Intermediate Interview Questions

Q5: "For each employee, show their name, salary, and the rank of their salary within their department (highest salary = rank 1)."

SELECT
    e.first_name,
    e.last_name,
    d.department_name,
    e.salary,
    RANK() OVER (
        PARTITION BY e.department_id
        ORDER BY e.salary DESC
    ) AS salary_rank
FROM employees e
JOIN departments d ON e.department_id = d.department_id;

The interviewer wants to see that you know PARTITION BY vs ORDER BY inside window functions. They might also ask when you'd use DENSE_RANK instead - the answer is when you don't want gaps in ranking (if two people tie for #2, the next person is #3 with DENSE_RANK, but #4 with RANK).

Q6: "Find employees who earn more than the average salary of their department."

-- Approach 1: Using a CTE
WITH dept_avg AS (
    SELECT
        department_id,
        AVG(salary) AS avg_salary
    FROM employees
    GROUP BY department_id
)
SELECT
    e.first_name,
    e.last_name,
    e.salary,
    ROUND(da.avg_salary, 2) AS dept_avg_salary
FROM employees e
JOIN dept_avg da ON e.department_id = da.department_id
WHERE e.salary > da.avg_salary;

-- Approach 2: Using a window function
SELECT first_name, last_name, salary, dept_avg_salary
FROM (
    SELECT
        first_name,
        last_name,
        salary,
        AVG(salary) OVER (PARTITION BY department_id) AS dept_avg_salary
    FROM employees
) sub
WHERE salary > dept_avg_salary;

Showing both approaches demonstrates flexibility. Mention that the CTE version is often easier to read, while the window function version avoids a second pass over the data.

Q7: "Show each employee's name, their order total for each month, and the previous month's order total."

WITH monthly_totals AS (
    SELECT
        e.employee_id,
        e.first_name,
        e.last_name,
        DATE_TRUNC('month', o.order_date) AS order_month,
        SUM(o.total_amount) AS monthly_total
    FROM employees e
    JOIN orders o ON e.employee_id = o.employee_id
    GROUP BY e.employee_id, e.first_name, e.last_name, DATE_TRUNC('month', o.order_date)
)
SELECT
    first_name,
    last_name,
    order_month,
    monthly_total,
    LAG(monthly_total, 1) OVER (
        PARTITION BY employee_id
        ORDER BY order_month
    ) AS prev_month_total
FROM monthly_totals
ORDER BY employee_id, order_month;

LAG and LEAD are interview favorites. The key thing to mention: they operate on the ordered result within each partition. If there's no previous row, LAG returns NULL (you can specify a default as a third argument).

Q8: "Find all employees and their direct manager's name using a self-join."

SELECT
    e.first_name AS employee_first_name,
    e.last_name AS employee_last_name,
    m.first_name AS manager_first_name,
    m.last_name AS manager_last_name
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.employee_id;

LEFT JOIN is important here - the CEO or top-level manager won't have a manager_id, and you don't want to exclude them from results.

Q9: "Classify each department's total salary spend as 'Over Budget', 'Within Budget', or 'Under Budget' (where under budget means spending less than 80% of the budget)."

SELECT
    d.department_name,
    d.budget,
    SUM(e.salary) AS total_salary_spend,
    CASE
        WHEN SUM(e.salary) > d.budget THEN 'Over Budget'
        WHEN SUM(e.salary) < d.budget * 0.8 THEN 'Under Budget'
        ELSE 'Within Budget'
    END AS budget_status
FROM departments d
LEFT JOIN employees e ON d.department_id = e.department_id
GROUP BY d.department_id, d.department_name, d.budget;

This combines aggregation, JOINs, and CASE. A common mistake is forgetting to include d.budget in the GROUP BY clause (since it's not in an aggregate function).

Tier 3: Advanced (Senior/Data Roles)

If you're interviewing for a senior backend role, a data engineering position, or anything at a company that takes databases seriously, you'll face these. This tier is less about writing fancy queries and more about understanding how databases actually work under the hood.

Query Optimization

The biggest thing interviewers test here is whether you understand execution plans and indexing:

EXPLAIN / EXPLAIN ANALYZE - shows you what the database is actually doing with your query
Indexes - B-tree indexes speed up lookups, but slow down writes. Know when to add them.
Covering indexes - an index that contains all the columns a query needs, so the database never has to touch the actual table
Composite indexes - multi-column indexes follow the "leftmost prefix" rule. An index on (a, b, c) helps queries filtering on a, or a AND b, or a AND b AND c, but NOT just b or c

Recursive CTEs

These let you traverse hierarchical data - org charts, category trees, bill of materials. The pattern is always the same: a base case UNION ALL with a recursive step.

Transaction Isolation Levels

Know the four standard levels and what problems each one prevents:

READ UNCOMMITTED - dirty reads possible (rarely used)
READ COMMITTED - no dirty reads, but non-repeatable reads possible (PostgreSQL default)
REPEATABLE READ - no dirty or non-repeatable reads, but phantom reads possible (MySQL InnoDB default)
SERIALIZABLE - full isolation, but lowest concurrency

Deadlocks

Two transactions each holding a lock the other needs. How to prevent them:

Access tables in a consistent order
Keep transactions short
Use appropriate isolation levels
Consider using SELECT ... FOR UPDATE with timeouts

Advanced Interview Questions

Q10: "Given the employees table with a self-referential manager_id, write a query to show the full management chain for every employee (from CEO down to individual contributor)."

WITH RECURSIVE org_chart AS (
    -- Base case: top-level employees (no manager)
    SELECT
        employee_id,
        first_name,
        last_name,
        manager_id,
        1 AS level,
        CAST(first_name || ' ' || last_name AS VARCHAR(1000)) AS chain
    FROM employees
    WHERE manager_id IS NULL

    UNION ALL

    -- Recursive step: find direct reports
    SELECT
        e.employee_id,
        e.first_name,
        e.last_name,
        e.manager_id,
        oc.level + 1,
        CAST(oc.chain || ' > ' || e.first_name || ' ' || e.last_name AS VARCHAR(1000))
    FROM employees e
    JOIN org_chart oc ON e.manager_id = oc.employee_id
)
SELECT employee_id, first_name, last_name, level, chain
FROM org_chart
ORDER BY chain;

Recursive CTEs are one of those topics that looks intimidating but follows a simple pattern. Always explain the base case and the recursive step separately. The interviewer will also likely ask about safeguards against infinite recursion - mention MAXRECURSION in SQL Server or adding a depth limit in your WHERE clause.

Q11: "A query that joins employees, orders, and products is running slowly. Walk me through how you'd diagnose and fix it."

This is more of a discussion question, but here's a structured answer:

-- Step 1: Look at the execution plan
EXPLAIN ANALYZE
SELECT e.first_name, p.product_name, SUM(o.total_amount)
FROM employees e
JOIN orders o ON e.employee_id = o.employee_id
JOIN products p ON o.product_id = p.product_id
WHERE o.order_date >= '2025-01-01'
GROUP BY e.first_name, p.product_name;

-- Step 2: Check for missing indexes
-- Look for sequential scans on large tables in the EXPLAIN output

-- Step 3: Add targeted indexes
CREATE INDEX idx_orders_employee_date ON orders(employee_id, order_date);
CREATE INDEX idx_orders_product ON orders(product_id);

-- Step 4: Consider a covering index if this query runs frequently
CREATE INDEX idx_orders_covering ON orders(
    employee_id, product_id, order_date
) INCLUDE (total_amount);

Walk through the thought process: check the plan, identify full table scans, add indexes on join columns and filter columns, consider covering indexes for hot queries. Mention that you'd also check table statistics are up to date (ANALYZE in PostgreSQL).

Q12: "Simulate a PIVOT to show total order amounts per employee per product category, with categories as columns."

SELECT
    e.first_name,
    e.last_name,
    SUM(CASE WHEN p.category = 'Electronics' THEN o.total_amount ELSE 0 END) AS electronics,
    SUM(CASE WHEN p.category = 'Clothing' THEN o.total_amount ELSE 0 END) AS clothing,
    SUM(CASE WHEN p.category = 'Food' THEN o.total_amount ELSE 0 END) AS food,
    SUM(CASE WHEN p.category = 'Software' THEN o.total_amount ELSE 0 END) AS software
FROM employees e
JOIN orders o ON e.employee_id = o.employee_id
JOIN products p ON o.product_id = p.product_id
GROUP BY e.employee_id, e.first_name, e.last_name
ORDER BY e.last_name;

Not every database supports PIVOT syntax natively (PostgreSQL doesn't, MySQL doesn't). The conditional aggregation pattern above works everywhere. The downside is you need to know the categories ahead of time - for dynamic pivoting, you'd typically need application-level code or dynamic SQL.

Q13: "Explain what would happen if two transactions simultaneously try to update the same employee's salary. How does isolation level affect this?"

This is a conceptual question. Here's the kind of answer that impresses:

-- Transaction A
BEGIN;
UPDATE employees SET salary = salary * 1.10 WHERE employee_id = 1;
-- hasn't committed yet...

-- Transaction B (runs concurrently)
BEGIN;
UPDATE employees SET salary = salary * 1.05 WHERE employee_id = 1;
-- This will BLOCK until Transaction A commits or rolls back

At READ COMMITTED (Postgres default): Transaction B waits for A to commit, then applies its update to A's committed value. No lost update.

At REPEATABLE READ (MySQL default): Transaction B still waits, but after A commits, B sees A's pre-update value due to snapshot isolation. The behavior depends on the database - MySQL InnoDB would let the update proceed on the committed value, while PostgreSQL would throw a serialization error.

At SERIALIZABLE: The database would detect the conflict and roll back one transaction, forcing a retry.

The key point to make: in practice, you should design your application to handle retries gracefully, and keep transactions as short as possible to minimize lock contention.

Scenario Questions (The Ones That Really Test You)

These are the open-ended, "real world" questions that senior interviewers love. There's often more than one valid solution - what matters is your approach and whether you can explain the tradeoffs.

Scenario 1: "Find users who placed an order in every month of Q4 2025"

This is a classic "relational division" problem. You need to find employees whose distinct count of order months equals the number of months in the quarter.

WITH q4_orders AS (
    SELECT
        employee_id,
        DATE_TRUNC('month', order_date) AS order_month
    FROM orders
    WHERE order_date >= '2025-10-01'
      AND order_date < '2026-01-01'
    GROUP BY employee_id, DATE_TRUNC('month', order_date)
)
SELECT
    e.employee_id,
    e.first_name,
    e.last_name
FROM employees e
JOIN q4_orders q ON e.employee_id = q.employee_id
GROUP BY e.employee_id, e.first_name, e.last_name
HAVING COUNT(DISTINCT q.order_month) = 3;

Why this works: We first get distinct employee-month combinations, then count them per employee. If an employee has orders in all 3 months (October, November, December), they show up. The = 3 is the number of months in Q4.

Common mistake: forgetting to deduplicate months first. If you just COUNT orders, you'll get the wrong answer for anyone with multiple orders in a single month.

Scenario 2: "Calculate a running total of order amounts per employee, ordered by date"

Running totals are a bread-and-butter window function use case:

SELECT
    e.first_name,
    e.last_name,
    o.order_date,
    o.total_amount,
    SUM(o.total_amount) OVER (
        PARTITION BY o.employee_id
        ORDER BY o.order_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_total
FROM employees e
JOIN orders o ON e.employee_id = o.employee_id
ORDER BY e.employee_id, o.order_date;

The frame clause (ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) is technically the default when you specify ORDER BY in SUM OVER, but being explicit shows the interviewer you understand window frames. They might follow up by asking for a 3-month rolling average, which would use ROWS BETWEEN 2 PRECEDING AND CURRENT ROW.

A subtle gotcha: if two orders have the same date, ROWS and RANGE behave differently. ROWS treats each row individually, while RANGE would include all rows with the same date value. For running totals, ROWS is usually what you want.

Scenario 3: "Find the second-highest salary in each department"

This is probably the most-asked SQL interview question of all time, and for good reason - there are several valid approaches and each reveals something about how you think.

-- Approach 1: Using DENSE_RANK (cleanest)
WITH ranked_salaries AS (
    SELECT
        e.first_name,
        e.last_name,
        e.salary,
        d.department_name,
        DENSE_RANK() OVER (
            PARTITION BY e.department_id
            ORDER BY e.salary DESC
        ) AS salary_rank
    FROM employees e
    JOIN departments d ON e.department_id = d.department_id
)
SELECT first_name, last_name, salary, department_name
FROM ranked_salaries
WHERE salary_rank = 2;

-- Approach 2: Using a correlated subquery (more traditional)
SELECT
    e.first_name,
    e.last_name,
    e.salary,
    d.department_name
FROM employees e
JOIN departments d ON e.department_id = d.department_id
WHERE e.salary = (
    SELECT MAX(e2.salary)
    FROM employees e2
    WHERE e2.department_id = e.department_id
      AND e2.salary < (
          SELECT MAX(e3.salary)
          FROM employees e3
          WHERE e3.department_id = e.department_id
      )
);

Use DENSE_RANK over RANK here. If two employees tie for the highest salary, DENSE_RANK still assigns rank 2 to the next salary down. With RANK, the next salary would be rank 3 and your query would return nothing.

The correlated subquery approach works but is harder to read and typically slower. Showing both and explaining why you prefer the window function version is a strong move.

How to Actually Prepare

Knowing the answers isn't enough - you need to be able to write SQL under pressure, on a whiteboard or in a shared editor. Here's how to make these concepts stick:

Practice on real data. Set up a PostgreSQL or MySQL instance (Docker makes this trivial) and load a sample dataset. Write queries against it. See the results. Break things. The muscle memory matters.

Talk through your approach. In an interview, thinking out loud is more important than getting the syntax perfect. Start with "I need to join these tables, group by this, filter on that" before you write a single line.

Know your dialect. PostgreSQL, MySQL, and SQL Server have differences. Know which one the company uses and be aware of syntax quirks (e.g., LIMIT vs TOP, ILIKE vs LOWER() LIKE, string concatenation operators).

Understand execution order. FROM, WHERE, GROUP BY, HAVING, SELECT, ORDER BY, LIMIT. This explains so many "weird" SQL behaviors. When an interviewer asks why something doesn't work, the answer is usually execution order.

Practice the hard problems on a timer. Give yourself 15 minutes per problem. If you can solve Tier 2 problems in that time, you're in good shape for most interviews. Tier 3 problems might take longer, and that's fine - the interviewer is usually looking for your approach more than a complete, perfect answer.

Don't memorize - understand. If you understand that a window function is just an aggregation that doesn't collapse rows, you can derive the syntax. If you understand that a CTE is just a named subquery, you'll know when to use one. The patterns are more important than the specifics.

SQL interviews in 2026 are less about trick questions and more about problem-solving. Companies want to see that you can take a business requirement, translate it into a query, and reason about performance. If you can do the problems in this guide comfortably, you're well-prepared for the vast majority of SQL interviews out there.

Now go fire up a database and start practicing. You've got this.