50 SQL Interview Questions & Answers for 2026 (All Levels)
A comprehensive list of SQL interview questions from basic SELECT statements to advanced window functions and optimization. Each question includes a clear answer with code examples you can practice before your next interview.
Why SQL Interview Prep Matters in 2026
SQL remains the most requested technical skill for data analysts, backend engineers, data engineers, and even product managers. According to Stack Overflow's 2025 survey, SQL ranked as the third most-used programming language overall and the number one language for data roles.
In 2026, SQL interview questions have shifted. Interviewers care less about memorizing syntax and more about problem-solving: can you translate a business question into a working query? Can you optimize a slow query? Can you design a schema that scales?
This guide covers 50 questions across all difficulty levels. Each answer includes the reasoning behind it, not just the query. Read through once, then practice the queries in AI2SQL with the built-in demo database.
Basic SQL Interview Questions (1-15)
These questions test your understanding of SQL fundamentals. Every candidate should nail these.
1. What is the difference between WHERE and HAVING?
WHERE filters individual rows before any grouping happens. HAVING filters groups after GROUP BY has been applied.
-- WHERE: filter rows before grouping
SELECT department, COUNT(*) AS emp_count
FROM employees
WHERE salary > 50000
GROUP BY department;
-- HAVING: filter groups after aggregation
SELECT department, COUNT(*) AS emp_count
FROM employees
GROUP BY department
HAVING COUNT(*) > 10;
2. What is the difference between DELETE, TRUNCATE, and DROP?
DELETE removes specific rows and can be rolled back. TRUNCATE removes all rows instantly and resets auto-increment counters (cannot be rolled back in most databases). DROP removes the entire table structure and all data permanently.
-- DELETE: removes rows, can use WHERE, logged
DELETE FROM orders WHERE status = 'cancelled';
-- TRUNCATE: removes all rows, faster, minimal logging
TRUNCATE TABLE temp_logs;
-- DROP: removes the entire table
DROP TABLE IF EXISTS old_backups;
3. What are the different types of JOINs?
SQL has five main JOIN types:
- INNER JOIN returns only matching rows from both tables
- LEFT JOIN returns all rows from the left table plus matching rows from the right
- RIGHT JOIN returns all rows from the right table plus matching rows from the left
- FULL OUTER JOIN returns all rows from both tables
- CROSS JOIN returns every combination (cartesian product)
-- INNER JOIN
SELECT e.name, d.department_name
FROM employees e
INNER JOIN departments d ON e.dept_id = d.id;
-- LEFT JOIN: includes employees with no department
SELECT e.name, d.department_name
FROM employees e
LEFT JOIN departments d ON e.dept_id = d.id;
4. What is a PRIMARY KEY?
A primary key is a column (or combination of columns) that uniquely identifies each row in a table. It enforces two constraints: uniqueness (no duplicate values) and NOT NULL (cannot be empty). Each table can have only one primary key.
CREATE TABLE users (
id SERIAL PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL,
name VARCHAR(100) NOT NULL
);
5. What is the difference between UNION and UNION ALL?
UNION combines results from two queries and removes duplicate rows. UNION ALL combines results but keeps all duplicates. UNION ALL is faster because it skips the deduplication step.
-- UNION: removes duplicates
SELECT city FROM customers
UNION
SELECT city FROM suppliers;
-- UNION ALL: keeps duplicates, faster
SELECT city FROM customers
UNION ALL
SELECT city FROM suppliers;
6. What does DISTINCT do?
DISTINCT removes duplicate rows from the result set. It applies to all selected columns, not just the first one.
-- Unique cities
SELECT DISTINCT city FROM customers;
-- Unique city + state combinations
SELECT DISTINCT city, state FROM customers;
7. Explain NULL in SQL. How do you check for NULL values?
NULL represents a missing or unknown value. It is not equal to zero, an empty string, or false. You cannot use = to compare NULL values. Use IS NULL or IS NOT NULL instead.
-- Correct way to check for NULL
SELECT * FROM orders WHERE shipped_date IS NULL;
-- COALESCE: return first non-null value
SELECT name, COALESCE(phone, email, 'No contact') AS contact
FROM customers;
8. What is the order of execution in a SQL query?
SQL does not execute in the order you write it. The actual execution order is:
- FROM and JOIN (identify tables)
- WHERE (filter rows)
- GROUP BY (create groups)
- HAVING (filter groups)
- SELECT (choose columns)
- DISTINCT (remove duplicates)
- ORDER BY (sort results)
- LIMIT / OFFSET (paginate)
This is why you cannot use a column alias from SELECT in your WHERE clause but you can use it in ORDER BY.
9. What is the difference between CHAR and VARCHAR?
CHAR(n) is fixed-length and always stores exactly n characters, padding shorter strings with spaces. VARCHAR(n) is variable-length and stores only the actual characters. Use CHAR for fixed-format data like country codes (US, UK). Use VARCHAR for everything else.
10. Write a query to find duplicate records.
SELECT email, COUNT(*) AS duplicate_count
FROM users
GROUP BY email
HAVING COUNT(*) > 1
ORDER BY duplicate_count DESC;
11. What is an alias and why use one?
An alias gives a temporary name to a table or column using the AS keyword. It makes queries more readable, especially when dealing with JOINs across multiple tables or computed columns.
SELECT
e.first_name AS name,
d.department_name AS dept,
e.salary * 12 AS annual_salary
FROM employees AS e
JOIN departments AS d ON e.dept_id = d.id;
12. What is the difference between IN and EXISTS?
IN checks if a value matches any value in a list or subquery result. EXISTS checks whether a subquery returns any rows at all. For large datasets, EXISTS is usually faster because it stops scanning once it finds the first match.
-- IN: evaluates entire subquery
SELECT * FROM customers
WHERE id IN (SELECT customer_id FROM orders);
-- EXISTS: stops at first match
SELECT * FROM customers c
WHERE EXISTS (
SELECT 1 FROM orders o WHERE o.customer_id = c.id
);
13. How do you insert data from one table into another?
-- INSERT INTO ... SELECT
INSERT INTO archived_orders (id, customer_id, total, order_date)
SELECT id, customer_id, total_amount, order_date
FROM orders
WHERE order_date < '2025-01-01';
14. What is a FOREIGN KEY?
A foreign key is a column that references the primary key of another table. It enforces referential integrity, meaning you cannot insert a value that does not exist in the referenced table.
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer_id INT NOT NULL,
total_amount DECIMAL(10,2),
FOREIGN KEY (customer_id) REFERENCES customers(id)
);
15. Write a query to get the second highest salary.
-- Using LIMIT/OFFSET
SELECT DISTINCT salary
FROM employees
ORDER BY salary DESC
LIMIT 1 OFFSET 1;
-- Using a subquery
SELECT MAX(salary) AS second_highest
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);
Intermediate SQL Interview Questions (16-30)
These questions go beyond syntax and test your ability to write multi-step queries.
16. What is a self-join? When would you use one?
A self-join joins a table to itself. It is used when rows in the same table have a hierarchical or comparative relationship, like employees and their managers.
-- Find each employee and their manager's name
SELECT
e.name AS employee,
m.name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;
17. Explain GROUP BY with an example.
GROUP BY groups rows that share the same values in specified columns, allowing you to run aggregate functions (COUNT, SUM, AVG) on each group.
-- Total revenue by product category
SELECT
category,
COUNT(*) AS total_orders,
SUM(amount) AS revenue,
AVG(amount) AS avg_order_value
FROM orders o
JOIN products p ON o.product_id = p.id
GROUP BY category
ORDER BY revenue DESC;
18. What is a subquery? What are correlated vs non-correlated subqueries?
A subquery is a query nested inside another query. A non-correlated subquery runs independently and executes once. A correlated subquery references the outer query and executes once per outer row.
-- Non-correlated: runs once
SELECT * FROM employees
WHERE dept_id IN (SELECT id FROM departments WHERE location = 'NYC');
-- Correlated: runs once per employee
SELECT * FROM employees e
WHERE salary > (
SELECT AVG(salary) FROM employees
WHERE dept_id = e.dept_id
);
19. What is a CTE (Common Table Expression)?
A CTE is a temporary named result set defined with the WITH keyword. It makes complex queries readable by breaking them into logical steps. CTEs exist only for the duration of the query.
WITH monthly_sales AS (
SELECT
DATE_TRUNC('month', order_date) AS month,
SUM(total_amount) AS revenue
FROM orders
GROUP BY DATE_TRUNC('month', order_date)
)
SELECT
month,
revenue,
LAG(revenue) OVER (ORDER BY month) AS prev_month,
ROUND((revenue - LAG(revenue) OVER (ORDER BY month))
/ LAG(revenue) OVER (ORDER BY month) * 100, 1) AS growth_pct
FROM monthly_sales
ORDER BY month;
20. What is the difference between a VIEW and a TABLE?
A table stores data physically on disk. A view is a saved query that acts like a virtual table. Views do not store data; they execute the underlying query each time you select from them. Views are useful for simplifying complex queries, enforcing security (exposing only certain columns), and creating reusable query logic.
CREATE VIEW active_customers AS
SELECT c.id, c.name, c.email, COUNT(o.id) AS order_count
FROM customers c
JOIN orders o ON c.id = o.customer_id
WHERE o.order_date >= CURRENT_DATE - INTERVAL '90 days'
GROUP BY c.id, c.name, c.email;
-- Use it like a table
SELECT * FROM active_customers WHERE order_count > 5;
21. Write a query to find customers who have never placed an order.
-- Using LEFT JOIN + NULL check
SELECT c.id, c.name, c.email
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
WHERE o.id IS NULL;
-- Using NOT EXISTS
SELECT c.id, c.name, c.email
FROM customers c
WHERE NOT EXISTS (
SELECT 1 FROM orders o WHERE o.customer_id = c.id
);
22. What is CASE WHEN and how do you use it?
CASE WHEN is SQL's conditional expression, similar to if-else in other languages. It can be used in SELECT, WHERE, ORDER BY, and even inside aggregate functions.
SELECT
name,
salary,
CASE
WHEN salary >= 120000 THEN 'Senior'
WHEN salary >= 80000 THEN 'Mid'
WHEN salary >= 50000 THEN 'Junior'
ELSE 'Entry'
END AS level
FROM employees;
-- CASE inside an aggregate
SELECT
department,
COUNT(CASE WHEN status = 'active' THEN 1 END) AS active_count,
COUNT(CASE WHEN status = 'inactive' THEN 1 END) AS inactive_count
FROM employees
GROUP BY department;
23. How do you calculate a running total?
SELECT
order_date,
total_amount,
SUM(total_amount) OVER (ORDER BY order_date) AS running_total
FROM orders;
24. What is the difference between INNER JOIN and LEFT JOIN?
INNER JOIN returns only rows where there is a match in both tables. LEFT JOIN returns all rows from the left table, and for rows with no match in the right table, the right-side columns are filled with NULL.
-- INNER JOIN: only customers with orders
SELECT c.name, o.id AS order_id
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id;
-- Result: 850 rows
-- LEFT JOIN: all customers, including those without orders
SELECT c.name, o.id AS order_id
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id;
-- Result: 1000 rows (150 with NULL order_id)
25. How do you handle duplicate rows in a result?
-- Option 1: DISTINCT
SELECT DISTINCT customer_id, product_id FROM orders;
-- Option 2: GROUP BY
SELECT customer_id, product_id, COUNT(*) AS times_ordered
FROM orders
GROUP BY customer_id, product_id;
-- Option 3: ROW_NUMBER to keep only the latest
WITH ranked AS (
SELECT *,
ROW_NUMBER() OVER (
PARTITION BY customer_id, product_id
ORDER BY order_date DESC
) AS rn
FROM orders
)
SELECT * FROM ranked WHERE rn = 1;
26. What are aggregate functions? Name five.
Aggregate functions perform a calculation on a set of rows and return a single value. The five most common are:
- COUNT(*) counts rows
- SUM(column) totals numeric values
- AVG(column) calculates the mean
- MIN(column) finds the smallest value
- MAX(column) finds the largest value
SELECT
department,
COUNT(*) AS headcount,
SUM(salary) AS total_payroll,
AVG(salary) AS avg_salary,
MIN(salary) AS lowest_salary,
MAX(salary) AS highest_salary
FROM employees
GROUP BY department;
27. What is a transaction? Explain ACID properties.
A transaction is a sequence of SQL operations executed as a single unit. It either completes fully or not at all. ACID stands for:
- Atomicity: All operations succeed or all are rolled back
- Consistency: The database moves from one valid state to another
- Isolation: Concurrent transactions do not interfere with each other
- Durability: Once committed, data survives crashes
BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 500 WHERE id = 1;
UPDATE accounts SET balance = balance + 500 WHERE id = 2;
-- If both succeed:
COMMIT;
-- If anything fails:
ROLLBACK;
28. Write a query to find the top 3 products by revenue per category.
WITH product_revenue AS (
SELECT
p.category,
p.product_name,
SUM(oi.quantity * oi.unit_price) AS revenue,
RANK() OVER (
PARTITION BY p.category
ORDER BY SUM(oi.quantity * oi.unit_price) DESC
) AS rank
FROM order_items oi
JOIN products p ON oi.product_id = p.id
GROUP BY p.category, p.product_name
)
SELECT category, product_name, revenue
FROM product_revenue
WHERE rank <= 3
ORDER BY category, rank;
29. What is the difference between RANK(), DENSE_RANK(), and ROW_NUMBER()?
All three assign a number to each row based on ordering, but they handle ties differently:
- ROW_NUMBER(): Always unique (1, 2, 3, 4). Ties get arbitrary order.
- RANK(): Ties get the same number, then skips (1, 2, 2, 4).
- DENSE_RANK(): Ties get the same number, no skip (1, 2, 2, 3).
SELECT
name,
salary,
ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num,
RANK() OVER (ORDER BY salary DESC) AS rank,
DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank
FROM employees;
30. Write a query to pivot data (rows to columns).
-- Using CASE + GROUP BY (works in all databases)
SELECT
product_id,
SUM(CASE WHEN EXTRACT(MONTH FROM order_date) = 1 THEN amount ELSE 0 END) AS jan,
SUM(CASE WHEN EXTRACT(MONTH FROM order_date) = 2 THEN amount ELSE 0 END) AS feb,
SUM(CASE WHEN EXTRACT(MONTH FROM order_date) = 3 THEN amount ELSE 0 END) AS mar
FROM orders
WHERE EXTRACT(YEAR FROM order_date) = 2026
GROUP BY product_id;
Advanced SQL Interview Questions (31-40)
These questions separate strong candidates from average ones. Expect these at senior data roles and FAANG-level interviews.
31. What are window functions? How do they differ from aggregate functions?
Window functions perform calculations across a set of rows related to the current row without collapsing them into a single result. Unlike aggregate functions with GROUP BY, window functions keep all individual rows in the output.
-- Aggregate: collapses to one row per department
SELECT department, AVG(salary) FROM employees GROUP BY department;
-- Window function: keeps all rows, adds department average
SELECT
name,
department,
salary,
AVG(salary) OVER (PARTITION BY department) AS dept_avg,
salary - AVG(salary) OVER (PARTITION BY department) AS diff_from_avg
FROM employees;
32. Explain LAG() and LEAD() with a practical example.
LAG() accesses data from a previous row. LEAD() accesses data from a following row. Both are window functions.
-- Month-over-month revenue comparison
SELECT
month,
revenue,
LAG(revenue, 1) OVER (ORDER BY month) AS prev_month_revenue,
ROUND(
(revenue - LAG(revenue, 1) OVER (ORDER BY month))
/ LAG(revenue, 1) OVER (ORDER BY month) * 100, 1
) AS mom_growth_pct,
LEAD(revenue, 1) OVER (ORDER BY month) AS next_month_revenue
FROM monthly_revenue;
33. What is a recursive CTE? Write an example.
A recursive CTE references itself to traverse hierarchical data like org charts, category trees, or graph structures.
-- Build an org chart hierarchy
WITH RECURSIVE org_tree AS (
-- Base case: top-level managers (no manager)
SELECT id, name, manager_id, 1 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive case: find direct reports
SELECT e.id, e.name, e.manager_id, ot.level + 1
FROM employees e
JOIN org_tree ot ON e.manager_id = ot.id
)
SELECT id, name, level
FROM org_tree
ORDER BY level, name;
34. What are indexes? When should you NOT create an index?
An index is a data structure that speeds up data retrieval at the cost of slower writes and extra storage. You should NOT create an index when:
- The table is small (under a few thousand rows)
- The column has very low cardinality (like a boolean or status with 2-3 values)
- The table has heavy write traffic and few reads
- The column is rarely used in WHERE, JOIN, or ORDER BY
-- Create an index on frequently queried columns
CREATE INDEX idx_orders_customer_date
ON orders(customer_id, order_date);
-- Check if your query uses the index
EXPLAIN ANALYZE
SELECT * FROM orders
WHERE customer_id = 42 AND order_date >= '2026-01-01';
35. What is the difference between a clustered and non-clustered index?
A clustered index determines the physical order of rows on disk. Each table can have only one clustered index (typically the primary key). A non-clustered index creates a separate structure with pointers to the actual rows. A table can have many non-clustered indexes. Think of a clustered index as the table of contents, and non-clustered indexes as the book's index at the back.
36. Write a query to find gaps in a sequence.
-- Find missing invoice numbers using LEAD
SELECT invoice_number + 1 AS gap_start,
next_num - 1 AS gap_end
FROM (
SELECT invoice_number,
LEAD(invoice_number) OVER (ORDER BY invoice_number) AS next_num
FROM invoices
) t
WHERE next_num - invoice_number > 1;
37. How do you implement pagination efficiently on large tables?
The standard LIMIT/OFFSET approach becomes slow on large tables because the database still scans all skipped rows. Keyset pagination (also called seek method) is faster.
-- Slow: LIMIT/OFFSET (scans 10,000 rows to skip them)
SELECT * FROM products
ORDER BY id
LIMIT 20 OFFSET 10000;
-- Fast: Keyset pagination (jumps directly)
SELECT * FROM products
WHERE id > 10000
ORDER BY id
LIMIT 20;
38. What is a materialized view?
A materialized view stores the query result physically on disk, unlike a regular view that re-executes the query each time. It is useful for expensive aggregation queries that do not need real-time data. You refresh it manually or on a schedule.
-- Create materialized view
CREATE MATERIALIZED VIEW monthly_summary AS
SELECT
DATE_TRUNC('month', order_date) AS month,
COUNT(*) AS total_orders,
SUM(total_amount) AS revenue
FROM orders
GROUP BY DATE_TRUNC('month', order_date);
-- Refresh when needed
REFRESH MATERIALIZED VIEW monthly_summary;
39. Explain the difference between OLTP and OLAP.
OLTP (Online Transaction Processing) handles day-to-day operations: inserts, updates, short queries. It uses normalized schemas with many small tables. Examples: PostgreSQL, MySQL for your app's database.
OLAP (Online Analytical Processing) handles complex analytical queries across large datasets. It uses denormalized schemas (star/snowflake) optimized for reads. Examples: BigQuery, Snowflake, Redshift.
In interviews, this often leads to follow-up questions about data warehousing and ETL pipelines.
40. Write a query to calculate a 7-day moving average.
SELECT
order_date,
daily_revenue,
AVG(daily_revenue) OVER (
ORDER BY order_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS moving_avg_7d
FROM (
SELECT
order_date,
SUM(total_amount) AS daily_revenue
FROM orders
GROUP BY order_date
) daily
ORDER BY order_date;
SQL Performance & Optimization Questions (41-45)
These questions test whether you can write queries that actually run fast in production.
41. What is an execution plan and how do you read one?
An execution plan shows how the database engine will execute your query: which indexes it uses, how it joins tables, and the estimated cost. Use EXPLAIN (or EXPLAIN ANALYZE for actual timing) to see it.
EXPLAIN ANALYZE
SELECT c.name, COUNT(o.id) AS order_count
FROM customers c
JOIN orders o ON c.id = o.customer_id
WHERE o.order_date >= '2026-01-01'
GROUP BY c.name
ORDER BY order_count DESC;
Key things to look for: Seq Scan (full table scan, usually bad on large tables), Index Scan (good), Nested Loop vs Hash Join (depends on data size), and the actual time vs estimated rows.
42. Why is SELECT * considered bad practice?
SELECT * causes problems in production:
- Reads unnecessary columns, wasting I/O and memory
- Prevents the optimizer from using covering indexes
- Breaks application code when columns are added or removed
- Transfers more data over the network
Always specify the exact columns you need.
43. How do you optimize a slow query?
Follow this systematic approach:
- Run
EXPLAIN ANALYZEto see the execution plan - Look for full table scans on large tables
- Add indexes on columns used in WHERE, JOIN, and ORDER BY
- Rewrite correlated subqueries as JOINs
- Replace
SELECT *with specific columns - Avoid functions on indexed columns in WHERE clauses
- Consider partitioning for very large tables
-- Bad: function on indexed column prevents index use
SELECT * FROM orders
WHERE YEAR(order_date) = 2026;
-- Good: range comparison uses the index
SELECT * FROM orders
WHERE order_date >= '2026-01-01' AND order_date < '2027-01-01';
44. What is database normalization? Explain 1NF, 2NF, and 3NF.
Normalization is the process of organizing data to reduce redundancy.
- 1NF (First Normal Form): Each column contains atomic values (no arrays or nested data). Each row is unique.
- 2NF (Second Normal Form): Meets 1NF plus every non-key column depends on the entire primary key (no partial dependencies).
- 3NF (Third Normal Form): Meets 2NF plus no non-key column depends on another non-key column (no transitive dependencies).
In practice, most production databases aim for 3NF. OLAP systems intentionally denormalize for query speed.
45. What is table partitioning and when should you use it?
Partitioning splits a large table into smaller physical pieces based on a column value (usually date or ID range). Queries that filter on the partition key only scan relevant partitions instead of the entire table.
-- Range partitioning by date (PostgreSQL)
CREATE TABLE orders (
id SERIAL,
customer_id INT,
total_amount DECIMAL(10,2),
order_date DATE NOT NULL
) PARTITION BY RANGE (order_date);
CREATE TABLE orders_2025 PARTITION OF orders
FOR VALUES FROM ('2025-01-01') TO ('2026-01-01');
CREATE TABLE orders_2026 PARTITION OF orders
FOR VALUES FROM ('2026-01-01') TO ('2027-01-01');
Use partitioning when a table exceeds millions of rows and queries consistently filter on a specific column.
Database Design Interview Questions (46-50)
These questions test your ability to think beyond queries and design systems that scale.
46. Design a schema for an e-commerce system.
A minimal e-commerce schema needs these core tables:
CREATE TABLE customers (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
created_at TIMESTAMP DEFAULT NOW()
);
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(200) NOT NULL,
price DECIMAL(10,2) NOT NULL,
category VARCHAR(50),
stock_quantity INT DEFAULT 0
);
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer_id INT REFERENCES customers(id),
total_amount DECIMAL(10,2),
status VARCHAR(20) DEFAULT 'pending',
order_date TIMESTAMP DEFAULT NOW()
);
CREATE TABLE order_items (
id SERIAL PRIMARY KEY,
order_id INT REFERENCES orders(id),
product_id INT REFERENCES products(id),
quantity INT NOT NULL,
unit_price DECIMAL(10,2) NOT NULL
);
Key design decisions: separate order_items table (many-to-many between orders and products), store unit_price in order_items (price may change later), use status for order lifecycle tracking.
47. What is the difference between a star schema and a snowflake schema?
Both are used in data warehousing. A star schema has a central fact table connected directly to dimension tables (one level). A snowflake schema normalizes dimension tables into sub-dimensions (multiple levels). Star schemas are simpler and faster for queries. Snowflake schemas save storage but require more JOINs.
48. How would you design a schema for a social media "likes" feature?
CREATE TABLE likes (
user_id INT NOT NULL REFERENCES users(id),
post_id INT NOT NULL REFERENCES posts(id),
created_at TIMESTAMP DEFAULT NOW(),
PRIMARY KEY (user_id, post_id) -- prevents duplicate likes
);
-- Create index for counting likes per post
CREATE INDEX idx_likes_post ON likes(post_id);
-- Get like count for a post
SELECT COUNT(*) FROM likes WHERE post_id = 123;
-- Check if user liked a post
SELECT EXISTS (
SELECT 1 FROM likes WHERE user_id = 42 AND post_id = 123
);
The composite primary key (user_id, post_id) prevents duplicate likes at the database level.
49. What are stored procedures and when should you use them?
Stored procedures are precompiled SQL code blocks stored in the database. They accept parameters, can contain control flow logic, and execute as a single unit. Use them for complex business logic that must run close to the data, multi-step operations that need transaction safety, and frequently executed operations that benefit from precompilation.
-- PostgreSQL function example
CREATE OR REPLACE FUNCTION transfer_funds(
sender_id INT,
receiver_id INT,
amount DECIMAL
) RETURNS VOID AS $$
BEGIN
UPDATE accounts SET balance = balance - amount
WHERE id = sender_id AND balance >= amount;
IF NOT FOUND THEN
RAISE EXCEPTION 'Insufficient funds';
END IF;
UPDATE accounts SET balance = balance + amount
WHERE id = receiver_id;
END;
$$ LANGUAGE plpgsql;
50. How do you handle soft deletes vs hard deletes?
A hard delete removes the row permanently with DELETE. A soft delete marks the row as deleted using a flag column, keeping the data for audit trails or recovery.
-- Soft delete: add a column
ALTER TABLE users ADD COLUMN deleted_at TIMESTAMP NULL;
-- "Delete" a user
UPDATE users SET deleted_at = NOW() WHERE id = 42;
-- Query only active users
SELECT * FROM users WHERE deleted_at IS NULL;
-- Create a view for convenience
CREATE VIEW active_users AS
SELECT * FROM users WHERE deleted_at IS NULL;
Soft deletes are preferred when you need audit logs, undo functionality, or regulatory compliance (GDPR data retention). Hard deletes are better when storage is a concern or when data must be permanently removed.
How to Practice SQL Interviews
Reading answers is not enough. You need to write queries with your own hands to build the muscle memory that interviews require.
- Use a real database. AI2SQL has a built-in demo database you can query instantly. No setup needed. Write the queries from this guide and verify they work.
- Practice under time pressure. Set a 5-minute timer per question. In real interviews, you will not have unlimited time.
- Explain your thought process out loud. Interviewers want to hear how you break down a problem, not just see the final query. Practice narrating: "First I need to join these two tables, then aggregate by category..."
- Study execution plans. Run
EXPLAIN ANALYZEon your queries. Understand why one approach is faster than another. - Recreate real scenarios. Take a business question ("Which customers spent the most last quarter?") and translate it to SQL. This is exactly what interviews test.
Start practicing now with AI2SQL's built-in database. Describe what you need in plain English and see the SQL generated instantly.
Common Mistakes to Avoid
These mistakes cost candidates in interviews every day. Avoid them.
- Using SELECT * in your answer. Always specify columns. It shows you understand performance and maintainability.
- Forgetting NULL handling. NULL breaks equality checks. Use
IS NULL,COALESCE, orNULLIFwhere appropriate. - Confusing WHERE and HAVING. WHERE filters rows before grouping. HAVING filters groups after aggregation. Mixing them up signals a weak foundation.
- Not using aliases. When joining multiple tables, unaliased column names create confusion. Always alias your tables and computed columns.
- Writing everything in one giant query. Break complex logic into CTEs. It makes your query readable and shows you can think in steps.
- Ignoring edge cases. What if the table is empty? What if there are ties? What about NULL values in JOIN columns? Address these before the interviewer asks.
- Not asking clarifying questions. Before writing SQL, ask: Which database? How large is the table? Are there indexes? What does "most recent" mean? Asking smart questions is a signal of seniority.
- Forgetting ORDER BY. SQL does not guarantee row order without ORDER BY. If the question asks for a "top N" result, always include it.
Frequently Asked Questions
What SQL topics are most commonly asked in interviews?
The most commonly asked SQL interview topics are JOINs (INNER, LEFT, RIGHT, FULL), aggregation with GROUP BY and HAVING, subqueries, window functions (ROW_NUMBER, RANK), indexing, and the difference between WHERE and HAVING. Expect at least one question on each of these in a typical technical interview.
How many SQL questions are typically asked in a technical interview?
Most technical interviews include 3 to 8 SQL questions. Phone screens typically have 2-3 conceptual questions, while on-site or take-home rounds may include 4-6 hands-on query writing exercises. Data engineering roles tend to have more SQL questions than general backend roles.
Should I practice SQL on a real database before my interview?
Yes. Reading answers is not enough. Practice writing queries against a real database to build muscle memory. Tools like AI2SQL provide a built-in demo database where you can run queries instantly without any setup. Aim for at least 20-30 practice queries before your interview.
Do I need to know database-specific syntax for SQL interviews?
Usually not. Most interviews test standard ANSI SQL. However, if the job description mentions a specific database (PostgreSQL, MySQL, SQL Server), it helps to know that dialect's unique features. For example, PostgreSQL's array types, MySQL's GROUP_CONCAT, or SQL Server's TOP instead of LIMIT.
What is the best way to prepare for SQL interviews in 2026?
Start by reviewing core concepts (JOINs, aggregation, subqueries, window functions). Then practice writing queries on a real database. Study execution plans with EXPLAIN to understand performance. Finally, practice explaining your thought process out loud, since interviewers want to hear how you approach problems, not just see the final query.