Advanced SQL Techniques
Master the most powerful SQL features! Learn window functions, CTEs, and advanced patterns used by professionals.
🚀 Welcome to Advanced SQL
Congratulations on making it to the final chapter! 🎉 You've learned the fundamentals of SQL. Now it's time to unlock the advanced techniques that separate beginners from experts!
🎯 What Makes These Techniques "Advanced"?
Think of SQL like learning a musical instrument:
- Beginner: You know the notes (SELECT, WHERE, JOIN)
- Intermediate: You can play songs (Complex queries, subqueries)
- Advanced: You can improvise and compose (Window functions, CTEs, complex analytics)
Advanced SQL lets you solve problems that would be nearly impossible with basic queries!
Window Functions
Perform calculations across related rows
Use case: Rankings, running totals, moving averages
CTEs
Create temporary named result sets
Use case: Simplify complex queries, recursive operations
Advanced Patterns
Clever techniques for common problems
Use case: Pivoting, deduplication, gaps & islands
Why Learn These Techniques?
- 📊 Analytics: Generate reports and insights that basic SQL can't
- ⚡ Performance: Often faster than multiple queries or application logic
- 💼 Career: These skills are highly valued in data roles
- 🧩 Elegance: Solve complex problems with clean, readable code
🪟 Window Functions Introduction
Window functions perform calculations across a set of rows that are related to the current row - like looking through a "window" at related data!
🎯 Real-Life Analogy: Class Rankings
Imagine a teacher calculating student rankings:
- Regular aggregate: "The class average is 85" - one number for everyone
- Window function: "You scored 92, which ranks you #3 in the class" - each student sees their rank AND keeps their individual score
Key difference: Window functions give you aggregated info WITHOUT collapsing rows like GROUP BY does!
Window Function Syntax:
SELECT
column1,
column2,
WINDOW_FUNCTION() OVER (
PARTITION BY column3
ORDER BY column4
) AS result
FROM table;
- OVER: Indicates this is a window function
- PARTITION BY: Divide rows into groups (optional, like GROUP BY)
- ORDER BY: Define the order within each partition (optional)
Difference Between GROUP BY and Window Functions:
-- GROUP BY (collapses rows)
SELECT department, AVG(salary)
FROM employees
GROUP BY department;
-- Result: One row per department
-- Window Function (keeps all rows)
SELECT
name,
department,
salary,
AVG(salary) OVER (PARTITION BY department) as dept_avg
FROM employees;
-- Result: All rows, each showing their department's average
🔍 Common Window Functions:
- ROW_NUMBER(): Assigns unique sequential numbers
- RANK(): Assigns ranks with gaps for ties
- DENSE_RANK(): Assigns ranks without gaps
- SUM/AVG/COUNT(): Aggregate functions as window functions
- LEAD/LAG(): Access next/previous row values
🔢 ROW_NUMBER() - Assigning Sequential Numbers
ROW_NUMBER() assigns a unique number to each row within a partition!
🎯 Real-Life Analogy: Race Positions
In a race, each runner gets a position number:
- 1st place, 2nd place, 3rd place, etc.
- Even if two runners have the same time, they get different positions
- ROW_NUMBER() works the same way - always unique, sequential numbers!
Try It: ROW_NUMBER() Example
🔍 What Just Happened?
- PARTITION BY department: Split data into groups (Engineering, Sales)
- ORDER BY salary DESC: Within each department, sort by salary (highest first)
- ROW_NUMBER(): Assigned 1, 2, 3... within each department
- Alice is #1 in Engineering ($95k), Bob is #2, Eve is #3
- Frank is #1 in Sales ($75k), Carol is #2, David is #3
Practical Use Case: Finding Top N Per Group
Want to find the top 2 highest-paid employees in each department?
-- Step 1: Use ROW_NUMBER in a subquery
WITH ranked_employees AS (
SELECT
name,
department,
salary,
ROW_NUMBER() OVER (
PARTITION BY department
ORDER BY salary DESC
) as rn
FROM employees
)
-- Step 2: Filter for top 2
SELECT name, department, salary
FROM ranked_employees
WHERE rn <= 2;
This is a very common interview question! 💡
🏆 RANK() and DENSE_RANK()
RANK() and DENSE_RANK() handle ties differently than ROW_NUMBER()!
🎯 Real-Life Analogy: Olympic Medals
Imagine a race where two runners tie for 1st place:
- ROW_NUMBER(): 1, 2, 3, 4 (one gets 1st, other gets 2nd arbitrarily)
- RANK(): 1, 1, 3, 4 (both get 1st, next is 3rd - skips 2)
- DENSE_RANK(): 1, 1, 2, 3 (both get 1st, next is 2nd - no skip)
Try It: Compare ROW_NUMBER, RANK, and DENSE_RANK
Results Breakdown:
| Student | Score | ROW_NUMBER | RANK | DENSE_RANK |
|---|---|---|---|---|
| Alice | 95 | 1 | 1 | 1 |
| Bob | 90 | 2 | 2 | 2 |
| Carol | 90 | 3 ⬅️ unique | 2 ⬅️ tie | 2 ⬅️ tie |
| David | 85 | 4 | 4 ⬅️ skipped 3 | 3 ⬅️ no skip |
| Eve | 85 | 5 | 4 | 3 |
| Frank | 80 | 6 | 6 ⬅️ skipped 5 | 4 ⬅️ no skip |
🔍 When to Use Each:
- ROW_NUMBER(): When you need unique numbers (deduplication, pagination)
- RANK(): When ties should share the same rank, and you want gaps (traditional ranking)
- DENSE_RANK(): When ties should share the same rank, but you don't want gaps (compact ranking)
📊 Aggregate Window Functions
Use familiar aggregates (SUM, AVG, COUNT) as window functions to calculate running totals, moving averages, and more!
🎯 Real-Life Analogy: Running Total
Imagine tracking your daily spending:
- Day 1: Spent $20 → Running total: $20
- Day 2: Spent $35 → Running total: $55
- Day 3: Spent $15 → Running total: $70
Window functions let you calculate this running total in SQL without complex self-joins!
Try It: Running Total (Cumulative Sum)
🔍 What's Happening:
- SUM(amount) OVER (ORDER BY date): Calculates cumulative sum
- Day 1: 100
- Day 2: 100 + 150 = 250
- Day 3: 250 + 200 = 450
- Each row sees the sum of all rows up to and including itself!
Try It: Moving Average
Window Frame Clauses:
Control which rows are included in the calculation:
- ROWS BETWEEN 2 PRECEDING AND CURRENT ROW: Current + 2 previous rows
- ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW: All rows from start to current (running total)
- ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING: Previous + current + next row
Real-World Example: Compare to Department Average
SELECT
name,
department,
salary,
AVG(salary) OVER (PARTITION BY department) as dept_avg,
salary - AVG(salary) OVER (PARTITION BY department) as diff_from_avg
FROM employees;
Shows each employee's salary compared to their department average!
📝 Common Table Expressions (CTEs)
CTEs let you create temporary named result sets that you can reference in your main query!
🎯 Real-Life Analogy: Cooking with Prep Steps
Think of making a complex dish:
- Without CTEs: "Mix eggs with flour, sugar, butter, vanilla while simultaneously chopping vegetables and marinating chicken..." (confusing!)
- With CTEs: "Step 1: Make batter. Step 2: Chop vegetables. Step 3: Combine everything." (clear steps!)
CTEs break complex queries into readable, manageable steps!
CTE Syntax:
WITH cte_name AS (
-- Your SELECT query here
SELECT ...
)
SELECT * FROM cte_name;
- WITH: Starts the CTE definition
- cte_name: Name you give to your temporary result set
- AS (...): The query that populates the CTE
- Main query: Uses the CTE like a regular table
Try It: Basic CTE Example
🔍 Why Use CTEs?
- Readability: Break complex queries into logical steps
- Reusability: Reference the same CTE multiple times in one query
- Maintainability: Easier to debug and modify
- Recursive queries: Can reference themselves (advanced!)
Multiple CTEs in One Query:
WITH
high_value_customers AS (
SELECT customer_id
FROM orders
GROUP BY customer_id
HAVING SUM(amount) > 1000
),
recent_orders AS (
SELECT *
FROM orders
WHERE order_date > '2024-01-01'
)
SELECT *
FROM recent_orders
WHERE customer_id IN (SELECT customer_id FROM high_value_customers);
You can define multiple CTEs separated by commas!
💡 CTE Practical Examples
Let's see real-world use cases for CTEs!
Example 1: Deduplication - Find Duplicates and Remove Them
🔍 Deduplication Explained:
- CTE ranks each email: First occurrence gets rn=1, duplicates get rn=2, 3, etc.
- WHERE rn = 1: Keeps only the first occurrence
- Alice's record from 2024-01-01 kept, 2024-01-05 removed
- Bob's record from 2024-01-02 kept, 2024-01-07 removed
Example 2: Comparing to Previous Row
LAG() and LEAD() Functions:
- LAG(column): Gets value from previous row
- LEAD(column): Gets value from next row
- LAG(price, 2): Gets value from 2 rows back
Great for comparing sequential data (time series, rankings, etc.)
🔁 Recursive CTEs
Recursive CTEs can reference themselves - perfect for hierarchical data like organization charts or file trees!
🎯 Real-Life Analogy: Family Tree
To find all your descendants:
- Base case: Start with yourself
- Recursive step: Find your children
- Repeat: For each child, find their children
- Continue: Until no more descendants found
Recursive CTEs work the same way!
Recursive CTE Syntax:
WITH RECURSIVE cte_name AS (
-- Base case (anchor)
SELECT ...
WHERE ...
UNION ALL
-- Recursive case
SELECT ...
FROM cte_name -- References itself!
WHERE ...
)
SELECT * FROM cte_name;
Try It: Generate Number Sequence
🔍 How It Works:
- Iteration 1: Base case returns 1
- Iteration 2: Recursive part adds 1 → 2
- Iteration 3: 2 + 1 → 3
- ... continues until...
- Iteration 10: WHERE n < 10 becomes false, stops
Real-World Example: Organization Hierarchy
Common Recursive CTE Use Cases:
- 🌳 Organizational charts: Find all reports under a manager
- 📁 File systems: List all files in nested folders
- 🛣️ Graph traversal: Find all connected nodes
- 📅 Date sequences: Generate calendar dates
- 🔢 Number series: Fibonacci, factorials, etc.
🎨 Advanced SQL Patterns
Clever techniques for solving common real-world problems!
1. Pivot Tables
Turn rows into columns
Use case: Sales reports, crosstab data
2. Unpivot
Turn columns into rows
Use case: Normalize wide tables
3. Gaps and Islands
Find missing sequences
Use case: Missing dates, broken sequences
4. Running Differences
Calculate changes over time
Use case: Trends, growth rates
Pattern: Pivot - Turn Rows into Columns
🔍 How Pivot Works:
Original data (rows):
Laptop, Jan, 5000 Laptop, Feb, 6000 Phone, Jan, 3000 Phone, Feb, 3500
After pivot (columns):
Product | Jan | Feb | Total Laptop | 5000 | 6000 | 11000 Phone | 3000 | 3500 | 6500
The CASE statement creates conditional columns!
Pattern: Find Missing Dates (Gaps)
More Advanced Patterns:
- Percentiles: Find median, quartiles (NTILE function)
- Year-over-year growth: Compare same period different years
- First/Last value: FIRST_VALUE(), LAST_VALUE() window functions
- Conditional aggregation: Different aggregates based on conditions
Practice Exercises
Part 1: Guided Exercises
Guided Exercise 1: Rank Products by Sales
Task: Use DENSE_RANK() to rank products by total sales amount.
Part 2: Independent Practice
Challenge 1: Calculate Running Average
Difficulty: Hard
Task: Create a query that shows each day's temperature along with the running average of all days up to that point.
📝 Knowledge Check Quiz
1. What's the key difference between window functions and GROUP BY?
2. What does a CTE (Common Table Expression) do?
3. When would you use DENSE_RANK() instead of RANK()?
🎓 Course Conclusion - You Did It!
Congratulations! 🎉🎊
You've completed all 10 chapters of the SQL course! You've gone from absolute beginner to mastering advanced SQL techniques. That's a huge accomplishment!
What You've Learned:
Chapter 1-2
SQL basics, SELECT, WHERE, filtering data
Chapter 3-4
JOINs, relationships, aggregate functions, GROUP BY
Chapter 5-6
Subqueries, INSERT, UPDATE, DELETE
Chapter 7-8
CREATE/ALTER tables, indexes, performance optimization
Chapter 9
Transactions, ACID properties, data integrity
Chapter 10
Window functions, CTEs, advanced patterns
🚀 Where to Go From Here:
- 💼 Practice, practice, practice: The best way to master SQL is by using it on real projects
- 📊 Work with real datasets: Try Kaggle datasets, public databases, or your own projects
- 🔍 Learn database-specific features: PostgreSQL, MySQL, SQL Server each have unique capabilities
- ⚡ Study query optimization: Learn about execution plans, query tuning
- 🏗️ Database design: Master normalization, schema design patterns
- 🔐 Security: Learn about SQL injection, permissions, encryption
Keep Learning:
- 📚 Documentation: Read official docs for PostgreSQL, MySQL, SQLite
- 💻 Practice platforms: LeetCode, HackerRank, SQLZoo
- 🎯 Real projects: Build a database for a personal project
- 👥 Community: Join SQL forums, Stack Overflow, Reddit r/SQL
Thank You! 🙏
Thank you for completing this course! You now have the SQL skills to build powerful data-driven applications. Keep practicing, keep learning, and most importantly - have fun with SQL!
Happy Querying! 🎉