Chapter 2

SELECT & WHERE Clauses

Master the art of querying data! Learn how to retrieve exactly what you need from databases using powerful filtering and sorting techniques.

SELECT Statement Deep Dive

You learned basic SELECT in Chapter 1. Now let's explore its full power!

🎯 Real-Life Analogy: SELECT is Like Shopping at a Store

Imagine walking into a grocery store:

  • SELECT * - "I'll take everything in the store!" (Grab all products)
  • SELECT name, price - "I only want to see product names and prices" (Specific items)
  • WHERE price < 100 - "Show me only items under $100" (Filter what you see)
  • ORDER BY price - "Arrange items from cheapest to most expensive" (Sort results)

SELECT lets you be very specific about what data you want to see!

SELECT Syntax Breakdown:

SELECT column1, column2, column3
FROM table_name
WHERE condition
ORDER BY column1
LIMIT number;
  • SELECT: Choose which columns to retrieve
  • FROM: Specify the table
  • WHERE: Filter rows (optional)
  • ORDER BY: Sort results (optional)
  • LIMIT: Restrict number of rows (optional)

Selecting Specific Columns:

Try It: Select Specific Columns

Retrieve only the columns you need to make queries faster and results cleaner:

Output will appear here...

🔍 Behind the Scenes: Why Not Always Use SELECT *?

While SELECT * is convenient, it has drawbacks:

  1. Performance: Fetching all columns is slower, especially with large tables
  2. Network: More data transferred = longer wait times
  3. Memory: Your application uses more RAM
  4. Clarity: Specific columns make your intent clear

Best Practice: Only select columns you actually need!

Column Aliases (Renaming Columns):

You can rename columns in your results using AS for better readability:

Try It: Using Column Aliases

Output will appear here...

Note:

The AS keyword is optional - SELECT name employee_name works too, but AS makes it more readable!

Calculated Columns:

You can perform calculations right in your SELECT statement:

Try It: Calculate Monthly Salary

Output will appear here...

Common Calculations in SELECT:

-- Arithmetic operations
SELECT
    price * quantity AS total_cost,
    price * 0.1 AS tax,
    price * 1.1 AS price_with_tax
FROM orders;

-- String concatenation (combines text)
SELECT
    first_name || ' ' || last_name AS full_name
FROM users;

DISTINCT - Removing Duplicates:

Use DISTINCT to get unique values only:

Try It: Find Unique Departments

Output will appear here...

🔍 How DISTINCT Works:

When you use DISTINCT:

  1. SQL retrieves all matching rows
  2. Sorts and compares them to find duplicates
  3. Returns only one copy of each unique row

Note: DISTINCT can be slow on large datasets because it requires comparing rows!

The WHERE Clause - Filtering Data

WHERE is your superpower for finding exactly what you need in massive databases!

🎯 Real-Life Analogy: WHERE is Like a Search Filter

Think about shopping on Amazon:

  • No filter: See ALL products (overwhelming!)
  • Price < $50: Only affordable items
  • Rating >= 4 stars: Only good quality
  • Category = "Electronics": Only electronics
  • In stock = Yes: Only available items

WHERE clause works exactly like these filters - it lets you narrow down results!

WHERE Clause Syntax:

SELECT column1, column2
FROM table_name
WHERE condition;

The condition can use comparison operators, logical operators, and special operators we'll learn next!

Try It: Basic WHERE Example

Output will appear here...

🔍 What Happens Behind the Scenes?

When SQL executes a WHERE clause:

  1. Table Scan: SQL examines each row in the table
  2. Condition Check: Tests if the row meets the WHERE condition
  3. Filter: If TRUE, includes the row; if FALSE, skips it
  4. Return Results: Returns only rows that passed the test

On a table with 1 million employees, WHERE can reduce results to just a few rows!

Comparison Operators

Comparison operators let you compare values in your WHERE clause:

= (Equal)

Exact match

WHERE age = 25

!= or <> (Not Equal)

Not equal to

WHERE status != 'Active'

> (Greater Than)

Larger values

WHERE salary > 80000

< (Less Than)

Smaller values

WHERE age < 30

>= (Greater or Equal)

Greater than or equal to

WHERE score >= 90

<= (Less or Equal)

Less than or equal to

WHERE price <= 100

Using Comparison Operators:

Try It: Find High Earners

Output will appear here...

Try It: Find Employees NOT Earning $90,000

Output will appear here...

String Comparisons:

When comparing strings (text):

  • Use single quotes: WHERE name = 'Alice'
  • Comparisons are usually case-sensitive: 'Alice' != 'alice'
  • You can use >, < for alphabetical ordering

Try It: Alphabetical Comparison

Output will appear here...

Logical Operators (AND, OR, NOT)

Combine multiple conditions to create powerful filters!

🎯 Real-Life Analogy: Combining Search Filters

Shopping online with multiple filters:

  • AND: "Show laptops that are under $1000 AND have 16GB RAM" (both conditions must be true)
  • OR: "Show products from Nike OR Adidas" (either condition can be true)
  • NOT: "Show all products NOT in the 'Clearance' category" (exclude items)

AND Operator:

All conditions must be TRUE for the row to be included:

Try It: Using AND

Output will appear here...

🔍 How AND Works:

SQL checks each condition:

Condition 1 Condition 2 Result
TRUE TRUE TRUE
TRUE FALSE FALSE
FALSE TRUE FALSE
FALSE FALSE FALSE

Only when ALL conditions are TRUE does the row get included!

OR Operator:

At least ONE condition must be TRUE:

Try It: Using OR

Output will appear here...

🔍 How OR Works:

Condition 1 Condition 2 Result
TRUE TRUE TRUE
TRUE FALSE TRUE
FALSE TRUE TRUE
FALSE FALSE FALSE

If ANY condition is TRUE, the row is included!

NOT Operator:

Reverses the condition (TRUE becomes FALSE, FALSE becomes TRUE):

Try It: Using NOT

Output will appear here...

Combining Multiple Operators:

You can combine AND, OR, and NOT. Use parentheses for clarity!

Try It: Complex Conditions

Output will appear here...

Operator Precedence (Order of Operations):

SQL evaluates operators in this order:

  1. ( ) - Parentheses (highest priority)
  2. NOT
  3. AND
  4. OR (lowest priority)

Best Practice: Always use parentheses to make your intent clear!

LIKE Operator and Wildcards

Search for patterns in text data - incredibly useful for partial matches!

🎯 Real-Life Analogy: LIKE is Like a Fuzzy Search

Think about searching on Google or Netflix:

  • "john" - Exact match only
  • "john%" - Anything starting with "john": Johnson, Johnny, Johnathan
  • "%john%" - Anything containing "john": John, Johnny, Elton John
  • "j_hn" - Single character wildcard: John, Jahn

Wildcard Characters:

  • % - Matches zero or more characters (any length)
  • _ - Matches exactly one character
-- Examples:
WHERE name LIKE 'A%'      -- Starts with A
WHERE name LIKE '%son'    -- Ends with "son"
WHERE name LIKE '%ar%'    -- Contains "ar" anywhere
WHERE name LIKE 'J_hn'    -- J, any char, hn (John, Jahn)

Try It: Find Names Starting with 'A'

Output will appear here...

Try It: Find Names Containing 'son'

Output will appear here...

Try It: Using Underscore Wildcard

Output will appear here...

Common LIKE Patterns:

-- Starts with 'A'
WHERE name LIKE 'A%'

-- Ends with 'son'
WHERE name LIKE '%son'

-- Contains 'tech'
WHERE name LIKE '%tech%'

-- Second letter is 'a'
WHERE name LIKE '_a%'

-- Exactly 5 characters
WHERE code LIKE '_____'

-- Starts with 'S' and ends with 'n'
WHERE name LIKE 'S%n'

🔍 Performance Note:

LIKE with wildcards can be slow on large tables:

  • Fast: LIKE 'A%' - can use indexes
  • Slow: LIKE '%son' or LIKE '%ar%' - requires full table scan

For better performance on large datasets, consider full-text search features!

Case Sensitivity:

In most SQL databases, LIKE is case-insensitive by default, but this varies:

  • MySQL: Case-insensitive (default)
  • PostgreSQL: Case-sensitive (use ILIKE for case-insensitive)
  • SQLite: Case-insensitive for ASCII characters

IN Operator

Check if a value matches any value in a list - cleaner than multiple OR conditions!

🎯 Real-Life Analogy: IN is Like a Guest List

Imagine a bouncer checking names at a club:

  • Old way (OR): "Are you John OR Mary OR Steve OR Lisa?" (tedious!)
  • Better way (IN): "Is your name on this list?" (checks guest list)

IN makes checking multiple values much simpler!

IN Operator Syntax:

WHERE column_name IN (value1, value2, value3, ...)

Returns TRUE if the column value matches ANY value in the list.

Comparison: OR vs IN

-- Using multiple OR (verbose)
WHERE department = 'Engineering'
   OR department = 'Sales'
   OR department = 'Marketing'

-- Using IN (cleaner!)
WHERE department IN ('Engineering', 'Sales', 'Marketing')

Both do the same thing, but IN is much more readable!

Try It: Find Employees with Specific Salaries

Output will appear here...

Try It: IN with Numbers

Output will appear here...

NOT IN - Exclude Values:

Find rows that DON'T match any value in the list:

Try It: Using NOT IN

Output will appear here...

🔍 Behind the Scenes: How IN Works

When SQL evaluates IN:

  1. Takes the column value from each row
  2. Compares it against each value in the IN list
  3. If ANY match is found, returns TRUE
  4. If no matches, returns FALSE

Think of it as an efficient series of OR comparisons!

Note: IN vs OR Performance

IN and OR have similar performance, but IN is:

  • More readable and maintainable
  • Easier to modify (just add to the list)
  • Preferred by SQL style guides

BETWEEN Operator

Find values within a range - perfect for dates, numbers, and more!

🎯 Real-Life Analogy: BETWEEN is Like a Range Filter

Shopping for a house:

  • Old way: "Show houses >= $200k AND <= $300k" (wordy)
  • Better way: "Show houses BETWEEN $200k and $300k" (clear!)

BETWEEN is cleaner and more intuitive for range queries!

BETWEEN Syntax:

WHERE column_name BETWEEN value1 AND value2

Important: BETWEEN is inclusive - it includes both endpoints!

BETWEEN 10 AND 20 includes 10, 11, 12, ... 19, 20

Comparison: Range Check vs BETWEEN

-- Using >= and <= (verbose)
WHERE salary >= 70000 AND salary <= 90000

-- Using BETWEEN (cleaner!)
WHERE salary BETWEEN 70000 AND 90000

Both are equivalent, but BETWEEN is more readable!

Try It: Find Mid-Range Salaries

Output will appear here...

Try It: BETWEEN with Alphabetical Range

Output will appear here...

NOT BETWEEN:

Find values OUTSIDE a range:

Try It: Using NOT BETWEEN

Output will appear here...

🔍 Common Use Cases for BETWEEN:

  • Salary ranges: BETWEEN 50000 AND 100000
  • Age ranges: BETWEEN 25 AND 40
  • Date ranges: BETWEEN '2024-01-01' AND '2024-12-31'
  • Alphabetical ranges: BETWEEN 'A' AND 'M'

Important Notes:

  • BETWEEN is inclusive (includes both boundary values)
  • First value should be smaller than second value
  • Works with numbers, dates, and strings
  • Can use NOT BETWEEN to exclude a range

Handling NULL Values

NULL represents missing or unknown data - it's NOT the same as zero or empty string!

🎯 Real-Life Analogy: NULL is Like "I Don't Know"

Imagine filling out a form:

  • Age: 0 - You're a newborn baby (actual value)
  • Age: "" - Empty string (invalid for a number)
  • Age: NULL - "I prefer not to answer" or "Unknown" (no data)

NULL means "this information is missing or not applicable"

Key Points About NULL:

  • NULL is NOT zero, NOT empty string, NOT false
  • NULL means "unknown" or "missing data"
  • You CANNOT use = or != with NULL
  • Must use IS NULL or IS NOT NULL
  • Any comparison with NULL returns NULL (not true or false)

Why = NULL Doesn't Work:

-- WRONG - This won't work!
WHERE email = NULL

-- CORRECT - Use IS NULL
WHERE email IS NULL

-- WRONG - This won't work either!
WHERE email != NULL

-- CORRECT - Use IS NOT NULL
WHERE email IS NOT NULL

Reason: NULL means "unknown", and you can't compare with unknown!

Try It: Find Rows with NULL Values

Output will appear here...

Try It: Find Rows WITHOUT NULL Values

Output will appear here...

🔍 NULL in Calculations:

NULL spreads through calculations:

  • 5 + NULL = NULL
  • 10 * NULL = NULL
  • 'Hello' || NULL = NULL

Any operation involving NULL results in NULL!

Handling NULL in Queries:

-- Find employees with no manager assigned
WHERE manager_id IS NULL

-- Find complete employee records (no missing data)
WHERE email IS NOT NULL
  AND phone IS NOT NULL
  AND address IS NOT NULL

-- Using COALESCE to replace NULL with a default value
SELECT name, COALESCE(email, 'No email') AS email_address
FROM employees;

Try It: Combining NULL Checks

Output will appear here...

ORDER BY - Sorting Results

Control how your results are arranged - alphabetically, numerically, or by date!

🎯 Real-Life Analogy: ORDER BY is Like Organizing Books

Imagine organizing your bookshelf:

  • ORDER BY title ASC - Alphabetically A to Z
  • ORDER BY title DESC - Reverse alphabetically Z to A
  • ORDER BY year DESC - Newest to oldest
  • ORDER BY author, title - By author, then by title within each author

ORDER BY Syntax:

SELECT columns
FROM table
WHERE condition
ORDER BY column1 [ASC|DESC], column2 [ASC|DESC];
  • ASC - Ascending (default): A to Z, 1 to 10, oldest to newest
  • DESC - Descending: Z to A, 10 to 1, newest to oldest

Try It: Sort by Salary (Ascending)

Output will appear here...

Try It: Sort by Salary (Descending)

Output will appear here...

Sorting by Multiple Columns:

Sort by one column, then by another within ties:

Try It: Multi-Column Sort

Output will appear here...

🔍 How Multi-Column Sorting Works:

With ORDER BY department ASC, salary DESC:

  1. First, sort all rows by department alphabetically
  2. Within each department, sort by salary (highest first)
  3. Result: Departments in A-Z order, with highest-paid employees listed first in each department

Try It: Sort by Calculated Column

Output will appear here...

NULL Values in ORDER BY:

  • By default, NULL values appear LAST in ASC order
  • NULL values appear FIRST in DESC order
  • Some databases let you control this with NULLS FIRST or NULLS LAST

Common ORDER BY Patterns:

-- Alphabetical by name
ORDER BY name ASC

-- Most recent first (dates)
ORDER BY created_date DESC

-- By multiple columns
ORDER BY country ASC, city ASC, name ASC

-- Using column position (not recommended, but works)
ORDER BY 1, 3 DESC  -- Sort by 1st column, then 3rd column descending

LIMIT and OFFSET - Pagination

Control how many rows to return and where to start - essential for working with large datasets!

🎯 Real-Life Analogy: LIMIT is Like Page Size

Think about browsing search results on Google:

  • LIMIT 10 - "Show only the first 10 results"
  • OFFSET 10 - "Skip the first 10 results, then show the next ones"
  • LIMIT 10 OFFSET 20 - "Show results 21-30" (page 3 of results)

This is how websites show "Page 1, Page 2, Page 3..." navigation!

LIMIT Syntax:

SELECT columns
FROM table
ORDER BY column
LIMIT number;

-- With OFFSET
SELECT columns
FROM table
ORDER BY column
LIMIT number OFFSET skip_count;

Try It: LIMIT Only

Output will appear here...

Using OFFSET for Pagination:

OFFSET skips a specified number of rows before returning results:

Try It: Page 1 (First 10 employees)

Output will appear here...

Try It: Page 2 (Next 10 employees)

Output will appear here...

Try It: Page 3 (Next 10 employees)

Output will appear here...

🔍 Pagination Formula:

To calculate OFFSET for any page:

page_size = 10  -- rows per page
page_number = 3 -- which page you want

OFFSET = (page_number - 1) * page_size

Example for page 3:
OFFSET = (3 - 1) * 10 = 20

Query: LIMIT 10 OFFSET 20

Common Pagination Patterns:

-- First page (1-10)
LIMIT 10 OFFSET 0

-- Second page (11-20)
LIMIT 10 OFFSET 10

-- Third page (21-30)
LIMIT 10 OFFSET 20

-- Show top 5 only
LIMIT 5

-- Get a sample of 100 rows
LIMIT 100

Important Notes:

  • Always use ORDER BY with LIMIT/OFFSET for consistent results
  • Without ORDER BY, row order is unpredictable
  • OFFSET 0 is the same as no OFFSET (starts at first row)
  • Some databases use different syntax (e.g., SQL Server uses TOP)

🔍 Performance Consideration:

Large OFFSET values can be slow:

  • OFFSET 0 - Fast (starts at beginning)
  • OFFSET 10000 - Slower (database must scan and skip 10,000 rows)

For large datasets, consider using cursor-based pagination instead!

Practice Exercises

Test your skills with these hands-on challenges!

Part 1: Guided Exercises

Guided Exercise 1: Find Mid-Range Salary Employees

Task: Write a query to find all employees earning between $80,000 and $90,000, sorted by salary (highest first).

Output will appear here...

Guided Exercise 2: Names Starting with Specific Letters

Task: Find employees whose names start with 'A', 'B', or 'C', sorted alphabetically.

Output will appear here...

Guided Exercise 3: Mid-Range Salaries

Task: Find employees earning between $65,000 and $85,000 (inclusive), show only name and salary.

Output will appear here...

Guided Exercise 4: Multiple Department Filter

Task: Find employees in Engineering, Marketing, or HR departments, sorted by department then name.

Output will appear here...

Guided Exercise 5: Employees with Missing Data

Task: Find employees who don't have an email address on file.

Output will appear here...

Part 2: Independent Practice

Now try these on your own! No pre-written code - you've got this!

Challenge 1: Top 10 Earners

Difficulty: Easy | Time: 5 minutes

Task: Find the top 10 highest-paid employees. Show name, department, and salary. Sort by salary (highest first).

Output will appear here...

Challenge 2: Entry-Level Positions

Difficulty: Easy | Time: 7 minutes

Task: Find employees earning less than $60,000 who work in Engineering or Sales. Sort by department and then salary.

Output will appear here...

Challenge 3: Name Pattern Search

Difficulty: Medium | Time: 8 minutes

Task: Find employees whose names contain "son" (like Johnson, Wilson, etc.). Show name and department, sorted alphabetically by name.

Output will appear here...

Challenge 4: Department Analysis

Difficulty: Medium | Time: 10 minutes

Task: Find all employees NOT in Engineering or Sales departments, who earn between $70,000 and $100,000. Show all columns, sorted by salary descending.

Output will appear here...

Challenge 5: Pagination Practice

Difficulty: Medium | Time: 10 minutes

Task: Show the second page of employees (rows 11-20) when sorted alphabetically by name. Display name, department, and salary.

Output will appear here...

Knowledge Check Quiz

Test your understanding of SELECT and WHERE clauses:

1. Which operator should you use to check if a value is NULL?

2. What does BETWEEN 10 AND 20 include?

3. Which wildcard matches exactly ONE character in a LIKE pattern?

4. What does ORDER BY name DESC do?

5. Which query shows rows 21-30 in a paginated result?

6. What's the best way to find employees in Sales, Marketing, or HR?

7. With AND operator, when is a row included in results?

8. What does LIKE 'J%n' match?