👋🏽 Hey, it’s Ismail. Welcome to data nomads lab newsletter on learning data analytics, career growth, networking, building portfolios, and interview skills to break into tech role as a high-performer.
In Lesson 4, we explored SQL aggregations—learning how to count, sum, average, and group our data to get big-picture insights.
If you missed that one, you can go back and check it out
but here’s a quick refresher: aggregations help us summarize data across multiple rows, answering questions like “What's our average sale amount?” or “How many customers do we have in each region?”
But what if your questions get more complex?
What if you need to find songs that are longer than the average length? Or
customers who spend more than most other customers? Or
artists who have released more albums than usual?
This is where SQL subqueries come in—queries nested inside other queries that help us answer multi-step questions with a single SQL statement.
Think of subqueries like a helper who runs off to find some information, then brings it back so you can use it to answer a bigger question.
It’s like asking “Which songs are longer than average?” where you first need to know “What's the average song length?”
Why Subqueries matter
As data users, we often need to:
Compare individual items against averages (like finding songs longer than the average length)
Filter data based on information from other tables (like finding all songs by a specific artist)
Answer questions that require multiple steps (like finding customers who spend more than average)
Create temporary results to use in another question
Subqueries let us do all of this in a single SQL statement, without having to run multiple separate queries.
What we will cover
In this lesson, we will explore:
What subqueries are and how they work (in simple terms)
Different types of subqueries and when to use them
Real-world examples using a music store database
How to write your own subqueries
Tips for making your subqueries work better
By the end of this lesson, you will be able to write SQL queries that answer complex questions about your data—taking your skills to the next level.
Let’s dive in!
Understanding SQL Subqueries
What is a Subquery?
A subquery is simply a query inside another query. The inner query runs first, and its result is used by the outer (main) query to complete its job.
Here’s a simple example:
SELECT
Name AS "Song Name",
Milliseconds/60000 AS "Length (minutes)"
FROM
tracks
WHERE
Milliseconds > (SELECT AVG(Milliseconds) FROM tracks);
In this example:
The part in parentheses
(SELECT AVG(Milliseconds) FROM tracks)
is the subqueryIt calculates the average song length
The main query then finds all songs longer than this average
It’s like first asking “What's the average song length?” and then using that answer to find “Which songs are longer than that?”
Syntax and Structure
Subqueries are always enclosed in parentheses ( )
and can appear in different parts of a SQL statement:
In the SELECT part (to calculate values)
In the FROM part (as a temporary table)
In the WHERE part (for filtering)
In the HAVING part (for filtering after grouping)
The basic pattern looks like this:
SELECT column_name(s)
FROM table_name
WHERE column_name operator (SELECT column_name FROM table_name WHERE condition);
How Subqueries Work?
Let’s look at two main types of subqueries:
Non-correlated Subqueries
These are independent subqueries that run once, and their result is passed to the outer query
-- Find songs with above-average length
SELECT
Name AS "Song Name",
Milliseconds/60000 AS "Length (minutes)"
FROM
tracks
WHERE
Milliseconds > (SELECT AVG(Milliseconds) FROM tracks);
How it works:
The database first calculates the average song length
Then it finds all songs longer than this average
Correlated Subqueries
These subqueries reference the outer query and run once for each row processed by the outer query.
-- Find artists with more albums than the average artist
SELECT
artists.Name AS "Artist Name",
(SELECT COUNT(*) FROM albums WHERE albums.ArtistId = artists.ArtistId) AS "Number of Albums"
FROM
artists
WHERE
(SELECT COUNT(*) FROM albums WHERE albums.ArtistId = artists.ArtistId) >
(SELECT AVG(album_count) FROM
(SELECT COUNT(*) AS album_count FROM albums GROUP BY ArtistId)
);
How it works:
For each artist, it counts how many albums they have
It also calculates the average number of albums per artist
It shows only artists with more albums than average
Types of Subqueries
Scalar Subqueries
These return a single value (one row, one column). They're often used in comparisons.
-- Find songs with length above average, and show the average too
SELECT
Name AS "Song Name",
Milliseconds/60000 AS "Length (minutes)",
(SELECT AVG(Milliseconds)/60000 FROM tracks) AS "Average Length (minutes)"
FROM
tracks
WHERE
Milliseconds > (SELECT AVG(Milliseconds) FROM tracks);
Column Subqueries
These return a single column with multiple rows. They are typically used with operators like IN.
-- Find all songs by AC/DC
SELECT
tracks.Name AS "Song Name"
FROM
tracks
WHERE
AlbumId IN (
SELECT AlbumId
FROM albums
WHERE albums.ArtistId IN (
SELECT ArtistId
FROM artists
WHERE Name = 'AC/DC'
)
);
Table Subqueries (derived tables)
These return multiple rows and columns, creating a temporary table. They are used in the FROM part.
-- Find the top 3 genres with the highest average track length
SELECT
genre_stats.Genre,
genre_stats.AvgLength
FROM (
SELECT
genres.Name AS Genre,
AVG(tracks.Milliseconds)/60000 AS AvgLength
FROM
tracks
JOIN
genres ON tracks.GenreId = genres.GenreId
GROUP BY
genres.Name
) AS genre_stats
ORDER BY
genre_stats.AvgLength DESC
LIMIT 3;
Practical Applications of Subqueries
Finding Items Above Average
One of the most common uses of subqueries is to find records that exceed average values.
-- Find customers who spend more than the average customer
SELECT
FirstName || ' ' || LastName AS "Customer Name",
(SELECT SUM(Total) FROM invoices WHERE invoices.CustomerId = customers.CustomerId) AS "Total Spent"
FROM
customers
WHERE
(SELECT SUM(Total) FROM invoices WHERE invoices.CustomerId = customers.CustomerId) >
(SELECT AVG(customer_total) FROM
(SELECT SUM(Total) AS customer_total FROM invoices GROUP BY CustomerId)
)
ORDER BY
"Total Spent" DESC;
This query:
Calculates how much each customer has spent
Calculates the average spending per customer
Shows only customers who spent more than average
Finding related records
Subqueries are great for finding records that are related to other records in specific ways.
-- Find all albums by artists who have more than 1 album
SELECT
albums.Title AS "Album Title",
artists.Name AS "Artist"
FROM
albums
JOIN
artists ON albums.ArtistId = artists.ArtistId
WHERE
artists.ArtistId IN (
SELECT ArtistId
FROM albums
GROUP BY ArtistId
HAVING COUNT(*) > 1
)
ORDER BY
artists.Name, albums.Title;
This query:
Finds artists who have more than one album
Shows all albums by those artists
Comparing groups
Subqueries can help compare groups of data against each other.
-- Find genres with more songs than the average genre
SELECT
genres.Name AS "Genre",
COUNT(*) AS "Number of Songs"
FROM
tracks
JOIN
genres ON tracks.GenreId = genres.GenreId
GROUP BY
genres.Name
HAVING
COUNT(*) > (
SELECT AVG(song_count)
FROM (
SELECT COUNT(*) AS song_count
FROM tracks
GROUP BY GenreId
)
)
ORDER BY
"Number of Songs" DESC;
This query:
Counts how many songs are in each genre
Calculates the average number of songs per genre
Shows only genres with above-average song counts
Performance tips and best practices
When to use subqueries?
Subqueries are most useful when:
You need to compare values against averages or totals
You need to filter based on information from another table
You want to avoid creating temporary tables
Making subqueries run faster
Keep subqueries simple: The simpler a subquery, the faster it will run.
Consider using JOINs: Sometimes a JOIN can be faster than a subquery.
-- Subquery approach
SELECT
tracks.Name AS "Song Name"
FROM
tracks
WHERE
GenreId IN (
SELECT GenreId
FROM genres
WHERE Name = 'Rock'
);
-- Equivalent JOIN approach (often faster)
SELECT
tracks.Name AS "Song Name"
FROM
tracks
JOIN
genres ON tracks.GenreId = genres.GenreId
WHERE
genres.Name = 'Rock';
Use EXISTS instead of IN for large datasets: The EXISTS operator often performs better because it stops once it finds a match.
-- Using IN
SELECT
artists.Name AS "Artist"
FROM
artists
WHERE
ArtistId IN (
SELECT ArtistId
FROM albums
WHERE Title LIKE '%Live%'
);
-- Using EXISTS (often more efficient)
SELECT
artists.Name AS "Artist"
FROM
artists a
WHERE
EXISTS (
SELECT 1
FROM albums
WHERE albums.ArtistId = a.ArtistId
AND Title LIKE '%Live%'
);
Avoid too much nesting: Try not to nest subqueries too deeply (more than 2-3 levels) as it makes queries hard to read and can slow things down.
Why subqueries are useful?
Think of subqueries like having a helper who can quickly answer a question while you are in the middle of asking a bigger question.
For example, if you want to know “Which songs are longer than average?”, you first need to know “What's the average song length?” A subquery lets you answer that second question right inside your main question.
Without subqueries, you would have to:
Run a separate query to find the average song length
Write down that number
Run another query using that number to find the longer songs
Subqueries let you do all of this in one step, making your work more efficient and your code cleaner.
Conclusion: mastering SQL subqueries
Congratulations! You've now explored SQL subqueries and learned how they can help you answer complex questions about your data. Let's recap what we've covered:
Key takeaways
Subqueries are queries nested within other queries, allowing you to perform multi-step operations in a single SQL statement.
Different types of subqueries serve different purposes:
Scalar subqueries return a single value (like the average song length)
Column subqueries return a list of values (like all album IDs for an artist)
Table subqueries return a temporary table with multiple rows and columns
Subqueries can appear in different parts of a SQL statement:
In the SELECT part (to calculate values)
In the FROM part (as a temporary table)
In the WHERE part (for filtering)
In the HAVING part (for filtering after grouping)
There are two main types of subqueries:
Non-correlated subqueries run once and pass their result to the outer query
Correlated subqueries reference the outer query and run once for each row processed
Performance tips to remember:
Sometimes JOINs are faster than subqueries
EXISTS often performs better than IN for large datasets
Keep your subqueries as simple as possible
Real-world applications
The techniques you have learned in this lesson can help you:
Find songs that are longer than average
Identify customers who spend more than others
Discover which music genres are most popular
Find artists who have released multiple albums
Answer complex business questions with a single query
By mastering subqueries, you have added a powerful tool to your SQL toolkit that will help you tackle more complex data challenges.
Stay tuned for more examples and walkthroughs!
If this lesson helped you, please share and support Data Nomads Lab on Substack. Your support helps keep this content free and accessible to learners around the world.