Saturday, January 26, 2008

Left Joins in SQL - TUTORIAL

Audience: Those with a basic understanding of the SQL language, and have a reasonable understanding of how to use the WHERE clause.

I'm sure most of you familiar with SQL have all seen the term "left join" around the place. Up until now, you have probably gotten away with never having to use it.

I recently had to learn about left joins, because a simple query I was using was not returning a complete set of records, as I had intended it to.

As with all innovations, the "left join" exists to solve a problem. I'll highlight the problem a left join can help us solve, using an example.

Let's say you have a database that stores people's names, and their favourite colours. The database could have two tables, and look like this:

Names (table)



Colours (table)


As you can see, the two tables are linked on the fave_colour column in the Names table, which relates to a colour_id in the Colours table.

So, let's say we want to run a query that will retrieve everyone's name, and their favourite colour. You might think the proper way to do this would be easy:

SELECT name, colour FROM names, colours WHERE fave_colour = colour_id

The above query would return the following results:



Looks ok.. but hang on. There are some names missing. Have a look at the Names table again, and you will see a "Simon" and a "Carol". Why haven't they showed up in the query result?

To get an answer, let's take another look at the query we used:

SELECT name, colour FROM names, colours WHERE fave_colour = colour_id

This query is asking to retrieve all names and colours "where the value in the fave_colour column matches a value in the colour_id column".

Have a look at the fave_colour column in the Names table, next to Simon and Carol. You will see that Simon and Carol have a fave_colour value of 4.

Have a look at the Colours table. There is no colour_id of 4. Hence, if we run a query that asks only to return records where the fave_colour column matches a colour_id column, there will be no match for Simon and Carol! Hence, Simon and Carol's records will not be returned.

The way to solve this problem is to slightly change our query. Here, again, is the query we started with:

SELECT name, colour FROM names, colours WHERE fave_colour = colour_id

We need to make two changes to this query. For the first change, I am going to remove the comma between names and colours, and put the words LEFT JOIN in its place. For the second change, I am going to remove the word WHERE, and put the word ON in its place. Here's what the new query would look like:

SELECT name, colour FROM names LEFT JOIN colours ON fave_colour = colour_id

So, what do these changes do?

In the new query, you will notice on each side of the words LEFT JOIN are the names of two tables. In this case, names is on the immediate left, and colours is on the immediate right. The table on the left hand side of LEFT JOIN will have all of its selected records returned, regardless if there is a match in the table on the right hand side. Then, just like a WHERE clause, we need to say which two columns we want to look for a match on. When using LEFT JOIN, we just replace the word WHERE with ON.

So running the new query will give us the following results:

SELECT name, colour FROM names LEFT JOIN colours ON fave_colour = colour_id



There we go. All relevant records from the Names table have now been returned, along with the favourite colour (if properly recorded).

No comments: