SQL basics 3¶
JOIN¶
- In SQL language join allows selecting columns from multiple tables to be shown in the result set
- Join is used in most cases on production environments
- Join is defined with primary and foreign keys connecting tables
-
We are going to go through several join types in this section:
- Inner join
- Left join
- Right join
-
Basically left and right join are considered outer join methods
- Figure below presents how data from two tables is gathered in different join methods
Inner join¶
- Inner join only shows values which have corresponding value stored on another table
- Using join in select query requires the following:
- Tables are presented in FROM clause with INNER JOIN clause
- ON clause is used after FROM clause and key values connecting tables must be given with ON
- All tables where columns will be selected to result set must be included in FROM clause and key columns included in ON clause
- Alias name for tables can be used for the whole query when alias is set in FROM clause
Example
Select student's ID and name (firstname and surname) from Student table, grade and date from Grade table and course name from Course table. Order results by student's surname.
SELECT S.studentID, S.firstname, S.surname, G.grade, G.date, C.name
FROM Course C
INNER JOIN Grade G ON C.courseID = G.courseID
INNER JOIN Student S ON S.studentID = G.studentID
ORDER BY S.surname;
Left and Right join¶
- Outer join methods can be used to show also values that have no corresponding value stored on another table
-
Description of outer join methods: Left join: select all data from the left table Right join: select all data from the right table
-
As can be seen from the example below, Student and Attendance tables are connected with StudentID
- Let's see how outer joins work on these tables
Left join¶
- All values from the left table are returned
SELECT firstname, surname, age, courseID
FROM Student left join Attendance
ON Student.StudentID = Attendance.StudentID;
- This query will return the following result set
firstname | surname | age | courseID |
---|---|---|---|
John | Addington | 23 | 2 |
Emma | Garrick | 22 | Null |
Sam | Marlee | 24 | 2 |
Sam | Marlee | 24 | 3 |
Mia | Hailey | 23 | 2 |
Mia | Hailey | 23 | 3 |
- As can be seen from the result set above, on the second row there is Null value present since Emma does not have any course attendances stored in the Attendance table
Right join¶
- All values from the right table are returned
SELECT firstname, surname, age, courseID
FROM Student right join Attendance
ON Student.StudentID = Attendance.StudentID;
- This query will return the following result set
firstname | surname | age | courseID |
---|---|---|---|
John | Addington | 23 | 2 |
Sam | Marlee | 24 | 2 |
Sam | Marlee | 24 | 3 |
Mia | Hailey | 23 | 2 |
Mia | Hailey | 23 | 3 |
- This result set has only five rows since Emma does not have any course attendances stored in the Attendance table (All Attendance table rows are returned)
UNION¶
- While join is used to gather multiple columns from two or more tables into the result set, union is used to gather rows from multiple tables
- Below is a figure presenting how union works
- The syntax for union statement is as follows
SELECT statement 1
UNION
SELECT statement 2
- Important: Take note of these when using union:
- SELECT statements must have equal amount of columns in same order
- Corresponding columns from both tables must be the same data type
- ORDER BY will be present after the select statements and will affect the whole result set
Example
List all persons from company's project including employees and partners. Order the result set by age starting from oldest.
SELECT * from Employee
UNION
SELECT * from Partner
ORDER BY age DESC;
- The following result set will be returned
+----+-----------+------------+-----+------------+
| ID | firstname | surname | age | position |
+----+-----------+------------+-----+------------+
| 3 | Mike | Bellington | 50 | Manager |
| 2 | Lucy | Smith | 40 | Secretary |
| 3 | Sam | Morton | 24 | Developer |
| 1 | John | Cason | 23 | Team lead |
| 1 | Mia | Hailey | 23 | Researcher |
| 2 | Emma | Braxton | 22 | Developer |
| 4 | Sandra | Blythe | 21 | Researcher |
+----+-----------+------------+-----+------------+
Subqueries¶
- Subqueries are always created inside the main query
- Subqueries can be nested
- Below is the syntax for using subqueries in MySQL
SELECT statement 1 -- Main query
(SELECT statement 2) -- First subquery
(SELECT statement 3) -- Second subquery
...
- Queries are executed started from the most inner subquery [3 → 2 → 1]
- Result from inner query is passed to outer
Example
Select all employees who come from the same city as employee with ID of 255.
SELECT firstname, surname, age
FROM Employee
WHERE city =
(SELECT city
FROM Employee
WHERE ID = 255);
IN subqueries¶
- You can only return one row from regular subquery
- IN subquery allows the return of multiple rows
Example
Select firstname and surname of employees who work in a team having budget over 100000.
SELECT firstname, surname
FROM Employee
WHERE team_name IN
(SELECT team_name
FROM Teams
WHERE budget > 100000);
ALL and ANY subqueries¶
- IN subquery cannot be used if comparison operators (< or >) are present in query
- Then you must use either ANY or ALL subquery
- Description: ANY: Any value from subquery matches the condition ALL: All values from subquery match the condition
Example
Select employees with salary greater than all salaries of employees from development team (TeamID = 3).
SELECT firstname, surname, salary
FROM Employees
WHERE salary > ALL
(SELECT salary
FROM Employee
WHERE TeamID = 3);
EXISTS subquery¶
- EXISTS subquery can be used for testing whether particular rows exits in a table
- SELECT statement will always be used with * selector (all columns)
- Result will be one of the following
- True: Query returns at least one row that matches the condition
- False: Query does not return any rows
Example
Select teams having at least one player.
SELECT teamID, team_name
FROM Team
WHERE EXISTS
(SELECT *
FROM Player
WHERE Team.teamID = Player.teamID);
In this query team will be selected if it's teamID is present at least once in Player table.