How can I join data to my table that isn't available for everyone without losing results?

51 views Asked by At

I have a main table, Table 1, that includes columns for student ID and some other info. There is also Table 2 which includes columns for student ID, survey question, and response.

Essentially, I want to append Table 2 to Table 1, but:

  • I only want to bring in the responses for the "Did you receive a GPA of 3.0 or higher?" survey question.
  • This question was not asked to students before 04/01/23.
  • For those who were not asked this question, I'd like for their row to just have a null after appending the tables.

Table 1 Example:

ID Date
1 03/01/23
2 03/05/23
3 04/10/23
4 04/12/23
5 04/21/23

Table 2 Example:

ID Date Question Response
1 03/01/23 What grade are you in? 9
2 03/05/23 What grade are you in? 10
3 04/10/23 What grade are you in? 11
3 04/10/23 Did you receive a GPA of 3.0 or higher? Yes
4 04/12/23 What grade are you in? 9
4 04/12/23 Did you receive a GPA of 3.0 or higher? Yes
5 04/21/23 What grade are you in? 11
5 04/21/23 Did you receive a GPA of 3.0 or higher? No

Example Desired Output:

ID Date Question Response
1 03/01/23 Did you receive a GPA of 3.0 or higher? null
2 03/05/23 Did you receive a GPA of 3.0 or higher? null
3 04/10/23 Did you receive a GPA of 3.0 or higher? Yes
4 04/12/23 Did you receive a GPA of 3.0 or higher? Yes
5 04/21/23 Did you receive a GPA of 3.0 or higher? No

How can I extract this question only into my results without excluding the students who were not asked it? If I add a WHERE clause:

SELECT t1.*, t2.Question, t2.Response
FROM Table_1 t1
LEFT JOIN Table_2 t2
ON t1.ID = t2.ID
WHERE t2.Question IN ('Did you receive a GPA of 3.0 or higher?')

Then it will exclude students 1 and 2. If I don't add the WHERE clause, then it will pull in all of the survey questions that, in my real data set, will generate 100+ rows per student which leaves me with too large a dataset.

2

There are 2 answers

0
CHill60 On

Instead of using WHERE add that condition to the ON-Clause e.g.

SELECT T1.*, t2.Question, t2.Response
FROM Table_1 t1
LEFT JOIN @Table_2 t2 ON t1.ID = t2.ID AND t2.Question = ('Did you receive a GPA of 3.0 or higher?')
0
Asgar On

You are almost there just use

SELECT t1.*, IFNULL(t2.Question,'Did you receive a GPA of 3.0 or higher?'), t2.Response
FROM Table_1 t1
LEFT JOIN Table_2 t2
ON t1.ID = t2.ID
AND t2.Question IN ('Did you receive a GPA of 3.0 or higher?')
ORDER BY t1.`ID`

You were using WHERE

LEFT JOIN Table_2 t2
ON t1.ID = t2.ID
wHERE t2.Question IN ('Did you receive a GPA of 3.0 or higher?')

which makes the LEFT JOIN pointless, instead, use the second case t2.Question IN ('Did you receive a GPA of 3.0 or higher?') within the ON clause which will join your data accordingly.