My question is: what is the problem in my thinking?
Why is the left join not working as I thought?
A working answer was given (without explanation), here.
More info way down.
I have this sql fiddle displaying the problem I am facing when using left outer join
Structure
CREATE TABLE IF NOT EXISTS `mychanges` (
`object_id` int(11) unsigned NOT NULL,
`version_stamp` datetime DEFAULT NULL,
`object_name` varchar(255) DEFAULT NULL,
`project` int(11) unsigned DEFAULT NULL,
PRIMARY KEY (`object_id` ,`version_stamp`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO mychanges VALUES (1266, "2014-09-24 09:55:40", "bbbb", 2 );
INSERT INTO mychanges VALUES (1236, "2014-09-24 09:55:40", "aaa", 2 );
INSERT INTO mychanges VALUES (1226, "2014-09-24 09:55:40", "zzz", 5 );
INSERT INTO mychanges VALUES (1216, "2014-09-24 09:55:40", "xxxx", 8 );
INSERT INTO mychanges VALUES (1256, "2014-09-24 09:51:40", "name1", 10 );
INSERT INTO mychanges VALUES (1256, "2014-09-24 09:52:40", "name2", 10 );
INSERT INTO mychanges VALUES (1256, "2014-09-24 09:53:40", "name3", 10 );
INSERT INTO mychanges VALUES (1256, "2014-09-24 09:54:40", "name4", null );
INSERT INTO mychanges VALUES (1256, "2014-09-24 09:56:40", "name5", null );
SELECT
SELECT mychanges.object_id AS objectid1, mychanges.object_name AS objectname1, mychanges.version_stamp AS version_stamp1, change2.project as project2, change2.version_stamp as version_stamp2, change2.object_name as objectname2 FROM mychanges
LEFT JOIN (SELECT * from mychanges AS x WHERE x.project IS NULL) AS change2
ON change2.object_id = mychanges.object_id
WHERE mychanges.project = 10
What I want to get:
The values inserted by
INSERT INTO mychanges VALUES (1256, "2014-09-24 09:51:40", "name1", 10 );
INSERT INTO mychanges VALUES (1256, "2014-09-24 09:52:40", "name2", 10 );
INSERT INTO mychanges VALUES (1256, "2014-09-24 09:53:40", "name3", 10 );
INSERT INTO mychanges VALUES (1256, "2014-09-24 09:54:40", "name4", null );
INSERT INTO mychanges VALUES (1256, "2014-09-24 09:56:40", "name5", null );
So in written language: all records that have as project a 10 and all records that have the same object id with a record that has the project id of 10.
Why?
This is a changeset table. A user is changing an object name and hits enter. This generates entries in form of a change. These changes, can, sadly, have a project id of null. The object id is unique though.
What am I thinking?
I thought: right we have all entries with project_id NULL. Left we have all records with project_id with 10. If we join them using object_id
, we get both, without duplicates. But instead we get duplicates of name4 and name5 and not name1, name2, and name3.
Why not?
No. You can't join that because the values
NULL
is not comparable with any other value, not even withNULL
. If you have null, you need to handle it explicitly by usingifNULL()
function.EDIT
On second look, I realize that you are trying to join with the
object_id
which is of course not null. So let us see what is happening in your query. If I may break your query in 2 parts, the first part is:This produces 3 records with
object_id
= 1256And the 2nd part is,
This produces 2 records with
object_id
= 1256.When you join these 2 parts using condition
change2.object_id = mychanges.object_id
it produces 3 X 2 = 6 records in the result. The duplicate originates as for each record in left side, there are 2 records in the right side.This is exactly what the SQL is supposed to behave.