How to find orphaned rows without having to specify all possible FK relationships?

98 views Asked by At

(I have already seen this answer, but this does not answer my specific question).

I have a table - in MSSQLServer - named Files, with many rows. There are FK relationships to Files from many other tables. I would like to delete all orphaned rows from Files (i.e. any row where there is no FK relationship to it from any other table).

If I attempt to delete a row manually from within the Data view (in SQL Server view of VS) then this will succeed if the row is an orphan, but fail if there is an FK reference from any table - so (as I would expect) the database is checking this. But how can I either delete all such orphaned rows, or, preferably, retrieve them in a SQL query to inspect them.

1

There are 1 answers

1
GarethD On

To address something you said in the comments:

doesn't this still require you to specify the FK relationships? If I'm wrong, please post an example of how to use it against the File table where you don't know about any of the FK relationships defined in other tables

This is not possible without specifying the tables to check against, but while you may not know what all the FKs are off the top of you head, as long as they are defined the system does and you can query that to build your query dynamically to check all child tables:

DECLARE @ObjectName sysname = N'dbo.SomeTable';

DECLARE @sql NVARCHAR(MAX);

SELECT  @sql = CONCAT('SELECT * FROM ', @ObjectName, ' AS p WHERE NOT EXISTS(', STRING_AGG(t.ChildTable, ' UNION ALL '), ')')
FROM
        (
            SELECT  ChildTable = CONCAT(
                                       'SELECT 1 FROM ', QUOTENAME(s.name), '.', QUOTENAME(t.name), ' AS c WHERE ',
                                       STRING_AGG(CONCAT('c.', c2.name, ' = p.', c.name), ' AND ')
                                       )
            FROM    sys.foreign_key_columns AS fkc
                    INNER JOIN sys.columns AS c
                        ON  c.object_id = fkc.referenced_object_id
                        AND c.column_id = fkc.referenced_column_id
                    INNER JOIN sys.foreign_keys AS fk
                        ON fk.object_id = fkc.constraint_object_id
                    INNER JOIN sys.tables AS t
                        ON t.object_id = fk.parent_object_id
                    INNER JOIN sys.schemas AS s
                        ON s.schema_id = t.schema_id
                    INNER JOIN sys.columns AS c2
                        ON  c2.object_id = fkc.parent_object_id
                        AND c2.column_id = fkc.parent_column_id
            WHERE   fkc.referenced_object_id = OBJECT_ID(@ObjectName, 'U')
            GROUP BY
                    s.name, t.name, fk.object_id
        ) AS t
HAVING COUNT(*) > 0;

PRINT @SQL;
--EXECUTE sp_executesql @sql;

So with Adventureworks if you use HumanResources.Employee as the input table, you end up with SQL something like:

SELECT  *
FROM    HumanResources.Employee AS p
WHERE   NOT EXISTS
(
    SELECT  1
    FROM    HumanResources.EmployeeDepartmentHistory AS c
    WHERE   c.BusinessEntityID = p.BusinessEntityID
    UNION ALL
    SELECT  1
    FROM    HumanResources.EmployeePayHistory AS c
    WHERE   c.BusinessEntityID = p.BusinessEntityID
    UNION ALL
    SELECT  1
    FROM    HumanResources.JobCandidate AS c
    WHERE   c.BusinessEntityID = p.BusinessEntityID
    UNION ALL
    SELECT  1
    FROM    Production.Document AS c
    WHERE   c.Owner = p.BusinessEntityID
    UNION ALL
    SELECT  1
    FROM    Purchasing.PurchaseOrderHeader AS c
    WHERE   c.EmployeeID = p.BusinessEntityID
    UNION ALL
    SELECT  1
    FROM    Sales.SalesPerson AS c
    WHERE   c.BusinessEntityID = p.BusinessEntityID
);

Examples on db<>fiddle