SQL - Split query data stream into 2 separate tables [Theoretical Optimisation]

430 views Asked by At

I am writing some SQL code to be run in MapBasic (MapInfo's Programming language). The best way to describe the question is with an example:

I want to select all records where ShipType="Barge" into a query named Barges and I want all the remaining records to be put in a query OtherShips.

I could simply use the following SQL commands:

select * from ShipsTable where  ShipType = "Barge" into Barges
select * from ShipsTable where  ShipType <> "Barge" into OtherShips

That's fine and all but I can't help but feel that this is inefficient. Won't SQL be searching through the database twice? Won't it find the rows of data that fit the 2nd Query during the processing of the 1st?

Instead, it would be faster if there was a command like:

select * from ShipsTable where ShipType = "Barge" into Barges ELSE into OtherShips

My question is, can you do this? Is there a command that fits this spec?

Thanks,

2

There are 2 answers

1
T_Bacon On BEST ANSWER

MapBasic does provide you access to MapInfo's 'Invert Selection' which would give you anything that wasn't selected from your first query (assuming your first query does return results). You can call it by using it's menu ID (found in Menu.def) which is 311 or if you include menu.def at the top of the file you can reference it through the constant M_QUERY_INVERTSELECT.

eg.

Select * from ShipsTable where  ShipType = "Barge" into Barges
Run Menu Command 311

or Run Menu Command M_QUERY_INVERTSELECT if you have included the menu definitions file.

I believe this would give you better performance than doing a second selection as per your example but you wouldn't be able to then name the results table with an alias without doing another selection. Depends on your use case whether this is worth using or not, for a large query that takes quite a while it could well save on some processing time.

1
Martin Smith On

You could do this quite easily in SSIS with a conditional split and two different destinations.

But not really in TSQL.

However for "fun" some possibilities are looked at below.

You could create a partitioned view but the requirements that you need to meet for this are quite arduous and the execution plan just loads it all into a spool and then reads the spool twice with two different filters anyway.

CREATE TABLE Barges
(
Id INT,
ShipType VARCHAR(50) NOT NULL CHECK (ShipType = 'Barge'),
PRIMARY KEY (Id, ShipType)
)

CREATE TABLE OtherShips
(
Id INT,
ShipType VARCHAR(50) NOT NULL CHECK (ShipType <> 'Barge'),
PRIMARY KEY (Id, ShipType)
)

CREATE TABLE ShipsTable
(
ShipType VARCHAR(50) NOT NULL
)

go

CREATE VIEW ShipsView
AS
SELECT *
FROM Barges
UNION ALL
SELECT *
FROM OtherShips

GO

INSERT INTO ShipsView(Id, ShipType)
SELECT ROW_NUMBER() OVER(ORDER BY @@SPID), ShipType
FROM ShipsTable

enter image description here

Or you could use the OUTPUT clause and composable DML but that would require inserting both sets of rows into the first table and then cleaning out the unwanted rows afterwards (the second table would only get the correct rows and not need any clean up).

CREATE TABLE Barges2
(
ShipType VARCHAR(50) NOT NULL
)

CREATE TABLE OtherShips2
(
ShipType VARCHAR(50) NOT NULL
)

CREATE TABLE ShipsTable2
(
ShipType VARCHAR(50) NOT NULL
)

INSERT INTO Barges2
SELECT *
FROM 
(
INSERT INTO OtherShips2
OUTPUT INSERTED.*
SELECT *
FROM ShipsTable2
) D
WHERE D.ShipType = 'Barge';

DELETE FROM OtherShips2 WHERE ShipType = 'Barge';

enter image description here