Data transfer insert into by chunk of 10,000 rows each time

3.9k views Asked by At

I get a 6M rows table need to sent to another server each day. it seems never finish when i do testing by:

insert into serverA.dbName.dbo.table1
select * from serverB.dbName.dbo.table1

is there such thing like "do 10000 rows each time"

something like

insert into serverA.dbName.dbo.table1
select * from serverB.dbName.dbo.table1
**by numOfRowEachTime=10000**

Otherwise i will write a loop.

1

There are 1 answers

0
Solomon Rutzky On

6 million rows in a single transaction is a pretty sizable chunk to begin with, and DML statements across linked servers aren't great either. But two methods to consider (that have worked for me) are:

  1. If there is an easy way to copy them in order (such as an ID field) then you can:

    • Keep track of the last ID sent in a new table on ServerB
    • Write a proc on ServerB that does a loop similar to the following:

      DECLARE @StartingID INT = 1,
              @EndingID INT,
              @DataToTransport NVARCHAR(MAX);
      
      SELECT @StartingID = StartingID FROM dbo.ProcessStatus;
      
      WHILE (1 = 1)
      BEGIN
         ;WITH cte AS
         (
            SELECT TOP (10000) t1.ID,
                   ROW_NUMBER() OVER (ORDER BY t1.ID ASC) AS [RowNum]
            FROM   dbo.Table1 t1
            WHERE t1.ID >= @StartingID
         )
         SELECT TOP (1) @EndingID = cte.[ID] FROM cte
         ORDER BY cte.[RowNum] DESC;
      
         IF (@@ROWCOUNT = 0)
         BEGIN
            UPDATE dbo.ProcessStatus SET StartingID = 1;
            BREAK; -- reset to start at the beginning and exit
         END;
      
         SET @DataToTransport = (CONVERT(NVARCHAR(MAX), (
             SELECT t1.* FROM dbo.Table1 t1
             WHERE t1.ID BETWEEN @StartingID AND @EndingID
             FOR XML RAW));
      
         EXEC [ServerA].[dbName].[dbo].Table1_ImportFromServerB @DataToTransport;
      
         UPDATE dbo.ProcessStatus SET StartingID = @StartingID;
      
         SET @StartingID = (@EndingID + 1);
      END;
      
    • Write a proc on ServerA -- [dbName].[dbo].Table1_ImportFromServerB -- that accepts the NVARCHAR(MAX) input param, converts it to XML, and does an INSERT INTO Table1 SELECT ... FROM (either @XMLvariable.Nodes or OPENXML)
    • Sending over a batch wrapped up as XML to a stored proc over a Linked Server is much faster than doing a straight INSERT over a Linked Server
    • You have to send the XML as NVARCHAR(MAX) as XML is not allowed for sending over a Linked Server
    • The [ProcessStatus] table is just to keep track of the current position in case the process fails and hence it can start where it left off, or you can even set a time limit or counter of X iterations and schedule to run from a SQL Agent job every N minutes or so (which is how I have always done it).


  2. Use SQLCLR to make use of the SqlBulkCopy class that is designed for this type of task. It is essentially the same API that is used by BCP, OPENROWSET(BULK...), and BULK INSERT. You can write your own CLR-based stored procedure, or there is one already done and available for free in the SQL# library (which I am the author of, but again, it is free). The stored procedure is called DB_BulkCopy and allows for setting a batch size.