I have two databases and I need to ensure that all records in one database have a matching record in the other. I will call these DB-SQL and DB-Legacy
This would be easy enough if both had an SQL interface, but unfortunately I only have this type of access to one, the other I have a 'find record / first / next' type interface.
The method that I have chosen to perform this task is to transfer DB-SQL to a clientdataset via the code below:
var
lQuery: TADOQuery;
lProvider: TDataSetProvider;
lDataSet: TClientDataSet;
begin
lQuery := TADOQuery.Create(nil);
lProvider := TDataSetProvider.Create(nil);
lDataSet := TClientDataSet.Create(nil);
// we don't need either of these and should speed things up
lDataSet.disablecontrols;
lQuery.DisableControls;
try
lQuery.Connection := aConnection;
lQuery.SQL.Add('SELECT FieldA, FieldB, FieldC, 0 as FoundInGIS');
lQuery.SQL.Add('FROM TableA');
// following two lines needed to allow us to modify the FoundInGIS field in the clientdataset
lQuery.open;
lquery.fieldbyname('FoundInGIS').Readonly := false;
lProvider.DataSet := lQuery;
lDataSet.Data := lProvider.Data;
lDataSet.fieldbyname('FoundInGIS').readonly := false;
lDataSet.LogChanges := false;
// index by FieldA for quick searching by FindKey later
lDataSet.IndexFieldNames := 'FieldA';
finally
lQuery.Free;
lProvider.Free;
end;
This is based on code at http://www.podgoretsky.com/ftp/docs/Delphi/D5/dg/5_ds3.html#20536
This will then allow me to iterate over DB-legacy with First/Next until EOF, searching the ClientDataSet using FindKey to ensure all records in DB-Legacy exist in DB-SQL. By setting the FoundInGIS tag to 1, I can then filter by this value to find all records that are in DB-SQL but not in DB-Legacy.
My problem is that one of our databases is significantly bigger than the others, at 3,310,510 records. lQuery has the correct number of records, however at the end of the procedure lDataSet only has around 2,500,000.
Now, I want to use the cds to make use of the FindKey method, which isn't supported in TADOQuery, however it's not much use if it ignores 1/3 of the records! I'm guessing that there may be an integer overflow somewhere within the DataSetProvider or the ClientDataSet, though it's a bit naughty that it doesn't raise an exception if so! Has anyone else had this kind of issue, and is there a way of sorting it (perhaps by downloading the data in smaller chunks or using another way of populating CDS)?
The SQL-DB in this case is Oracle, but code will also need to work with SQL-Server, though I doubt it's a DB issue.
Edit: I'm getting some slightly different behaviour now. When I try removing some of the fields from the query it runs fine. All of the fields run OK individually, but it just can't handle all of them (which supports my overflow hypothesis). I am now, however getting an exception being raised occasionally. The exception is
'Format '%s' invalid or incompatible with argument'
This is misleading, as delving down the Debug DCUs shows the error is being raised by
SafeArrayCheck(SafeArrayCopy(VarToDataPacket(Value), FSavedPacket));
in TCustomClientDataSet.SetData (DBClient line 1482). This raises an ESafeArrayError (AResult = -2147024882) which becomes a 'Unexpected variant or safe array error', but it can't handle a subsequent call to FormatStr.
OK - after playing with the number of fields brought down I was convinced that TClientDataSet couldn't insert all records in one chunk (2 fields inserted all records, all but one inserted about 2,900,000 and all of them inserted c2,500,000). The same result was experienced by calling SetProvider and using lDataSet.open.
I'd already convinced myself that the issue wasn't on the lQuery or TDataSetProvider side, as the TCustomProvider.GetData was returning the correct recordcount after calling GetRecords.
In the end I have been able to sort it out by splitting the data into 100,000 record chunks as follows:
This appears to work fine, and even gives me an opportunity to attach it to a progress bar if I see fit.
Still not impressed that there wasn't a sensible exception raised in the VCL code, though.